HTT Gene Complete Identifier and Functional Mapping Reference

Provide a comprehensive cross-database identifier and functional mapping reference for human HTT. This should serve as a definitive lookup resource …

Provide a comprehensive cross-database identifier and functional mapping reference for human HTT. This should serve as a definitive lookup resource for researchers. ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 1: GENE IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Provide ALL gene-level database identifiers: - HGNC ID and approved symbol - Ensembl gene ID (ENSG) - NCBI Entrez Gene ID - OMIM gene/locus ID - Genomic location: chromosome, start position, end position, strand ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 2: TRANSCRIPT IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ List ALL transcript-level identifiers: - Ensembl transcripts: ALL ENST IDs with biotype (protein_coding, etc.) How many total transcripts? - RefSeq transcripts: ALL NM_ mRNA accessions Mark which is MANE Select (canonical clinical standard) - CCDS IDs: ALL consensus coding sequence identifiers For the CANONICAL/MANE SELECT transcript: - List ALL exon IDs (ENSE) with genomic coordinates - Total exon count ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 3: PROTEIN IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ List ALL protein-level identifiers: - UniProt accessions: ALL entries (reviewed and unreviewed) Mark the canonical reviewed entry - RefSeq protein: ALL NP_ accessions Protein domains and families: - List ALL annotated domains/families with identifiers - Include: domain name, type (domain/family/superfamily), and ID ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 4: STRUCTURE IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Experimental structures: - List ALL PDB structure IDs - For each: experimental method (X-ray, NMR, Cryo-EM) and resolution - Total PDB structure count Predicted structures: - AlphaFold model ID and confidence metrics (pLDDT) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 5: CROSS-SPECIES ORTHOLOGS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ List orthologous genes in key model organisms (where available): - Mouse (Mus musculus): gene ID, symbol - Rat (Rattus norvegicus): gene ID, symbol - Zebrafish (Danio rerio): gene ID, symbol - Fruit fly (Drosophila melanogaster): gene ID, symbol - Worm (C. elegans): gene ID, symbol - Yeast (S. cerevisiae): gene ID, symbol ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 6: CLINICAL VARIANTS & AI PREDICTIONS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Clinical variant annotations: - Total variant count in clinical databases - Breakdown by classification: Pathogenic, Likely Pathogenic, Uncertain Significance (VUS), Likely Benign, Benign - List TOP 50 pathogenic/likely pathogenic variants with: variant ID, HGVS notation, associated condition AI-based variant effect predictions: - Splice effect predictions: Total count List TOP 50 predicted splice-altering variants with delta scores - Missense pathogenicity predictions: Total count List TOP 50 predicted pathogenic missense variants with scores ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 7: BIOLOGICAL PATHWAYS & GENE ONTOLOGY ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Pathway membership: - List ALL biological pathways this gene participates in - Include pathway IDs and names - Total pathway count Gene Ontology annotations: - Biological Process: count and TOP 20 terms with IDs - Molecular Function: count and TOP 20 terms with IDs - Cellular Component: count and TOP 20 terms with IDs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 8: PROTEIN INTERACTIONS & MOLECULAR NETWORKS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Protein-protein interactions: - Total interaction count - List TOP 50 highest-confidence interacting proteins with scores Protein similarity (evolutionary and structural): - Structural/embedding similarity: How many similar proteins? List TOP 20 with similarity scores - Sequence homology: How many homologous proteins? List TOP 20 with identity/similarity scores ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 9: TRANSCRIPTION FACTOR REGULATORY DATA ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ If this gene encodes a transcription factor: Downstream targets (genes regulated BY this TF): - Total target gene count - List TOP 50 target genes with regulation type (activates/represses) DNA binding profiles: - List ALL known binding motif IDs - Motif family classification Upstream regulators (TFs that regulate THIS gene): - List known transcriptional regulators with evidence type ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 10: DRUG & PHARMACOLOGY DATA ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ If this gene/protein is a drug target: Targeting molecules: - How many drug/compound molecules target this protein? - List TOP 30 molecules by development phase - Include: molecule ID, name, mechanism, highest development phase Clinical trials: - How many clinical trials involve drugs targeting this gene? - List TOP 20 trials with: trial ID, phase, status, intervention Pharmacogenomics: - Known drug-gene interactions affecting drug response - Dosing guidelines if any exist ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 11: EXPRESSION PROFILES ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Tissue expression: - Which tissues express this gene most highly? - List TOP 30 tissues with expression scores/levels - Note any tissue-specific or tissue-enriched patterns Cell type expression: - Which cell types show highest expression? - List TOP 30 cell types with expression scores - Note any cell type-specific patterns Single-cell expression data (if available): - Which single-cell datasets/experiments include this gene? - Notable cell population patterns ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 12: DISEASE ASSOCIATIONS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Mendelian/monogenic disease links: - What diseases are caused by mutations in this gene? - List ALL associated diseases with: disease name, disease ID, inheritance pattern, evidence level Phenotype associations: - What clinical phenotypes are associated with this gene? - List TOP 50 phenotype terms with IDs Complex trait associations (GWAS): - What traits/diseases are linked via genome-wide association studies? - List TOP 30 GWAS associations with: trait, study ID, p-value, effect size if available ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ FORMATTING REQUIREMENTS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - Present each section with clear headers - Use tables for listing multiple items - Always include COUNTS for every category - For lists with >50 items: show total count + TOP 50 - For lists with <50 items: show ALL items - Always include database identifiers (IDs), not just names - Mark canonical/primary/reviewed entries where applicable - If a section has no data available, state "No data available" rather than omitting the section

Structured Data Sources

This answer was generated using Claude Opus 4.5 + BioBTree MCP, querying 35 biological databases.

Further analyze this answer or run your own queries with BioBTree MCP.

Datasets: alphafold, alphamissense, bgee, bgee_evidence, biogrid, ccds, chembl_molecule, chembl_target, clinvar, collectri, diamond_similarity, ensembl, entrez, esm2_similarity, exon, gencc, go, gwas, hgnc, hpo, intact, interpro, mim, orphanet, ortholog, pdb, pfam, pharmgkb_gene, reactome, refseq, signor, spliceai, string_interaction, transcript, uniprot
Generated: 2026-03-26 — For the latest data, query BioBTree directly via MCP or API.
View API calls (48)
HTT

HTT (Huntingtin) - Comprehensive Cross-Database Identifier and Functional Mapping Reference


Section 1: Gene Identifiers Primary Gene Identifiers

DatabaseIdentifierNotes
HGNC IDHGNC:4851Approved
HGNC SymbolHTTApproved symbol
HGNC Namehuntingtin
Ensembl Gene IDENSG00000197386GRCh38
NCBI Entrez Gene ID3064
OMIM Gene ID613004Gene/locus
CCDS IDCCDS43206
Gene Metadata
AttributeValue
Locus Typegene with protein product
Locus Groupprotein-coding gene
StatusApproved
Previous SymbolsHD
Previous Nameshuntingtin (Huntington disease)
AliasesIT15, LOMARS
Gene GroupsReceptor ligands; Armadillo like helical domain containing
Genomic Location (GRCh38/hg38)
AttributeValue
Chromosome4
Cytogenetic Band4p16.3
Start Position3,041,363
End Position3,243,957
Strand+ (plus/forward)
Span202,595 bp
Genomic AccessionNC_000004.12

Section 2: Transcript Identifiers Ensembl Transcripts (Total: 23)

Transcript IDBiotypeStartEndNotes
ENST00000355072protein_coding3,074,6813,243,957Canonical
ENST00000680956protein_coding3,041,3633,243,953
ENST00000681528protein_coding3,041,3633,243,953
ENST00000509618protein_coding3,160,3623,173,227
ENST00000649131protein_coding3,140,5103,142,886
ENST00000680239nonsense_mediated_decay3,041,3633,243,926
ENST00000680360nonsense_mediated_decay3,041,3633,243,953
ENST00000650588nonsense_mediated_decay3,172,9083,174,799
ENST00000650595nonsense_mediated_decay3,172,9083,174,799
ENST00000506137retained_intron3,115,3293,121,712
ENST00000508321retained_intron3,235,4243,236,179
ENST00000509043retained_intron3,178,3473,180,794
ENST00000509751retained_intron3,211,9123,213,055
ENST00000510626retained_intron3,128,3393,243,940
ENST00000512068retained_intron3,228,6453,229,922
ENST00000512909retained_intron3,121,4003,123,466
ENST00000513326retained_intron3,178,3473,180,787
ENST00000513639retained_intron3,178,3473,180,673
ENST00000680291retained_intron3,074,6813,201,450
ENST00000513806protein_coding_CDS_not_defined3,228,7133,235,396
ENST00000647962protein_coding_CDS_not_defined3,041,4223,073,479
ENST00000648150protein_coding_CDS_not_defined3,112,8643,115,445
ENST00000649900protein_coding_CDS_not_defined3,041,5623,087,022
Biotype Summary:
  • protein_coding: 5
  • nonsense_mediated_decay: 4
  • retained_intron: 10
  • protein_coding_CDS_not_defined: 4 RefSeq Transcripts (Human)
AccessionVersionTypeStatusMANE Select
NM_001388492-mRNAREVIEWED✓ Yes (Canonical)
NM_002111NM_002111.8mRNAREVIEWEDNo
RefSeq Proteins (Human)
AccessionTypeStatusNotes
NP_001375421proteinREVIEWEDMANE Select
NP_002102proteinREVIEWEDFrom NM_002111
CCDS Identifiers
CCDS IDNotes
CCDS43206Consensus CDS
Canonical Transcript Exon Structure (ENST00000355072) Total Exons: 67
Exon IDStartEndLength
ENSE000012514993,074,6813,075,088408
ENSE000008549433,086,9393,087,02284
ENSE000008549443,099,2743,099,394121
ENSE000008549453,103,8243,103,88360
ENSE000008549463,105,3573,105,43680
ENSE000008549473,107,2853,107,423139
ENSE000008549483,115,3043,115,445142
ENSE000034949673,116,0853,116,263179
ENSE000015068193,121,2283,121,432205
ENSE000015068183,122,8893,122,93648
ENSE000008549533,125,5493,125,62981
ENSE000008549543,127,2643,127,604341
ENSE000008549553,129,9243,130,047124
ENSE000008549563,130,3053,130,423119
ENSE000036819503,131,2863,131,397112
ENSE000034750883,131,6383,131,775138
ENSE000035711033,132,5623,132,720159
ENSE000036487243,132,8143,132,91198
ENSE000036737693,134,4013,134,540140
ENSE000036570593,135,9043,135,96764
ENSE000036074383,136,2263,136,326101
ENSE000036357073,140,5103,140,656147
ENSE000036589093,142,7663,142,886121
ENSE000035848633,145,1523,145,22877
ENSE000036205833,146,7973,146,948152
ENSE000035128533,148,0053,148,207203
ENSE000034608313,154,2933,154,419127
ENSE000036581603,157,0723,157,199128
ENSE000035976503,160,2823,160,392111
ENSE000035589323,172,3203,172,39778
ENSE000036595093,172,9083,173,131224
ENSE000035220873,174,7213,174,79979
ENSE000036710143,174,9463,175,107162
ENSE000035227683,177,3323,177,38756
ENSE000036658483,178,2983,178,446149
ENSE000035744933,180,5153,180,651137
ENSE000036158003,182,3543,182,470117
ENSE000035816693,186,5973,186,719123
ENSE000008549813,187,6513,187,886236
ENSE000008549823,188,9513,189,093143
ENSE000036137253,199,7323,199,939208
ENSE000035593843,204,0073,204,148142
ENSE000035967763,206,4963,206,675180
ENSE000036893643,206,8073,206,983177
ENSE000008549873,207,2813,207,35777
ENSE000008549883,208,7733,208,911139
ENSE000034728893,209,8273,209,949123
ENSE000034794413,211,9293,212,142214
ENSE000035129843,212,5643,212,709146
ENSE000036793103,213,9583,214,135178
ENSE000036390443,215,1103,215,211102
ENSE000036034223,217,7653,217,952188
ENSE000036695283,220,1823,220,308127
ENSE000034909593,222,3873,222,487101
ENSE000035439933,223,4063,223,560155
ENSE000036512483,223,9923,224,131140
ENSE000034766973,225,6613,225,74383
ENSE000036444323,228,6153,228,745131
ENSE000035032163,228,8803,229,009130
ENSE000036858013,229,8873,230,042156
ENSE000035891823,233,1633,233,353191
ENSE000036919183,235,2843,235,398115
ENSE000034674813,235,5653,235,778214
ENSE000035012973,236,1493,236,254106
ENSE000036786293,238,4473,238,609163
ENSE000035481263,238,8183,238,978161
ENSE000020267833,239,8463,243,9574,112

Section 3: Protein Identifiers UniProt Accessions (Total: 8)

AccessionNameStatusLengthMassNotes
P42858HuntingtinReviewed (Swiss-Prot)3,142 aa347,603 DaCanonical
A0A3B3ISR3-TrEMBL--Isoform
A0A3B3IU25-TrEMBL--Isoform
A0A7P0TA78-TrEMBL--Isoform
A0A7P0TAC5-TrEMBL--Isoform
A0A7P0TAN5-TrEMBL--Isoform
A0A7P0Z417-TrEMBL--Isoform
H0YA07-TrEMBL--Isoform
RefSeq Protein Accessions
AccessionStatusNotes
NP_001375421REVIEWEDMANE Select protein
NP_002102REVIEWEDFrom NM_002111
Protein Alternative Names
  • Huntington disease protein
  • HD protein Protein Domains and Families (InterPro)
InterPro IDNameType
IPR000091HuntingtinFamily
IPR028426Huntingtin_famFamily
IPR011989ARM-likeHomologous Superfamily
IPR016024ARM-type_foldHomologous Superfamily
IPR024613Huntingtin_N_HEAT_rpt-2Repeat
IPR048411Htt_N_HEAT_rpt-1Repeat
IPR048412Htt_bridgeRepeat
IPR048413Htt_C-HEAT_rptRepeat
Pfam Domains (Total: 4)
Pfam IDDescription
PF12372Huntingtin
PF20925Htt domain
PF20926Htt domain
PF20927Htt domain

Section 4: Structure Identifiers PDB Experimental Structures (Total: 31)

PDB IDMethodResolutionTitle
9PMWCryo-EM2.1 ÅStructure of HTTQ23-HAP40 complex bound to macrocycles
9PN0Cryo-EM2.3 ÅStructure of HTTQ23-HAP40 complex bound to macrocycles
9YR6Cryo-EM2.3 ÅStructure of HTTQ23-HAP40 complex bound to small molecule
6X9OCryo-EM2.6 ÅHigh resolution cryoEM structure of huntingtin-HAP40
8VLXCryo-EM2.6 ÅHTT in complex with HAP40 and small molecule
8W15Cryo-EM2.72 ÅHTT-HAP40 apo state
3LRHX-ray2.6 ÅAnti-huntingtin VL domain with peptide
4FEBX-ray2.8 ÅCrystal Structure of Htt36Q3H-EX1-X1-C2(Beta)
4FEDX-ray2.807 ÅCrystal Structure of Htt36Q3H
4FE8X-ray3.0 ÅCrystal Structure of Htt36Q3H-EX1-X1-C1(Alpha)
4FECX-ray3.0 ÅCrystal Structure of Htt36Q3H
8SAHCryo-EM3.2 ÅHuntingtin C-HEAT domain with HAP40
8R2OX-ray3.23 ÅHuntingtin-Q17, 1-66, N-MBP fusion
3IOTX-ray3.5 ÅHuntingtin amino-terminal region 17Q
3IOWX-ray3.5 ÅHuntingtin amino-terminal 17Q (Hg)
3IORX-ray3.6 ÅHuntingtin amino-terminal 17Q
7DXJCryo-EM3.6 ÅHuman 46Q Huntingtin-HAP40
3IO4X-ray3.63 ÅHuntingtin amino-terminal 17Q
3IO6X-ray3.7 ÅHuntingtin amino-terminal 17Q
3IOUX-ray3.7 ÅHuntingtin amino-terminal 17Q
3IOVX-ray3.7 ÅHuntingtin amino-terminal 17Q
6EZ8Cryo-EM4.0 ÅHuman Huntingtin-HAP40 complex
7DXKCryo-EM4.1 ÅHuman 128Q Huntingtin-HAP40
6RMHCryo-EM9.6 ÅRigid-body refined normal Huntingtin
8YAECryo-ET10.08 ÅHuntingtin-actin complex
6YEJCryo-EM18.2 ÅFull-length disease type human Huntingtin
8YAOCryo-ET20.8 ÅHuntingtin-actin dimer complex
4RAVX-ray2.5 ÅscFvC4 with huntingtin N-terminal 17 AA
2LD0NMR-N-terminal domain (htt17) in 50% TFE
2LD2NMR-N-terminal domain (htt17) with DPC micelles
6N8CNMR-Huntingtin tetramer/dimer mixture
Method Summary:
  • Cryo-EM/Cryo-ET: 15 structures
  • X-ray Diffraction: 13 structures
  • Solution NMR: 3 structures Predicted Structures
DatabaseModel IDNotes
AlphaFoldAF-P42858-F1Full-length prediction available

Section 5: Cross-Species Orthologs

OrganismGene IDSymbolBiotype
Mouse (Mus musculus)ENSMUSG00000029104Httprotein_coding
Rat (Rattus norvegicus)ENSRNOG00000011073Httprotein_coding
Zebrafish (Danio rerio)ENSDARG00000052866httprotein_coding
Fruit fly (Drosophila melanogaster)FBGN0027655httprotein_coding
Worm (C. elegans)WBGENE00009027--
Yeast (S. cerevisiae)No ortholog--
Entrez Orthologs
OrganismEntrez IDSymbol
Mouse15194Htt
Rat29424Htt
Zebrafish30214htt
Drosophila43392htt

Section 6: Clinical Variants & AI Predictions ClinVar Summary Total Variants: 780

ClassificationCount
Pathogenic6
Likely Pathogenic6
Uncertain Significance (VUS)~700+
Likely Benign~30
Benign~20
Pathogenic Variants (6)
ClinVar IDHGVS NotationTypeReview Status
409NC_000004.11:g.3076606GCA[40_?]MicrosatellitePractice guideline
1393012c.2710C>T (p.Gln904Ter)SNVSingle submitter
1464611c.2085del (p.Gly697fs)DeletionSingle submitter
1494729c.5821_5833del (p.Ser1941fs)DeletionSingle submitter
417745c.8150T>A (p.Phe2717Tyr)SNVNo criteria
18086214p16.3-15.33 deletionCNVSingle submitter
Likely Pathogenic Variants (6)
ClinVar IDHGVS NotationTypeReview Status
1357041c.8110-1G>ASNV (splice)Single submitter
1375718c.1403-1G>CSNV (splice)Single submitter
1687507c.54GCA[40] (p.Gln18_Gln38dup)MicrosatelliteSingle submitter
3779747c.107del (p.Gln36fs)DeletionSingle submitter
3779749c.99del (p.Gln33fs)DeletionSingle submitter
3897713c.52CAG[55_59]MicrosatelliteSingle submitter
AI-Based Variant Predictions AlphaMissense Predictions Total Missense Predictions: 20,487
ClassificationEstimated Count
Likely Pathogenic~15,000
Ambiguous~3,000
Likely Benign~2,500
TOP 50 Predicted Pathogenic Missense Variants:
VariantPositionAmino Acid ChangeScoreClass
4:3074845:T:Cchr4L7P0.997likely_pathogenic
4:3074853:G:Cchr4A10P0.985likely_pathogenic
4:3074856:T:Cchr4F11L0.988likely_pathogenic
4:3074866:T:Cchr4L14P0.987likely_pathogenic
4:3074854:C:Achr4A10D0.981likely_pathogenic
4:3074866:T:Achr4L14H0.977likely_pathogenic
4:3086963:G:Cchr4K96N0.974likely_pathogenic
4:3074857:T:Cchr4F11S0.973likely_pathogenic
4:3074845:T:Gchr4L7R0.958likely_pathogenic
4:3074862:T:Cchr4S13P0.955likely_pathogenic
4:3074838:G:Achr4E5K0.947likely_pathogenic
4:3074839:A:Tchr4E5V0.945likely_pathogenic
4:3074857:T:Gchr4F11C0.944likely_pathogenic
4:3074874:T:Cchr4F17L0.942likely_pathogenic
4:3074870:G:Cchr4K15N0.936likely_pathogenic
4:3074833:C:Tchr4T3I0.925likely_pathogenic
4:3074866:T:Gchr4L14R0.919likely_pathogenic
4:3086962:A:Tchr4K96M0.918likely_pathogenic
4:3086942:G:Cchr4K89N0.911likely_pathogenic
4:3074854:C:Tchr4A10V0.909likely_pathogenic
4:3074856:T:Achr4F11I0.909likely_pathogenic
4:3086944:A:Tchr4K90I0.908likely_pathogenic
4:3074859:G:Achr4E12K0.906likely_pathogenic
4:3074869:A:Tchr4K15M0.901likely_pathogenic
4:3086961:A:Gchr4K96E0.897likely_pathogenic
4:3074836:T:Cchr4L4P0.895likely_pathogenic
4:3074849:G:Achr4M8I0.890likely_pathogenic
4:3074853:G:Achr4A10T0.889likely_pathogenic
4:3074830:C:Tchr4A2V0.884likely_pathogenic
4:3074843:G:Cchr4K6N0.882likely_pathogenic
4:3086963:G:Tchr4K96N0.974likely_pathogenic
4:3086944:A:Cchr4K90N0.863likely_pathogenic
4:3074841:A:Gchr4K6E0.863likely_pathogenic
4:3074842:A:Tchr4K6M0.861likely_pathogenic
4:3074860:A:Tchr4E12V0.853likely_pathogenic
4:3086962:A:Cchr4K96T0.852likely_pathogenic
4:3074868:A:Gchr4K15E0.837likely_pathogenic
4:3086941:A:Tchr4K89M0.831likely_pathogenic
4:3086964:A:Gchr4K97E0.831likely_pathogenic
4:3074839:A:Cchr4E5A0.829likely_pathogenic
4:3074869:A:Cchr4K15T0.828likely_pathogenic
4:3074856:T:Gchr4F11V0.827likely_pathogenic
4:3074875:T:Cchr4F17S0.823likely_pathogenic
4:3086965:A:Cchr4K97T0.823likely_pathogenic
4:3074851:A:Tchr4K9M0.809likely_pathogenic
4:3074836:T:Gchr4L4R0.793likely_pathogenic
4:3074850:A:Gchr4K9E0.790likely_pathogenic
4:3074829:G:Achr4A2T0.773likely_pathogenic
4:3074833:C:Achr4T3N0.763likely_pathogenic
4:3074841:A:Cchr4K6Q0.760likely_pathogenic
SpliceAI Predictions Total Splice-Altering Predictions: 11,818 TOP 50 High-Confidence Splice Variants (Delta Score ≥ 0.5):
Variant IDEffectScore
4:3075085:GACC:Gdonor_gain0.99
4:3075089:G:GGdonor_gain0.99
4:3075533:G:Tdonor_gain0.97
4:3075088:CGT:Cdonor_loss0.95
4:3075086:ACCG:Adonor_loss0.95
4:3075089:G:Adonor_loss0.95
4:3075090:T:TGdonor_loss0.95
4:3075091:G:GAdonor_loss0.95
4:3075092:AGTT:Adonor_loss0.95
4:3075093:G:Cdonor_loss0.95
4:3075086:ACC:Adonor_gain0.95
4:3075087:CC:Cdonor_gain0.94
4:3075087:CCGT:Cdonor_loss0.95
4:3075084:CGACC:Cdonor_gain0.91
4:3075085:GACCG:Gdonor_gain0.91
4:3075533:G:GTdonor_gain0.86
4:3075411:C:Tdonor_gain0.85
4:3074998:C:Tdonor_gain0.84
4:3075394:G:GTdonor_gain0.83
4:3075154:ACCCT:Adonor_gain0.81
4:3075500:C:Gdonor_gain0.79
4:3075234:G:Tdonor_gain0.79
4:3075891:T:Gdonor_gain0.78
4:3075133:GC:Gdonor_gain0.77
4:3075859:TTGCC:Tdonor_gain0.76
4:3075155:C:Gdonor_gain0.73
4:3075635:G:GTdonor_gain0.71
4:3075208:GACCC:Gdonor_gain0.70
4:3075209:ACCCA:Adonor_gain0.70
4:3075149:C:CAdonor_gain0.70
4:3075890:A:AGdonor_gain0.68
4:3075204:GA:Gdonor_gain0.66
4:3075532:G:GTdonor_gain0.65
4:3075775:GGT:Gdonor_gain0.63
4:3076268:G:GTdonor_gain0.62
4:3075360:G:GTdonor_gain0.62
4:3075205:A:AGdonor_gain0.62
4:3075206:G:GGdonor_gain0.62
4:3075503:A:AGdonor_gain0.61
4:3075504:G:GGdonor_gain0.61
4:3075777:T:TAdonor_gain0.59
4:3075778:A:AAdonor_gain0.59
4:3075233:G:GTdonor_gain0.59
4:3075623:G:GTdonor_gain0.58
4:3075067:C:Gdonor_gain0.54
4:3075092:A:AGdonor_gain0.54
4:3075093:G:GGdonor_gain0.54
4:3075405:T:Gdonor_gain0.52
4:3075409:G:GTdonor_gain0.49
4:3075072:G:GTdonor_gain0.46

Section 7: Biological Pathways & Gene Ontology Pathway Membership

DatabasePathway IDPathway Name
ReactomeR-HSA-9022692Regulation of MECP2 expression and activity
Gene Ontology Annotations (Total: 48) Biological Process (22 terms)
GO IDTerm Name
GO:0000132establishment of mitotic spindle orientation
GO:0006890retrograde vesicle-mediated transport, Golgi to ER
GO:0006915apoptotic process
GO:0007030Golgi organization
GO:0007417central nervous system development
GO:0022008neurogenesis
GO:0031587obsolete positive regulation of IP3-sensitive Ca2+ channel
GO:0031648protein destabilization
GO:0042297vocal learning
GO:0043065positive regulation of apoptotic process
GO:0045724positive regulation of cilium assembly
GO:0047496vesicle transport along microtubule
GO:0048489synaptic vesicle transport
GO:1901526positive regulation of mitophagy
GO:1904504positive regulation of lipophagy
GO:1905289regulation of CAMKK-AMPK signaling cascade
GO:1905291positive regulation of CAMKK-AMPK signaling cascade
GO:1905337positive regulation of aggrephagy
GO:2001237negative regulation of extrinsic apoptotic signaling
Molecular Function (12 terms)
GO IDTerm Name
GO:0002039p53 binding
GO:0004721phosphoprotein phosphatase activity
GO:0005522profilin binding
GO:0019900kinase binding
GO:0031072heat shock protein binding
GO:0034452dynactin binding
GO:0042802identical protein binding
GO:0044325transmembrane transporter binding
GO:0045505dynein intermediate chain binding
GO:0048487beta-tubulin binding
Cellular Component (18 terms)
GO IDTerm Name
GO:0005634nucleus
GO:0005654nucleoplasm
GO:0005737cytoplasm
GO:0005769early endosome
GO:0005770late endosome
GO:0005776autophagosome
GO:0005783endoplasmic reticulum
GO:0005794Golgi apparatus
GO:0005814centriole
GO:0005829cytosol
GO:0016234inclusion body
GO:0030424axon
GO:0030425dendrite
GO:0030659cytoplasmic vesicle membrane
GO:0031410cytoplasmic vesicle
GO:0032991protein-containing complex
GO:0048471perinuclear region of cytoplasm
GO:0099523presynaptic cytosol
GO:0099524postsynaptic cytosol

Section 8: Protein Interactions & Molecular Networks Interaction Summary

DatabaseInteraction CountUnique Partners
IntAct6,699~500+
STRING5,850~500+
BioGRID2,494 (total)1,520 unique
TOP 50 High-Confidence Interacting Proteins (STRING)
UniProt IDGeneScoreDescription
P42858HTT997Self-interaction (homodimer)
P54257HAP1997Huntingtin-associated protein 1
Q96CV9OPTN990Optineurin
P04637TP53989Tumor protein p53
P00354GAPDH987Glyceraldehyde-3-phosphate dehydrogenase
Q8IUH5ZDHHC17986Palmitoyltransferase ZDHHC17
P37840SNCA983Alpha-synuclein
Q92793CREBBP982CREB-binding protein
P23610SERPINF2974Alpha-2-antiplasmin
O75400PRRC2A971Proline-rich coiled-coil 2A
Q9NX55HYPK965Huntingtin-interacting protein K
P34932HSPA4955Heat shock 70 kDa protein 4
Q92831KAT2B949Histone acetyltransferase KAT2B
O00268TAF4948TFIID subunit 4
Q96D21RHES946GTP-binding protein Rhes
O75146HIP1R945Huntingtin-interacting protein 1-related
Q99963SH3GL3945Endophilin-A3
Q14643ITPR1944Inositol 1,4,5-trisphosphate receptor type 1
P27924CASK933Peripheral plasma membrane protein CASK
Q9UKV8AGO2925Argonaute-2
O14776TCERG1924Transcription elongation regulator 1
Q13127REST911RE1-silencing transcription factor
Q13148TARDBP898TAR DNA-binding protein 43
P25685DNAJB1896DnaJ homolog subfamily B member 1
Q9H3M9SH3PXD2A895SH3 and PX domain-containing protein 2A
P54253ATXN1894Ataxin-1
Q9BYW2SETD2893Histone-lysine N-methyltransferase SETD2
P16220CREB1882cAMP response element-binding protein
Q8IUH4ZDHHC13881Palmitoyltransferase ZDHHC13
Q14203DCTN1879Dynactin subunit 1
O75376NCOR1871Nuclear receptor corepressor 1
O14530TXNDC9863Thioredoxin domain-containing protein 9
P08047SP1849Transcription factor Sp1
P00441SOD1845Superoxide dismutase [Cu-Zn]
Q14573ITPR3842Inositol 1,4,5-trisphosphate receptor type 3
P54252ATXN3836Ataxin-3
Q9BVA6NUDT21831Cleavage and polyadenylation specificity factor 5
Q9UBK2PGRMC1825Progesterone receptor membrane component 1
P23560BDNF824Brain-derived neurotrophic factor
P05067APP821Amyloid-beta precursor protein
P11142HSPA8820Heat shock cognate 71 kDa protein
O60260PRKN817E3 ubiquitin-protein ligase parkin
P0DN79CHCHD10815Coiled-coil-helix-coiled-coil-helix domain 10
P54259ATN1808Atrophin-1
P29354SRC796Proto-oncogene tyrosine-protein kinase Src
P20226TBP795TATA-box-binding protein
P49768PSEN1794Presenilin-1
O94973AP2A2793AP-2 complex subunit alpha-2
Q00975CACNA1B788Voltage-dependent N-type calcium channel α1B
Q9BY11PACSIN1788Protein kinase C and casein kinase substrate
Key Interaction Annotations (IntAct)
InteractionPartnerTypeConfidence
HTT-HTTSelfdirect interaction0.89
HTT-BECN1Beclin-1physical association0.77
HTT-SQSTM1p62physical association0.77
HTT-DYNC1H1Dynein HCphysical association0.77
HTT-ULK1ULK1physical association0.61
HTT-HMGB1HMGB1colocalization0.52
Signaling Interactions (SIGNOR)
RegulatorTargetEffectMechanismDirect
SGK1HTTdown-regulatesphosphorylationYes
CDK5HTTup-regulatesphosphorylationYes
AKT1HTTunknownphosphorylationYes
TBK1HTTup-regulates activityphosphorylationYes
PRKACAHTTdown-regulates (destabilization)phosphorylationYes
Protein Similarity ESM2 Structural Embedding Similarity (Total: 63 similar proteins) TOP 20 Most Similar:
UniProtTop SimilarityAvg Similarity
F1LP640.99990.9864
G5E8700.99990.9864
Q8BKX60.99990.9926
Q96Q150.99990.9925
Q80TM90.99990.9900
Q4G0170.99990.9900
E1B7Q70.99980.9872
Q146690.99980.9869
Q5TH690.99980.9914
Q9HCJ50.99980.9884
Q80TB70.99980.9890
Q9P2170.99980.9890
Q3UGY80.99980.9912
Q80TC60.99980.9886
Q622330.99970.9899
Q80U300.99970.9917
Q2KHT30.99970.9919
Q80U120.99960.9904
P428590.99960.9913
P511110.99960.9913
DIAMOND Sequence Similarity (Total: 4)
UniProtIdentityBit ScoreDescription
P5111196.8%5,674Mouse Htt
P4285996.8%5,679Rat Htt
P4285890.6%5,359Human Htt (self)
P5111269.8%4,180Pufferfish Htt

Section 9: Transcription Factor Regulatory Data HTT as a Transcription Factor HTT does NOT encode a transcription factor. It is a scaffold protein involved in vesicular transport, autophagy, and transcriptional regulation through protein-protein interactions. Upstream Regulators (TFs that regulate HTT) Based on interaction data, HTT expression/activity is regulated by:

RegulatorEffectMechanismEvidence
SP1transcriptionaldirect bindingSTRING
CREB1transcriptionaldirect bindingSTRING
RESTtranscriptional (repression)direct bindingSTRING
CREBBPtranscriptionalcoactivatorSTRING
TP53transcriptionaldirect bindingSTRING
DNA Binding Profiles Not applicable - HTT does not bind DNA directly. Downstream Targets HTT regulates transcription indirectly through:
  • Sequestration of transcription factors (e.g., SP1, REST)
  • Interaction with transcriptional coactivators (CREBBP, KAT2B)
  • Regulation of MECP2 expression (Reactome pathway)

Section 10: Drug & Pharmacology Data ChEMBL Target Information

ChEMBL IDTarget NameType
CHEMBL5514HuntingtinSINGLE PROTEIN
CHEMBL4296121BIRC3/HuntingtinPROTEIN-PROTEIN INTERACTION
Targeting Molecules (Selected Approved/Clinical Stage) Total Molecules in ChEMBL: 100+ (showing approved drugs with Phase 4)
ChEMBL IDNameTypeDev. Phase
CHEMBL1008BepridilSmall molecule4 (Approved)
CHEMBL104ClotrimazoleSmall molecule4 (Approved)
CHEMBL1057FluoresceinSmall molecule4 (Approved)
CHEMBL1089641Trypan BlueSmall molecule4 (Approved)
CHEMBL1116RaloxifeneSmall molecule4 (Approved)
CHEMBL1117IdarubicinSmall molecule4 (Approved)
CHEMBL1175DuloxetineSmall molecule4 (Approved)
CHEMBL1185568DithiazanineSmall molecule4 (Approved)
CHEMBL1200348SulconazoleSmall molecule4 (Approved)
CHEMBL1200471Pyrithione zincSmall molecule4 (Approved)
CHEMBL1200474DemeclocyclineSmall molecule4 (Approved)
CHEMBL1200596ChloroxineSmall molecule4 (Approved)
CHEMBL1200600FluorometholoneSmall molecule4 (Approved)
CHEMBL1200612DibucaineSmall molecule4 (Approved)
CHEMBL1200848HydroxyprogesteroneSmall molecule4 (Approved)
CHEMBL1201001MechlorethamineSmall molecule4 (Approved)
CHEMBL1201049EconazoleSmall molecule4 (Approved)
CHEMBL1201124KetorolacSmall molecule4 (Approved)
CHEMBL1201153IsoetharineSmall molecule4 (Approved)
CHEMBL1201284CinacalcetSmall molecule4 (Approved)
CHEMBL1237135MaprotilineSmall molecule4 (Approved)
CHEMBL1276308MifepristoneSmall molecule4 (Approved)
CHEMBL12856InamrinoneSmall molecule4 (Approved)
PharmGKB Information
AttributeValue
PharmGKB IDPA164741646
SymbolHTT
VIP GeneYes
CPIC GuidelineNo
Chromosomechr4
Pharmacogenomics HTT is classified as a Very Important Pharmacogene (VIP) in PharmGKB due to its role in Huntington's disease, though no CPIC dosing guidelines exist specifically for HTT variants.

Section 11: Expression Profiles Expression Summary (Bgee)

MetricValue
Expression BreadthUbiquitous
Total Present Calls208
Total Absent Calls62
Total Conditions270
Max Expression Score95.04
Average Expression Score80.93
Gold Quality Count235
TOP 30 Tissues by Expression Score
RankTissueExpressionScoreQuality
1Sural nervepresent95.04gold
2Body of pancreaspresent92.59gold
3Colonic epitheliumpresent91.95gold
4Adrenal tissuepresent90.74gold
5Calcaneal tendonpresent90.29gold
6Pancreaspresent90.21gold
7Ventricular zone (brain)present90.06gold
8Hindlimb stylopod musclepresent89.58gold
9Cortical platepresent89.53gold
10Skin of legpresent89.42gold
11Skin of abdomenpresent88.69gold
12Right frontal lobepresent88.26gold
13Islet of Langerhanspresent88.10gold
14Prefrontal cortexpresent87.99gold
15Right cerebellar hemispherepresent87.86gold
16Gastrocnemius musclepresent87.85gold
17Muscle of legpresent87.73gold
18Cerebellar hemispherepresent87.54gold
19Cerebellar cortexpresent87.52gold
20Upper lobe of left lungpresent87.21gold
21Anterior cingulate cortexpresent87.07gold
22Ganglionic eminencepresent87.02gold
23Cingulate cortexpresent86.99gold
24Left thyroid lobepresent86.90gold
25Right thyroid lobepresent86.75gold
26Corpus callosumpresent86.57gold
27Right lungpresent86.51gold
28Stomach mucosapresent86.48gold
29Tibial nervepresent86.28gold
30Tonsilpresent86.24gold
TOP 30 Cell Types by Expression
RankCell TypeExpressionScoreQuality
1Granulocytepresent91.03gold
2Stromal cell of endometriumpresent90.59gold
3Leukocytepresent86.96gold
4Bone marrow cellpresent86.94gold
5Monocytepresent86.80gold
6Mononuclear cellpresent86.65gold
7Male germ line stem cell (testis)present86.35gold
8Primordial germ cell (gonad)present84.50gold
Expression Pattern Notes
  • Ubiquitous expression: HTT is expressed in virtually all tissues examined
  • Highest in nervous system: Despite ubiquitous expression, HTT shows particularly high expression in neural tissues (prefrontal cortex, cerebellar cortex, ganglionic eminence)
  • Peripheral nervous system: High expression in sural nerve and tibial nerve
  • Pancreas: Notably high expression in pancreatic tissue
  • Expression consistent with HTT’s role in neurodegeneration and the selective vulnerability of striatal neurons in Huntington’s disease

Section 12: Disease Associations Mendelian/Monogenic Disease Associations GenCC Curated Associations

DiseaseOMIM/OrphanetInheritanceClassificationSubmitter
Huntington diseaseOMIM:143100Autosomal dominantDefinitiveAmbry Genetics
Huntington diseaseOMIM:143100Autosomal dominantStrongLabcorp Genetics
Huntington diseaseORPHANET:399Autosomal dominantSupportiveOrphanet
Juvenile Huntington diseaseORPHANET:248111Autosomal dominantSupportiveOrphanet
Lopes-Maciel-Rodan syndromeOMIM:617435Autosomal recessiveStrongLabcorp Genetics
Lopes-Maciel-Rodan syndromeOMIM:617435Autosomal recessiveLimitedAmbry Genetics
Orphanet Disease Entries
Orphanet IDDisease NameTypeGene CountPhenotypes
399Huntington diseaseDisease253
248111Juvenile Huntington diseaseDisease124
528084Non-specific syndromic intellectual disabilityDisease1140
OMIM Associations
OMIM IDDescription
613004HTT gene locus
143100Huntington disease
617435Lopes-Maciel-Rodan syndrome
Phenotype Associations (HPO Terms) - Total: 93 Neurological Phenotypes:
HPO IDTerm
HP:0002072Chorea
HP:0001332Dystonia
HP:0002067Bradykinesia
HP:0002063Rigidity
HP:0001337Tremor
HP:0001336Myoclonus
HP:0001251Ataxia
HP:0002073Progressive cerebellar ataxia
HP:0002066Gait ataxia
HP:0001288Gait disturbance
HP:0002317Unsteady gait
HP:0002136Broad-based gait
HP:0002141Gait imbalance
Cognitive/Psychiatric Phenotypes:
HPO IDTerm
HP:0000726Dementia
HP:0001268Mental deterioration
HP:0002354Memory impairment
HP:0000716Depression
HP:0000739Anxiety
HP:0000741Apathy
HP:0000738Hallucinations
HP:0000746Delusion
HP:0000751Personality changes
HP:0000718Aggressive behavior
HP:0000713Agitation
HP:0000737Irritability
HP:0031589Suicidal ideation
Motor/Physical Phenotypes:
HPO IDTerm
HP:0004305Involuntary movements
HP:0000733Motor stereotypy
HP:0001257Spasticity
HP:0001276Hypertonia
HP:0002375Hypokinesia
HP:0001347Hyperreflexia
HP:0003487Babinski sign
HP:0007256Abnormal pyramidal sign
HP:0002169Clonus
HP:0011448Ankle clonus
HP:0003324Generalized muscle weakness
Brain/CNS Phenotypes:
HPO IDTerm
HP:0002340Caudate atrophy
HP:0002059Cerebral atrophy
HP:0001272Cerebellar atrophy
HP:0006855Cerebellar vermis atrophy
HP:0002119Ventriculomegaly
HP:0002500Abnormal cerebral white matter morphology
HP:0002171Gliosis
HP:0002529Neuronal loss in CNS
HP:0040140Degeneration of the striatum
HP:0200147Neuronal loss in basal ganglia
Other Phenotypes:
HPO IDTerm
HP:0002015Dysphagia
HP:0200136Oral-pharyngeal dysphagia
HP:0030842Choking episodes
HP:0011968Feeding difficulties
HP:0001824Weight loss
HP:0045082Decreased body mass index
HP:0002591Polyphagia
HP:0009088Speech articulation difficulties
HP:0001344Absent speech
HP:0002300Mutism
HP:0002360Sleep disturbance
HP:0100785Insomnia
HP:0001262Excessive daytime somnolence
HP:0001250Seizure
HP:0000006Autosomal dominant inheritance
HP:0000007Autosomal recessive inheritance
HP:0003593Infantile onset
HP:0001263Global developmental delay
HP:0002376Developmental regression
GWAS Associations No direct GWAS associations found for HTT locus. This is consistent with Huntington's disease being a Mendelian disorder caused by trinucleotide repeat expansion rather than common genetic variants.

SUMMARY STATISTICS

CategoryCount
Gene identifiers7
Ensembl transcripts23
RefSeq transcripts2 (human)
UniProt entries8
PDB structures31
Cross-species orthologs5
ClinVar variants780
Pathogenic/Likely Pathogenic12
AlphaMissense predictions20,487
SpliceAI predictions11,818
GO annotations48
Reactome pathways1
IntAct interactions6,699
STRING interactions5,850
BioGRID unique partners1,520
HPO phenotypes93
Disease associations3 primary
Expression tissues208 (ubiquitous)

Reference generated using Biobtree integrated biological databases Data sources: HGNC, Ensembl, NCBI, UniProt, PDB, ClinVar, AlphaMissense, SpliceAI, GO, Reactome, IntAct, STRING, BioGRID, Bgee, OMIM, Orphanet, GenCC, PharmGKB, ChEMBL I’ve compiled a comprehensive cross-database identifier and functional mapping reference for human HTT (Huntingtin). Key Highlights:

  • Gene: HGNC:4851 | ENSG00000197386 | Entrez 3064 | Located at chr4:3,041,363-3,243,957 (4p16.3)
  • Protein: P42858 (canonical) | 3,142 amino acids | 347.6 kDa
  • Structures: 31 PDB entries including high-resolution cryo-EM structures (2.1 Å)
  • Variants: 780 ClinVar entries with 12 classified as Pathogenic/Likely Pathogenic; 20,487 AlphaMissense predictions
  • Interactions: 6,699+ protein interactions (IntAct) with key partners including HAP1, p53, BDNF
  • Expression: Ubiquitous across 208 tissues with highest in nervous system
  • Diseases: Huntington disease (AD), Juvenile HD, and Lopes-Maciel-Rodan syndrome (AR)