HBB Gene Complete Identifier and Functional Mapping Reference

Provide a comprehensive cross-database identifier and functional mapping reference for human HBB. This should serve as a definitive lookup resource …

Provide a comprehensive cross-database identifier and functional mapping reference for human HBB. This should serve as a definitive lookup resource for researchers. ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 1: GENE IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Provide ALL gene-level database identifiers: - HGNC ID and approved symbol - Ensembl gene ID (ENSG) - NCBI Entrez Gene ID - OMIM gene/locus ID - Genomic location: chromosome, start position, end position, strand ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 2: TRANSCRIPT IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ List ALL transcript-level identifiers: - Ensembl transcripts: ALL ENST IDs with biotype (protein_coding, etc.) How many total transcripts? - RefSeq transcripts: ALL NM_ mRNA accessions Mark which is MANE Select (canonical clinical standard) - CCDS IDs: ALL consensus coding sequence identifiers For the CANONICAL/MANE SELECT transcript: - List ALL exon IDs (ENSE) with genomic coordinates - Total exon count ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 3: PROTEIN IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ List ALL protein-level identifiers: - UniProt accessions: ALL entries (reviewed and unreviewed) Mark the canonical reviewed entry - RefSeq protein: ALL NP_ accessions Protein domains and families: - List ALL annotated domains/families with identifiers - Include: domain name, type (domain/family/superfamily), and ID ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 4: STRUCTURE IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Experimental structures: - List ALL PDB structure IDs - For each: experimental method (X-ray, NMR, Cryo-EM) and resolution - Total PDB structure count Predicted structures: - AlphaFold model ID and confidence metrics (pLDDT) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 5: CROSS-SPECIES ORTHOLOGS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ List orthologous genes in key model organisms (where available): - Mouse (Mus musculus): gene ID, symbol - Rat (Rattus norvegicus): gene ID, symbol - Zebrafish (Danio rerio): gene ID, symbol - Fruit fly (Drosophila melanogaster): gene ID, symbol - Worm (C. elegans): gene ID, symbol - Yeast (S. cerevisiae): gene ID, symbol ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 6: CLINICAL VARIANTS & AI PREDICTIONS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Clinical variant annotations: - Total variant count in clinical databases - Breakdown by classification: Pathogenic, Likely Pathogenic, Uncertain Significance (VUS), Likely Benign, Benign - List TOP 50 pathogenic/likely pathogenic variants with: variant ID, HGVS notation, associated condition AI-based variant effect predictions: - Splice effect predictions: Total count List TOP 50 predicted splice-altering variants with delta scores - Missense pathogenicity predictions: Total count List TOP 50 predicted pathogenic missense variants with scores ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 7: BIOLOGICAL PATHWAYS & GENE ONTOLOGY ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Pathway membership: - List ALL biological pathways this gene participates in - Include pathway IDs and names - Total pathway count Gene Ontology annotations: - Biological Process: count and TOP 20 terms with IDs - Molecular Function: count and TOP 20 terms with IDs - Cellular Component: count and TOP 20 terms with IDs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 8: PROTEIN INTERACTIONS & MOLECULAR NETWORKS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Protein-protein interactions: - Total interaction count - List TOP 50 highest-confidence interacting proteins with scores Protein similarity (evolutionary and structural): - Structural/embedding similarity: How many similar proteins? List TOP 20 with similarity scores - Sequence homology: How many homologous proteins? List TOP 20 with identity/similarity scores ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 9: TRANSCRIPTION FACTOR REGULATORY DATA ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ If this gene encodes a transcription factor: Downstream targets (genes regulated BY this TF): - Total target gene count - List TOP 50 target genes with regulation type (activates/represses) DNA binding profiles: - List ALL known binding motif IDs - Motif family classification Upstream regulators (TFs that regulate THIS gene): - List known transcriptional regulators with evidence type ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 10: DRUG & PHARMACOLOGY DATA ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ If this gene/protein is a drug target: Targeting molecules: - How many drug/compound molecules target this protein? - List TOP 30 molecules by development phase - Include: molecule ID, name, mechanism, highest development phase Clinical trials: - How many clinical trials involve drugs targeting this gene? - List TOP 20 trials with: trial ID, phase, status, intervention Pharmacogenomics: - Known drug-gene interactions affecting drug response - Dosing guidelines if any exist ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 11: EXPRESSION PROFILES ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Tissue expression: - Which tissues express this gene most highly? - List TOP 30 tissues with expression scores/levels - Note any tissue-specific or tissue-enriched patterns Cell type expression: - Which cell types show highest expression? - List TOP 30 cell types with expression scores - Note any cell type-specific patterns Single-cell expression data (if available): - Which single-cell datasets/experiments include this gene? - Notable cell population patterns ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 12: DISEASE ASSOCIATIONS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Mendelian/monogenic disease links: - What diseases are caused by mutations in this gene? - List ALL associated diseases with: disease name, disease ID, inheritance pattern, evidence level Phenotype associations: - What clinical phenotypes are associated with this gene? - List TOP 50 phenotype terms with IDs Complex trait associations (GWAS): - What traits/diseases are linked via genome-wide association studies? - List TOP 30 GWAS associations with: trait, study ID, p-value, effect size if available ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ FORMATTING REQUIREMENTS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - Present each section with clear headers - Use tables for listing multiple items - Always include COUNTS for every category - For lists with >50 items: show total count + TOP 50 - For lists with <50 items: show ALL items - Always include database identifiers (IDs), not just names - Mark canonical/primary/reviewed entries where applicable - If a section has no data available, state "No data available" rather than omitting the section

Structured Data Sources

This answer was generated using Claude Opus 4.5 + BioBTree MCP, querying 34 biological databases.

Further analyze this answer or run your own queries with BioBTree MCP.

Datasets: alphafold, alphamissense, bgee, bgee_evidence, ccds, chembl_molecule, chembl_target, clinvar, collectri, diamond_similarity, ensembl, entrez, esm2_similarity, exon, gencc, go, gwas, hgnc, hpo, intact, interpro, mim, orphanet, ortholog, pdb, pharmgkb_gene, reactome, refseq, scxa, spliceai, string, string_interaction, transcript, uniprot
Generated: 2026-03-26 — For the latest data, query BioBTree directly via MCP or API.
View API calls (40)
HBB

Section 1: Gene Identifiers

DatabaseIdentifierNotes
HGNC IDHGNC:4827Approved
HGNC SymbolHBB
HGNC Namehemoglobin subunit beta
Ensembl Gene IDENSG00000244734
NCBI Entrez Gene ID3043
OMIM Gene/Locus ID141900
CCDS IDCCDS7753
Genomic Location
AttributeValue
Chromosome11
Cytogenetic Band11p15.4
Start Position (GRCh38)5,225,464
End Position (GRCh38)5,229,395
StrandMinus (-)
Biotypeprotein_coding
Aliases and Previous Names
  • Aliases: CD113t-C, beta-globin, ECYT6
  • Previous Names: hemoglobin, beta
  • Gene Group: Hemoglobin subunits

Section 2: Transcript Identifiers Total Transcript Count: 7

Ensembl Transcript IDBiotypeStartEndStrand
ENST00000335295protein_coding5,225,4645,227,071-
ENST00000380315protein_coding5,226,6205,229,395-
ENST00000475226retained_intron5,225,6555,226,823-
ENST00000485743protein_coding5,226,2635,227,072-
ENST00000633227nonsense_mediated_decay5,225,4675,227,071-
ENST00000647020protein_coding5,225,4645,227,197-
ENST00000883533protein_coding5,225,4645,227,306-
RefSeq Transcripts
RefSeq IDTypeStatusMANE Select
NM_000518mRNAREVIEWEDYes (Canonical)
NP_000509proteinREVIEWEDYes
CCDS Identifiers
  • CCDS7753 (canonical) Canonical/MANE Select Transcript Exons (ENST00000335295) Total Exon Count: 3
Exon IDStartEndStrandChromosome
ENSE000018298675,226,9305,227,071-11
ENSE000010573815,226,5775,226,799-11
ENSE000016006135,225,4645,225,726-11

Section 3: Protein Identifiers UniProt Accessions Total UniProt Entries: 5

UniProt IDStatusNameLengthMass
P68871Canonical (Reviewed)Hemoglobin subunit beta147 aa15,998 Da
A0A0J9YWK4Unreviewed---
A0A2R8Y7R2Unreviewed---
D9YZU5Unreviewed---
F8W6P5Unreviewed---
Alternative Names: Beta-globin, Hemoglobin beta chain RefSeq Protein Accessions
RefSeq Protein IDStatus
NP_000509REVIEWED (canonical)
NP_150237PROVISIONAL
Protein Domains and Families (InterPro) Total Domain Annotations: 5
InterPro IDNameType
IPR000971GlobinDomain
IPR002337Hemoglobin_bFamily
IPR009050Globin-like_sfHomologous superfamily
IPR012292Globin/ProtoHomologous superfamily
IPR050056Hemoglobin_oxygen_transportFamily

Section 4: Structure Identifiers Experimental Structures Total PDB Structure Count: 346 TOP 50 PDB Structures (by resolution)

PDB IDMethodResolution (Å)Title
1IRDX-RAY DIFFRACTION1.25Crystal Structure of Human Carbonmonoxy-Haemoglobin
J3YX-RAY DIFFRACTION1.55Photolysis-induced tertiary structural changes
1J40X-RAY DIFFRACTION1.45Alpha(Ni)-beta(Fe-CO) hemoglobin
1J41X-RAY DIFFRACTION1.45Alpha(Ni)-beta(Fe) hemoglobin
1BABX-RAY DIFFRACTION1.50Hemoglobin Thionville
1BZ0X-RAY DIFFRACTION1.50Hemoglobin A (deoxy, high salt)
1THBX-RAY DIFFRACTION1.50Partially oxygenated T state haemoglobin
1UIWX-RAY DIFFRACTION1.50Half-liganded human hemoglobin derivatives
1BZ1X-RAY DIFFRACTION1.59Hemoglobin (alpha + Met) variant
1BZZX-RAY DIFFRACTION1.59Hemoglobin (alpha V1M) mutant
1J3ZX-RAY DIFFRACTION1.60Alpha(Fe-CO)-beta(Ni) hemoglobin
1BBBX-RAY DIFFRACTION1.70Third quaternary structure of human hemoglobin
1DXTX-RAY DIFFRACTION1.70Deoxy recombinant human hemoglobins
1DXUX-RAY DIFFRACTION1.70Deoxy recombinant human hemoglobins
1DXVX-RAY DIFFRACTION1.70Deoxy recombinant human hemoglobins
1J7YX-RAY DIFFRACTION1.70Partially ligated mutant of HbA
1QSHX-RAY DIFFRACTION1.70Magnesium(II)-protoporphyrin hemoglobin
1QSIX-RAY DIFFRACTION1.70Zinc(II)-protoporphyrin hemoglobin
1CBMX-RAY DIFFRACTION1.74Carbonmonoxy-beta4 hemoglobin
1NQPX-RAY DIFFRACTION1.73Human hemoglobin E
1A3NX-RAY DIFFRACTION1.80Deoxy human hemoglobin
1A3OX-RAY DIFFRACTION1.80Mutant (alpha Y42H) of deoxy hemoglobin
1CBLX-RAY DIFFRACTION1.80Deoxy-beta4 hemoglobin
1A01X-RAY DIFFRACTION1.80Hemoglobin (Val beta1 Met, Trp beta37 Ala) mutant
1SDKX-RAY DIFFRACTION1.80Cross-linked, carbonmonoxy hemoglobin A
1SDLX-RAY DIFFRACTION1.80Cross-linked, carbonmonoxy hemoglobin A
1GBUX-RAY DIFFRACTION1.80Deoxy (beta-(C93A,C112G)) human hemoglobin
1C7CX-RAY DIFFRACTION1.80Deoxy RHb1.1 (recombinant hemoglobin)
1C7DX-RAY DIFFRACTION1.80Deoxy RHb1.2 (recombinant hemoglobin)
1O1LX-RAY DIFFRACTION1.80Deoxy hemoglobin (mutant)
1O1NX-RAY DIFFRACTION1.80Deoxy hemoglobin (mutant)
1O1OX-RAY DIFFRACTION1.80Deoxy hemoglobin (mutant)
1O1PX-RAY DIFFRACTION1.80Deoxy hemoglobin (mutant)
1R1YX-RAY DIFFRACTION1.80Deoxy-human hemoglobin Bassett
1G9VX-RAY DIFFRACTION1.85Deoxy hemoglobin with allosteric effector
1QXEX-RAY DIFFRACTION1.85Antisickling compound complex
1O1MX-RAY DIFFRACTION1.85Deoxy hemoglobin (mutant)
1K0YX-RAY DIFFRACTION1.87Symmetrical allosteric effectors of hemoglobin
1KD2X-RAY DIFFRACTION1.87Human deoxyhemoglobin without anions
1HBBX-RAY DIFFRACTION1.90Deoxyhemoglobin Rothschild 37beta
1BUWX-RAY DIFFRACTION1.90S-nitroso-nitrosyl human hemoglobin A
1CLSX-RAY DIFFRACTION1.90Cross-linked human hemoglobin deoxy
1VWTX-RAY DIFFRACTION1.90T state human hemoglobin [alpha V96W]
1O1JX-RAY DIFFRACTION1.90Deoxy hemoglobin (mutant)
1RQ3X-RAY DIFFRACTION1.91Interaction of NO with deoxyhemoglobin
1XXTX-RAY DIFFRACTION1.91T-to-T high transitions in human hemoglobin
1XY0X-RAY DIFFRACTION1.99alphaK40G deoxy low-salt
1A00X-RAY DIFFRACTION2.00Hemoglobin (Val beta1 Met, Trp beta37 Tyr) mutant
1ABWX-RAY DIFFRACTION2.00Deoxy RHb1.1 (recombinant hemoglobin)
1GBVX-RAY DIFFRACTION2.00T-state human hemoglobin
Predicted Structures (AlphaFold)
AlphaFold IDGlobal pLDDTSequence LengthFraction Very High Confidence
P6887197.0911310.97 (97%)

Section 5: Cross-Species Orthologs Total Orthologs Identified: 11

OrganismEnsembl Gene IDSymbolBiotype
Zebrafish (Danio rerio)ENSDARG00000097011hbaa1protein_coding
ZebrafishENSDARG00000069735hbaa2protein_coding
ZebrafishENSDARG00000045142hbae5protein_coding
ZebrafishENSDARG00000079305hbae3protein_coding
ZebrafishENSDARG00000088330hbae1.2protein_coding
ZebrafishENSDARG00000089124hbae1.3protein_coding
ZebrafishENSDARG00000089475hbae1.1protein_coding
ZebrafishENSDARG00000079078si:ch211-5k11.8protein_coding
Fruit fly (D. melanogaster)FBGN0027657glob1protein_coding
Worm (C. elegans)WBGENE00008996--
Worm (C. elegans)WBGENE00077763--
Note: Mouse (Mus musculus) HBB orthologs exist via Entrez (15127 - Hbb-b1 complex). Rat, yeast orthologs not directly mapped in Ensembl Compara for this gene.

Section 6: Clinical Variants & AI Predictions Clinical Variant Annotations (ClinVar) Total Variant Count: 1,706 Breakdown by Classification (estimated from sample):

ClassificationCount (Approximate)
Pathogenic300+
Likely Pathogenic50+
Uncertain Significance (VUS)100+
Likely Benign800+
Benign300+
Conflicting50+
TOP 50 Pathogenic/Likely Pathogenic Variants
Variant IDHGVS NotationTypeClassification
15333c.20A>T (p.Glu7Val)SNVPathogenic (Sickle cell - HbS)
15161c.79G>A (p.Glu27Lys)SNVPathogenic (HbE)
15436c.92+1G>ASNVPathogenic (splice)
15437c.92+1G>TSNVPathogenic (splice)
15438c.315+1G>ASNVPathogenic (splice)
15446c.93-1G>ASNVPathogenic (splice)
15447c.92+5G>CSNVPathogenic (splice)
15448c.92+5G>TSNVPathogenic (splice)
15449c.92+5G>ASNVPathogenic (splice)
15401c.52A>T (p.Lys18Ter)SNVPathogenic (nonsense)
15402c.118C>T (p.Gln40Ter)SNVPathogenic (nonsense)
15403c.47G>A (p.Trp16Ter)SNVPathogenic (nonsense)
15404c.364G>T (p.Glu122Ter)SNVPathogenic (nonsense)
15405c.114G>A (p.Trp38Ter)SNVPathogenic (nonsense)
15406c.130G>T (p.Glu44Ter)SNVPathogenic (nonsense)
15407c.184A>T (p.Lys62Ter)SNVPathogenic (nonsense)
15408c.108C>A (p.Tyr36Ter)SNVPathogenic (nonsense)
15413c.25_26del (p.Lys9fs)DeletionPathogenic (frameshift)
15414c.51del (p.Lys18fs)DeletionPathogenic (frameshift)
15418c.20del (p.Glu7fs)DeletionPathogenic (frameshift)
15420c.230del (p.Ala77fs)DeletionPathogenic (frameshift)
15422c.17_18del (p.Pro6fs)DeletionPathogenic (frameshift)
15423c.36del (p.Thr13fs)DeletionPathogenic (frameshift)
15426c.45dup (p.Trp16fs)DuplicationPathogenic (frameshift)
15427c.114_120delDeletionPathogenic
15431c.112del (p.Trp38fs)DeletionPathogenic (frameshift)
15432c.85dup (p.Leu29fs)DuplicationPathogenic (frameshift)
15434c.2T>G (p.Met1Arg)SNVPathogenic (start loss)
15519c.3G>A (p.Met1Ile)SNVPathogenic (start loss)
15466c.-81A>GSNVPathogenic (promoter)
15470c.-78A>CSNVPathogenic (promoter)
15473c.*113A>GSNVPathogenic (3'UTR)
15488c.*111A>GSNVPathogenic (3'UTR)
15458c.316-197C>TSNVPathogenic (intronic)
15190c.128T>C (p.Phe43Ser)SNVPathogenic
15256c.190C>T (p.His64Tyr)SNVPathogenic
15241c.295G>A (p.Val99Met)SNVPathogenic
15342c.328G>A (p.Val110Met)SNVPathogenic
15352c.332T>C (p.Leu111Pro)SNVPathogenic
15526c.347C>A (p.Ala116Asp)SNVPathogenic
15322c.437A>G (p.Tyr146Cys)SNVPathogenic
15442c.93-22_95delDeletionPathogenic
15485c.143dup (p.Asp48fs)DuplicationPathogenic
15504c.271G>T (p.Glu91Ter)SNVPathogenic
1065151c.233_234del (p.His78fs)DeletionLikely pathogenic
1098798c.126dup (p.Phe43fs)DuplicationLikely pathogenic
1459872c.380_396del (p.Val127fs)DeletionPathogenic
15096c.435G>C (p.Lys145Asn)SNVPathogenic
15536c.202G>A (p.Val68Met)SNVPathogenic
15545c.422C>T (p.Ala141Val)SNVPathogenic
AI-Based Variant Effect Predictions AlphaMissense Predictions Total Missense Predictions: 957 TOP 50 Predicted Pathogenic Missense Variants
Genomic PositionProtein ChangeAM ScoreAM Class
11:5225606:A>GY146H0.998likely_pathogenic
11:5225605:T>CY146C0.989likely_pathogenic
11:5225617:A>GL142P0.990likely_pathogenic
11:5225621:C>GA141P0.990likely_pathogenic
11:5225617:A>TL142Q0.989likely_pathogenic
11:5225620:G>TA141D0.986likely_pathogenic
11:5225627:C>GA139P0.985likely_pathogenic
11:5225636:C>GA136P0.949likely_pathogenic
11:5225626:G>TA139D0.985likely_pathogenic
11:5225632:C>TG137D0.949likely_pathogenic
11:5225614:G>TA143D0.981likely_pathogenic
11:5225643:T>AK133N0.975likely_pathogenic
11:5225641:A>TV134E0.975likely_pathogenic
11:5225617:A>CL142R0.974likely_pathogenic
11:5225629:A>TV138E0.973likely_pathogenic
11:5225603:G>CH147D0.971likely_pathogenic
11:5225645:T>CK133E0.965likely_pathogenic
11:5225633:C>GG137R0.959likely_pathogenic
11:5225638:A>TV135E0.955likely_pathogenic
11:5225601:G>CH147Q0.941likely_pathogenic
11:5225620:G>AA141V0.928likely_pathogenic
11:5225644:T>GK133T0.927likely_pathogenic
11:5225607:C>AK145N0.924likely_pathogenic
11:5225621:C>TA141T0.898likely_pathogenic
11:5225603:G>TH147N0.898likely_pathogenic
11:5225645:T>GK133Q0.890likely_pathogenic
11:5225626:G>AA139V0.833likely_pathogenic
11:5225614:G>AA143V0.880likely_pathogenic
11:5225605:T>AY146F0.878likely_pathogenic
11:5225606:A>CY146D0.985likely_pathogenic
11:5225606:A>TY146N0.978likely_pathogenic
11:5225605:T>GY146S0.976likely_pathogenic
11:5225644:T>AK133I0.979likely_pathogenic
11:5225602:T>AH147L0.874likely_pathogenic
11:5225602:T>CH147R0.874likely_pathogenic
11:5225608:T>AK145M0.870likely_pathogenic
11:5225630:C>TV138M0.867likely_pathogenic
11:5225611:T:GH144P0.858likely_pathogenic
11:5225602:T:GH147P0.858likely_pathogenic
11:5225630:C>AV138L0.851likely_pathogenic
11:5225629:A>GV138A0.800likely_pathogenic
11:5225612:G>CH144D0.807likely_pathogenic
11:5225608:T:GK145T0.792likely_pathogenic
11:5225618:G:CL142V0.767likely_pathogenic
11:5225609:T:CK145E0.762likely_pathogenic
11:5225639:C:AV135L0.754likely_pathogenic
11:5225629:A:CV138G0.744likely_pathogenic
11:5225615:C:TA143T0.732likely_pathogenic
11:5225627:C:TA139T0.717likely_pathogenic
11:5225641:A:CV134G0.714likely_pathogenic
SpliceAI Predictions Total Splice Predictions: 260 TOP 50 Predicted Splice-Altering Variants
Variant IDEffectDelta Score
11:5225727:C:CCacceptor_gain0.99
11:5225723:GGAG:Gacceptor_gain0.97
11:5225728:T:Aacceptor_loss0.98
11:5225723:GGAGC:Gacceptor_loss0.98
11:5225725:AGCT:Aacceptor_loss0.98
11:5225726:GCTG:Gacceptor_loss0.98
11:5225727:CT:Cacceptor_loss0.98
11:5225927:TA:Tdonor_gain0.98
11:5225928:AA:Adonor_gain0.98
11:5225928:AAC:Adonor_gain0.98
11:5225929:A:Cdonor_gain0.97
11:5225940:A:ACdonor_gain0.99
11:5225923:G:Adonor_gain0.94
11:5225724:GAG:Gacceptor_gain0.93
11:5225722:AGGAG:Aacceptor_gain0.91
11:5225725:AG:Aacceptor_gain0.91
11:5225943:A:ACdonor_gain0.90
11:5225944:C:CCdonor_gain0.90
11:5225872:A:Cdonor_gain0.90
11:5225729:G:Cacceptor_gain0.88
11:5225920:A:ACdonor_gain0.87
11:5225729:G:GCacceptor_gain0.82
11:5225926:TTA:Tdonor_gain0.78
11:5225871:CAT:Cdonor_gain0.77
11:5225832:G:Cdonor_gain0.72
11:5225870:A:ACdonor_gain0.64
11:5225871:C:CCdonor_gain0.64
11:5225866:T:TAdonor_gain0.62
11:5225864:CCTC:Cdonor_gain0.61
11:5225863:ACCT:Adonor_gain0.61
11:5225876:G:Cdonor_gain0.60
11:5225862:AACCT:Adonor_gain0.58
11:5225941:T:Cdonor_gain0.49
11:5225740:A:ACacceptor_gain0.49
11:5225996:CC:Cacceptor_gain0.45
11:5225997:CC:Cacceptor_gain0.45
11:5225995:GCCCT:Gacceptor_loss0.41
11:5225996:CCCTG:Cacceptor_loss0.41
11:5225998:CT:Cacceptor_loss0.41
11:5225999:T:Gacceptor_loss0.41
11:5226000:G:Cacceptor_loss0.41
11:5226002:A:Cacceptor_loss0.40
11:5226003:A:Cacceptor_loss0.39
11:5225747:G:Cacceptor_gain0.37
11:5226001:A:ACacceptor_loss0.37
11:5225949:A:Cacceptor_gain0.36
11:5225749:A:ACacceptor_gain0.35
11:5225979:A:Cacceptor_gain0.35
11:5226007:A:Tacceptor_loss0.34
11:5225955:T:Cacceptor_gain0.34

Section 7: Biological Pathways & Gene Ontology Pathway Membership (Reactome) Total Pathway Count: 10

Reactome IDPathway NameDisease Pathway
R-HSA-1237044Erythrocytes take up carbon dioxide and release oxygenNo
R-HSA-1247673Erythrocytes take up oxygen and release carbon dioxideNo
R-HSA-2168880Scavenging of heme from plasmaNo
R-HSA-6798695Neutrophil degranulationNo
R-HSA-9613829Chaperone Mediated AutophagyNo
R-HSA-9615710Late endosomal microautophagyNo
R-HSA-9707564Cytoprotection by HMOX1No
R-HSA-9707616Heme signalingNo
R-HSA-983231Factors involved in megakaryocyte development and platelet productionNo
R-HSA-9927020Heme assimilationYes
Gene Ontology Annotations Total GO Terms: 29 Molecular Function (6 terms)
GO IDTerm
GO:0005344oxygen carrier activity
GO:0019825oxygen binding
GO:0020037heme binding
GO:0030492hemoglobin binding
GO:0031721hemoglobin alpha binding
GO:0046872metal ion binding
Biological Process (14 terms)
GO IDTerm
GO:0006954inflammatory response
GO:0008217regulation of blood pressure
GO:0015670carbon dioxide transport
GO:0015671oxygen transport
GO:0030185nitric oxide transport
GO:0042542response to hydrogen peroxide
GO:0042744hydrogen peroxide catabolic process
GO:0045429positive regulation of nitric oxide biosynthetic process
GO:0048821erythrocyte development
GO:0070293renal absorption
GO:0070527platelet aggregation
GO:0097746blood vessel diameter maintenance
GO:0098869cellular oxidant detoxification
Cellular Component (9 terms)
GO IDTerm
GO:0005576extracellular region
GO:0005615extracellular space
GO:0005829cytosol
GO:0005833hemoglobin complex
GO:0031838haptoglobin-hemoglobin complex
GO:0070062extracellular exosome
GO:0071682endocytic vesicle lumen
GO:0072562blood microparticle
GO:1904724tertiary granule lumen
GO:1904813ficolin-1-rich granule lumen

Section 8: Protein Interactions & Molecular Networks Protein-Protein Interactions STRING Interaction Count: 2,490 IntAct Interaction Count: 215 TOP 50 Highest-Confidence Interacting Proteins (IntAct)

Interaction IDPartner GeneInteraction TypeConfidence
EBI-1025826HBA1direct interaction0.970
EBI-1025836HBA1direct interaction0.970
EBI-1025847HBA1direct interaction0.970
EBI-1025857HBA1direct interaction0.970
EBI-1025867HBA1direct interaction0.970
EBI-1025877HBA1direct interaction0.970
EBI-1025887HBA1direct interaction0.970
EBI-1026804HBA1direct interaction0.970
EBI-1027413HBA1direct interaction0.970
EBI-1027502HBA1direct interaction0.970
EBI-1027513HBA1direct interaction0.970
EBI-1027836HBA1direct interaction0.970
EBI-1029796HBA1direct interaction0.970
EBI-1029806HBA1direct interaction0.970
EBI-1029816HBA1direct interaction0.970
EBI-1032478HBA1direct interaction0.970
EBI-1032489HBA1direct interaction0.970
EBI-1042156HBA1direct interaction0.970
EBI-20733296HBA1proximity0.970
EBI-20733566HBA1proximity0.970
EBI-20735471HBA1physical association0.970
EBI-20735651HBA1physical association0.970
EBI-24543794HBA1physical association0.970
EBI-26607643HBA1physical association0.970
EBI-26872082HBA1physical association0.970
EBI-48442994HBA1proximity0.970
EBI-48443154HBA1proximity0.970
EBI-48444412HBA1physical association0.970
EBI-25472377MED4/MED19association0.900
EBI-22086955HBZphysical association0.860
EBI-22144391HBZphysical association0.860
EBI-24217588HBZphysical association0.860
EBI-24809306HBZphysical association0.860
EBI-25224483HBZphysical association0.860
EBI-21817664MAPK6/HERC2association0.840
EBI-21656272PSMC5/PSMD11association0.730
EBI-21714606HSPA8/GAKassociation0.760
EBI-21694732FAM234B/ABCD4association0.620
EBI-21853241LPAR2/LGALS3association0.620
EBI-21878389GRID1physical association0.590
EBI-23946077HBMphysical association0.560
EBI-24649965HBMphysical association0.560
EBI-25159304HBMphysical association0.560
EBI-24806228HBQ1physical association0.560
EBI-21574408FRMD1/A2ML1association0.530
EBI-21731095GSTT1/MID1association0.530
EBI-21755986NDUFAF5/XRCC2association0.530
EBI-21825929APCDD1/JCHAINassociation0.530
EBI-21860736SEMG2/VSIG8association0.530
EBI-21878313AGAassociation0.530
Protein Similarity ESM2 Structural/Embedding Similarity Total Similar Proteins: 67 TOP 20 Similar Proteins (ESM2)
UniProt IDSimilarity CountTop SimilarityAvg Similarity
P61772501.00000.9993
P61773501.00000.9993
P61774501.00000.9993
P61775501.00000.9993
P68222501.00000.9993
P68223501.00000.9993
P68224501.00000.9993
P68225501.00000.9993
P02038501.00000.9992
P68054501.00000.9992
P68055501.00000.9992
P68232501.00000.9994
P68234501.00000.9994
P67819501.00000.9986
P67820501.00000.9986
P67823501.00000.9987
P67824501.00000.9987
P02024500.99990.9992
P02025500.99990.9993
P02029500.99990.9993
DIAMOND Sequence Homology Total Homologous Proteins: 200+ TOP 20 Homologous Proteins (DIAMOND)
UniProt IDTop Identity (%)Top Bitscore
P0202499.3303
P02030100.0302
P0202699.3302
P0202999.3301
P0204299.3301
P0203198.6300
P0203599.3300
P0203899.3300
P0204099.3300
P1089399.3300
P1544999.3300
P0202598.6299
P0203298.6299
D0VX08100.0298
P0202898.6298
P0825998.6301
P1898298.6298
P1898399.3302
P07415100.0301
P14391100.0298

Section 9: Transcription Factor Regulatory Data Note: HBB encodes a structural protein (hemoglobin beta chain), not a transcription factor. This section covers TFs that regulate HBB expression. Upstream Regulators (TFs That Regulate HBB) Total TF Regulators: 71

TF GeneRegulationConfidence
KLF1UnknownHigh
GATA1UnknownHigh
NFE2ActivationHigh
BCL11ARepression-
KLF2ActivationHigh
BACH1ActivationHigh
BACH2Repression-
CEBPBUnknownHigh
CEBPDActivationHigh
CEBPGUnknownHigh
EPAS1Activation-
EPOActivation-
ETV6Activation-
FLI1RepressionHigh
GLI3UnknownLow
GTF2IUnknownHigh
HIF1AUnknown-
HLTFActivationHigh
HMGA1UnknownHigh
HOXC13RepressionHigh
JUNUnknownHigh
KAT7UnknownHigh
KLF8Repression-
KLF11UnknownLow
MAFKUnknownHigh
MAPK11Activation-
MAPK14Activation-
MYBUnknownLow
MYCUnknownHigh
NFE2L1ActivationLow
NFE2L2UnknownHigh
NFE2L3RepressionLow
NFYAUnknown-
NFYBUnknown-
NFYCUnknown-
NR1I2UnknownHigh
NR2F2Unknown-
POU2F1Unknown-
POU2F2UnknownLow
RBPJActivationLow
SMARCA1UnknownHigh
SP1UnknownLow
SP2UnknownLow
SPI1RepressionLow
STAT5AActivation-
SUZ12UnknownLow
TBPUnknownHigh
TFAP2AUnknownHigh
TFCP2UnknownLow
TGFB1Activation-
TP53UnknownLow
USF1UnknownHigh
USF2UnknownLow
ZFPM1UnknownLow
ZNF354CUnknownHigh
ZNF362UnknownLow
Downstream Targets (Genes Regulated BY HBB) HBB itself may act as a signaling molecule:
Target GeneRegulation
TNFActivation

Section 10: Drug & Pharmacology Data ChEMBL Target Information

ChEMBL Target IDNameType
CHEMBL4331Hemoglobin subunit betaSINGLE PROTEIN
CHEMBL2095168Hemoglobin HbAPROTEIN COMPLEX
Targeting Molecules Total Molecules Targeting HBB: 100+ TOP 30 Molecules by Development Phase
ChEMBL IDNameTypeHighest Dev Phase
CHEMBL1014CANDESARTAN CILEXETILSmall molecule4 (Approved)
CHEMBL1201001MECHLORETHAMINE HClSmall molecule4 (Approved)
CHEMBL1201022PHENAZOPYRIDINE HClSmall molecule4 (Approved)
CHEMBL1425MERCAPTOPURINESmall molecule4 (Approved)
CHEMBL140CURCUMINSmall molecule3
CHEMBL1232461MOLIBRESIBSmall molecule2
CHEMBL102714SB-216763Small molecule0
CHEMBL123ELLIPTECINESmall molecule0
CHEMBL1208903ISOGINKGETINSmall molecule0
CHEMBL1314376GINKGETIN, K saltSmall molecule0
CHEMBL1333953PHENYLMERCURIC ACETATESmall molecule0
CHEMBL1399702NEBULARINESmall molecule0
PharmGKB Information
AttributeValue
PharmGKB IDPA29202
VIP GeneYes
Has CPIC GuidelineNo
Chromosomechr11
Clinical Trials Drugs targeting hemoglobin-related conditions (sickle cell disease, beta-thalassemia) include:
  • Hydroxyurea (fetal hemoglobin induction)
  • Voxelotor (HbS polymerization inhibitor)
  • Crizanlizumab (anti-P-selectin)
  • Gene therapy approaches (LentiGlobin, Casgevy)

Section 11: Expression Profiles Tissue Expression (Bgee) Expression Pattern: Ubiquitous Total Present Calls: 284 Maximum Expression Score: 100.00 TOP 30 Tissues by Expression Score

Tissue/Cell TypeExpression ScoreExpression RankQuality
Monocyte100.003Gold
Trabecular bone tissue100.002Gold
Vena cava100.002Gold
Periodontal ligament100.003Gold
Mononuclear cell99.995Gold
Bone marrow99.996Gold
Bone marrow cell99.9812Gold
Triceps brachii99.9810Gold
Skeletal muscle (biceps brachii)99.9812Gold
Gluteal muscle99.9716Gold
Biceps brachii99.9619Gold
Blood99.9240Gold
Adult organism99.9237Gold
Decidua99.9047Gold
Bronchial epithelial cell99.8954Gold
Spleen99.8188Gold
Diaphragm99.76114Gold
Skeletal muscle (rectus abdominis)99.76113Gold
Apex of heart99.75118Gold
Colonic epithelium99.71137Gold
Bronchus epithelium99.71135Gold
Cardia of stomach99.69147Gold
Placenta99.68149Gold
Inferior olivary complex99.68151Gold
Vastus lateralis99.67154Gold
Trigeminal ganglion99.64168Gold
Superior vestibular nucleus99.59194Gold
Gall bladder99.57203Gold
Metanephros cortex99.57201Gold
Lower esophagus mucosa99.57203Gold
Single-Cell Expression Data Total Single-Cell Experiments: 15
Experiment IDDescriptionSpeciesCell Count
E-HCAD-4Census of Immune CellsHomo sapiens791,344
E-MTAB-7407Human fetal liver, skin and kidneyHomo sapiens473,803
E-CURD-122Cross-tissue immune cell analysisHomo sapiens356,580
E-MTAB-9543Human gastrointestinal tractHomo sapiens319,479
E-CURD-98Human Fetal LiverHomo sapiens197,943
E-CURD-112Fetal bone marrow haematopoiesisHomo sapiens56,592
E-MTAB-6653Lung carcinomasHomo sapiens55,719
E-MTAB-10662Human fetal lungHomo sapiens39,900
E-HCAD-6CD34+ cells from bone marrowHomo sapiens34,596
E-MTAB-9221COVID-19 patient bloodHomo sapiens27,943
E-MTAB-10553Human liver cellsHomo sapiens24,355
E-MTAB-10042Fetal bone marrow haematopoiesisHomo sapiens21,485
E-MTAB-8884Chronic myelomonocytic leukemiaHomo sapiens9,386
E-MTAB-10432Post-MI heart failure bone nicheHomo sapiens8,228
E-CURD-6Normal human bone marrowHomo sapiens1,024

Section 12: Disease Associations Mendelian/Monogenic Disease Links (GenCC) Total Disease Associations: 25

DiseaseOMIM/MONDO/Orphanet IDInheritanceClassificationSubmitter
Sickle cell diseaseOMIM:603903ARStrongLabcorp Genetics
Beta-thalassemia HBB/LCRBOMIM:613985AR/SemidominantDefinitive/StrongAmbry/Labcorp
Beta thalassemiaMONDO:0019402ARDefinitiveMyriad
Dominant beta-thalassemiaOMIM:603902ADStrongLabcorp Genetics
Hemoglobin M diseaseOMIM:617971ADModerateAmbry Genetics
Erythrocytosis, familial, 6OMIM:617980ADStrongGenomics England
Heinz body anemiaOMIM:140700ADStrongLabcorp Genetics
Hemoglobin C diseaseORPHANET:2132ARSupportiveOrphanet
Hemoglobin E diseaseORPHANET:2133ARSupportiveOrphanet
Beta-thalassemia majorORPHANET:231214ARSupportiveOrphanet
Beta-thalassemia intermediaORPHANET:231222ARSupportiveOrphanet
Unstable beta globin variantORPHANET:231226ADSupportiveOrphanet
Delta-beta-thalassemiaORPHANET:231237ARSupportiveOrphanet
Hemoglobin C-beta-thalassemiaORPHANET:231242ARSupportiveOrphanet
Hemoglobin E-beta-thalassemiaORPHANET:231249ARSupportiveOrphanet
Sickle cell-beta-thalassemiaORPHANET:251359ARSupportiveOrphanet
Sickle cell-hemoglobin CORPHANET:251365ARSupportiveOrphanet
Sickle cell-hemoglobin DORPHANET:251370ARSupportiveOrphanet
Sickle cell-hemoglobin EORPHANET:251375ARSupportiveOrphanet
HPFH-sickle cell syndromeORPHANET:251380ARSupportiveOrphanet
HPFH-beta-thalassemiaORPHANET:46532ADSupportiveOrphanet
Orphanet Disease Associations Total Orphanet Diseases: 29
Orphanet IDDisease NameTypePhenotype Count
232Sickle cell anemiaDisease37
231214Beta-thalassemia majorDisease50
231222Beta-thalassemia intermediaDisease36
231226Unstable beta globin chain variantDisease49
2133Hemoglobin E diseaseDisease13
90039Hemoglobin D diseaseDisease14
251380HPFH-sickle cell disease syndromeDisease13
46532HPFH-beta-thalassemia syndromeDisease6
2132Hemoglobin C diseaseDisease0
231237Delta-beta-thalassemiaDisease3
231242Hemoglobin C-beta-thalassemiaDisease4
231249Hemoglobin E-beta-thalassemiaDisease4
330041Hemoglobin M diseaseDisease0
247511Autosomal dominant secondary polycythemiaDisease0
Phenotype Associations (HPO) Total HPO Phenotypes: 139+ TOP 50 Associated Phenotypes
HPO IDPhenotype
HP:0001878Hemolytic anemia
HP:0001903Anemia
HP:0001935Microcytic anemia
HP:0001931Hypochromic anemia
HP:0004840Hypochromic microcytic anemia
HP:0004870Chronic hemolytic anemia
HP:0001930Nonspherocytic hemolytic anemia
HP:0005511Heinz body anemia
HP:0001923Reticulocytosis
HP:0008346Increased red cell sickling tendency
HP:0001744Splenomegaly
HP:0001971Hypersplenism
HP:0001433Hepatosplenomegaly
HP:0002240Hepatomegaly
HP:0000952Jaundice
HP:0008282Unconjugated hyperbilirubinemia
HP:0001081Cholelithiasis
HP:0001297Stroke
HP:0002140Ischemic stroke
HP:0000980Pallor
HP:0000961Cyanosis
HP:0001510Growth delay
HP:0001531Failure to thrive in infancy
HP:0000823Delayed puberty
HP:0000135Hypogonadism
HP:0001640Cardiomegaly
HP:0001644Dilated cardiomyopathy
HP:0001722High-output congestive heart failure
HP:0002092Pulmonary arterial hypertension
HP:0002094Dyspnea
HP:0000939Osteoporosis
HP:0000938Osteopenia
HP:0002659Increased susceptibility to fractures
HP:0002754Osteomyelitis
HP:0010885Avascular necrosis
HP:0002719Recurrent infections
HP:0002718Recurrent bacterial infections
HP:0001954Recurrent fever
HP:0000083Renal insufficiency
HP:0000790Hematuria
HP:0003281Increased circulating ferritin concentration
HP:0001978Extramedullary hematopoiesis
HP:0002007Frontal bossing
HP:0005280Depressed nasal bridge
HP:0010620Malar prominence
HP:0000582Upslanted palpebral fissure
HP:0001901Polycythemia
HP:0001899Increased hematocrit
HP:0001900Increased hemoglobin concentration
HP:0002027Abdominal pain
Complex Trait Associations (GWAS) Total GWAS Associations: 40
Study IDTrait/DiseaseP-valueChromosome
GCST010725_48Malaria4.0e-6911
GCST010725_61Malaria2.0e-6711
GCST010725_1Malaria1.0e-5511
GCST003122_3Hemoglobin levels4.0e-8611
GCST008398_13Glycated hemoglobin (HbA1c)3.0e-3911
GCST000069_3F-cell distribution2.0e-3811
GCST008034_18Hemoglobin A1c levels8.0e-2711
GCST003122_7Hemoglobin levels1.0e-2511
GCST004329_2Mean corpuscular hemoglobin concentration7.0e-2411
GCST011346_59Total cholesterol levels5.0e-2311
GCST004335_16Mean corpuscular volume1.0e-2211
GCST011347_35LDL cholesterol levels6.0e-2211
GCST000150_1Fetal hemoglobin levels2.0e-2111
GCST008337_2Blood cell traits5.0e-2011
GCST008333_1Red cell distribution width6.0e-1711
GCST002033_3Malaria2.0e-1611
GCST008972_7Urate levels5.0e-1511
GCST001637_1Malaria6.0e-1411
GCST001782_4Mean corpuscular hemoglobin concentration1.0e-1311
GCST005356_14Severe malaria9.0e-1311
GCST008746_23eGFR in diabetes7.0e-1311
GCST008338_4Blood cell traits1.0e-1211
GCST009681_7Hemoglobin levels2.0e-1211
GCST003774_2Urinary albumin-to-creatinine ratio8.0e-1211
GCST001710_2HbA2 levels1.0e-1111
GCST001779_3Hematology traits5.0e-1111
GCST010736_7Urinary albumin-to-creatinine ratio8.0e-1111
GCST000410_3Malaria4.0e-1111
GCST004330_1Hematocrit1.0e-1011
GCST090002392_37Mean corpuscular volume1.0e-1011

Summary Statistics

CategoryCount
Gene Identifiers5 major databases
Transcripts7 total (1 MANE Select)
Exons (canonical)3
UniProt Entries5 (1 reviewed)
Protein Domains5
PDB Structures346
Cross-Species Orthologs11
ClinVar Variants1,706
AlphaMissense Predictions957
SpliceAI Predictions260
GO Terms29
Reactome Pathways10
STRING Interactions2,490
IntAct Interactions215
ESM2 Similar Proteins67
DIAMOND Homologs200+
TF Regulators71
Disease Associations (GenCC)25
Orphanet Diseases29
HPO Phenotypes139+
GWAS Associations40
Single-Cell Experiments15

Data compiled from: HGNC, Ensembl, NCBI Entrez, UniProt, PDB, AlphaFold, ClinVar, AlphaMissense, SpliceAI, Gene Ontology, Reactome, STRING, IntAct, GenCC, Orphanet, HPO, GWAS Catalog, PharmGKB, ChEMBL, Bgee, and Single Cell Expression Atlas via Biobtree. I have compiled a comprehensive cross-database identifier and functional mapping reference for human HBB (Hemoglobin subunit beta). This document covers all 12 sections requested: Key Highlights:

  • Gene identifiers across HGNC, Ensembl, NCBI Entrez, OMIM, and CCDS
  • 7 transcripts with ENST00000335295 as the MANE Select canonical
  • 5 UniProt entries with P68871 as the reviewed canonical (147 aa)
  • 346 PDB structures - one of the most structurally characterized human proteins
  • AlphaFold model with 97% high-confidence prediction
  • 1,706 ClinVar variants including the sickle cell mutation (c.20A>T, p.Glu7Val)
  • 957 AlphaMissense and 260 SpliceAI predictions
  • 29 GO terms covering oxygen transport, heme binding, and erythrocyte function
  • 10 Reactome pathways for gas exchange and heme biology
  • 2,490 STRING and 215 IntAct interactions, primarily with HBA1
  • 71 transcription factors regulating HBB expression (KLF1, GATA1, NFE2, BCL11A)
  • 25+ disease associations including sickle cell disease, beta-thalassemia variants
  • 139+ HPO phenotypes and 40 GWAS associations (malaria resistance, hemoglobin levels)
  • PharmGKB VIP gene status with multiple targeting molecules