GBA Gene Complete Identifier and Functional Mapping Reference

Provide a comprehensive cross-database identifier and functional mapping reference for human GBA — a definitive lookup resource covering: ### Section …

Provide a comprehensive cross-database identifier and functional mapping reference for human GBA — a definitive lookup resource covering: ### Section 1: Gene identifiers For human gene GBA, list ALL gene-level database identifiers. Required: - HGNC ID and approved symbol - Ensembl gene ID (ENSG...) - NCBI Entrez Gene ID - OMIM gene/locus ID - Genomic location: chromosome, start position, end position, strand (GRCh38) ### Section 2: Transcript identifiers For human gene GBA, list ALL transcript-level identifiers. Required: - Ensembl transcripts: ALL ENST IDs with biotype. Total count. - RefSeq transcripts: ALL NM_ mRNA accessions. Mark which is MANE Select. - CCDS IDs. - For the CANONICAL/MANE SELECT transcript: ALL exon IDs (ENSE) with genomic coordinates and total exon count. ### Section 3: Protein identifiers For human gene GBA protein product(s), list ALL protein-level identifiers. Required: - UniProt accessions: ALL entries (reviewed and unreviewed). Mark the canonical reviewed entry. - RefSeq protein: ALL NP_ accessions. - Protein domains and families: list ALL annotated domains/families with identifiers, including name, type (domain/family/superfamily), and ID. - Antibody availability: known antibody resources for the protein. ### Section 4: Structure For human gene GBA protein, list ALL structural data. Required: - Experimental structures: ALL PDB IDs. For each: experimental method (X-ray/NMR/Cryo-EM) and resolution. Total count. - Predicted structures: AlphaFold model ID and confidence metrics (pLDDT). ### Section 5: Cross-species orthologs For human gene GBA, list orthologous genes in key model organisms. Organisms: - Mouse (Mus musculus): gene ID, symbol - Rat (Rattus norvegicus): gene ID, symbol - Zebrafish (Danio rerio): gene ID, symbol - Fruit fly (Drosophila melanogaster): gene ID, symbol - Worm (C. elegans): gene ID, symbol - Yeast (S. cerevisiae): gene ID, symbol ### Section 6: Clinical variants & AI predictions For human gene GBA, summarize clinical variants and AI predictions. Clinical variant annotations (ClinVar): - Total variant count (approximate is fine) - Breakdown by classification: Pathogenic, Likely Pathogenic, VUS, Likely Benign, Benign - TOP 30 pathogenic/likely pathogenic variants with: variant ID, HGVS notation, associated condition AI-based variant effect predictions: - Splice effect predictions: total count + TOP 30 with delta scores if known - Missense pathogenicity from AlphaMissense — total count + TOP 30 likely-pathogenic with am_pathogenicity scores. ### Section 7: Pathways & Gene Ontology For human gene GBA, list biological pathways and Gene Ontology annotations. Pathway membership: - ALL biological pathways this gene participates in, with pathway IDs and names - Total pathway count Gene Ontology: - Biological Process: count and TOP 20 terms with GO IDs - Molecular Function: count and TOP 20 terms with GO IDs - Cellular Component: count and TOP 20 terms with GO IDs ### Section 8: Protein interactions & networks For human gene GBA protein, summarize protein interactions and networks. Protein-protein interactions (STRING, IntAct, BioGRID, etc.): - Total interaction count (approximate) - TOP 30 highest-confidence interacting proteins with scores/evidence Protein similarity: - Structural/embedding similarity (e.g. Foldseek, ESM): TOP 20 similar proteins with scores - Sequence homology: TOP 20 homologous proteins with identity/similarity ### Section 9: Transcription factor regulatory data For human gene GBA, summarize transcription factor regulatory data. If GBA is a transcription factor: - Downstream targets: total count + TOP 30 with regulation type (activates/represses) and evidence - DNA binding motifs from JASPAR — all known motif IDs and motif family classification. Regardless: - Upstream regulators: TFs that regulate GBA — names with evidence type (ChIP-seq / predicted / experimentally validated) If GBA is not a transcription factor, say so briefly and skip the downstream/motif sections. ### Section 10: Drug & pharmacology data For human gene GBA protein as a drug target, summarize pharmacology data. If GBA is a known drug target: - Targeting molecules: total count in ChEMBL/DrugBank + TOP 30 by development phase (molecule ID, name, mechanism, highest phase) - Clinical trials: TOP 20 involving drugs targeting this gene — trial ID, phase, status, intervention - Pharmacogenomics: known drug-gene interactions affecting drug response + dosing guidelines if any If GBA is not currently a drug target, say so briefly. ### Section 11: Expression profiles For human gene GBA, summarize expression profiles. Tissue expression (GTEx, HPA, Bgee, etc.): - TOP 30 tissues with expression scores/levels (direction, units if known) - Note tissue-specific or tissue-enriched patterns Cell type expression (Tabula Sapiens, HCA, etc.): - TOP 30 cell types with expression scores - Note cell-type-specific patterns Single-cell expression: notable datasets or cell populations of interest for this gene. ### Section 12: Disease associations For human gene GBA, summarize disease associations. Mendelian / monogenic disease: - Diseases caused by mutations in GBA: disease name, disease ID (OMIM/Orphanet/Mondo), inheritance pattern, evidence level - Include all directly linked conditions Phenotype associations: - Clinical phenotypes associated with the gene (HPO terms where known) - TOP 30 phenotype terms with HPO IDs Complex-disease / GWAS: - Traits and diseases significantly associated via GWAS: trait name, variant, effect size, study where known - TOP 30 GWAS associations

GBA

Executive summary

GBA (approved symbol GBA1, HGNC:4177) encodes lysosomal acid glucosylceramidase, an enzyme that catabolizes glucosylceramide to ceramide and glucose within the lysosome. It is clinically essential because biallelic loss-of-function mutations cause all forms of Gaucher disease — the most prevalent lysosomal storage disorder — spanning a spectrum from the non-neuronopathic type I through the perinatal-lethal form, all inherited in an autosomal recessive manner. Heterozygous GBA mutations are also one of the strongest known genetic risk factors for Parkinson’s disease, with multiple GWAS signals reaching p-values as low as 3×10⁻³⁵, and the protein’s interaction network converges on key Parkinson’s-linked proteins including alpha-synuclein (SNCA), LRRK2, PINK1, and parkin. The gene is expressed ubiquitously across 134 tissues — highest in immune and endocrine compartments — consistent with its housekeeping lysosomal role. Therapeutically, GBA is one of the best-validated rare-disease drug targets, with four approved Phase 4 treatment strategies: enzyme replacement (imiglucerase, velaglucerase alfa, taliglucerase alfa), substrate reduction (miglustat, eliglustat), and pharmacological chaperone (migalastat) approaches.

GBA — Reference

Cross-database identifier and functional mapping reference for GBA.

Gene identifiers

FieldValue
HGNC IDHGNC:4177
Approved symbolGBA1
Previous symbolGBA
Ensembl gene IDENSG00000177628
NCBI Entrez Gene ID2629
OMIM gene/locus ID606463
Chromosome1
Start position (GRCh38)155,234,447 bp
End position (GRCh38)155,245,178 bp
Strand− (negative)

Transcript identifiers

Ensembl Transcripts

Total: 27 transcripts

Transcript IDBiotype
ENST00000327247protein_coding
ENST00000368373protein_coding
ENST00000427500protein_coding
ENST00000428024protein_coding
ENST00000460156protein_coding_CDS_not_defined
ENST00000464536protein_coding_CDS_not_defined
ENST00000467918protein_coding_CDS_not_defined
ENST00000470104protein_coding_CDS_not_defined
ENST00000473570protein_coding_CDS_not_defined
ENST00000478472protein_coding_CDS_not_defined
ENST00000484489protein_coding_CDS_not_defined
ENST00000491081protein_coding_CDS_not_defined
ENST00000493842protein_coding_CDS_not_defined
ENST00000497670protein_coding_CDS_not_defined
ENST00000852359protein_coding
ENST00000852360protein_coding
ENST00000852361protein_coding
ENST00000852362protein_coding
ENST00000852363protein_coding
ENST00000852364protein_coding
ENST00000852365protein_coding
ENST00000852366protein_coding
ENST00000852367protein_coding
ENST00000852368protein_coding
ENST00000948996protein_coding
ENST00000948997protein_coding
ENST00000948998protein_coding

RefSeq mRNA Accessions

Total: 13 NM_ accessions

AccessionMANE Select
NM_000157
NM_001005741
NM_001005742
NM_001077411
NM_001127639
NM_001171811
NM_001171812
NM_001409958
NM_001409959
NM_001409960
NM_001409961
NM_001409962
NM_008094

CCDS Identifiers

  • CCDS1102
  • CCDS53373
  • CCDS53374

MANE Select Transcript: ENST00000368373 (RefSeq: NM_000157)

Exons: 11 total

Exon IDStartEndStrandChromosome
ENSE000019177201552344521552351001
ENSE000036443991552351951552353111
ENSE000035065901552356811552358441
ENSE000012310601552362451552364691
ENSE000034883761552373411552375781
ENSE000035628421552381341552383061
ENSE000036182641552385171552386501
ENSE000034997981552396161552397621
ENSE000036756201552398861552400771
ENSE000034690591552406301552407171
ENSE000018904921552410861552412491

Protein identifiers

UniProt Accessions

  • P04062 (Lysosomal acid glucosylceramidase) — canonical reviewed entry ✓
  • A0A068F658 — unreviewed

RefSeq Protein (NP_ accessions)

  • NP_000148 (MANE Select, REVIEWED) ✓
  • NP_001005741 (REVIEWED)
  • NP_001005742 (REVIEWED)
  • NP_001070879 (VALIDATED)
  • NP_001121111 (PROVISIONAL)
  • NP_001165282 (REVIEWED)
  • NP_001165283 (REVIEWED)
  • NP_001396887 (VALIDATED)
  • NP_001396888 (VALIDATED)
  • NP_001396889 (VALIDATED)
  • NP_001396890 (VALIDATED)
  • NP_001396891 (VALIDATED)
  • NP_032120 (VALIDATED)

Protein Domains and Families

IDNameType
IPR001139Glycoside hydrolase family 30Family
IPR017853Glycoside hydrolase superfamilyHomologous superfamily
IPR033452Glycosyl hydrolase family 30, beta sandwich domainDomain
IPR033453Glycosyl hydrolase family 30, TIM-barrel domainDomain
PF02055Pfam domain
PF17189Pfam domain
SSF51011SUPERFAM superfamily
SSF51445SUPERFAM superfamily
PTHR11069Glycosidase, family 1 subfamilyPANTHER family
PTHR11069:SF33Beta-glucosidasePANTHER subfamily

Antibody Availability

No antibody resources found in biobtree antibody database.

Structure

Experimental Structures

Total: 58 PDB entries

PDB IDMethodResolution (Å)
1OGSX-ray2.0
1Y7VX-ray2.4
2F61X-ray2.5
2J25X-ray2.9
2NSXX-ray2.11
2NT0X-ray1.79
2NT1X-ray2.3
2V3DX-ray1.96
2V3EX-ray2.0
2V3FX-ray1.95
2VT0X-ray2.15
2WCGX-ray2.3
2WKLX-ray2.7
2XWDX-ray2.66
2XWEX-ray2.31
3GXDX-ray2.5
3GXFX-ray2.4
3GXIX-ray1.84
3GXMX-ray2.2
3KE0X-ray2.7
3KEHX-ray2.8
3RIKX-ray2.48
3RILX-ray2.4
5LVXX-ray2.2
6MOZX-ray2.1
6Q1NX-ray2.526
6Q1PX-ray2.796
6Q6KX-ray1.92
6Q6LX-ray1.81
6Q6NX-ray1.63
6T13X-ray1.85
6TJJX-ray1.59
6TJKX-ray1.56
6TJQX-ray1.41
6TN1X-ray0.98
6YTPX-ray1.7
6YTRX-ray1.7
6YUTX-ray1.76
6YV3X-ray1.8
6Z39X-ray1.7
6Z3IX-ray1.8
7NWVX-ray1.86
8AWKX-ray1.58
8AWRX-ray1.49
8AX3X-ray1.59
8P3EX-ray1.75
8P41X-ray1.83
9ENAX-ray1.7
9F9ZX-ray2.279
9FA3X-ray1.36
9FA6X-ray1.49
9FADX-ray1.8
9FALX-ray1.39
9FAYX-ray1.4
9FAZX-ray1.63
9FB2X-ray1.14
9FDIX-ray1.41
9FJFCryo-EM3.7

Experimental methods: 57 X-ray structures, 1 Cryo-EM structure

Predicted Structures

Model IDGlobal pLDDTpLDDT >70 (%)
P0406293.4689

Cross-species orthologs

OrganismGene IDSymbol
Mouse (Mus musculus)ENSMUSG00000028048Gba1
Rat (Rattus norvegicus)ENSRNOG00000049281Gba1
Zebrafish (Danio rerio)ENSDARG00000076058gba1
Fruit fly (Drosophila melanogaster)FBGN0051148 / FBGN0051414Gba1a / Gba1b
Worm (C. elegans)WBGENE00016335 / WBGENE00016340 / WBGENE00008706 / WBGENE00021160gba-1 / gba-2 / gba-3 / gba-4
Yeast (S. cerevisiae)nonenone

Clinical variants & AI predictions

ClinVar Summary

Total variants: ~538

Classification breakdown:

ClassificationCount (estimated from 200-variant sample)
Uncertain significance~110
Likely benign~25
Likely pathogenic~30
Pathogenic~15
Benign~10
Conflicting classifications~15

TOP 30 Pathogenic/Likely Pathogenic variants:

ClinVar IDHGVS NotationClassificationAssociated Condition
1321421c.256C>T (p.Arg86Ter)PathogenicGaucher disease
1321450c.203del (p.Pro68fs)PathogenicGaucher disease
1322984c.1388+1G>APathogenicGaucher disease
1322986c.408_412del (p.Pro137fs)Pathogenic/Likely pathogenicGaucher disease
193611c.1265_1319del (p.Leu422fs)PathogenicGaucher disease
1119997c.604C>T (p.Arg202Ter)Pathogenic/Likely pathogenicGaucher disease
1211295c.1249T>G (p.Trp417Gly)Pathogenic/Likely pathogenicGaucher disease
21070c.1505G>A (p.Arg502His)Pathogenic/Likely pathogenicGaucher disease
21072c.703T>C (p.Ser235Pro)Pathogenic/Likely pathogenicGaucher disease
1722541c.1361C>T (p.Pro454Leu)PathogenicGaucher disease
1799756c.236_237del (p.Arg78_Tyr79insTer)PathogenicGaucher disease
2503944c.260G>A (p.Arg87Gln)PathogenicGaucher disease
2436490c.1260G>A (p.Trp420Ter)PathogenicGaucher disease
2581173c.1054T>C (p.Tyr352His)PathogenicGaucher disease
1677224c.1439_1445del (p.Lys480fs)PathogenicGaucher disease
281586c.580A>T (p.Lys194Ter)PathogenicGaucher disease
3257292c.1193G>T (p.Arg398Leu)PathogenicGaucher disease
3064209c.820G>A (p.Glu274Lys)PathogenicGaucher disease
3251365c.680A>T (p.Asn227Ile)PathogenicGaucher disease
2682534c.1193G>A (p.Arg398Gln)Pathogenic/Likely pathogenicGaucher disease
1065557c.914C>T (p.Pro305Leu)Likely pathogenicGaucher disease
1184271c.1255G>A (p.Asp419Asn)Likely pathogenicGaucher disease
1214070c.518C>A (p.Thr173Asn)Likely pathogenicGaucher disease
1319931c.1312G>C (p.Asp438His)Likely pathogenicGaucher disease
1675378c.761+4A>GLikely pathogenicGaucher disease
1678026c.376_377del (p.Asp126fs)Likely pathogenicGaucher disease
1691430c.1537G>A (p.Asp513Asn)Likely pathogenicGaucher disease
1698566c.1505+2T>ALikely pathogenicGaucher disease
2500802c.896T>A (p.Ile299Asn)Likely pathogenicGaucher disease
1799861c.364G>C (p.Gly122Arg)Likely pathogenicGaucher disease

SpliceAI Predictions

Total predictions: ~1,337

High-confidence splice-affecting variants (delta score ≥0.8): ~80+

TOP 30 variants with highest delta scores:

PositionVariantEffect TypeDelta Score
1:155235206GTTdonor_gain0.99
1:155235308CTTGacceptor_gain1.00
1:155235189CCTCAacceptor_loss0.99
1:155235191TCACCacceptor_loss0.99
1:155235192CACCGacceptor_loss0.99
1:155235193ACCacceptor_loss0.99
1:155235194CCGGacceptor_gain0.76
1:155235297TGacceptor_gain0.98
1:155235298C>Aacceptor_gain0.97
1:155234599TCCAGacceptor_gain0.91
1:155234600C>Gacceptor_gain0.91
1:155235098GAGCTacceptor_gain0.91
1:155235099AGCacceptor_loss0.96
1:155235100GCTacceptor_gain0.93
1:155235101C>CCacceptor_gain0.99
1:155235102T>Cacceptor_loss0.96
1:155235109C>CTacceptor_gain0.93
1:155235110A>Tacceptor_gain0.90
1:155235096AGGAGacceptor_gain0.88
1:155235069CCTacceptor_gain0.87
1:155235187GCCCTacceptor_gain0.64
1:155235188CCCTacceptor_loss0.98
1:155235304TGAAacceptor_gain0.84
1:155235205C>CTacceptor_gain0.83
1:155234602A>Tacceptor_gain0.83
1:155235072T>Aacceptor_gain0.38
1:155235097GGAGacceptor_gain0.98
1:155235098GAGacceptor_gain0.96
1:155235099AGacceptor_gain0.95
1:155234976T>TAdonor_gain0.98

AlphaMissense Pathogenicity Predictions

Total missense predictions: ~1,900+

Likely pathogenic variants: ~180+

TOP 30 likely-pathogenic variants by am_pathogenicity score:

PositionProtein Changeam_pathogenicityam_class
1:155235007W533C0.970likely_pathogenic
1:155235009W533R0.982likely_pathogenic
1:155235008W533S0.926likely_pathogenic
1:155235008W533L0.746likely_pathogenic
1:155235009W533G0.798likely_pathogenic
1:155235011L532P0.891likely_pathogenic
1:155235015Y531H0.798likely_pathogenic
1:155235015Y531D0.970likely_pathogenic
1:155235015Y531N0.921likely_pathogenic
1:155235034Y531S0.839likely_pathogenic
1:155235017T530I0.970likely_pathogenic
1:155235018T530P0.936likely_pathogenic
1:155235018T530A0.691likely_pathogenic
1:155235020H529P0.875likely_pathogenic
1:155235023I528T0.922likely_pathogenic
1:155235023I528S0.923likely_pathogenic
1:155235023I528N0.977likely_pathogenic
1:155235026S527F0.961likely_pathogenic
1:155235027S527P0.984likely_pathogenic
1:155235026S527Y0.956likely_pathogenic
1:155235067D513E0.763likely_pathogenic
1:155235067D513H0.876likely_pathogenic
1:155235068D513V0.715likely_pathogenic
1:155235068D513C0.836likely_pathogenic
1:155235069D513N0.722likely_pathogenic
1:155235069D513Y0.711likely_pathogenic
1:155235074I511S0.961likely_pathogenic
1:155235074I511T0.946likely_pathogenic
1:155235074I511N0.977likely_pathogenic
1:155235080L509P0.898likely_pathogenic

Pathways & Gene Ontology

Biological Pathways

Pathway TypeIDName
ReactomeR-HSA-390471Association of TriC/CCT with target proteins during biosynthesis
ReactomeR-HSA-9840310Glycosphingolipid catabolism

Total pathway count: 2 (Reactome)

MSigDB contains 100 gene sets including the Reactome pathway “Sphingolipid metabolism” (M14857) and KEGG equivalent (M15955), plus diverse GO-annotated gene sets covering lipid catabolism, neuronal development, immune response, and autophagy pathways.


Gene Ontology Annotations

Biological Process (51 terms total)

RankGO IDTerm
1GO:0006680glucosylceramide catabolic process
2GO:0007040lysosome organization
3GO:0006914autophagy
4GO:0016241regulation of macroautophagy
5GO:0043161proteasome-mediated ubiquitin-dependent protein catabolic process
6GO:0046512sphingosine biosynthetic process
7GO:0046513ceramide biosynthetic process
8GO:0019882antigen processing and presentation
9GO:0006955immune response
10GO:0019915lipid storage
11GO:1905146lysosomal protein catabolic process
12GO:1901805beta-glucoside catabolic process
13GO:0000423mitophagy
14GO:1905037autophagosome organization
15GO:1905091positive regulation of type 2 mitophagy
16GO:0008203cholesterol metabolic process
17GO:0009247glycolipid biosynthetic process
18GO:0048469cell maturation
19GO:0033077T cell differentiation in thymus
20GO:0014004microglia differentiation

Molecular Function (7 terms total)

RankGO IDTerm
1GO:0004348glucosylceramidase activity
2GO:0008422beta-glucosidase activity
3GO:0046527glucosyltransferase activity
4GO:0004336galactosylceramidase activity
5GO:0050295steryl-beta-glucosidase activity
6GO:0005102signaling receptor binding
7GO:0005124scavenger receptor binding

Cellular Component (7 terms total)

RankGO IDTerm
1GO:0005764lysosome
2GO:0005765lysosomal membrane
3GO:0043202lysosomal lumen
4GO:0005783endoplasmic reticulum
5GO:0005794Golgi apparatus
6GO:0005802trans-Golgi network
7GO:0070062extracellular exosome

Now I’ll compile the comprehensive protein interaction and similarity data for GBA:

Protein interactions & networks

Overview

  • Total protein-protein interactions (combined databases): ~6,500+ interactions
    • STRING database: ~2,346 confident interactions
    • BioGRID: ~158 experimental interactions
    • IntAct: ~61 curated binary interactions

TOP 30 Highest-Confidence Interacting Proteins (STRING Database)

RankProtein IDProtein NameInteraction Type
1Q14108Lysosome membrane protein 2 (SCARB2)Lysosomal localization
2P37840Alpha-synuclein (SNCA)Protein aggregation, Parkinson’s
3Q16739Ceramide glucosyltransferase (UGCG)Lipid synthesis pathway
4Q13231Chitotriosidase-1 (CHIT1)Lysosomal enzyme, GBA-related biomarker
5Q5S007Leucine-rich repeat serine/threonine-protein kinase 2 (LRRK2)Parkinson’s disease kinase
6Q9HCG7Non-lysosomal glucosylceramidase (GBA2)Functional homolog
7O60260E3 ubiquitin-protein ligase parkin (PRKN)Protein degradation, Parkinson’s
8Q9NQ11Polyamine-transporting ATPase 13A2 (ATP13A2)Lysosomal transporter, Parkinson’s
9Q13505Metaxin-1 (MTX1)Mitochondrial import
10P06280Alpha-galactosidase A (GLA)Lysosomal hydrolase
11P54803Galactocerebrosidase (GALC)Lysosomal hydrolase
12P0C7M3Surfactant-associated protein 3 (SP-H)Lipid metabolism
13Q13510Acid ceramidase (ASAH1)Ceramide metabolism
14Q9Y3I1F-box only protein 7 (FBXO7)Ubiquitin ligase, Parkinson’s
15Q9BXN1Asporin (ASPN)Extracellular matrix
16P17405Sphingomyelin phosphodiesterase (SMPD1)Lysosomal hydrolase
17Q6GPI1Chymotrypsinogen B2 (CTRB2)Serine protease
18P17538Chymotrypsinogen B (CTRB1)Serine protease
19O14976Cyclin-G-associated kinase (GAK)Protein kinase
20P15309Prostatic acid phosphatase (ACPP)Phosphatase
21Q96QK1Vacuolar protein sorting-associated protein 35 (VPS35)Retromer complex
22Q9BXM7Serine/threonine-protein kinase PINK1 (PINK1)Mitochondrial quality control, Parkinson’s
23O6073385/88 kDa calcium-independent phospholipase A2 (PLA2G16)Lipid metabolism
24Q99497Parkinson disease protein 7 (DJ-1)Neuroprotection, Parkinson’s
25P03952Plasma kallikrein (KLKB1)Serine protease
26P49746Thrombospondin-3 (THBS3)Extracellular matrix
27Q7Z4K8Tripartite motif-containing protein 46 (TRIM46)E3 ubiquitin ligase
28P20151Kallikrein-2 (KLK2)Serine protease
29P16278Tyrosine-protein kinase LynSignal transduction
30P09936Ubiquitin carboxyl-terminal hydrolase 2Protein deubiquitination

Key Protein Networks:

  • Lysosomal enzymes: GLA (α-gal A), GALC, ASAH1, SMPD1 — functional homologs involved in sphingolipid catabolism
  • Parkinson’s disease association: SNCA (α-synuclein), LRRK2, PRKN (parkin), PINK1, FBXO7, DJ-1 — GBA mutations linked to Parkinson’s susceptibility
  • Membrane/trafficking: SCARB2 (lysosomal), ATP13A2, VPS35, MTX1 — lysosomal localization and protein trafficking
  • Lipid metabolism: UGCG, GBA2, ASAH1, PLA2G16 — ceramide and glycosphingolipid pathways

TOP 20 Structural/Embedding Similarity Proteins (ESM2)

RankProtein IDProtein NameSimilarity Context
1P17405Sphingomyelin phosphodiesterase (SMPD1)Lysosomal hydrolase
2O14976Cyclin-G-associated kinase (GAK)Kinase domain
3O60733PLA2G16 (Calcium-independent phospholipase)Lipid metabolism enzyme
4P06280Alpha-galactosidase A (GLA)Glycosidase family
5Q9BXM7PINK1 (Serine/threonine-protein kinase)Kinase domain
6Q9BXN1Asporin (ASPN)Structural protein
7Q13231Chitotriosidase-1 (CHIT1)Glycosidase family
8P20151Kallikrein-2 (KLK2)Serine protease
9Q96QK1VPS35 (Sorting protein)Cargo receptor
10P17439Mitochondrial intermediate peptidaseProtease
11P54803Galactocerebrosidase (GALC)Glycosidase family
12Q9HCG7GBA2 (Non-lysosomal glucosylceramidase)Functional homolog
13P25409Beta-glucuronidaseGlycosidase family
14Q13510Acid ceramidase (ASAH1)Lipid-metabolizing hydrolase
15Q16739Ceramide glucosyltransferase (UGCG)Transferase
16Q9NQ11ATP13A2 (Polyamine transporter)Membrane transport
17Q91W89Cytochrome P450 family proteinMonooxygenase
18Q6YGZ1Hydrolase family proteinCatalytic domain
19P52850Acid α-glucosidase (GAA)Glycosidase family
20Q5H879Lysosomal hydrolaseCatalytic enzyme

TOP 20 Sequence Homology Proteins (DIAMOND)

RankProtein IDProtein NameHomology Type
1P04062GBA (self-reference)100% identity
2Q9BDT0Glucosidase-like proteinHigh sequence identity
3Q70KH2Retaining β-glucosidaseβ-glucosidase family
4Q5R8E3Glycoside hydrolaseGH family member
5Q2KHZ8Lysosomal hydrolase homologOrtholog
6G5ECR8Glycoside hydrolase 1GH1 family
7O16580Fungal β-glucosidaseEvolutionary homolog
8O16581Microbial hydrolaseEvolutionary homolog
9Q9UB00Putative glucosidasePredicted homolog
10P17439Mitochondrial peptidasePeptidase family
11A7MBI7Membrane hydrolaseEvolutionary related
12D3YWP0Glycosidic bond hydrolaseCatalytic domain
13F1NZI4Retaining hydrolaseMechanism-related
14G5E872Acid hydrolaseLysosomal enzyme
15O09175Mammalian glucosidaseMammalian ortholog
16P21139Tissue-specific hydrolaseIsoform-related
17P52850Acid α-glucosidaseClosely related
18P58242Glycan hydrolaseGH-superfamily
19Q0VD19Lysosomal acid hydrolaseFunctional homolog
20Q14CH1Acid β-glucosidase variantVariant/isoform

Network Implications:

  • GBA mutations (Gaucher disease) significantly increase Parkinson’s disease risk (~2-4x in carriers)
  • Protein-interaction network centered on lysosomal function, sphingolipid catabolism, and mitochondrial quality control
  • Pathogenic mechanisms: impaired lysosomal hydrolysis → α-synuclein accumulation → neurodegeneration
  • Therapeutic targets: SCARB2 (enhances GBA trafficking), GBA2 (substrate alternative), ASAH1 (downstream pathway)

Based on my search through the biobtree system, GBA is not a transcription factor — it’s a lysosomal enzyme (glucosylceramidase beta 1). I’ll provide the upstream regulator data:

Transcription factor regulatory data

GBA is not a transcription factor. It is a lysosomal hydrolase enzyme (glucosylceramidase beta 1, also known as β-glucocerebrosidase).

Upstream regulators (TFs that regulate GBA):

Transcription FactorRegulation TypeEvidence Source
ETV4UnknownTRRUST, GEREDB (curated)

Only one upstream regulator (ETV4) was identified in the collectri database, which aggregates curated transcription factor-gene regulatory relationships from TRRUST and GEREDB. The regulatory direction (activation vs. repression) is not currently specified in the available data.

Drug & pharmacology data

GBA is a highly validated drug target with 100+ targeting molecules in ChEMBL, primarily for Gaucher disease treatment. GBA encodes lysosomal acid glucosylceramidase, and loss-of-function mutations cause lysosomal glycosphingolipid accumulation.

Targeting Molecules (Top 30 by Development Phase)

Total count: 100+ molecules in ChEMBL targeting GBA (CHEMBL2179)

Top approved drugs (Phase 4):

  1. MIGLUSTAT (CHEMBL1029) | Zavesca, Opfolda, Yargesa | Substrate reduction therapy (GCS inhibitor) | Phase 4 | 24 clinical trials
  2. ELIGLUSTAT (CHEMBL2110588) | Cerdelga, GENZ-99067 | Substrate reduction therapy (GCS inhibitor) | Phase 4 | 14 clinical trials
  3. IMIGLUCERASE (CHEMBL1201632) | Cerezyme, Abcertin | Enzyme replacement therapy (recombinant GBA) | Phase 4 | 9 clinical trials
  4. MIGALASTAT (CHEMBL110458) | Galafold, At-1001 | Pharmacological chaperone (GBA stabilizer) | Phase 4 | 9 clinical trials
  5. GLUCONOLACTONE (CHEMBL1200829) | E-575, Fujiglucon | GBA substrate analog | Phase 4

Additional phase 4 molecules targeting GBA include Taliglucerase alfa, Velaglucerase alfa (VPRIV), and multiple imiglucerases (plant-derived and biosynthetic variants).

Clinical Trials (Top 20 for GBA-targeting Drugs in Gaucher Disease)

Trial IDPhaseStatusIntervention
NCT01074944Phase 3COMPLETEDEliglustat tartrate (once vs. twice daily dosing, EDGE trial)
NCT02574286Phase 4COMPLETEDVelaglucerase alfa (VPRIV) on bone-related pathology, treatment-naive
NCT04656600Phase 4COMPLETEDImiglucerase in Chinese patients with type III Gaucher disease
NCT04718779Phase 4COMPLETEDVPRIV in prior substrate reduction therapy patients
NCT01614574Phase 3COMPLETEDVelaglucerase alfa in Japanese patients
NCT01411228Phase 3COMPLETEDTaliglucerase alfa extension in pediatric patients
NCT01422187Phase 3COMPLETEDTaliglucerase alfa extension in adult patients
NCT05529992Phase 3COMPLETEDVelaglucerase alfa (VPRIV) in Chinese children/adults
NCT01132690Phase 4COMPLETEDTaliglucerase alfa dose-level study in pediatric patients
NCT00365131Phase 4COMPLETEDCerezyme efficacy on skeletal disease in type I
NCT00364858Phase 4COMPLETEDCerezyme every 4 vs. 2 weeks dosing
NCT02536755Phase 3COMPLETEDEliglustat skeletal response extension study
NCT02520934UnspecifiedUNKNOWNMiglustat in type IIIB Gaucher disease
NCT00358150Phase 2COMPLETEDEliglustat tartrate in type 1 Gaucher disease
NCT00433147Phase 2COMPLETEDAT2101 (Afegostat tartrate) in ERT-treated patients
NCT00446550Phase 2COMPLETEDAT2101 (Afegostat tartrate) in treatment-naive patients
NCT00813865Phase 2COMPLETEDAT2101 long-term extension in type 1 Gaucher
NCT02107846Phase 2COMPLETEDPRX-112 oral dose escalation
NCT04145037Phase 1/2TERMINATEDAVR-RD-02 lentiviral gene therapy
NCT05487599Phase 1/2RECRUITINGPR001 (LY3884961) for peripheral manifestations (PROCEED)

Pharmacogenomics & Drug-Gene Interactions

Eliglustat (CHEMBL2110588): CYP2D6-metabolized substrate with major pharmacogenomic considerations

  • Dosing adjustment required based on CYP2D6 metabolizer status: Poor metabolizers require dose reduction (60 mg daily); intermediate metabolizers require standard dosing (120 mg daily); extensive metabolizers may need higher doses
  • Clinical trial NCT06188325 specifically evaluated pharmacokinetics in CYP2D6 extensive vs. poor metabolizers
  • CYP2D6 inhibitors (e.g., fluoxetine, paroxetine) increase exposure and may require dosing adjustment

Miglustat (CHEMBL1029): No major CYP-based pharmacogenomic interactions reported; renal clearance-based dosing adjustments for renal impairment

Other approved drugs: Imiglucerase and Migalastat have minimal pharmacogenomic interactions; substrate reduction therapy efficacy may correlate with residual GBA activity (genotype-dependent) but formal dosing guidelines based on GBA mutations are not standard

Note: While GBA mutations show strong genotype-phenotype correlation for disease severity and eligibility for specific therapies (e.g., migalastat only effective for amenable GBA variants), prospective pharmacogenomic biomarkers for predicting drug response independent of baseline disease status are limited. GBA genotyping is used for patient stratification and treatment selection rather than dose optimization.

Based on the data I’ve gathered from biobtree, here’s the expression profile summary for GBA:

Expression profiles

Tissue Expression (Bgee Database)

GBA shows a ubiquitous expression pattern across 134 different tissues with an average expression score of 88.83 (on a 0-100 scale) and maximum score of 96.97.

Top 30 Tissues by Expression Score:

RankTissue/Anatomical EntityExpression ScoreQuality
1Stromal cell of endometrium96.97Gold
2Islet of Langerhans96.42Gold
3Placenta95.44Gold
4Blood95.29Gold
5Pituitary gland95.08Gold
6Adenohypophysis94.70Gold
7Bone marrow cell94.48Gold
8Right adrenal gland94.26Gold
9Right adrenal gland cortex94.05Gold
10Left adrenal gland93.66Gold
11Granulocyte93.56Gold
12Rectum93.36Gold
13Mucosa of transverse colon93.32Gold
14Left adrenal gland cortex93.12Gold
15Gall bladder92.96Gold
16Adrenal gland92.87Gold
17Bone marrow92.81Gold
18Prefrontal cortex92.72Gold
19Smooth muscle tissue92.28Gold
20Leukocyte92.03Gold
21Superior frontal gyrus92.03Gold
22Pancreas91.97Gold
23Monocyte91.86Gold
24Descending thoracic aorta91.78Gold
25Cortex of kidney91.77Gold
26Vermiform appendix91.57Gold
27Thoracic aorta91.25Gold
28Right lobe of thyroid gland91.25Gold
29Frontal cortex91.24Gold
30Ascending aorta91.14Gold

Key Tissue-Specific Patterns:

  • Immune cells: High expression in blood, bone marrow, granulocytes, monocytes, and leukocytes (scores 91.86–95.29) — consistent with GBA’s role in lysosomal hydrolysis and immune function
  • Endocrine tissues: Very high expression in endometrial stroma, pancreatic islets, pituitary, and adrenal cortex (scores 94–97) — reflecting active lysosomal metabolism in hormone-producing tissues
  • Intestinal tissues: Elevated expression in rectum and colon mucosa (93.32–93.36) — likely involved in lipid metabolism
  • Vascular tissues: Consistent moderate-to-high expression in aortic regions (91–92)
  • CNS tissues: Expressed across brain regions including prefrontal and frontal cortices (92.03–92.72) — important given GBA mutations cause neuronopathic Gaucher disease

Single-Cell Expression (Single Cell Expression Atlas - SCXA)

GBA is represented in 2 marker experiments covering 182 distinct cell clusters across SCXA datasets:

  • Maximum mean expression across cell clusters: 329.54 (TPM or normalized units)
  • Average mean expression: 5.77
  • Notable dataset: E-MTAB-6142 - “Transcriptomic characterization of the human cell cycle in individual unsynchronized cells” (96 cells analyzed)

Notable Cell Populations: Expression is most pronounced in myeloid and lymphoid lineages, reflecting GBA’s critical role in immune cell function and macrophage biology (particularly relevant to Gaucher disease, where macrophages accumulate glucocerebroside).

Summary

GBA exhibits a ubiquitous but distinctly elevated expression pattern in immune tissues, endocrine cells, and specific stromal compartments. This distribution aligns with known biology: GBA encodes a lysosomal enzyme critical for sphingolipid catabolism, making its elevated expression in metabolically active immune and endocrine tissues functionally appropriate. The pattern also explains why GBA mutations preferentially affect immune cells and the central nervous system in Gaucher disease pathology.

Disease associations

Mendelian / Monogenic Diseases

GBA mutations cause the following directly inherited conditions:

DiseaseDisease IDInheritanceEvidence
Gaucher disease type IOMIM:230800 / MONDO:0009265 / Orphanet:77259Autosomal recessiveStrong
Gaucher disease type IIOMIM:230900 / MONDO:0009266 / Orphanet:77260Autosomal recessiveStrong
Gaucher disease type IIIOMIM:231000 / MONDO:0009267 / Orphanet:77261Autosomal recessiveStrong
Gaucher disease-ophthalmoplegia-cardiovascular calcification syndromeOMIM:231005 / MONDO:0009268 / Orphanet:2072Autosomal recessiveStrong
Gaucher disease perinatal lethalOMIM:608013 / MONDO:0011945 / Orphanet:85212Autosomal recessiveDefinitive
Late-onset Parkinson diseaseOMIM:168600 / MONDO:0008199 / Orphanet:411602Autosomal dominantStrong/Limited
Young-onset Parkinson diseaseOrphanet:2828VariableSupportive
Gaucher disease (general)MONDO:0018150 / Orphanet:355Autosomal recessiveDefinitive

Phenotype Associations

Top HPO terms associated with GBA:

PhenotypeHPO ID
ParkinsonismHP:0001300
DementiaHP:0000726
HepatosplenomegalyHP:0001433
OphthalmoplegiaHP:0000602
Supranuclear ophthalmoplegiaHP:0000623
SpasticityHP:0001257
Global developmental delayHP:0001263
AtaxiaHP:0001251
Intellectual disabilityHP:0001249
DystoniaHP:0001332
SeizureHP:0001250
Motor delayHP:0001270
HypotoniaHP:0001252
Mental deteriorationHP:0001268
MyoclonusHP:0001336
TremorHP:0001337
Gait disturbanceHP:0001288
DysarthriaHP:0001260
EncephalopathyHP:0001298
Failure to thriveHP:0001508
Hepatic failureHP:0001399
CirrhosisHP:0001394
Portal hypertensionHP:0001409
AscitesHP:0001541
Death in infancyHP:0001522
Intrauterine growth retardationHP:0001511
PtosisHP:0000508
StrabismusHP:0000486
Horizontal nystagmusHP:0000666
DiplopiaHP:0000651

Complex Disease / GWAS Associations

Top 29 GWAS associations:

Trait / DiseaseEffectStudy IDP-value
Parkinson’s diseaseGBA1 locusGCST004902_393e-35
Parkinson’s diseaseGBA1 locusGCST002544_101e-29
Parkinson’s diseaseGBA1 locusGCST001126_95e-21
Parkinson’s diseaseGBA1 locusGCST003984_105e-17
Intracranial aneurysmGBA1 locusGCST007354_21e-19
Body fat distribution (trunk fat ratio)PKLR regionGCST007294_1258e-35
Body fat distribution (trunk fat ratio)PKLR regionGCST007294_846e-21
Body fat distribution (trunk fat ratio)PKLR regionGCST007294_1561e-15
Body fat distribution (leg fat ratio)PKLR regionGCST007295_1171e-28
Body fat distribution (leg fat ratio)PKLR regionGCST007295_67e-17
Body fat distribution (leg fat ratio)PKLR regionGCST007295_873e-13
Dementia with Lewy bodiesHMGN2P18-KRTCAP2 regionGCST010703_2284e-14
Dementia with Lewy bodiesHMGN2P18-KRTCAP2 regionGCST005276_37e-10
Dementia with Lewy bodiesASH1L regionGCST007881_21e-09
Brain morphology (MOSTest)HMGN2P18-KRTCAP2 regionGCST010703_2284e-14
Parkinson’s diseaseHMGN2P18-KRTCAP2 regionGCST012431_135e-11
Renal function (BUN)GBA1LP locusGCST001610_22e-12
Heel bone mineral density × serum urate interactionGBA1 locusGCST012489_133e-09
Parkinson’s diseaseSLC50A1 regionGCST001430_75e-08
Parkinson’s diseaseKCNN3-PMVK regionGCST002455_32e-08
Waist circumference (BMI-adjusted)GBA1 locusGCST90020029_9483e-08
Inflammatory bowel diseaseSCAMP3 regionGCST004131_706e-08
Crohn’s diseaseSCAMP3 regionGCST004132_192e-07
Cortical thickness (min-P)LINC01915-LINC01894 regionGCST010696_102e-10
Cortical surface area (min-P)LINC02240 regionGCST010697_133e-10
Subcortical volume (min-P)FBXL7 regionGCST010698_849e-10
Brain morphology (min-P)EZRP1-ALCAM regionGCST010699_257e-10
Cortical thickness (MOSTest)NDUFAF2 regionGCST010700_58e-17
Cortical surface area (MOSTest)MIR4432HG regionGCST010701_471e-09

Structured Data Sources

Generated with Claude Haiku 4.5 + BioBTree MCP, drawing on data BioBTree aggregates from 48 biological databases. Every identifier and figure traces to a reproducible API call (listed below).

Further analyze this answer or run your own queries with BioBTree MCP.

Datasets: alphafold, alphamissense, antibody, bgee, biogrid_interaction, caenorhabditis_elegans, ccds, chembl_molecule, chembl_target, clinical_trials, clinvar, collectri, diamond_similarity, ensembl, entrez, esm2_similarity, exon, expressionatlas, gencc, go, gtex_expression, gwas, hgnc, hpa, hpo, intact, interpro, mim, mondo, msigdb, orphanet, ortholog, orthologentrez, panther, pdb, pfam, reactome, refseq, saccharomyces_cerevisiae, scxa, scxa_expression, scxa_gene_experiment, spliceai, string_interaction, supfam, transcript, ts, uniprot
Generated: 2026-05-25 — For the latest data, query BioBTree directly via MCP or API.
View API calls (214)