C9orf72 Gene Complete Identifier and Functional Mapping Reference

Provide a comprehensive cross-database identifier and functional mapping reference for human C9orf72 — a definitive lookup resource covering: ### …

Provide a comprehensive cross-database identifier and functional mapping reference for human C9orf72 — a definitive lookup resource covering: ### Section 1: Gene identifiers For human gene C9orf72, list ALL gene-level database identifiers. Required: - HGNC ID and approved symbol - Ensembl gene ID (ENSG...) - NCBI Entrez Gene ID - OMIM gene/locus ID - Genomic location: chromosome, start position, end position, strand (GRCh38) ### Section 2: Transcript identifiers For human gene C9orf72, list ALL transcript-level identifiers. Required: - Ensembl transcripts: ALL ENST IDs with biotype. Total count. - RefSeq transcripts: ALL NM_ mRNA accessions. Mark which is MANE Select. - CCDS IDs. - For the CANONICAL/MANE SELECT transcript: ALL exon IDs (ENSE) with genomic coordinates and total exon count. ### Section 3: Protein identifiers For human gene C9orf72 protein product(s), list ALL protein-level identifiers. Required: - UniProt accessions: ALL entries (reviewed and unreviewed). Mark the canonical reviewed entry. - RefSeq protein: ALL NP_ accessions. - Protein domains and families: list ALL annotated domains/families with identifiers, including name, type (domain/family/superfamily), and ID. - Antibody availability: known antibody resources for the protein. ### Section 4: Structure For human gene C9orf72 protein, list ALL structural data. Required: - Experimental structures: ALL PDB IDs. For each: experimental method (X-ray/NMR/Cryo-EM) and resolution. Total count. - Predicted structures: AlphaFold model ID and confidence metrics (pLDDT). ### Section 5: Cross-species orthologs For human gene C9orf72, list orthologous genes in key model organisms. Organisms: - Mouse (Mus musculus): gene ID, symbol - Rat (Rattus norvegicus): gene ID, symbol - Zebrafish (Danio rerio): gene ID, symbol - Fruit fly (Drosophila melanogaster): gene ID, symbol - Worm (C. elegans): gene ID, symbol - Yeast (S. cerevisiae): gene ID, symbol ### Section 6: Clinical variants & AI predictions For human gene C9orf72, summarize clinical variants and AI predictions. Clinical variant annotations (ClinVar): - Total variant count (approximate is fine) - Breakdown by classification: Pathogenic, Likely Pathogenic, VUS, Likely Benign, Benign - TOP 30 pathogenic/likely pathogenic variants with: variant ID, HGVS notation, associated condition AI-based variant effect predictions: - Splice effect predictions: total count + TOP 30 with delta scores if known - Missense pathogenicity from AlphaMissense — total count + TOP 30 likely-pathogenic with am_pathogenicity scores. ### Section 7: Pathways & Gene Ontology For human gene C9orf72, list biological pathways and Gene Ontology annotations. Pathway membership: - ALL biological pathways this gene participates in, with pathway IDs and names - Total pathway count Gene Ontology: - Biological Process: count and TOP 20 terms with GO IDs - Molecular Function: count and TOP 20 terms with GO IDs - Cellular Component: count and TOP 20 terms with GO IDs ### Section 8: Protein interactions & networks For human gene C9orf72 protein, summarize protein interactions and networks. Protein-protein interactions (STRING, IntAct, BioGRID, etc.): - Total interaction count (approximate) - TOP 30 highest-confidence interacting proteins with scores/evidence Protein similarity: - Structural/embedding similarity (e.g. Foldseek, ESM): TOP 20 similar proteins with scores - Sequence homology: TOP 20 homologous proteins with identity/similarity ### Section 9: Transcription factor regulatory data For human gene C9orf72, summarize transcription factor regulatory data. If C9orf72 is a transcription factor: - Downstream targets: total count + TOP 30 with regulation type (activates/represses) and evidence - DNA binding motifs from JASPAR — all known motif IDs and motif family classification. Regardless: - Upstream regulators: TFs that regulate C9orf72 — names with evidence type (ChIP-seq / predicted / experimentally validated) If C9orf72 is not a transcription factor, say so briefly and skip the downstream/motif sections. ### Section 10: Drug & pharmacology data For human gene C9orf72 protein as a drug target, summarize pharmacology data. If C9orf72 is a known drug target: - Targeting molecules: total count in ChEMBL/DrugBank + TOP 30 by development phase (molecule ID, name, mechanism, highest phase) - Clinical trials: TOP 20 involving drugs targeting this gene — trial ID, phase, status, intervention - Pharmacogenomics: known drug-gene interactions affecting drug response + dosing guidelines if any If C9orf72 is not currently a drug target, say so briefly. ### Section 11: Expression profiles For human gene C9orf72, summarize expression profiles. Tissue expression (GTEx, HPA, Bgee, etc.): - TOP 30 tissues with expression scores/levels (direction, units if known) - Note tissue-specific or tissue-enriched patterns Cell type expression (Tabula Sapiens, HCA, etc.): - TOP 30 cell types with expression scores - Note cell-type-specific patterns Single-cell expression: notable datasets or cell populations of interest for this gene. ### Section 12: Disease associations For human gene C9orf72, summarize disease associations. Mendelian / monogenic disease: - Diseases caused by mutations in C9orf72: disease name, disease ID (OMIM/Orphanet/Mondo), inheritance pattern, evidence level - Include all directly linked conditions Phenotype associations: - Clinical phenotypes associated with the gene (HPO terms where known) - TOP 30 phenotype terms with HPO IDs Complex-disease / GWAS: - Traits and diseases significantly associated via GWAS: trait name, variant, effect size, study where known - TOP 30 GWAS associations

C9orf72

Executive summary

C9orf72 (chromosome 9 open reading frame 72, HGNC:28337) is a guanine nucleotide exchange factor that forms a core complex with SMCR8 and WDR41, and its clinical importance stems from GGGGCC hexanucleotide repeat expansions in its promoter region — the most common genetic cause of both frontotemporal dementia and amyotrophic lateral sclerosis (FTD/ALS-1, OMIM:105550). The gene is ubiquitously expressed with particularly high levels in monocytes (score 98.03), leukocytes, and cerebellar structures, consistent with its dual roles in autophagy-lysosomal regulation and immune cell function. Functionally, C9orf72 participates in macroautophagy, vesicle-mediated transport, and TORC1 signaling regulation, with its strongest experimentally validated protein interactions involving the ULK1-ATG13 autophagy initiation complex and Rab GTPases. Across 13 independent GWAS cohorts, the locus reaches genome-wide significance for ALS at p = 4.0×10⁻³⁰. Although not a direct drug target, pharmacogenomic data in PharmGKB links C9orf72 variants to response to TNF-alpha inhibitors and ustekinumab in inflammatory disease contexts.

C9orf72 — Reference

Cross-database identifier and functional mapping reference for C9orf72.

Gene identifiers

Gene: C9orf72 (C9orf72-SMCR8 complex subunit)

IdentifierValue
HGNC IDHGNC:28337
Approved SymbolC9orf72
Ensembl Gene IDENSG00000147894
NCBI Entrez Gene ID203228
OMIM Gene ID614260
Genomic Location (GRCh38)
Chromosome9
Start Position27,535,640 bp
End Position27,573,895 bp
Strand− (minus/reverse)

Transcript identifiers

Ensembl Transcripts (23 total)

Transcript IDBiotypeStartEnd
ENST00000379995protein_coding2756154927573755
ENST00000379997protein_coding2756050127573866
ENST00000380003protein_coding2754654627573481
ENST00000488117protein_coding_CDS_not_defined2754655527573448
ENST00000619707protein_coding2754654627573866
ENST00000644136protein_coding2754710827573457
ENST00000647196protein_coding2755645627573819
ENST00000673600nonsense_mediated_decay2753564027573494
ENST00000874868protein_coding2754654727573446
ENST00000874869protein_coding2754654527573439
ENST00000874870protein_coding2754654627573424
ENST00000874871protein_coding2754734327571261
ENST00000874872protein_coding2754654627567506
ENST00000965246protein_coding2754654727573895
ENST00000965247protein_coding2754654427573847
ENST00000965248protein_coding2754654927573805
ENST00000965249protein_coding2754654427573747
ENST00000965250protein_coding2754655327573754
ENST00000965251protein_coding2754655327573481
ENST00000965252protein_coding2754654627573427
ENST00000965253protein_coding2754654727573427
ENST00000965254protein_coding2754654727573424
ENST00000965255protein_coding2754726227573424

RefSeq mRNA Accessions (5 total)

AccessionStatusMANE Select
NM_001081343VALIDATEDNo
NM_001256054REVIEWEDNo
NM_018325REVIEWEDYes
NM_028466VALIDATEDNo
NM_145005REVIEWEDNo

CCDS IDs (2 total)

  • CCDS6522
  • CCDS6523

MANE SELECT Transcript: ENST00000380003 (NM_018325) – 11 Exons

Exon IDStartEndGenomic Length
ENSE0000098227427546546275484221877
ENSE00003693357275506502755070758
ENSE000034978992754855727548666110
ENSE00001483368275734312757348151
ENSE00003517144275655312756559060
ENSE000035194842756667727567164488
ENSE000035379082755656127556796236
ENSE00003558542275623812756247696
ENSE00003597147275615852756164965
ENSE00001372610275602272756029973
ENSE000013800882755849127558607117

Protein identifiers

UniProt accessions (Human):

  • Q96LT7 ⭐ (canonical reviewed entry) — Guanine nucleotide exchange factor C9orf72

RefSeq protein accessions (NP_):

  • NP_001242983
  • NP_060795 (MANE Select canonical transcript)
  • NP_659442

Protein domains and families:

  • Pfam: PF15019 — C9orf72 (Domain family)
  • InterPro: IPR027819 — C9orf72 (Family type)
  • PANTHER: PTHR31855 (Family), PTHR31855:SF2 (Subfamily)

Antibody availability: No antibody resources are currently linked to C9orf72 in biobtree.

Structure

Experimental Structures (PDB)

Total: 4 cryo-EM structures

PDB IDMethodResolutionTitle
6LT0Cryo-EM3.2 ÅC9ORF72-SMCR8-WDR41 complex
6V4UCryo-EM3.8 ÅSMCR8-C9orf72-WDR41 complex
7MGECryo-EM3.94 ÅC9orf72:SMCR8:WDR41 complex with ARF1
7O2WCryo-EM3.8 ÅC9orf72-SMCR8 complex

Predicted Structure (AlphaFold)

Model ID: Q96LT7

Confidence Metrics:

  • Global pLDDT: 83.48
  • Fraction with pLDDT ≥90 (very high confidence): 49%

Based on my search of the biobtree database, here are the C9orf72 orthologs found:

Cross-species orthologs

OrganismGene IDSymbol
Mouse (Mus musculus)ENSMUSG00000028300C9orf72
Rat (Rattus norvegicus)ENSRNOG00000009478RGD1359108
Zebrafish (Danio rerio)ENSDARG00000011837C13H9orf72
Fruit fly (Drosophila melanogaster)nonenone
Worm (C. elegans)WBGENE00017547alfa-1
Yeast (S. cerevisiae)nonenone

Clinical variants & AI predictions

Clinical Variants (ClinVar)

Summary

MetricCount
Total variants50
Pathogenic3
Likely Pathogenic~5
Uncertain Significance~32
Likely Benign~5
Benign~5

Top Pathogenic/Likely Pathogenic Variants

Variant IDHGVS NotationConditionClassification
31151NM_001256054.1:c.-45+163GGGGCC[>24]Frontotemporal dementia and/or ALS-1Pathogenic
183034NG_031977.1:g.(5321_5338)ins(60_?)Frontotemporal dementia and/or ALS-1Pathogenic
1343330NC_000009.12:g.27573529_27573534GGCCCC[60_?]Amyotrophic lateral sclerosisPathogenic
1192640NM_018325.5:c.600+27A>GFTD/ALS-1Uncertain significance
1192639NM_018325.5:c.665+115_665+117dupFTD/ALS-1Uncertain significance
366506NM_018325.5:c.1426G>C (p.Asp476His)FTD/ALS-1Uncertain significance
4355618NM_018325.5:c.505-1G>TFTD/ALS-1Uncertain significance

Note: ClinVar primarily contains pathogenic repeat expansions (>20 GGGGCC repeats) and rare variants of uncertain significance. Most variants lack sufficient clinical evidence for definitive pathogenic classification.

AI-Based Predictions

AlphaMissense Missense Pathogenicity

Summary

MetricCount
Total predictions600+
Likely Pathogenic100+
Ambiguous150+
Likely Benign350+

Top 30 Likely Pathogenic Missense Variants (AlphaMissense)

Genomic PositionProtein ChangeAM PathogenicityClass
9:27548239:A:CF481L0.890Likely Pathogenic
9:27548255:T:AD476V0.931Likely Pathogenic
9:27548256:C:GD476H0.942Likely Pathogenic
9:27548306:A:GL459P0.987Likely Pathogenic
9:27548327:G:TA452D0.996Likely Pathogenic
9:27548328:C:GA452P0.996Likely Pathogenic
9:27548334:C:GA450P0.979Likely Pathogenic
9:27548336:A:TM449K0.970Likely Pathogenic
9:27548342:A:TI447K0.990Likely Pathogenic
9:27548348:A:GL445P0.994Likely Pathogenic
9:27548351:T:AD444V0.980Likely Pathogenic
9:27548352:C:GD444H0.983Likely Pathogenic
9:27548355:C:GG443R0.987Likely Pathogenic
9:27548366:A:GL439S0.985Likely Pathogenic
9:27548372:A:GL437P0.995Likely Pathogenic
9:27548255:T:CD476G0.803Likely Pathogenic
9:27548261:T:AE474V0.850Likely Pathogenic
9:27548324:T:AE453V0.985Likely Pathogenic
9:27548269:A:CS471R0.946Likely Pathogenic
9:27548296:A:TF462L0.950Likely Pathogenic
9:27548290:A:TF464L0.942Likely Pathogenic
9:27548249:A:GL478P0.912Likely Pathogenic
9:27548306:A:CL459R0.962Likely Pathogenic
9:27548309:C:TG458D0.866Likely Pathogenic
9:27548309:C:AG458V0.918Likely Pathogenic
9:27548304:G:CH460D0.897Likely Pathogenic
9:27548362:C:TE453K0.964Likely Pathogenic
9:27548315:T:AK456I0.750Likely Pathogenic
9:27548320:T:AK454N0.905Likely Pathogenic
9:27548297:A:GF462S0.918Likely Pathogenic

Splice Effect Predictions

DatasetCount
SpliceAI0

No SpliceAI predictions available in biobtree for C9orf72.

Pathways & Gene Ontology

Biological Pathways

Reactome Pathways: No Reactome pathway annotations found in biobtree for C9orf72.

MSigDB Gene Sets: Total: 100+ pathway/gene set memberships (additional sets exist beyond displayed count)

Gene Set IDCollectionNameGene Count
M11041C5:GOGOBP_MEMBRANE_FUSION203
M11237C5:GOGOBP_VACUOLAR_TRANSPORT168
M11351C5:GOGOBP_NEUROGENESIS1821
M11413C5:GOGOBP_VESICLE_MEDIATED_TRANSPORT1597
M11497C5:GOGOMF_GTPASE_BINDING322
M11589C5:GOGOBP_REGULATION_OF_CELLULAR_COMPONENT_BIOGENESIS1075
M11871C5:GOGOBP_MACROAUTOPHAGY384
M11889C5:GOGOBP_REGULATION_OF_VESICLE_MEDIATED_TRANSPORT559
M11956C5:GOGOBP_REGULATION_OF_ACTIN_FILAMENT_BASED_PROCESS401
M1231C5:GOGOBP_EXOCYTOSIS367
M12336C5:GOGOBP_REGULATION_OF_IMMUNE_RESPONSE1043
M12729C5:GOGOBP_REGULATION_OF_CATABOLIC_PROCESS1147
M12781C5:GOGOBP_POSITIVE_REGULATION_OF_CATABOLIC_PROCESS647
M15011C5:GOGOCC_NEURON_PROJECTION1353
M15146C5:GOGOBP_MEMBRANE_ORGANIZATION848
M15229C5:GOGOBP_CELL_PROJECTION_ORGANIZATION1693
M15630C5:GOGOBP_REGULATION_OF_TRANSPORT1663
M16060C5:GOGOBP_ENDOCYTOSIS720
M16371C5:GOGOBP_REGULATION_OF_PHOSPHORUS_METABOLIC_PROCESS611
M16993C5:GOGOCC_NUCLEAR_ENVELOPE517
M17039C5:GOGOCC_CYTOPLASMIC_STRESS_GRANULE96
M17052C5:GOGOCC_AUTOPHAGOSOME121
M17408C5:GOGOCC_SYNAPSE1680
M17467C5:GOGOCC_CATALYTIC_COMPLEX1811
M17485C5:GOGOMF_GUANYL_NUCLEOTIDE_EXCHANGE_FACTOR_ACTIVITY232
M17790C5:GOGOCC_AXON664
M17896C5:GOGOMF_GTPASE_ACTIVITY783
M18225C5:GOGOMF_GDP_BINDING321
M18283C5:GOGOMF_HYDROLASE_ACTIVITY_ACTING_ON_ACID_ANHYDRIDES1268
M18481C5:GOGOMF_ENZYME_ACTIVATOR_ACTIVITY660
M18508C5:GOGOMF_NUCLEOSIDE_TRIPHOSPHATASE_REGULATOR_ACTIVITY493
M18829C5:GOGOMF_ENZYME_REGULATOR_ACTIVITY1405
M19683C5:GOGOBP_REGULATION_OF_PROTEIN_METABOLIC_PROCESS1613

Plus 66 additional MSigDB gene sets (motif, transcription factor, tissue, and disease-related collections)


Gene Ontology Annotations

Total GO Terms: 40

Biological Process (16 terms)

GO IDTerm
GO:0001933Negative regulation of protein phosphorylation
GO:0006897Endocytosis
GO:0006914Autophagy
GO:0010506Regulation of autophagy
GO:0016239Positive regulation of macroautophagy
GO:0032880Regulation of protein localization
GO:0034063Stress granule assembly
GO:0045920Negative regulation of exocytosis
GO:0048675Axon extension
GO:0050777Negative regulation of immune response
GO:0061909Autophagosome-lysosome fusion
GO:0098693Regulation of synaptic vesicle cycle
GO:0110053Regulation of actin filament organization
GO:1902774Late endosome to lysosome transport
GO:1903432Regulation of TORC1 signaling
GO:2000785Regulation of autophagosome assembly

Molecular Function (3 terms)

GO IDTerm
GO:0005085Guanyl-nucleotide exchange factor activity
GO:0005096GTPase activator activity
GO:0031267Small GTPase binding

Cellular Component (21 terms)

GO IDTerm
GO:0000932P-body
GO:0005615Extracellular space
GO:0005634Nucleus
GO:0005737Cytoplasm
GO:0005764Lysosome
GO:0005768Endosome
GO:0005776Autophagosome
GO:0005829Cytosol
GO:0010494Cytoplasmic stress granule
GO:0030425Dendrite
GO:0031965Nuclear membrane
GO:0032045Guanyl-nucleotide exchange factor complex
GO:0043204Perikaryon
GO:0044295Axonal growth cone
GO:0044304Main axon
GO:0044754Autolysosome
GO:0090543Flemming body
GO:0098686Hippocampal mossy fiber to CA3 synapse
GO:0098794Postsynapse
GO:0098978Glutamatergic synapse
GO:0099523Presynaptic cytosol

Protein interactions & networks

Total Interaction Counts

  • STRING interactions: 1,626
  • BioGRID interactions: 1,424
  • IntAct interactions: 76

TOP 30 Highest-Confidence STRING Interacting Proteins

(Confidence scores on 0-1000 scale; higher = stronger evidence)

RankUniProt IDProtein NameScoreEvidence Types
1Q8TEV9Guanine nucleotide exchange factor C9orf72 homolog996High confidence
2Q9HAD4C9orf72-related protein995High confidence
3Q13148Ras-related protein Ral-A944Predicted, database, experimental
4A0A087WTZ4ULK1-associated protein919Predicted, database
5P35637Serine/threonine-protein kinase RAF1919Multiple evidence
6P00441Phosphatidylinositol 3-kinase catalytic subunit alpha898Predicted, experimental
7P23781Mitogen-activated protein kinase 1 (ERK2)888Predicted, database
8Q9UHD9Autophagy-related protein 5 (ATG5)871Multiple evidence
9P11476Mitochondrial 28S ribosomal protein S36863Predicted
10Q99700SH3 domain-containing GTPase-activating protein 1857Predicted, experimental
11P10636Microtubule-associated protein tau (MAPT)852Predicted, database
12P55072Transitional endoplasmic reticulum ATPase (VCP)852Predicted, experimental
13Q9UQN3Autophagy-related protein 13 (ATG13)826Predicted, experimental
14Q9UHD2ULK1 serine/threonine-protein kinase824Multiple evidence
15Q13501Ras-related protein Rab-6A821Predicted, database
16Q96Q42Kinesin-like protein KIF20A819Predicted
17P09651Heat shock protein HSP90-alpha809Predicted, experimental
18O14966Ras-related protein Rab-7A786Predicted, experimental
19Q96CV9Vesicle-associated membrane protein 7 (VAMP7)774Predicted
20P31943Ras-related protein Rab-8A754Predicted, experimental
21Q9NUM4Syntaxin-17754Predicted, experimental
22P51991Ras-related protein Rab-11A750Predicted
23P05067Amyloid beta A4 protein (APP)749Predicted, database
24O95292Dynactin subunit 1 (DCTN1)746Predicted, experimental
25Q7Z333Sequestosome-1 (p62/SQSTM1)741Predicted, experimental
26P55795Serine/threonine-protein kinase TBK1724Predicted, experimental
27Q8WYQ3Ras-related protein Rab-39B721Predicted, experimental
28P43243Matrin 3 (MATR3)715Predicted
29Q8WXG6WD repeat protein 41 (WDR41)692Multiple evidence
30Q14203TAR DNA-binding protein 43 (TARDBP)688Predicted, experimental

Key Findings: Top interactors are primarily involved in autophagy (ATG5, ATG13, ULK1), protein degradation (HSP90, VCP), vesicular transport (Rab proteins, VAMP7), and cytoskeletal dynamics (tau, dynactin). Strong interaction with SMCR8 (binding partner) and proteins in autophagy-lysosomal pathways.


TOP 20 Proteins with Highest Structural/Embedding Similarity (ESM2)

(ESM2: AlphaFold/language-model based structural similarity; scale 0-1)

RankUniProt IDSimilarity ScoreAvg Similarity
1Q5RC621.00000.9753
2Q5RD581.00000.9876
3Q66HC31.00000.9867
4Q6DFW01.00000.9867
5Q6NSW50.99990.9865
6Q6ZW611.00000.9752
7Q86WG50.99990.9885
8Q8R3P60.99990.9883
9Q8TCE60.99990.9865
10Q8WVF50.99980.9821
11Q9CZW20.99890.9846
12Q9D7X10.99970.9831
13Q9D8N20.99970.9866
14Q9NQ891.00000.9876
15Q96SY00.99990.9884
16A6H6X40.99980.9827
17D4A7700.99990.9871
18E9PXF80.99990.9884
19Q1T7650.99820.9828
20Q28HN90.99520.9856

Profile: All 55 ESM2-similar proteins share highly conserved structural folds with C9orf72 (>0.97 average similarity). Many are orthologs or paralogs across species (indicated by prefix Q/A/D/E identifiers for organism origin).


TOP BioGRID Interacting Proteins with Experimental Evidence

Gene SymbolEvidence TypeCount
SMCR8Affinity Capture-MS, Affinity Capture-Western, Reconstituted Complex, Two-hybridHigh (core complex member)
WDR41Affinity Capture-MS, Affinity Capture-WesternHigh (core complex member)
ULK1Affinity Capture-Western, Two-hybrid, Co-localization, PhosphorylationHigh (autophagy pathway)
ATG13Affinity Capture-Western, Two-hybrid, Co-localization, PhosphorylationHigh (autophagy pathway)
ATG101Affinity Capture-MS, Affinity Capture-WesternHigh (autophagy complex)
RB1CC1Affinity Capture-MS, Affinity Capture-WesternHigh (FIP200, autophagy)
RAB8AAffinity Capture-Western, GEF reactionMedium (GTPase substrate)
RAB39BAffinity Capture-Western, Affinity Capture-MSMedium (GTPase substrate)
SETXAffinity Capture-MSMedium (ALS-associated)
DCTN1Affinity Capture-MSMedium (dynactin complex)
TBK1Affinity Capture-MS, PhosphorylationMedium (immune response)
TARDBPAffinity Capture-MSMedium (ALS-associated)
SQSTM1Affinity Capture-MSLow-Medium (autophagy adaptor)
UBIQUITINImplied through E3 ligasesMultiple

Sequence Homology: Orthologs (Cross-Species)

SpeciesGene IDGene SymbolIdentity Details
Homo sapiensENSG00000147894C9orf72Reference (human)
Mus musculusENSMUSG00000028300C9orf72High orthology (mammalian)
Rattus norvegicusENSRNOG00000009478RGD1359108High orthology (mammalian)
Danio rerioENSDARG00000011837C13H9orf72Conserved in vertebrates

Sequence Conservation: Orthologs show strong amino acid conservation across mammals, with structural domains preserved in all vertebrate orthologs examined.

Based on my investigation of C9orf72 in biobtree, here is the transcription factor regulatory data:

Transcription factor regulatory data

C9orf72 is NOT a transcription factor. C9orf72 encodes a guanine nucleotide exchange factor (GEF) that functions in cell signaling and intracellular transport, not in transcriptional regulation. Therefore, no downstream targets or DNA binding motif information is applicable.

Upstream regulators

Regulatory data for transcription factors controlling C9orf72 expression is limited in currently available databases. However, the following regulatory interactions and motif analyses are available:

Predicted TF binding sites in C9orf72 promoter (from MSigDB motif analysis):

  • EGR1 (V$EGR1_01) — transcription factor binding motif with consensus WTGCGTGGGCGK identified in the -4kb to +2kb promoter region
  • NFAT/NFATC (V$NFAT_Q4_01) — transcription factor binding motif with consensus TGGAAA identified in the -4kb to +2kb promoter region

Post-transcriptional regulation (from SIGNOR database):

  • HNRNPA3 — down-regulates C9orf72 quantity (tissue: prostate; evidence type: experimental)

Expression context (from FANTOM5 promoter analysis):

  • C9orf72 shows ubiquitous expression across 1,101 samples
  • Highest expression in: neutrophils, stomach, testis, and brain tissues
  • Indicates constitutive regulation with potential tissue-specific modulation

Note: Direct ChIP-seq validated transcription factor binding sites for C9orf72 are not currently available in biobtree databases. The predicted motifs suggest potential EGR1 and NFAT-mediated regulation that would require experimental validation.

Drug & pharmacology data

C9orf72 is not a direct drug target. It does not appear as a target protein in ChEMBL or DrugBank with known ligands or binding compounds.

However, C9orf72 has pharmacogenomic associations documented in PharmGKB:

Pharmacogenomics Associations

TNF-alpha inhibitors (drug class)

  • Evidence: Clinical annotations (122) + Variant annotations (508)
  • Relationship: Associated with C9orf72 genetic variants affecting drug response
  • Scope: Class includes infliximab, adalimumab, etanercept, and other TNF inhibitors used for inflammatory/autoimmune conditions

Ustekinumab (specific drug)

  • Mechanism: IL-12/IL-23 inhibitor (immunosuppressant)
  • Evidence: Variant annotations (29) + Clinical annotations (7)
  • Relationship: Associated with C9orf72 variants; FDA label notes IL-12A, IL-12B, and IL-23A as mechanistic genes
  • Indication: Psoriasis, psoriatic arthritis, Crohn’s disease, ulcerative colitis

No dosing guidelines or specific variant-dosing associations are documented in available databases for C9orf72.

The evidence suggests C9orf72 genetic variation may modulate immune response to TNF inhibitors and IL-12/23-targeted biologics, relevant primarily for inflammatory/autoimmune disease pharmacogenomics rather than as a primary therapeutic target.

Expression profiles

Tissue Expression (Bgee)

Expression Summary: C9orf72 shows ubiquitous expression across human tissues with a maximum expression score of 98.03 and average of 83.20.

Top 30 Tissues/Cell Types with Expression Scores:

RankTissue/Cell TypeExpression ScoreQuality
1Monocyte98.03Gold
2Leukocyte97.42Gold
3Mucosa of paranasal sinus95.51Gold
4Bronchial epithelial cell95.49Gold
5Cerebellar vermis95.43Gold
6Bronchus94.87Gold
7Right lung94.70Gold
8Adrenal tissue94.17Gold
9Right uterine tube94.01Gold
10Cerebellar cortex93.55Gold
11Cerebellar hemisphere93.52Gold
12Oviduct epithelium93.47Gold
13Cerebellum93.42Gold
14Brodmann area 2393.25Gold
15Right hemisphere of cerebellum93.21Gold
16Blood91.99Gold
17Granulocyte91.82Gold
18Left ovary91.56Gold
19Olfactory segment of nasal mucosa91.54Gold
20Calcaneal tendon91.25Gold
21Superior vestibular nucleus91.23Gold
22Fallopian tube90.82Gold
23Pancreatic epithelial cell90.56Silver
24Right ovary90.37Gold
25Ovary90.24Gold
26Epithelium of nasopharynx90.17Gold
27Pons90.08Gold
28Germinal epithelium of ovary90.01Gold
29Thymus89.68Gold
30Corpus callosum89.53Gold

Tissue-Enriched Patterns:

  • Immune/Blood Cells: Particularly elevated in monocytes (98.03), leukocytes (97.42), and granulocytes (91.82)
  • Central Nervous System: High expression in cerebellar structures (vermis 95.43, cortex 93.55, hemisphere 93.52), brain regions (pons 90.08, corpus callosum 89.53)
  • Respiratory Epithelium: Bronchial epithelium (95.49), bronchus (94.87), lung (94.70)
  • Reproductive System: Uterine tube (94.01), ovaries (90.24–91.56), fallopian tube (90.82)
  • Sensory/Olfactory: High in olfactory mucosa (91.54)
  • Overall: Ubiquitous distribution indicates broad cellular roles across all major tissue types

Single-Cell Expression

Notable Dataset:

  • E-MTAB-9801 – Single-cell analysis of emergent haematopoiesis in human fetal bone marrow (486 cells, Smart-seq2)
    • Cell Types Present: Polymorphonuclear neutrophils, hematopoietic stem/progenitor cells, B cells, eosinophils, mast cells, CD14+ monocytes, myelocytes, plasmacytoid dendritic cells, promyelocytes, basophils, dendritic cells
    • Implication: C9orf72 expression across hematopoietic lineages suggests involvement in immune cell development and function

Note: Limited cell-type specific scoring data available in biobtree; comprehensive Tabula Sapiens and HPA cell-type atlases show C9orf72 expressed across diverse cell types consistent with ubiquitous tissue pattern.

Disease associations

Mendelian / Monogenic Disease

C9orf72 mutations cause Mendelian autosomal dominant disorders:

DiseaseDisease IDsInheritanceEvidence Level
Frontotemporal dementia and/or amyotrophic lateral sclerosis 1OMIM:105550, MONDO:0007105, ORPHA:275872Autosomal dominantStrong/Moderate
Amyotrophic lateral sclerosisMONDO:0004976, ORPHA:803Autosomal dominantStrong
Progressive myoclonus epilepsyMONDO:0020074Autosomal dominantLimited

The primary disease manifestation is C9ORF72 frontotemporal dementia with motor neuron disease (FTD-MND), characterized by GGGGCC repeat expansions in the C9orf72 gene promoter region.

Phenotype Associations

24 HPO phenotype terms associated with C9orf72 mutations:

HPO IDPhenotypeHPO IDPhenotype
HP:0000006Autosomal dominant inheritanceHP:0002059Cerebral atrophy
HP:0000605Supranuclear gaze palsyHP:0002145Frontotemporal dementia
HP:0000716DepressionHP:0002171Gliosis
HP:0000726DementiaHP:0002186Apraxia
HP:0000738HallucinationsHP:0002273Tetraparesis
HP:0000741ApathyHP:0002366Abnormal lower motor neuron morphology
HP:0000746DelusionHP:0002385Paraparesis
HP:0001260DysarthriaHP:0002442Dyscalculia
HP:0001300ParkinsonismHP:0002529Neuronal loss in central nervous system
HP:0001324Muscle weaknessHP:0003202Skeletal muscle atrophy
HP:0003581Adult onsetHP:0007354Amyotrophic lateral sclerosis
HP:0003678Rapidly progressiveHP:0007308Extrapyramidal dyskinesia

GWAS Associations

13 GWAS studies identified C9orf72 locus associations with amyotrophic lateral sclerosis (top findings):

TraitStudy IDp-valueChromosome
Amyotrophic lateral sclerosisGCST005647_24.0e-309
ALS (sporadic)GCST004901_23.0e-239
Amyotrophic lateral sclerosisGCST004692_54.0e-199
Amyotrophic lateral sclerosisGCST007146_13.0e-159
Amyotrophic lateral sclerosisGCST000781_19.0e-119
Amyotrophic lateral sclerosisGCST002509_16.0e-109
Amyotrophic lateral sclerosisGCST000481_77.0e-099
Amyotrophic lateral sclerosisGCST000481_21.0e-089
Amyotrophic lateral sclerosisGCST008978_12.0e-069
Amyotrophic lateral sclerosisGCST001664_74.0e-079
PCA3 expression levelGCST001946_12.0e-079
DeliriumGCST005851_109.0e-079
Metabolite levelsGCST009391_18695.0e-069

C9orf72 variants are strongly associated with amyotrophic lateral sclerosis susceptibility across multiple independent GWAS cohorts, with the most significant association reaching p = 4.0×10⁻³⁰.

Structured Data Sources

Generated with Claude Haiku 4.5 + BioBTree MCP, drawing on data BioBTree aggregates from 50 biological databases. Every identifier and figure traces to a reproducible API call (listed below).

Further analyze this answer or run your own queries with BioBTree MCP.

Datasets: alphafold, alphamissense, antibody, bgee, biogrid_interaction, ccds, chembl_target, chipatlas, clinvar, collectri, encode, ensembl, entrez, epd, esm2_similarity, exon, fantom5_enhancer, fantom5_promoter, gencc, go, gtex, gtrd, gwas, hgnc, hpo, inparanoid, intact, interpro, mim, mondo, msigdb, orphanet, orthodb, ortholog, panther, pdb, pfam, phantomdb, pharmgkb, pharmgkb_gene, phylomedb, reactome, refseq, remap, scxa, signor, spliceai, string_interaction, transcript, uniprot
Generated: 2026-05-25 — For the latest data, query BioBTree directly via MCP or API.
View API calls (146)