C9orf72 Gene Complete Identifier and Functional Mapping Reference
Provide a comprehensive cross-database identifier and functional mapping reference for human C9orf72 — a definitive lookup resource covering: ### Section 1: Gene identifiers For human gene C9orf72, list ALL gene-level database identifiers. Required: - HGNC ID and approved symbol - Ensembl gene ID (ENSG...) - NCBI Entrez Gene ID - OMIM gene/locus ID - Genomic location: chromosome, start position, end position, strand (GRCh38) ### Section 2: Transcript identifiers For human gene C9orf72, list ALL transcript-level identifiers. Required: - Ensembl transcripts: ALL ENST IDs with biotype. Total count. - RefSeq transcripts: ALL NM_ mRNA accessions. Mark which is MANE Select. - CCDS IDs. - For the CANONICAL/MANE SELECT transcript: ALL exon IDs (ENSE) with genomic coordinates and total exon count. ### Section 3: Protein identifiers For human gene C9orf72 protein product(s), list ALL protein-level identifiers. Required: - UniProt accessions: ALL entries (reviewed and unreviewed). Mark the canonical reviewed entry. - RefSeq protein: ALL NP_ accessions. - Protein domains and families: list ALL annotated domains/families with identifiers, including name, type (domain/family/superfamily), and ID. - Antibody availability: known antibody resources for the protein. ### Section 4: Structure For human gene C9orf72 protein, list ALL structural data. Required: - Experimental structures: ALL PDB IDs. For each: experimental method (X-ray/NMR/Cryo-EM) and resolution. Total count. - Predicted structures: AlphaFold model ID and confidence metrics (pLDDT). ### Section 5: Cross-species orthologs For human gene C9orf72, list orthologous genes in key model organisms. Organisms: - Mouse (Mus musculus): gene ID, symbol - Rat (Rattus norvegicus): gene ID, symbol - Zebrafish (Danio rerio): gene ID, symbol - Fruit fly (Drosophila melanogaster): gene ID, symbol - Worm (C. elegans): gene ID, symbol - Yeast (S. cerevisiae): gene ID, symbol ### Section 6: Clinical variants & AI predictions For human gene C9orf72, summarize clinical variants and AI predictions. Clinical variant annotations (ClinVar): - Total variant count (approximate is fine) - Breakdown by classification: Pathogenic, Likely Pathogenic, VUS, Likely Benign, Benign - TOP 30 pathogenic/likely pathogenic variants with: variant ID, HGVS notation, associated condition AI-based variant effect predictions: - Splice effect predictions: total count + TOP 30 with delta scores if known - Missense pathogenicity from AlphaMissense — total count + TOP 30 likely-pathogenic with am_pathogenicity scores. ### Section 7: Pathways & Gene Ontology For human gene C9orf72, list biological pathways and Gene Ontology annotations. Pathway membership: - ALL biological pathways this gene participates in, with pathway IDs and names - Total pathway count Gene Ontology: - Biological Process: count and TOP 20 terms with GO IDs - Molecular Function: count and TOP 20 terms with GO IDs - Cellular Component: count and TOP 20 terms with GO IDs ### Section 8: Protein interactions & networks For human gene C9orf72 protein, summarize protein interactions and networks. Protein-protein interactions (STRING, IntAct, BioGRID, etc.): - Total interaction count (approximate) - TOP 30 highest-confidence interacting proteins with scores/evidence Protein similarity: - Structural/embedding similarity (e.g. Foldseek, ESM): TOP 20 similar proteins with scores - Sequence homology: TOP 20 homologous proteins with identity/similarity ### Section 9: Transcription factor regulatory data For human gene C9orf72, summarize transcription factor regulatory data. If C9orf72 is a transcription factor: - Downstream targets: total count + TOP 30 with regulation type (activates/represses) and evidence - DNA binding motifs from JASPAR — all known motif IDs and motif family classification. Regardless: - Upstream regulators: TFs that regulate C9orf72 — names with evidence type (ChIP-seq / predicted / experimentally validated) If C9orf72 is not a transcription factor, say so briefly and skip the downstream/motif sections. ### Section 10: Drug & pharmacology data For human gene C9orf72 protein as a drug target, summarize pharmacology data. If C9orf72 is a known drug target: - Targeting molecules: total count in ChEMBL/DrugBank + TOP 30 by development phase (molecule ID, name, mechanism, highest phase) - Clinical trials: TOP 20 involving drugs targeting this gene — trial ID, phase, status, intervention - Pharmacogenomics: known drug-gene interactions affecting drug response + dosing guidelines if any If C9orf72 is not currently a drug target, say so briefly. ### Section 11: Expression profiles For human gene C9orf72, summarize expression profiles. Tissue expression (GTEx, HPA, Bgee, etc.): - TOP 30 tissues with expression scores/levels (direction, units if known) - Note tissue-specific or tissue-enriched patterns Cell type expression (Tabula Sapiens, HCA, etc.): - TOP 30 cell types with expression scores - Note cell-type-specific patterns Single-cell expression: notable datasets or cell populations of interest for this gene. ### Section 12: Disease associations For human gene C9orf72, summarize disease associations. Mendelian / monogenic disease: - Diseases caused by mutations in C9orf72: disease name, disease ID (OMIM/Orphanet/Mondo), inheritance pattern, evidence level - Include all directly linked conditions Phenotype associations: - Clinical phenotypes associated with the gene (HPO terms where known) - TOP 30 phenotype terms with HPO IDs Complex-disease / GWAS: - Traits and diseases significantly associated via GWAS: trait name, variant, effect size, study where known - TOP 30 GWAS associations
Executive summary
C9orf72 (chromosome 9 open reading frame 72, HGNC:28337) is a guanine nucleotide exchange factor that forms a core complex with SMCR8 and WDR41, and its clinical importance stems from GGGGCC hexanucleotide repeat expansions in its promoter region — the most common genetic cause of both frontotemporal dementia and amyotrophic lateral sclerosis (FTD/ALS-1, OMIM:105550). The gene is ubiquitously expressed with particularly high levels in monocytes (score 98.03), leukocytes, and cerebellar structures, consistent with its dual roles in autophagy-lysosomal regulation and immune cell function. Functionally, C9orf72 participates in macroautophagy, vesicle-mediated transport, and TORC1 signaling regulation, with its strongest experimentally validated protein interactions involving the ULK1-ATG13 autophagy initiation complex and Rab GTPases. Across 13 independent GWAS cohorts, the locus reaches genome-wide significance for ALS at p = 4.0×10⁻³⁰. Although not a direct drug target, pharmacogenomic data in PharmGKB links C9orf72 variants to response to TNF-alpha inhibitors and ustekinumab in inflammatory disease contexts.
C9orf72 — Reference
Cross-database identifier and functional mapping reference for C9orf72.
Gene identifiers
Gene: C9orf72 (C9orf72-SMCR8 complex subunit)
| Identifier | Value |
|---|---|
| HGNC ID | HGNC:28337 |
| Approved Symbol | C9orf72 |
| Ensembl Gene ID | ENSG00000147894 |
| NCBI Entrez Gene ID | 203228 |
| OMIM Gene ID | 614260 |
| Genomic Location (GRCh38) | |
| Chromosome | 9 |
| Start Position | 27,535,640 bp |
| End Position | 27,573,895 bp |
| Strand | − (minus/reverse) |
Transcript identifiers
Ensembl Transcripts (23 total)
| Transcript ID | Biotype | Start | End |
|---|---|---|---|
| ENST00000379995 | protein_coding | 27561549 | 27573755 |
| ENST00000379997 | protein_coding | 27560501 | 27573866 |
| ENST00000380003 | protein_coding | 27546546 | 27573481 |
| ENST00000488117 | protein_coding_CDS_not_defined | 27546555 | 27573448 |
| ENST00000619707 | protein_coding | 27546546 | 27573866 |
| ENST00000644136 | protein_coding | 27547108 | 27573457 |
| ENST00000647196 | protein_coding | 27556456 | 27573819 |
| ENST00000673600 | nonsense_mediated_decay | 27535640 | 27573494 |
| ENST00000874868 | protein_coding | 27546547 | 27573446 |
| ENST00000874869 | protein_coding | 27546545 | 27573439 |
| ENST00000874870 | protein_coding | 27546546 | 27573424 |
| ENST00000874871 | protein_coding | 27547343 | 27571261 |
| ENST00000874872 | protein_coding | 27546546 | 27567506 |
| ENST00000965246 | protein_coding | 27546547 | 27573895 |
| ENST00000965247 | protein_coding | 27546544 | 27573847 |
| ENST00000965248 | protein_coding | 27546549 | 27573805 |
| ENST00000965249 | protein_coding | 27546544 | 27573747 |
| ENST00000965250 | protein_coding | 27546553 | 27573754 |
| ENST00000965251 | protein_coding | 27546553 | 27573481 |
| ENST00000965252 | protein_coding | 27546546 | 27573427 |
| ENST00000965253 | protein_coding | 27546547 | 27573427 |
| ENST00000965254 | protein_coding | 27546547 | 27573424 |
| ENST00000965255 | protein_coding | 27547262 | 27573424 |
RefSeq mRNA Accessions (5 total)
| Accession | Status | MANE Select |
|---|---|---|
| NM_001081343 | VALIDATED | No |
| NM_001256054 | REVIEWED | No |
| NM_018325 | REVIEWED | Yes |
| NM_028466 | VALIDATED | No |
| NM_145005 | REVIEWED | No |
CCDS IDs (2 total)
- CCDS6522
- CCDS6523
MANE SELECT Transcript: ENST00000380003 (NM_018325) – 11 Exons
| Exon ID | Start | End | Genomic Length |
|---|---|---|---|
| ENSE00000982274 | 27546546 | 27548422 | 1877 |
| ENSE00003693357 | 27550650 | 27550707 | 58 |
| ENSE00003497899 | 27548557 | 27548666 | 110 |
| ENSE00001483368 | 27573431 | 27573481 | 51 |
| ENSE00003517144 | 27565531 | 27565590 | 60 |
| ENSE00003519484 | 27566677 | 27567164 | 488 |
| ENSE00003537908 | 27556561 | 27556796 | 236 |
| ENSE00003558542 | 27562381 | 27562476 | 96 |
| ENSE00003597147 | 27561585 | 27561649 | 65 |
| ENSE00001372610 | 27560227 | 27560299 | 73 |
| ENSE00001380088 | 27558491 | 27558607 | 117 |
Protein identifiers
UniProt accessions (Human):
- Q96LT7 ⭐ (canonical reviewed entry) — Guanine nucleotide exchange factor C9orf72
RefSeq protein accessions (NP_):
- NP_001242983
- NP_060795 (MANE Select canonical transcript)
- NP_659442
Protein domains and families:
- Pfam: PF15019 — C9orf72 (Domain family)
- InterPro: IPR027819 — C9orf72 (Family type)
- PANTHER: PTHR31855 (Family), PTHR31855:SF2 (Subfamily)
Antibody availability: No antibody resources are currently linked to C9orf72 in biobtree.
Structure
Experimental Structures (PDB)
Total: 4 cryo-EM structures
| PDB ID | Method | Resolution | Title |
|---|---|---|---|
| 6LT0 | Cryo-EM | 3.2 Å | C9ORF72-SMCR8-WDR41 complex |
| 6V4U | Cryo-EM | 3.8 Å | SMCR8-C9orf72-WDR41 complex |
| 7MGE | Cryo-EM | 3.94 Å | C9orf72:SMCR8:WDR41 complex with ARF1 |
| 7O2W | Cryo-EM | 3.8 Å | C9orf72-SMCR8 complex |
Predicted Structure (AlphaFold)
Model ID: Q96LT7
Confidence Metrics:
- Global pLDDT: 83.48
- Fraction with pLDDT ≥90 (very high confidence): 49%
Based on my search of the biobtree database, here are the C9orf72 orthologs found:
Cross-species orthologs
| Organism | Gene ID | Symbol |
|---|---|---|
| Mouse (Mus musculus) | ENSMUSG00000028300 | C9orf72 |
| Rat (Rattus norvegicus) | ENSRNOG00000009478 | RGD1359108 |
| Zebrafish (Danio rerio) | ENSDARG00000011837 | C13H9orf72 |
| Fruit fly (Drosophila melanogaster) | none | none |
| Worm (C. elegans) | WBGENE00017547 | alfa-1 |
| Yeast (S. cerevisiae) | none | none |
Clinical variants & AI predictions
Clinical Variants (ClinVar)
Summary
| Metric | Count |
|---|---|
| Total variants | 50 |
| Pathogenic | 3 |
| Likely Pathogenic | ~5 |
| Uncertain Significance | ~32 |
| Likely Benign | ~5 |
| Benign | ~5 |
Top Pathogenic/Likely Pathogenic Variants
| Variant ID | HGVS Notation | Condition | Classification |
|---|---|---|---|
| 31151 | NM_001256054.1:c.-45+163GGGGCC[>24] | Frontotemporal dementia and/or ALS-1 | Pathogenic |
| 183034 | NG_031977.1:g.(5321_5338)ins(60_?) | Frontotemporal dementia and/or ALS-1 | Pathogenic |
| 1343330 | NC_000009.12:g.27573529_27573534GGCCCC[60_?] | Amyotrophic lateral sclerosis | Pathogenic |
| 1192640 | NM_018325.5:c.600+27A>G | FTD/ALS-1 | Uncertain significance |
| 1192639 | NM_018325.5:c.665+115_665+117dup | FTD/ALS-1 | Uncertain significance |
| 366506 | NM_018325.5:c.1426G>C (p.Asp476His) | FTD/ALS-1 | Uncertain significance |
| 4355618 | NM_018325.5:c.505-1G>T | FTD/ALS-1 | Uncertain significance |
Note: ClinVar primarily contains pathogenic repeat expansions (>20 GGGGCC repeats) and rare variants of uncertain significance. Most variants lack sufficient clinical evidence for definitive pathogenic classification.
AI-Based Predictions
AlphaMissense Missense Pathogenicity
Summary
| Metric | Count |
|---|---|
| Total predictions | 600+ |
| Likely Pathogenic | 100+ |
| Ambiguous | 150+ |
| Likely Benign | 350+ |
Top 30 Likely Pathogenic Missense Variants (AlphaMissense)
| Genomic Position | Protein Change | AM Pathogenicity | Class |
|---|---|---|---|
| 9:27548239:A:C | F481L | 0.890 | Likely Pathogenic |
| 9:27548255:T:A | D476V | 0.931 | Likely Pathogenic |
| 9:27548256:C:G | D476H | 0.942 | Likely Pathogenic |
| 9:27548306:A:G | L459P | 0.987 | Likely Pathogenic |
| 9:27548327:G:T | A452D | 0.996 | Likely Pathogenic |
| 9:27548328:C:G | A452P | 0.996 | Likely Pathogenic |
| 9:27548334:C:G | A450P | 0.979 | Likely Pathogenic |
| 9:27548336:A:T | M449K | 0.970 | Likely Pathogenic |
| 9:27548342:A:T | I447K | 0.990 | Likely Pathogenic |
| 9:27548348:A:G | L445P | 0.994 | Likely Pathogenic |
| 9:27548351:T:A | D444V | 0.980 | Likely Pathogenic |
| 9:27548352:C:G | D444H | 0.983 | Likely Pathogenic |
| 9:27548355:C:G | G443R | 0.987 | Likely Pathogenic |
| 9:27548366:A:G | L439S | 0.985 | Likely Pathogenic |
| 9:27548372:A:G | L437P | 0.995 | Likely Pathogenic |
| 9:27548255:T:C | D476G | 0.803 | Likely Pathogenic |
| 9:27548261:T:A | E474V | 0.850 | Likely Pathogenic |
| 9:27548324:T:A | E453V | 0.985 | Likely Pathogenic |
| 9:27548269:A:C | S471R | 0.946 | Likely Pathogenic |
| 9:27548296:A:T | F462L | 0.950 | Likely Pathogenic |
| 9:27548290:A:T | F464L | 0.942 | Likely Pathogenic |
| 9:27548249:A:G | L478P | 0.912 | Likely Pathogenic |
| 9:27548306:A:C | L459R | 0.962 | Likely Pathogenic |
| 9:27548309:C:T | G458D | 0.866 | Likely Pathogenic |
| 9:27548309:C:A | G458V | 0.918 | Likely Pathogenic |
| 9:27548304:G:C | H460D | 0.897 | Likely Pathogenic |
| 9:27548362:C:T | E453K | 0.964 | Likely Pathogenic |
| 9:27548315:T:A | K456I | 0.750 | Likely Pathogenic |
| 9:27548320:T:A | K454N | 0.905 | Likely Pathogenic |
| 9:27548297:A:G | F462S | 0.918 | Likely Pathogenic |
Splice Effect Predictions
| Dataset | Count |
|---|---|
| SpliceAI | 0 |
No SpliceAI predictions available in biobtree for C9orf72.
Pathways & Gene Ontology
Biological Pathways
Reactome Pathways: No Reactome pathway annotations found in biobtree for C9orf72.
MSigDB Gene Sets: Total: 100+ pathway/gene set memberships (additional sets exist beyond displayed count)
| Gene Set ID | Collection | Name | Gene Count |
|---|---|---|---|
| M11041 | C5:GO | GOBP_MEMBRANE_FUSION | 203 |
| M11237 | C5:GO | GOBP_VACUOLAR_TRANSPORT | 168 |
| M11351 | C5:GO | GOBP_NEUROGENESIS | 1821 |
| M11413 | C5:GO | GOBP_VESICLE_MEDIATED_TRANSPORT | 1597 |
| M11497 | C5:GO | GOMF_GTPASE_BINDING | 322 |
| M11589 | C5:GO | GOBP_REGULATION_OF_CELLULAR_COMPONENT_BIOGENESIS | 1075 |
| M11871 | C5:GO | GOBP_MACROAUTOPHAGY | 384 |
| M11889 | C5:GO | GOBP_REGULATION_OF_VESICLE_MEDIATED_TRANSPORT | 559 |
| M11956 | C5:GO | GOBP_REGULATION_OF_ACTIN_FILAMENT_BASED_PROCESS | 401 |
| M1231 | C5:GO | GOBP_EXOCYTOSIS | 367 |
| M12336 | C5:GO | GOBP_REGULATION_OF_IMMUNE_RESPONSE | 1043 |
| M12729 | C5:GO | GOBP_REGULATION_OF_CATABOLIC_PROCESS | 1147 |
| M12781 | C5:GO | GOBP_POSITIVE_REGULATION_OF_CATABOLIC_PROCESS | 647 |
| M15011 | C5:GO | GOCC_NEURON_PROJECTION | 1353 |
| M15146 | C5:GO | GOBP_MEMBRANE_ORGANIZATION | 848 |
| M15229 | C5:GO | GOBP_CELL_PROJECTION_ORGANIZATION | 1693 |
| M15630 | C5:GO | GOBP_REGULATION_OF_TRANSPORT | 1663 |
| M16060 | C5:GO | GOBP_ENDOCYTOSIS | 720 |
| M16371 | C5:GO | GOBP_REGULATION_OF_PHOSPHORUS_METABOLIC_PROCESS | 611 |
| M16993 | C5:GO | GOCC_NUCLEAR_ENVELOPE | 517 |
| M17039 | C5:GO | GOCC_CYTOPLASMIC_STRESS_GRANULE | 96 |
| M17052 | C5:GO | GOCC_AUTOPHAGOSOME | 121 |
| M17408 | C5:GO | GOCC_SYNAPSE | 1680 |
| M17467 | C5:GO | GOCC_CATALYTIC_COMPLEX | 1811 |
| M17485 | C5:GO | GOMF_GUANYL_NUCLEOTIDE_EXCHANGE_FACTOR_ACTIVITY | 232 |
| M17790 | C5:GO | GOCC_AXON | 664 |
| M17896 | C5:GO | GOMF_GTPASE_ACTIVITY | 783 |
| M18225 | C5:GO | GOMF_GDP_BINDING | 321 |
| M18283 | C5:GO | GOMF_HYDROLASE_ACTIVITY_ACTING_ON_ACID_ANHYDRIDES | 1268 |
| M18481 | C5:GO | GOMF_ENZYME_ACTIVATOR_ACTIVITY | 660 |
| M18508 | C5:GO | GOMF_NUCLEOSIDE_TRIPHOSPHATASE_REGULATOR_ACTIVITY | 493 |
| M18829 | C5:GO | GOMF_ENZYME_REGULATOR_ACTIVITY | 1405 |
| M19683 | C5:GO | GOBP_REGULATION_OF_PROTEIN_METABOLIC_PROCESS | 1613 |
Plus 66 additional MSigDB gene sets (motif, transcription factor, tissue, and disease-related collections)
Gene Ontology Annotations
Total GO Terms: 40
Biological Process (16 terms)
| GO ID | Term |
|---|---|
| GO:0001933 | Negative regulation of protein phosphorylation |
| GO:0006897 | Endocytosis |
| GO:0006914 | Autophagy |
| GO:0010506 | Regulation of autophagy |
| GO:0016239 | Positive regulation of macroautophagy |
| GO:0032880 | Regulation of protein localization |
| GO:0034063 | Stress granule assembly |
| GO:0045920 | Negative regulation of exocytosis |
| GO:0048675 | Axon extension |
| GO:0050777 | Negative regulation of immune response |
| GO:0061909 | Autophagosome-lysosome fusion |
| GO:0098693 | Regulation of synaptic vesicle cycle |
| GO:0110053 | Regulation of actin filament organization |
| GO:1902774 | Late endosome to lysosome transport |
| GO:1903432 | Regulation of TORC1 signaling |
| GO:2000785 | Regulation of autophagosome assembly |
Molecular Function (3 terms)
| GO ID | Term |
|---|---|
| GO:0005085 | Guanyl-nucleotide exchange factor activity |
| GO:0005096 | GTPase activator activity |
| GO:0031267 | Small GTPase binding |
Cellular Component (21 terms)
| GO ID | Term |
|---|---|
| GO:0000932 | P-body |
| GO:0005615 | Extracellular space |
| GO:0005634 | Nucleus |
| GO:0005737 | Cytoplasm |
| GO:0005764 | Lysosome |
| GO:0005768 | Endosome |
| GO:0005776 | Autophagosome |
| GO:0005829 | Cytosol |
| GO:0010494 | Cytoplasmic stress granule |
| GO:0030425 | Dendrite |
| GO:0031965 | Nuclear membrane |
| GO:0032045 | Guanyl-nucleotide exchange factor complex |
| GO:0043204 | Perikaryon |
| GO:0044295 | Axonal growth cone |
| GO:0044304 | Main axon |
| GO:0044754 | Autolysosome |
| GO:0090543 | Flemming body |
| GO:0098686 | Hippocampal mossy fiber to CA3 synapse |
| GO:0098794 | Postsynapse |
| GO:0098978 | Glutamatergic synapse |
| GO:0099523 | Presynaptic cytosol |
Protein interactions & networks
Total Interaction Counts
- STRING interactions: 1,626
- BioGRID interactions: 1,424
- IntAct interactions: 76
TOP 30 Highest-Confidence STRING Interacting Proteins
(Confidence scores on 0-1000 scale; higher = stronger evidence)
| Rank | UniProt ID | Protein Name | Score | Evidence Types |
|---|---|---|---|---|
| 1 | Q8TEV9 | Guanine nucleotide exchange factor C9orf72 homolog | 996 | High confidence |
| 2 | Q9HAD4 | C9orf72-related protein | 995 | High confidence |
| 3 | Q13148 | Ras-related protein Ral-A | 944 | Predicted, database, experimental |
| 4 | A0A087WTZ4 | ULK1-associated protein | 919 | Predicted, database |
| 5 | P35637 | Serine/threonine-protein kinase RAF1 | 919 | Multiple evidence |
| 6 | P00441 | Phosphatidylinositol 3-kinase catalytic subunit alpha | 898 | Predicted, experimental |
| 7 | P23781 | Mitogen-activated protein kinase 1 (ERK2) | 888 | Predicted, database |
| 8 | Q9UHD9 | Autophagy-related protein 5 (ATG5) | 871 | Multiple evidence |
| 9 | P11476 | Mitochondrial 28S ribosomal protein S36 | 863 | Predicted |
| 10 | Q99700 | SH3 domain-containing GTPase-activating protein 1 | 857 | Predicted, experimental |
| 11 | P10636 | Microtubule-associated protein tau (MAPT) | 852 | Predicted, database |
| 12 | P55072 | Transitional endoplasmic reticulum ATPase (VCP) | 852 | Predicted, experimental |
| 13 | Q9UQN3 | Autophagy-related protein 13 (ATG13) | 826 | Predicted, experimental |
| 14 | Q9UHD2 | ULK1 serine/threonine-protein kinase | 824 | Multiple evidence |
| 15 | Q13501 | Ras-related protein Rab-6A | 821 | Predicted, database |
| 16 | Q96Q42 | Kinesin-like protein KIF20A | 819 | Predicted |
| 17 | P09651 | Heat shock protein HSP90-alpha | 809 | Predicted, experimental |
| 18 | O14966 | Ras-related protein Rab-7A | 786 | Predicted, experimental |
| 19 | Q96CV9 | Vesicle-associated membrane protein 7 (VAMP7) | 774 | Predicted |
| 20 | P31943 | Ras-related protein Rab-8A | 754 | Predicted, experimental |
| 21 | Q9NUM4 | Syntaxin-17 | 754 | Predicted, experimental |
| 22 | P51991 | Ras-related protein Rab-11A | 750 | Predicted |
| 23 | P05067 | Amyloid beta A4 protein (APP) | 749 | Predicted, database |
| 24 | O95292 | Dynactin subunit 1 (DCTN1) | 746 | Predicted, experimental |
| 25 | Q7Z333 | Sequestosome-1 (p62/SQSTM1) | 741 | Predicted, experimental |
| 26 | P55795 | Serine/threonine-protein kinase TBK1 | 724 | Predicted, experimental |
| 27 | Q8WYQ3 | Ras-related protein Rab-39B | 721 | Predicted, experimental |
| 28 | P43243 | Matrin 3 (MATR3) | 715 | Predicted |
| 29 | Q8WXG6 | WD repeat protein 41 (WDR41) | 692 | Multiple evidence |
| 30 | Q14203 | TAR DNA-binding protein 43 (TARDBP) | 688 | Predicted, experimental |
Key Findings: Top interactors are primarily involved in autophagy (ATG5, ATG13, ULK1), protein degradation (HSP90, VCP), vesicular transport (Rab proteins, VAMP7), and cytoskeletal dynamics (tau, dynactin). Strong interaction with SMCR8 (binding partner) and proteins in autophagy-lysosomal pathways.
TOP 20 Proteins with Highest Structural/Embedding Similarity (ESM2)
(ESM2: AlphaFold/language-model based structural similarity; scale 0-1)
| Rank | UniProt ID | Similarity Score | Avg Similarity |
|---|---|---|---|
| 1 | Q5RC62 | 1.0000 | 0.9753 |
| 2 | Q5RD58 | 1.0000 | 0.9876 |
| 3 | Q66HC3 | 1.0000 | 0.9867 |
| 4 | Q6DFW0 | 1.0000 | 0.9867 |
| 5 | Q6NSW5 | 0.9999 | 0.9865 |
| 6 | Q6ZW61 | 1.0000 | 0.9752 |
| 7 | Q86WG5 | 0.9999 | 0.9885 |
| 8 | Q8R3P6 | 0.9999 | 0.9883 |
| 9 | Q8TCE6 | 0.9999 | 0.9865 |
| 10 | Q8WVF5 | 0.9998 | 0.9821 |
| 11 | Q9CZW2 | 0.9989 | 0.9846 |
| 12 | Q9D7X1 | 0.9997 | 0.9831 |
| 13 | Q9D8N2 | 0.9997 | 0.9866 |
| 14 | Q9NQ89 | 1.0000 | 0.9876 |
| 15 | Q96SY0 | 0.9999 | 0.9884 |
| 16 | A6H6X4 | 0.9998 | 0.9827 |
| 17 | D4A770 | 0.9999 | 0.9871 |
| 18 | E9PXF8 | 0.9999 | 0.9884 |
| 19 | Q1T765 | 0.9982 | 0.9828 |
| 20 | Q28HN9 | 0.9952 | 0.9856 |
Profile: All 55 ESM2-similar proteins share highly conserved structural folds with C9orf72 (>0.97 average similarity). Many are orthologs or paralogs across species (indicated by prefix Q/A/D/E identifiers for organism origin).
TOP BioGRID Interacting Proteins with Experimental Evidence
| Gene Symbol | Evidence Type | Count |
|---|---|---|
| SMCR8 | Affinity Capture-MS, Affinity Capture-Western, Reconstituted Complex, Two-hybrid | High (core complex member) |
| WDR41 | Affinity Capture-MS, Affinity Capture-Western | High (core complex member) |
| ULK1 | Affinity Capture-Western, Two-hybrid, Co-localization, Phosphorylation | High (autophagy pathway) |
| ATG13 | Affinity Capture-Western, Two-hybrid, Co-localization, Phosphorylation | High (autophagy pathway) |
| ATG101 | Affinity Capture-MS, Affinity Capture-Western | High (autophagy complex) |
| RB1CC1 | Affinity Capture-MS, Affinity Capture-Western | High (FIP200, autophagy) |
| RAB8A | Affinity Capture-Western, GEF reaction | Medium (GTPase substrate) |
| RAB39B | Affinity Capture-Western, Affinity Capture-MS | Medium (GTPase substrate) |
| SETX | Affinity Capture-MS | Medium (ALS-associated) |
| DCTN1 | Affinity Capture-MS | Medium (dynactin complex) |
| TBK1 | Affinity Capture-MS, Phosphorylation | Medium (immune response) |
| TARDBP | Affinity Capture-MS | Medium (ALS-associated) |
| SQSTM1 | Affinity Capture-MS | Low-Medium (autophagy adaptor) |
| UBIQUITIN | Implied through E3 ligases | Multiple |
Sequence Homology: Orthologs (Cross-Species)
| Species | Gene ID | Gene Symbol | Identity Details |
|---|---|---|---|
| Homo sapiens | ENSG00000147894 | C9orf72 | Reference (human) |
| Mus musculus | ENSMUSG00000028300 | C9orf72 | High orthology (mammalian) |
| Rattus norvegicus | ENSRNOG00000009478 | RGD1359108 | High orthology (mammalian) |
| Danio rerio | ENSDARG00000011837 | C13H9orf72 | Conserved in vertebrates |
Sequence Conservation: Orthologs show strong amino acid conservation across mammals, with structural domains preserved in all vertebrate orthologs examined.
Based on my investigation of C9orf72 in biobtree, here is the transcription factor regulatory data:
Transcription factor regulatory data
C9orf72 is NOT a transcription factor. C9orf72 encodes a guanine nucleotide exchange factor (GEF) that functions in cell signaling and intracellular transport, not in transcriptional regulation. Therefore, no downstream targets or DNA binding motif information is applicable.
Upstream regulators
Regulatory data for transcription factors controlling C9orf72 expression is limited in currently available databases. However, the following regulatory interactions and motif analyses are available:
Predicted TF binding sites in C9orf72 promoter (from MSigDB motif analysis):
- EGR1 (V$EGR1_01) — transcription factor binding motif with consensus WTGCGTGGGCGK identified in the -4kb to +2kb promoter region
- NFAT/NFATC (V$NFAT_Q4_01) — transcription factor binding motif with consensus TGGAAA identified in the -4kb to +2kb promoter region
Post-transcriptional regulation (from SIGNOR database):
- HNRNPA3 — down-regulates C9orf72 quantity (tissue: prostate; evidence type: experimental)
Expression context (from FANTOM5 promoter analysis):
- C9orf72 shows ubiquitous expression across 1,101 samples
- Highest expression in: neutrophils, stomach, testis, and brain tissues
- Indicates constitutive regulation with potential tissue-specific modulation
Note: Direct ChIP-seq validated transcription factor binding sites for C9orf72 are not currently available in biobtree databases. The predicted motifs suggest potential EGR1 and NFAT-mediated regulation that would require experimental validation.
Drug & pharmacology data
C9orf72 is not a direct drug target. It does not appear as a target protein in ChEMBL or DrugBank with known ligands or binding compounds.
However, C9orf72 has pharmacogenomic associations documented in PharmGKB:
Pharmacogenomics Associations
TNF-alpha inhibitors (drug class)
- Evidence: Clinical annotations (122) + Variant annotations (508)
- Relationship: Associated with C9orf72 genetic variants affecting drug response
- Scope: Class includes infliximab, adalimumab, etanercept, and other TNF inhibitors used for inflammatory/autoimmune conditions
Ustekinumab (specific drug)
- Mechanism: IL-12/IL-23 inhibitor (immunosuppressant)
- Evidence: Variant annotations (29) + Clinical annotations (7)
- Relationship: Associated with C9orf72 variants; FDA label notes IL-12A, IL-12B, and IL-23A as mechanistic genes
- Indication: Psoriasis, psoriatic arthritis, Crohn’s disease, ulcerative colitis
No dosing guidelines or specific variant-dosing associations are documented in available databases for C9orf72.
The evidence suggests C9orf72 genetic variation may modulate immune response to TNF inhibitors and IL-12/23-targeted biologics, relevant primarily for inflammatory/autoimmune disease pharmacogenomics rather than as a primary therapeutic target.
Expression profiles
Tissue Expression (Bgee)
Expression Summary: C9orf72 shows ubiquitous expression across human tissues with a maximum expression score of 98.03 and average of 83.20.
Top 30 Tissues/Cell Types with Expression Scores:
| Rank | Tissue/Cell Type | Expression Score | Quality |
|---|---|---|---|
| 1 | Monocyte | 98.03 | Gold |
| 2 | Leukocyte | 97.42 | Gold |
| 3 | Mucosa of paranasal sinus | 95.51 | Gold |
| 4 | Bronchial epithelial cell | 95.49 | Gold |
| 5 | Cerebellar vermis | 95.43 | Gold |
| 6 | Bronchus | 94.87 | Gold |
| 7 | Right lung | 94.70 | Gold |
| 8 | Adrenal tissue | 94.17 | Gold |
| 9 | Right uterine tube | 94.01 | Gold |
| 10 | Cerebellar cortex | 93.55 | Gold |
| 11 | Cerebellar hemisphere | 93.52 | Gold |
| 12 | Oviduct epithelium | 93.47 | Gold |
| 13 | Cerebellum | 93.42 | Gold |
| 14 | Brodmann area 23 | 93.25 | Gold |
| 15 | Right hemisphere of cerebellum | 93.21 | Gold |
| 16 | Blood | 91.99 | Gold |
| 17 | Granulocyte | 91.82 | Gold |
| 18 | Left ovary | 91.56 | Gold |
| 19 | Olfactory segment of nasal mucosa | 91.54 | Gold |
| 20 | Calcaneal tendon | 91.25 | Gold |
| 21 | Superior vestibular nucleus | 91.23 | Gold |
| 22 | Fallopian tube | 90.82 | Gold |
| 23 | Pancreatic epithelial cell | 90.56 | Silver |
| 24 | Right ovary | 90.37 | Gold |
| 25 | Ovary | 90.24 | Gold |
| 26 | Epithelium of nasopharynx | 90.17 | Gold |
| 27 | Pons | 90.08 | Gold |
| 28 | Germinal epithelium of ovary | 90.01 | Gold |
| 29 | Thymus | 89.68 | Gold |
| 30 | Corpus callosum | 89.53 | Gold |
Tissue-Enriched Patterns:
- Immune/Blood Cells: Particularly elevated in monocytes (98.03), leukocytes (97.42), and granulocytes (91.82)
- Central Nervous System: High expression in cerebellar structures (vermis 95.43, cortex 93.55, hemisphere 93.52), brain regions (pons 90.08, corpus callosum 89.53)
- Respiratory Epithelium: Bronchial epithelium (95.49), bronchus (94.87), lung (94.70)
- Reproductive System: Uterine tube (94.01), ovaries (90.24–91.56), fallopian tube (90.82)
- Sensory/Olfactory: High in olfactory mucosa (91.54)
- Overall: Ubiquitous distribution indicates broad cellular roles across all major tissue types
Single-Cell Expression
Notable Dataset:
- E-MTAB-9801 – Single-cell analysis of emergent haematopoiesis in human fetal bone marrow (486 cells, Smart-seq2)
- Cell Types Present: Polymorphonuclear neutrophils, hematopoietic stem/progenitor cells, B cells, eosinophils, mast cells, CD14+ monocytes, myelocytes, plasmacytoid dendritic cells, promyelocytes, basophils, dendritic cells
- Implication: C9orf72 expression across hematopoietic lineages suggests involvement in immune cell development and function
Note: Limited cell-type specific scoring data available in biobtree; comprehensive Tabula Sapiens and HPA cell-type atlases show C9orf72 expressed across diverse cell types consistent with ubiquitous tissue pattern.
Disease associations
Mendelian / Monogenic Disease
C9orf72 mutations cause Mendelian autosomal dominant disorders:
| Disease | Disease IDs | Inheritance | Evidence Level |
|---|---|---|---|
| Frontotemporal dementia and/or amyotrophic lateral sclerosis 1 | OMIM:105550, MONDO:0007105, ORPHA:275872 | Autosomal dominant | Strong/Moderate |
| Amyotrophic lateral sclerosis | MONDO:0004976, ORPHA:803 | Autosomal dominant | Strong |
| Progressive myoclonus epilepsy | MONDO:0020074 | Autosomal dominant | Limited |
The primary disease manifestation is C9ORF72 frontotemporal dementia with motor neuron disease (FTD-MND), characterized by GGGGCC repeat expansions in the C9orf72 gene promoter region.
Phenotype Associations
24 HPO phenotype terms associated with C9orf72 mutations:
| HPO ID | Phenotype | HPO ID | Phenotype |
|---|---|---|---|
| HP:0000006 | Autosomal dominant inheritance | HP:0002059 | Cerebral atrophy |
| HP:0000605 | Supranuclear gaze palsy | HP:0002145 | Frontotemporal dementia |
| HP:0000716 | Depression | HP:0002171 | Gliosis |
| HP:0000726 | Dementia | HP:0002186 | Apraxia |
| HP:0000738 | Hallucinations | HP:0002273 | Tetraparesis |
| HP:0000741 | Apathy | HP:0002366 | Abnormal lower motor neuron morphology |
| HP:0000746 | Delusion | HP:0002385 | Paraparesis |
| HP:0001260 | Dysarthria | HP:0002442 | Dyscalculia |
| HP:0001300 | Parkinsonism | HP:0002529 | Neuronal loss in central nervous system |
| HP:0001324 | Muscle weakness | HP:0003202 | Skeletal muscle atrophy |
| HP:0003581 | Adult onset | HP:0007354 | Amyotrophic lateral sclerosis |
| HP:0003678 | Rapidly progressive | HP:0007308 | Extrapyramidal dyskinesia |
GWAS Associations
13 GWAS studies identified C9orf72 locus associations with amyotrophic lateral sclerosis (top findings):
| Trait | Study ID | p-value | Chromosome |
|---|---|---|---|
| Amyotrophic lateral sclerosis | GCST005647_2 | 4.0e-30 | 9 |
| ALS (sporadic) | GCST004901_2 | 3.0e-23 | 9 |
| Amyotrophic lateral sclerosis | GCST004692_5 | 4.0e-19 | 9 |
| Amyotrophic lateral sclerosis | GCST007146_1 | 3.0e-15 | 9 |
| Amyotrophic lateral sclerosis | GCST000781_1 | 9.0e-11 | 9 |
| Amyotrophic lateral sclerosis | GCST002509_1 | 6.0e-10 | 9 |
| Amyotrophic lateral sclerosis | GCST000481_7 | 7.0e-09 | 9 |
| Amyotrophic lateral sclerosis | GCST000481_2 | 1.0e-08 | 9 |
| Amyotrophic lateral sclerosis | GCST008978_1 | 2.0e-06 | 9 |
| Amyotrophic lateral sclerosis | GCST001664_7 | 4.0e-07 | 9 |
| PCA3 expression level | GCST001946_1 | 2.0e-07 | 9 |
| Delirium | GCST005851_10 | 9.0e-07 | 9 |
| Metabolite levels | GCST009391_1869 | 5.0e-06 | 9 |
C9orf72 variants are strongly associated with amyotrophic lateral sclerosis susceptibility across multiple independent GWAS cohorts, with the most significant association reaching p = 4.0×10⁻³⁰.