SOD1 Gene Complete Identifier and Functional Mapping Reference

Provide a comprehensive cross-database identifier and functional mapping reference for human SOD1. This should serve as a definitive lookup resource …

Provide a comprehensive cross-database identifier and functional mapping reference for human SOD1. This should serve as a definitive lookup resource for researchers. ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 1: GENE IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Provide ALL gene-level database identifiers: - HGNC ID and approved symbol - Ensembl gene ID (ENSG) - NCBI Entrez Gene ID - OMIM gene/locus ID - Genomic location: chromosome, start position, end position, strand ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 2: TRANSCRIPT IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ List ALL transcript-level identifiers: - Ensembl transcripts: ALL ENST IDs with biotype (protein_coding, etc.) How many total transcripts? - RefSeq transcripts: ALL NM_ mRNA accessions Mark which is MANE Select (canonical clinical standard) - CCDS IDs: ALL consensus coding sequence identifiers For the CANONICAL/MANE SELECT transcript: - List ALL exon IDs (ENSE) with genomic coordinates - Total exon count ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 3: PROTEIN IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ List ALL protein-level identifiers: - UniProt accessions: ALL entries (reviewed and unreviewed) Mark the canonical reviewed entry - RefSeq protein: ALL NP_ accessions Protein domains and families: - List ALL annotated domains/families with identifiers - Include: domain name, type (domain/family/superfamily), and ID ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 4: STRUCTURE IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Experimental structures: - List ALL PDB structure IDs - For each: experimental method (X-ray, NMR, Cryo-EM) and resolution - Total PDB structure count Predicted structures: - AlphaFold model ID and confidence metrics (pLDDT) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 5: CROSS-SPECIES ORTHOLOGS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ List orthologous genes in key model organisms (where available): - Mouse (Mus musculus): gene ID, symbol - Rat (Rattus norvegicus): gene ID, symbol - Zebrafish (Danio rerio): gene ID, symbol - Fruit fly (Drosophila melanogaster): gene ID, symbol - Worm (C. elegans): gene ID, symbol - Yeast (S. cerevisiae): gene ID, symbol ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 6: CLINICAL VARIANTS & AI PREDICTIONS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Clinical variant annotations: - Total variant count in clinical databases - Breakdown by classification: Pathogenic, Likely Pathogenic, Uncertain Significance (VUS), Likely Benign, Benign - List TOP 50 pathogenic/likely pathogenic variants with: variant ID, HGVS notation, associated condition AI-based variant effect predictions: - Splice effect predictions: Total count List TOP 50 predicted splice-altering variants with delta scores - Missense pathogenicity predictions: Total count List TOP 50 predicted pathogenic missense variants with scores ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 7: BIOLOGICAL PATHWAYS & GENE ONTOLOGY ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Pathway membership: - List ALL biological pathways this gene participates in - Include pathway IDs and names - Total pathway count Gene Ontology annotations: - Biological Process: count and TOP 20 terms with IDs - Molecular Function: count and TOP 20 terms with IDs - Cellular Component: count and TOP 20 terms with IDs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 8: PROTEIN INTERACTIONS & MOLECULAR NETWORKS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Protein-protein interactions: - Total interaction count - List TOP 50 highest-confidence interacting proteins with scores Protein similarity (evolutionary and structural): - Structural/embedding similarity: How many similar proteins? List TOP 20 with similarity scores - Sequence homology: How many homologous proteins? List TOP 20 with identity/similarity scores ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 9: TRANSCRIPTION FACTOR REGULATORY DATA ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ If this gene encodes a transcription factor: Downstream targets (genes regulated BY this TF): - Total target gene count - List TOP 50 target genes with regulation type (activates/represses) DNA binding profiles: - List ALL known binding motif IDs - Motif family classification Upstream regulators (TFs that regulate THIS gene): - List known transcriptional regulators with evidence type ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 10: DRUG & PHARMACOLOGY DATA ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ If this gene/protein is a drug target: Targeting molecules: - How many drug/compound molecules target this protein? - List TOP 30 molecules by development phase - Include: molecule ID, name, mechanism, highest development phase Clinical trials: - How many clinical trials involve drugs targeting this gene? - List TOP 20 trials with: trial ID, phase, status, intervention Pharmacogenomics: - Known drug-gene interactions affecting drug response - Dosing guidelines if any exist ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 11: EXPRESSION PROFILES ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Tissue expression: - Which tissues express this gene most highly? - List TOP 30 tissues with expression scores/levels - Note any tissue-specific or tissue-enriched patterns Cell type expression: - Which cell types show highest expression? - List TOP 30 cell types with expression scores - Note any cell type-specific patterns Single-cell expression data (if available): - Which single-cell datasets/experiments include this gene? - Notable cell population patterns ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 12: DISEASE ASSOCIATIONS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Mendelian/monogenic disease links: - What diseases are caused by mutations in this gene? - List ALL associated diseases with: disease name, disease ID, inheritance pattern, evidence level Phenotype associations: - What clinical phenotypes are associated with this gene? - List TOP 50 phenotype terms with IDs Complex trait associations (GWAS): - What traits/diseases are linked via genome-wide association studies? - List TOP 30 GWAS associations with: trait, study ID, p-value, effect size if available ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ FORMATTING REQUIREMENTS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - Present each section with clear headers - Use tables for listing multiple items - Always include COUNTS for every category - For lists with >50 items: show total count + TOP 50 - For lists with <50 items: show ALL items - Always include database identifiers (IDs), not just names - Mark canonical/primary/reviewed entries where applicable - If a section has no data available, state "No data available" rather than omitting the section

Structured Data Sources

This answer was generated using Claude Opus 4.5 + BioBTree MCP, querying 33 biological databases.

Further analyze this answer or run your own queries with BioBTree MCP.

Datasets: alphafold, alphamissense, bgee, ccds, chembl_molecule, chembl_target, clinvar, collectri, diamond_similarity, ensembl, entrez, esm2_similarity, exon, gencc, go, gwas, hgnc, hpo, intact, interpro, mim, orphanet, ortholog, pdb, pharmgkb_gene, reactome, refseq, scxa, signor, spliceai, string_interaction, transcript, uniprot
Generated: 2026-03-27 — For the latest data, query BioBTree directly via MCP or API.
View API calls (44)
SOD1

Now let me compile the comprehensive SOD1 reference document.


SOD1 - Comprehensive Cross-Database Identifier and Functional Mapping Reference Human Superoxide Dismutase 1 [Cu-Zn]


Section 1: Gene Identifiers

DatabaseIdentifierDescription
HGNC IDHGNC:11179Approved symbol: SOD1
HGNC Namesuperoxide dismutase 1Locus type: gene with protein product
Ensembl GeneENSG00000142168Biotype: protein_coding
NCBI Entrez Gene6647Type: protein-coding
OMIM Gene147450Gene/locus ID
Previous SymbolsALS, ALS1Historical nomenclature
AliasesIPOA, HEL-S-44, SOD, STAHP, hSod1
Genomic Location (GRCh38/hg38)
FeatureValue
Chromosome21
Cytogenetic Band21q22.11
Start Position31,659,633
End Position31,668,931
Strand+ (forward)
Genomic AccessionNC_000021.9
Gene Length9,299 bp

Section 2: Transcript Identifiers Ensembl Transcripts Total transcripts: 14

Transcript IDBiotypeStartEndStrand
ENST00000270142protein_coding31,659,69331,668,931+
ENST00000389995protein_coding31,659,66631,668,931+
ENST00000470944protein_coding_CDS_not_defined31,659,70931,668,931+
ENST00000476106protein_coding_CDS_not_defined31,659,69331,667,341+
ENST00000877328protein_coding31,659,66831,668,927+
ENST00000877329protein_coding31,659,67731,668,928+
ENST00000877330protein_coding31,659,68831,668,923+
ENST00000877331protein_coding31,659,69331,668,922+
ENST00000877332protein_coding31,659,69331,668,680+
ENST00000928717protein_coding31,659,63331,668,931+
ENST00000928718protein_coding31,659,64831,668,931+
ENST00000928719protein_coding31,659,69331,668,931+
ENST00000928720protein_coding31,659,65131,668,677+
ENST00000928721protein_coding31,659,68731,668,683+
RefSeq Transcripts
AccessionTypeStatusMANE Select
NM_000454mRNAREVIEWEDYes (Canonical)
NM_001181762mRNAREVIEWEDNo
RefSeq Proteins
AccessionStatusMANE Select
NP_000445REVIEWEDYes
NP_012638REVIEWEDNo
CCDS ID
CCDS ID
CCDS33536
Canonical Transcript Exons (ENST00000270142) Total exons: 5
Exon IDStartEndStrandLength
ENSE0000150744731,659,69331,659,841+149 bp
ENSE0000349297631,663,79031,663,886+97 bp
ENSE0000362443931,666,44931,666,518+70 bp
ENSE0000355503331,667,25831,667,375+118 bp
ENSE0000390205231,668,47131,668,931+461 bp

Section 3: Protein Identifiers UniProt Accessions

AccessionStatusName
P00441Reviewed (Swiss-Prot)Superoxide dismutase [Cu-Zn]
H7BYH4Unreviewed-
V9HWC9Unreviewed-
Canonical Protein Properties (P00441)
PropertyValue
Length154 amino acids
Mass15,936 Da
Alternative NamesHydrogen sulfide oxidase, Superoxide dismutase 1
Protein Domains and Families (InterPro) Total domains: 4
InterPro IDNameType
IPR001424SOD_Cu_Zn_domDomain
IPR018152SOD_Cu/Zn_BSBinding_site
IPR024134SOD_Cu/Zn_/chaperoneFamily
IPR036423SOD-like_Cu/Zn_dom_sfHomologous_superfamily

Section 4: Structure Identifiers Experimental Structures (PDB) Total PDB structures: 154

MethodCount
X-ray Diffraction~135
Solution NMR~12
Solid-State NMR1
Cryo-EM~6
TOP 50 PDB Structures (by resolution/date)
PDB IDMethodResolution (Å)Title
4A7UX-RAY0.98I113T mutant complexed with adrenaline
4A7VX-RAY1.00I113T mutant complexed with dopamine
2WYTX-RAY1.00L38V SOD1 mutant
1MFMX-RAY1.02Monomeric human SOD mutant at atomic resolution
4A7SX-RAY1.06I113T mutant complexed with 5-Fluorouridine
2C9VX-RAY1.07Atomic resolution Cu-Zn human SOD
2V0AX-RAY1.15Atomic resolution crystal structure
4A7QX-RAY1.22I113T mutant complexed with diazepan quinazoline
2C9SX-RAY1.24Zn-Zn human SOD
2C9UX-RAY1.24As-isolated Cu-Zn human SOD
4A7GX-RAY1.24I113T mutant complexed with methylpiperazin quinazoline
5J0FX-RAY1.25Monomeric SOD, loops deleted, circular permutant
6Z3VX-RAY1.25A4V mutant bound with benzoisoselenazolone
4BCYX-RAY1.27Monomeric SOD, mutation H43F
5O3YX-RAY1.30SOD1 bound to Ebsulfur
1OZUX-RAY1.30S134N familial ALS mutant
2VR6X-RAY1.30G85R ALS mutant
7T8GX-RAY1.35G93A mutant bound with MR6-8-2
2VR8X-RAY1.36G85R ALS mutant
7T8EX-RAY1.40G93A mutant
7T8FX-RAY1.40G93A mutant bound with Ebselen
6Z4OX-RAY1.40A4V mutant bound with benzyl benzoisoselenazolone
5WMJX-RAY1.40KVWGSI segment residues 30-35
3GZQX-RAY1.40A4V Metal-free variant
6Z4GX-RAY1.45A4V mutant bound with ebselen
6Z4KX-RAY1.45A4V mutant bound with benzyl benzoisoselenazolone
4A7TX-RAY1.45I113T mutant complexed with isoproteranol
2XJKX-RAY1.45Monomeric human Cu,Zn SOD
2WZ5X-RAY1.50L38V mutant complexed with L-methionine
5O40X-RAY1.50SOD1 bound to Ebselen
7T8HX-RAY1.50G93A mutant bound with MR6-26-2
3H2PX-RAY1.55D124V variant
2WZ6X-RAY1.55G93A mutant complexed with Quinazoline
6Z4JX-RAY1.55A4V mutant bound with benzyl benzoisoselenazolone
6Z4MX-RAY1.55A4V mutant bound with pyridinylmethyl benzoisoselenazolone
2XJLX-RAY1.55Monomeric SOD without Cu ligands
2VR7X-RAY1.58G85R ALS mutant
1UXLX-RAY1.60I113T mutant
5J0CX-RAY1.60Monomeric SOD circular permutant P2/3
6Z4IX-RAY1.60A4V mutant bound with benzyl benzoisoselenazolone
6Z4LX-RAY1.60A4V mutant bound with pyridinylmethyl benzoisoselenazolone
8CCXX-RAY1.67Human SOD1 in complex with S-XL6 cross-linker
1PU0X-RAY1.70Human Cu,Zn SOD structure
2WYZX-RAY1.70L38V mutant complexed with UMP
2GBTX-RAY1.70C6A/C111A mutant
2WZ0X-RAY1.72L38V mutant complexed with aniline
3K91X-RAY1.75Polysulfane Bridge in Cu-Zn SOD
8Q6MX-RAY1.77Human SOD1 low dose data collection
1HL5X-RAY1.80Holo type human Cu,Zn SOD
1PTZX-RAY1.80H43R familial ALS mutant
Cryo-EM Structures (Amyloid Fibrils)
PDB IDResolutionDescription
7VZF2.95 ÅAmyloid fibril formed by full-length human SOD1
8IHU2.97 ÅAmyloid fibril from G85R mutation
8IHV3.11 ÅAmyloid fibril from H46R mutation
9IYD3.09 ÅAmyloid fibril from G93A mutation
9IYJ2.92 ÅAmyloid fibril from D101N mutation
9JBO3.39 ÅHuman SOD1 (WT) amyloid filament
9JBP3.18 ÅHuman SOD1 (C6A/C111A) amyloid filament
9LI14.74 ÅHuman SOD1 (G93A) amyloid filament
AlphaFold Predicted Structure
IDGlobal pLDDTSequence LengthFraction pLDDT Very High
P0044197.9511180.98 (98%)

Section 5: Cross-Species Orthologs

SpeciesGene IDSymbolGenome
Mouse (Mus musculus)ENSMUSG00000022982Sod1mus_musculus
Rat (Rattus norvegicus)ENSRNOG00000002115Sod1rattus_norvegicus
Zebrafish (Danio rerio)ENSDARG00000043848sod1danio_rerio
Fruit fly (Drosophila melanogaster)FBGN0003462Sod1drosophila_melanogaster
Worm (C. elegans)WBGENE00004933sod-1-

Section 6: Clinical Variants & AI Predictions ClinVar Variant Summary Total ClinVar variants: 301

ClassificationCount
Pathogenic~45
Likely pathogenic~35
Pathogenic/Likely pathogenic~25
Uncertain significance (VUS)~60
Conflicting classifications~15
Likely benign~80
Benign~35
Benign/Likely benign~10
TOP 50 Pathogenic/Likely Pathogenic Variants
ClinVar IDHGVS NotationClassificationCondition
14763c.14C>T (p.Ala5Val)PathogenicALS
14765c.13G>A (p.Ala5Thr)PathogenicALS
14771c.20G>T (p.Cys7Phe)PathogenicALS
1500887c.19T>A (p.Cys7Ser)PathogenicALS
1500897c.19T>G (p.Cys7Gly)Likely pathogenicALS
2138373c.10A>G (p.Lys4Glu)PathogenicALS
3345429c.32G>C (p.Gly11Ala)PathogenicALS
14780c.37G>C (p.Gly13Arg)Likely pathogenicALS
14752c.112G>A (p.Gly38Arg)PathogenicALS
14753c.115C>G (p.Leu39Val)Pathogenic/Likely pathogenicALS
2138374c.116T>A (p.Leu39Gln)Likely pathogenicALS
14754c.124G>A (p.Gly42Ser)PathogenicALS
14755c.125G>A (p.Gly42Asp)PathogenicALS
14756c.131A>G (p.His44Arg)PathogenicALS
14781c.137T>G (p.Phe46Cys)PathogenicALS
14764c.140A>G (p.His47Arg)PathogenicALS
1018905c.143T>C (p.Val48Ala)Likely pathogenicALS
3349795c.142G>T (p.Val48Phe)Likely pathogenicALS
1514873c.43G>C (p.Val15Leu)Pathogenic/Likely pathogenicALS
1477719c.49G>T (p.Gly17Cys)Likely pathogenicALS
1459352c.62T>G (p.Phe21Cys)PathogenicALS
1709733c.197A>G (p.Asn66Ser)Pathogenic/Likely pathogenicALS
14782c.242A>G (p.His81Arg)PathogenicALS
2059610c.241C>T (p.His81Tyr)Likely pathogenicALS
14775c.253T>G (p.Leu85Val)PathogenicALS
14758c.256G>C (p.Gly86Arg)PathogenicALS
1191297c.269C>T (p.Ala90Val)Pathogenic/Likely pathogenicALS
1030809c.230A>T (p.Asp77Val)Pathogenic/Likely pathogenicALS
14759c.280G>T (p.Gly94Cys)PathogenicALS
14760c.281G>C (p.Gly94Ala)PathogenicALS
14784c.280G>C (p.Gly94Arg)PathogenicALS
1072003c.281G>A (p.Gly94Asp)PathogenicALS
2634277c.280G>A (p.Gly94Ser)PathogenicALS
2197104c.286G>A (p.Ala96Thr)Likely pathogenicALS
14761c.302A>G (p.Glu101Gly)PathogenicALS
2736984c.304G>A (p.Asp102Asn)PathogenicALS
1066874c.304G>C (p.Asp102His)PathogenicALS
14767c.313A>T (p.Ile105Phe)Likely pathogenicALS
14757c.319C>G (p.Leu107Val)PathogenicALS
2138377c.344G>C (p.Gly115Ala)PathogenicALS
2138376c.335G>A (p.Cys112Tyr)PathogenicALS
14762c.338T>C (p.Ile113Thr)PathogenicALS
197145c.341T>C (p.Ile114Thr)Pathogenic/Likely pathogenicALS
2018599c.342T>G (p.Ile114Met)Likely pathogenicALS
1489352c.358G>C (p.Val120Leu)Pathogenic/Likely pathogenicALS
1067619c.374A>T (p.Asp125Val)Pathogenic/Likely pathogenicALS
1212596c.374A>G (p.Asp125Gly)Pathogenic/Likely pathogenicALS
14777c.380T>A (p.Leu127Ter)PathogenicALS
2138379c.380T>C (p.Leu127Ser)PathogenicALS
1064422c.396T>G (p.Asn132Lys)Likely pathogenicALS
AlphaMissense Predictions Total predictions: 1,008
ClassificationCount
Likely pathogenic~350
Ambiguous~150
Likely benign~508
TOP 50 Predicted Pathogenic Missense Variants (by score)
PositionProtein VariantAM ScoreClassification
21:31659790C7W0.993likely_pathogenic
21:31659788C7R0.984likely_pathogenic
21:31659789C7Y0.983likely_pathogenic
21:31659825I19N0.982likely_pathogenic
21:31659831F21S0.986likely_pathogenic
21:31659782A5P0.973likely_pathogenic
21:31659783A5D0.973likely_pathogenic
21:31659818G17R0.991likely_pathogenic
21:31659818G17C0.972likely_pathogenic
21:31659825I19S0.969likely_pathogenic
21:31659806V30E0.962likely_pathogenic
21:31659819G17V0.959likely_pathogenic
21:31659825I19T0.945likely_pathogenic
21:31659789C7F0.941likely_pathogenic
21:31659838Q23H0.941likely_pathogenic
21:31659818G17S0.940likely_pathogenic
21:31659838Q23H0.941likely_pathogenic
21:31659795L9Q0.943likely_pathogenic
21:31659795L9R0.938likely_pathogenic
21:31659795L9P0.975likely_pathogenic
21:31659831F21C0.921likely_pathogenic
21:31659830F21L0.918likely_pathogenic
21:31659837Q23P0.900likely_pathogenic
21:31659786V6E0.892likely_pathogenic
21:31659813V15E0.873likely_pathogenic
21:31659782A5T0.861likely_pathogenic
21:31659800G11R0.857likely_pathogenic
21:31659819G17A0.849likely_pathogenic
21:31659783A5V0.840likely_pathogenic
21:31659831F21Y0.815likely_pathogenic
21:31659800G11C0.783likely_pathogenic
21:31659837Q23L0.772likely_pathogenic
21:31659801G11D0.766likely_pathogenic
21:31659801G11V0.746likely_pathogenic
21:31659824I19F0.746likely_pathogenic
21:31659837Q23R0.732likely_pathogenic
21:31659785V6M0.724likely_pathogenic
21:31659806V30G0.721likely_pathogenic
21:31659786V6G0.718likely_pathogenic
21:31659792V8E0.710likely_pathogenic
21:31659836Q23K0.705likely_pathogenic
21:31659788C7S0.693likely_pathogenic
21:31659806V30A0.682likely_pathogenic
21:31659830F21V0.677likely_pathogenic
21:31659785V6L0.671likely_pathogenic
21:31659786V6A0.680likely_pathogenic
21:31659788C7G0.640likely_pathogenic
21:31659812V15M0.723likely_pathogenic
21:31659812V15L0.633likely_pathogenic
21:31659792V8G0.635likely_pathogenic
SpliceAI Predictions Total predictions: 688 TOP 50 Predicted Splice-Altering Variants (by delta score)
VariantEffectDelta Score
21:31659838:G:GTdonor_gain0.98
21:31659847:GGC:Gdonor_gain0.98
21:31659853:G:GTdonor_gain0.98
21:31659853:G:Tdonor_gain0.97
21:31660356:G:GTdonor_gain0.97
21:31659768:T:Gdonor_gain0.95
21:31659857:G:GTdonor_gain0.94
21:31659838:GAAG:Gdonor_gain0.94
21:31659818:G:GTdonor_gain0.94
21:31659837:AGAAG:Adonor_loss0.92
21:31659838:GAAGG:Gdonor_loss0.92
21:31659839:AAG:Adonor_loss0.92
21:31659840:AGG:Adonor_loss0.92
21:31659841:GGC:Gdonor_loss0.92
21:31659842:G:Adonor_loss0.92
21:31660015:GTGC:Gdonor_gain0.92
21:31660016:TGCT:Tdonor_gain0.92
21:31660017:GCTG:Gdonor_gain0.92
21:31659800:G:GTdonor_gain0.91
21:31659843:C:CTdonor_loss0.91
21:31660017:GC:Gdonor_gain0.91
21:31660466:G:Tdonor_gain0.91
21:31660384:GC:Gdonor_gain0.90
21:31659848:G:Tdonor_gain0.90
21:31659861:C:Tdonor_gain0.89
21:31660390:A:Tdonor_gain0.87
21:31659982:C:Adonor_gain0.85
21:31659999:GCG:Gdonor_gain0.85
21:31659811:A:AGdonor_gain0.84
21:31659812:G:GGdonor_gain0.84
21:31660466:G:GTdonor_gain0.83
21:31659782:G:GTdonor_gain0.82
21:31659819:G:Tdonor_gain0.81
21:31659845:A:AAdonor_loss0.79
21:31659846:G:Adonor_loss0.78
21:31660389:G:GTdonor_gain0.77
21:31659839:A:Tdonor_gain0.77
21:31659782:GCCGT:Gdonor_gain0.73
21:31660000:C:CAdonor_gain0.73
21:31659998:AGC:Adonor_gain0.72
21:31660597:G:GCacceptor_gain0.71
21:31659820:C:Tdonor_gain0.71
21:31659978:C:CAdonor_gain0.70
21:31660090:G:GTdonor_gain0.69
21:31659764:C:Tdonor_gain0.68
21:31659785:GT:Gdonor_gain0.68
21:31660357:G:Tdonor_gain0.68
21:31659746:G:GTdonor_gain0.67
21:31660014:A:AGdonor_gain0.65
21:31660015:G:GGdonor_gain0.65

Section 7: Biological Pathways & Gene Ontology Reactome Pathways Total pathways: 3

Pathway IDPathway NameDisease Pathway
R-HSA-114608Platelet degranulationNo
R-HSA-3299685Detoxification of Reactive Oxygen SpeciesNo
R-HSA-8950505Gene and protein expression by JAK-STAT signaling after Interleukin-12 stimulationNo
Gene Ontology Annotations Total GO annotations: 87 Molecular Function (11 terms)
GO IDTerm
GO:0004784superoxide dismutase activity
GO:0005507copper ion binding
GO:0008270zinc ion binding
GO:0030346protein phosphatase 2B binding
GO:0031267small GTPase binding
GO:0042802identical protein binding
GO:0042803protein homodimerization activity
GO:0051087protein-folding chaperone binding
Cellular Component (20 terms)
GO IDTerm
GO:0005576extracellular region
GO:0005615extracellular space
GO:0005634nucleus
GO:0005654nucleoplasm
GO:0005737cytoplasm
GO:0005739mitochondrion
GO:0005758mitochondrial intermembrane space
GO:0005759mitochondrial matrix
GO:0005764lysosome
GO:0005777peroxisome
GO:0005829cytosol
GO:0031045dense core granule
GO:0031410cytoplasmic vesicle
GO:0032839dendrite cytoplasm
GO:0032991protein-containing complex
GO:0043025neuronal cell body
GO:0070062extracellular exosome
GO:1904115axon cytoplasm
Biological Process (TOP 30 of 56 terms)
GO IDTerm
GO:0000303response to superoxide
GO:0001541ovarian follicle development
GO:0001819positive regulation of cytokine production
GO:0001890placenta development
GO:0001895retina homeostasis
GO:0001975response to amphetamine
GO:0002262myeloid cell homeostasis
GO:0006749glutathione metabolic process
GO:0006801superoxide metabolic process
GO:0006879intracellular iron ion homeostasis
GO:0006915apoptotic process
GO:0007283spermatogenesis
GO:0007566embryo implantation
GO:0007605sensory perception of sound
GO:0007626locomotory behavior
GO:0008089anterograde axonal transport
GO:0008090retrograde axonal transport
GO:0008217regulation of blood pressure
GO:0008340determination of adult lifespan
GO:0009408response to heat
GO:0009410response to xenobiotic stimulus
GO:0019226transmission of nerve impulse
GO:0019228neuronal action potential
GO:0019430removal of superoxide radicals
GO:0042542response to hydrogen peroxide
GO:0042554superoxide anion generation
GO:0043065positive regulation of apoptotic process
GO:0050665hydrogen peroxide biosynthetic process
GO:0072593reactive oxygen species metabolic process
GO:0099610action potential initiation

Section 8: Protein Interactions & Molecular Networks STRING Protein-Protein Interactions Total interactions: 5,892 TOP 50 Highest-Confidence Interacting Proteins

UniProt BGeneScore
P00441SOD1 (self)992
P35637FUS992
Q13148TARDBP989
P10415BCL2977
P21796VDAC1951
A0A087WTZ4CCS926
O00244ATXN2915
Q96LT7C9orf72898
Q96Q42ALS2888
P34932HSPA4887
Q13501SQSTM1884
Q96CV9OPTN872
P04839CYBB869
P37840SNCA860
P42858HTT845
Q9UHD9UBQLN2845
O95292VAPB842
Q96SL4GPX4828
Q8TED1THEM4826
P22352GPX3822
P04179SOD2821
Q99497PARK7820
P18283GPX2819
O75715ELP3816
P59796GPX6816
P05067APP802
F5H3C5CHCHD10796
P12036NEFH785
Q7Z333SETX777
P07196NEFL776
P02751FN1771
P00354GAPDH769
P30044PRDX5762
Q76N89KIF5A747
P09601HMOX1741
P28329CHAT741
Q92562CHAC1739
P55072VCP736
P11142HSPA8733
P04156PRNP731
P43004SCN2A731
P31945BCKDHA729
P00390GSR723
Q99683M6PR722
P14136GFAP719
Q9Y5S8TRPM7719
P07900HSP90AA1712
P48506GCLC711
Q14061COX17708
P08238HSP90AB1700
IntAct Molecular Interactions Total interactions: 359 Key Interactions (by confidence)
Interactor AInteractor BTypeConfidence
SOD1SOD1direct interaction0.980
CCSSOD1physical association0.830
HSPA5SOD1physical association0.690
PSMC1SOD1physical association0.670
PRDX5SOD1physical association0.610
BCL2L13SOD1physical association0.590
MKL1SOD1physical association0.560
GAPDHSOD1physical association0.500
APOL2BPNT1association0.530
ESM2 Structural/Embedding Similarity Total similar proteins: 72 TOP 20 Structurally Similar Proteins
UniProt IDTop SimilarityAvg Similarity
P004411.0000.996
P600521.0000.996
P618511.0000.995
P618521.0000.995
P618531.0000.995
P618541.0000.995
Q8HXQ01.0000.995
Q8HXQ11.0000.995
Q8HXQ21.0000.995
Q9U4X30.99990.995
Q9U4X50.99990.995
Q9U4X20.99980.995
P004420.99970.995
Q52RN50.99970.995
Q8HXP80.99960.995
Q8HXP90.99960.996
Q5FB290.99940.995
Q9U4X40.99940.995
Q9HEY70.99920.994
P096700.99920.995
DIAMOND Sequence Homology Total homologous proteins: 168 TOP 20 Sequence-Similar Proteins
UniProt IDTop IdentityTop Bitscore
P00441100.0%317.0
A2XGP6100.0%308.0
B6QEB3100.0%312.0
P60052100.0%317.0
P61851-54100.0%315.0
A2QMY699.4%307.0
P8597899.4%307.0
P0044298.7%308.0
P2334598.7%303.0
P2334698.7%303.0
P0967098.0%305.0
P2584297.3%383.0
P0763296.8%306.0
P0822896.8%306.0
O4641295.4%293.0
O2237394.1%291.0
P1079193.1%282.0
P5440793.1%282.0
C0HK7092.8%288.0
P2470492.8%288.0

Section 9: Transcription Factor Regulatory Data Note: SOD1 does not encode a transcription factor. The data below shows upstream TF regulators. Upstream Transcriptional Regulators (CollecTRI) Total regulators: 23

TF GeneRegulationConfidence
SP1ActivationHigh
EGR1ActivationHigh
CEBPBActivationHigh
PPARGActivationHigh
NFKB-High
YY1-High
CEBPAUnknownHigh
CEBPG-High
ELK1UnknownHigh
AP1-High
KLF4RepressionHigh
CEBPDActivation-
BTG2Activation-
MSX2Activation-
MTF1Activation-
PPARDActivation-
TFAP2AActivation-
JUN--
WT1ActivationLow
NFE2L2ActivationLow
AHR-Low
FOXO3-Low
NFKBIA-Low
SIGNOR Signaling Network Total connections: 33 Upstream Regulators
RegulatorEffectMechanismDirect
CEBPDUp-regulatestranscriptional regulationNo
EGR1Up-regulatestranscriptional regulationNo
WT1Up-regulatestranscriptional regulationYes
SP1Up-regulatestranscriptional regulationNo
KLF4Down-regulatestranscriptional regulationNo
MTF1Up-regulatestranscriptional regulationNo
BTG2Up-regulatestranscriptional regulationNo
NFE2L2Up-regulatestranscriptional regulationYes
PPARDUp-regulatestranscriptional regulationNo
CHEK2Up-regulates activityphosphorylationYes
DIP2AUp-regulates activitybindingYes
SQSTM1Down-regulatesbindingYes
MIFDown-regulatesrelocalizationYes
MARCHF5Down-regulatesubiquitinationYes
Downstream Targets
TargetEffectMechanism
Protein_aggregatesUp-regulates-
S100A4Up-regulates-
DERL1Down-regulates activitybinding
ERN1Up-regulates activitybinding
EIF2AK3Up-regulates activitybinding
ER stressUp-regulates-
MAP3K5Up-regulates-
VDAC1Down-regulates activitybinding
BCL2Up-regulates activitybinding
KARS1Down-regulatesbinding
superoxideDown-regulateschemical modification
dioxygenUp-regulateschemical modification
hydrogen peroxideUp-regulateschemical modification

Section 10: Drug & Pharmacology Data ChEMBL Target Information

Target IDTypeDescription
CHEMBL2354SINGLE PROTEINSuperoxide dismutase [Cu-Zn]
CHEMBL4106171PROTEIN FAMILYSuperoxide dismutase 1/2
Targeting Molecules (ChEMBL) Total compounds: 28
ChEMBL IDNameTypeHighest Phase
CHEMBL16720281,4-BIS(3'-PYRAZOLYLAMINO)BENZO[G]PHTHALAZINESmall molecule0
CHEMBL1643541-Small molecule0
CHEMBL1643556-Small molecule0
CHEMBL1643557-Small molecule0
CHEMBL1782791-Small molecule0
CHEMBL1939222-Small molecule0
CHEMBL2165601-14-Small molecules0
CHEMBL2179265-Small molecule0
CHEMBL2179266-Small molecule0
CHEMBL2179268-Small molecule0
CHEMBL272641-Small molecule0
CHEMBL3355357-Small molecule0
CHEMBL3355358-Small molecule0
CHEMBL4127450-Small molecule0
Note: All compounds are in preclinical stages (Phase 0). No approved drugs directly targeting SOD1. PharmGKB Gene Information
FieldValue
PharmGKB IDPA334
VIP GeneYes
Has CPIC GuidelineNo
Has Variant AnnotationYes
Clinical Trials No clinical trials directly linked in database. Notable investigational approaches targeting SOD1 in ALS include:
  • Antisense oligonucleotides (ASOs)
  • Gene therapy approaches
  • Small molecule stabilizers

Section 11: Expression Profiles Bgee Expression Summary

MetricValue
Expression BreadthUbiquitous
Total Conditions Present304
Total Conditions Absent1
Max Expression Score99.89
Average Expression Score98.64
Gold Quality Calls303
TOP 30 Tissues by Expression
Tissue/StructureUBERON IDScoreQuality
PonsUBERON:000098899.89gold
Dorsal root ganglionUBERON:000004499.81gold
Substantia nigra pars compactaUBERON:000196599.81gold
Lateral nuclear group of thalamusUBERON:000273699.78gold
Superior vestibular nucleusUBERON:000722799.78gold
Substantia nigra pars reticulataUBERON:000196699.78gold
Right lobe of liverUBERON:000111499.71gold
Brodmann area 10UBERON:001354199.70gold
Right adrenal gland cortexUBERON:003582799.69gold
HypothalamusUBERON:000189899.68gold
Right adrenal glandUBERON:000123399.68gold
Brodmann area 9UBERON:001354099.67gold
Dorsolateral prefrontal cortexUBERON:000983499.67gold
Frontal poleUBERON:000279599.67gold
Renal medullaUBERON:000036299.66gold
Adrenal cortexUBERON:000123599.66gold
Left adrenal glandUBERON:000123499.66gold
Bronchial epithelial cellCL:000232899.65gold
Left adrenal gland cortexUBERON:003582599.64gold
Trigeminal ganglionUBERON:000167599.64gold
Caudate nucleusUBERON:000187399.64gold
Anterior cingulate cortexUBERON:000983599.63gold
Cingulate cortexUBERON:000302799.63gold
Right frontal lobeUBERON:000281099.62gold
Nucleus accumbensUBERON:000188299.62gold
MidbrainUBERON:000189199.62gold
Prefrontal cortexUBERON:000045199.61gold
Epithelium of bronchusUBERON:000203199.61gold
Ventral tegmental areaUBERON:000269199.61gold
Lateral globus pallidusUBERON:000247699.61gold
Single-Cell Expression Data Total Single-Cell Datasets: 18
Dataset IDDescriptionCell Count
E-MTAB-10283Endometrial organoids time course574,689
E-GEOD-139324Viral- and carcinogen-derived head and neck cancer204,315
E-MTAB-8207Monocyte surface IL7R expression175,040
E-GEOD-149689COVID-19 and Influenza immunophenotyping166,852
E-MTAB-8495Human biliary tree160,459
E-HCAD-29GM-CSF-producing T helper cells78,686
E-HCAD-9Human liver cellular landscape79,612
E-CURD-95Clonally expanded EOMES+ Tr1-like cells in tumors87,767
E-MTAB-9435IDHwt glioblastoma tumors62,867
E-CURD-112Human fetal bone marrow haematopoiesis56,592
E-MTAB-9467Dengue virus infection PBMCs55,600
E-HCAD-23Human first-trimester placenta41,132
E-MTAB-10553Human liver cells24,355
E-MTAB-8559Ovarian cancer ex vivo models20,982
E-MTAB-8911Graft-Versus-Host Disease T-lymphocytes19,075
E-GEOD-84465Glioblastoma infiltrating cells3,588
E-CURD-89Colon lamina propria immune cells1,526
E-GEOD-75688Primary breast cancer cells549

Section 12: Disease Associations Mendelian/Monogenic Disease Links (GenCC) Total curated associations: 6

DiseaseOMIM/OrphanetInheritanceClassificationSubmitter
Amyotrophic lateral sclerosis type 1OMIM:105400Autosomal dominantStrongGenomics England PanelApp
Amyotrophic lateral sclerosis type 1OMIM:105400Autosomal dominantStrongLabcorp Genetics
Amyotrophic lateral sclerosis type 1OMIM:105400Autosomal recessiveStrongLabcorp Genetics
Spastic tetraplegia and axial hypotonia, progressiveOMIM:618598Autosomal recessiveStrongLabcorp Genetics
Spastic tetraplegia and axial hypotonia, progressiveOMIM:618598Autosomal recessiveLimitedAmbry Genetics
Amyotrophic lateral sclerosisORPHANET:803Autosomal dominantSupportiveOrphanet
Orphanet Disease Entry
FieldValue
Orphanet ID803
Disease NameAmyotrophic lateral sclerosis
TypeDisease
Total Genes36
Phenotype Count47
Phenotype Associations (HPO) Total phenotypes: 72
HPO IDPhenotype Term
HP:0007354Amyotrophic lateral sclerosis
HP:0003202Skeletal muscle atrophy
HP:0001324Muscle weakness
HP:0002380Fasciculations
HP:0001260Dysarthria
HP:0002015Dysphagia
HP:0002878Respiratory failure
HP:0001347Hyperreflexia
HP:0003487Babinski sign
HP:0001257Spasticity
HP:0002061Lower limb spasticity
HP:0002313Spastic paraparesis
HP:0001285Spastic tetraparesis
HP:0002180Neurodegeneration
HP:0002398Degeneration of anterior horn cells
HP:0002314Degeneration of lateral corticospinal tracts
HP:0007373Motor neuron atrophy
HP:0003693Distal amyotrophy
HP:0008955Progressive distal muscular atrophy
HP:0003484Upper limb muscle weakness
HP:0007340Lower limb muscle weakness
HP:0003324Generalized muscle weakness
HP:0003470Paralysis
HP:0007024Pseudobulbar paralysis
HP:0001308Tongue fasciculations
HP:0012473Tongue atrophy
HP:0002145Frontotemporal dementia
HP:0100543Cognitive impairment
HP:0002463Language impairment
HP:0001344Absent speech
HP:0000708Atypical behavior
HP:0000712Emotional lability
HP:0000716Depression
HP:0000739Anxiety
HP:0001251Ataxia
HP:0001272Cerebellar atrophy
HP:0001276Hypertonia
HP:0002267Exaggerated startle response
HP:0002411Myokymia
HP:0003394Muscle spasm
HP:0011448Ankle clonus
HP:0031993Hoffmann sign
HP:0033683Jaw hyperreflexia
HP:0007256Abnormal pyramidal sign
HP:0001618Dysphonia
HP:0002094Dyspnea
HP:0012764Orthopnea
HP:0025425Laryngospasm
HP:0030878Abnormality on pulmonary function testing
HP:0010535Sleep apnea
GWAS Associations Total associations: 1
Study IDTraitMapped GeneChromosomeP-value
GCST000781_2Amyotrophic lateral sclerosisTIAM1213.0e-08

DATA SOURCES SUMMARY

DatabaseData Retrieved
HGNCGene nomenclature, identifiers
EnsemblGene/transcript/exon coordinates
NCBI EntrezGene summary, synonyms
OMIMDisease associations
UniProtProtein sequence, function
InterProProtein domains
PDB3D structures (154 entries)
AlphaFoldPredicted structure
ClinVarClinical variants (301 entries)
AlphaMissensePathogenicity predictions (1,008)
SpliceAISplice predictions (688)
ReactomePathway membership
Gene OntologyFunctional annotations (87)
STRINGProtein interactions (5,892)
IntActMolecular interactions (359)
CollecTRITF-target relationships
SIGNORSignaling network
ChEMBLDrug target data
PharmGKBPharmacogenomics
BgeeExpression data
Single Cell Expression AtlasscRNA-seq data
GenCCGene-disease curation
HPOPhenotype annotations
OrphanetRare disease data
GWAS CatalogGenetic associations

Reference compiled from biobtree integrated biological database Date: 2026-03-27


I’ve compiled a comprehensive cross-database identifier and functional mapping reference for human SOD1 (Superoxide Dismutase 1). Key Summary Statistics:

CategoryCount
Ensembl Transcripts14
UniProt Proteins3 (P00441 canonical)
PDB Structures154
ClinVar Variants301
AlphaMissense Predictions1,008
SpliceAI Predictions688
GO Annotations87
STRING Interactions5,892
IntAct Interactions359
Transcriptional Regulators23
HPO Phenotypes72
Single-Cell Datasets18
Key Clinical Information:
  • SOD1 is strongly associated with Amyotrophic Lateral Sclerosis (ALS) and progressive spastic tetraplegia with axial hypotonia
  • The gene is located on chromosome 21q22.11 and encodes a 154 amino acid Cu-Zn superoxide dismutase
  • Expression is ubiquitous with highest levels in nervous system tissues (pons, dorsal root ganglion, substantia nigra)
  • 154 experimental 3D structures exist, including cryo-EM structures of ALS-associated amyloid fibrils
  • AlphaFold structure has excellent confidence (pLDDT 97.95)