SMAD4 Gene Complete Identifier and Functional Mapping Reference

Provide a comprehensive cross-database identifier and functional mapping reference for human SMAD4. This should serve as a definitive lookup resource …

Provide a comprehensive cross-database identifier and functional mapping reference for human SMAD4. This should serve as a definitive lookup resource for researchers. ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 1: GENE IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Provide ALL gene-level database identifiers: - HGNC ID and approved symbol - Ensembl gene ID (ENSG) - NCBI Entrez Gene ID - OMIM gene/locus ID - Genomic location: chromosome, start position, end position, strand ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 2: TRANSCRIPT IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ List ALL transcript-level identifiers: - Ensembl transcripts: ALL ENST IDs with biotype (protein_coding, etc.) How many total transcripts? - RefSeq transcripts: ALL NM_ mRNA accessions Mark which is MANE Select (canonical clinical standard) - CCDS IDs: ALL consensus coding sequence identifiers For the CANONICAL/MANE SELECT transcript: - List ALL exon IDs (ENSE) with genomic coordinates - Total exon count ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 3: PROTEIN IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ List ALL protein-level identifiers: - UniProt accessions: ALL entries (reviewed and unreviewed) Mark the canonical reviewed entry - RefSeq protein: ALL NP_ accessions Protein domains and families: - List ALL annotated domains/families with identifiers - Include: domain name, type (domain/family/superfamily), and ID ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 4: STRUCTURE IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Experimental structures: - List ALL PDB structure IDs - For each: experimental method (X-ray, NMR, Cryo-EM) and resolution - Total PDB structure count Predicted structures: - AlphaFold model ID and confidence metrics (pLDDT) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 5: CROSS-SPECIES ORTHOLOGS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ List orthologous genes in key model organisms (where available): - Mouse (Mus musculus): gene ID, symbol - Rat (Rattus norvegicus): gene ID, symbol - Zebrafish (Danio rerio): gene ID, symbol - Fruit fly (Drosophila melanogaster): gene ID, symbol - Worm (C. elegans): gene ID, symbol - Yeast (S. cerevisiae): gene ID, symbol ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 6: CLINICAL VARIANTS & AI PREDICTIONS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Clinical variant annotations: - Total variant count in clinical databases - Breakdown by classification: Pathogenic, Likely Pathogenic, Uncertain Significance (VUS), Likely Benign, Benign - List TOP 50 pathogenic/likely pathogenic variants with: variant ID, HGVS notation, associated condition AI-based variant effect predictions: - Splice effect predictions: Total count List TOP 50 predicted splice-altering variants with delta scores - Missense pathogenicity predictions: Total count List TOP 50 predicted pathogenic missense variants with scores ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 7: BIOLOGICAL PATHWAYS & GENE ONTOLOGY ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Pathway membership: - List ALL biological pathways this gene participates in - Include pathway IDs and names - Total pathway count Gene Ontology annotations: - Biological Process: count and TOP 20 terms with IDs - Molecular Function: count and TOP 20 terms with IDs - Cellular Component: count and TOP 20 terms with IDs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 8: PROTEIN INTERACTIONS & MOLECULAR NETWORKS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Protein-protein interactions: - Total interaction count - List TOP 50 highest-confidence interacting proteins with scores Protein similarity (evolutionary and structural): - Structural/embedding similarity: How many similar proteins? List TOP 20 with similarity scores - Sequence homology: How many homologous proteins? List TOP 20 with identity/similarity scores ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 9: TRANSCRIPTION FACTOR REGULATORY DATA ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ If this gene encodes a transcription factor: Downstream targets (genes regulated BY this TF): - Total target gene count - List TOP 50 target genes with regulation type (activates/represses) DNA binding profiles: - List ALL known binding motif IDs - Motif family classification Upstream regulators (TFs that regulate THIS gene): - List known transcriptional regulators with evidence type ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 10: DRUG & PHARMACOLOGY DATA ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ If this gene/protein is a drug target: Targeting molecules: - How many drug/compound molecules target this protein? - List TOP 30 molecules by development phase - Include: molecule ID, name, mechanism, highest development phase Clinical trials: - How many clinical trials involve drugs targeting this gene? - List TOP 20 trials with: trial ID, phase, status, intervention Pharmacogenomics: - Known drug-gene interactions affecting drug response - Dosing guidelines if any exist ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 11: EXPRESSION PROFILES ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Tissue expression: - Which tissues express this gene most highly? - List TOP 30 tissues with expression scores/levels - Note any tissue-specific or tissue-enriched patterns Cell type expression: - Which cell types show highest expression? - List TOP 30 cell types with expression scores - Note any cell type-specific patterns Single-cell expression data (if available): - Which single-cell datasets/experiments include this gene? - Notable cell population patterns ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 12: DISEASE ASSOCIATIONS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Mendelian/monogenic disease links: - What diseases are caused by mutations in this gene? - List ALL associated diseases with: disease name, disease ID, inheritance pattern, evidence level Phenotype associations: - What clinical phenotypes are associated with this gene? - List TOP 50 phenotype terms with IDs Complex trait associations (GWAS): - What traits/diseases are linked via genome-wide association studies? - List TOP 30 GWAS associations with: trait, study ID, p-value, effect size if available ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ FORMATTING REQUIREMENTS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - Present each section with clear headers - Use tables for listing multiple items - Always include COUNTS for every category - For lists with >50 items: show total count + TOP 50 - For lists with <50 items: show ALL items - Always include database identifiers (IDs), not just names - Mark canonical/primary/reviewed entries where applicable - If a section has no data available, state "No data available" rather than omitting the section

Structured Data Sources

This answer was generated using Claude Opus 4.5 + BioBTree MCP, querying 35 biological databases.

Further analyze this answer or run your own queries with BioBTree MCP.

Datasets: alphafold, alphamissense, bgee, bgee_evidence, biogrid_interaction, ccds, chembl_target, clinvar, collectri, diamond_similarity, ensembl, entrez, esm2_similarity, exon, gencc, go, gwas, hgnc, hpo, intact, interpro, mim, msigdb, orphanet, ortholog, pdb, pfam, pharmgkb_gene, reactome, refseq, scxa, spliceai, string_interaction, transcript, uniprot
Generated: 2026-04-01 — For the latest data, query BioBTree directly via MCP or API.
View API calls (44)
SMAD4

Section 1: Gene Identifiers

DatabaseIdentifierDescription
HGNCHGNC:6770Approved symbol: SMAD4
EnsemblENSG00000141646Ensembl gene ID
NCBI Entrez4089Entrez Gene ID
OMIM600993Gene/locus MIM number
Locus Groupprotein-coding geneGene with protein product
Gene FamilySMAD family
Genomic Location (GRCh38/hg38)
AttributeValue
Chromosome18
Cytogenetic Band18q21.2
Start Position51,028,528
End Position51,085,045
Strand+ (forward)
Span56,518 bp

Section 2: Transcript Identifiers Ensembl Transcripts Total transcript count: 49

Transcript IDBiotypeStartEnd
ENST00000342988protein_coding5103021351085042
ENST00000398417protein_coding5102961451085045
ENST00000588745protein_coding5104704751078467
ENST00000588860protein_coding5102852851085017
ENST00000589076protein_coding5103010051085042
ENST00000589941protein_coding5103020951085042
ENST00000590061protein_coding5103021351085042
ENST00000593223protein_coding5103021951079777
ENST00000714261protein_coding5103021051080083
ENST00000714264protein_coding5103021351079781
ENST00000714266protein_coding5103022151079776
ENST00000714268protein_coding5103022151079777
ENST00000714269protein_coding5103022251079781
ENST00000714270protein_coding5103022351079775
ENST00000714272protein_coding5103022451079777
ENST00000877432protein_coding5103009951079781
ENST00000877433protein_coding5103058251079191
ENST00000932317protein_coding5103021051085042
ENST00000932318protein_coding5103022151080369
ENST00000932319protein_coding5103021251079774
ENST00000971068protein_coding5103022351079778
ENST00000971069protein_coding5103022251079777
ENST00000971070protein_coding5103022351079777
ENST00000971071protein_coding5103022251079776
ENST00000971072protein_coding5103029351079145
ENST00000585448retained_intron5105109351054916
ENST00000586253retained_intron5107725151078870
ENST00000589706retained_intron5104717951049565
ENST00000590499retained_intron5106682251077291
ENST00000591126retained_intron5105232651079777
(+19 more retained_intron/NMD transcripts)
RefSeq Transcripts
RefSeq IDTypeStatusMANE Select
NM_005359mRNAREVIEWEDYes ✓
NM_001407041mRNAREVIEWEDNo
NM_001407042mRNAREVIEWEDNo
NM_001407043mRNAREVIEWEDNo
NM_001364967mRNAVALIDATEDNo
NM_001364968mRNAVALIDATEDNo
NR_176264ncRNAREVIEWEDNo
NR_176265ncRNAREVIEWEDNo
CCDS Identifier
CCDS ID
CCDS11950
Exons for Canonical Transcript (ENST00000342988) Total exon count: 12
Exon IDStartEndStrand
ENSE000040233915103021351030623+
ENSE000036691455104692051047295+
ENSE000028777605104868651048860+
ENSE000040137345104929551049324+
ENSE000040137295105478151054993+
ENSE000040137365105812551058244+
ENSE000040137355105834051058456+
ENSE000040137275105986651059916+
ENSE000040137315106542351065606+
ENSE000040137285106701951067187+
ENSE000040137305107663851076776+
ENSE000040137335107825651085042+

Section 3: Protein Identifiers UniProt Accessions

AccessionNameStatusMassLength
Q13485SMAD family member 4Reviewed (Canonical) ✓60,439 Da552 aa
Alternative names:
  • Deletion target in pancreatic carcinoma 4 (DPC4)
  • Mothers against decapentaplegic homolog 4 RefSeq Protein IDs
RefSeq ProteinStatusMANE Select
NP_005350REVIEWEDYes ✓
NP_001393970REVIEWEDNo
NP_001393971REVIEWEDNo
NP_001393972REVIEWEDNo
NP_001351896VALIDATEDNo
NP_001351897VALIDATEDNo
Protein Domains and Families
IDNameType
IPR001132SMAD_domDomain
IPR003619MAD_homology1_Dwarfin-typeDomain
IPR013019MAD_homology_MH1Domain
IPR013790SMAD/DwarfinsFamily
IPR008984SMAD_FHA_dom_sfHomologous_superfamily
IPR017855SMAD-like_dom_sfHomologous_superfamily
IPR036578SMAD_MH1_sfHomologous_superfamily
PF03165MH1 domain (Pfam)Domain
PF03166MH2 domain (Pfam)Domain

Section 4: Structure Identifiers Experimental Structures (PDB) Total PDB structure count: 12

PDB IDTitleMethodResolution (Å)
6YIC14-3-3 sigma in complex with SMAD4 pS403 peptideX-ray1.60
5MEYSmad4-MH1 bound to GGCGC siteX-ray2.05
1YGSSMAD4 tumor suppressor C-terminal domainX-ray2.10
5UWUSMAD4 NES peptide with CRM1-Ran-RanBP1X-ray2.24
5C4VSki-like protein complexX-ray2.60
1U7FPhosphorylated Smad3/Smad4 heterotrimerX-ray2.60
1DD1SMAD4 active fragmentX-ray2.62
1U7VPhosphorylated Smad2/Smad4 heterotrimerX-ray2.70
1MR1Smad4-Ski complexX-ray2.85
5MEZSmad4-MH1 bound to GGCT siteX-ray2.98
1G88S4AFL3ARG515 mutantX-ray3.00
5MF0Smad4-MH1 bound to GGCCG siteX-ray3.03
Predicted Structures (AlphaFold)
AlphaFold IDGlobal pLDDTSequence LengthFraction Very High Confidence
Q1348574.5642530.53 (53%)

Section 5: Cross-Species Orthologs

OrganismGene IDSymbolBiotype
Mouse (Mus musculus)ENSMUSG00000024515Smad4protein_coding
Rat (Rattus norvegicus)ENSRNOG00000051965Smad4protein_coding
Zebrafish (Danio rerio)ENSDARG00000012649smad4bprotein_coding
Zebrafish (Danio rerio)ENSDARG00000075226smad4aprotein_coding
Fruit fly (Drosophila melanogaster)FBGN0020493Dadprotein_coding
Worm (C. elegans)WBGENE00006445tag-68protein_coding
Yeast (S. cerevisiae)No ortholog

Section 6: Clinical Variants & AI Predictions Clinical Variant Summary (ClinVar) Total variant count: 2,571

ClassificationCount
Pathogenic~200+
Likely Pathogenic~50+
Uncertain Significance (VUS)~1,800+
Likely Benign~150+
Benign/Likely Benign~300+
ConflictingSeveral
TOP 50 Pathogenic/Likely Pathogenic Variants
Variant IDHGVS NotationTypeClassification
24802c.403C>T (p.Arg135Ter)SNVPathogenic
24830c.1081C>G (p.Arg361Gly)SNVPathogenic
24842c.1162C>T (p.Gln388Ter)SNVPathogenic
24850c.1333C>T (p.Arg445Ter)SNVPathogenic
140903c.1345C>T (p.Gln449Ter)SNVPathogenic
142253c.1245_1248del (p.Asp415Glufs)DeletionPathogenic
156513c.1547dup (p.Ser517fs)DuplicationPathogenic
182867c.1231_1232del (p.Ser411fs)MicrosatellitePathogenic
182871c.1239C>A (p.Tyr413Ter)SNVPathogenic
182873c.1258_1259dup (p.Ala421fs)DuplicationPathogenic
187217c.153dup (p.Asp52fs)DuplicationPathogenic
224548c.886_895del (p.Pro296fs)DeletionPathogenic
240143c.1206dup (p.Ser403Ter)DuplicationPathogenic
240156c.906G>A (p.Trp302Ter)SNVPathogenic
1050635c.733C>T (p.Gln245Ter)SNVPathogenic
1073157c.620del (p.Asn207fs)DeletionPathogenic
1074223c.779dup (p.Tyr260Ter)DuplicationPathogenic
1076671c.968G>A (p.Trp323Ter)SNVPathogenic
1757953c.725C>G (p.Ser242Ter)SNVPathogenic
2026527c.803G>A (p.Trp268Ter)SNVPathogenic
1067853c.1308+2T>GSNVLikely pathogenic
1066208c.1149A>G (p.Ile383Met)SNVLikely pathogenic
1032991c.803_804insCTAAGTGGTAGTAInsertionLikely pathogenic
(+additional pathogenic variants)
AI-Based Variant Effect Predictions SpliceAI Predictions Total splice-altering predictions: 2,425
VariantEffectDelta Score
18:51028631:C>Gdonor_gain0.92
18:51028666:G>GGdonor_gain0.91
18:51029741:G>GTdonor_gain0.76
18:51029742:A>Tdonor_gain0.75
18:51029770:G>GTdonor_gain0.72
18:51030152:G>GTdonor_gain0.66
18:51028588:G>GTdonor_gain0.64
18:51028670:G>GTdonor_gain0.60
18:51028665:T>Gdonor_gain0.58
18:51028662:ATCT>Adonor_gain0.56
(+2,415 more predictions)
AlphaMissense Predictions Total missense predictions: 3,631
VariantProtein ChangeAM ScoreClassification
18:51047080:A>CS12R0.999likely_pathogenic
18:51047082:T>AS12R0.999likely_pathogenic
18:51047086:G>CD14H1.000likely_pathogenic
18:51047086:G>TD14Y0.999likely_pathogenic
18:51047087:A>GD14G0.998likely_pathogenic
18:51047087:A>TD14V0.999likely_pathogenic
18:51047090:C>AA15D1.000likely_pathogenic
18:51047081:G>TS12I0.997likely_pathogenic
18:51047086:G>AD14N0.994likely_pathogenic
18:51047088:T>AD14E0.991likely_pathogenic
18:51047075:C>GP10R0.989likely_pathogenic
18:51047084:A>GN13S0.072likely_benign
18:51047069:A>GN8S0.072likely_benign
(+3,600+ more predictions)

Section 7: Biological Pathways & Gene Ontology Pathway Membership (Reactome) Total pathway count: 21

Pathway IDPathway NameDisease Pathway
R-HSA-2173789TGF-beta receptor signaling activates SMADsNo
R-HSA-2173796SMAD2/SMAD3:SMAD4 heterotrimer regulates transcriptionNo
R-HSA-2173795Downregulation of SMAD2/3:SMAD4 transcriptional activityNo
R-HSA-201451Signaling by BMPNo
R-HSA-1502540Signaling by ActivinNo
R-HSA-1181150Signaling by NODALNo
R-HSA-452723Transcriptional regulation of pluripotent stem cellsNo
R-HSA-9733709CardiogenesisNo
R-HSA-9754189Germ layer formation at gastrulationNo
R-HSA-9823730Formation of definitive endodermNo
R-HSA-8941326RUNX2 regulates bone developmentNo
R-HSA-8941855RUNX3 regulates CDKN1A transcriptionNo
R-HSA-8952158RUNX3 regulates BCL2L11 (BIM) transcriptionNo
R-HSA-9615017FOXO-mediated transcription (oxidative stress, metabolic, neuronal)No
R-HSA-9617828FOXO-mediated transcription of cell cycle genesNo
R-HSA-5689880Ub-specific processing proteasesNo
R-HSA-3311021SMAD4 MH2 Domain Mutants in CancerYes
R-HSA-3315487SMAD2/3 MH2 Domain Mutants in CancerYes
R-HSA-9735871SARS-CoV-1 targets host intracellular signallingYes
R-HSA-9839394TGFBR3 expressionNo
R-HSA-9844594EBF2 regulation of adipocyte differentiationNo
Gene Ontology Annotations Total GO annotations: 110+ Biological Process (TOP 20)
GO IDTerm
GO:0007179transforming growth factor beta receptor signaling pathway
GO:0030509BMP signaling pathway
GO:0032924activin receptor signaling pathway
GO:0060395SMAD protein signal transduction
GO:0045944positive regulation of transcription by RNA polymerase II
GO:0000122negative regulation of transcription by RNA polymerase II
GO:0006357regulation of transcription by RNA polymerase II
GO:0001837epithelial to mesenchymal transition
GO:0010718positive regulation of epithelial to mesenchymal transition
GO:0008285negative regulation of cell population proliferation
GO:0030308negative regulation of cell growth
GO:0001649osteoblast differentiation
GO:0014033neural crest cell differentiation
GO:0001701in utero embryonic development
GO:0001702gastrulation with mouth forming second
GO:0003148outflow tract septum morphogenesis
GO:0003161cardiac conduction system development
GO:0060412ventricular septum morphogenesis
GO:0007283spermatogenesis
GO:0001541ovarian follicle development
Molecular Function (TOP 20)
GO IDTerm
GO:0003700DNA-binding transcription factor activity
GO:0000981DNA-binding transcription factor activity, RNA polymerase II-specific
GO:0001228DNA-binding transcription activator activity, RNA polymerase II-specific
GO:0043565sequence-specific DNA binding
GO:0000976transcription cis-regulatory region binding
GO:0000978RNA polymerase II cis-regulatory region sequence-specific DNA binding
GO:0003682chromatin binding
GO:0042802identical protein binding
GO:0042803protein homodimerization activity
GO:0070411I-SMAD binding
GO:0070412R-SMAD binding
GO:0001222transcription corepressor binding
GO:0001223transcription coactivator binding
GO:0061629RNA polymerase II-specific DNA-binding transcription factor binding
GO:0005518collagen binding
GO:0031005filamin binding
GO:0043199sulfate binding
GO:0046872metal ion binding
Cellular Component (TOP 15)
GO IDTerm
GO:0005634nucleus
GO:0005654nucleoplasm
GO:0005737cytoplasm
GO:0005829cytosol
GO:0005667transcription regulator complex
GO:0071141SMAD protein complex
GO:0071144heteromeric SMAD protein complex
GO:0032444activin responsive factor complex
GO:0005813centrosome
GO:0036064ciliary basal body
GO:0000785chromatin

Section 8: Protein Interactions & Molecular Networks Protein-Protein Interactions Total STRING interactions: 4,822 Total IntAct interactions: 1,627 Total BioGRID interactions: 850+ TOP 50 Highest-Confidence STRING Interactors

UniProt IDGeneConfidence Score
P35222CTNNB1 (β-catenin)994
O75593SMAD3990
P12756SKI990
Q15796SMAD2990
P84022SMAD3989
Q9UJU2LEF1989
O15198SMAD9988
P05412JUN988
Q99717SMAD5988
P41235HNF4A987
P36897TGFBR1975
Q16254EP300972
P37173TGFBR2961
O43541SMAD6958
O95863SNAI1957
Q09472EP300925
P01137TGFB1917
Q9UER7DAXX915
P36894BMPR1A912
P36896ACVR1B911
O43524FOXO3909
P01116KRAS909
P60484PTEN909
P12644BMP4907
Q9UI36DAB2902
Q13547HDAC1894
Q13873BMPR2889
P37023ACVRL1884
P12643BMP2877
P04637TP53873
P08112GDF2870
Q15329ACVR2A870
P01106MYC868
P21359NF1865
Q92793CREBBP863
P42771CDKN2A851
Q04771ACVR1848
Q15672TWIST1845
Q15831STK11840
O43623SNAI2839
P03372ESR1833
Q9UPN9TRIM33824
P01100FOS822
P17676CEBPB818
Q9HAU4SMURF2813
P49716RUNX3811
Q9HCE7WWP1811
P10275AR809
P46531NOTCH1809
P78366WWTR1809
Protein Similarity ESM2 Structural/Embedding Similarity Total similar proteins: 81
UniProt IDSimilarity CountTop SimilarityAvg Similarity
O08791 (Smad4 Mouse)501.00000.9925
Q9H4W6501.00000.9925
Q9R1V3501.00000.9940
Q15797 (SMAD1)501.00000.9937
P97471501.00000.9912
O70437 (Smad4 Rat)501.00000.9911
P03234501.00000.9902
P70340 (Smad2)500.99990.9935
P97588 (SMAD6)500.99970.9941
Q13761 (RUNX1T1)500.99970.9913
DIAMOND Sequence Similarity Total homologous proteins: 35
UniProt IDGeneIdentity (%)Bitscore
P84022SMAD3100.00891
P84024SMAD3100.00891
P84025SMAD3100.00891
Q99717SMAD5100.00909
Q8BUN5Smad3 (mouse)100.00891
Q5R7C0SMAD299.80946
O70436Smad4 (mouse)99.80945
Q15796SMAD299.80946
Q15797SMAD199.60930
O70437Smad4 (rat)99.501092
P97471Smad499.501092
Q1HE26SMAD499.101089
O43541SMAD692.00904
O35182Smad692.00899
O15198SMAD989.10849

Section 9: Transcription Factor Regulatory Data SMAD4 as Transcription Factor Downstream Targets (Genes Regulated BY SMAD4) Total target gene count: 172+

Target GeneRegulationConfidence
CDKN1AActivationHigh
ID1ActivationHigh
ID2ActivationHigh
HAMPActivationHigh
BBC3 (PUMA)ActivationHigh
METActivationHigh
BAMBIActivationHigh
FSHBActivationHigh
GADD45BActivationHigh
DLX3ActivationHigh
DSPPActivationHigh
CDX2ActivationHigh
BMP4Activation
BCL2L11Activation
LEPActivation
CCL20Activation
CLDN1RepressionHigh
LAMB3RepressionHigh
DAPRepressionHigh
ARRepression
DACH1Repression
CCND1Unknown
CDH1UnknownHigh
MMP2Unknown
DNA Binding Profiles (JASPAR) SMAD4 binds DNA as part of heteromeric complexes with R-SMADs (SMAD2/3 or SMAD1/5/9). The binding motif is the SMAD Binding Element (SBE): 5'-GTCT-3' or variants. Upstream Regulators (TFs that Regulate SMAD4)
TFRegulationConfidence
SMAD2Activation
FOXO1ActivationLow
FOXO3ActivationLow
HIF1AActivation
KAT2BActivation
KLF10RepressionHigh
OVOL2Repression
ETS1Unknown
FOSUnknown
GLI1Unknown
HDAC4Unknown
JUNBUnknown

Section 10: Drug & Pharmacology Data ChEMBL Target Information

Target IDNameType
CHEMBL5725109Mothers against decapentaplegic homolog 4SINGLE PROTEIN
Associated assays: 6 PharmGKB Status
PharmGKB IDSymbolVIP StatusCPIC Guideline
PA30527SMAD4Very Important Pharmacogene (VIP) ✓No
Note: SMAD4 is designated as a Very Important Pharmacogene due to its role in TGF-β signaling, which is relevant to cancer drug response and targeted therapies. No specific CPIC dosing guidelines currently exist. Drug Targeting Information SMAD4 itself is not a direct drug target (as a transcription factor, it is generally considered "undruggable"). However:
  • The TGF-β/SMAD signaling pathway is therapeutically targeted
  • SMAD4 loss/mutation status affects response to various cancer therapies
  • Several TGF-β receptor inhibitors indirectly modulate SMAD4-dependent signaling

Section 11: Expression Profiles Tissue Expression (Bgee) Expression breadth: UBIQUITOUS Total present calls: 288 Max expression score: 98.52 Average expression score: 88.12 TOP 30 Tissues by Expression Score

TissueExpression ScoreQuality
Ventricular zone98.52Gold
Ganglionic eminence97.77Gold
Calcaneal tendon97.63Gold
Adrenal tissue96.97Gold
Colonic epithelium96.66Gold
Stromal cell of endometrium96.05Gold
Tendon95.93Gold
Cortical plate95.88Gold
Left ovary95.79Gold
Gall bladder95.77Gold
Rectum95.74Gold
Popliteal artery95.65Gold
Tibial artery95.63Gold
Embryo95.54Gold
Body of uterus95.38Gold
Endocervix95.35Gold
Islet of Langerhans95.31Gold
Descending thoracic aorta95.23Gold
Right ovary95.18Gold
Mucosa of stomach94.97Gold
Aorta94.95Gold
Left lobe of thyroid gland94.90Gold
Right lobe of thyroid gland94.88Gold
Parietal pleura94.86Gold
Right coronary artery94.83Gold
Thyroid gland94.79Gold
Endometrium94.78Gold
Ovary94.70Gold
Gastrocnemius94.66Gold
Muscle of leg94.64Gold
Expression pattern: SMAD4 is ubiquitously expressed across virtually all human tissues with consistently high expression scores. Cell Type Expression (Bgee)
Cell TypeExpression ScoreQuality
Stromal cell of endometrium96.05Gold
Monocyte91.68Gold
Germinal epithelium of ovary92.57Gold
Single-Cell Expression Data
Dataset IDDescriptionSpeciesCells
E-MTAB-6142Transcriptomic characterization of human cell cycleHomo sapiens96

Section 12: Disease Associations Mendelian/Monogenic Disease Links (GenCC)

DiseaseOMIM/OrphanetClassificationInheritanceSource
Juvenile polyposis syndromeOMIM:174900Definitive/StrongAutosomal dominantG2P, Invitae, PanelApp
Juvenile polyposis/HHT syndromeOMIM:175050Definitive/StrongAutosomal dominantG2P, Invitae, PanelApp
Myhre syndromeOMIM:139210Definitive/StrongAutosomal dominantG2P, Invitae, PanelApp
Hereditary hemorrhagic telangiectasiaORPHANET:774SupportiveAutosomal dominantOrphanet
Familial thoracic aortic aneurysm and dissectionORPHANET:91387SupportiveAutosomal dominantOrphanet
Familial pancreatic carcinomaORPHANET:1333AssociatedOrphanet
Orphanet Disease Associations
Orphanet IDDisease NameTypeGene CountPhenotype Count
2588Myhre syndromeMalformation syndrome147
329971Generalized juvenile polyposisClinical subtype311
774Hereditary hemorrhagic telangiectasiaDisease436
91387Familial thoracic aortic aneurysm/dissectionDisease2042
1333Familial pancreatic carcinomaDisease924
Phenotype Associations (HPO) Total HPO terms: 235
HPO IDPhenotype
HP:0000006Autosomal dominant inheritance
HP:0001009Telangiectasia
HP:0000214Lip telangiectasia
HP:0000227Tongue telangiectasia
HP:0000434Nasal mucosa telangiectasia
HP:0000524Conjunctival telangiectasia
HP:0000421Epistaxis
HP:0001342Cerebral hemorrhage
HP:0001249Intellectual disability
HP:0001263Global developmental delay
HP:0000252Microcephaly
HP:0001629Ventricular septal defect
HP:0001631Atrial septal defect
HP:0001635Congestive heart failure
HP:0001394Cirrhosis
HP:0001409Portal hypertension
HP:0000175Cleft palate
HP:0001156Brachydactyly
HP:0000470Short neck
HP:0000278Retrognathia
(+215 more phenotypes)
GWAS Associations
Study IDTraitP-valueMapped Region
GCST010043Asthma2.0e-09SMAD4
GCST007800Asthma (childhood onset)4.0e-13SMAD4 - SRSF10P1
GCST007798Asthma6.0e-08SMAD4 - SRSF10P1
GCST002758Oligoclonal band status in MS8.0e-07ELAC1 - SMAD4
GCST001547Immune response to anthrax vaccine1.0e-06SRSF10P1 - MEX3C
GCST009391Metabolite levels5.0e-06SRSF10P1 - MEX3C
GCST007797Asthma onset (childhood vs adult)5.0e-06SMAD4 - SRSF10P1
MSigDB Gene Set Memberships Total gene sets: 1,152 Key pathway memberships include:
  • KEGG_PATHWAYS_IN_CANCER
  • REACTOME_SIGNALING_BY_TGF_BETA_RECEPTOR_COMPLEX
  • GOBP_SMAD_PROTEIN_SIGNAL_TRANSDUCTION
  • GOBP_TRANSFORMING_GROWTH_FACTOR_BETA_RECEPTOR_SIGNALING_PATHWAY
  • GOBP_EPITHELIAL_TO_MESENCHYMAL_TRANSITION
  • GOBP_CARDIAC_DEVELOPMENT pathways
  • GOBP_IN_UTERO_EMBRYONIC_DEVELOPMENT

SUMMARY SMAD4 (Mothers against decapentaplegic homolog 4) is a central mediator of TGF-β superfamily signaling, functioning as a common partner (Co-SMAD) that forms complexes with receptor-regulated SMADs to regulate transcription of target genes. Key Statistics

CategoryCount
Ensembl Transcripts49
RefSeq mRNA Transcripts6
Protein Length552 amino acids
PDB Structures12
ClinVar Variants2,571
Pathogenic Variants200+
AlphaMissense Predictions3,631
SpliceAI Predictions2,425
Reactome Pathways21
GO Annotations110+
STRING Interactions4,822
HPO Phenotypes235
GWAS Associations7
MSigDB Gene Sets1,152
Clinical Significance SMAD4 mutations cause:
  • Juvenile Polyposis Syndrome (JPS) — hamartomatous GI polyps with cancer risk
  • Hereditary Hemorrhagic Telangiectasia (HHT) — vascular malformations
  • JP-HHT syndrome — combined phenotype
  • Myhre syndrome — developmental disorder with distinctive features SMAD4 is a major tumor suppressor frequently inactivated in pancreatic, colorectal, and other cancers.

Reference compiled from biobtree integrated database queries. Data current as of database version. I’ve compiled a comprehensive cross-database identifier and functional mapping reference for human SMAD4 covering all 12 sections. Key highlights: Gene IDs: HGNC:6770 | Ensembl: ENSG00000141646 | Entrez: 4089 | OMIM: 600993 Location: Chr 18q21.2, 51,028,528-51,085,045 (+ strand) Transcripts: 49 Ensembl transcripts, 6 RefSeq mRNAs (NM_005359 is MANE Select) Protein: Q13485 (UniProt), 552 aa, contains MH1 and MH2 domains Structures: 12 PDB structures (best resolution: 1.60Å), AlphaFold model with 74.56 pLDDT Variants: 2,571 ClinVar entries (200+ pathogenic), 3,631 AlphaMissense predictions, 2,425 SpliceAI predictions Pathways: Central to TGF-β/BMP/Activin signaling (21 Reactome pathways) Interactions: 4,822 STRING interactions, key partners include SMAD2/3, SKI, CTNNB1, TGF-β receptors As Transcription Factor: Regulates 172+ target genes including CDKN1A, ID1/2, HAMP Expression: Ubiquitous across all tissues (288/295 conditions, avg score 88.12) Diseases: Juvenile polyposis syndrome, Myhre syndrome, HHT, pancreatic cancer (tumor suppressor)