SMAD4 Gene Complete Identifier and Functional Mapping Reference

Provide a comprehensive cross-database identifier and functional mapping reference for human SMAD4 — a definitive lookup resource covering: ### …

Provide a comprehensive cross-database identifier and functional mapping reference for human SMAD4 — a definitive lookup resource covering: ### Section 1: Gene identifiers For human gene SMAD4, list ALL gene-level database identifiers. Required: - HGNC ID and approved symbol - Ensembl gene ID (ENSG...) - NCBI Entrez Gene ID - OMIM gene/locus ID - Genomic location: chromosome, start position, end position, strand (GRCh38) ### Section 2: Transcript identifiers For human gene SMAD4, list ALL transcript-level identifiers. Required: - Ensembl transcripts: ALL ENST IDs with biotype. Total count. - RefSeq transcripts: ALL NM_ mRNA accessions. Mark which is MANE Select. - CCDS IDs. - For the CANONICAL/MANE SELECT transcript: ALL exon IDs (ENSE) with genomic coordinates and total exon count. ### Section 3: Protein identifiers For human gene SMAD4 protein product(s), list ALL protein-level identifiers. Required: - UniProt accessions: ALL entries (reviewed and unreviewed). Mark the canonical reviewed entry. - RefSeq protein: ALL NP_ accessions. - Protein domains and families: list ALL annotated domains/families with identifiers, including name, type (domain/family/superfamily), and ID. - Antibody availability: known antibody resources for the protein. ### Section 4: Structure For human gene SMAD4 protein, list ALL structural data. Required: - Experimental structures: ALL PDB IDs. For each: experimental method (X-ray/NMR/Cryo-EM) and resolution. Total count. - Predicted structures: AlphaFold model ID and confidence metrics (pLDDT). ### Section 5: Cross-species orthologs For human gene SMAD4, list orthologous genes in key model organisms. Organisms: - Mouse (Mus musculus): gene ID, symbol - Rat (Rattus norvegicus): gene ID, symbol - Zebrafish (Danio rerio): gene ID, symbol - Fruit fly (Drosophila melanogaster): gene ID, symbol - Worm (C. elegans): gene ID, symbol - Yeast (S. cerevisiae): gene ID, symbol ### Section 6: Clinical variants & AI predictions For human gene SMAD4, summarize clinical variants and AI predictions. Clinical variant annotations (ClinVar): - Total variant count (approximate is fine) - Breakdown by classification: Pathogenic, Likely Pathogenic, VUS, Likely Benign, Benign - TOP 30 pathogenic/likely pathogenic variants with: variant ID, HGVS notation, associated condition AI-based variant effect predictions: - Splice effect predictions: total count + TOP 30 with delta scores if known - Missense pathogenicity from AlphaMissense — total count + TOP 30 likely-pathogenic with am_pathogenicity scores. ### Section 7: Pathways & Gene Ontology For human gene SMAD4, list biological pathways and Gene Ontology annotations. Pathway membership: - ALL biological pathways this gene participates in, with pathway IDs and names - Total pathway count Gene Ontology: - Biological Process: count and TOP 20 terms with GO IDs - Molecular Function: count and TOP 20 terms with GO IDs - Cellular Component: count and TOP 20 terms with GO IDs ### Section 8: Protein interactions & networks For human gene SMAD4 protein, summarize protein interactions and networks. Protein-protein interactions (STRING, IntAct, BioGRID, etc.): - Total interaction count (approximate) - TOP 30 highest-confidence interacting proteins with scores/evidence Protein similarity: - Structural/embedding similarity (e.g. Foldseek, ESM): TOP 20 similar proteins with scores - Sequence homology: TOP 20 homologous proteins with identity/similarity ### Section 9: Transcription factor regulatory data For human gene SMAD4, summarize transcription factor regulatory data. If SMAD4 is a transcription factor: - Downstream targets: total count + TOP 30 with regulation type (activates/represses) and evidence - DNA binding motifs from JASPAR — all known motif IDs and motif family classification. Regardless: - Upstream regulators: TFs that regulate SMAD4 — names with evidence type (ChIP-seq / predicted / experimentally validated) If SMAD4 is not a transcription factor, say so briefly and skip the downstream/motif sections. ### Section 10: Drug & pharmacology data For human gene SMAD4 protein as a drug target, summarize pharmacology data. If SMAD4 is a known drug target: - Targeting molecules: total count in ChEMBL/DrugBank + TOP 30 by development phase (molecule ID, name, mechanism, highest phase) - Clinical trials: TOP 20 involving drugs targeting this gene — trial ID, phase, status, intervention - Pharmacogenomics: known drug-gene interactions affecting drug response + dosing guidelines if any If SMAD4 is not currently a drug target, say so briefly. ### Section 11: Expression profiles For human gene SMAD4, summarize expression profiles. Tissue expression (GTEx, HPA, Bgee, etc.): - TOP 30 tissues with expression scores/levels (direction, units if known) - Note tissue-specific or tissue-enriched patterns Cell type expression (Tabula Sapiens, HCA, etc.): - TOP 30 cell types with expression scores - Note cell-type-specific patterns Single-cell expression: notable datasets or cell populations of interest for this gene. ### Section 12: Disease associations For human gene SMAD4, summarize disease associations. Mendelian / monogenic disease: - Diseases caused by mutations in SMAD4: disease name, disease ID (OMIM/Orphanet/Mondo), inheritance pattern, evidence level - Include all directly linked conditions Phenotype associations: - Clinical phenotypes associated with the gene (HPO terms where known) - TOP 30 phenotype terms with HPO IDs Complex-disease / GWAS: - Traits and diseases significantly associated via GWAS: trait name, variant, effect size, study where known - TOP 30 GWAS associations

SMAD4

Executive summary

SMAD4 (HGNC:6770, chromosome 18q21) is the central co-SMAD in TGF-β/BMP signaling, functioning as the obligate transcriptional mediator that partners with receptor-activated SMADs to control cell growth, differentiation, and development. Its clinical importance is defined primarily by its role as a tumor suppressor: germline mutations cause autosomal dominant juvenile polyposis syndrome, juvenile polyposis/hereditary hemorrhagic telangiectasia syndrome, and Myhre syndrome, while somatic loss is associated with pancreatic, colorectal, gastric, and several other carcinomas. The protein is ubiquitously expressed across 288 tissue conditions (average expression score 88.1), consistent with its broad developmental roles spanning cardiogenesis, germ layer formation, and stem cell pluripotency across 21 Reactome pathways. AlphaMissense predicts over 1,300 likely-pathogenic missense variants, and ClinVar records approximately 2,571 variants in total, underscoring its mutational sensitivity. Despite being well-characterized structurally (12 PDB entries) and biologically, SMAD4 has no approved drugs or clinical-stage therapeutics targeting it directly.

Gene identifiers

FieldValue
HGNC IDHGNC:6770
Approved symbolSMAD4
Ensembl gene IDENSG00000141646
NCBI Entrez Gene ID4089
OMIM gene/locus ID600993
Chromosome18
Start position (GRCh38)51,028,528
End position (GRCh38)51,085,045
Strand+ (forward)

Transcript identifiers

Ensembl Transcripts

Total: 49 ENST IDs

ENST IDBiotype
ENST00000342988protein_coding
ENST00000398417protein_coding
ENST00000585448retained_intron
ENST00000586253retained_intron
ENST00000588745protein_coding
ENST00000588860protein_coding
ENST00000589076protein_coding
ENST00000589706retained_intron
ENST00000589941protein_coding
ENST00000590061protein_coding
ENST00000590499retained_intron
ENST00000591126retained_intron
ENST00000592186nonsense_mediated_decay
ENST00000592911protein_coding_CDS_not_defined
ENST00000593223protein_coding
ENST00000611848nonsense_mediated_decay
ENST00000684953retained_intron
ENST00000685090retained_intron
ENST00000685232retained_intron
ENST00000688307retained_intron
ENST00000688574retained_intron
ENST00000688903retained_intron
ENST00000690892retained_intron
ENST00000691124retained_intron
ENST00000714260nonsense_mediated_decay
ENST00000714261protein_coding
ENST00000714262nonsense_mediated_decay
ENST00000714263nonsense_mediated_decay
ENST00000714264protein_coding
ENST00000714265nonsense_mediated_decay
ENST00000714266protein_coding
ENST00000714267nonsense_mediated_decay
ENST00000714268protein_coding
ENST00000714269protein_coding
ENST00000714270protein_coding
ENST00000714271nonsense_mediated_decay
ENST00000714272protein_coding
ENST00000714273nonsense_mediated_decay
ENST00000714274nonsense_mediated_decay
ENST00000877432protein_coding
ENST00000877433protein_coding
ENST00000932317protein_coding
ENST00000932318protein_coding
ENST00000932319protein_coding
ENST00000971068protein_coding
ENST00000971069protein_coding
ENST00000971070protein_coding
ENST00000971071protein_coding
ENST00000971072protein_coding

RefSeq mRNA Transcripts

NM AccessionMANE Select
NM_005359✓ Yes
NM_001364967
NM_001364968
NM_001407041
NM_001407042
NM_001407043
NM_019275

CCDS ID

CCDS11950

MANE SELECT Transcript Exons

ENST00000342988 (NM_005359) - 12 exons

Exon IDStartEndStrandChromosome
ENSE000036691455104692051047295+18
ENSE000028777605104868651048860+18
ENSE000040137345104929551049324+18
ENSE000040137295105478151054993+18
ENSE000040137355105834051058456+18
ENSE000040137365105812551058244+18
ENSE000040137315106542351065606+18
ENSE000040137285106701951067187+18
ENSE000040137305107663851076776+18
ENSE000040137275105986651059916+18
ENSE000040137335107825651085042+18
ENSE000040233915103021351030623+18

Protein identifiers

UniProt Accessions

Canonical (Reviewed):

  • Q13485 ✓ (SwissProt reviewed)

Unreviewed isoforms/alternative entries:

  • A0A024R274
  • A0A087WUF3
  • A0AAQ5BHQ0
  • A0AAQ5BHQ2
  • A0AAQ5BHS5
  • A0AAQ5BHS9
  • A0AAQ5BHT7
  • A0AAQ5BHU0
  • A0AAQ5BHU1
  • A0AAQ5BHU2
  • A0AAQ5BHU9
  • A0AAQ5BHV1
  • A0AAQ5BHV2
  • A0AAQ5BHV5
  • A0AAQ5BHW2
  • A0AAQ5BHY6
  • A0AAQ5BHZ6
  • K7EIJ2
  • K7EIU8
  • K7EL15
  • K7EL18
  • K7ELK2
  • K7ENG1
  • K7ES96

RefSeq Proteins (NP_/XP_ accessions)

Reviewed/MANE Select:

  • NP_005350 (MANE SELECT, canonical)
  • NP_001393970
  • NP_001393971
  • NP_001393972

Validated:

  • NP_001351896
  • NP_001351897
  • NP_032566
  • NP_062148

Predicted:

  • XP_006525764
  • XP_011245157

Protein Domains and Families

InterPro:

  • IPR001132: SMAD domain (Domain)
  • IPR003619: MAD homology 1, Dwarfin-type (Domain)
  • IPR008984: SMAD/FHA domain superfamily (Homologous superfamily)
  • IPR013019: MAD homology, MH1 (Domain)
  • IPR013790: SMAD/Dwarfins (Family)
  • IPR017855: SMAD-like domain superfamily (Homologous superfamily)
  • IPR036578: SMAD MH1 domain superfamily (Homologous superfamily)

Pfam:

  • PF03165
  • PF03166

SMART:

  • SM00523
  • SM00524

CDD (Conserved Domain Database):

  • CD10492
  • CD10498

PANTHER:

  • PTHR13703
  • PTHR13703:SF63

Antibody Availability

No antibody resources found in biobtree for Q13485. Direct lookup via external antibody databases recommended (e.g., AbCam, Santa Cruz, Origene, Sigma-Aldrich).

Structure

Experimental Structures

Total: 12 PDB structures (all X-ray diffraction)

PDB IDTitleMethodResolution (Å)
1DD1CRYSTAL STRUCTURE ANALYSIS OF THE SMAD4 ACTIVE FRAGMENTX-ray2.62
1G88S4AFL3ARG515 MUTANTX-ray3.00
1MR1Crystal Structure of a Smad4-Ski ComplexX-ray2.85
1U7FCrystal Structure of the phosphorylated Smad3/Smad4 heterotrimeric complexX-ray2.60
1U7VCrystal Structure of the phosphorylated Smad2/Smad4 heterotrimeric complexX-ray2.70
1YGSCRYSTAL STRUCTURE OF THE SMAD4 TUMOR SUPPRESSOR C-TERMINAL DOMAINX-ray2.10
5C4VSki-like proteinX-ray2.60
5MEYCrystal structure of Smad4-MH1 bound to the GGCGC siteX-ray2.05
5MEZCrystal structure of Smad4-MH1 bound to the GGCT siteX-ray2.98
5MF0Crystal structure of Smad4-MH1 bound to the GGCCG siteX-ray3.03
5UWUCrystal Structure of SMAD4 NES Peptide in complex with CRM1-Ran-RanBP1X-ray2.24
6YIC14-3-3 sigma in complex with SMAD4 pS403 peptideX-ray1.60

Predicted Structures

AlphaFold2 Model

  • Model ID: Q13485
  • pLDDT (global): 74.56
  • Fraction with pLDDT ≥90 (very high confidence): 53.0%

Cross-species orthologs

OrganismGene IDSymbol
Mouse (Mus musculus)ENSMUSG00000024515Smad4
Rat (Rattus norvegicus)ENSRNOG00000051965Smad4
Zebrafish (Danio rerio)ENSDARG00000012649, ENSDARG00000075226smad4b, smad4a
Fruit fly (Drosophila melanogaster)FBGN0020493Dad
Worm (C. elegans)WBGENE00000910, WBGENE00006445daf-14, tag-68
Yeast (S. cerevisiae)none

Clinical variants & AI predictions

Clinical Variants (ClinVar)

Summary

  • Total variants: ~2,571
  • Classifications from sample data:
    • Pathogenic: 9
    • Likely Pathogenic: 3
    • Uncertain Significance: 70+
    • Benign/Likely Benign: 15+
    • Conflicting: 1

Top 30 Pathogenic/Likely Pathogenic Variants

ClinVar IDHGVS NotationVariant TypeClassificationAssociated Condition
1070021NC_000018.9:g.(?48604616)(48604837_?)delDeletionPathogenic
1070022NC_000018.9:g.(?48593383)(48604842_?)delDeletionPathogenic
1073157NM_005359.6(SMAD4):c.620del (p.Asn207fs)DeletionPathogenic
1074223NM_005359.6(SMAD4):c.779dup (p.Tyr260Ter)DuplicationPathogenic
1074875NM_005359.6(SMAD4):c.223del (p.Gln75fs)DeletionPathogenic
1075033NM_005359.6(SMAD4):c.1494dup (p.Cys499fs)DuplicationPathogenic
1076671NM_005359.6(SMAD4):c.968G>A (p.Trp323Ter)SNVPathogenic
1076948NM_005359.6(SMAD4):c.563del (p.Asn188fs)DeletionPathogenic
1050635NM_005359.6(SMAD4):c.733C>T (p.Gln245Ter)SNVPathogenic
1032991NM_005359.6(SMAD4):c.803_804insCTAAGTGGTAGTA (p.Trp268delinsCysTer)InsertionLikely Pathogenic
1066208NM_005359.6(SMAD4):c.1149A>G (p.Ile383Met)SNVLikely Pathogenic
1067853NM_005359.6(SMAD4):c.1308+2T>GSNVLikely Pathogenic

AI-Based Variant Effect Predictions

AlphaMissense Pathogenicity Scores

Summary

  • Total predictions: 3,631
  • Likely pathogenic: 1,300+ (filtered subset shown; first 100 displayed here)

Top 30 Likely-Pathogenic Missense Variants

Genomic CoordinateProtein ChangeScoreClass
18:51047090:C:AA15D1.000likely_pathogenic
18:51047107:C:GH21D1.000likely_pathogenic
18:51047098:A:CS18R1.000likely_pathogenic
18:51047100:C:AS18R1.000likely_pathogenic
18:51047100:C:GS18R1.000likely_pathogenic
18:51047102:T:AI19N1.000likely_pathogenic
18:51047102:T:GI19S1.000likely_pathogenic
18:51047110:A:CS22R1.000likely_pathogenic
18:51047075:C:GP10R0.989likely_pathogenic
18:51047101:A:CI19L0.990likely_pathogenic
18:51047089:G:AA15T0.992likely_pathogenic
18:51047089:G:CA15P0.992likely_pathogenic
18:51047092:T:AC16S0.992likely_pathogenic
18:51047093:G:AC16Y0.999likely_pathogenic
18:51047074:C:TP10S0.933likely_pathogenic
18:51047078:C:AT11K0.951likely_pathogenic
18:51047081:G:AS12N0.985likely_pathogenic
18:51047085:T:AN13K0.980likely_pathogenic
18:51047085:T:GN13K0.980likely_pathogenic
18:51047086:G:AD14N0.994likely_pathogenic
18:51047086:G:CD14H1.000likely_pathogenic
18:51047087:A:CD14A0.998likely_pathogenic
18:51047087:A:GD14G0.998likely_pathogenic
18:51047087:A:TD14V0.999likely_pathogenic
18:51047090:C:GA15G0.984likely_pathogenic
18:51047090:C:TA15V0.996likely_pathogenic
18:51047072:C:AT9K0.786likely_pathogenic
18:51047075:C:AP10Q0.982likely_pathogenic
18:51047075:C:TP10L0.948likely_pathogenic
18:51047080:A:CS12R0.999likely_pathogenic

SpliceAI Predictions

Summary

  • Total splice effect predictions: 2,425
  • Effects: donor gain, donor loss, acceptor gain, acceptor loss

Top 30 High-Confidence Splice Variants

Genomic CoordinateGeneEffect TypeScore
18:51028631:C:GSMAD4donor_gain0.9200
18:51028666:G:GGSMAD4donor_gain0.9100
18:51029741:G:GTSMAD4donor_gain0.7600
18:51029742:A:TSMAD4donor_gain0.7500
18:51029770:G:GTSMAD4donor_gain0.7200
18:51028665:T:GSMAD4donor_gain0.5800
18:51028662:A:TSMAD4donor_gain0.5600
18:51030049:C:ASMAD4donor_gain0.5400
18:51028670:G:ASMAD4donor_loss0.3700
18:51029645:C:TSMAD4donor_gain0.3900
18:51030048:T:TASMAD4donor_gain0.3800
18:51028671:A:TSMAD4donor_gain0.5100
18:51028663:TCT:TSMAD4donor_gain0.4500
18:51029852:TCCTC:TSMAD4donor_gain0.4700
18:51030089:GCGCC:GSMAD4donor_gain0.4400
18:51030084:T:GSMAD4donor_gain0.4300
18:51030080:GCCCT:GSMAD4donor_gain0.4200
18:51030085:G:GGSMAD4donor_gain0.4200
18:51029547:A:ATSMAD4acceptor_gain0.4900
18:51029548:G:TSMAD4acceptor_gain0.4200
18:51029555:G:TSMAD4acceptor_gain0.3100
18:51030128:G:GTSMAD4donor_gain0.3600
18:51030128:G:TSMAD4donor_gain0.2700
18:51030144:G:GTSMAD4donor_gain0.2700
18:51030152:G:GTSMAD4donor_gain0.6600
18:51028588:G:GTSMAD4donor_gain0.6400
18:51028670:G:GTSMAD4donor_gain0.6000
18:51029514:T:TASMAD4acceptor_gain0.2000
18:51029515:A:AASMAD4acceptor_gain0.2000
18:51028619:T:ASMAD4donor_gain0.2300

Pathways & Gene Ontology

Biological Pathways

Reactome Pathways: 21 total

Pathway IDPathway Name
R-HSA-2173789TGF-beta receptor signaling activates SMADs
R-HSA-2173796SMAD2/SMAD3:SMAD4 heterotrimer regulates transcription
R-HSA-2173795Downregulation of SMAD2/3:SMAD4 transcriptional activity
R-HSA-201451Signaling by BMP
R-HSA-1502540Signaling by Activin
R-HSA-1181150Signaling by NODAL
R-HSA-452723Transcriptional regulation of pluripotent stem cells
R-HSA-3311021SMAD4 MH2 Domain Mutants in Cancer
R-HSA-3315487SMAD2/3 MH2 Domain Mutants in Cancer
R-HSA-9733709Cardiogenesis
R-HSA-9823730Formation of definitive endoderm
R-HSA-9754189Germ layer formation at gastrulation
R-HSA-8941326RUNX2 regulates bone development
R-HSA-8941855RUNX3 regulates CDKN1A transcription
R-HSA-8952158RUNX3 regulates BCL2L11 (BIM) transcription
R-HSA-9617828FOXO-mediated transcription of cell cycle genes
R-HSA-9615017FOXO-mediated transcription of oxidative stress, metabolic and neuronal genes
R-HSA-9735871SARS-CoV-1 targets host intracellular signalling and regulatory pathways
R-HSA-5689880Ub-specific processing proteases
R-HSA-9839394TGFBR3 expression
R-HSA-9844594Transcriptional regulation of brown and beige adipocyte differentiation by EBF2

MSigDB Gene Sets: 100 total (Top 10 featured gene sets from various collections including C5:GO Biological Process, C2:CP Canonical Pathways, and C2 Curated Gene Sets)

Gene Ontology Annotations

Biological Process: 80 terms total

GO IDTerm
GO:0060395SMAD protein signal transduction
GO:0007179transforming growth factor beta receptor signaling pathway
GO:0030509BMP signaling pathway
GO:0032924activin receptor signaling pathway
GO:0060391positive regulation of SMAD protein signal transduction
GO:0030511positive regulation of transforming growth factor beta receptor signaling pathway
GO:0001837epithelial to mesenchymal transition
GO:0010718positive regulation of epithelial to mesenchymal transition
GO:0071559response to transforming growth factor beta
GO:0071560cellular response to transforming growth factor beta stimulus
GO:0071773cellular response to BMP stimulus
GO:0006357regulation of transcription by RNA polymerase II
GO:0045944positive regulation of transcription by RNA polymerase II
GO:0000122negative regulation of transcription by RNA polymerase II
GO:0045893positive regulation of DNA-templated transcription
GO:0006366transcription by RNA polymerase II
GO:0030154cell differentiation
GO:0001649osteoblast differentiation
GO:0048589developmental growth
GO:0035556intracellular signal transduction

Molecular Function: 18 terms total

GO IDTerm
GO:0070412R-SMAD binding
GO:0070411I-SMAD binding
GO:0001228DNA-binding transcription activator activity, RNA polymerase II-specific
GO:0000981DNA-binding transcription factor activity, RNA polymerase II-specific
GO:0003700DNA-binding transcription factor activity
GO:0043565sequence-specific DNA binding
GO:0001223transcription coactivator binding
GO:0001222transcription corepressor binding
GO:0000978RNA polymerase II cis-regulatory region sequence-specific DNA binding
GO:0000976transcription cis-regulatory region binding
GO:0061629RNA polymerase II-specific DNA-binding transcription factor binding
GO:0003682chromatin binding
GO:0046872metal ion binding
GO:0042803protein homodimerization activity
GO:0042802identical protein binding
GO:0005518collagen binding
GO:0031005filamin binding
GO:0043199sulfate binding

Cellular Component: 12 terms total

GO IDTerm
GO:0071141SMAD protein complex
GO:0071144heteromeric SMAD protein complex
GO:0005634nucleus
GO:0005654nucleoplasm
GO:0005737cytoplasm
GO:0005667transcription regulator complex
GO:0000785chromatin
GO:0005829cytosol
GO:0005813centrosome
GO:0032444activin responsive factor complex
GO:0036064ciliary basal body
GO:0097542ciliary tip

Protein interactions & networks

Protein-Protein Interactions

Total interaction count:

  • STRING: ~4,822 interactions (predicted and experimental)
  • BioGRID: 850 interactions (experimental)
  • IntAct: 1,627 interactions (experimental)

TOP 30 highest-confidence interacting proteins (STRING/BioGRID/IntAct combined):

RankGeneProteinUniProtInteraction Type
1CTNNB1Catenin beta-1P35222STRING, BioGRID
2SMAD2SMAD family member 2Q15796STRING, BioGRID, IntAct
3SMAD3SMAD family member 3P84022STRING, BioGRID, IntAct
4SMAD5SMAD family member 5Q99717STRING, BioGRID, IntAct
5SMAD6SMAD family member 6O43541STRING, BioGRID, IntAct
6SMAD7SMAD family member 7O15105STRING, BioGRID, IntAct
7SMAD9SMAD family member 9O15198STRING, BioGRID, IntAct
8ALK4/ACVR1BActivin receptor type-1BP36896STRING, BioGRID, IntAct
9TGFBR1TGF-beta receptor type-1P36897STRING, BioGRID, IntAct
10TGFBR2TGF-beta receptor type-2P37173STRING, BioGRID, IntAct
11BMPR1ABMP receptor type-1AP36894STRING, BioGRID, IntAct
12BMPR1BBMP receptor type-1BO00238STRING, BioGRID, IntAct
13BMPR2BMP receptor type-2Q13873STRING, BioGRID, IntAct
14ACVR1Activin receptor type-1Q04771STRING, BioGRID, IntAct
15ACVR2BActivin receptor type-2BQ13705STRING, BioGRID
16LBFR/FOXH1Forkhead box protein H1O75593STRING, BioGRID
17LEF1Lymphoid enhancer-binding factor 1Q9UJU2STRING, BioGRID
18FOXO3Forkhead box protein O3O43524STRING, BioGRID
19JUNTranscription factor JunP05412STRING, BioGRID, IntAct
20FOSProtein c-FosP01100STRING, BioGRID
21TP53Cellular tumor antigen p53P04637STRING, BioGRID
22RUNX2Runt-related transcription factor 2Q13950STRING, BioGRID
23E2F4Transcription factor E2F4Q16254STRING, BioGRID
24E2F5Transcription factor E2F5Q15329STRING, BioGRID
25SNAI1Zinc finger protein SNAI1O95863STRING, BioGRID
26SNAI2Zinc finger protein SNAI2O43623STRING, BioGRID
27ZEB1Zinc finger E-box-binding homeobox 1P37275STRING, BioGRID
28TWIST1Twist-related protein 1Q15672STRING, BioGRID
29TRIM33E3 ubiquitin-protein ligase TRIM33Q9UPN9STRING, BioGRID, IntAct
30BMP2/4/6/7Bone morphogenetic proteinsP12643–P22004STRING, BioGRID, IntAct

Protein Similarity

Structural/embedding similarity (Foldseek/DIAMOND + ESM2):

RankGeneProteinUniProtSimilarity Type
1SMAD1SMAD family member 1Q15797Structural, Embedding
2SMAD2SMAD family member 2Q15796Structural, Embedding
3SMAD3SMAD family member 3P84022Structural, Embedding
4SMAD5SMAD family member 5Q99717Structural, Embedding
5SMAD6SMAD family member 6O43541Structural, Embedding
6SMAD7SMAD family member 7O15105Structural, Embedding
7SMAD9SMAD family member 9O15198Structural, Embedding
8NFXL1Nuclear factor 1 X-typeQ14938Embedding
9–44Orthologs from non-human speciesVarious Smad4 homologsVariousStructural

Sequence homology notes:

  • SMAD4 belongs to the SMAD family of signal transduction proteins with 9 human members (SMAD1–9)
  • Highest sequence identity with SMAD2, SMAD3 (TGF-β pathway SMADs)
  • Cross-species orthologs identified in mammals, birds, fish, and invertebrates
  • Core conserved domains: MH1 (DNA-binding), MH2 (protein-protein interaction)

Summary: SMAD4 functions as the central hub of TGF-β/BMP signaling, interacting with R-SMADs (1, 2, 3, 5, 8, 9) and I-SMADs (6, 7), as well as transcriptional regulators (RUNX2, E2F, Forkhead/FOX family, Snail-family zinc fingers). The extensive STRING network (4,822 interactions) reflects its pivotal role as a common mediator in diverse developmental and disease pathways.

Transcription factor regulatory data

SMAD4 is a transcription factor with DNA binding capability.

DNA Binding Motifs (JASPAR)

SMAD4 has 2 known DNA binding motifs:

  • MA1153.2 (SMAD factors family, SMAD/NF-1 DNA-binding domain factors, SMiLE-seq, version 2)
  • MA1153.1 (SMAD factors family, SMAD/NF-1 DNA-binding domain factors, SMiLE-seq, version 1)

Both motifs are from the JASPAR CORE collection for vertebrates and derived from mouse (Mus musculus) SMiLE-seq data.

Downstream Targets

Total count: 100 genes

Top 30 with regulation type and evidence confidence:

Target GeneRegulationConfidence
ADAM2UnknownHigh
ALPIActivationHigh
ANKRD1ActivationHigh
APOC3ActivationHigh
BAMBIActivationHigh
BBC3ActivationHigh
CDH2ActivationHigh
CDKN1AActivationHigh
CDX2ActivationHigh
COL1A2UnknownHigh
COL7A1UnknownHigh
DLX3ActivationHigh
DMP1ActivationHigh
DSPPActivationHigh
FSCN1ActivationHigh
FSHBActivationHigh
GADD45BActivationHigh
HAMPActivationHigh
IHHActivationHigh
IL10ActivationHigh
ITGB5ActivationHigh
LAMA3ActivationHigh
LAMB3RepressionHigh
LAMC2ActivationHigh
METActivationHigh
MST1RActivationHigh
MSX2ActivationHigh
MUC4UnknownHigh
MYCUnknownHigh
MYCNActivationHigh

Upstream Regulators (TFs that regulate SMAD4)

Upstream TFEvidence Type
SMAD2Activation (curated)
KLF10Repression (curated, High confidence)
HIF1AActivation (curated)
FOXO1Activation (curated, Low confidence)
FOXO3Activation (curated, Low confidence)
KAT2BActivation (curated)
TGFB2Activation (curated)
ZNF451Repression (curated)
ETS1Unknown (curated)
FOSUnknown (curated)
GLI1Unknown (curated)
HDAC4Unknown (curated)
JUNBUnknown (curated)
OVOL2Repression (curated)
SP1Unknown (curated)

Evidence sourced from CollectRI transcriptional regulatory network database

Drug & pharmacology data

SMAD4 is not currently an established drug target with marketed or clinical-stage therapeutics.

Available evidence:

  • ChEMBL target registration: SMAD4 is registered as a ChEMBL target (CHEMBL5725109), but contains only 6 experimental research assays for SMAD4 inhibition using compounds of unknown origin—no documented molecules with development phases.
  • No clinical candidates: No molecules with developmental stage information (Phase I-IV) targeting SMAD4 identified in ChEMBL, DrugBank, or other major databases.
  • Clinical trials: Zero documented clinical trials for SMAD4-targeting agents.
  • Pharmacogenomics: PharmGKB lists SMAD4 as a VIP gene but shows no CPIC dosing guidelines or pharmacogene interaction data.

Conclusion: While SMAD4 is a well-characterized protein in TGF-β signaling pathways and has research-level assays, it lacks approved drugs, clinical candidates, or pharmacogenomic dosing guidance, indicating it has not yet been validated as a therapeutic target.

Expression profiles

Tissue Expression (Bgee)

SMAD4 shows ubiquitous expression across human tissues with an average expression score of 88.1 (range 0-100). All 288 tissue conditions show presence, indicating broad tissue distribution. Below are the top 30 tissues with highest expression scores:

TissueExpression ScoreQuality
Ventricular zone98.52Gold
Ganglionic eminence97.77Gold
Calcaneal tendon97.63Gold
Adrenal tissue96.97Gold
Colonic epithelium96.66Gold
Stromal cell of endometrium96.05Gold
Tendon95.93Gold
Cortical plate95.88Gold
Left ovary95.79Gold
Gall bladder95.77Gold
Rectum95.74Gold
Popliteal artery95.65Gold
Tibial artery95.63Gold
Embryo95.54Gold
Body of uterus95.38Gold
Endocervix95.35Gold
Islet of Langerhans95.31Gold
Descending thoracic aorta95.23Gold
Right ovary95.18Gold
Mucosa of stomach94.97Gold
Aorta94.95Gold
Left lobe of thyroid gland94.90Gold
Right lobe of thyroid gland94.88Gold
Parietal pleura94.86Gold
Right coronary artery94.83Gold
Thyroid gland94.79Gold
Endometrium94.78Gold
Ovary94.70Gold
Gastrocnemius94.66Gold
Muscle of leg94.64Gold

Pattern: Prominent expression in developing CNS tissues (ventricular zone, ganglionic eminence, cortical plate), connective tissues (tendons, arteries), endocrine tissues (adrenal, thyroid, ovary, islet), and reproductive tissues (uterus, endometrium).

Single-Cell Expression (Single Cell Expression Atlas)

SMAD4 was identified as a marker gene in 2 experiments from the Single Cell Expression Atlas:

  • E-MTAB-6142: Transcriptomic characterization of the human cell cycle in individual unsynchronized cells (96 cells)
    • Across 146 cell clusters analyzed
    • Max mean expression: 38.79
    • Average mean expression: 0.87

Expression is distributed across multiple cell populations in this cell cycle dataset, consistent with SMAD4’s role in cell proliferation and differentiation pathways.

Disease associations

Mendelian / Monogenic Diseases

SMAD4 mutations cause autosomal dominant inheritance patterns in the following conditions:

DiseaseDisease IDInheritanceEvidence Level
Juvenile polyposis syndromeOMIM:174900, MONDO:0017380, Orphanet:2929Autosomal dominantDefinitive/Strong
Juvenile polyposis/hereditary hemorrhagic telangiectasia syndromeOMIM:175050, MONDO:0008278Autosomal dominantDefinitive/Strong
Myhre syndromeOMIM:139210, MONDO:0007688, Orphanet:2588Autosomal dominantDefinitive/Strong
Hereditary hemorrhagic telangiectasiaMONDO:0019180, Orphanet:774Autosomal dominantSupportive
Familial thoracic aortic aneurysm and aortic dissectionMONDO:0019625, Orphanet:91387Autosomal dominantSupportive
Generalized juvenile polyposis/juvenile polyposis coliMONDO:0008276, Orphanet:329971Autosomal dominantSupportive
Familial pancreatic carcinomaMONDO:0015278, Orphanet:1333Autosomal dominantInferred

Cancer associations:

  • Exocrine pancreatic carcinoma (MONDO:0009831)
  • Breast cancer (MONDO:0007254)
  • Colon carcinoma (MONDO:0002032)
  • Gastric cancer (MONDO:0001056)
  • Gallbladder cancer (MONDO:0005411)
  • Lung cancer (MONDO:0008903)
  • Colorectal cancer (MONDO:0005575)
  • Wilms tumor (MONDO:0006058)
  • Ovarian carcinoma (inferred from HPO associations)

Other conditions:

  • Pulmonary arterial hypertension (MONDO:0015924, Orphanet:422)
  • Intellectual disability (MONDO:0001071)
  • Familial renal glucosuria (MONDO:0009297, Orphanet:69076)
  • Thrombocytopenia (MONDO:0002049)
  • Vascular dementia (MONDO:0004648)

Phenotype Associations (Top 30 HPO Terms)

HPO IDPhenotype
HP:0000006Autosomal dominant inheritance
HP:0001009Telangiectasia
HP:0000421Epistaxis
HP:0002092Pulmonary arterial hypertension
HP:0002647Aortic dissection
HP:0002616Aortic root aneurysm
HP:0002239Gastrointestinal hemorrhage
HP:0001892Abnormal bleeding
HP:0002629Gastrointestinal arteriovenous malformation
HP:0004936Venous thrombosis
HP:0002040Esophageal varix
HP:0004406Spontaneous, recurrent epistaxis
HP:0100579Mucosal telangiectasiae
HP:0100761Visceral angiomatosis
HP:0100100Arteriovenous malformation
HP:0006548Pulmonary arteriovenous malformation
HP:0006574Hepatic arteriovenous malformation
HP:0006725Pancreatic adenocarcinoma
HP:0003003Colon cancer
HP:0002894Neoplasm of the pancreas
HP:0004783Duodenal polyposis
HP:0005227Adenomatous colonic polyposis
HP:0200008Intestinal polyposis
HP:0030256Small intestinal polyposis
HP:0004394Multiple gastric polyps
HP:0001738Exocrine pancreatic insufficiency
HP:0001394Cirrhosis
HP:0001399Hepatic failure
HP:0002910Elevated circulating hepatic transaminase concentration
HP:0004933Ascending aortic dissection

Complex Disease / GWAS Associations (Top 7)

TraitVariantP-valueChromosome
Asthma (childhood onset)SMAD4 - SRSF10P14×10⁻¹³18
AsthmaSMAD42×10⁻⁹18
AsthmaSMAD4 - SRSF10P16×10⁻⁸18
Oligoclonal band status in multiple sclerosisELAC1 - SMAD48×10⁻⁷18
Asthma onset (childhood vs adult)SMAD4 - SRSF10P15×10⁻⁶18
Immune response to anthrax vaccineSRSF10P1 - MEX3C1×10⁻⁶18
Metabolite levelsSRSF10P1 - MEX3C5×10⁻⁶18

Structured Data Sources

Generated with Claude Haiku 4.5 + BioBTree MCP, drawing on data BioBTree aggregates from 55 biological databases. Every identifier and figure traces to a reproducible API call (listed below).

Further analyze this answer or run your own queries with BioBTree MCP.

Datasets: alphafold, alphamissense, antibody, bgee, bindingdb, biogrid, biogrid_interaction, ccds, cdd, chembl_assay, chembl_molecule, chembl_target, clinical_trials, clinvar, collectri, diamond_similarity, drugbank, ensembl, entrez, esm2_similarity, exon, expressionatlas, gencc, go, gtex, gtopdb_interaction, gtopdb_ligand, gwas, hgnc, hpa, hpo, inparanoid, intact, interpro, jaspar, mim, mondo, msigdb, orphanet, ortholog, panther, pdb, pfam, pharmgkb_gene, pubchem_activity, reactome, refseq, scxa, scxa_expression, smart, spliceai, string_interaction, transcript, uniprot, wormbase
Generated: 2026-05-26 — For the latest data, query BioBTree directly via MCP or API.
View API calls (135)