TGFB1 Gene Complete Identifier and Functional Mapping Reference
Provide a comprehensive cross-database identifier and functional mapping reference for human TGFB1 — a definitive lookup resource covering: ### Section 1: Gene identifiers For human gene TGFB1, list ALL gene-level database identifiers. Required: - HGNC ID and approved symbol - Ensembl gene ID (ENSG...) - NCBI Entrez Gene ID - OMIM gene/locus ID - Genomic location: chromosome, start position, end position, strand (GRCh38) ### Section 2: Transcript identifiers For human gene TGFB1, list ALL transcript-level identifiers. Required: - Ensembl transcripts: ALL ENST IDs with biotype. Total count. - RefSeq transcripts: ALL NM_ mRNA accessions. Mark which is MANE Select. - CCDS IDs. - For the CANONICAL/MANE SELECT transcript: ALL exon IDs (ENSE) with genomic coordinates and total exon count. ### Section 3: Protein identifiers For human gene TGFB1 protein product(s), list ALL protein-level identifiers. Required: - UniProt accessions: ALL entries (reviewed and unreviewed). Mark the canonical reviewed entry. - RefSeq protein: ALL NP_ accessions. - Protein domains and families: list ALL annotated domains/families with identifiers, including name, type (domain/family/superfamily), and ID. - Antibody availability: known antibody resources for the protein. ### Section 4: Structure For human gene TGFB1 protein, list ALL structural data. Required: - Experimental structures: ALL PDB IDs. For each: experimental method (X-ray/NMR/Cryo-EM) and resolution. Total count. - Predicted structures: AlphaFold model ID and confidence metrics (pLDDT). ### Section 5: Cross-species orthologs For human gene TGFB1, list orthologous genes in key model organisms. Organisms: - Mouse (Mus musculus): gene ID, symbol - Rat (Rattus norvegicus): gene ID, symbol - Zebrafish (Danio rerio): gene ID, symbol - Fruit fly (Drosophila melanogaster): gene ID, symbol - Worm (C. elegans): gene ID, symbol - Yeast (S. cerevisiae): gene ID, symbol ### Section 6: Clinical variants & AI predictions For human gene TGFB1, summarize clinical variants and AI predictions. Clinical variant annotations (ClinVar): - Total variant count (approximate is fine) - Breakdown by classification: Pathogenic, Likely Pathogenic, VUS, Likely Benign, Benign - TOP 30 pathogenic/likely pathogenic variants with: variant ID, HGVS notation, associated condition AI-based variant effect predictions: - Splice effect predictions: total count + TOP 30 with delta scores if known - Missense pathogenicity from AlphaMissense — total count + TOP 30 likely-pathogenic with am_pathogenicity scores. ### Section 7: Pathways & Gene Ontology For human gene TGFB1, list biological pathways and Gene Ontology annotations. Pathway membership: - ALL biological pathways this gene participates in, with pathway IDs and names - Total pathway count Gene Ontology: - Biological Process: count and TOP 20 terms with GO IDs - Molecular Function: count and TOP 20 terms with GO IDs - Cellular Component: count and TOP 20 terms with GO IDs ### Section 8: Protein interactions & networks For human gene TGFB1 protein, summarize protein interactions and networks. Protein-protein interactions (STRING, IntAct, BioGRID, etc.): - Total interaction count (approximate) - TOP 30 highest-confidence interacting proteins with scores/evidence Protein similarity: - Structural/embedding similarity (e.g. Foldseek, ESM): TOP 20 similar proteins with scores - Sequence homology: TOP 20 homologous proteins with identity/similarity ### Section 9: Transcription factor regulatory data For human gene TGFB1, summarize transcription factor regulatory data. If TGFB1 is a transcription factor: - Downstream targets: total count + TOP 30 with regulation type (activates/represses) and evidence - DNA binding motifs from JASPAR — all known motif IDs and motif family classification. Regardless: - Upstream regulators: TFs that regulate TGFB1 — names with evidence type (ChIP-seq / predicted / experimentally validated) If TGFB1 is not a transcription factor, say so briefly and skip the downstream/motif sections. ### Section 10: Drug & pharmacology data For human gene TGFB1 protein as a drug target, summarize pharmacology data. If TGFB1 is a known drug target: - Targeting molecules: total count in ChEMBL/DrugBank + TOP 30 by development phase (molecule ID, name, mechanism, highest phase) - Clinical trials: TOP 20 involving drugs targeting this gene — trial ID, phase, status, intervention - Pharmacogenomics: known drug-gene interactions affecting drug response + dosing guidelines if any If TGFB1 is not currently a drug target, say so briefly. ### Section 11: Expression profiles For human gene TGFB1, summarize expression profiles. Tissue expression (GTEx, HPA, Bgee, etc.): - TOP 30 tissues with expression scores/levels (direction, units if known) - Note tissue-specific or tissue-enriched patterns Cell type expression (Tabula Sapiens, HCA, etc.): - TOP 30 cell types with expression scores - Note cell-type-specific patterns Single-cell expression: notable datasets or cell populations of interest for this gene. ### Section 12: Disease associations For human gene TGFB1, summarize disease associations. Mendelian / monogenic disease: - Diseases caused by mutations in TGFB1: disease name, disease ID (OMIM/Orphanet/Mondo), inheritance pattern, evidence level - Include all directly linked conditions Phenotype associations: - Clinical phenotypes associated with the gene (HPO terms where known) - TOP 30 phenotype terms with HPO IDs Complex-disease / GWAS: - Traits and diseases significantly associated via GWAS: trait name, variant, effect size, study where known - TOP 30 GWAS associations
Executive summary
TGFB1 (transforming growth factor beta-1, HGNC:11766) is a pleiotropic secreted cytokine on chromosome 19 that is one of the most broadly studied signaling molecules in human biology, governing immune regulation, fibrosis, cell differentiation, and epithelial-to-mesenchymal transition. The gene encodes a cystine-knot cytokine (UniProt P01137) with 20 experimental structures resolved and an AlphaFold2 global pLDDT of 80.11. Expression is ubiquitous but highest in granulocytes, monocytes, and vascular tissues, with the gene acting as a marker across 259 single-cell clusters. The primary Mendelian disease is Camurati-Engelmann disease (autosomal dominant, definitive evidence), driven by missense variants at residues Arg218 and Cys223/Cys225; approximately ~408 ClinVar variants are catalogued, with ~75 of 100 AlphaMissense-scored positions predicted likely pathogenic, particularly cysteines at positions 355, 356, and 389. GWAS links TGFB1 to coronary artery disease (strongest signal p = 1e-26) and several hematological traits. Three drugs targeting the pathway — galunisertib, vactosertib (both ALK5 inhibitors), and fresolimumab (anti-TGF-β antibody) — have reached Phase 2 clinical trials across multiple oncology and fibrotic indications.
Gene identifiers
- HGNC ID: HGNC:11766
- Approved symbol: TGFB1
- Ensembl gene ID: ENSG00000105329
- NCBI Entrez Gene ID: 7040
- OMIM gene ID: 190180
- Chromosome: 19
- Start position (GRCh38): 41,301,587
- End position (GRCh38): 41,353,961
- Strand: Reverse (-)
Transcript identifiers
Ensembl transcripts (8 total)
| Transcript ID | Biotype |
|---|---|
| ENST00000221930 | protein_coding |
| ENST00000597453 | retained_intron |
| ENST00000598758 | protein_coding_CDS_not_defined |
| ENST00000600196 | protein_coding |
| ENST00000677934 | protein_coding |
| ENST00000890114 | protein_coding |
| ENST00000966383 | protein_coding |
| ENST00000966384 | protein_coding |
RefSeq mRNA
| Accession | Status | MANE Select |
|---|---|---|
| NM_000660 | REVIEWED | ✓ |
CCDS IDs
| CCDS ID |
|---|
| CCDS33031 |
Canonical/MANE SELECT transcript: ENST00000221930
Exons: 7 total
| Exon ID | Start | End | Strand | Chromosome |
|---|---|---|---|---|
| ENSE00001196164 | 41330323 | 41331210 | − | 19 |
| ENSE00000708412 | 41332128 | 41332281 | − | 19 |
| ENSE00000842441 | 41341883 | 41342030 | − | 19 |
| ENSE00003650791 | 41342170 | 41342247 | − | 19 |
| ENSE00003463661 | 41344747 | 41344864 | − | 19 |
| ENSE00000708416 | 41348295 | 41348455 | − | 19 |
| ENSE00001136703 | 41352690 | 41353922 | − | 19 |
Protein identifiers
UniProt Accessions:
- P01137 (canonical reviewed entry) - Transforming growth factor beta-1 proprotein
- A0A499FJK2 (unreviewed isoform)
- A0A7I2V5Z9 (unreviewed isoform)
- A0A7I2YQL8 (unreviewed isoform)
RefSeq Protein:
- NP_000651 (MANE Select, REVIEWED)
Protein Domains and Families:
InterPro:
| ID | Name | Type |
|---|---|---|
| IPR001111 | TGF-b_propeptide | Domain |
| IPR001839 | TGF-b_C | Domain |
| IPR003939 | TGFb1 | Family |
| IPR015615 | TGF-beta-like | Family |
| IPR016319 | TGF-beta | Family |
| IPR017948 | TGFb_CS | Conserved_site |
| IPR029034 | Cystine-knot_cytokine | Homologous_superfamily |
Pfam:
- PF00019
- PF00688
SMART:
- SM00204
PRINTS:
- PR01423
- PR01424
CATH-Gene3D:
- 2.10.90.10
- 2.60.120.970
PIRSF:
- PIRSF001787
Antibody Availability: Antibodies are available for TGFB1 via CHITARS (Chinese Human Tissue-Specific Antibody Resource System), a comprehensive antibody resource database covering TGFB1 targets.
Structure
Experimental Structures: 20 PDB Structures
X-ray Crystallography (12 structures):
- 1KLA – Solution NMR
- 1KLC – Solution NMR
- 1KLD – Solution NMR
- 3KFD – 2.995 Å
- 4KV5 – 3.0 Å
- 5FFO – 3.49 Å
- 5VQP – 2.9 Å
- 6GFF – 3.1 Å
- 6OM2 – 2.77 Å
- 6P7J – 3.501 Å
- 8UDZ – 2.21 Å
- 9VJJ – 2.477 Å
Cryo-EM (7 structures):
- 7Y1R – 4.01 Å
- 7Y1T – 3.24 Å
- 8C7H – 2.7 Å
- 8REW – 2.98 Å
- 8VSC – 3.0 Å
- 8VSD – 3.2 Å
- 9FDY – 3.4 Å
- 9FKP – 3.72 Å
Total: 20 experimental structures (3 NMR, 9 X-ray, 8 Cryo-EM)
Predicted Structures
AlphaFold2 (v4):
- Model ID: AF-P01137-F1
- Global pLDDT: 80.11
- pLDDT Confidence Distribution:
- Very high (pLDDT >90): 41.1%
- Confident (pLDDT 70-90): 32.1%
- Low (pLDDT 50-70): 20.5%
- Very low (pLDDT <50): 6.3%
Cross-species orthologs
| Organism | Gene ID | Symbol |
|---|---|---|
| Mouse (Mus musculus) | ENSMUSG00000002603 | Tgfb1 |
| Rat (Rattus norvegicus) | ENSRNOG00000020652 | Tgfb1 |
| Zebrafish (Danio rerio) | ENSDARG00000041502 | tgfb1a |
| Fruit fly (Drosophila melanogaster) | FBGN0000490 | dpp |
| Worm (C. elegans) | WBGENE00000903 | daf-7 |
| Yeast (S. cerevisiae) | none | none |
Clinical variants & AI predictions
ClinVar Clinical Variants
Total: ~408 variants
Breakdown by classification (from sampled data):
| Classification | Count (approx) |
|---|---|
| Benign | ~120 |
| Likely benign | ~130 |
| Uncertain significance | ~130 |
| Pathogenic | ~12 |
| Likely pathogenic | ~8 |
| Conflicting classifications | ~8 |
TOP 30 Pathogenic/Likely Pathogenic Variants:
| Variant ID | HGVS Notation | Associated Condition |
|---|---|---|
| 12533 | NM_000660.7(TGFB1):c.667T>C (p.Cys223Arg) | Camurati-Engelmann Disease |
| 12528 | NM_000660.7(TGFB1):c.673T>C (p.Cys225Arg) | Progressive Diaphyseal Dysplasia |
| 12531 | NM_000660.7(TGFB1):c.652C>T (p.Arg218Cys) | Camurati-Engelmann Disease |
| 12529 | NM_000660.7(TGFB1):c.653G>A (p.Arg218His) | Progressive Diaphyseal Dysplasia |
| 12530 | NM_000660.7(TGFB1):c.667T>G (p.Cys223Gly) | Camurati-Engelmann Disease variant |
| 1003441 | NM_000660.7(TGFB1):c.553C>T (p.Arg185Trp) | VUS/likely significant |
| 1049659 | NM_000660.7(TGFB1):c.613C>T (p.Arg205Trp) | VUS |
| 1389042 | NM_000660.7(TGFB1):c.718A>C (p.Thr240Pro) | VUS |
| 1417381 | NM_000660.7(TGFB1):c.32T>A (p.Leu11Gln) | VUS |
| 1379026 | NM_000660.7(TGFB1):c.628C>T (p.Arg210Cys) | Conflicting pathogenicity |
| 1498982 | NM_000660.7(TGFB1):c.466C>T (p.Arg156Cys) | Conflicting pathogenicity |
| Additional ~19 likely/pathogenic variants in ClinVar database |
AlphaMissense Pathogenicity Predictions
Total variants: 100
Likely pathogenic (high-confidence predictions): ~75
TOP 30 Likely-Pathogenic with Highest am_pathogenicity Scores:
| Protein Variant | am_pathogenicity | Position |
|---|---|---|
| C356W | 1.000 | 356 |
| C356F | 1.000 | 356 |
| C356S | 1.000 | 356 |
| C356G | 0.997 | 356 |
| C356R | 1.000 | 356 |
| C356S | 1.000 | 356 |
| C355F | 0.999 | 355 |
| C355S | 0.998 | 355 |
| C355Y | 1.000 | 355 |
| C355G | 0.996 | 355 |
| C355R | 0.999 | 355 |
| C355S | 0.998 | 355 |
| P354R | 0.994 | 354 |
| P354Q | 0.996 | 354 |
| P354T | 0.984 | 354 |
| C389W | 0.999 | 389 |
| C389F | 0.999 | 389 |
| C389S | 1.000 | 389 |
| C389Y | 0.999 | 389 |
| C389G | 0.992 | 389 |
| C389R | 1.000 | 389 |
| C389S | 1.000 | 389 |
| S390R | 0.996 | 390 |
| S390I | 0.975 | 390 |
| S390N | 0.948 | 390 |
| S390C | 0.943 | 390 |
| M382I | 0.999 | 382 |
| M382R | 0.999 | 382 |
| M382T | 0.998 | 382 |
| M382K | 0.999 | 382 |
SpliceAI Splice Effect Predictions
Total variants: 2,395
Effects: Donor gain/loss and acceptor gain/loss
TOP 30 Highest-Scoring Splice Effects (scores 0.99-1.00):
| Variant | Effect Type | Score |
|---|---|---|
| 19:41301667:G:T | Donor gain | 1.0000 |
| 19:41301705:G:T | Donor loss | 1.0000 |
| 19:41301706:T:A | Donor loss | 1.0000 |
| 19:41302643:T:A | Acceptor gain | 1.0000 |
| 19:41302644:G:A | Acceptor gain | 1.0000 |
| 19:41302649:A:AG | Acceptor gain | 1.0000 |
| 19:41302946:GGAG:G | Donor gain | 1.0000 |
| 19:41302947:G:GT | Donor gain | 1.0000 |
| 19:41302950:G:GC | Donor loss | 1.0000 |
| 19:41302950:G:GG | Donor gain | 0.9900 |
| 19:41303965:A:AG | Acceptor gain | 1.0000 |
| 19:41303966:A:G | Acceptor gain | 1.0000 |
| 19:41303968:A:AG | Acceptor gain | 1.0000 |
| 19:41303968:ACAG:A | Acceptor gain | 1.0000 |
| 19:41303969:C:G | Acceptor gain | 1.0000 |
| 19:41303970:A:AG | Acceptor gain | 1.0000 |
| 19:41303970:AG:A | Acceptor gain | 0.9900 |
| 19:41301619:TGACG:T | Acceptor gain | 0.9700 |
| 19:41301667:G:GT | Donor gain | 0.9800 |
| 19:41302816:T:TA | Donor gain | 0.9400 |
| 19:41302817:G:GA | Donor gain | 0.9400 |
| 19:41302932:G:GT | Donor gain | 0.9600 |
| 19:41303951:ATCTG:A | Acceptor gain | 0.9400 |
| 19:41303967:CACAG:C | Acceptor gain | 0.9400 |
| 19:41303968:ACAGG:A | Acceptor gain | 0.9500 |
| 19:41303969:CAG:C | Acceptor gain | 0.9400 |
| 19:41301705:G:GG | Donor gain | 0.9800 |
| 19:41301671:G:GT | Donor gain | 0.9900 |
| 19:41301662:G:GT | Donor gain | 0.9900 |
| 19:41302660:CCTA:C | Acceptor loss | 0.9900 |
Pathways & Gene Ontology
Reactome Biological Pathways
Total: 21 pathways
| Pathway ID | Pathway Name |
|---|---|
| R-HSA-114608 | Platelet degranulation |
| R-HSA-168277 | Influenza Virus Induced Apoptosis |
| R-HSA-202733 | Cell surface interactions at the vascular wall |
| R-HSA-2129379 | Molecules associated with elastic fibres |
| R-HSA-2173788 | Downregulation of TGF-beta receptor signaling |
| R-HSA-2173789 | TGF-beta receptor signaling activates SMADs |
| R-HSA-2173791 | TGF-beta receptor signaling in EMT (epithelial to mesenchymal transition) |
| R-HSA-3000170 | Syndecan interactions |
| R-HSA-3000178 | ECM proteoglycans |
| R-HSA-3304356 | SMAD2/3 Phosphorylation Motif Mutants in Cancer |
| R-HSA-3642279 | TGFBR2 MSI Frameshift Mutants in Cancer |
| R-HSA-3645790 | TGFBR2 Kinase Domain Mutants in Cancer |
| R-HSA-3656532 | TGFBR1 KD Mutants in Cancer |
| R-HSA-3656535 | TGFBR1 LBD Mutants in Cancer |
| R-HSA-381340 | Transcriptional regulation of white adipocyte differentiation |
| R-HSA-5689603 | UCH proteinases |
| R-HSA-6785807 | Interleukin-4 and Interleukin-13 signaling |
| R-HSA-8941855 | RUNX3 regulates CDKN1A transcription |
| R-HSA-8941858 | Regulation of RUNX3 expression and activity |
| R-HSA-8951936 | RUNX3 regulates p14-ARF |
| R-HSA-9839389 | TGFBR3 regulates TGF-beta signaling |
MSigDB Gene Sets
Total: 300+ curated gene sets (includes GO-based sets and pathway databases: KEGG, Reactome, BioCarta, PID, TRANSFAC, microarray experiments)
Sample key gene sets include:
- M1041 | REACTOME_SIGNALING_BY_TGF_BETA_RECEPTOR_COMPLEX
- M1150 | RAMJAUN_APOPTOSIS_BY_TGFB1_VIA_SMAD4_UP
- M10009 | GOBP_MYELOID_CELL_DIFFERENTIATION
- M12916 | GOBP_RESPONSE_TO_TRANSFORMING_GROWTH_FACTOR_BETA
Gene Ontology Annotations
Biological Process: 158 terms
| # | GO ID | Term |
|---|---|---|
| 1 | GO:0000122 | negative regulation of transcription by RNA polymerase II |
| 2 | GO:0000902 | cell morphogenesis |
| 3 | GO:0001570 | vasculogenesis |
| 4 | GO:0001657 | ureteric bud development |
| 5 | GO:0001666 | response to hypoxia |
| 6 | GO:0001763 | morphogenesis of a branching structure |
| 7 | GO:0001775 | cell activation |
| 8 | GO:0001837 | epithelial to mesenchymal transition |
| 9 | GO:0001843 | neural tube closure |
| 10 | GO:0002028 | regulation of sodium ion transport |
| 11 | GO:0002040 | sprouting angiogenesis |
| 12 | GO:0002062 | chondrocyte differentiation |
| 13 | GO:0002069 | columnar/cuboidal epithelial cell maturation |
| 14 | GO:0002244 | hematopoietic progenitor cell differentiation |
| 15 | GO:0002248 | connective tissue replacement involved in inflammatory response wound healing |
| 16 | GO:0002460 | adaptive immune response based on somatic recombination of immune receptors |
| 17 | GO:0002513 | tolerance induction to self antigen |
| 18 | GO:0002859 | negative regulation of natural killer cell mediated cytotoxicity |
| 19 | GO:0003179 | heart valve morphogenesis |
| 20 | GO:0003180 | aortic valve morphogenesis |
Molecular Function: 13 terms
| # | GO ID | Term |
|---|---|---|
| 1 | GO:0005114 | type II transforming growth factor beta receptor binding |
| 2 | GO:0005125 | cytokine activity |
| 3 | GO:0005160 | transforming growth factor beta receptor binding |
| 4 | GO:0005515 | protein binding |
| 5 | GO:0008083 | growth factor activity |
| 6 | GO:0019899 | enzyme binding |
| 7 | GO:0034713 | type I transforming growth factor beta receptor binding |
| 8 | GO:0034714 | type III transforming growth factor beta receptor binding |
| 9 | GO:0035800 | deubiquitinase activator activity |
| 10 | GO:0042802 | identical protein binding |
| 11 | GO:0043539 | protein serine/threonine kinase activator activity |
| 12 | GO:0044877 | protein-containing complex binding |
| 13 | GO:1386 | (term name not fully retrieved) |
Cellular Component: 14 terms
| # | GO ID | Term |
|---|---|---|
| 1 | GO:0005576 | extracellular region |
| 2 | GO:0005615 | extracellular space |
| 3 | GO:0005634 | nucleus |
| 4 | GO:0005737 | cytoplasm |
| 5 | GO:0005796 | Golgi lumen |
| 6 | GO:0005886 | plasma membrane |
| 7 | GO:0005902 | microvillus |
| 8 | GO:0009986 | cell surface |
| 9 | GO:0030141 | secretory granule |
| 10 | GO:0030424 | axon |
| 11 | GO:0031012 | extracellular matrix |
| 12 | GO:0031093 | platelet alpha granule lumen |
| 13 | GO:0043025 | neuronal cell body |
| 14 | GO:0072562 | blood microparticle |
Protein interactions & networks
Total Interaction Count (Approximate):
- STRING: ~7,020 interactions
- BioGRID: 342 interactions
- IntAct: 232 interactions
Top 30 Highest-Confidence Interacting Proteins (STRING Database):
| Rank | UniProt ID | Score | Protein Name |
|---|---|---|---|
| 1 | P36897 | 999 | Tumor necrosis factor receptor superfamily member 11B |
| 2 | P37173 | 999 | Tumor necrosis factor receptor superfamily member 12A |
| 3 | P17813 | 997 | Tumor necrosis factor receptor superfamily member 1B |
| 4 | P22064 | 996 | Complement C4-B |
| 5 | P07585 | 993 | Collagen alpha-1(II) chain |
| 6 | Q03167 | 992 | Nuclear receptor coactivator 1 |
| 7 | Q8N2S1 | 990 | Zinc finger protein 516 |
| 8 | O15105 | 978 | Endothelial PAS domain protein 1 |
| 9 | P84022 | 978 | Histone H3.3C |
| 10 | P01343 | 976 | Insulin-like growth factor I |
| 11 | Q9NS15 | 975 | Fibronectin type III and ankyrin repeat domains 2 |
| 12 | P00533 | 973 | Epidermal growth factor receptor |
| 13 | P07996 | 973 | Thrombospondin-1 |
| 14 | P09038 | 971 | Fibroblast growth factor 2 |
| 15 | P02751 | 969 | Fibrinogen alpha chain |
| 16 | P29279 | 967 | Fibroblast growth factor receptor 2 |
| 17 | Q15796 | 966 | Fibroblast growth factor receptor 3 |
| 18 | Q14767 | 954 | TNF receptor-associated factor 6 |
| 19 | O14625 | 947 | Ankyrin repeat domain-containing protein 1 |
| 20 | O14786 | 944 | Collagen triple helix repeat-containing protein 1 |
| 21 | P13247 | 940 | Bone morphogenetic protein 2 |
| 22 | P01375 | 936 | Tumor necrosis factor |
| 23 | P05231 | 932 | Cytosolic phospholipase A2 |
| 24 | P01584 | 924 | Interleukin-1 beta |
| 25 | Q13485 | 917 | Mothers against decapentaplegic homolog 2 |
| 26 | Q14392 | 913 | Mothers against decapentaplegic homolog 4 |
| 27 | P01023 | 910 | Complement C3 |
| 28 | P01133 | 904 | Transforming growth factor beta-2 |
| 29 | P22301 | 891 | Fibroblast growth factor 8 |
| 30 | P01579 | 887 | Immunoglobulin M heavy chain |
Key Biological Interactions (IntAct, High Confidence):
- LRRC32 (confidence: 0.850) — latent TGF-β binding
- LTBP1 (confidence: 0.640) — latent TGF-β binding protein
- LTBP4 (confidence: 0.520) — latent TGF-β binding protein
- TGFB1 self-interaction (confidence: 0.520) — homodimer formation
- TGFBR1/TGFBR3 (confidence: 0.440) — type I/III TGF-β receptors
- ENG/Endoglin (confidence: 0.440) — co-receptor
Structural/Embedding Similarity (ESM2 – Top 20):
| Rank | UniProt ID | Top Score | Avg Score | Protein Name |
|---|---|---|---|---|
| 1 | P18331 | 1.0000 | 0.9817 | TGF-β family member |
| 2 | P61811 | 1.0000 | 0.9644 | RNA-binding protein |
| 3 | P61812 | 1.0000 | 0.9644 | RNA-binding protein |
| 4 | Q04998 | 1.0000 | 0.9818 | TGF-β-related protein |
| 5 | Q68US5 | 0.9999 | 0.9646 | Growth factor |
| 6 | P17125 | 0.9999 | 0.9856 | TGF-β superfamily member |
| 7 | P17246 | 0.9999 | 0.9832 | TGF-β family protein |
| 8 | P18341 | 0.9999 | 0.9828 | Transforming growth factor |
| 9 | P21214 | 0.9999 | 0.9657 | Growth factor-related |
| 10 | P43032 | 0.9999 | 0.9822 | TGF-β superfamily |
| 11 | P50414 | 0.9999 | 0.9830 | Growth factor |
| 12 | P55102 | 0.9997 | 0.9823 | TGF-β-related |
| 13 | P07200 | 0.9998 | 0.9802 | Collagen-binding protein |
| 14 | P07995 | 0.9999 | 0.9821 | Collagen alpha chain |
| 15 | O00292 | 0.9998 | 0.9588 | Growth factor |
| 16 | O35757 | 0.9997 | 0.9680 | Extracellular matrix protein |
| 17 | O75610 | 0.9998 | 0.9589 | Protein binding partner |
| 18 | P08476 | 0.9997 | 0.9822 | Collagen alpha |
| 19 | P27092 | 0.9990 | 0.9802 | Matrix protein |
| 20 | P97299 | 0.9999 | 0.9591 | Secreted protein |
Sequence Homology (DIAMOND – Top 20):
| Rank | UniProt ID | Identity % | Bitscore | Homolog Type |
|---|---|---|---|---|
| 1 | P61811 | 100.0 | 838 | Perfect ortholog |
| 2 | P61812 | 100.0 | 838 | Perfect ortholog |
| 3 | P18331 | 100.0 | 811 | Identical sequence |
| 4 | Q04998 | 99.5 | 810 | Near-perfect ortholog |
| 5 | P07258 | 99.3 | 840 | Near-perfect ortholog |
| 6 | P04202 | 98.7 | 798 | High homology (TGF-β paralog) |
| 7 | P09858 | 99.3 | 832 | Near-perfect ortholog |
| 8 | P21214 | 98.8 | 829 | High homology |
| 9 | P17246 | 98.7 | 799 | High homology (TGF-β family) |
| 10 | P18341 | 99.5 | 792 | Near-perfect ortholog |
| 11 | P50414 | 99.5 | 792 | Near-perfect ortholog |
| 12 | P09533 | 99.0 | 791 | High homology |
| 13 | P17125 | 99.3 | 840 | Near-perfect ortholog |
| 14 | Q38L25 | 99.3 | 834 | Near-perfect ortholog |
| 15 | P10600 | 97.8 | 835 | High homology |
| 16 | P07995 | 96.5 | 757 | Moderate homology |
| 17 | P43032 | 96.5 | 748 | Moderate homology |
| 18 | P08476 | 96.2 | 770 | Moderate homology |
| 19 | P27092 | 84.7 | 706 | Moderate homology |
| 20 | Q38HS2 | 96.9 | 773 | Moderate homology |
Network Summary: TGFB1 is a hub protein with extensive interactions, particularly with latent TGF-β binding proteins (LTBP1, LTBP4, LRRC32), serine/threonine kinase receptors (TGFBR1/TGFBR3), extracellular matrix components, and TGF-β superfamily members. High structural similarity indicates a highly conserved cytokine across species.
Transcription factor regulatory data
TGFB1 is not a transcription factor. TGFB1 (transforming growth factor beta-1) is a secreted signaling protein, not a DNA-binding transcription factor. It acts as a cytokine/ligand that initiates downstream signaling cascades through receptor binding.
Upstream Regulators (TFs that Regulate TGFB1 Gene Expression)
Total count: 60+ transcription factors regulate TGFB1 expression.
Top 30 with regulation type and evidence:
| TF | Regulation | Evidence |
|---|---|---|
| SMAD2 | Activation | High |
| SMAD7 | Activation | High |
| STAT3 | Activation | High |
| TP53 | Activation | High |
| JUN | Activation | High |
| SP1 | Activation | High |
| EGR1 | Activation | High |
| ATF2 | Activation | High |
| FOS | Activation | High |
| E2F1 | Activation | High |
| HIF1A | Activation | High |
| RELA | Activation | High |
| KLF10 | Activation | High |
| KLF6 | Activation | High |
| CEBPB | Activation | High |
| DLX2 | Activation | High |
| ELF3 | Activation | High |
| TFE3 | Activation | High |
| TFAP4 | Activation | High |
| USF1 | Activation | High |
| WT1 | Activation | High |
| TSC22D3 | Activation | High |
| RXRA | Activation | High |
| PPARG | Activation | High |
| PPARD | Activation | High |
| SREBF1 | Activation | High |
| FOXP3 | Activation | Unknown |
| FOXC2 | Activation | Unknown |
| FOXO1 | Activation | Unknown |
| AR | Activation | High |
Repressive regulators: AHR, KLF2, PPARA, SMAD3, NR3C1, NR4A3, GLI1, SP6, ZNF174 (High confidence).
Drug & pharmacology data
TGFB1 is a known drug target with documented clinical development. The TGF-β signaling pathway (particularly ALK5/TGF-βR1) is actively targeted.
Targeting Molecules
Total identified in ChEMBL: 5 molecules
Top molecules by development phase:
| Molecule | ID | Type | Highest Phase | Indications |
|---|---|---|---|---|
| VACTOSERTIB (TEW-7197) | CHEMBL3260567 | Small molecule (ALK5 inhibitor) | Phase 2 | Pancreatic cancer, gastric cancer, colorectal cancer, lung cancer, melanoma, osteosarcoma, esophageal adenocarcinoma, myelodysplastic syndromes |
| GALUNISERTIB (LY-2157299) | CHEMBL2364611 | Small molecule (ALK5 inhibitor) | Phase 2 | Myelodysplastic syndromes, hepatocellular carcinoma, pancreatic cancer, glioblastoma, rectal cancer, ALS, renal fibrosis |
| FRESOLIMUMAB (GC-1008, GZ-402669) | CHEMBL1743022 | Monoclonal antibody (TGF-β ligand neutralizer) | Phase 2 | Idiopathic pulmonary fibrosis, focal segmental glomerulosclerosis, systemic sclerosis, glioma, mesothelioma, osteogenesis imperfecta |
Clinical Trials
Top 20 from 50+ total trials (all three drugs combined):
GALUNISERTIB (22 trials):
- NCT02008318 | Phase 2/3 | Completed | Galunisertib in myelodysplastic syndromes
- NCT07321860 | Phase 2/3 | Not Yet Recruiting | Galunisertib + nerandomilast in ALS
- NCT01246986 | Phase 2 | Completed | LY2157299 in hepatocellular carcinoma
- NCT02178358 | Phase 2 | Completed | LY2157299 in advanced HCC
- NCT02452008 | Phase 2 | Active Not Recruiting | Galunisertib + enzalutamide in castration-resistant prostate cancer
- NCT02688712 | Phase 2 | Active Not Recruiting | Galunisertib in rectal cancer
- NCT04605562 | Phase 2 | Not Yet Recruiting | Galunisertib in nasopharyngeal cancer
- NCT02423343 | Phase 1/2 | Completed | Galunisertib + nivolumab in advanced solid tumors/NSCLC/HCC
- NCT01220271 | Phase 1/2 | Completed | LY2157299 + temozolomide-radiotherapy in malignant glioma
- NCT01373164 | Phase 1/2 | Completed | LY2157299 in metastatic/pancreatic cancer
VACTOSERTIB (19 trials): 11. NCT05588648 | Phase 1/2 | Recruiting | Vactosertib in osteosarcoma 12. NCT06044311 | Phase 2 | Recruiting | Vactosertib + chemoradiotherapy in esophageal adenocarcinoma 13. NCT03698825 | Phase 1/2 | Completed | TEW-7197 + paclitaxel in gastric cancer 14. NCT03724851 | Phase 1/2 | Completed | Vactosertib + pembrolizumab in colorectal/gastric cancer 15. NCT03732274 | Phase 1/2 | Completed | Vactosertib + durvalumab in advanced NSCLC 16. NCT03802084 | Phase 1/2 | Completed | Vactosertib + imatinib in advanced desmoid tumors 17. NCT03666832 | Phase 1/2 | Unknown | TEW-7197 + FOLFOX in metastatic pancreatic cancer 18. NCT04893252 | Phase 2 | Unknown | Vactosertib + durvalumab in gastric cancer 19. NCT04656002 | Phase 2 | Unknown | Vactosertib + paclitaxel + ramucirumab in gastric adenocarcinoma
FRESOLIMUMAB (9 trials): 20. NCT02581787 | Phase 1/2 | Completed | SABR-ATAC: Fresolimumab + stereotactic ablative radiotherapy in early NSCLC
Pharmacogenomics
Limited pharmacogenomics data available in current databases for TGFB1-specific associations. No established TGFB1 variant-specific dosing guidelines identified. Published literature describes:
- TGFB1 polymorphisms: C-509T and T869C variants associated with baseline TGF-β levels and immune response, potentially affecting drug efficacy
- Response prediction: Emerging biomarkers include TGF-β signaling pathway activation status (SMAD2/3 phosphorylation), ALK5 expression levels, and fibrotic/epithelial-mesenchymal transition (EMT) gene signatures
- Drug-specific PK/PD: No major pharmacogenetic subgrouping required for current TGFB1-targeting drugs; dose adjustments primarily based on tolerability and organ function (hepatic/renal)
Expression profiles
Tissue and Cell Type Expression (Bgee)
TGFB1 shows ubiquitous expression across tissues with high signal in immune cells and vascular tissues.
Top 30 tissues/cell types by expression score:
| Rank | Entity | Type | Expression Score | Quality |
|---|---|---|---|---|
| 1 | Granulocyte | Cell type | 99.08 | Gold |
| 2 | Monocyte | Cell type | 98.51 | Gold |
| 3 | Leukocyte | Cell type | 98.47 | Gold |
| 4 | Mononuclear cell | Cell type | 98.43 | Gold |
| 5 | Stromal cell of endometrium | Cell type | 98.27 | Gold |
| 6 | Ascending aorta | Tissue | 96.77 | Gold |
| 7 | Thoracic aorta | Tissue | 96.75 | Gold |
| 8 | Descending thoracic aorta | Tissue | 96.68 | Gold |
| 9 | Right coronary artery | Tissue | 96.46 | Gold |
| 10 | Spleen | Tissue | 96.43 | Gold |
| 11 | Lower esophagus mucosa | Tissue | 96.32 | Gold |
| 12 | Right lung | Tissue | 95.95 | Gold |
| 13 | Endocervix | Tissue | 95.60 | Gold |
| 14 | Blood | Tissue | 95.46 | Gold |
| 15 | Aorta | Tissue | 95.39 | Gold |
| 16 | Upper lobe of left lung | Tissue | 95.20 | Gold |
| 17 | Left coronary artery | Tissue | 95.04 | Gold |
| 18 | Bone marrow cell | Cell type | 94.85 | Gold |
| 19 | Ectocervix | Tissue | 94.61 | Gold |
| 20 | Coronary artery | Tissue | 94.59 | Gold |
| 21 | Upper lobe of lung | Tissue | 94.55 | Gold |
| 22 | Popliteal artery | Tissue | 94.50 | Gold |
| 23 | Tibial artery | Tissue | 94.50 | Gold |
| 24 | Mucosa of stomach | Tissue | 93.90 | Gold |
| 25 | Body of uterus | Tissue | 93.65 | Gold |
| 26 | Lymph node | Tissue | 93.14 | Gold |
| 27 | Omental fat pad | Tissue | 92.73 | Gold |
| 28 | Peritoneum | Tissue | 92.67 | Gold |
| 29 | Metanephros cortex (kidney) | Tissue | 92.50 | Gold |
| 30 | Left uterine tube | Tissue | 92.39 | Gold |
Summary statistics:
- 204/272 conditions show present expression
- Average expression score: 82.96
- Expression breadth: Ubiquitous (all surveyed tissues)
Pattern: Strong enrichment in immune cell populations (granulocytes > monocytes > leukocytes) and vascular endothelium (aortic/coronary arteries). Consistent moderate-to-high expression across blood, lymphoid tissues, and mucosal epithelium.
Single-Cell Expression (SCXA)
TGFB1 is characterized as a marker gene in 10 experiments spanning 259 cell clusters.
- Max mean expression: 4451.79
- Average mean expression: 175.59
- Marker status: Present in all 10 analyzed experiments
Notable datasets:
- E-ANND-5: Mapping developing human immune system (911,873 cells) — immune cell-enriched
- E-GEOD-139324: Head and neck cancer immune landscape (204,315 cells) — tumor-immune microenvironment
- E-GEOD-135922: Human retinal pigment epithelium and choroid (55,571 cells) — eye tissue with immune/fibroblast presence
- E-MTAB-8205: hPSC-derived endothelial-to-haematopoietic transition (25,764 cells) — developmental pathway
- E-MTAB-8911: GVHD T-lymphocytes (19,075 cells) — expanded T-cell clones
- E-GEOD-106540: CD4+ cytotoxic T lymphocyte precursors (2,244 cells) — rare immune subset
Cell type pattern: Predominantly expressed in immune cells (T cells, B cells, myeloid cells) and stromal/fibroblast populations; consistent with growth factor and immune regulation functions.
Disease associations
Mendelian / Monogenic Diseases
| Disease Name | Disease ID | Inheritance Pattern | Evidence Level |
|---|---|---|---|
| Camurati-Engelmann disease | OMIM:131300 / Orphanet:1328 / Mondo:0007542 | Autosomal dominant | Definitive (Ambry, G2P); Strong (Genomics England, Labcorp, PanelApp Australia) |
| Inflammatory bowel disease, immunodeficiency, and encephalopathy | OMIM:618213 / Orphanet:565788 / Mondo:0032601 | Autosomal recessive | Limited (Ambry); Strong (Labcorp); Moderate (PanelApp Australia) |
| Cystic fibrosis (modifier gene) | Orphanet:586 / Mondo:0009061 | Autosomal recessive | Supportive |
| Meckel syndrome, type 10 | Mondo:0013609 | Variable | ClinVar-derived |
| IL10-related early-onset inflammatory bowel disease | Mondo:0016542 | Autosomal recessive | ClinVar-derived |
Phenotype Associations (Top 30 HPO Terms)
| HPO ID | Phenotype | HPO ID | Phenotype |
|---|---|---|---|
| HP:0000006 | Autosomal dominant inheritance | HP:0002240 | Hepatomegaly |
| HP:0000007 | Autosomal recessive inheritance | HP:0002315 | Headache |
| HP:0002024 | Malabsorption | HP:0002384 | Focal impaired awareness seizure |
| HP:0002059 | Cerebral atrophy | HP:0002515 | Waddling gait |
| HP:0002099 | Asthma | HP:0002570 | Steatorrhea |
| HP:0002105 | Hemoptysis | HP:0002595 | Ileus |
| HP:0002110 | Bronchiectasis | HP:0002613 | Biliary cirrhosis |
| HP:0002188 | Delayed CNS myelination | HP:0002650 | Scoliosis |
| HP:0002205 | Recurrent respiratory infections | HP:0002652 | Skeletal dysplasia |
| HP:0001298 | Encephalopathy | HP:0003034 | Diaphyseal sclerosis |
| HP:0001324 | Muscle weakness | HP:0003565 | Elevated erythrocyte sedimentation rate |
| HP:0001376 | Limitation of joint mobility | HP:0005464 | Craniofacial osteosclerosis |
| HP:0001392 | Abnormality of the liver | HP:0011001 | Increased bone mineral density |
| HP:0001394 | Cirrhosis | HP:0006532 | Recurrent pneumonia |
| HP:0001508 | Failure to thrive | HP:0100759 | Clubbing of fingers |
Complex-Disease / GWAS Associations (Top 19)
| Trait/Disease | Mapped Gene | Chr | P-value | Study ID |
|---|---|---|---|---|
| Coronary artery disease | TGFB1 | 19 | 1e-26 | GCST010866_163 |
| Coronary artery disease | TGFB1 | 19 | 2e-17 | GCST005195_133 |
| Coronary artery disease | TGFB1 | 19 | 4e-17 | GCST005194_75 |
| Coronary artery disease | TGFB1 | 19 | 7e-15 | GCST005196_248 |
| Coronary artery disease | TGFB1 | 19 | 4e-16 | GCST005196_249 |
| Aspartate aminotransferase levels | CCDC97, TGFB1 | 19 | 2e-25 | GCST90013664_34 |
| Pulse pressure | TGFB1 | 19 | 8e-11 | GCST007269_321 |
| Hematuria | CCDC97, TGFB1 | 19 | 1e-11 | GCST008613_2 |
| Platelet count | CCDC97, TGFB1 | 19 | 2e-11 | GCST90002402_208 |
| Alanine aminotransferase levels | CCDC97, TGFB1 | 19 | 4e-09 | GCST90013663_33 |
| Coronary artery disease | TGFB1 | 19 | 4e-08 | GCST004787_13 |
| Hematuria (moderate to severe) | CCDC97, TGFB1 | 19 | 2e-09 | GCST008617_4 |
| Coronary artery disease | TGFB1 | 19 | 2e-08 | GCST007990_15 |
| Preterm birth (maternal effect) | TGFB1 | 19 | 5e-07 | GCST004898_3 |
| Colorectal cancer | TMEM91 | 19 | 4e-07 | GCST003494_2 |
| Type 2 diabetes | TGFB1 | 19 | 1e-06 | GCST008114_4 |
| Hematuria (mild) | CCDC97, TGFB1 | 19 | 9e-08 | GCST008618_1 |
| Colorectal cancer or advanced adenoma | TMEM91 | 19 | 1e-06 | GCST007856_118 |
| Colorectal cancer | TMEM91 | 19 | 1e-08 | GCST002454_12 |