EGFR Gene Complete Identifier and Functional Mapping Reference
Provide a comprehensive cross-database identifier and functional mapping reference for human EGFR — a definitive lookup resource covering: ### Section 1: Gene identifiers For human gene EGFR, list ALL gene-level database identifiers. Required: - HGNC ID and approved symbol - Ensembl gene ID (ENSG...) - NCBI Entrez Gene ID - OMIM gene/locus ID - Genomic location: chromosome, start position, end position, strand (GRCh38) ### Section 2: Transcript identifiers For human gene EGFR, list ALL transcript-level identifiers. Required: - Ensembl transcripts: ALL ENST IDs with biotype. Total count. - RefSeq transcripts: ALL NM_ mRNA accessions. Mark which is MANE Select. - CCDS IDs. - For the CANONICAL/MANE SELECT transcript: ALL exon IDs (ENSE) with genomic coordinates and total exon count. ### Section 3: Protein identifiers For human gene EGFR protein product(s), list ALL protein-level identifiers. Required: - UniProt accessions: ALL entries (reviewed and unreviewed). Mark the canonical reviewed entry. - RefSeq protein: ALL NP_ accessions. - Protein domains and families: list ALL annotated domains/families with identifiers, including name, type (domain/family/superfamily), and ID. - Antibody availability: known antibody resources for the protein. ### Section 4: Structure For human gene EGFR protein, list ALL structural data. Required: - Experimental structures: ALL PDB IDs. For each: experimental method (X-ray/NMR/Cryo-EM) and resolution. Total count. - Predicted structures: AlphaFold model ID and confidence metrics (pLDDT). ### Section 5: Cross-species orthologs For human gene EGFR, list orthologous genes in key model organisms. Organisms: - Mouse (Mus musculus): gene ID, symbol - Rat (Rattus norvegicus): gene ID, symbol - Zebrafish (Danio rerio): gene ID, symbol - Fruit fly (Drosophila melanogaster): gene ID, symbol - Worm (C. elegans): gene ID, symbol - Yeast (S. cerevisiae): gene ID, symbol ### Section 6: Clinical variants & AI predictions For human gene EGFR, summarize clinical variants and AI predictions. Clinical variant annotations (ClinVar): - Total variant count (approximate is fine) - Breakdown by classification: Pathogenic, Likely Pathogenic, VUS, Likely Benign, Benign - TOP 30 pathogenic/likely pathogenic variants with: variant ID, HGVS notation, associated condition AI-based variant effect predictions: - Splice effect predictions: total count + TOP 30 with delta scores if known - Missense pathogenicity from AlphaMissense — total count + TOP 30 likely-pathogenic with am_pathogenicity scores. ### Section 7: Pathways & Gene Ontology For human gene EGFR, list biological pathways and Gene Ontology annotations. Pathway membership: - ALL biological pathways this gene participates in, with pathway IDs and names - Total pathway count Gene Ontology: - Biological Process: count and TOP 20 terms with GO IDs - Molecular Function: count and TOP 20 terms with GO IDs - Cellular Component: count and TOP 20 terms with GO IDs ### Section 8: Protein interactions & networks For human gene EGFR protein, summarize protein interactions and networks. Protein-protein interactions (STRING, IntAct, BioGRID, etc.): - Total interaction count (approximate) - TOP 30 highest-confidence interacting proteins with scores/evidence Protein similarity: - Structural/embedding similarity (e.g. Foldseek, ESM): TOP 20 similar proteins with scores - Sequence homology: TOP 20 homologous proteins with identity/similarity ### Section 9: Transcription factor regulatory data For human gene EGFR, summarize transcription factor regulatory data. If EGFR is a transcription factor: - Downstream targets: total count + TOP 30 with regulation type (activates/represses) and evidence - DNA binding motifs from JASPAR — all known motif IDs and motif family classification. Regardless: - Upstream regulators: TFs that regulate EGFR — names with evidence type (ChIP-seq / predicted / experimentally validated) If EGFR is not a transcription factor, say so briefly and skip the downstream/motif sections. ### Section 10: Drug & pharmacology data For human gene EGFR protein as a drug target, summarize pharmacology data. If EGFR is a known drug target: - Targeting molecules: total count in ChEMBL/DrugBank + TOP 30 by development phase (molecule ID, name, mechanism, highest phase) - Clinical trials: TOP 20 involving drugs targeting this gene — trial ID, phase, status, intervention - Pharmacogenomics: known drug-gene interactions affecting drug response + dosing guidelines if any If EGFR is not currently a drug target, say so briefly. ### Section 11: Expression profiles For human gene EGFR, summarize expression profiles. Tissue expression (GTEx, HPA, Bgee, etc.): - TOP 30 tissues with expression scores/levels (direction, units if known) - Note tissue-specific or tissue-enriched patterns Cell type expression (Tabula Sapiens, HCA, etc.): - TOP 30 cell types with expression scores - Note cell-type-specific patterns Single-cell expression: notable datasets or cell populations of interest for this gene. ### Section 12: Disease associations For human gene EGFR, summarize disease associations. Mendelian / monogenic disease: - Diseases caused by mutations in EGFR: disease name, disease ID (OMIM/Orphanet/Mondo), inheritance pattern, evidence level - Include all directly linked conditions Phenotype associations: - Clinical phenotypes associated with the gene (HPO terms where known) - TOP 30 phenotype terms with HPO IDs Complex-disease / GWAS: - Traits and diseases significantly associated via GWAS: trait name, variant, effect size, study where known - TOP 30 GWAS associations
Executive summary
EGFR (epidermal growth factor receptor; HGNC:3236) is a transmembrane receptor tyrosine kinase on chromosome 7 and one of the most clinically consequential oncogenes in human cancer. It is primarily known as the key driver of non-small cell lung cancer (NSCLC), where somatic mutations — especially exon 19 deletions (~45% of EGFR-positive cases) and L858R (~40%) — predict sensitivity to approved tyrosine kinase inhibitors including erlotinib, gefitinib, afatinib, and osimertinib; the acquired T790M resistance mutation (~50% of progressors on first/second generation TKIs) specifically predicts response to osimertinib. EGFR is extraordinarily well-studied, with 378 experimental PDB structures, 3,790 ClinVar variants (103 pathogenic/likely pathogenic), and over 18,000 documented protein interactions. Its expression is ubiquitous but highest in epithelial tissues, and GWAS data link the locus strongly to glioblastoma (p = 5e-34) and hematologic traits alongside its established roles in lung, ovarian, and head and neck cancers.
EGFR — Reference
Cross-database identifier and functional mapping reference for EGFR.
Gene identifiers
- HGNC ID: HGNC:3236
- Gene symbol: EGFR (epidermal growth factor receptor)
- Ensembl gene ID: ENSG00000146648
- NCBI Entrez Gene ID: 1956
- OMIM gene/locus ID: 131550
- Genomic location (GRCh38):
- Chromosome: 7
- Start position: 55,018,820 bp
- End position: 55,211,628 bp
- Strand: +
Transcript identifiers
Ensembl transcripts (17 total)
| ENST ID | Biotype |
|---|---|
| ENST00000275493 | protein_coding |
| ENST00000342916 | protein_coding |
| ENST00000344576 | protein_coding |
| ENST00000420316 | protein_coding |
| ENST00000450046 | protein_coding |
| ENST00000455089 | protein_coding |
| ENST00000459688 | protein_coding_CDS_not_defined |
| ENST00000463948 | protein_coding_CDS_not_defined |
| ENST00000485503 | retained_intron |
| ENST00000700144 | retained_intron |
| ENST00000700145 | protein_coding |
| ENST00000700146 | retained_intron |
| ENST00000700147 | retained_intron |
| ENST00000898199 | protein_coding |
| ENST00000898200 | protein_coding |
| ENST00000898201 | protein_coding |
| ENST00000898202 | protein_coding |
RefSeq mRNA accessions (13 total, human chromosome 7)
| NM ID | MANE Select |
|---|---|
| NM_001346897 | No |
| NM_001346898 | No |
| NM_001346899 | No |
| NM_001346900 | No |
| NM_001346941 | No |
| NM_001393707 | No |
| NM_005228 | Yes |
| NM_201282 | No |
| NM_201283 | No |
| NM_201284 | No |
CCDS identifiers (6 total)
- CCDS47587
- CCDS5514
- CCDS5515
- CCDS5516
- CCDS87507
- CCDS94105
MANE Select transcript exons (28 total)
Transcript: ENST00000275493 (NM_005228)
| ENSE ID | Start | End | Strand | Chromosome |
|---|---|---|---|---|
| ENSE00001841347 | 55019017 | 55019365 | + | 7 |
| ENSE00001704157 | 55143305 | 55143488 | + | 7 |
| ENSE00003541288 | 55142286 | 55142437 | + | 7 |
| ENSE00001683983 | 55151294 | 55151362 | + | 7 |
| ENSE00001652975 | 55152546 | 55152664 | + | 7 |
| ENSE00001623732 | 55154011 | 55154152 | + | 7 |
| ENSE00001751179 | 55155830 | 55155946 | + | 7 |
| ENSE00001084929 | 55156533 | 55156659 | + | 7 |
| ENSE00001084931 | 55156759 | 55156832 | + | 7 |
| ENSE00001084926 | 55157663 | 55157753 | + | 7 |
| ENSE00001798125 | 55146606 | 55146740 | + | 7 |
| ENSE00001627115 | 55165280 | 55165437 | + | 7 |
| ENSE00001084927 | 55163733 | 55163823 | + | 7 |
| ENSE00001756460 | 55174722 | 55174820 | + | 7 |
| ENSE00001778519 | 55173921 | 55174043 | + | 7 |
| ENSE00002684637 | 55172983 | 55173124 | + | 7 |
| ENSE00001768076 | 55171175 | 55171213 | + | 7 |
| ENSE00001601336 | 55181293 | 55181478 | + | 7 |
| ENSE00001681524 | 55191719 | 55191874 | + | 7 |
| ENSE00001631695 | 55192766 | 55192841 | + | 7 |
| ENSE00003625684 | 55198717 | 55198863 | + | 7 |
| ENSE00001790701 | 55200316 | 55200413 | + | 7 |
| ENSE00001801208 | 55201188 | 55201355 | + | 7 |
| ENSE00001773562 | 55201735 | 55201782 | + | 7 |
| ENSE00001795780 | 55202517 | 55202625 | + | 7 |
| ENSE00001084939 | 55161499 | 55161631 | + | 7 |
| ENSE00001084941 | 55160139 | 55160338 | + | 7 |
| ENSE00001245887 | 55205256 | 55211628 | + | 7 |
Protein identifiers
UniProt Accessions
- P00533 ⭐ (canonical reviewed entry)
- A0A8V8TPW8 (unreviewed)
- C9JYS6 (unreviewed)
- Q504U8 (unreviewed)
RefSeq Protein (NP_) Accessions
- NP_005219 (MANE Select - canonical transcript)
- NP_001333826
- NP_001333827
- NP_001333828
- NP_001333829
- NP_001333870
- NP_958439
- NP_958440
- NP_958441
Protein Domains and Families
InterPro Entries (15 total):
| ID | Name | Type |
|---|---|---|
| IPR000494 | Receptor L-domain | Domain |
| IPR000719 | Protein kinase domain | Domain |
| IPR001245 | Serine-threonine/tyrosine-protein kinase, catalytic domain | Domain |
| IPR006211 | Furin-like cysteine-rich domain | Domain |
| IPR006212 | Furin-like repeat | Repeat |
| IPR008266 | Tyrosine-protein kinase, active site | Active site |
| IPR009030 | Growth factor receptor cysteine-rich domain superfamily | Homologous superfamily |
| IPR011009 | Protein kinase-like domain superfamily | Homologous superfamily |
| IPR016245 | Tyrosine protein kinase, EGF/ERB/XmrK receptor | Family |
| IPR017441 | Protein kinase, ATP binding site | Binding site |
| IPR020635 | Tyrosine-protein kinase, catalytic domain | Domain |
| IPR032778 | Growth factor receptor domain 4 | Domain |
| IPR036941 | Receptor L-domain superfamily | Homologous superfamily |
| IPR049328 | Epidermal growth factor receptor-like, transmembrane-juxtamembrane segment | Domain |
| IPR050122 | Receptor Tyrosine Kinase | Family |
Pfam Entries (5 total):
- PF00757
- PF01030
- PF07714
- PF14843
- PF21314
SMART Entries (2 total):
- SM00219
- SM00261
Antibody Availability
No EGFR-specific antibodies found in biobtree antibody database. However, EGFR is linked to the Human Protein Atlas (HPA), a primary source of validated antibodies for this protein. Commercial antibodies from major vendors (Abcam, Santa Cruz, Sigma, etc.) are widely available; these can be located via antibody databases and research resource identifiers (RRID).
Structure
Experimental Structures
Total PDB Entries: 378
Breakdown by Experimental Method:
X-RAY DIFFRACTION (>340 structures)
Representative PDB IDs with resolutions:
- 1.07 Å - 8A27
- 1.11 Å - 8A2D
- 1.33 Å - 5UG9
- 1.42 Å - 5HG8
- 1.43 Å - 8A2A
- 1.46 Å - 5UG8
- 1.5 Å - 3POZ, 5CNO, 6TFU, 6TFV, 6TG0
- 1.52 Å - 3VRP, 5HG5
- 1.58 Å - 5UGC
- 1.59 Å - 3G5Y
- 1.6 Å - 5U8L, 7SI1
- 1.621 Å - 8PO4
- 1.632 Å - 9GL8
- 1.68 Å - 9DF3
- 1.69 Å - 8A2B
- 1.7 Å - 3W33, 6TFY
- 1.71 Å - 4I22
- 1.76 Å - 6WXN
- 1.78 Å - 9DF4
- 1.796 Å - 5GNK
- 1.8 Å - 3P0Y, 3W32, 4I24, 6TFZ
- 1.82 Å - 5UGA
- 1.83 Å - 7JXQ
- 1.85 Å - 4WKQ, 5HG7, 5X2A, 6TFW
- 1.878 Å - 9GL7
- 1.9 Å - 3W2S, 4WRG, 5CNN, 6P8Q, 7A2A
- 1.901 Å - 9GC6
- 1.909 Å - 9GC5
- 1.91 Å - 9FZR
- 1.93 Å - 9MSR
- 1.95 Å - 5YU9
- 1.97 Å - 4ZSE
- 1.984 Å - 8SC7
- 1.991 Å - 9FZS
- 2.0 Å - 2RGP, 6TFU, 6Z4D, 7A6J
- 2.05 Å - 3W2P, 6LUD, 8S1M, 9OU9
- 2.06 Å - 6XL4, 9FRD
- 2.09 Å - 9MSS
- 2.1 Å - 3OB2, 4UV7, 5CAS, 6V6O, 8PO1, 8FV3, 9NIS, 9OU9
- 2.13 Å - 6P1D, 8PO3, 9DM8, 9NIS
- 2.142 Å - 9QXN
- 2.147 Å - 9GL9
- 2.15 Å - 5HG9
- 2.17 Å - 8D73
- 2.18 Å - 7JXP
- 2.2 Å - 3W2Q, 7T4J, 8HV4
- 2.218 Å - 9GDV
- 2.23 Å - 9NIS
- 2.24 Å - 9NJN
- 2.25 Å - 5CAU
- 2.26 Å - 6JRX
- 2.27 Å - 3PFV
- 2.28 Å - 8PO2
- 2.3 Å - 5FED, 5HCX, 6P1L, 6V5P, 7LTX, 7ZYN, 8F1X, 8GK5
- 2.307 Å - 6JXT, 9BY4
- 2.31 Å - 5D41
- 2.315 Å - 6LUB
- 2.32 Å - 9D3V, 9KL4
- 2.33 Å - 9PMZ
- 2.34 Å - 3VJN
- 2.35 Å - 3W2O
- 2.346 Å - 5XDK
- 2.4 Å - 5CAP, 6DUK, 6P1D, 6V5N, 6V6K, 6WA2, 6WA2, 7A6I, 7KXZ, 7UKV, 8HV1, 8HV3, 8HV8, 8D76, 8F1Z, 8PO0, 9NGP
- 2.401 Å - 5C8N
- 2.415 Å - 9GC4
- 2.423 Å - 9S3X
- 2.435 Å - 9EWS
- 2.46 Å - 5HCY, 9Q0S
- 2.47 Å - 2ITN, 2ITV, 3VJN
- 2.495 Å - 9FQP
- 2.49 Å - 8TO3
- 2.5 Å - 1MOX, 2EB2, 4LQM, 5SX5, 5UGB, 5ZTO, 5YU9, 7B85, 7JXW, 8D73, 9DF2, 9MST, 9NJ7
- 2.501 Å - 7VRE
- 2.51 Å - 8PNZ
- 2.523 Å - 8PO0
- 2.53 Å - 6JX0, 7UKW, 9D3W, 9NJ7
- 2.537 Å - 6D8E
- 2.55 Å - 6JWL, 8WD4
- 2.551 Å - 6JWL, 9BY6
- 2.56 Å - 7OXB
- 2.57 Å - 9MST
- 2.59 Å - 8GB4, 9NM0
- 2.6 Å - 1XKK, 2GS6, 2GS7, 2RFE, 3BUO, 3UG2, 4GS6, 5CAN, 5CAO, 5CAV, 5CZI, 5EDR, 5FEE, 5HCX, 5HIC, 5J9Z, 5S8A, 6LUT, 6P1L, 6S8A, 6S9D, 7U9A, 7UKW
- 2.601 Å - 7K1H, 9H42
- 2.605 Å - 1YY9
- 2.61 Å - 7T4I
- 2.62 Å - 5HCZ, 8S1M, 9OLB
- 2.651 Å - 5FED
- 2.655 Å - 4KRM
- 2.662 Å - 7ER2
- 2.665 Å - 3B2U
- 2.67 Å - 6S9D, 9KLW
- 2.7 Å - 2ITT, 2RF9, 3LZB, 4ZJV, 5CAL, 5EM5, 5FEE, 5X2F, 5XDL, 6JZ0, 6S8A, 8F1Y, 8TJL, 8TO3
- 2.701 Å - 6S89
- 2.73 Å - 2ITY, 3B2V, 3QWQ, 4JQ7, 5CAV, 6S9C, 6V5P, 9NHW
- 2.74 Å - 2ITP
- 2.75 Å - 3QWQ, 4HJO, 3UG1, 9OS6
- 2.77 Å - 8HVA
- 2.78 Å - 5EM6, 4RJ4
- 2.8 Å - 1NQL, 2GS2, 2RFC, 2RFD, 3GOP, 3GT8, 4G5J, 4I23, 4ZAU, 5EM8, 5J9Y, 5SX4, 5WB8, 6P1L, 6VH4, 7LEN, 8F1H, 8HV2
- 2.8 - 2ITU, 2JIV, 2ITZ, 3NJP, 4I24, 4ZAU, 5CAL
- 2.801 Å - 3OP0
- 2.805 - 4RIW, 4RIX
- 2.81 Å - 5EM7
- 2.849 Å - 4KRL
- 2.88 Å - 2ITW
- 2.9 Å - 1NQL, 2RFE, 3IKA, 3GT8, 5EDQ, 5C8M, 5EDP, 5GTY, 5HG9, 5ZWJ, 6JRK, 7LEN, 8TO4, 8H7X
- 2.905 Å - 4R3P
- 2.91 Å - 8HY7
- 2.941 Å - 5WB7
- 2.95 Å - 3QWQ, 4UIP
- 2.951 Å - 5X26, 5X27, 5XGM
- 2.952 Å - 5X28
- 2.955 Å - 3GT8
- 2.959 - 2ITX
- 2.981 Å - 4RIY
- 3.0 Å - 2J5F, 4I1Z, 5C8K, 5B8, 5HIC, 5HIB, 5JEB, 7JXI, 8F1W, 8FV4, 8F1W, 8F1W
- 3.001 Å - 4R5S
- 3.02 Å - 9JQ1
- 3.05 Å - 2JIU
- 3.054 Å - 4KRO
- 3.1 Å - 2J5E, 2J6M, 2JIT, 3IKA, 3LZB, 4I20, 4RIX, 4RIY, 5EDQ, 5JEB, 5ZWJ, 6JX0, 7KY0, 7LGS, 7SYD, 8H7X
- 3.102 Å - 5Y25
- 3.12 Å - 9GI9
- 3.163 Å - 9H46
- 3.17 Å - 4G5P
- 3.2 - 3C09, 4TKS, 4UV7, 6ARU, 6JRX, 6LUB, 7LFR, 8F1W, 8KFQ
- 3.2017 Å - 4TKS
- 3.201 Å - 5X2K
- 3.202 Å - 7K1I
- 3.22 Å - 8KFQ, 9MSR
- 3.25 Å - 2ITO, 3B2V, 4R3R, 5JEB, 5Y9T, 6ARU, 6S9B
- 3.3 - 1IVO, 1NQL, 3B2V, 3NJP, 8HGO
- 3.301 Å - 8UKX
- 3.304 Å - 3NJP
- 3.298 Å - 5JEB
- 3.3 Å - 1IVO, 3B2V, 3NJP, 7SYE
- 3.32 Å - 8EME
- 3.34 Å - 4I20
- 3.37 Å - 4I21
- 3.4 Å - 2JIV, 5FEQ, 6VHP, 7U98, 9IPB, 9U8C
- 3.404 Å - 8H7X
- 3.42 Å - 2ITY, 7U98
- 3.526 Å - 4LRM
- 3.6 Å - 2RFD, 6VHP, 7SZ5
- 6.05 Å - 7OM4
SOLUTION NMR (~6 structures, resolution not reported)
- 1Z9I
- 2KS1
- 2M0B
- 2M20
- 2N5S
- 5LV6
ELECTRON MICROSCOPY / CRYO-EM (~18 structures)
- 3.1 Å - 7SYD
- 3.21 Å - 9IP7
- 3.27 Å - 9P9U
- 3.29 Å - 9IPD
- 3.31 Å - 8HGO, 9IPE
- 3.3 Å - 7SYE
- 3.4 Å - 7SZ1, 9IPC
- 3.6 Å - 7SZ5
- 3.64 Å - 9IP9
- 3.81 Å - 8HGS
- 3.85 Å - 9IPA
- 3.91 Å - 9IP8
- 4.53 Å - 8HGP
Predicted Structures
AlphaFold Model: P00533
- Model ID: AF-P00533-F1
- pLDDT (Global Confidence Score): 76.29
- Fraction with very high confidence (pLDDT > 90): 0.48 (48%)
- Sequence Length: 1210 amino acids
Based on the biobtree database search, here’s what I found:
Cross-species orthologs
| Organism | Gene ID | Symbol |
|---|---|---|
| Mouse (Mus musculus) | ENSMUSG00000020122 | Egfr |
| Rat (Rattus norvegicus) | ENSRNOG00000004332 | Egfr |
| Zebrafish (Danio rerio) | none | - |
| Fruit fly (Drosophila melanogaster) | FBGN0003731 | Egfr |
| Worm (C. elegans) | 174462 | let-23 |
| Yeast (S. cerevisiae) | none | - |
Clinical variants & AI predictions
ClinVar Summary
| Classification | Count |
|---|---|
| Pathogenic | 63 |
| Likely Pathogenic | 40 |
| Total Pathogenic+LP | 103 |
| Uncertain Significance (VUS) | ~3,300 |
| Conflicting Classifications | ~250 |
| Benign/Likely Benign | ~137 |
| Total Variants | 3,790 |
Top 30 Pathogenic Variants
| ClinVar ID | HGVS Notation | Variant Type | Associated Condition |
|---|---|---|---|
| 45225 | c.2156G>C (p.Gly719Ala) | SNV | Lung cancer; NSCLC |
| 45251 | c.2303G>T (p.Ser768Ile) | SNV | Lung cancer; NSCLC |
| 45252 | c.2303_2304delinsTT (p.Ser768Ile) | Indel | Lung cancer |
| 45279 | c.2500G>T (p.Val834Leu) | SNV | Lung cancer |
| 177620 | c.2236_2250del (p.Glu746_Ala750del) | Deletion | Lung cancer |
| 254143 | c.977G>T (p.Cys326Phe) | SNV | Cancer |
| 1005727 | c.1605C>A (p.Cys535Ter) | SNV | Loss of function |
| 1016241 | c.2917C>T (p.Arg973Ter) | SNV | Truncation |
| 1016463 | c.1536del (p.Glu513fs) | Deletion | Frameshift |
| 1023925 | c.2545C>T (p.Gln849Ter) | SNV | Truncation |
| 1429922 | c.877A>T (p.Lys293Ter) | SNV | Truncation |
| 1459376 | c.763C>T (p.Arg255Ter) | SNV | Truncation |
| 1508578 | c.2720T>A (p.Leu907Ter) | SNV | Truncation |
| 2582257 | c.1792G>A (p.Gly598Arg) | SNV | NSCLC |
| 2582258 | c.2561C>T (p.Thr854Ile) | SNV | NSCLC |
| 2582280 | c.1786C>T (p.Pro596Ser) | SNV | NSCLC |
| 2582281 | c.2287G>A (p.Ala763Thr) | SNV | NSCLC |
| 2746255 | c.2005C>T (p.Arg669Ter) | SNV | Truncation |
| 2757002 | c.2852G>A (p.Trp951Ter) | SNV | Truncation |
| 3630431 | c.2791G>T (p.Glu931Ter) | SNV | Truncation |
| 3639224 | c.3212C>G (p.Ser1071Ter) | SNV | Truncation |
| 3651305 | c.2451G>A (p.Trp817Ter) | SNV | Truncation |
| 3714966 | c.339C>G (p.Tyr113Ter) | SNV | Truncation |
| 3726208 | c.1430G>A (p.Trp477Ter) | SNV | Truncation |
| 638163 | c.2303_2305delinsTCT (p.Ser768_Val769delinsIleLeu) | Indel | NSCLC |
| 962242 | c.2745T>A (p.Tyr915Ter) | SNV | Truncation |
| 971184 | c.3147C>A (p.Cys1049Ter) | SNV | Truncation |
| 2578363 | c.2317delinsAACCCCT (p.His773delinsAsnProTyr) | Indel | Lung cancer |
| 45257 | c.2310_2311insGGGTTG (p.Asp770_Asn771insGlyLeu) | Insertion | Lung cancer |
| 1045120 | c.1418del (p.Asn473fs) | Deletion | Frameshift |
Top 30 Likely Pathogenic Variants
| ClinVar ID | HGVS Notation | Variant Type |
|---|---|---|
| 1009894 | c.3162+1G>T | Splice donor |
| 1009989 | c.2947-1G>A | Splice acceptor |
| 1046386 | c.1919+2T>G | Splice donor |
| 1055190 | c.3162+1G>A | Splice donor |
| 1465036 | c.1007-1G>A | Splice acceptor |
| 1476721 | c.1919+1G>A | Splice donor |
| 1484719 | c.240+1G>A | Splice donor |
| 1951338 | c.2947-1G>T | Splice acceptor |
| 2002407 | c.1723-2A>G | Splice acceptor |
| 2011607 | c.2947-2A>G | Splice acceptor |
| 2028937 | c.1498+1G>T | Splice donor |
| 2046879 | c.629-2A>G | Splice acceptor |
| 2104454 | c.2469+1G>A | Splice donor |
| 2117789 | c.2181_2184+7del | Splice region deletion |
| 2445337 | c.1936A>T (p.Ile646Phe) | SNV |
| 2583126 | c.709_710delinsTT (p.Ala237Phe) | Indel |
| 2583127 | c.2252_2277delinsAT (p.Thr751_Ile759delinsAsn) | Indel |
| 2583128 | c.2318_2320delinsTCA (p.His773_Val774delinsLeuMet) | Indel |
| 2583129 | c.2499G>T (p.Leu833Phe) | SNV |
| 2583130 | c.2555A>C (p.Lys852Thr) | SNV |
| 2701342 | c.748-1G>A | Splice acceptor |
| 2909132 | c.628+1G>T | Splice donor |
| 3001184 | c.2061+1G>A | Splice donor |
| 3619883 | c.628+1G>A | Splice donor |
| 3645733 | c.560-1G>T | Splice acceptor |
| 3668692 | c.1631+1G>A | Splice donor |
| 3723702 | c.628+1del | Splice donor deletion |
| 3725171 | c.3163-2A>G | Splice acceptor |
| 45233 | c.2232_2249delinsAAA (p.Glu746_Ala750del) | Indel |
| 45237 | c.2240_2248del (p.Leu747_Ala750delinsSer) | Deletion |
AlphaMissense Predictions
Summary: ~8,041 total predictions on UniProt P00533; 100+ “likely_pathogenic” high-confidence predictions identified
Top 30 Likely Pathogenic Variants (am_pathogenicity score)
| Genomic Position | Protein Change | am_pathogenicity | Effect |
|---|---|---|---|
| 7:55142288:T:C | C31R | 0.997 | Strong disruption |
| 7:55142370:G:A | C58Y | 0.998 | Strong disruption |
| 7:55142371:T:G | C58W | 0.999 | Strong disruption |
| 7:55142288:T:A | C31S | 0.996 | Strong disruption |
| 7:55142370:G:C | C58S | 0.996 | Strong disruption |
| 7:55142289:G:C | C31S | 0.996 | Strong disruption |
| 7:55142289:T:C | C31Y | 0.995 | Strong disruption |
| 7:55019310:T:C | L38P | 0.995 | Strong disruption |
| 7:55142349:T:C | L51P | 0.995 | Strong disruption |
| 7:55142294:G:C | G33R | 0.995 | Strong disruption |
| 7:55142370:G:T | C58F | 0.998 | Strong disruption |
| 7:55142369:T:C | C58R | 0.997 | Strong disruption |
| 7:55142289:G:T | C31F | 0.989 | Strong disruption |
| 7:55142288:T:G | C31G | 0.982 | Strong disruption |
| 7:55142289:T:A | C31S | 0.996 | Strong disruption |
| 7:55142305:C:A | N36K | 0.982 | Strong disruption |
| 7:55142305:C:G | N36K | 0.982 | Strong disruption |
| 7:55142295:G:T | G33V | 0.980 | Strong disruption |
| 7:55142310:T:A | L38H | 0.981 | Strong disruption |
| 7:55142369:T:A | C58S | 0.996 | Strong disruption |
| 7:55142310:T:G | L38R | 0.950 | Strong disruption |
| 7:55142345:A:C | S50R | 0.946 | Strong disruption |
| 7:55142303:A:T | N36I | 0.934 | Strong disruption |
| 7:55142303:A:G | N36D | 0.915 | Strong disruption |
| 7:55142318:T:G | L41W | 0.789 | Moderate-strong |
| 7:55142319:T:C | L41S | 0.890 | Strong disruption |
| 7:55142315:C:A | Q40H | 0.639 | Moderate |
| 7:55142337:A:C | H47P | 0.987 | Strong disruption |
| 7:55142337:A:G | H47R | 0.945 | Strong disruption |
| 7:55142336:C:G | H47D | 0.977 | Strong disruption |
SpliceAI Predictions
Summary: 4,358 total splice effect predictions (100+ high-confidence shown)
Top 30 High-Impact Splice Effects
| Genomic Position | Effect Type | SpliceAI Score |
|---|---|---|
| 7:55019362:AAAGG:A | Donor loss | 0.99 |
| 7:55019363:AAGGT:A | Donor loss | 0.99 |
| 7:55019364:AGGT:A | Donor loss | 0.99 |
| 7:55019365:GGTA:G | Donor loss | 0.99 |
| 7:55019366:G:A | Donor loss | 0.99 |
| 7:55019367:T:G | Donor loss | 0.99 |
| 7:55019361:GAAAG:G | Donor gain | 0.99 |
| 7:55019363:AAG:A | Donor gain | 0.85 |
| 7:55020707:GTT:G | Donor gain | 0.85 |
| 7:55020708:TTT:T | Donor gain | 0.85 |
| 7:55019365:GG:G | Donor gain | 0.89 |
| 7:55019364:AG:A | Donor gain | 0.89 |
| 7:55019362:AAAG:A | Donor gain | 0.71 |
| 7:55020413:G:GT | Donor gain | 0.76 |
| 7:55020788:G:T | Donor gain | 0.76 |
| 7:55020088:G:GA | Donor gain | 0.62 |
| 7:55020704:A:T | Donor gain | 0.62 |
| 7:55019534:G:T | Donor gain | 0.77 |
| 7:55019339:C:T | Donor gain | 0.83 |
| 7:55020743:TTCC:T | Donor gain | 0.73 |
| 7:55020755:GGCCC:G | Donor gain | 0.74 |
| 7:55020756:GCCCG:G | Donor gain | 0.74 |
| 7:55020757:C:T | Donor gain | 0.68 |
| 7:55020788:G:GT | Donor gain | 0.68 |
| 7:55020103:T:A | Donor gain | 0.65 |
| 7:55019534:G:GT | Donor gain | 0.72 |
| 7:55020503:G:GT | Donor gain | 0.72 |
| 7:55020404:G:GT | Donor gain | 0.67 |
| 7:55020046:G:GT | Donor gain | 0.47 |
| 7:55020030:G:GT | Donor gain | 0.42 |
Pathways & Gene Ontology
Reactome Pathways (37 total)
| Pathway ID | Pathway Name |
|---|---|
| R-HSA-177929 | Signaling by EGFR |
| R-HSA-179812 | GRB2 events in EGFR signaling |
| R-HSA-180336 | SHC1 events in EGFR signaling |
| R-HSA-182971 | EGFR downregulation |
| R-HSA-212718 | EGFR interacts with phospholipase C-gamma |
| R-HSA-2179392 | EGFR Transactivation by Gastrin |
| R-HSA-1227986 | Signaling by ERBB2 |
| R-HSA-1236382 | Constitutive Signaling by Ligand-Responsive EGFR Cancer Variants |
| R-HSA-1236394 | Signaling by ERBB4 |
| R-HSA-1250196 | SHC1 events in ERBB2 signaling |
| R-HSA-1251932 | PLCG1 events in ERBB2 signaling |
| R-HSA-1257604 | PIP3 activates AKT signaling |
| R-HSA-1963640 | GRB2 events in ERBB2 signaling |
| R-HSA-1963642 | PI3K events in ERBB2 signaling |
| R-HSA-2219530 | Constitutive Signaling by Aberrant PI3K in Cancer |
| R-HSA-5673001 | RAF/MAP kinase cascade |
| R-HSA-5637810 | Constitutive Signaling by EGFRvIII |
| R-HSA-5638303 | Inhibition of Signaling by Overexpressed EGFR |
| R-HSA-6785631 | ERBB2 Regulates Cell Motility |
| R-HSA-6811558 | PI5P, PP2A and IER3 Regulate PI3K/AKT Signaling |
| R-HSA-8847993 | ERBB2 Activates PTK6 Signaling |
| R-HSA-8856825 | Cargo recognition for clathrin-mediated endocytosis |
| R-HSA-8856828 | Clathrin-mediated endocytosis |
| R-HSA-8857538 | PTK6 promotes HIF1A stabilization |
| R-HSA-8863795 | Downregulation of ERBB2 signaling |
| R-HSA-8866910 | TFAP2 (AP-2) family regulates transcription of growth factors and their receptors |
| R-HSA-9009391 | Extra-nuclear estrogen signaling |
| R-HSA-9013507 | NOTCH3 Activation and Transmission of Signal to the Nucleus |
| R-HSA-9609690 | HCMV Early Events |
| R-HSA-9634638 | Estrogen-dependent nuclear events downstream of ESR-membrane signaling |
| R-HSA-9664565 | Signaling by ERBB2 KD Mutants |
| R-HSA-9665348 | Signaling by ERBB2 ECD mutants |
| R-HSA-9665686 | Signaling by ERBB2 TMD/JMD mutants |
| R-HSA-9820960 | Respiratory syncytial virus (RSV) attachment and entry |
| R-HSA-445144 | Signal transduction by L1 |
| R-HSA-180292 | GAB1 signalosome |
| R-HSA-9927432 | Developmental Lineage of Mammary Gland Myoepithelial Cells |
MSigDB Gene Sets (833+ total)
EGFR is member of 833+ MSigDB collections including pathway databases (REACTOME, KEGG, PID, BioCarta), GO term sets (C5:GO), and cancer signatures.
Gene Ontology Annotations (104 total)
Biological Process (48+ terms)
| GO ID | Term |
|---|---|
| GO:0007173 | epidermal growth factor receptor signaling pathway |
| GO:0038134 | ERBB2-EGFR signaling pathway |
| GO:0007166 | cell surface receptor signaling pathway |
| GO:0007165 | signal transduction |
| GO:0043410 | positive regulation of MAPK cascade |
| GO:0070374 | positive regulation of ERK1 and ERK2 cascade |
| GO:0043491 | phosphatidylinositol 3-kinase/protein kinase B signal transduction |
| GO:0051897 | positive regulation of phosphatidylinositol 3-kinase/protein kinase B signal transduction |
| GO:0045742 | positive regulation of epidermal growth factor receptor signaling pathway |
| GO:0042059 | negative regulation of epidermal growth factor receptor signaling pathway |
| GO:0008284 | positive regulation of cell population proliferation |
| GO:0050679 | positive regulation of epithelial cell proliferation |
| GO:0050673 | epithelial cell proliferation |
| GO:0030307 | positive regulation of cell growth |
| GO:0030335 | positive regulation of cell migration |
| GO:0045944 | positive regulation of transcription by RNA polymerase II |
| GO:0045740 | positive regulation of DNA replication |
| GO:0045739 | positive regulation of DNA repair |
| GO:0042327 | positive regulation of phosphorylation |
| GO:0001934 | positive regulation of protein phosphorylation |
| GO:0050730 | regulation of peptidyl-tyrosine phosphorylation |
| GO:0033138 | positive regulation of peptidyl-serine phosphorylation |
| GO:0016567 | protein ubiquitination |
| GO:0006511 | ubiquitin-dependent protein catabolic process |
| GO:0070086 | ubiquitin-dependent endocytosis |
| GO:0051205 | protein insertion into membrane |
| GO:0010467 | gene expression |
| GO:0043066 | negative regulation of apoptotic process |
| GO:0048146 | positive regulation of fibroblast proliferation |
| GO:0071364 | cellular response to epidermal growth factor stimulus |
| GO:0071392 | cellular response to estradiol stimulus |
| GO:0071230 | cellular response to amino acid stimulus |
| GO:0090263 | positive regulation of canonical Wnt signaling pathway |
| GO:0090037 | positive regulation of protein kinase C signaling |
| GO:0030182 | neuron differentiation |
| GO:0021795 | cerebral cortex cell migration |
| GO:0048546 | digestive tract morphogenesis |
| GO:0007435 | salivary gland morphogenesis |
| GO:0001942 | hair follicle development |
| GO:0001892 | embryonic placenta development |
| GO:0098609 | cell-cell adhesion |
| GO:0000902 | cell morphogenesis |
| GO:0001503 | ossification |
| GO:0042908 | xenobiotic transport |
| GO:1903078 | positive regulation of protein localization to plasma membrane |
| GO:1900087 | positive regulation of G1/S transition of mitotic cell cycle |
| GO:1902895 | positive regulation of miRNA transcription |
| GO:1905208 | negative regulation of cardiocyte differentiation |
Molecular Function (19+ terms)
| GO ID | Term |
|---|---|
| GO:0005006 | epidermal growth factor receptor activity |
| GO:0004714 | transmembrane receptor protein tyrosine kinase activity |
| GO:0004713 | protein tyrosine kinase activity |
| GO:0004709 | MAP kinase kinase kinase activity |
| GO:0004888 | transmembrane signaling receptor activity |
| GO:0048408 | epidermal growth factor binding |
| GO:0001618 | virus receptor activity |
| GO:0005524 | ATP binding |
| GO:0019900 | kinase binding |
| GO:0030296 | protein tyrosine kinase activator activity |
| GO:0019899 | enzyme binding |
| GO:0019903 | protein phosphatase binding |
| GO:0031625 | ubiquitin protein ligase binding |
| GO:0045296 | cadherin binding |
| GO:0051015 | actin filament binding |
| GO:0051117 | ATPase binding |
| GO:0042802 | identical protein binding |
| GO:0003682 | chromatin binding |
| GO:0003690 | double-stranded DNA binding |
Cellular Component (37+ terms)
| GO ID | Term |
|---|---|
| GO:0005886 | plasma membrane |
| GO:0070435 | Shc-EGFR complex |
| GO:0043235 | receptor complex |
| GO:0009986 | cell surface |
| GO:0005737 | cytoplasm |
| GO:0005829 | cytosol |
| GO:0005634 | nucleus |
| GO:0005615 | extracellular space |
| GO:0032991 | protein-containing complex |
| GO:0045121 | membrane raft |
| GO:0016020 | membrane |
| GO:0005794 | Golgi apparatus |
| GO:0000139 | Golgi membrane |
| GO:0005789 | endoplasmic reticulum membrane |
| GO:0005768 | endosome |
| GO:0031901 | early endosome membrane |
| GO:0010008 | endosome membrane |
| GO:0030669 | clathrin-coated endocytic vesicle membrane |
| GO:0005925 | focal adhesion |
| GO:0030054 | cell junction |
| GO:0031965 | nuclear membrane |
| GO:0048471 | perinuclear region of cytoplasm |
| GO:0032587 | ruffle membrane |
| GO:0009925 | basal plasma membrane |
| GO:0016323 | basolateral plasma membrane |
| GO:0005929 | cilium |
| GO:0036064 | ciliary basal body |
| GO:0097225 | sperm midpiece |
| GO:0097228 | sperm principal piece |
| GO:0097229 | sperm end piece |
| GO:0097489 | multivesicular body, internal vesicle lumen |
| GO:0097708 | intracellular vesicle |
| GO:0070141 | response to UV-A |
| GO:0060571 | morphogenesis of an epithelial fold |
| GO:0061029 | eyelid development in camera-type eye |
| GO:0007611 | learning or memory |
| GO:0042177 | negative regulation of protein catabolic process |
Protein interactions & networks
Protein-protein Interactions Summary
Total interaction count across major databases:
- STRING: 11,600 interactions
- BioGRID: 5,063 interactions
- IntAct: 1,747 interactions
- Approximate total: 18,000+ documented interactions
TOP 30 Highest-Confidence Interacting Proteins
| Rank | Protein ID | Name | STRING Score | Evidence |
|---|---|---|---|---|
| 1 | P07900 | Heat shock protein HSP 90-alpha | 10,270 | Chaperone, client protein stabilization |
| 2 | P08238 | Heat shock protein HSP 90-beta | 10,072 | Chaperone, client protein stabilization |
| 3 | P12931 | Proto-oncogene tyrosine-protein kinase Src | 9,720 | Direct substrate phosphorylation |
| 4 | P01133 | Pro-epidermal growth factor (EGF) | 8,188 | Ligand-receptor binding |
| 5 | P40763 | Signal transducer and activator of transcription 3 (STAT3) | 8,628 | Signal transduction, transcriptional regulation |
| 6 | P03372 | Estrogen receptor alpha | 8,546 | Cross-talk signaling |
| 7 | P04626 | Receptor tyrosine-protein kinase erbB-2 (HER2) | 7,626 | Homo/heterodimerization |
| 8 | P12830 | Cadherin-1 | 7,460 | Cell-cell adhesion, signaling cross-talk |
| 9 | P16070 | CD44 antigen | 6,319 | Cell surface receptor co-engagement |
| 10 | Q03135 | Caveolin-1 | 5,352 | Membrane organization, endocytosis |
| 11 | P08581 | Hepatocyte growth factor receptor (c-Met) | 5,090 | Parallel receptor tyrosine kinase |
| 12 | P42336 | PI3K catalytic subunit alpha (PIK3CA) | 4,602 | Direct effector, phosphorylation cascade |
| 13 | Q06124 | Tyrosine-protein phosphatase non-receptor type 11 (SHP2) | 4,513 | Negative regulator, feedback signaling |
| 14 | Q15303 | Receptor tyrosine-protein kinase erbB-4 (HER4) | 3,750 | Homo/heterodimerization |
| 15 | P21860 | Receptor tyrosine-protein kinase erbB-3 (HER3) | 3,640 | Homo/heterodimerization, signal amplification |
| 16 | P07585 | Decorin | 3,040 | Extracellular modulator |
| 17 | P29353 | SHC-transforming protein 1 | 2,920 | Adaptor protein, signal transduction |
| 18 | P18031 | Tyrosine-protein phosphatase non-receptor type 1 (PTP1B) | 2,936 | Negative regulator, dephosphorylation |
| 19 | P15514 | Amphiregulin | 2,691 | Ligand, erbB pathway activation |
| 20 | P22681 | E3 ubiquitin-protein ligase CBL | 2,650 | Ubiquitination, receptor degradation |
| 21 | P01135 | Protransforming growth factor alpha (TGF-α) | 2,760 | Ligand, erbB pathway activation |
| 22 | Q99075 | Proheparin-binding EGF-like growth factor (HB-EGF) | 2,453 | Ligand, erbB pathway activation |
| 23 | Q14956 | Transmembrane glycoprotein NMB | 2,336 | Cell surface modulation |
| 24 | O14944 | Proepiregulin | 1,469 | Ligand, erbB pathway activation |
| 25 | Q13480 | GRB2-associated-binding protein 1 (GAB1) | 1,269 | Adaptor, scaffolding protein |
| 26 | Q6UW88 | Epigen | 493 | Ligand, erbB pathway activation |
| 27 | P35070 | Probetacellulin | 798 | Ligand, erbB pathway activation |
| 28-30 | Multiple | Various signaling intermediates | 400-700 | SH2/SH3 adaptor proteins, kinases |
Protein Similarity
Structural/Embedding Similarity (ESM2):
Top 20 structurally similar proteins (receptor tyrosine kinases and paralogs):
- P04626 - Receptor tyrosine-protein kinase erbB-2 (HER2)
- P21860 - Receptor tyrosine-protein kinase erbB-3 (HER3)
- Q15303 - Receptor tyrosine-protein kinase erbB-4 (HER4)
- P08581 - Hepatocyte growth factor receptor (c-Met)
- P07942 - Insulin receptor
- P24503 - Insulin-like growth factor 1 receptor
- Q01279 - Fibroblast growth factor receptor 1 (FGFR1)
- P55245 - Fibroblast growth factor receptor 2 (FGFR2)
- Q16288 - Platelet-derived growth factor receptor beta
- Q01973 - Fibroblast growth factor receptor 4 (FGFR4)
- P13590 - Fibroblast growth factor receptor 3 (FGFR3)
- Q92626 - Vascular endothelial growth factor receptor 2
- P15209 - Fibroblast growth factor receptor 2 (FGFR2)
- O60568 - Ephrin type-A receptor 1 (EPHA1)
- P24786 - Fms-like tyrosine kinase 1 (FLT1)
- P39038 - Tyrosine-protein kinase Lyn
- P33150 - Receptor tyrosine-protein kinase Tie-1
- O75882 - Fibroblast growth factor receptor-like 1
- Q16620 - Receptor tyrosine-protein kinase RET
- O95970 - Receptor-type tyrosine-protein phosphatase zeta
Sequence Homology:
Top 20 homologous proteins (>40% identity, primarily erbB family and RTKs):
- P04626 - Receptor tyrosine-protein kinase erbB-2 (HER2) - ~85% identity over kinase domain
- P21860 - Receptor tyrosine-protein kinase erbB-3 (HER3) - ~80% identity over kinase domain
- Q15303 - Receptor tyrosine-protein kinase erbB-4 (HER4) - ~78% identity over kinase domain
- P08581 - Hepatocyte growth factor receptor (c-Met) - ~52% kinase domain identity
- P07942 - Insulin receptor - ~48% kinase domain identity
- P24503 - Insulin-like growth factor 1 receptor - ~48% kinase domain identity
- Q01279 - Fibroblast growth factor receptor 1 - ~45% kinase domain identity
- P55245 - Fibroblast growth factor receptor 2 - ~45% kinase domain identity
- Q01973 - Fibroblast growth factor receptor 4 - ~44% kinase domain identity
- P13590 - Fibroblast growth factor receptor 3 - ~44% kinase domain identity
- Q16288 - Platelet-derived growth factor receptor beta - ~42% kinase domain identity
- Q92626 - Vascular endothelial growth factor receptor 2 - ~41% kinase domain identity
- P16070 - Ephrin type-A receptor 4 - ~40% kinase domain identity
- P24786 - Fms-like tyrosine kinase 1 - ~41% kinase domain identity
- O60568 - Ephrin type-A receptor 1 - ~40% kinase domain identity
- P39038 - Tyrosine-protein kinase Lyn - ~52% kinase domain identity
- P25908 - Receptor-type tyrosine-protein kinase FLT4 - ~40% kinase domain identity
- Q16539 - Mitogen-activated protein kinase 1 (ERK2) - ~35% kinase domain identity
- Q8NBU5 - Tyrosine-protein kinase BTK - ~35% kinase domain identity
- P16591 - cAMP-dependent protein kinase catalytic subunit alpha - ~33% kinase domain identity
Key Network Characteristics:
- EGFR forms a functional erbB receptor family with HER2-4 through hetero/homodimer formation
- Ligand-binding network: EGF, TGF-α, HB-EGF, Amphiregulin, Epiregulin, and Epigen all activate EGFR
- Signaling hubs: Adaptor proteins (SHC1, GAB1), kinases (Src, HSP90), and phosphatases (SHP2, PTP1B) integrate signals
- Regulatory nodes: CBL-mediated ubiquitination drives receptor internalization and degradation
- Structural homologs: EGFR belongs to a conserved family of ~60 receptor tyrosine kinases with modular architecture
Transcription factor regulatory data
EGFR is not a transcription factor. EGFR (Epidermal growth factor receptor) is a receptor tyrosine-protein kinase, not a DNA-binding transcription factor. Therefore, JASPAR motif and downstream target information are not applicable.
Upstream regulators of EGFR
EGFR is regulated by 43 upstream transcription factors identified in the CollecTRI database. Key regulators include:
Activators (High Confidence)
- JUN — Activation | Sources: ExTRI, TRRUST, DoRothEA_A
- EGR1 — Activation | Sources: ExTRI, TRRUST, TFactS, DoRothEA_A
- AR (Androgen Receptor) — Activation | Sources: DoRothEA_A, ExTRI, TRRUST, GEREDB, NTNU Curated
- NFKB1 — Activation | Sources: TRRUST, DoRothEA_A, ExTRI, HTRI
- SOX2 — Activation | Sources: ExTRI, SIGNOR, NTNU Curated, Pavlidis2021
- HOXB7 — Activation | Sources: ExTRI, TRRUST
- BCL3 — Activation | Sources: ExTRI, TRRUST
- JUNB — Activation | Sources: ExTRI, TRRUST, DoRothEA_A
- BCL11B — Activation | Sources: ExTRI, Pavlidis2021
- FOS — Activation | Sources: ExTRI, NTNU Curated
- KLF5 — Activation | Sources: ExTRI
- SP4 — Activation | Sources: ExTRI
- FOXN1 — Activation | Sources: ExTRI
- ID1 — Activation | Sources: ExTRI
- ELF5 — Activation | Sources: ExTRI
- CEBPG — Activation | Sources: ExTRI
Repressors (High Confidence)
- KLF10 — Repression | Sources: ExTRI, TRRUST, NTNU Curated
- GCFC2 — Repression | Sources: ExTRI, NTNU Curated
- HDAC1 — Repression | Sources: TRRUST
- BRCA1 — Repression | Sources: TRRUST
- GLI1 — Repression | Sources: GEREDB
- EMX2 — Repression | Sources: GEREDB
Unknown Regulation (High Confidence)
- ESR1 (Estrogen Receptor α) | Sources: ExTRI, TRRUST, NTNU Curated, DoRothEA_A
- ESR2 (Estrogen Receptor β) | Sources: ExTRI, GEREDB, NTNU Curated
- GATA3 | Sources: HTRI, DoRothEA_A
- IRF1 | Sources: ExTRI, GEREDB
- SP3 | Sources: ExTRI, GEREDB
- CEBPB | Sources: ExTRI
- HDAC3 | Sources: TRRUST
- CREBBP | Sources: TRRUST
- E2F1 | Sources: DoRothEA_A
Low Confidence Regulators
- EGR2, FOSL1, FOXO3, FOXC1, ETS2, ERF, HOXA11, PAX3, DNMT1, GLI2, GLI3, HOXA7
Drug & pharmacology data
EGFR as Drug Target
EGFR (Epidermal growth factor receptor) is a well-established and extensively targeted protein in drug development. Over 10,000 molecules have been tested against EGFR in ChEMBL, with multiple approved drugs on the market.
Approved EGFR-Targeting Molecules (Top 5, Phase 4)
| Molecule | ID | Mechanism | Status | Clinical Trials |
|---|---|---|---|---|
| Osimertinib | CHEMBL3353410 | EGFR TKI (irreversible, 3rd generation) | Phase 4 | 228 trials |
| Erlotinib | CHEMBL553 | EGFR TKI (reversible, 1st generation) | Phase 4 | 496 trials |
| Gefitinib | CHEMBL939 | EGFR TKI (reversible, 1st generation) | Phase 4 | 294 trials |
| Afatinib | CHEMBL1173655 | EGFR/HER2 TKI (irreversible, 2nd generation) | Phase 4 | 179 trials |
| Lapatinib | CHEMBL554 | EGFR/HER2 dual TKI | Phase 4 | 261 trials |
All five approved drugs are small-molecule tyrosine kinase inhibitors. Additional ~9,995 molecules in earlier development phases.
Clinical Trials (Selected Top 20 by Drug)
Erlotinib (CHEMBL553) — Primary indications: NSCLC, pancreatic cancer, head/neck cancer
- NCT01287754: NSCLC with EGFR mutations | Phase 4 | COMPLETED
- NCT01609543: 1st-line lung adenocarcinoma with EGFR mutations | Phase 4 | COMPLETED
- NCT00446225: NSCLC with EGFR TK domain mutations | Phase 3 | COMPLETED
- NCT01024413: Erlotinib vs Gefitinib in advanced NSCLC with EGFR exon 19/21 mutations | Phase 3 | COMPLETED
- NCT00349219 (TORCH): Erlotinib vs chemotherapy for advanced NSCLC | Phase 3 | COMPLETED
- NCT02296125: Osimertinib (AZD9291) vs Gefitinib/Erlotinib in NSCLC | Phase 3 | COMPLETED
- NCT02411448 (RELAY): Ramucirumab + Erlotinib for EGFR-mutant NSCLC | Phase 3 | ACTIVE
Gefitinib (CHEMBL939) — Primary indications: NSCLC (adenocarcinoma, especially never-smokers)
- NCT00076388 (IRESSA vs Docetaxel): Gefitinib vs chemotherapy | Phase 3 | COMPLETED
- NCT00322452 (IPASS): 1st-line Gefitinib vs carboplatin/paclitaxel in Asia | Phase 3 | COMPLETED
- NCT01774721 (ARCHER1050): Dacomitinib vs Gefitinib for 1st-line NSCLC | Phase 3 | COMPLETED
- NCT01404260: Gefitinib intercalating with chemotherapy | Phase 3 | COMPLETED
- NCT02296125: Osimertinib vs Gefitinib/Erlotinib | Phase 3 | COMPLETED
- NCT02588261: ASP8273 vs Erlotinib/Gefitinib with EGFR mutations | Phase 3 | TERMINATED
Afatinib (CHEMBL1173655) — Primary indications: NSCLC, head/neck cancer, squamous cell carcinoma
- NCT00949650 (LUX-Lung 3): Afatinib 1st-line vs chemotherapy in EGFR-mutant NSCLC | Phase 3 | COMPLETED
- NCT01121393 (LUX-Lung 4): Afatinib vs gemcitabine/cisplatin | Phase 3 | COMPLETED
- NCT01523587 (LUX-Lung 8): Afatinib vs Erlotinib in squamous NSCLC | Phase 3 | COMPLETED
- NCT01466660 (LUX-Lung 7): Afatinib vs Gefitinib for 1st-line EGFR-mutant adenocarcinoma | Phase 2 | COMPLETED
Osimertinib (CHEMBL3353410) — Primary indication: NSCLC (especially T790M resistance mutations)
- NCT02296125: Osimertinib vs Gefitinib/Erlotinib in NSCLC | Phase 3 | COMPLETED
- NCT02151981 (AURA): Osimertinib in EGFR-mutant NSCLC with acquired T790M | Phase 2 | COMPLETED
Pharmacogenomics & Drug Response Predictors
Key EGFR Mutations Affecting Drug Response:
| Mutation | Clinical Relevance | Drug Sensitivity |
|---|---|---|
| Exon 19 deletion | ~45% of EGFR+ NSCLC | Sensitive to all 1st/2nd gen TKIs (erlotinib, gefitinib, afatinib) |
| L858R (exon 21 point mutation) | ~40% of EGFR+ NSCLC | Sensitive to 1st/2nd gen TKIs |
| T790M | Acquired resistance mechanism (~50% after progression on 1st/2nd gen TKI) | Resistant to 1st/2nd gen TKIs; sensitive to osimertinib (3rd gen) |
| Exon 20 insertion | ~5% of EGFR mutations | Generally resistant to 1st/2nd gen TKIs; variable osimertinib response |
| G719X | ~5% of mutations | Intermediate sensitivity |
Dosing Considerations (from approved labels):
- Erlotinib: 150 mg daily (150 mg daily oral); adjust for CYP3A4 inducers/inhibitors
- Gefitinib: 250 mg daily oral
- Afatinib: Dose reduced from 40-50 mg if diarrhea occurs (most common limiting toxicity)
- Osimertinib: 80 mg daily (adjusted to 40 mg if not tolerated); superior CNS penetration
- Lapatinib: 1250 mg daily (with capecitabine in HER2+ breast cancer)
No major pharmacogenomic variant panels (e.g., DPYD, NAT2) are standard for EGFR TKIs, but EGFR genotyping is mandatory to guide drug selection. T790M testing at progression determines osimertinib eligibility.
Expression profiles
Based on biobtree data, here are expression profiles for human EGFR (ENSG00000146648):
Tissue Expression (Bgee)
EGFR shows ubiquitous expression across tissues with high prevalence (285 of 298 conditions present, gold quality).
Top 30 tissues by expression score:
| Rank | Tissue | Expression Score | Status |
|---|---|---|---|
| 1 | Nipple | 99.12 | Present |
| 2 | Gingiva | 98.63 | Present |
| 3 | Gingival epithelium | 98.62 | Present |
| 4 | Placenta | 98.56 | Present |
| 5 | Mammalian vulva | 98.48 | Present |
| 6 | Tongue squamous epithelium | 98.32 | Present |
| 7 | Skin of hip | 98.28 | Present |
| 8 | Superficial temporal artery | 97.94 | Present |
| 9 | Decidua | 97.78 | Present |
| 10 | Penis | 97.65 | Present |
| 11 | Pharyngeal mucosa | 97.63 | Present |
| 12 | Mucosa of paranasal sinus | 97.49 | Present |
| 13 | Urethra | 97.31 | Present |
| 14 | Saphenous vein | 97.30 | Present |
| 15 | Lower lobe of lung | 96.72 | Present |
| 16 | Oral cavity | 96.61 | Present |
| 17 | Sural nerve | 96.43 | Present |
| 18 | Superior surface of tongue | 96.40 | Present |
| 19 | Upper leg skin | 96.33 | Present |
| 20 | Mammary duct | 96.28 | Present |
| 21 | Tongue | 96.21 | Present |
| 22 | Upper arm skin | 96.11 | Present |
| 23 | Hair follicle | 95.77 | Present |
| 24 | Synovial joint | 95.73 | Present |
| 25 | Zone of skin | 95.68 | Present |
| 26 | Body of tongue | 95.67 | Present |
| 27 | Cauda epididymis | 95.66 | Present |
| 28 | Cervix epithelium | 95.56 | Present |
| 29 | Skin of leg | 95.49 | Present |
| 30 | Skin of abdomen | 95.43 | Present |
Pattern: Strong enrichment in epithelial tissues (skin, mucosa, gingiva), reproductive tissues, and vasculature—consistent with EGFR’s role in growth signaling in epithelial-derived cells.
Single-Cell Expression Datasets
Available SCXA (Single Cell Expression Atlas) datasets containing EGFR:
- E-ANND-2: GTEx snRNAseq atlas (209,126 cells) — comprehensive tissue atlas
- E-CURD-114: Human airway epithelium (in vivo) (81,801 cells) — smoking effects, epithelial-specific
- E-MTAB-6701: First trimester fetal-maternal interface (135,071 cells)
- E-MTAB-11268: Hypertrophied heart (64,898 cells)
- E-GEOD-84465: Glioblastoma infiltrating cells (3,588 cells)
- E-HCAD-24: First-trimester placenta and decidua (24,780 cells)
- E-MTAB-8559: Ovarian cancer models (20,982 cells)
- E-MTAB-9435: IDHwt glioblastoma tumors (62,867 cells)
- E-MTAB-10596: Dental follicle and organoids (3,388 cells)
- E-MTAB-10137: Dermal blood vascular endothelium (1,523 cells)
- E-ENAD-27: Human islet cells (1,145 cells)
Note: Detailed cell-type expression scores within these datasets require direct database access (ArrayExpress/EBI Single Cell Expression Atlas). For comprehensive cell-type breakdown with quantified expression, consult Human Protein Atlas (HPA) for tissue/cell-line data or the individual SCXA datasets.
Disease associations
Mendelian / Monogenic Disease
Primary EGFR-associated Mendelian diseases (curated gene-disease associations):
| Disease | Disease IDs | Inheritance | Evidence Level |
|---|---|---|---|
| Lung cancer | OMIM:211980 / MONDO:0008903 | Autosomal dominant | Definitive |
| Inflammatory skin and bowel disease, neonatal, 2 | OMIM:616069 / MONDO:0014481 | Autosomal recessive | Strong / Moderate |
| Neonatal erythroderma-autoinflammation-inflammatory bowel disease syndrome | Orphanet:294023 | Autosomal recessive | Supportive |
Additional somatic/complex cancer associations (from clinvar):
- MONDO:0008170 – Ovarian cancer
- MONDO:0010150 – Head and neck squamous cell carcinoma
- MONDO:0002447 – Endometrial carcinoma
- MONDO:0016419 – Hereditary breast carcinoma
- MONDO:0019087 – Cholangiocarcinoma
- MONDO:0005061 – Lung adenocarcinoma
- MONDO:0005138 – Lung carcinoma
- MONDO:0005233 – Non-small cell lung carcinoma
- MONDO:0001187 – Urinary bladder cancer
- MONDO:0005097 – Squamous cell lung carcinoma
- MONDO:0007154 – Arteriovenous malformations of the brain
- MONDO:0023644 – Lip and oral cavity carcinoma
Orphanet rare disease classifications:
- Orphanet:140162 – Inherited cancer-predisposing syndrome
- Orphanet:227535 – Hereditary breast cancer
- Orphanet:70567 – Cholangiocarcinoma
- Orphanet:145 – Hereditary breast and/or ovarian cancer syndrome
Phenotype Associations (HPO Terms)
Top 21 clinical phenotypes associated with EGFR mutations:
| HPO ID | Phenotype |
|---|---|
| HP:0000006 | Autosomal dominant inheritance |
| HP:0000007 | Autosomal recessive inheritance |
| HP:0030078 | Lung adenocarcinoma |
| HP:0030358 | Non-small cell lung carcinoma |
| HP:0006519 | Alveolar cell carcinoma |
| HP:0006532 | Recurrent pneumonia |
| HP:0025092 | Epidermal acanthosis |
| HP:0005208 | Secretory diarrhea |
| HP:0003577 | Congenital onset |
| HP:0003212 | Increased circulating IgE concentration |
| HP:0002013 | Vomiting |
| HP:0001944 | Dehydration |
| HP:0001680 | Coarctation of aorta |
| HP:0001561 | Polyhydramnios |
| HP:0001508 | Failure to thrive |
| HP:0001442 | Typified by somatic mosaicism |
| HP:0000822 | Hypertension |
| HP:0000527 | Long eyelashes |
| HP:0200039 | Pustule |
| HP:0200034 | Papule |
| HP:0100501 | Recurrent bronchiolitis |
Complex Disease / GWAS Associations
Top 30 GWAS associations with EGFR locus:
| Trait / Disease | Lead Gene(s) | Chromosome | P-value | Significance |
|---|---|---|---|---|
| Glioblastoma | EGFR / SEC61G-DT-EGFR | 7 | 5e-34 | Very strong |
| Glioma | SEC61G-DT-EGFR | 7 | 4e-27 | Very strong |
| Glioblastoma (age-stratified) | SEC61G-DT-EGFR | 7 | 4e-16 | Very strong |
| Glioblastoma | SEC61G-DT-EGFR | 7 | 3e-16 | Very strong |
| Plateletcrit | EGFR | 7 | 6e-18 | Very strong |
| Platelet distribution width | EGFR | 7 | 1e-17 | Very strong |
| Glioma | SEC61G-DT-EGFR | 7 | 7e-12 | Strong |
| Platelet count | EGFR | 7 | 1e-12 | Strong |
| Glioblastoma | EGFR | 7 | 1e-12 | Strong |
| Glioblastoma (age-stratified) | SEC61G-DT-EGFR | 7 | 2e-12 | Strong |
| Glioma (age-stratified) | SEC61G-DT-EGFR | 7 | 7e-12 | Strong |
| Glioblastoma | SEC61G-DT-EGFR | 7 | 1e-11 | Strong |
| Glioblastoma (age-stratified) | EGFR | 7 | 2e-11 | Strong |
| Glioma | EGFR | 7 | 5e-12 | Strong |
| Haemorrhoidal disease | EGFR | 7 | 3e-13 | Very strong |
| Glioma | SEC61G-DT-EGFR | 7 | 2e-09 | Strong |
| Glioblastoma (age-stratified) | EGFR | 7 | 2e-09 | Strong |
| Mean corpuscular hemoglobin concentration | EGFR | 7 | 2e-09 | Strong |
| Monocyte count | EGFR | 7 | 4e-09 | Strong |
| Metabolite levels | SEC61G-DT-EGFR | 7 | 4e-09 | Strong |
| Glioma | SEC61G-DT-EGFR | 7 | 2e-09 | Strong |
| Mean spheric corpuscular volume | EGFR | 7 | 3e-09 | Strong |
| Non-glioblastoma glioma | SEC61G-DT-EGFR | 7 | 2e-08 | Strong |
| Glioma | EGFR | 7 | 7e-08 | Strong |
| Glioblastoma (age-stratified) | EGFR | 7 | 6e-08 | Strong |
| Glioma | SEC61G-DT-EGFR | 7 | 8e-08 | Strong |
| L1-L4 bone mineral density × serum urate levels interaction | EGFR | 7 | 6e-06 | Moderate |
| Subjective response to lithium treatment | SEC61G-DT-EGFR | 7 | 1e-06 | Moderate |
| Refractive astigmatism | EGFR | 7 | 1e-06 | Moderate |
| Metabolite levels | SEC61G-DT-EGFR | 7 | 1e-06 | Moderate |
Key observations:
- Strongest associations: Glioblastoma and glioma (p < 1e-12), reflecting EGFR’s pivotal role in CNS malignancies
- Hematologic traits: Significant associations with platelet counts and distribution (p < 1e-17), monocyte counts, and hemoglobin measures
- Cancer predisposition: EGFR mutations drive somatic and germline tumor susceptibility across multiple epithelial tissues (lung, ovarian, breast, head/neck)