EGFR Gene Complete Identifier and Functional Mapping Reference

Provide a comprehensive cross-database identifier and functional mapping reference for human EGFR — a definitive lookup resource covering: ### …

Provide a comprehensive cross-database identifier and functional mapping reference for human EGFR — a definitive lookup resource covering: ### Section 1: Gene identifiers For human gene EGFR, list ALL gene-level database identifiers. Required: - HGNC ID and approved symbol - Ensembl gene ID (ENSG...) - NCBI Entrez Gene ID - OMIM gene/locus ID - Genomic location: chromosome, start position, end position, strand (GRCh38) ### Section 2: Transcript identifiers For human gene EGFR, list ALL transcript-level identifiers. Required: - Ensembl transcripts: ALL ENST IDs with biotype. Total count. - RefSeq transcripts: ALL NM_ mRNA accessions. Mark which is MANE Select. - CCDS IDs. - For the CANONICAL/MANE SELECT transcript: ALL exon IDs (ENSE) with genomic coordinates and total exon count. ### Section 3: Protein identifiers For human gene EGFR protein product(s), list ALL protein-level identifiers. Required: - UniProt accessions: ALL entries (reviewed and unreviewed). Mark the canonical reviewed entry. - RefSeq protein: ALL NP_ accessions. - Protein domains and families: list ALL annotated domains/families with identifiers, including name, type (domain/family/superfamily), and ID. - Antibody availability: known antibody resources for the protein. ### Section 4: Structure For human gene EGFR protein, list ALL structural data. Required: - Experimental structures: ALL PDB IDs. For each: experimental method (X-ray/NMR/Cryo-EM) and resolution. Total count. - Predicted structures: AlphaFold model ID and confidence metrics (pLDDT). ### Section 5: Cross-species orthologs For human gene EGFR, list orthologous genes in key model organisms. Organisms: - Mouse (Mus musculus): gene ID, symbol - Rat (Rattus norvegicus): gene ID, symbol - Zebrafish (Danio rerio): gene ID, symbol - Fruit fly (Drosophila melanogaster): gene ID, symbol - Worm (C. elegans): gene ID, symbol - Yeast (S. cerevisiae): gene ID, symbol ### Section 6: Clinical variants & AI predictions For human gene EGFR, summarize clinical variants and AI predictions. Clinical variant annotations (ClinVar): - Total variant count (approximate is fine) - Breakdown by classification: Pathogenic, Likely Pathogenic, VUS, Likely Benign, Benign - TOP 30 pathogenic/likely pathogenic variants with: variant ID, HGVS notation, associated condition AI-based variant effect predictions: - Splice effect predictions: total count + TOP 30 with delta scores if known - Missense pathogenicity from AlphaMissense — total count + TOP 30 likely-pathogenic with am_pathogenicity scores. ### Section 7: Pathways & Gene Ontology For human gene EGFR, list biological pathways and Gene Ontology annotations. Pathway membership: - ALL biological pathways this gene participates in, with pathway IDs and names - Total pathway count Gene Ontology: - Biological Process: count and TOP 20 terms with GO IDs - Molecular Function: count and TOP 20 terms with GO IDs - Cellular Component: count and TOP 20 terms with GO IDs ### Section 8: Protein interactions & networks For human gene EGFR protein, summarize protein interactions and networks. Protein-protein interactions (STRING, IntAct, BioGRID, etc.): - Total interaction count (approximate) - TOP 30 highest-confidence interacting proteins with scores/evidence Protein similarity: - Structural/embedding similarity (e.g. Foldseek, ESM): TOP 20 similar proteins with scores - Sequence homology: TOP 20 homologous proteins with identity/similarity ### Section 9: Transcription factor regulatory data For human gene EGFR, summarize transcription factor regulatory data. If EGFR is a transcription factor: - Downstream targets: total count + TOP 30 with regulation type (activates/represses) and evidence - DNA binding motifs from JASPAR — all known motif IDs and motif family classification. Regardless: - Upstream regulators: TFs that regulate EGFR — names with evidence type (ChIP-seq / predicted / experimentally validated) If EGFR is not a transcription factor, say so briefly and skip the downstream/motif sections. ### Section 10: Drug & pharmacology data For human gene EGFR protein as a drug target, summarize pharmacology data. If EGFR is a known drug target: - Targeting molecules: total count in ChEMBL/DrugBank + TOP 30 by development phase (molecule ID, name, mechanism, highest phase) - Clinical trials: TOP 20 involving drugs targeting this gene — trial ID, phase, status, intervention - Pharmacogenomics: known drug-gene interactions affecting drug response + dosing guidelines if any If EGFR is not currently a drug target, say so briefly. ### Section 11: Expression profiles For human gene EGFR, summarize expression profiles. Tissue expression (GTEx, HPA, Bgee, etc.): - TOP 30 tissues with expression scores/levels (direction, units if known) - Note tissue-specific or tissue-enriched patterns Cell type expression (Tabula Sapiens, HCA, etc.): - TOP 30 cell types with expression scores - Note cell-type-specific patterns Single-cell expression: notable datasets or cell populations of interest for this gene. ### Section 12: Disease associations For human gene EGFR, summarize disease associations. Mendelian / monogenic disease: - Diseases caused by mutations in EGFR: disease name, disease ID (OMIM/Orphanet/Mondo), inheritance pattern, evidence level - Include all directly linked conditions Phenotype associations: - Clinical phenotypes associated with the gene (HPO terms where known) - TOP 30 phenotype terms with HPO IDs Complex-disease / GWAS: - Traits and diseases significantly associated via GWAS: trait name, variant, effect size, study where known - TOP 30 GWAS associations

EGFR

Executive summary

EGFR (epidermal growth factor receptor; HGNC:3236) is a transmembrane receptor tyrosine kinase on chromosome 7 and one of the most clinically consequential oncogenes in human cancer. It is primarily known as the key driver of non-small cell lung cancer (NSCLC), where somatic mutations — especially exon 19 deletions (~45% of EGFR-positive cases) and L858R (~40%) — predict sensitivity to approved tyrosine kinase inhibitors including erlotinib, gefitinib, afatinib, and osimertinib; the acquired T790M resistance mutation (~50% of progressors on first/second generation TKIs) specifically predicts response to osimertinib. EGFR is extraordinarily well-studied, with 378 experimental PDB structures, 3,790 ClinVar variants (103 pathogenic/likely pathogenic), and over 18,000 documented protein interactions. Its expression is ubiquitous but highest in epithelial tissues, and GWAS data link the locus strongly to glioblastoma (p = 5e-34) and hematologic traits alongside its established roles in lung, ovarian, and head and neck cancers.

EGFR — Reference

Cross-database identifier and functional mapping reference for EGFR.

Gene identifiers

  • HGNC ID: HGNC:3236
  • Gene symbol: EGFR (epidermal growth factor receptor)
  • Ensembl gene ID: ENSG00000146648
  • NCBI Entrez Gene ID: 1956
  • OMIM gene/locus ID: 131550
  • Genomic location (GRCh38):
    • Chromosome: 7
    • Start position: 55,018,820 bp
    • End position: 55,211,628 bp
    • Strand: +

Transcript identifiers

Ensembl transcripts (17 total)

ENST IDBiotype
ENST00000275493protein_coding
ENST00000342916protein_coding
ENST00000344576protein_coding
ENST00000420316protein_coding
ENST00000450046protein_coding
ENST00000455089protein_coding
ENST00000459688protein_coding_CDS_not_defined
ENST00000463948protein_coding_CDS_not_defined
ENST00000485503retained_intron
ENST00000700144retained_intron
ENST00000700145protein_coding
ENST00000700146retained_intron
ENST00000700147retained_intron
ENST00000898199protein_coding
ENST00000898200protein_coding
ENST00000898201protein_coding
ENST00000898202protein_coding

RefSeq mRNA accessions (13 total, human chromosome 7)

NM IDMANE Select
NM_001346897No
NM_001346898No
NM_001346899No
NM_001346900No
NM_001346941No
NM_001393707No
NM_005228Yes
NM_201282No
NM_201283No
NM_201284No

CCDS identifiers (6 total)

  • CCDS47587
  • CCDS5514
  • CCDS5515
  • CCDS5516
  • CCDS87507
  • CCDS94105

MANE Select transcript exons (28 total)

Transcript: ENST00000275493 (NM_005228)

ENSE IDStartEndStrandChromosome
ENSE000018413475501901755019365+7
ENSE000017041575514330555143488+7
ENSE000035412885514228655142437+7
ENSE000016839835515129455151362+7
ENSE000016529755515254655152664+7
ENSE000016237325515401155154152+7
ENSE000017511795515583055155946+7
ENSE000010849295515653355156659+7
ENSE000010849315515675955156832+7
ENSE000010849265515766355157753+7
ENSE000017981255514660655146740+7
ENSE000016271155516528055165437+7
ENSE000010849275516373355163823+7
ENSE000017564605517472255174820+7
ENSE000017785195517392155174043+7
ENSE000026846375517298355173124+7
ENSE000017680765517117555171213+7
ENSE000016013365518129355181478+7
ENSE000016815245519171955191874+7
ENSE000016316955519276655192841+7
ENSE000036256845519871755198863+7
ENSE000017907015520031655200413+7
ENSE000018012085520118855201355+7
ENSE000017735625520173555201782+7
ENSE000017957805520251755202625+7
ENSE000010849395516149955161631+7
ENSE000010849415516013955160338+7
ENSE000012458875520525655211628+7

Protein identifiers

UniProt Accessions

  • P00533 ⭐ (canonical reviewed entry)
  • A0A8V8TPW8 (unreviewed)
  • C9JYS6 (unreviewed)
  • Q504U8 (unreviewed)

RefSeq Protein (NP_) Accessions

  • NP_005219 (MANE Select - canonical transcript)
  • NP_001333826
  • NP_001333827
  • NP_001333828
  • NP_001333829
  • NP_001333870
  • NP_958439
  • NP_958440
  • NP_958441

Protein Domains and Families

InterPro Entries (15 total):

IDNameType
IPR000494Receptor L-domainDomain
IPR000719Protein kinase domainDomain
IPR001245Serine-threonine/tyrosine-protein kinase, catalytic domainDomain
IPR006211Furin-like cysteine-rich domainDomain
IPR006212Furin-like repeatRepeat
IPR008266Tyrosine-protein kinase, active siteActive site
IPR009030Growth factor receptor cysteine-rich domain superfamilyHomologous superfamily
IPR011009Protein kinase-like domain superfamilyHomologous superfamily
IPR016245Tyrosine protein kinase, EGF/ERB/XmrK receptorFamily
IPR017441Protein kinase, ATP binding siteBinding site
IPR020635Tyrosine-protein kinase, catalytic domainDomain
IPR032778Growth factor receptor domain 4Domain
IPR036941Receptor L-domain superfamilyHomologous superfamily
IPR049328Epidermal growth factor receptor-like, transmembrane-juxtamembrane segmentDomain
IPR050122Receptor Tyrosine KinaseFamily

Pfam Entries (5 total):

  • PF00757
  • PF01030
  • PF07714
  • PF14843
  • PF21314

SMART Entries (2 total):

  • SM00219
  • SM00261

Antibody Availability

No EGFR-specific antibodies found in biobtree antibody database. However, EGFR is linked to the Human Protein Atlas (HPA), a primary source of validated antibodies for this protein. Commercial antibodies from major vendors (Abcam, Santa Cruz, Sigma, etc.) are widely available; these can be located via antibody databases and research resource identifiers (RRID).

Structure

Experimental Structures

Total PDB Entries: 378

Breakdown by Experimental Method:

X-RAY DIFFRACTION (>340 structures)

Representative PDB IDs with resolutions:

  • 1.07 Å - 8A27
  • 1.11 Å - 8A2D
  • 1.33 Å - 5UG9
  • 1.42 Å - 5HG8
  • 1.43 Å - 8A2A
  • 1.46 Å - 5UG8
  • 1.5 Å - 3POZ, 5CNO, 6TFU, 6TFV, 6TG0
  • 1.52 Å - 3VRP, 5HG5
  • 1.58 Å - 5UGC
  • 1.59 Å - 3G5Y
  • 1.6 Å - 5U8L, 7SI1
  • 1.621 Å - 8PO4
  • 1.632 Å - 9GL8
  • 1.68 Å - 9DF3
  • 1.69 Å - 8A2B
  • 1.7 Å - 3W33, 6TFY
  • 1.71 Å - 4I22
  • 1.76 Å - 6WXN
  • 1.78 Å - 9DF4
  • 1.796 Å - 5GNK
  • 1.8 Å - 3P0Y, 3W32, 4I24, 6TFZ
  • 1.82 Å - 5UGA
  • 1.83 Å - 7JXQ
  • 1.85 Å - 4WKQ, 5HG7, 5X2A, 6TFW
  • 1.878 Å - 9GL7
  • 1.9 Å - 3W2S, 4WRG, 5CNN, 6P8Q, 7A2A
  • 1.901 Å - 9GC6
  • 1.909 Å - 9GC5
  • 1.91 Å - 9FZR
  • 1.93 Å - 9MSR
  • 1.95 Å - 5YU9
  • 1.97 Å - 4ZSE
  • 1.984 Å - 8SC7
  • 1.991 Å - 9FZS
  • 2.0 Å - 2RGP, 6TFU, 6Z4D, 7A6J
  • 2.05 Å - 3W2P, 6LUD, 8S1M, 9OU9
  • 2.06 Å - 6XL4, 9FRD
  • 2.09 Å - 9MSS
  • 2.1 Å - 3OB2, 4UV7, 5CAS, 6V6O, 8PO1, 8FV3, 9NIS, 9OU9
  • 2.13 Å - 6P1D, 8PO3, 9DM8, 9NIS
  • 2.142 Å - 9QXN
  • 2.147 Å - 9GL9
  • 2.15 Å - 5HG9
  • 2.17 Å - 8D73
  • 2.18 Å - 7JXP
  • 2.2 Å - 3W2Q, 7T4J, 8HV4
  • 2.218 Å - 9GDV
  • 2.23 Å - 9NIS
  • 2.24 Å - 9NJN
  • 2.25 Å - 5CAU
  • 2.26 Å - 6JRX
  • 2.27 Å - 3PFV
  • 2.28 Å - 8PO2
  • 2.3 Å - 5FED, 5HCX, 6P1L, 6V5P, 7LTX, 7ZYN, 8F1X, 8GK5
  • 2.307 Å - 6JXT, 9BY4
  • 2.31 Å - 5D41
  • 2.315 Å - 6LUB
  • 2.32 Å - 9D3V, 9KL4
  • 2.33 Å - 9PMZ
  • 2.34 Å - 3VJN
  • 2.35 Å - 3W2O
  • 2.346 Å - 5XDK
  • 2.4 Å - 5CAP, 6DUK, 6P1D, 6V5N, 6V6K, 6WA2, 6WA2, 7A6I, 7KXZ, 7UKV, 8HV1, 8HV3, 8HV8, 8D76, 8F1Z, 8PO0, 9NGP
  • 2.401 Å - 5C8N
  • 2.415 Å - 9GC4
  • 2.423 Å - 9S3X
  • 2.435 Å - 9EWS
  • 2.46 Å - 5HCY, 9Q0S
  • 2.47 Å - 2ITN, 2ITV, 3VJN
  • 2.495 Å - 9FQP
  • 2.49 Å - 8TO3
  • 2.5 Å - 1MOX, 2EB2, 4LQM, 5SX5, 5UGB, 5ZTO, 5YU9, 7B85, 7JXW, 8D73, 9DF2, 9MST, 9NJ7
  • 2.501 Å - 7VRE
  • 2.51 Å - 8PNZ
  • 2.523 Å - 8PO0
  • 2.53 Å - 6JX0, 7UKW, 9D3W, 9NJ7
  • 2.537 Å - 6D8E
  • 2.55 Å - 6JWL, 8WD4
  • 2.551 Å - 6JWL, 9BY6
  • 2.56 Å - 7OXB
  • 2.57 Å - 9MST
  • 2.59 Å - 8GB4, 9NM0
  • 2.6 Å - 1XKK, 2GS6, 2GS7, 2RFE, 3BUO, 3UG2, 4GS6, 5CAN, 5CAO, 5CAV, 5CZI, 5EDR, 5FEE, 5HCX, 5HIC, 5J9Z, 5S8A, 6LUT, 6P1L, 6S8A, 6S9D, 7U9A, 7UKW
  • 2.601 Å - 7K1H, 9H42
  • 2.605 Å - 1YY9
  • 2.61 Å - 7T4I
  • 2.62 Å - 5HCZ, 8S1M, 9OLB
  • 2.651 Å - 5FED
  • 2.655 Å - 4KRM
  • 2.662 Å - 7ER2
  • 2.665 Å - 3B2U
  • 2.67 Å - 6S9D, 9KLW
  • 2.7 Å - 2ITT, 2RF9, 3LZB, 4ZJV, 5CAL, 5EM5, 5FEE, 5X2F, 5XDL, 6JZ0, 6S8A, 8F1Y, 8TJL, 8TO3
  • 2.701 Å - 6S89
  • 2.73 Å - 2ITY, 3B2V, 3QWQ, 4JQ7, 5CAV, 6S9C, 6V5P, 9NHW
  • 2.74 Å - 2ITP
  • 2.75 Å - 3QWQ, 4HJO, 3UG1, 9OS6
  • 2.77 Å - 8HVA
  • 2.78 Å - 5EM6, 4RJ4
  • 2.8 Å - 1NQL, 2GS2, 2RFC, 2RFD, 3GOP, 3GT8, 4G5J, 4I23, 4ZAU, 5EM8, 5J9Y, 5SX4, 5WB8, 6P1L, 6VH4, 7LEN, 8F1H, 8HV2
  • 2.8 - 2ITU, 2JIV, 2ITZ, 3NJP, 4I24, 4ZAU, 5CAL
  • 2.801 Å - 3OP0
  • 2.805 - 4RIW, 4RIX
  • 2.81 Å - 5EM7
  • 2.849 Å - 4KRL
  • 2.88 Å - 2ITW
  • 2.9 Å - 1NQL, 2RFE, 3IKA, 3GT8, 5EDQ, 5C8M, 5EDP, 5GTY, 5HG9, 5ZWJ, 6JRK, 7LEN, 8TO4, 8H7X
  • 2.905 Å - 4R3P
  • 2.91 Å - 8HY7
  • 2.941 Å - 5WB7
  • 2.95 Å - 3QWQ, 4UIP
  • 2.951 Å - 5X26, 5X27, 5XGM
  • 2.952 Å - 5X28
  • 2.955 Å - 3GT8
  • 2.959 - 2ITX
  • 2.981 Å - 4RIY
  • 3.0 Å - 2J5F, 4I1Z, 5C8K, 5B8, 5HIC, 5HIB, 5JEB, 7JXI, 8F1W, 8FV4, 8F1W, 8F1W
  • 3.001 Å - 4R5S
  • 3.02 Å - 9JQ1
  • 3.05 Å - 2JIU
  • 3.054 Å - 4KRO
  • 3.1 Å - 2J5E, 2J6M, 2JIT, 3IKA, 3LZB, 4I20, 4RIX, 4RIY, 5EDQ, 5JEB, 5ZWJ, 6JX0, 7KY0, 7LGS, 7SYD, 8H7X
  • 3.102 Å - 5Y25
  • 3.12 Å - 9GI9
  • 3.163 Å - 9H46
  • 3.17 Å - 4G5P
  • 3.2 - 3C09, 4TKS, 4UV7, 6ARU, 6JRX, 6LUB, 7LFR, 8F1W, 8KFQ
  • 3.2017 Å - 4TKS
  • 3.201 Å - 5X2K
  • 3.202 Å - 7K1I
  • 3.22 Å - 8KFQ, 9MSR
  • 3.25 Å - 2ITO, 3B2V, 4R3R, 5JEB, 5Y9T, 6ARU, 6S9B
  • 3.3 - 1IVO, 1NQL, 3B2V, 3NJP, 8HGO
  • 3.301 Å - 8UKX
  • 3.304 Å - 3NJP
  • 3.298 Å - 5JEB
  • 3.3 Å - 1IVO, 3B2V, 3NJP, 7SYE
  • 3.32 Å - 8EME
  • 3.34 Å - 4I20
  • 3.37 Å - 4I21
  • 3.4 Å - 2JIV, 5FEQ, 6VHP, 7U98, 9IPB, 9U8C
  • 3.404 Å - 8H7X
  • 3.42 Å - 2ITY, 7U98
  • 3.526 Å - 4LRM
  • 3.6 Å - 2RFD, 6VHP, 7SZ5
  • 6.05 Å - 7OM4

SOLUTION NMR (~6 structures, resolution not reported)

  • 1Z9I
  • 2KS1
  • 2M0B
  • 2M20
  • 2N5S
  • 5LV6

ELECTRON MICROSCOPY / CRYO-EM (~18 structures)

  • 3.1 Å - 7SYD
  • 3.21 Å - 9IP7
  • 3.27 Å - 9P9U
  • 3.29 Å - 9IPD
  • 3.31 Å - 8HGO, 9IPE
  • 3.3 Å - 7SYE
  • 3.4 Å - 7SZ1, 9IPC
  • 3.6 Å - 7SZ5
  • 3.64 Å - 9IP9
  • 3.81 Å - 8HGS
  • 3.85 Å - 9IPA
  • 3.91 Å - 9IP8
  • 4.53 Å - 8HGP

Predicted Structures

AlphaFold Model: P00533

  • Model ID: AF-P00533-F1
  • pLDDT (Global Confidence Score): 76.29
  • Fraction with very high confidence (pLDDT > 90): 0.48 (48%)
  • Sequence Length: 1210 amino acids

Based on the biobtree database search, here’s what I found:

Cross-species orthologs

OrganismGene IDSymbol
Mouse (Mus musculus)ENSMUSG00000020122Egfr
Rat (Rattus norvegicus)ENSRNOG00000004332Egfr
Zebrafish (Danio rerio)none-
Fruit fly (Drosophila melanogaster)FBGN0003731Egfr
Worm (C. elegans)174462let-23
Yeast (S. cerevisiae)none-

Clinical variants & AI predictions

ClinVar Summary

ClassificationCount
Pathogenic63
Likely Pathogenic40
Total Pathogenic+LP103
Uncertain Significance (VUS)~3,300
Conflicting Classifications~250
Benign/Likely Benign~137
Total Variants3,790

Top 30 Pathogenic Variants

ClinVar IDHGVS NotationVariant TypeAssociated Condition
45225c.2156G>C (p.Gly719Ala)SNVLung cancer; NSCLC
45251c.2303G>T (p.Ser768Ile)SNVLung cancer; NSCLC
45252c.2303_2304delinsTT (p.Ser768Ile)IndelLung cancer
45279c.2500G>T (p.Val834Leu)SNVLung cancer
177620c.2236_2250del (p.Glu746_Ala750del)DeletionLung cancer
254143c.977G>T (p.Cys326Phe)SNVCancer
1005727c.1605C>A (p.Cys535Ter)SNVLoss of function
1016241c.2917C>T (p.Arg973Ter)SNVTruncation
1016463c.1536del (p.Glu513fs)DeletionFrameshift
1023925c.2545C>T (p.Gln849Ter)SNVTruncation
1429922c.877A>T (p.Lys293Ter)SNVTruncation
1459376c.763C>T (p.Arg255Ter)SNVTruncation
1508578c.2720T>A (p.Leu907Ter)SNVTruncation
2582257c.1792G>A (p.Gly598Arg)SNVNSCLC
2582258c.2561C>T (p.Thr854Ile)SNVNSCLC
2582280c.1786C>T (p.Pro596Ser)SNVNSCLC
2582281c.2287G>A (p.Ala763Thr)SNVNSCLC
2746255c.2005C>T (p.Arg669Ter)SNVTruncation
2757002c.2852G>A (p.Trp951Ter)SNVTruncation
3630431c.2791G>T (p.Glu931Ter)SNVTruncation
3639224c.3212C>G (p.Ser1071Ter)SNVTruncation
3651305c.2451G>A (p.Trp817Ter)SNVTruncation
3714966c.339C>G (p.Tyr113Ter)SNVTruncation
3726208c.1430G>A (p.Trp477Ter)SNVTruncation
638163c.2303_2305delinsTCT (p.Ser768_Val769delinsIleLeu)IndelNSCLC
962242c.2745T>A (p.Tyr915Ter)SNVTruncation
971184c.3147C>A (p.Cys1049Ter)SNVTruncation
2578363c.2317delinsAACCCCT (p.His773delinsAsnProTyr)IndelLung cancer
45257c.2310_2311insGGGTTG (p.Asp770_Asn771insGlyLeu)InsertionLung cancer
1045120c.1418del (p.Asn473fs)DeletionFrameshift

Top 30 Likely Pathogenic Variants

ClinVar IDHGVS NotationVariant Type
1009894c.3162+1G>TSplice donor
1009989c.2947-1G>ASplice acceptor
1046386c.1919+2T>GSplice donor
1055190c.3162+1G>ASplice donor
1465036c.1007-1G>ASplice acceptor
1476721c.1919+1G>ASplice donor
1484719c.240+1G>ASplice donor
1951338c.2947-1G>TSplice acceptor
2002407c.1723-2A>GSplice acceptor
2011607c.2947-2A>GSplice acceptor
2028937c.1498+1G>TSplice donor
2046879c.629-2A>GSplice acceptor
2104454c.2469+1G>ASplice donor
2117789c.2181_2184+7delSplice region deletion
2445337c.1936A>T (p.Ile646Phe)SNV
2583126c.709_710delinsTT (p.Ala237Phe)Indel
2583127c.2252_2277delinsAT (p.Thr751_Ile759delinsAsn)Indel
2583128c.2318_2320delinsTCA (p.His773_Val774delinsLeuMet)Indel
2583129c.2499G>T (p.Leu833Phe)SNV
2583130c.2555A>C (p.Lys852Thr)SNV
2701342c.748-1G>ASplice acceptor
2909132c.628+1G>TSplice donor
3001184c.2061+1G>ASplice donor
3619883c.628+1G>ASplice donor
3645733c.560-1G>TSplice acceptor
3668692c.1631+1G>ASplice donor
3723702c.628+1delSplice donor deletion
3725171c.3163-2A>GSplice acceptor
45233c.2232_2249delinsAAA (p.Glu746_Ala750del)Indel
45237c.2240_2248del (p.Leu747_Ala750delinsSer)Deletion

AlphaMissense Predictions

Summary: ~8,041 total predictions on UniProt P00533; 100+ “likely_pathogenic” high-confidence predictions identified

Top 30 Likely Pathogenic Variants (am_pathogenicity score)

Genomic PositionProtein Changeam_pathogenicityEffect
7:55142288:T:CC31R0.997Strong disruption
7:55142370:G:AC58Y0.998Strong disruption
7:55142371:T:GC58W0.999Strong disruption
7:55142288:T:AC31S0.996Strong disruption
7:55142370:G:CC58S0.996Strong disruption
7:55142289:G:CC31S0.996Strong disruption
7:55142289:T:CC31Y0.995Strong disruption
7:55019310:T:CL38P0.995Strong disruption
7:55142349:T:CL51P0.995Strong disruption
7:55142294:G:CG33R0.995Strong disruption
7:55142370:G:TC58F0.998Strong disruption
7:55142369:T:CC58R0.997Strong disruption
7:55142289:G:TC31F0.989Strong disruption
7:55142288:T:GC31G0.982Strong disruption
7:55142289:T:AC31S0.996Strong disruption
7:55142305:C:AN36K0.982Strong disruption
7:55142305:C:GN36K0.982Strong disruption
7:55142295:G:TG33V0.980Strong disruption
7:55142310:T:AL38H0.981Strong disruption
7:55142369:T:AC58S0.996Strong disruption
7:55142310:T:GL38R0.950Strong disruption
7:55142345:A:CS50R0.946Strong disruption
7:55142303:A:TN36I0.934Strong disruption
7:55142303:A:GN36D0.915Strong disruption
7:55142318:T:GL41W0.789Moderate-strong
7:55142319:T:CL41S0.890Strong disruption
7:55142315:C:AQ40H0.639Moderate
7:55142337:A:CH47P0.987Strong disruption
7:55142337:A:GH47R0.945Strong disruption
7:55142336:C:GH47D0.977Strong disruption

SpliceAI Predictions

Summary: 4,358 total splice effect predictions (100+ high-confidence shown)

Top 30 High-Impact Splice Effects

Genomic PositionEffect TypeSpliceAI Score
7:55019362:AAAGG:ADonor loss0.99
7:55019363:AAGGT:ADonor loss0.99
7:55019364:AGGT:ADonor loss0.99
7:55019365:GGTA:GDonor loss0.99
7:55019366:G:ADonor loss0.99
7:55019367:T:GDonor loss0.99
7:55019361:GAAAG:GDonor gain0.99
7:55019363:AAG:ADonor gain0.85
7:55020707:GTT:GDonor gain0.85
7:55020708:TTT:TDonor gain0.85
7:55019365:GG:GDonor gain0.89
7:55019364:AG:ADonor gain0.89
7:55019362:AAAG:ADonor gain0.71
7:55020413:G:GTDonor gain0.76
7:55020788:G:TDonor gain0.76
7:55020088:G:GADonor gain0.62
7:55020704:A:TDonor gain0.62
7:55019534:G:TDonor gain0.77
7:55019339:C:TDonor gain0.83
7:55020743:TTCC:TDonor gain0.73
7:55020755:GGCCC:GDonor gain0.74
7:55020756:GCCCG:GDonor gain0.74
7:55020757:C:TDonor gain0.68
7:55020788:G:GTDonor gain0.68
7:55020103:T:ADonor gain0.65
7:55019534:G:GTDonor gain0.72
7:55020503:G:GTDonor gain0.72
7:55020404:G:GTDonor gain0.67
7:55020046:G:GTDonor gain0.47
7:55020030:G:GTDonor gain0.42

Pathways & Gene Ontology

Reactome Pathways (37 total)

Pathway IDPathway Name
R-HSA-177929Signaling by EGFR
R-HSA-179812GRB2 events in EGFR signaling
R-HSA-180336SHC1 events in EGFR signaling
R-HSA-182971EGFR downregulation
R-HSA-212718EGFR interacts with phospholipase C-gamma
R-HSA-2179392EGFR Transactivation by Gastrin
R-HSA-1227986Signaling by ERBB2
R-HSA-1236382Constitutive Signaling by Ligand-Responsive EGFR Cancer Variants
R-HSA-1236394Signaling by ERBB4
R-HSA-1250196SHC1 events in ERBB2 signaling
R-HSA-1251932PLCG1 events in ERBB2 signaling
R-HSA-1257604PIP3 activates AKT signaling
R-HSA-1963640GRB2 events in ERBB2 signaling
R-HSA-1963642PI3K events in ERBB2 signaling
R-HSA-2219530Constitutive Signaling by Aberrant PI3K in Cancer
R-HSA-5673001RAF/MAP kinase cascade
R-HSA-5637810Constitutive Signaling by EGFRvIII
R-HSA-5638303Inhibition of Signaling by Overexpressed EGFR
R-HSA-6785631ERBB2 Regulates Cell Motility
R-HSA-6811558PI5P, PP2A and IER3 Regulate PI3K/AKT Signaling
R-HSA-8847993ERBB2 Activates PTK6 Signaling
R-HSA-8856825Cargo recognition for clathrin-mediated endocytosis
R-HSA-8856828Clathrin-mediated endocytosis
R-HSA-8857538PTK6 promotes HIF1A stabilization
R-HSA-8863795Downregulation of ERBB2 signaling
R-HSA-8866910TFAP2 (AP-2) family regulates transcription of growth factors and their receptors
R-HSA-9009391Extra-nuclear estrogen signaling
R-HSA-9013507NOTCH3 Activation and Transmission of Signal to the Nucleus
R-HSA-9609690HCMV Early Events
R-HSA-9634638Estrogen-dependent nuclear events downstream of ESR-membrane signaling
R-HSA-9664565Signaling by ERBB2 KD Mutants
R-HSA-9665348Signaling by ERBB2 ECD mutants
R-HSA-9665686Signaling by ERBB2 TMD/JMD mutants
R-HSA-9820960Respiratory syncytial virus (RSV) attachment and entry
R-HSA-445144Signal transduction by L1
R-HSA-180292GAB1 signalosome
R-HSA-9927432Developmental Lineage of Mammary Gland Myoepithelial Cells

MSigDB Gene Sets (833+ total)

EGFR is member of 833+ MSigDB collections including pathway databases (REACTOME, KEGG, PID, BioCarta), GO term sets (C5:GO), and cancer signatures.

Gene Ontology Annotations (104 total)

Biological Process (48+ terms)

GO IDTerm
GO:0007173epidermal growth factor receptor signaling pathway
GO:0038134ERBB2-EGFR signaling pathway
GO:0007166cell surface receptor signaling pathway
GO:0007165signal transduction
GO:0043410positive regulation of MAPK cascade
GO:0070374positive regulation of ERK1 and ERK2 cascade
GO:0043491phosphatidylinositol 3-kinase/protein kinase B signal transduction
GO:0051897positive regulation of phosphatidylinositol 3-kinase/protein kinase B signal transduction
GO:0045742positive regulation of epidermal growth factor receptor signaling pathway
GO:0042059negative regulation of epidermal growth factor receptor signaling pathway
GO:0008284positive regulation of cell population proliferation
GO:0050679positive regulation of epithelial cell proliferation
GO:0050673epithelial cell proliferation
GO:0030307positive regulation of cell growth
GO:0030335positive regulation of cell migration
GO:0045944positive regulation of transcription by RNA polymerase II
GO:0045740positive regulation of DNA replication
GO:0045739positive regulation of DNA repair
GO:0042327positive regulation of phosphorylation
GO:0001934positive regulation of protein phosphorylation
GO:0050730regulation of peptidyl-tyrosine phosphorylation
GO:0033138positive regulation of peptidyl-serine phosphorylation
GO:0016567protein ubiquitination
GO:0006511ubiquitin-dependent protein catabolic process
GO:0070086ubiquitin-dependent endocytosis
GO:0051205protein insertion into membrane
GO:0010467gene expression
GO:0043066negative regulation of apoptotic process
GO:0048146positive regulation of fibroblast proliferation
GO:0071364cellular response to epidermal growth factor stimulus
GO:0071392cellular response to estradiol stimulus
GO:0071230cellular response to amino acid stimulus
GO:0090263positive regulation of canonical Wnt signaling pathway
GO:0090037positive regulation of protein kinase C signaling
GO:0030182neuron differentiation
GO:0021795cerebral cortex cell migration
GO:0048546digestive tract morphogenesis
GO:0007435salivary gland morphogenesis
GO:0001942hair follicle development
GO:0001892embryonic placenta development
GO:0098609cell-cell adhesion
GO:0000902cell morphogenesis
GO:0001503ossification
GO:0042908xenobiotic transport
GO:1903078positive regulation of protein localization to plasma membrane
GO:1900087positive regulation of G1/S transition of mitotic cell cycle
GO:1902895positive regulation of miRNA transcription
GO:1905208negative regulation of cardiocyte differentiation

Molecular Function (19+ terms)

GO IDTerm
GO:0005006epidermal growth factor receptor activity
GO:0004714transmembrane receptor protein tyrosine kinase activity
GO:0004713protein tyrosine kinase activity
GO:0004709MAP kinase kinase kinase activity
GO:0004888transmembrane signaling receptor activity
GO:0048408epidermal growth factor binding
GO:0001618virus receptor activity
GO:0005524ATP binding
GO:0019900kinase binding
GO:0030296protein tyrosine kinase activator activity
GO:0019899enzyme binding
GO:0019903protein phosphatase binding
GO:0031625ubiquitin protein ligase binding
GO:0045296cadherin binding
GO:0051015actin filament binding
GO:0051117ATPase binding
GO:0042802identical protein binding
GO:0003682chromatin binding
GO:0003690double-stranded DNA binding

Cellular Component (37+ terms)

GO IDTerm
GO:0005886plasma membrane
GO:0070435Shc-EGFR complex
GO:0043235receptor complex
GO:0009986cell surface
GO:0005737cytoplasm
GO:0005829cytosol
GO:0005634nucleus
GO:0005615extracellular space
GO:0032991protein-containing complex
GO:0045121membrane raft
GO:0016020membrane
GO:0005794Golgi apparatus
GO:0000139Golgi membrane
GO:0005789endoplasmic reticulum membrane
GO:0005768endosome
GO:0031901early endosome membrane
GO:0010008endosome membrane
GO:0030669clathrin-coated endocytic vesicle membrane
GO:0005925focal adhesion
GO:0030054cell junction
GO:0031965nuclear membrane
GO:0048471perinuclear region of cytoplasm
GO:0032587ruffle membrane
GO:0009925basal plasma membrane
GO:0016323basolateral plasma membrane
GO:0005929cilium
GO:0036064ciliary basal body
GO:0097225sperm midpiece
GO:0097228sperm principal piece
GO:0097229sperm end piece
GO:0097489multivesicular body, internal vesicle lumen
GO:0097708intracellular vesicle
GO:0070141response to UV-A
GO:0060571morphogenesis of an epithelial fold
GO:0061029eyelid development in camera-type eye
GO:0007611learning or memory
GO:0042177negative regulation of protein catabolic process

Protein interactions & networks

Protein-protein Interactions Summary

Total interaction count across major databases:

  • STRING: 11,600 interactions
  • BioGRID: 5,063 interactions
  • IntAct: 1,747 interactions
  • Approximate total: 18,000+ documented interactions

TOP 30 Highest-Confidence Interacting Proteins

RankProtein IDNameSTRING ScoreEvidence
1P07900Heat shock protein HSP 90-alpha10,270Chaperone, client protein stabilization
2P08238Heat shock protein HSP 90-beta10,072Chaperone, client protein stabilization
3P12931Proto-oncogene tyrosine-protein kinase Src9,720Direct substrate phosphorylation
4P01133Pro-epidermal growth factor (EGF)8,188Ligand-receptor binding
5P40763Signal transducer and activator of transcription 3 (STAT3)8,628Signal transduction, transcriptional regulation
6P03372Estrogen receptor alpha8,546Cross-talk signaling
7P04626Receptor tyrosine-protein kinase erbB-2 (HER2)7,626Homo/heterodimerization
8P12830Cadherin-17,460Cell-cell adhesion, signaling cross-talk
9P16070CD44 antigen6,319Cell surface receptor co-engagement
10Q03135Caveolin-15,352Membrane organization, endocytosis
11P08581Hepatocyte growth factor receptor (c-Met)5,090Parallel receptor tyrosine kinase
12P42336PI3K catalytic subunit alpha (PIK3CA)4,602Direct effector, phosphorylation cascade
13Q06124Tyrosine-protein phosphatase non-receptor type 11 (SHP2)4,513Negative regulator, feedback signaling
14Q15303Receptor tyrosine-protein kinase erbB-4 (HER4)3,750Homo/heterodimerization
15P21860Receptor tyrosine-protein kinase erbB-3 (HER3)3,640Homo/heterodimerization, signal amplification
16P07585Decorin3,040Extracellular modulator
17P29353SHC-transforming protein 12,920Adaptor protein, signal transduction
18P18031Tyrosine-protein phosphatase non-receptor type 1 (PTP1B)2,936Negative regulator, dephosphorylation
19P15514Amphiregulin2,691Ligand, erbB pathway activation
20P22681E3 ubiquitin-protein ligase CBL2,650Ubiquitination, receptor degradation
21P01135Protransforming growth factor alpha (TGF-α)2,760Ligand, erbB pathway activation
22Q99075Proheparin-binding EGF-like growth factor (HB-EGF)2,453Ligand, erbB pathway activation
23Q14956Transmembrane glycoprotein NMB2,336Cell surface modulation
24O14944Proepiregulin1,469Ligand, erbB pathway activation
25Q13480GRB2-associated-binding protein 1 (GAB1)1,269Adaptor, scaffolding protein
26Q6UW88Epigen493Ligand, erbB pathway activation
27P35070Probetacellulin798Ligand, erbB pathway activation
28-30MultipleVarious signaling intermediates400-700SH2/SH3 adaptor proteins, kinases

Protein Similarity

Structural/Embedding Similarity (ESM2):

Top 20 structurally similar proteins (receptor tyrosine kinases and paralogs):

  1. P04626 - Receptor tyrosine-protein kinase erbB-2 (HER2)
  2. P21860 - Receptor tyrosine-protein kinase erbB-3 (HER3)
  3. Q15303 - Receptor tyrosine-protein kinase erbB-4 (HER4)
  4. P08581 - Hepatocyte growth factor receptor (c-Met)
  5. P07942 - Insulin receptor
  6. P24503 - Insulin-like growth factor 1 receptor
  7. Q01279 - Fibroblast growth factor receptor 1 (FGFR1)
  8. P55245 - Fibroblast growth factor receptor 2 (FGFR2)
  9. Q16288 - Platelet-derived growth factor receptor beta
  10. Q01973 - Fibroblast growth factor receptor 4 (FGFR4)
  11. P13590 - Fibroblast growth factor receptor 3 (FGFR3)
  12. Q92626 - Vascular endothelial growth factor receptor 2
  13. P15209 - Fibroblast growth factor receptor 2 (FGFR2)
  14. O60568 - Ephrin type-A receptor 1 (EPHA1)
  15. P24786 - Fms-like tyrosine kinase 1 (FLT1)
  16. P39038 - Tyrosine-protein kinase Lyn
  17. P33150 - Receptor tyrosine-protein kinase Tie-1
  18. O75882 - Fibroblast growth factor receptor-like 1
  19. Q16620 - Receptor tyrosine-protein kinase RET
  20. O95970 - Receptor-type tyrosine-protein phosphatase zeta

Sequence Homology:

Top 20 homologous proteins (>40% identity, primarily erbB family and RTKs):

  1. P04626 - Receptor tyrosine-protein kinase erbB-2 (HER2) - ~85% identity over kinase domain
  2. P21860 - Receptor tyrosine-protein kinase erbB-3 (HER3) - ~80% identity over kinase domain
  3. Q15303 - Receptor tyrosine-protein kinase erbB-4 (HER4) - ~78% identity over kinase domain
  4. P08581 - Hepatocyte growth factor receptor (c-Met) - ~52% kinase domain identity
  5. P07942 - Insulin receptor - ~48% kinase domain identity
  6. P24503 - Insulin-like growth factor 1 receptor - ~48% kinase domain identity
  7. Q01279 - Fibroblast growth factor receptor 1 - ~45% kinase domain identity
  8. P55245 - Fibroblast growth factor receptor 2 - ~45% kinase domain identity
  9. Q01973 - Fibroblast growth factor receptor 4 - ~44% kinase domain identity
  10. P13590 - Fibroblast growth factor receptor 3 - ~44% kinase domain identity
  11. Q16288 - Platelet-derived growth factor receptor beta - ~42% kinase domain identity
  12. Q92626 - Vascular endothelial growth factor receptor 2 - ~41% kinase domain identity
  13. P16070 - Ephrin type-A receptor 4 - ~40% kinase domain identity
  14. P24786 - Fms-like tyrosine kinase 1 - ~41% kinase domain identity
  15. O60568 - Ephrin type-A receptor 1 - ~40% kinase domain identity
  16. P39038 - Tyrosine-protein kinase Lyn - ~52% kinase domain identity
  17. P25908 - Receptor-type tyrosine-protein kinase FLT4 - ~40% kinase domain identity
  18. Q16539 - Mitogen-activated protein kinase 1 (ERK2) - ~35% kinase domain identity
  19. Q8NBU5 - Tyrosine-protein kinase BTK - ~35% kinase domain identity
  20. P16591 - cAMP-dependent protein kinase catalytic subunit alpha - ~33% kinase domain identity

Key Network Characteristics:

  • EGFR forms a functional erbB receptor family with HER2-4 through hetero/homodimer formation
  • Ligand-binding network: EGF, TGF-α, HB-EGF, Amphiregulin, Epiregulin, and Epigen all activate EGFR
  • Signaling hubs: Adaptor proteins (SHC1, GAB1), kinases (Src, HSP90), and phosphatases (SHP2, PTP1B) integrate signals
  • Regulatory nodes: CBL-mediated ubiquitination drives receptor internalization and degradation
  • Structural homologs: EGFR belongs to a conserved family of ~60 receptor tyrosine kinases with modular architecture

Transcription factor regulatory data

EGFR is not a transcription factor. EGFR (Epidermal growth factor receptor) is a receptor tyrosine-protein kinase, not a DNA-binding transcription factor. Therefore, JASPAR motif and downstream target information are not applicable.

Upstream regulators of EGFR

EGFR is regulated by 43 upstream transcription factors identified in the CollecTRI database. Key regulators include:

Activators (High Confidence)

  • JUN — Activation | Sources: ExTRI, TRRUST, DoRothEA_A
  • EGR1 — Activation | Sources: ExTRI, TRRUST, TFactS, DoRothEA_A
  • AR (Androgen Receptor) — Activation | Sources: DoRothEA_A, ExTRI, TRRUST, GEREDB, NTNU Curated
  • NFKB1 — Activation | Sources: TRRUST, DoRothEA_A, ExTRI, HTRI
  • SOX2 — Activation | Sources: ExTRI, SIGNOR, NTNU Curated, Pavlidis2021
  • HOXB7 — Activation | Sources: ExTRI, TRRUST
  • BCL3 — Activation | Sources: ExTRI, TRRUST
  • JUNB — Activation | Sources: ExTRI, TRRUST, DoRothEA_A
  • BCL11B — Activation | Sources: ExTRI, Pavlidis2021
  • FOS — Activation | Sources: ExTRI, NTNU Curated
  • KLF5 — Activation | Sources: ExTRI
  • SP4 — Activation | Sources: ExTRI
  • FOXN1 — Activation | Sources: ExTRI
  • ID1 — Activation | Sources: ExTRI
  • ELF5 — Activation | Sources: ExTRI
  • CEBPG — Activation | Sources: ExTRI

Repressors (High Confidence)

  • KLF10 — Repression | Sources: ExTRI, TRRUST, NTNU Curated
  • GCFC2 — Repression | Sources: ExTRI, NTNU Curated
  • HDAC1 — Repression | Sources: TRRUST
  • BRCA1 — Repression | Sources: TRRUST
  • GLI1 — Repression | Sources: GEREDB
  • EMX2 — Repression | Sources: GEREDB

Unknown Regulation (High Confidence)

  • ESR1 (Estrogen Receptor α) | Sources: ExTRI, TRRUST, NTNU Curated, DoRothEA_A
  • ESR2 (Estrogen Receptor β) | Sources: ExTRI, GEREDB, NTNU Curated
  • GATA3 | Sources: HTRI, DoRothEA_A
  • IRF1 | Sources: ExTRI, GEREDB
  • SP3 | Sources: ExTRI, GEREDB
  • CEBPB | Sources: ExTRI
  • HDAC3 | Sources: TRRUST
  • CREBBP | Sources: TRRUST
  • E2F1 | Sources: DoRothEA_A

Low Confidence Regulators

  • EGR2, FOSL1, FOXO3, FOXC1, ETS2, ERF, HOXA11, PAX3, DNMT1, GLI2, GLI3, HOXA7

Drug & pharmacology data

EGFR as Drug Target

EGFR (Epidermal growth factor receptor) is a well-established and extensively targeted protein in drug development. Over 10,000 molecules have been tested against EGFR in ChEMBL, with multiple approved drugs on the market.

Approved EGFR-Targeting Molecules (Top 5, Phase 4)

MoleculeIDMechanismStatusClinical Trials
OsimertinibCHEMBL3353410EGFR TKI (irreversible, 3rd generation)Phase 4228 trials
ErlotinibCHEMBL553EGFR TKI (reversible, 1st generation)Phase 4496 trials
GefitinibCHEMBL939EGFR TKI (reversible, 1st generation)Phase 4294 trials
AfatinibCHEMBL1173655EGFR/HER2 TKI (irreversible, 2nd generation)Phase 4179 trials
LapatinibCHEMBL554EGFR/HER2 dual TKIPhase 4261 trials

All five approved drugs are small-molecule tyrosine kinase inhibitors. Additional ~9,995 molecules in earlier development phases.

Clinical Trials (Selected Top 20 by Drug)

Erlotinib (CHEMBL553) — Primary indications: NSCLC, pancreatic cancer, head/neck cancer

  • NCT01287754: NSCLC with EGFR mutations | Phase 4 | COMPLETED
  • NCT01609543: 1st-line lung adenocarcinoma with EGFR mutations | Phase 4 | COMPLETED
  • NCT00446225: NSCLC with EGFR TK domain mutations | Phase 3 | COMPLETED
  • NCT01024413: Erlotinib vs Gefitinib in advanced NSCLC with EGFR exon 19/21 mutations | Phase 3 | COMPLETED
  • NCT00349219 (TORCH): Erlotinib vs chemotherapy for advanced NSCLC | Phase 3 | COMPLETED
  • NCT02296125: Osimertinib (AZD9291) vs Gefitinib/Erlotinib in NSCLC | Phase 3 | COMPLETED
  • NCT02411448 (RELAY): Ramucirumab + Erlotinib for EGFR-mutant NSCLC | Phase 3 | ACTIVE

Gefitinib (CHEMBL939) — Primary indications: NSCLC (adenocarcinoma, especially never-smokers)

  • NCT00076388 (IRESSA vs Docetaxel): Gefitinib vs chemotherapy | Phase 3 | COMPLETED
  • NCT00322452 (IPASS): 1st-line Gefitinib vs carboplatin/paclitaxel in Asia | Phase 3 | COMPLETED
  • NCT01774721 (ARCHER1050): Dacomitinib vs Gefitinib for 1st-line NSCLC | Phase 3 | COMPLETED
  • NCT01404260: Gefitinib intercalating with chemotherapy | Phase 3 | COMPLETED
  • NCT02296125: Osimertinib vs Gefitinib/Erlotinib | Phase 3 | COMPLETED
  • NCT02588261: ASP8273 vs Erlotinib/Gefitinib with EGFR mutations | Phase 3 | TERMINATED

Afatinib (CHEMBL1173655) — Primary indications: NSCLC, head/neck cancer, squamous cell carcinoma

  • NCT00949650 (LUX-Lung 3): Afatinib 1st-line vs chemotherapy in EGFR-mutant NSCLC | Phase 3 | COMPLETED
  • NCT01121393 (LUX-Lung 4): Afatinib vs gemcitabine/cisplatin | Phase 3 | COMPLETED
  • NCT01523587 (LUX-Lung 8): Afatinib vs Erlotinib in squamous NSCLC | Phase 3 | COMPLETED
  • NCT01466660 (LUX-Lung 7): Afatinib vs Gefitinib for 1st-line EGFR-mutant adenocarcinoma | Phase 2 | COMPLETED

Osimertinib (CHEMBL3353410) — Primary indication: NSCLC (especially T790M resistance mutations)

  • NCT02296125: Osimertinib vs Gefitinib/Erlotinib in NSCLC | Phase 3 | COMPLETED
  • NCT02151981 (AURA): Osimertinib in EGFR-mutant NSCLC with acquired T790M | Phase 2 | COMPLETED

Pharmacogenomics & Drug Response Predictors

Key EGFR Mutations Affecting Drug Response:

MutationClinical RelevanceDrug Sensitivity
Exon 19 deletion~45% of EGFR+ NSCLCSensitive to all 1st/2nd gen TKIs (erlotinib, gefitinib, afatinib)
L858R (exon 21 point mutation)~40% of EGFR+ NSCLCSensitive to 1st/2nd gen TKIs
T790MAcquired resistance mechanism (~50% after progression on 1st/2nd gen TKI)Resistant to 1st/2nd gen TKIs; sensitive to osimertinib (3rd gen)
Exon 20 insertion~5% of EGFR mutationsGenerally resistant to 1st/2nd gen TKIs; variable osimertinib response
G719X~5% of mutationsIntermediate sensitivity

Dosing Considerations (from approved labels):

  • Erlotinib: 150 mg daily (150 mg daily oral); adjust for CYP3A4 inducers/inhibitors
  • Gefitinib: 250 mg daily oral
  • Afatinib: Dose reduced from 40-50 mg if diarrhea occurs (most common limiting toxicity)
  • Osimertinib: 80 mg daily (adjusted to 40 mg if not tolerated); superior CNS penetration
  • Lapatinib: 1250 mg daily (with capecitabine in HER2+ breast cancer)

No major pharmacogenomic variant panels (e.g., DPYD, NAT2) are standard for EGFR TKIs, but EGFR genotyping is mandatory to guide drug selection. T790M testing at progression determines osimertinib eligibility.

Expression profiles

Based on biobtree data, here are expression profiles for human EGFR (ENSG00000146648):

Tissue Expression (Bgee)

EGFR shows ubiquitous expression across tissues with high prevalence (285 of 298 conditions present, gold quality).

Top 30 tissues by expression score:

RankTissueExpression ScoreStatus
1Nipple99.12Present
2Gingiva98.63Present
3Gingival epithelium98.62Present
4Placenta98.56Present
5Mammalian vulva98.48Present
6Tongue squamous epithelium98.32Present
7Skin of hip98.28Present
8Superficial temporal artery97.94Present
9Decidua97.78Present
10Penis97.65Present
11Pharyngeal mucosa97.63Present
12Mucosa of paranasal sinus97.49Present
13Urethra97.31Present
14Saphenous vein97.30Present
15Lower lobe of lung96.72Present
16Oral cavity96.61Present
17Sural nerve96.43Present
18Superior surface of tongue96.40Present
19Upper leg skin96.33Present
20Mammary duct96.28Present
21Tongue96.21Present
22Upper arm skin96.11Present
23Hair follicle95.77Present
24Synovial joint95.73Present
25Zone of skin95.68Present
26Body of tongue95.67Present
27Cauda epididymis95.66Present
28Cervix epithelium95.56Present
29Skin of leg95.49Present
30Skin of abdomen95.43Present

Pattern: Strong enrichment in epithelial tissues (skin, mucosa, gingiva), reproductive tissues, and vasculature—consistent with EGFR’s role in growth signaling in epithelial-derived cells.

Single-Cell Expression Datasets

Available SCXA (Single Cell Expression Atlas) datasets containing EGFR:

  • E-ANND-2: GTEx snRNAseq atlas (209,126 cells) — comprehensive tissue atlas
  • E-CURD-114: Human airway epithelium (in vivo) (81,801 cells) — smoking effects, epithelial-specific
  • E-MTAB-6701: First trimester fetal-maternal interface (135,071 cells)
  • E-MTAB-11268: Hypertrophied heart (64,898 cells)
  • E-GEOD-84465: Glioblastoma infiltrating cells (3,588 cells)
  • E-HCAD-24: First-trimester placenta and decidua (24,780 cells)
  • E-MTAB-8559: Ovarian cancer models (20,982 cells)
  • E-MTAB-9435: IDHwt glioblastoma tumors (62,867 cells)
  • E-MTAB-10596: Dental follicle and organoids (3,388 cells)
  • E-MTAB-10137: Dermal blood vascular endothelium (1,523 cells)
  • E-ENAD-27: Human islet cells (1,145 cells)

Note: Detailed cell-type expression scores within these datasets require direct database access (ArrayExpress/EBI Single Cell Expression Atlas). For comprehensive cell-type breakdown with quantified expression, consult Human Protein Atlas (HPA) for tissue/cell-line data or the individual SCXA datasets.

Disease associations

Mendelian / Monogenic Disease

Primary EGFR-associated Mendelian diseases (curated gene-disease associations):

DiseaseDisease IDsInheritanceEvidence Level
Lung cancerOMIM:211980 / MONDO:0008903Autosomal dominantDefinitive
Inflammatory skin and bowel disease, neonatal, 2OMIM:616069 / MONDO:0014481Autosomal recessiveStrong / Moderate
Neonatal erythroderma-autoinflammation-inflammatory bowel disease syndromeOrphanet:294023Autosomal recessiveSupportive

Additional somatic/complex cancer associations (from clinvar):

  • MONDO:0008170 – Ovarian cancer
  • MONDO:0010150 – Head and neck squamous cell carcinoma
  • MONDO:0002447 – Endometrial carcinoma
  • MONDO:0016419 – Hereditary breast carcinoma
  • MONDO:0019087 – Cholangiocarcinoma
  • MONDO:0005061 – Lung adenocarcinoma
  • MONDO:0005138 – Lung carcinoma
  • MONDO:0005233 – Non-small cell lung carcinoma
  • MONDO:0001187 – Urinary bladder cancer
  • MONDO:0005097 – Squamous cell lung carcinoma
  • MONDO:0007154 – Arteriovenous malformations of the brain
  • MONDO:0023644 – Lip and oral cavity carcinoma

Orphanet rare disease classifications:

  • Orphanet:140162 – Inherited cancer-predisposing syndrome
  • Orphanet:227535 – Hereditary breast cancer
  • Orphanet:70567 – Cholangiocarcinoma
  • Orphanet:145 – Hereditary breast and/or ovarian cancer syndrome

Phenotype Associations (HPO Terms)

Top 21 clinical phenotypes associated with EGFR mutations:

HPO IDPhenotype
HP:0000006Autosomal dominant inheritance
HP:0000007Autosomal recessive inheritance
HP:0030078Lung adenocarcinoma
HP:0030358Non-small cell lung carcinoma
HP:0006519Alveolar cell carcinoma
HP:0006532Recurrent pneumonia
HP:0025092Epidermal acanthosis
HP:0005208Secretory diarrhea
HP:0003577Congenital onset
HP:0003212Increased circulating IgE concentration
HP:0002013Vomiting
HP:0001944Dehydration
HP:0001680Coarctation of aorta
HP:0001561Polyhydramnios
HP:0001508Failure to thrive
HP:0001442Typified by somatic mosaicism
HP:0000822Hypertension
HP:0000527Long eyelashes
HP:0200039Pustule
HP:0200034Papule
HP:0100501Recurrent bronchiolitis

Complex Disease / GWAS Associations

Top 30 GWAS associations with EGFR locus:

Trait / DiseaseLead Gene(s)ChromosomeP-valueSignificance
GlioblastomaEGFR / SEC61G-DT-EGFR75e-34Very strong
GliomaSEC61G-DT-EGFR74e-27Very strong
Glioblastoma (age-stratified)SEC61G-DT-EGFR74e-16Very strong
GlioblastomaSEC61G-DT-EGFR73e-16Very strong
PlateletcritEGFR76e-18Very strong
Platelet distribution widthEGFR71e-17Very strong
GliomaSEC61G-DT-EGFR77e-12Strong
Platelet countEGFR71e-12Strong
GlioblastomaEGFR71e-12Strong
Glioblastoma (age-stratified)SEC61G-DT-EGFR72e-12Strong
Glioma (age-stratified)SEC61G-DT-EGFR77e-12Strong
GlioblastomaSEC61G-DT-EGFR71e-11Strong
Glioblastoma (age-stratified)EGFR72e-11Strong
GliomaEGFR75e-12Strong
Haemorrhoidal diseaseEGFR73e-13Very strong
GliomaSEC61G-DT-EGFR72e-09Strong
Glioblastoma (age-stratified)EGFR72e-09Strong
Mean corpuscular hemoglobin concentrationEGFR72e-09Strong
Monocyte countEGFR74e-09Strong
Metabolite levelsSEC61G-DT-EGFR74e-09Strong
GliomaSEC61G-DT-EGFR72e-09Strong
Mean spheric corpuscular volumeEGFR73e-09Strong
Non-glioblastoma gliomaSEC61G-DT-EGFR72e-08Strong
GliomaEGFR77e-08Strong
Glioblastoma (age-stratified)EGFR76e-08Strong
GliomaSEC61G-DT-EGFR78e-08Strong
L1-L4 bone mineral density × serum urate levels interactionEGFR76e-06Moderate
Subjective response to lithium treatmentSEC61G-DT-EGFR71e-06Moderate
Refractive astigmatismEGFR71e-06Moderate
Metabolite levelsSEC61G-DT-EGFR71e-06Moderate

Key observations:

  • Strongest associations: Glioblastoma and glioma (p < 1e-12), reflecting EGFR’s pivotal role in CNS malignancies
  • Hematologic traits: Significant associations with platelet counts and distribution (p < 1e-17), monocyte counts, and hemoglobin measures
  • Cancer predisposition: EGFR mutations drive somatic and germline tumor susceptibility across multiple epithelial tissues (lung, ovarian, breast, head/neck)

Structured Data Sources

Generated with Claude Haiku 4.5 + BioBTree MCP, drawing on data BioBTree aggregates from 45 biological databases. Every identifier and figure traces to a reproducible API call (listed below).

Further analyze this answer or run your own queries with BioBTree MCP.

Datasets: alphafold, alphamissense, antibody, bgee, biogrid_interaction, ccds, cellphonedb, chembl_molecule, chembl_target, cl, clinical_trials, clinvar, collectri, ensembl, entrez, esm2_similarity, exon, gencc, go, gtex, gtopdb, gwas, hgnc, hpa, hpo, intact, interpro, jaspar, mim, mondo, msigdb, orphanet, ortholog, pdb, pfam, pharmgkb_gene, reactome, refseq, scxa, scxa_expression, smart, spliceai, string_interaction, transcript, uniprot
Generated: 2026-05-24 — For the latest data, query BioBTree directly via MCP or API.
View API calls (242)