EGFR Gene Complete Identifier and Functional Mapping Reference

Provide a comprehensive cross-database identifier and functional mapping reference for human EGFR. This should serve as a definitive lookup resource …

Provide a comprehensive cross-database identifier and functional mapping reference for human EGFR. This should serve as a definitive lookup resource for researchers. ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 1: GENE IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Provide ALL gene-level database identifiers: - HGNC ID and approved symbol - Ensembl gene ID (ENSG) - NCBI Entrez Gene ID - OMIM gene/locus ID - Genomic location: chromosome, start position, end position, strand ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 2: TRANSCRIPT IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ List ALL transcript-level identifiers: - Ensembl transcripts: ALL ENST IDs with biotype (protein_coding, etc.) How many total transcripts? - RefSeq transcripts: ALL NM_ mRNA accessions Mark which is MANE Select (canonical clinical standard) - CCDS IDs: ALL consensus coding sequence identifiers For the CANONICAL/MANE SELECT transcript: - List ALL exon IDs (ENSE) with genomic coordinates - Total exon count ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 3: PROTEIN IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ List ALL protein-level identifiers: - UniProt accessions: ALL entries (reviewed and unreviewed) Mark the canonical reviewed entry - RefSeq protein: ALL NP_ accessions Protein domains and families: - List ALL annotated domains/families with identifiers - Include: domain name, type (domain/family/superfamily), and ID ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 4: STRUCTURE IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Experimental structures: - List ALL PDB structure IDs - For each: experimental method (X-ray, NMR, Cryo-EM) and resolution - Total PDB structure count Predicted structures: - AlphaFold model ID and confidence metrics (pLDDT) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 5: CROSS-SPECIES ORTHOLOGS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ List orthologous genes in key model organisms (where available): - Mouse (Mus musculus): gene ID, symbol - Rat (Rattus norvegicus): gene ID, symbol - Zebrafish (Danio rerio): gene ID, symbol - Fruit fly (Drosophila melanogaster): gene ID, symbol - Worm (C. elegans): gene ID, symbol - Yeast (S. cerevisiae): gene ID, symbol ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 6: CLINICAL VARIANTS & AI PREDICTIONS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Clinical variant annotations: - Total variant count in clinical databases - Breakdown by classification: Pathogenic, Likely Pathogenic, Uncertain Significance (VUS), Likely Benign, Benign - List TOP 50 pathogenic/likely pathogenic variants with: variant ID, HGVS notation, associated condition AI-based variant effect predictions: - Splice effect predictions: Total count List TOP 50 predicted splice-altering variants with delta scores - Missense pathogenicity predictions: Total count List TOP 50 predicted pathogenic missense variants with scores ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 7: BIOLOGICAL PATHWAYS & GENE ONTOLOGY ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Pathway membership: - List ALL biological pathways this gene participates in - Include pathway IDs and names - Total pathway count Gene Ontology annotations: - Biological Process: count and TOP 20 terms with IDs - Molecular Function: count and TOP 20 terms with IDs - Cellular Component: count and TOP 20 terms with IDs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 8: PROTEIN INTERACTIONS & MOLECULAR NETWORKS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Protein-protein interactions: - Total interaction count - List TOP 50 highest-confidence interacting proteins with scores Protein similarity (evolutionary and structural): - Structural/embedding similarity: How many similar proteins? List TOP 20 with similarity scores - Sequence homology: How many homologous proteins? List TOP 20 with identity/similarity scores ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 9: TRANSCRIPTION FACTOR REGULATORY DATA ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ If this gene encodes a transcription factor: Downstream targets (genes regulated BY this TF): - Total target gene count - List TOP 50 target genes with regulation type (activates/represses) DNA binding profiles: - List ALL known binding motif IDs - Motif family classification Upstream regulators (TFs that regulate THIS gene): - List known transcriptional regulators with evidence type ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 10: DRUG & PHARMACOLOGY DATA ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ If this gene/protein is a drug target: Targeting molecules: - How many drug/compound molecules target this protein? - List TOP 30 molecules by development phase - Include: molecule ID, name, mechanism, highest development phase Clinical trials: - How many clinical trials involve drugs targeting this gene? - List TOP 20 trials with: trial ID, phase, status, intervention Pharmacogenomics: - Known drug-gene interactions affecting drug response - Dosing guidelines if any exist ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 11: EXPRESSION PROFILES ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Tissue expression: - Which tissues express this gene most highly? - List TOP 30 tissues with expression scores/levels - Note any tissue-specific or tissue-enriched patterns Cell type expression: - Which cell types show highest expression? - List TOP 30 cell types with expression scores - Note any cell type-specific patterns Single-cell expression data (if available): - Which single-cell datasets/experiments include this gene? - Notable cell population patterns ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 12: DISEASE ASSOCIATIONS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Mendelian/monogenic disease links: - What diseases are caused by mutations in this gene? - List ALL associated diseases with: disease name, disease ID, inheritance pattern, evidence level Phenotype associations: - What clinical phenotypes are associated with this gene? - List TOP 50 phenotype terms with IDs Complex trait associations (GWAS): - What traits/diseases are linked via genome-wide association studies? - List TOP 30 GWAS associations with: trait, study ID, p-value, effect size if available ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ FORMATTING REQUIREMENTS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - Present each section with clear headers - Use tables for listing multiple items - Always include COUNTS for every category - For lists with >50 items: show total count + TOP 50 - For lists with <50 items: show ALL items - Always include database identifiers (IDs), not just names - Mark canonical/primary/reviewed entries where applicable - If a section has no data available, state "No data available" rather than omitting the section

Structured Data Sources

This answer was generated using Claude Opus 4.5 + BioBTree MCP, querying 32 biological databases.

Further analyze this answer or run your own queries with BioBTree MCP.

Datasets: alphafold, alphamissense, bgee, ccds, chembl_molecule, chembl_target, clinvar, collectri, diamond_similarity, ensembl, entrez, esm2_similarity, exon, gencc, go, gwas, hgnc, hpo, intact, interpro, mim, orphanet, ortholog, pdb, pharmgkb_gene, reactome, refseq, scxa, spliceai, string_interaction, transcript, uniprot
Generated: 2026-03-26 — For the latest data, query BioBTree directly via MCP or API.
View API calls (40)
EGFR

COMPREHENSIVE CROSS-DATABASE IDENTIFIER AND FUNCTIONAL MAPPING REFERENCE Human EGFR (Epidermal Growth Factor Receptor)

Section 1: Gene Identifiers Primary Gene Identifiers

DatabaseIdentifierNotes
HGNCHGNC:3236Approved symbol: EGFR
EnsemblENSG00000146648
NCBI Entrez1956
OMIM131550Gene/Locus
Gene Names & Aliases
  • Approved Name: Epidermal growth factor receptor
  • Previous Names: Epidermal growth factor receptor (avian erythroblastic leukemia viral (v-erb-b) oncogene homolog)
  • Aliases: ERBB, ERBB1, ERRP, HER1, NISBD2, NNCIS, PIG61, mENA
  • Locus Group: Protein-coding gene
  • Gene Group: Erb-b2 receptor tyrosine kinases Genomic Location
AttributeValue
Chromosome7
Cytogenetic Band7p11.2
Start Position55,018,820 (GRCh38)
End Position55,211,628 (GRCh38)
Strand+ (Plus/Forward)
Genomic AccessionNC_000007.14
Gene Length~192,808 bp

Section 2: Transcript Identifiers Ensembl Transcripts (Total: 17 transcripts)

Transcript IDBiotypeStartEndUTR Info
ENST00000275493protein_coding55,019,01755,211,628CANONICAL
ENST00000342916protein_coding55,019,03255,168,635
ENST00000344576protein_coding55,019,01755,171,037
ENST00000420316protein_coding55,019,03455,156,951
ENST00000450046protein_coding55,109,72355,211,536
ENST00000455089protein_coding55,019,02155,203,076
ENST00000459688protein_coding_CDS_not_defined55,019,13155,046,004
ENST00000463948protein_coding_CDS_not_defined55,019,15155,119,401
ENST00000485503retained_intron55,192,81155,200,802
ENST00000700144retained_intron55,019,08855,157,014
ENST00000700145protein_coding55,163,75355,205,865
ENST00000700146retained_intron55,198,27255,208,067
ENST00000700147retained_intron55,200,57355,206,016
ENST00000898199protein_coding55,018,82055,205,898
ENST00000898200protein_coding55,019,01755,205,911
ENST00000898201protein_coding55,019,01755,205,911
ENST00000898202protein_coding55,019,10455,205,911
RefSeq Transcripts (Human, Chromosome 7)
AccessionTypeStatusMANE Select
NM_005228mRNAREVIEWED✓ YES
NM_001346897mRNAREVIEWED
NM_001346898mRNAREVIEWED
NM_001346899mRNAREVIEWED
NM_001346900mRNAREVIEWED
NM_001346941mRNAREVIEWED
NM_201282mRNAREVIEWED
NM_201283mRNAREVIEWED
NM_201284mRNAREVIEWED
CCDS Identifiers (Total: 6)
CCDS ID
CCDS5514
CCDS5515
CCDS5516
CCDS47587
CCDS87507
CCDS94105
Canonical Transcript Exons (ENST00000275493) - Total: 28 exons
Exon IDStartEndLength
ENSE0000184134755,019,01755,019,365349 bp
ENSE0000354128855,142,28655,142,437152 bp
ENSE0000170415755,143,30555,143,488184 bp
ENSE0000179812555,146,60655,146,740135 bp
ENSE0000168398355,151,29455,151,36269 bp
ENSE0000165297555,152,54655,152,664119 bp
ENSE0000162373255,154,01155,154,152142 bp
ENSE0000175117955,155,83055,155,946117 bp
ENSE0000108492955,156,53355,156,659127 bp
ENSE0000108493155,156,75955,156,83274 bp
ENSE0000108492655,157,66355,157,75391 bp
ENSE0000108494155,160,13955,160,338200 bp
ENSE0000108493955,161,49955,161,631133 bp
ENSE0000108492755,163,73355,163,82391 bp
ENSE0000162711555,165,28055,165,437158 bp
ENSE0000176807655,171,17555,171,21339 bp
ENSE0000268463755,172,98355,173,124142 bp
ENSE0000177851955,173,92155,174,043123 bp
ENSE0000175646055,174,72255,174,82099 bp
ENSE0000160133655,181,29355,181,478186 bp
ENSE0000168152455,191,71955,191,874156 bp
ENSE0000163169555,192,76655,192,84176 bp
ENSE0000362568455,198,71755,198,863147 bp
ENSE0000179070155,200,31655,200,41398 bp
ENSE0000180120855,201,18855,201,355168 bp
ENSE0000177356255,201,73555,201,78248 bp
ENSE0000179578055,202,51755,202,625109 bp
ENSE0000124588755,205,25655,211,6286,373 bp

Section 3: Protein Identifiers UniProt Accessions (Total: 4 entries)

AccessionStatusName
P00533✓ Reviewed (Swiss-Prot)Epidermal growth factor receptor
A0A8V8TPW8Unreviewed
C9JYS6Unreviewed
Q504U8Unreviewed
Canonical Protein Properties (P00533)
PropertyValue
Length1,210 amino acids
Mass134,277 Da
Alternative NamesProto-oncogene c-ErbB-1; Receptor tyrosine-protein kinase erbB-1
RefSeq Protein Accessions (Human)
AccessionStatusMANE Select
NP_005219REVIEWED✓ YES
NP_001333826REVIEWED
NP_001333827REVIEWED
NP_001333828REVIEWED
NP_001333829REVIEWED
NP_001333870REVIEWED
NP_958439REVIEWED
NP_958440REVIEWED
NP_958441REVIEWED
Protein Domains & Families (Total: 15 InterPro entries)
InterPro IDNameType
IPR000494Rcpt_L-domDomain
IPR000719Prot_kinase_domDomain
IPR001245Ser-Thr/Tyr_kinase_cat_domDomain
IPR006211Furin-like_Cys-rich_domDomain
IPR006212Furin_repeatRepeat
IPR008266Tyr_kinase_ASActive_site
IPR009030Growth_fac_rcpt_cys_sfHomologous_superfamily
IPR011009Kinase-like_dom_sfHomologous_superfamily
IPR016245Tyr_kinase_EGF/ERB/XmrK_rcptFamily
IPR017441Protein_kinase_ATP_BSBinding_site
IPR020635Tyr_kinase_cat_domDomain
IPR032778GF_recep_IVDomain
IPR036941Rcpt_L-dom_sfHomologous_superfamily
IPR049328TM_ErbB1Domain
IPR050122RTKFamily

Section 4: Structure Identifiers Experimental Structures - PDB (Total: 378 structures) TOP 50 PDB Structures by Resolution

PDB IDMethodResolutionDescription
3POZX-RAY1.50 ÅEGFR kinase domain with TAK-285
5CNOX-RAY1.55 ÅEGFR kinase domain mutant V924R
3VRPX-RAY1.52 ÅCbl-c TKB with phospho-EGFR peptide
5HG5X-RAY1.52 ÅEGFR L858R/T790M/V948R with inhibitor
3G5YX-RAY1.59 ÅAntibodies targeting tumor EGFR
3W33X-RAY1.70 ÅEGFR kinase domain with compound 19b
4I22X-RAY1.71 ÅEGFR L858R+T790M with gefitinib
3W32X-RAY1.80 ÅEGFR kinase domain with compound 20a
3P0YX-RAY1.80 Åanti-EGFR/HER3 Fab DL11 complex
4I24X-RAY1.80 ÅEGFR T790M with dacomitinib
4WKQX-RAY1.85 ÅEGFR kinase domain with gefitinib
5HG7X-RAY1.85 ÅEGFR mutant with PF-06459988
5GNKX-RAY1.80 ÅEGFR T790M with LXX-6-34
3W2SX-RAY1.90 ÅEGFR kinase domain with compound 4
5CNNX-RAY1.90 ÅEGFR kinase domain mutant I682Q
4WRGX-RAY1.90 ÅEGFR kinase domain structure
4ZSEX-RAY1.97 ÅEGFR T790M/V948R crystal form II
2RGPX-RAY2.00 ÅEGFR with hydrazone dual inhibitor
3VRRX-RAY2.00 ÅCbl-c PL mutant with phospho-EGFR
3W2PX-RAY2.05 ÅEGFR T790M/L858R with compound 2
3W2RX-RAY2.05 ÅEGFR T790M/L858R with compound 4
4UV7X-RAY2.10 ÅEGFR extracellular domain with GC1118A
3OB2X-RAY2.10 Åc-Cbl TKB with double phosphorylated EGFR
5CASX-RAY2.10 ÅEGFR TMLR with compound 41a
3W2QX-RAY2.20 ÅEGFR T790M/L858R with HKI-272
5CAUX-RAY2.25 ÅEGFR TMLR with compound 41b
3PFVX-RAY2.27 ÅCbl-b TKB with EGFR pY1069 peptide
3BELX-RAY2.30 ÅEGFR with oxime inhibitor
5D41X-RAY2.31 ÅEGFR with mutant selective allosteric inhibitor
3VJNX-RAY2.34 ÅEGFR G719S/T790M with AMPPNP
3W2OX-RAY2.35 ÅEGFR T790M/L858R with TAK-285
1XKKX-RAY2.40 ÅEGFR kinase with GW572016 (lapatinib)
5CAPX-RAY2.40 ÅEGFR TMLR with compound 30
5HCYX-RAY2.46 ÅEGFR TMLR with azaindole compound 13
2ITNX-RAY2.47 ÅEGFR G719S with AMP-PNP
2ITVX-RAY2.47 ÅEGFR L858R with AMP-PNP
3UG2X-RAY2.50 ÅEGFR G719S/T790M with gefitinib
4RJ8X-RAY2.50 ÅEGFR T790M/L858R with compound 8
4LQMX-RAY2.50 ÅEGFR L858R with PD168393
1MOXX-RAY2.50 ÅEGFR extracellular domain with TGF-alpha
4CAOX-RAY2.60 ÅEGFR TMLR with compound 29
1M14X-RAY2.60 ÅTyrosine Kinase Domain from EGFR
1M17X-RAY2.60 ÅEGFR kinase with erlotinib
2GS6X-RAY2.60 ÅActive EGFR with ATP analog-peptide
2GS7X-RAY2.60 ÅInactive EGFR with AMP-PNP
3BUOX-RAY2.60 Åc-Cbl-TKB with EGF receptor
5EDRX-RAY2.60 ÅEGFR T790M/L858R with compound 27
5HCXX-RAY2.60 ÅEGFR TMLR with azabenzimidazole compound 7
1YY9X-RAY2.61 ÅEGFR extracellular with cetuximab Fab
5FEDX-RAY2.65 ÅEGFR with covalent aminobenzimidazole inhibitor
NMR Structures
PDB IDMethodDescription
1Z9ISOLUTION NMRJuxtamembrane domain model
2KS1SOLUTION NMRErbB1/ErbB2 TM heterodimer
2M0BSOLUTION NMRHomodimeric TM domain in micelles
2M20SOLUTION NMRTM-JM segment in bicelles
2N5SSOLUTION NMRTM and JM domains in DPC micelles
Predicted Structures - AlphaFold
Model IDGlobal pLDDTSequence LengthFraction Very High Confidence
AF-P0053376.299,3920.48 (48%)

Section 5: Cross-Species Orthologs Orthologous Genes in Model Organisms

OrganismGene IDSymbolBiotype
Mouse (Mus musculus)ENSMUSG00000020122Egfrprotein_coding
Rat (Rattus norvegicus)ENSRNOG00000004332Egfrprotein_coding
Zebrafish (Danio rerio)ENSDARG00000013847egfraprotein_coding
Zebrafish (Danio rerio)ENSDARG00000056909-protein_coding
Fruit fly (D. melanogaster)FBGN0003731Egfrprotein_coding
Additional Entrez Orthologs
OrganismEntrez IDRefSeq mRNARefSeq Protein
Mouse13649NM_007912, NM_207655NP_031938, NP_997538
Rat24329NM_031507NP_113695
Drosophila37455NM_057410, NM_057411NP_476758, NP_476759

Section 6: Clinical Variants & AI Predictions ClinVar Summary (Total: 3,790 variants) Pathogenic Variants (Total: 63)

ClinVar IDHGVS NotationVariant TypeReview Status
45225c.2156G>C (p.Gly719Ala)SNVcriteria provided
45251c.2303G>T (p.Ser768Ile)SNVcriteria provided
45279c.2500G>T (p.Val834Leu)SNVcriteria provided
177620c.2236_2250del (p.Glu746_Ala750del)Deletioncriteria provided
45220c.2127_2129del (p.Glu709_Thr710delinsAsp)Deletioncriteria provided
157499c.1283G>A (p.Gly428Asp)SNVcriteria provided
254143c.977G>T (p.Cys326Phe)SNVno criteria
638163c.2303_2305delinsTCT (p.Ser768_Val769delinsIleLeu)Indelcriteria provided
1005727c.1605C>A (p.Cys535Ter)SNVcriteria provided
1016241c.2917C>T (p.Arg973Ter)SNVcriteria provided
1016463c.1536del (p.Glu513fs)Deletioncriteria provided
1023925c.2545C>T (p.Gln849Ter)SNVcriteria provided
1042058g.(?55086755)(55274084_?)delDeletioncriteria provided
1043841c.2577del (p.Lys860fs)Deletioncriteria provided
1045120c.1418del (p.Asn473fs)Deletioncriteria provided
1058215c.2650G>T (p.Glu884Ter)SNVcriteria provided
1425739c.2921_2928del (p.Asp974fs)Deletioncriteria provided
1429922c.877A>T (p.Lys293Ter)SNVcriteria provided
1441678c.3061C>T (p.Gln1021Ter)SNVcriteria provided
1444704c.2289del (p.Tyr764fs)Deletioncriteria provided
1453945c.1860_1861delinsAA (p.Cys620_His621delinsTer)Indelcriteria provided
1457850c.113del (p.Leu38fs)Deletioncriteria provided
1459376c.763C>T (p.Arg255Ter)SNVcriteria provided
1508578c.2720T>A (p.Leu907Ter)SNVcriteria provided
2032844c.492_511del (p.Trp164_Asp171delinsTer)Deletioncriteria provided
2115063c.977_978del (p.Lys325_Cys326insTer)Microsatellitecriteria provided
2129131c.2927del (p.Gln976fs)Deletioncriteria provided
2578363c.2317delinsAACCCCT (p.His773delinsAsnProTyr)Indelcriteria provided
2582257c.1792G>A (p.Gly598Arg)SNVno criteria
2582258c.2561C>T (p.Thr854Ile)SNVno criteria
2582280c.1786C>T (p.Pro596Ser)SNVno criteria
2582281c.2287G>A (p.Ala763Thr)SNVno criteria
Likely Pathogenic Variants
ClinVar IDHGVS NotationVariant Type
1009894c.3162+1G>TSplice site
1009989c.2947-1G>ASplice site
AlphaMissense Predictions (Total: 8,041 predictions) TOP 50 Likely Pathogenic Predictions (by score)
VariantProtein ChangeAM PathogenicityAM Class
7:55019324:C:AA16D0.582likely_pathogenic
Note: Most signal peptide region variants show low pathogenicity scores. Higher pathogenicity predictions are concentrated in functional domains. SpliceAI Predictions (Total: 4,358 predictions) TOP 50 High-Impact Splice Variants (Delta Score ≥0.7)
VariantEffectDelta Score
7:55019361:GAAAG:Gdonor_gain0.99
7:55019362:AAAGG:Adonor_loss0.99
7:55019363:AAGGT:Adonor_loss0.99
7:55019364:AGGT:Adonor_loss0.99
7:55019365:GGTA:Gdonor_loss0.99
7:55019366:G:Adonor_loss0.99
7:55019367:T:Gdonor_loss0.99
7:55019366:G:GGdonor_gain0.97
7:55019364:AG:Adonor_gain0.89
7:55019365:GG:Gdonor_gain0.89
7:55020707:GTT:Gdonor_gain0.85
7:55020708:TTT:Tdonor_gain0.85
7:55019363:AAG:Adonor_gain0.85
7:55019339:C:Tdonor_gain0.83
7:55019845:G:Tdonor_gain0.71
7:55019362:AAAG:Adonor_gain0.71

Section 7: Biological Pathways & Gene Ontology Reactome Pathways (Total: 37 pathways)

Pathway IDNameDisease Pathway
R-HSA-177929Signaling by EGFRNo
R-HSA-1227986Signaling by ERBB2No
R-HSA-1236394Signaling by ERBB4No
R-HSA-182971EGFR downregulationNo
R-HSA-179812GRB2 events in EGFR signalingNo
R-HSA-180292GAB1 signalosomeNo
R-HSA-180336SHC1 events in EGFR signalingNo
R-HSA-212718EGFR interacts with phospholipase C-gammaNo
R-HSA-1250196SHC1 events in ERBB2 signalingNo
R-HSA-1251932PLCG1 events in ERBB2 signalingNo
R-HSA-1257604PIP3 activates AKT signalingNo
R-HSA-1963640GRB2 events in ERBB2 signalingNo
R-HSA-1963642PI3K events in ERBB2 signalingNo
R-HSA-5673001RAF/MAP kinase cascadeNo
R-HSA-6785631ERBB2 Regulates Cell MotilityNo
R-HSA-6811558PI5P, PP2A and IER3 Regulate PI3K/AKT SignalingNo
R-HSA-8847993ERBB2 Activates PTK6 SignalingNo
R-HSA-8856825Cargo recognition for clathrin-mediated endocytosisNo
R-HSA-8856828Clathrin-mediated endocytosisNo
R-HSA-8857538PTK6 promotes HIF1A stabilizationNo
R-HSA-8863795Downregulation of ERBB2 signalingNo
R-HSA-9009391Extra-nuclear estrogen signalingNo
R-HSA-1236382Constitutive Signaling by Ligand-Responsive EGFR Cancer VariantsYes
R-HSA-2219530Constitutive Signaling by Aberrant PI3K in CancerYes
R-HSA-5637810Constitutive Signaling by EGFRvIIIYes
R-HSA-5638303Inhibition of Signaling by Overexpressed EGFRYes
R-HSA-9664565Signaling by ERBB2 KD MutantsYes
R-HSA-9665348Signaling by ERBB2 ECD mutantsYes
R-HSA-9665686Signaling by ERBB2 TMD/JMD mutantsYes
R-HSA-9609690HCMV Early EventsYes
R-HSA-9820960Respiratory syncytial virus (RSV) attachment and entryYes
Gene Ontology Annotations (Total: 104+ terms) Molecular Function (TOP 20)
GO IDTerm
GO:0005006epidermal growth factor receptor activity
GO:0004713protein tyrosine kinase activity
GO:0004714transmembrane receptor protein tyrosine kinase activity
GO:0004709MAP kinase kinase kinase activity
GO:0004888transmembrane signaling receptor activity
GO:0005524ATP binding
GO:0048408epidermal growth factor binding
GO:0001618virus receptor activity
GO:0003682chromatin binding
GO:0003690double-stranded DNA binding
GO:0019899enzyme binding
GO:0019900kinase binding
GO:0019903protein phosphatase binding
GO:0030296protein tyrosine kinase activator activity
GO:0031625ubiquitin protein ligase binding
GO:0042802identical protein binding
GO:0045296cadherin binding
GO:0051015actin filament binding
GO:0051117ATPase binding
Biological Process (TOP 20)
GO IDTerm
GO:0007173epidermal growth factor receptor signaling pathway
GO:0007165signal transduction
GO:0007166cell surface receptor signaling pathway
GO:0008284positive regulation of cell population proliferation
GO:0030307positive regulation of cell growth
GO:0030335positive regulation of cell migration
GO:0042327positive regulation of phosphorylation
GO:0043066negative regulation of apoptotic process
GO:0043410positive regulation of MAPK cascade
GO:0045742positive regulation of EGFR signaling pathway
GO:0070374positive regulation of ERK1 and ERK2 cascade
GO:0051897positive regulation of PI3K/AKT signal transduction
GO:0001934positive regulation of protein phosphorylation
GO:0045944positive regulation of transcription by RNA polymerase II
GO:0050679positive regulation of epithelial cell proliferation
GO:0038134ERBB2-EGFR signaling pathway
GO:0071364cellular response to epidermal growth factor stimulus
Cellular Component (TOP 20)
GO IDTerm
GO:0005886plasma membrane
GO:0009986cell surface
GO:0005634nucleus
GO:0005737cytoplasm
GO:0005829cytosol
GO:0005768endosome
GO:0005794Golgi apparatus
GO:0000139Golgi membrane
GO:0005789endoplasmic reticulum membrane
GO:0005925focal adhesion
GO:0005929cilium
GO:0010008endosome membrane
GO:0016020membrane
GO:0016323basolateral plasma membrane
GO:0030054cell junction
GO:0030669clathrin-coated endocytic vesicle membrane
GO:0031901early endosome membrane
GO:0032587ruffle membrane
GO:0043235receptor complex
GO:0045121membrane raft

Section 8: Protein Interactions & Molecular Networks STRING Interactions (Total: 11,600+) TOP 50 Highest-Confidence Interacting Proteins

Partner UniProtPartner GeneScore
P00533EGFR (self)999
P01133EGF999
P01135TGFA998
Q99075HBEGF998
P29354SRC997
P98202ERBB3997
O14944EREG (Epiregulin)996
P15514AREG (Amphiregulin)996
P12830CDH1995
P16070CD44995
P22681CBL995
P35070BTC (Betacellulin)995
P13931PLCG1994
P29353SHC1994
P07900HSP90AA1993
P08238HSP90AB1993
Q6UW88EPGN992
Q06124PTPN11991
P12931SRC988
Q03135CAV1987
P07585DCN986
P40763STAT3986
Q13480GAB1986
P21860ERBB3984
P04626ERBB2983
P08581MET979
P18031PTPN1978
P42336PIK3CA977
Q14956GPNMB977
Q15303ERBB4977
P03372ESR1974
P01137TGFB1973
P08069IGF1R973
P09619PDGFRB971
P17936IGFBP3971
Q14451GRB7967
O14511NRG2959
Q13882PTK6948
P14210HGF947
P05231IL6943
Q07889SOS1943
Q9UKV8AGO2941
P35222CTNNB1940
P48509NGFR940
P13866SLC2A1938
P42229STAT5A933
P01112HRAS931
P01308INS930
P60484PIK3R1926
Q05397FAK1/PTK2926
IntAct Interactions (Total: 1,747 interactions) TOP Direct Interactions (by confidence)
PartnerInteraction TypeConfidence
GRB2physical association0.980
EGFdirect interaction0.970
CBLphysical association0.960
ERBB2physical association0.950
PTPN1colocalization0.900
CALM1physical association0.830
TGFAdirect interaction0.780
GAPDHassociation0.790
RUBCNphysical association0.650
GOLM1physical association0.640
Protein Similarity ESM2 Structural/Embedding Similarity (33 similar proteins)
UniProtTop SimilarityAvg Similarity
P21860 (ERBB3)1.0000.964
Q5RB22 (Chimpanzee)1.0000.964
Q61526 (Mouse Erbb3)0.9990.965
Q62799 (Rat Erbb3)0.9990.965
P552450.9990.954
P04626 (ERBB2)0.9990.952
P064940.9990.960
O187350.9990.954
Diamond Sequence Similarity (55 homologs)
UniProtIdentityBitscore
P00534100.0%1210
P0053599.5%1211
Q61527 (Mouse Erbb4)98.8%2628
Q62956 (Rat Erbb4)98.8%2631
P21860 (ERBB3)98.0%2625
Q5RB2298.0%2627
P4802597.3%1268
Q15303 (ERBB4)97.3%2605
Q61526 (Mouse Erbb3)96.7%2553
Q62799 (Rat Erbb3)96.7%2551
O1906495.3%2239
P0649494.9%2473
P7042494.9%2472

Section 9: Transcription Factor Regulatory Data Note: EGFR is NOT a transcription factor, but it is regulated by many TFs and has been shown to have some nuclear signaling roles. Downstream Targets of EGFR (non-canonical nuclear signaling)

Target GeneRegulation
KRT14Activation
SOX2Activation
Upstream Regulators (TFs that regulate EGFR) - Total: 100+ Activating Transcription Factors
TF GeneConfidence
AP1High
ARHigh
BCL11BHigh
BCL3High
EGR1High
FOSHigh
HOXB7High
JUNHigh
JUNBHigh
NFKB1High
NFKBHigh
SOX2High
STAT3High
TP53High
YBX1High
YY1High
Repressing Transcription Factors
TF GeneConfidence
BRCA1-
EMX2-
GCFC2High
GLI1-
HDAC1-
KLF10High
LRRFIP1High
PML-
PPARGHigh
RARA-
SP1High
TP63High
VDRHigh
WT1High

Section 10: Drug & Pharmacology Data ChEMBL Target Information

Target IDTarget NameType
CHEMBL203Epidermal growth factor receptorSINGLE PROTEIN
CHEMBL2111431EGFR and ERBB2 (HER1 and HER2)PROTEIN FAMILY
CHEMBL2363049Epidermal growth factor receptorPROTEIN FAMILY
FDA-Approved Drugs Targeting EGFR (Phase 4, Total: 70) Primary EGFR Inhibitors (Targeted Therapies)
ChEMBL IDDrug NameTypeIndication
CHEMBL553ERLOTINIBSmall moleculeNSCLC, Pancreatic cancer
CHEMBL1079742ERLOTINIB HYDROCHLORIDESmall moleculeNSCLC
CHEMBL939GEFITINIBSmall moleculeNSCLC
CHEMBL1173655AFATINIBSmall moleculeNSCLC
CHEMBL2105712AFATINIB DIMALEATESmall moleculeNSCLC
CHEMBL554LAPATINIBSmall moleculeHER2+ breast cancer
CHEMBL1201179LAPATINIB DITOSYLATESmall moleculeHER2+ breast cancer
CHEMBL3353410OSIMERTINIBSmall moleculeNSCLC (T790M+)
CHEMBL2105719DACOMITINIBSmall moleculeNSCLC
CHEMBL2110732DACOMITINIB ANHYDROUSSmall moleculeNSCLC
CHEMBL180022NERATINIBSmall moleculeHER2+ breast cancer
CHEMBL4650319MOBOCERTINIBSmall moleculeNSCLC (EGFR exon 20 insertion)
CHEMBL4558324LAZERTINIBSmall moleculeNSCLC
CHEMBL3786343OLMUTINIBSmall moleculeNSCLC
CHEMBL24828VANDETANIBSmall moleculeMedullary thyroid cancer
CHEMBL3989868TUCATINIBSmall moleculeHER2+ breast cancer
Multi-Kinase Inhibitors with EGFR Activity
ChEMBL IDDrug NameType
CHEMBL535SUNITINIBSmall molecule
CHEMBL1336SORAFENIBSmall molecule
CHEMBL1171837PONATINIBSmall molecule
CHEMBL1289926AXITINIBSmall molecule
CHEMBL2105717CABOZANTINIBSmall molecule
CHEMBL601719CRIZOTINIBSmall molecule
CHEMBL941IMATINIBSmall molecule
CHEMBL1421DASATINIBSmall molecule
CHEMBL5416410DASATINIBSmall molecule
CHEMBL288441BOSUTINIBSmall molecule
CHEMBL1229517VEMURAFENIBSmall molecule
CHEMBL1738797ALECTINIBSmall molecule
CHEMBL2403108CERITINIBSmall molecule
CHEMBL3286830LORLATINIBSmall molecule
CHEMBL3545311BRIGATINIBSmall molecule
CHEMBL3301622GILTERITINIBSmall molecule
CHEMBL608533MIDOSTAURINSmall molecule
Other Phase 4 Compounds with EGFR Activity
ChEMBL IDDrug NameType
CHEMBL1873475IBRUTINIBSmall molecule
CHEMBL3707348ACALABRUTINIBSmall molecule
CHEMBL3936761ZANUBRUTINIBSmall molecule
CHEMBL1614701SELUMETINIBSmall molecule
CHEMBL3301610ABEMACICLIBSmall molecule
CHEMBL98VORINOSTATSmall molecule
PharmGKB Status
AttributeValue
PharmGKB IDPA7360
VIP GeneYes (Very Important Pharmacogene)
Has Variant AnnotationsYes
Has CPIC GuidelineNo

Section 11: Expression Profiles Bgee Expression Summary

PropertyValue
Expression PatternUbiquitous
Total Present Calls285
Total Absent Calls13
Total Conditions298
Max Expression Score99.12
Average Expression Score83.64
Gold Quality Count285
TOP 30 Highest-Expressing Tissues
Tissue (UBERON ID)ScoreQuality
Nipple (UBERON:0002030)99.12Gold
Gingiva (UBERON:0001828)98.63Gold
Gingival epithelium (UBERON:0001949)98.62Gold
Placenta (UBERON:0001987)98.56Gold
Mammalian vulva (UBERON:0000997)98.48Gold
Tongue squamous epithelium (UBERON:0006919)98.32Gold
Skin of hip (UBERON:0001554)98.28Gold
Superficial temporal artery (UBERON:0001614)97.94Gold
Decidua (UBERON:0002450)97.78Gold
Penis (UBERON:0000989)97.65Gold
Pharyngeal mucosa (UBERON:0000355)97.63Gold
Mucosa of paranasal sinus (UBERON:0005030)97.49Gold
Urethra (UBERON:0000057)97.31Gold
Saphenous vein (UBERON:0007318)97.30Gold
Lower lobe of lung (UBERON:0008949)96.72Gold
Oral cavity (UBERON:0000167)96.61Gold
Sural nerve (UBERON:0015488)96.43Gold
Superior surface of tongue (UBERON:0007371)96.40Gold
Upper leg skin (UBERON:0004262)96.33Gold
Mammary duct (UBERON:0001765)96.28Gold
Tongue (UBERON:0001723)96.21Gold
Upper arm skin (UBERON:0004263)96.11Gold
Hair follicle (UBERON:0002073)95.77Gold
Synovial joint (UBERON:0002217)95.73Gold
Zone of skin (UBERON:0000014)95.68Gold
Body of tongue (UBERON:0011876)95.67Gold
Cauda epididymis (UBERON:0004360)95.66Gold
Cervix epithelium (UBERON:0004801)95.56Gold
Skin of leg (UBERON:0001511)95.49Gold
Skin of abdomen (UBERON:0001416)95.43Gold
Expression Pattern Summary: EGFR is highly expressed in epithelial tissues including skin, oral mucosa, respiratory epithelium, and reproductive tissues. Single-Cell Expression Datasets (Total: 11 datasets)
Dataset IDDescriptionSpeciesCell Count
E-ANND-2GTEx: snRNAseq atlasHomo sapiens209,126
E-MTAB-6701Human first trimester fetal-maternal interface (10x)Homo sapiens135,071
E-CURD-114Human airway epithelium smoking effectsHomo sapiens81,801
E-MTAB-11268Human hypertrophied heart atlasHomo sapiens64,898
E-MTAB-9435IDHwt glioblastoma tumorsHomo sapiens62,867
E-HCAD-24Human first-trimester placenta and deciduaHomo sapiens24,780
E-MTAB-8559Ovarian cancer ex vivo modelsHomo sapiens20,982
E-GEOD-84465Glioblastoma migrating front cellsHomo sapiens3,588
E-MTAB-10596Human dental follicle organoidsHomo sapiens3,388
E-MTAB-10137Human dermal blood vascular endotheliumHomo sapiens1,523
E-ENAD-27Human islet cells in type 2 diabetesHomo sapiens1,145

Section 12: Disease Associations Mendelian/Monogenic Disease Links (GenCC)

DiseaseOMIM/MONDOInheritanceClassificationSubmitter
Lung cancerOMIM:211980Autosomal dominantDefinitiveAmbry Genetics, G2P
Inflammatory skin and bowel disease, neonatal, 2 (NISBD2)OMIM:616069Autosomal recessiveStrong/ModerateLabcorp, Ambry, G2P
Neonatal inflammatory skin and bowel diseaseORPHANET:294023Autosomal recessiveSupportiveOrphanet
Orphanet Disease Associations
Orphanet IDDisease NameType
251576GliosarcomaHistopathological subtype
251579Giant cell glioblastomaHistopathological subtype
HPO Phenotype Associations (Total: 21 terms)
HPO IDPhenotype
HP:0000006Autosomal dominant inheritance
HP:0000007Autosomal recessive inheritance
HP:0030358Non-small cell lung carcinoma
HP:0030078Lung adenocarcinoma
HP:0006519Alveolar cell carcinoma
HP:0000527Long eyelashes
HP:0000822Hypertension
HP:0001442Typified by somatic mosaicism
HP:0001508Failure to thrive
HP:0001561Polyhydramnios
HP:0001680Coarctation of aorta
HP:0001944Dehydration
HP:0002013Vomiting
HP:0003212Increased circulating IgE concentration
HP:0003577Congenital onset
HP:0005208Secretory diarrhea
HP:0006532Recurrent pneumonia
HP:0025092Epidermal acanthosis
HP:0100501Recurrent bronchiolitis
HP:0200034Papule
HP:0200039Pustule
GWAS Associations (Total: 35 associations)
StudyTraitP-value
GCST004349_12Glioblastoma5×10⁻³⁴
GCST004347_14Glioma4×10⁻²⁷
GCST004349_6Glioblastoma5×10⁻²³
GCST90002400_91Plateletcrit6×10⁻¹⁸
GCST90002401_81Platelet distribution width1×10⁻¹⁷
GCST006480_2Glioblastoma (age-stratified)4×10⁻¹⁶
GCST005932_8Glioblastoma3×10⁻¹⁶
GCST90014033_78Haemorrhoidal disease3×10⁻¹³
GCST005932_10Glioblastoma1×10⁻¹²
GCST90002402_65Platelet count1×10⁻¹²
GCST006480_15Glioblastoma (age-stratified)2×10⁻¹²
GCST005931_12Glioma7×10⁻¹²
GCST005931_14Glioma5×10⁻¹²
GCST005932_9Glioblastoma1×10⁻¹¹
GCST006480_3Glioblastoma (age-stratified)2×10⁻¹¹
GCST006480_9Glioblastoma (age-stratified)7×10⁻¹²
GCST006480_4Glioblastoma (age-stratified)2×10⁻¹⁰
GCST005931_13Glioma2×10⁻⁹
GCST90002391_152Mean corpuscular hemoglobin concentration2×10⁻⁹
GCST006480_16Glioblastoma (age-stratified)2×10⁻⁹
GCST90002393_38Monocyte count4×10⁻⁹
GCST009391_2084Metabolite levels4×10⁻⁹
GCST90002397_16Mean spheric corpuscular volume3×10⁻⁹
GCST004348_9Non-glioblastoma glioma2×10⁻⁸
GCST001058_7Glioma8×10⁻⁸
GCST001058_1Glioma7×10⁻⁸
GCST006480_17Glioblastoma (age-stratified)6×10⁻⁸
SUMMARY STATISTICS
CategoryCount
Ensembl Transcripts17
RefSeq mRNA Transcripts9+
CCDS IDs6
Exons (canonical)28
UniProt Entries4
Protein Length1,210 aa
InterPro Domains15
PDB Structures378
ClinVar Variants3,790
Pathogenic Variants63
AlphaMissense Predictions8,041
SpliceAI Predictions4,358
Reactome Pathways37
GO Terms104+
STRING Interactions11,600+
IntAct Interactions1,747
FDA-Approved Targeting Drugs70
GWAS Associations35
Single-Cell Datasets11
Orthologs (key species)5
---Reference compiled from biobtree database integrating: HGNC, Ensembl, NCBI Entrez, UniProt, PDB, AlphaFold, ClinVar, AlphaMissense, SpliceAI, Reactome, Gene Ontology, STRING, IntAct, ChEMBL, PharmGKB, Bgee, GWAS Catalog, GenCC, HPO, Orphanet, and Expression Atlas.