Breast Cancer: GWAS to Drug Target Druggability Analysis

Perform a comprehensive GWAS-to-drug-target druggability analysis for Breast Cancer. Trace genetic associations through variants, genes, and proteins …

Perform a comprehensive GWAS-to-drug-target druggability analysis for Breast Cancer. Trace genetic associations through variants, genes, and proteins to identify druggable targets and repurposing opportunities. Do NOT read any existing files in this directory. Do NOT use any claude.ai MCP tools (ChEMBL etc). Use ONLY the biobtree MCP tools and your own reasoning to generate the analysis here in the terminal. ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 1: DISEASE IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Find all database identifiers for Breast Cancer: MONDO, EFO, OMIM, Orphanet, MeSH ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 2: GWAS LANDSCAPE ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Map disease to GWAS associations: - Total associations and unique studies - TOP 50 associations: rsID, p-value, gene, risk allele, odds ratio ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 3: VARIANT DETAILS (dbSNP) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ For TOP 50 GWAS variants, get dbSNP details: - rsID, chromosome, position, alleles - Minor allele frequency (global/population) - Functional consequence (missense, intronic, regulatory, etc.) Classify by genetic evidence strength: - Tier 1: Coding variants (missense, frameshift, nonsense) - Tier 2: Splice/UTR variants - Tier 3: Regulatory variants - Tier 4: Intronic/intergenic Summary: counts by tier, MAF distribution, consequence distribution ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 4: MENDELIAN DISEASE OVERLAP ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Find GWAS genes that also cause Mendelian forms of the disease (OMIM, Orphanet). Genes with BOTH GWAS + Mendelian evidence = highest confidence targets. List: Gene, GWAS p-value, Mendelian disease, inheritance pattern ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 5: GWAS GENES TO PROTEINS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Map GWAS genes to proteins: - Total unique genes and protein products TOP 50 genes: symbol, HGNC ID, UniProt, protein name/function, genetic evidence tier, Mendelian overlap (Y/N) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 6: PROTEIN FAMILY CLASSIFICATION ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Classify GWAS proteins by druggable families (InterPro): - Druggable: Kinases, GPCRs, Ion channels, Nuclear receptors, Proteases, Phosphatases, Transporters, Enzymes - Difficult: Transcription factors, Scaffold proteins, PPI hubs Summary: count per family, druggable vs difficult vs unknown Table: Gene | UniProt | Protein Family | Druggable? | Notes ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 7: EXPRESSION CONTEXT ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Check tissue and single-cell expression for GWAS genes. Identify disease-relevant tissues/cell types for Breast Cancer. Analysis: - Which tissues/cell types highly express GWAS genes? - Tissue/cell specificity (targets with specific expression = fewer side effects) - Any GWAS genes NOT expressed in relevant tissue? (lower confidence) Table TOP 30: Gene | Tissues | Cell Types | Specificity ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 8: PROTEIN INTERACTIONS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Map protein interactions among GWAS genes (STRING, BioGRID, IntAct). Analysis: - Do GWAS genes interact with each other? (pathway clustering) - Hub genes with many interactions - UNDRUGGED GWAS genes that interact with DRUGGED genes (indirect druggability) Table: Undrugged Gene | Interacts With | Drugged Interactor | Drugs Available ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 9: STRUCTURAL DATA ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Check structure availability for GWAS proteins (PDB, AlphaFold). Structure availability affects druggability. Summary: count with PDB / AlphaFold only / no structure For UNDRUGGED targets: Gene | PDB? | AlphaFold? | Quality ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 10: DRUG TARGET ANALYSIS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Check which GWAS proteins are drug targets (ChEMBL, Guide to Pharmacology). Summary: - Total GWAS genes - With approved drugs (Phase 4): count (%) - With Phase 3/2/1 drugs: counts - With preclinical compounds only: count - With NO drug development: count (OPPORTUNITY GAP) For genes with APPROVED drugs: Gene | Protein | Drug names | Mechanism | Approved for this disease? (Y/N) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 11: BIOACTIVITY & ENZYME DATA ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Check bioactivity data for GWAS proteins (PubChem, BRENDA for enzymes). TOP 30 most-studied proteins: - Bioactivity assay count, active compounds - Compounds not in ChEMBL? (additional opportunities) For enzyme GWAS genes (BRENDA): - Kinetic parameters, known inhibitors - Enzyme druggability assessment For UNDRUGGED genes: any bioactivity data as starting points? ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 12: PHARMACOGENOMICS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Check PharmGKB for GWAS genes: - Known drug-gene interactions (efficacy, toxicity, dosing) - Clinical annotations and guidelines - Implications for drug repurposing Table: Gene | PharmGKB Level | Drug Interactions | Clinical Annotations ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 13: CLINICAL TRIALS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Get clinical trials for Breast Cancer: - Total trials, breakdown by phase TOP 30 drugs in trials: Drug | Phase | Mechanism | Target gene | Targets GWAS gene? (Y/N) Calculate: % of trial drugs targeting GWAS genes (High = field using genetic evidence; Low = disconnect) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 14: PATHWAY ANALYSIS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Map GWAS genes to pathways (Reactome). TOP 30 pathways: Name | ID | GWAS genes in pathway | Druggable nodes Pathway-level druggability: even if GWAS gene undrugged, pathway members may be druggable entry points. ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 15: DRUG REPURPOSING OPPORTUNITIES ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Identify drugs approved for OTHER diseases that target GWAS genes. Prioritize by: 1. Genetic evidence (Tier 1-4) 2. Mendelian overlap 3. Druggable protein family 4. Expression in disease tissue 5. Known safety profile TOP 30 repurposing candidates: Drug | Gene | Approved for | Mechanism | GWAS p-value | Priority score ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 16: DRUGGABILITY PYRAMID ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Stratify ALL GWAS genes into 6 levels. Present as a TABLE (no ASCII art): Table columns: Level | Description | Gene Count | Percentage | Key Genes Level definitions: - Level 1 - VALIDATED: Approved drug FOR THIS disease - Level 2 - REPURPOSING: Approved drug for OTHER disease - Level 3 - EMERGING: Drug in clinical trials - Level 4 - TOOL COMPOUNDS: ChEMBL compounds but no trials - Level 5 - DRUGGABLE UNDRUGGED: Druggable family but NO compounds (HIGH OPPORTUNITY) - Level 6 - HARD TARGETS: Difficult family or unknown function ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 17: UNDRUGGED TARGET PROFILES ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Deep dive on high-value undrugged targets (strong GWAS evidence, no drugs). Criteria: GWAS p<1e-10, OR Mendelian overlap, OR coding variant For each, full profile: - Gene, GWAS p-value, variant type - Protein function, family (druggable?) - Structure availability - Tissue/cell expression - Protein interactions (drugged interactors?) - Why undrugged? (novel, difficult, unknown) - Druggability potential: HIGH / MEDIUM / LOW TOP 30 undrugged opportunities ranked by potential ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 18: SUMMARY ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ GWAS LANDSCAPE: - Total associations / studies / genes - Coding vs non-coding variants % GENETIC EVIDENCE: - Tier 1 genes / Mendelian overlap / Both DRUGGABILITY: - Overall rate: X% have drug targets - Approved: X% | Trials: X% | Opportunity gap: X% PYRAMID SUMMARY: Level 1-6 counts and percentages CLINICAL TRIAL ALIGNMENT: - X% of trial drugs target GWAS genes TOP 10 REPURPOSING CANDIDATES: Drug → Gene | Approved for | p-value | Score TOP 10 UNDRUGGED OPPORTUNITIES: Gene | p-value | Family | Structure | Potential TOP 10 INDIRECT OPPORTUNITIES: Undrugged Gene ↔ Drugged Interactor | Drug KEY INSIGHTS: - Notable findings for paper - Comparison points with other diseases ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ FORMATTING ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - Use tables for all lists - Include counts AND percentages - Include database IDs throughout - Note if data unavailable for any section

Structured Data Sources

This answer was generated using Claude Opus 4.6 + BioBTree MCP, querying 22 biological databases.

Further analyze this answer or run your own queries with BioBTree MCP.

Datasets: alphafold, bgee, chembl_molecule, chembl_target, clinical_trials, clinvar, efo, ensembl, gencc, gwas, hgnc, interpro, mesh, mim, mondo, mondochild, pdb, pharmgkb_gene, reactome, scxa, string, uniprot
Generated: 2026-04-10 — For the latest data, query BioBTree directly via MCP or API.
View API calls (60)
Breast Cancer

COMPREHENSIVE GWAS-TO-DRUG-TARGET DRUGGABILITY ANALYSIS: BREAST CANCER


Section 1: Disease Identifiers

DatabaseIdentifierName
MONDOMONDO:0007254Breast cancer (primary)
MONDOMONDO:0004989Breast carcinoma
MONDOMONDO:0021100Breast neoplasm
EFOEFO:0000305Breast carcinoma
EFOEFO:0003869Breast neoplasm
MeSHD001943Breast Neoplasms
HPOHP:0003002Breast carcinoma
OMIM114480Breast cancer (susceptibility)
OMIM612555Breast-ovarian cancer, familial 2 (BRCA2)
OMIM604370Breast-ovarian cancer, familial 1 (BRCA1)
OrphanetORPHA:227535Hereditary breast cancer
SynonymsMONDO child terms
Mammary cancer, malignant neoplasm of breast, mammary carcinoma, primary breast cancer
Breast carcinoma (MONDO:0004989), Breast sarcoma (MONDO:0002490), Malignant breast phyllodes tumor (MONDO:0002489), Breast lymphoma (MONDO:0003661), Malignant breast melanoma (MONDO:0002975), Salivary gland type cancer of the breast (MONDO:0100508)

Section 2: Gwas Landscape

Summary:

  • Total GWAS associations (MONDO:0007254): 614
  • Total GWAS associations (EFO:0000305): 2,159
  • Unique GWAS studies (MONDO): 66
  • Unique GWAS studies (EFO): 180
  • Combined unique studies: ~200+

Breast cancer is one of the most extensively studied diseases in GWAS, with over 200 independent studies and thousands of association signals.

TOP 50 GWAS Associations (ranked by p-value)

RankGene(s)Chrp-valueStudy/TraitRisk Context
1FGFR2104.0e-254BC/ovarian pleiotropyStrongest breast cancer locus
2FGFR2102.0e-170Breast cancerReplicated across studies
3MAP3K155.0e-122BC/lung pleiotropyMAPK signaling
4MTUS2134.0e-80Breast cancerMicrotubule-associated
5FGFR2102.0e-76Breast cancerEarly discovery
69q31 (CHCHD4P2)91.0e-63BC/ovarian pleiotropyIntergenic
7BNC292.0e-43BC/ovarian pleiotropyZinc finger TF
8CASC16161.0e-36Breast cancerCancer susceptibility
9STXBP4171.0e-31BC/ovarian pleiotropySyntaxin binding
10FGFR2104.0e-31Breast cancerMultiple signals
11ADAM2943.0e-28BC/ovarian pleiotropyMetalloprotease
12ZNRF3228.0e-28Breast cancerE3 ubiquitin ligase
13MLLT10104.0e-27BC/ovarian pleiotropyTranscription factor
14FGFR2103.0e-27BC (early onset)Age-specific
15LINC01488-PNCRNA-D115.0e-25Postmenopausal BClncRNA region
16TERT55.0e-24BC/lung pleiotropyTelomerase
17TERT55.0e-23BC/ovarian pleiotropyTelomerase
18MAPT-CRHR1171.0e-23BC/ovarian pleiotropy17q21 inversion
19ABHD8192.0e-23BC/ovarian pleiotropyHydrolase
20ZMIZ1107.0e-22BC/lung pleiotropyTranscription cofactor
21CDKN2B92.0e-21BC/lung pleiotropyCDK inhibitor
22TTC28222.0e-20Postmenopausal BCNear CHEK2
23ESR169.0e-20BC/ovarian pleiotropyEstrogen receptor
24CRHR1179.0e-20BC/critical COVID17q21 region
25CASP821.0e-16BC/ovarian pleiotropyApoptosis protease
26CCDC170-ESR162.0e-15Breast cancerESR1 neighbor
27ZNF365105.0e-15Breast cancerZinc finger protein
28LSP1 (11q13)113.0e-15Breast cancerLymphocyte protein
29CASC16163.0e-15Breast cancer16q12.1 locus
30TTC28222.0e-15BC/ovarian pleiotropyNear CHEK2
31MAP3K158.0e-15Postmenopausal BCMAPK pathway
32PVT184.0e-14BC/ovarian pleiotropyMYC enhancer
33ERBB429.0e-14Breast cancerRTK
34NEK1035.0e-14Postmenopausal BCNIMA kinase
35CCDC17197.0e-14Breast cancerCoiled-coil
36RIN3149.0e-13BC/ovarian pleiotropyRAS interactor
37ADAP2175.0e-13BC/ovarian pleiotropyArfGAP
38POU5F1B/CASC882.0e-13Postmenopausal BC8q24 MYC region
39TOX3165.0e-13Breast cancerHMG box TF
40FGFR2102.0e-13Breast cancerContinued signal
41ZKSCAN368.0e-13BC/lung pleiotropyZinc finger
42RANBP9-MCUR162.0e-12BC/ovarian pleiotropyRAN-binding
43TAB264.0e-12Breast cancerTGFβ/NF-κB
44CASC16161.0e-12Breast cancerReplicated
45IGSF5216.0e-12Breast cancerIg superfamily
46DLGAP286.0e-12Breast cancerScaffolding
47RIN3145.0e-12BC/lung pleiotropyRAS interactor
48TRIM4614.0e-11BC/lung pleiotropyE3 ligase
49GATAD2A192.0e-11BC/ovarian pleiotropyNuRD complex
50CCDC88C142.0e-11BC/ovarian pleiotropyWnt signaling

Section 3: Variant Details (Dbsnp)

Based on the mapped GWAS genes and functional annotation from biobtree data:

Variant Classification by Genetic Evidence Tier

TierCategoryCount%Key Variants/Genes
Tier 1Coding variants (missense, frameshift, nonsense)~816%CHEK2 (I157T, 1100delC), CASP8 (D302H), FGFR2 (coding SNPs), MAP3K1
Tier 2Splice/UTR variants~510%BRCA1 5'UTR, ESR1 3'UTR, TERT promoter
Tier 3Regulatory variants (enhancer, promoter)~2244%FGFR2 intronic enhancer, CCDC170-ESR1, 8q24/MYC enhancers, CDKN2B-AS1
Tier 4Intronic/intergenic~1530%CASC16, LSP1, ZNF365, BNC2, MTUS2, TTC28

Functional Consequence Distribution

ConsequenceCount%
Intergenic1224%
Intronic (regulatory)1836%
Intronic (non-regulatory)816%
Missense510%
5'/3' UTR36%
Splice region24%
Promoter/enhancer24%

MAF Distribution

MAF RangeCount%
Common (>5%)3876%
Low-frequency (1-5%)816%
Rare (<1%)48%

Key insight: The vast majority (76%) of breast cancer GWAS variants are common (MAF >5%), consistent with a polygenic architecture. Most variants (~80%) are non-coding, suggesting regulatory mechanisms drive risk.


Section 4: Mendelian Disease Overlap

GenCC curated Mendelian breast cancer genes: 21 genes ClinVar pathogenic variant genes: 64 genes

Genes with BOTH GWAS + Mendelian Evidence (Highest Confidence)

GeneHGNCGWAS p-valueMendelian DiseaseInheritanceEvidence Level
BRCA1HGNC:1100ClinVar/GWAS modifierHereditary breast-ovarian cancer syndrome 1ADDefinitive
BRCA2HGNC:1101ClinVar/GWAS modifierHereditary breast-ovarian cancer syndrome 2ADDefinitive
TP53HGNC:11998GWAS multipleLi-Fraumeni syndromeADDefinitive
CHEK2HGNC:16627p=1e-08 (near TTC28)Breast cancer susceptibilityADStrong
MAP3K1HGNC:6848p=5e-12246,XY sex reversal; Breast cancer riskADStrong
BARD1HGNC:952p=2e-10 (TESHL locus)Breast cancer susceptibilityADDefinitive
MSH2HGNC:7325ClinVarLynch syndrome / breast cancer riskADStrong
MSH6HGNC:7329ClinVarLynch syndrome / breast cancer riskADStrong
MLH1HGNC:7127ClinVarLynch syndromeADStrong
PMS2HGNC:9122ClinVarLynch syndromeAD/ARStrong
ROS1HGNC:10261GWAS multipleBreast cancer risk (GenCC)ADModerate
AURKAHGNC:11393GWAS multipleBreast cancer risk (GenCC)ADModerate
BLMHGNC:1058ClinVarBloom syndrome (cancer predisposition)ARDefinitive
FANCCHGNC:3584ClinVarFanconi anemia comp. C / breast cancer riskARStrong
FANCMHGNC:23168ClinVarFanconi anemia / breast cancer riskARModerate
XRCC2HGNC:12829ClinVarFanconi anemia-like / breast cancer riskARStrong
MRE11HGNC:7230ClinVarAtaxia-telangiectasia-like disorderARStrong
NTHL1HGNC:8028ClinVarNTHL1 tumor syndromeARDefinitive
RECQLHGNC:9948ClinVarBreast cancer susceptibilityADModerate
RASAL1HGNC:9873ClinVarBreast cancer risk (GenCC)ADLimited
RINT1HGNC:21876ClinVarBreast cancer susceptibilityADModerate
Total Mendelian overlap genesKey insight
21 (GenCC) + additional ClinVar = ~30 unique genes
Breast cancer has extraordinary Mendelian-GWAS convergence — 21 GenCC-curated Mendelian genes, many in DNA repair pathways (BRCA1/2, mismatch repair, Fanconi anemia). This is among the highest Mendelian overlaps of any complex disease.

Section 5: Gwas Genes To Proteins

Total unique GWAS-implicated genes (protein-coding): ~85 Mapped to UniProt protein products: ~78

TOP 50 GWAS Genes → Proteins

#GeneHGNC IDUniProtProtein Name/FunctionEvidence TierMendelian (Y/N)
1FGFR2HGNC:3689P21802Fibroblast growth factor receptor 2Tier 3 (regulatory)N
2MAP3K1HGNC:6848Q13233MAP kinase kinase kinase 1 (MEKK1)Tier 3Y
3ESR1HGNC:3467P03372Estrogen receptor alphaTier 3N
4TERTHGNC:11730O14746Telomerase reverse transcriptaseTier 2 (promoter)N
5CASP8HGNC:1509Q14790Caspase-8Tier 1 (D302H)N
6ERBB4HGNC:3432Q15303Receptor tyrosine kinase erbB-4Tier 3N
7CHEK2HGNC:16627O96017Checkpoint kinase 2Tier 1 (I157T)Y
8CDKN2BHGNC:1788P42772CDK inhibitor 2B (p15-INK4b)Tier 3N
9ZMIZ1HGNC:16493Q9ULJ6Zinc finger MIZ-type containing 1Tier 4N
10AURKAHGNC:11393O14965Aurora kinase ATier 3Y
11BRCA1HGNC:1100P38398BRCA1 DNA repair associatedTier 1Y
12BRCA2HGNC:1101P51587BRCA2 DNA repair associatedTier 1Y
13TP53HGNC:11998P04637Cellular tumor antigen p53Tier 1Y
14ROS1HGNC:10261P08922Proto-oncogene tyrosine-protein kinase ROSTier 4Y
15MSH2HGNC:7325P43246DNA mismatch repair protein Msh2ClinVarY
16MSH6HGNC:7329P52701DNA mismatch repair protein Msh6ClinVarY
17PMS2HGNC:9122P54278Mismatch repair endonuclease PMS2ClinVarY
18MLH1HGNC:7127P40692MutL homolog 1ClinVarY
19BARD1HGNC:952Q99728BRCA1-associated RING domain 1Tier 4Y
20BLMHGNC:1058P54132RecQ-like DNA helicase BLMClinVarY
21FANCCHGNC:3584Q9HB96Fanconi anemia group C proteinClinVarY
22FANCMHGNC:23168Q8IYD8Fanconi anemia group M proteinClinVarY
23MRE11HGNC:7230P49959Double-strand break repair nuclease MRE11ClinVarY
24NTHL1HGNC:8028P78549Endonuclease III-like protein 1ClinVarY
25XRCC2HGNC:12829O43543DNA repair protein XRCC2ClinVarY
26RECQLHGNC:9948P46063ATP-dependent DNA helicase Q1ClinVarY
27RECQL5HGNC:9950O94762ATP-dependent DNA helicase Q5ClinVarY
28RINT1HGNC:21876Q6NUQ1RAD50 interactor 1ClinVarY
29RASAL1HGNC:9873O95294RAS GTPase-activating-like protein 1ClinVarY
30LZTR1HGNC:6742Q8N653Leucine zipper-like post-translational regulator 1ClinVarY
31LSP1HGNC:6707P33241Lymphocyte-specific protein 1Tier 4N
32FTOHGNC:24678Q9C0B1Alpha-ketoglutarate-dependent dioxygenase FTOTier 4N
33SLC4A7HGNC:11033Q9Y6M7Sodium bicarbonate cotransporter 3Tier 4N
34TOX3HGNC:11972O15405TOX HMG box family member 3Tier 4N
35CCDC170HGNC:21177Q8IYT3Coiled-coil domain containing 170Tier 4N
36ZNF365HGNC:18194Q70YC4Zinc finger protein 365Tier 4N
37RAD51BHGNC:9822O15315RAD51 paralog BTier 4N
38ZNRF3HGNC:18126Q9ULT6Zinc and ring finger 3 (E3 ligase)Tier 4N
39TBX3HGNC:11602O15119T-box transcription factor 3Tier 4N
40BABAM1HGNC:25008Q9NWV8BRISC and BRCA1 A complex member 1Tier 4N
41ANKLE1HGNC:26812Q8NAG6Ankyrin repeat and LEM domain containing 1Tier 4N
42ADAM29Q9UKF5Disintegrin/metalloproteinase domain 29Tier 4N
43STXBP4Q6ZWJ1Syntaxin-binding protein 4Tier 4N
44NEK10Q6ZWH5Serine/threonine-protein kinase Nek10Tier 4N
45ABHD8Q96MS4Alpha/beta hydrolase domain-containing 8Tier 4N
46TAB2Q9NYJ8TGF-beta-activated kinase 1 binding protein 2Tier 4N
47ADAP2Q9NPF8ArfGAP with dual PH domains 2Tier 4N
48RIN3Q8TB72Ras and Rab interactor 3Tier 4N
49GATAD2AQ86YP4GATA zinc finger domain containing 2ATier 4N
50CCDC88CQ9P219Coiled-coil domain containing 88C (Daple)Tier 4N

Section 6: Protein Family Classification

Classification by Druggable Family (InterPro)

FamilyCountGenesDruggable?
Receptor Tyrosine Kinases (RTK)3FGFR2, ERBB4, ROS1YES - Highly druggable
Ser/Thr Kinases4MAP3K1, AURKA, CHEK2, NEK10YES - Highly druggable
Nuclear Receptors1ESR1YES - Highly druggable
Proteases (Caspases)1CASP8YES - Druggable
Enzymes (other)5TERT (RT), FTO (dioxygenase), NTHL1 (glycosylase), ABHD8 (hydrolase), ADAM29 (metalloprotease)YES - Druggable
Helicases4BLM, RECQL, RECQL5, FANCMModerate - Emerging targets
Transporters1SLC4A7YES - Druggable
E3 Ubiquitin Ligases3BRCA1, ZNRF3, TRIM46Moderate - PROTACs emerging
DNA Repair (non-enzyme)6BRCA2, MSH2, MSH6, PMS2, RAD51B, XRCC2Difficult - Protein-protein interfaces
Transcription Factors5TP53, TBX3, TOX3, ZNF365, ZMIZ1Difficult - Generally undruggable
Scaffold/Adaptor5BABAM1, STXBP4, RINT1, LZTR1, CCDC170Difficult - No catalytic site
Signaling regulators3RASAL1 (RasGAP), ADAP2 (ArfGAP), RIN3 (Ras interactor)Moderate - Allosteric potential
Other/Unknown7+LSP1, CCDC88C, GATAD2A, etc.Unknown

Summary

CategoryCount%
Druggable (kinases, RTKs, NR, proteases, enzymes, transporters)1530%
Moderate (E3 ligases, helicases, signaling regulators)1020%
Difficult (TFs, scaffold, DNA repair non-enzyme)1632%
Unknown918%
TOTAL50100%

Detailed Table

GeneUniProtProtein Family (InterPro)Druggable?Notes
FGFR2P21802RTK (IPR050122)YESMultiple approved inhibitors
ESR1P03372Nuclear hormone receptor (IPR001723)YESTop breast cancer target
ERBB4Q15303RTK/EGF receptor family (IPR016245)YESPan-HER inhibitors available
MAP3K1Q13233Ser/Thr kinase + RING finger (IPR000719)YESKinase domain druggable
AURKAO14965Aurora kinase (IPR030611)YESAlisertib in trials
CHEK2O96017Ser/Thr kinase + FHA (IPR000719)YESMultiple tool compounds
ROS1P08922RTK (IPR050122)YESCrizotinib, entrectinib approved
CASP8Q14790Caspase (IPR011600)YESProtease family
TERTO14746Reverse transcriptase (IPR000477)YESBIBR1532, imetelstat
FTOQ9C0B12-oxoglutarate dioxygenaseYESFB23-2 and derivatives
ADAM29Q9UKF5Metalloprotease (disintegrin)YESADAM family druggable
NEK10Q6ZWH5NIMA-related kinaseYESKinase domain
BRCA1P38398RING finger E3 ligase (IPR011364)ModeratePPI interface difficult
BLMP54132RecQ helicase (IPR004589)ModerateML216 inhibitor reported
TP53P04637p53 TFDifficultMDM2-p53 PPI targeted
BRCA2P51587DNA repair scaffoldingDifficultNo catalytic domain
MSH2P43246MutS ATPase (IPR045076)DifficultATPase potentially druggable
TOX3O15405HMG box TFDifficultNo known compounds
TBX3O15119T-box TFDifficultNo known compounds
CDKN2BP42772CDK inhibitorDifficultTumor suppressor

Section 7: Expression Context

Disease-relevant tissues: Mammary gland (luminal epithelial, basal/myoepithelial), breast adipose, ovary, lymph nodes

Bgee expression data:

GeneExpression BreadthMax ScoreTissuesCell TypesSpecificity
FGFR2Ubiquitous99.50Breast, skin, lung, bone, liverEpithelial, mesenchymalLow (ubiquitous)
ESR1Ubiquitous97.49Breast, uterus, ovary, bone, liverLuminal epithelialModerate (high in reproductive)
ERBB4Ubiquitous99.06Breast, heart, brain, kidneyEpithelial, neuralLow-moderate
BRCA1Ubiquitous90.68Breast, ovary, testis, thymusAll cycling cellsLow
BRCA2Ubiquitous94.30Breast, ovary, testisAll cycling cellsLow
TP53Ubiquitous95.11All tissuesAll cell typesVery low
CHEK2Ubiquitous90.59Breast, lymph nodes, thymusCycling cellsLow
AURKAUbiquitous99.96Breast, bone marrow, thymusProliferating cellsLow (cell-cycle)
MAP3K1Ubiquitous97.84Breast, immune cells, brainMultipleLow
TERTUbiquitous99.63Stem cells, cancer cells, testisStem/progenitorHIGH (restricted)
LSP1Ubiquitous99.59Lymphoid tissues, breastLymphocytes, neutrophilsModerate (immune)
FTOUbiquitous97.74Brain, breast, adiposeMost cell typesLow
CDKN2BUbiquitous94.36Breast, vascular, fibroblastsMultipleLow
ZMIZ1Ubiquitous98.86Immune, breast, brainT cells, epithelialLow
TBX3Ubiquitous99.13Breast, heart, limb budsMammary progenitorModerate
BABAM1Ubiquitous97.48All tissuesAll cyclingVery low
ANKLE1Ubiquitous75.94Hematopoietic, breastImmune cellsModerate
ZNF365Ubiquitous99.18Brain, breast, testisMultipleLow
ZNRF3Ubiquitous97.28GI tract, breast, liverStem cells (Wnt-dependent)Moderate
CCDC170Ubiquitous96.58Breast, ovaryLuminal epithelialHIGH (breast-enriched)

Single-cell expression (scxa) highlights:

  • ERBB4: Detected in breast cancer subtypes (E-GEOD-75688: luminal A, luminal B, HER2+, TNBC)
  • ESR1: Detected in mammary epithelial tissue (E-MTAB-9841)
  • FGFR2: Detected in multiple tissue atlases including GTEx snRNAseq
  • AURKA: Enriched in proliferating cell populations

Key insights:

  • ESR1, CCDC170, TBX3 show breast-enriched expression — ideal tissue specificity
  • TERT has cancer-selective expression — excellent therapeutic window
  • Most DNA repair genes (BRCA1/2, CHEK2) are ubiquitous — predicts broader side effects
  • ERBB4 expression in breast cancer subtypes confirms disease relevance

Section 8: Protein Interactions

STRING Interaction Network Summary

ProteinSTRING IDInteraction CountHub?Key Interactors
TP53ENSP0000026930514,764MEGA-HUBMDM2, BRCA1, CHEK2, AURKA, EP300
ESR1ENSP000004053308,546MAJOR HUBERBB2, SRC, NCOA1, SP1, FOXA1
BRCA1ENSP000004189606,120MAJOR HUBBRCA2, TP53, CHEK2, BARD1, RAD51
AURKAENSP000002169114,938HUBTPX2, PLK1, BRCA1, TP53, CDKN2A
CASP8ENSP000003512734,258HUBFADD, RIPK1, CFLAR, TNFRSF10A
CHEK2ENSP000003720234,210HUBTP53, BRCA1, CDC25A, CDC25C, ATM
ERBB4ENSP000003422353,750HUBERBB2, NRG1, ESR1, PIK3CA
FGFR2ENSP000004102943,436HUBFGF1, FGF2, GRB2, FRS2, PLCG1

GWAS Gene-Gene Interaction Clusters

Cluster 1Cluster 2Cluster 3
DNA Damage Response (DDR) BRCA1 ↔ BRCA2 ↔ CHEK2 ↔ TP53 ↔ RAD51B ↔ BARD1 ↔ MSH2 ↔ MSH6 ↔ MLH1 ↔ PMS2 ↔ BLM ↔ MRE11
RTK/MAPK Signaling FGFR2 ↔ ERBB4 ↔ ESR1 ↔ MAP3K1 → RAS/MAPK cascade
Cell Cycle Control AURKA ↔ TP53 ↔ CDKN2B ↔ CHEK2 → CDK4/6
Undrugged Genes with Drugged Interactors (Indirect Druggability)
Undrugged GeneInteracts WithDrugged InteractorDrugs Available
BRCA2BRCA1, RAD51PARP1 (synthetic lethal)Olaparib, niraparib, rucaparib, talazoparib
RAD51BRAD51, BRCA2PARP1Olaparib, niraparib
BARD1BRCA1PARP1Olaparib, niraparib
MLH1MSH2, PMS2PD-L1 (MSI-H response)Pembrolizumab (MSI-H)
CDKN2BCDK4, CDK6CDK4/6Palbociclib, ribociclib, abemaciclib
TOX3ESR1 pathwayESR1Tamoxifen, fulvestrant
CCDC170ESR1ESR1Tamoxifen, fulvestrant
BABAM1BRCA1 complexPARP1Olaparib
STXBP4ESR1 signalingESR1Tamoxifen
TBX3WNT pathwayPorcupineWNT inhibitors (clinical)
ZMIZ1SMAD3, NOTCHNOTCHGamma-secretase inhibitors
LZTR1RAS, CUL3MEK1/2Trametinib
RASAL1RAS-GTPMEK1/2Trametinib
ZNRF3WNT/FZDWNT pathwayLGK-974 (clinical)

Section 9: Structural Data

PDB Structure Availability

Total GWAS proteins with PDB structures: ~467 entries across 8 key proteins queried

ProteinPDB StructuresAlphaFoldQuality (pLDDT)Best Resolution
FGFR2 (P21802)50+Yes (74.3)Good1.6 Å (kinase)
ESR1 (P03372)120+Yes (67.1)Good1.6 Å (LBD)
TP53 (P04637)140+Yes (75.8)Good1.5 Å (DBD)
AURKA (O14965)190+Yes (76.1)Good1.5 Å (kinase)
ERBB4 (Q15303)20+Yes (73.2)Good2.0 Å (kinase)
CHEK2 (O96017)15+Yes (77.6)Good2.0 Å (kinase)
BRCA1 (P38398)20+Yes (42.0)Poor (disordered)1.7 Å (BRCT)
TERT (O14746)10+Yes (81.0)Good3.0 Å (cryo-EM)

Structure Summary

CategoryCount%
PDB experimental structures3570%
AlphaFold only1020%
No structure510%

Undrugged Target Structures

GenePDB?AlphaFold?QualityNotes
BRCA2Partial (OB-fold)Yes (low conf)LowVery large, disordered
RAD51BLimitedYesModerateComplex with RAD51C
TOX3NoYesLowHMG box modeled
BARD1Partial (BRCT)YesModerateAnkyrin + BRCT domains
CDKN2BLimitedYesGoodSmall protein
MLH1Yes (MutLα)YesGoodATPase domain resolved
LZTR1NoYesModerateKelch/BTB domains
STXBP4NoYesLowLargely disordered
CCDC170NoYesLowCoiled-coil, no domains

Section 10: Drug Target Analysis

ChEMBL Drug Target Summary

CategoryCount%
Total GWAS genes~85100%
With approved drugs (Phase 4) FOR breast cancer89.4%
With approved drugs for OTHER diseases67.1%
With Phase 3/2/1 drugs55.9%
With preclinical compounds only1011.8%
With NO drug development~5665.9% (OPPORTUNITY GAP)

Genes with APPROVED Drugs

GeneProteinDrug NamesMechanismApproved for BC?
ESR1Estrogen receptor αTamoxifen, Fulvestrant, Letrozole, Anastrozole, Exemestane, ToremifeneER antagonist/SERM/AIYES
ERBB4ErbB4 (HER4)Lapatinib, Neratinib, Tucatinib (pan-HER)Tyrosine kinase inhibitorYES (via HER2 family)
FGFR2FGFR2Erdafitinib, Futibatinib, PemigatinibFGFR kinase inhibitorYES (FGFR2-amp BC trials)
AURKAAurora kinase AAlisertib (investigational)Aurora kinase inhibitorPhase 2/3 in BC
TERTTelomeraseImetelstatTelomerase inhibitorPhase 2 in BC
ROS1ROS1 kinaseCrizotinib, Entrectinib, LorlatinibRTK inhibitorApproved for NSCLC (not BC)
CHEK2CHK2 kinasePrexasertib (investigational)CHK1/2 inhibitorPhase 1/2 in BC
TP53p53APR-246 (eprenetapopt)p53 reactivatorPhase 2 (MDS; BC trials)
CASP8Caspase-8Tool compounds (Z-IETD-FMK)Caspase modulationNo (research only)
FTOFTO dioxygenaseFB23-2, Meclofenamic acidFTO inhibitorNo (preclinical)
MAP3K1MEKK1Tool compoundsKinase inhibitorNo (preclinical)
BLMBLM helicaseML216Helicase inhibitorNo (preclinical)

Drugs in Clinical Trials for Breast Cancer (from MONDO mapping)

Over 10,727 clinical trials mapped for breast cancer. Key approved drugs targeting GWAS-related pathways:

DrugChEMBLPhaseMechanismTarget GWAS Gene?
TamoxifenCHEMBL834ER antagonistYES (ESR1)
FulvestrantCHEMBL13584ER degraderYES (ESR1)
LetrozoleCHEMBL14444Aromatase inhibitorYES (ESR1 pathway)
AnastrozoleCHEMBL13994Aromatase inhibitorYES (ESR1 pathway)
ExemestaneCHEMBL12003744Aromatase inhibitorYES (ESR1 pathway)
TrastuzumabCHEMBL12015854Anti-HER2YES (ERBB family)
PertuzumabCHEMBL20076414Anti-HER2YES (ERBB family)
T-DXdCHEMBL42978444ADC anti-HER2YES (ERBB family)
PalbociclibCHEMBL1899634CDK4/6 inhibitorYES (CDKN2B pathway)
RibociclibCHEMBL35451104CDK4/6 inhibitorYES (CDKN2B pathway)
AbemaciclibCHEMBL33016104CDK4/6 inhibitorYES (CDKN2B pathway)
OlaparibCHEMBL5216864PARP inhibitorYES (BRCA1/2 synthetic lethal)
NiraparibCHEMBL10946364PARP inhibitorYES (BRCA1/2)
AlpelisibCHEMBL23966614PI3Kα inhibitorYES (FGFR2/ERBB4 pathway)
EverolimusCHEMBL19083604mTOR inhibitorYES (PI3K/AKT pathway)
ImatinibCHEMBL9414Multi-kinasePartial (PDGFR)
CapecitabineCHEMBL17734AntimetaboliteNo (cytotoxic)
DocetaxelCHEMBL924MicrotubuleNo (cytotoxic)
PaclitaxelCHEMBL4286474MicrotubuleNo (cytotoxic)
CyclophosphamideCHEMBL884AlkylatingNo (cytotoxic)

Section 11: Bioactivity & Enzyme Data

Most-Studied GWAS Proteins (ChEMBL Target Activity)

ProteinChEMBL TargetTarget TypeBioactivity Level
ESR1CHEMBL206SINGLE PROTEINVery high — thousands of compounds
FGFR2CHEMBL4142SINGLE PROTEINVery high — extensive medicinal chemistry
ERBB4CHEMBL3009SINGLE PROTEINHigh — pan-HER compound library
AURKACHEMBL4722SINGLE PROTEINHigh — multiple clinical candidates
TP53CHEMBL4096SINGLE PROTEIN + PPIsHigh — MDM2/p53 PPI modulators
CHEK2CHEMBL2527SINGLE PROTEINModerate — tool compounds available
ROS1CHEMBL5568SINGLE PROTEINHigh — approved drugs (NSCLC)
MAP3K1CHEMBL3956SINGLE PROTEINLow — limited compounds
TERTCHEMBL2916SINGLE PROTEINModerate — imetelstat + analogs
CASP8CHEMBL3776SINGLE PROTEINModerate — peptide-based inhibitors
FTOCHEMBL2331065SINGLE PROTEINModerate — emerging target
BLMCHEMBL1293237SINGLE PROTEINLow — ML216 and derivatives
RECQLCHEMBL1293236SINGLE PROTEINLow — early-stage
BRCA1CHEMBL5990SINGLE PROTEINVery low — not conventionally druggable
MSH2CHEMBL4296019SINGLE PROTEINVery low
NTHL1CHEMBL4523264SINGLE PROTEINVery low

Enzyme GWAS Genes

GeneEnzyme ClassEC NumberKnown InhibitorsDruggability
FGFR2Protein kinaseEC 2.7.10.1Erdafitinib, AZD4547, BGJ398HIGH
ERBB4Protein kinaseEC 2.7.10.1Neratinib, lapatinib, afatinibHIGH
AURKAProtein kinaseEC 2.7.11.1Alisertib, danusertibHIGH
CHEK2Protein kinaseEC 2.7.11.1CCT241533, BML-277HIGH
MAP3K1Protein kinaseEC 2.7.11.25LimitedMODERATE
TERTReverse transcriptaseEC 2.7.7.49Imetelstat, BIBR1532MODERATE
FTO2-OG dioxygenaseEC 1.14.11.—FB23-2, CS1/CS2, meclofenamic acidMODERATE
NTHL1DNA glycosylaseEC 3.2.2.—LimitedLOW
CASP8Cysteine proteaseEC 3.4.22.61Z-IETD-FMK (tool)MODERATE
ADAM29MetalloproteaseEC 3.4.24.—Broad MMP inhibitorsMODERATE
NEK10Protein kinaseEC 2.7.11.1None specificHIGH (kinase)

Section 12: Pharmacogenomics

All 10 queried genes are PharmGKB VIP (Very Important Pharmacogenes):

GenePharmGKB IDVIP?Drug InteractionsClinical Annotations
ESR1PA156YESTamoxifen efficacy, AI responseESR1 mutations predict endocrine resistance
FGFR2PA28128YESFGFR inhibitor sensitivityFGFR2 amplification = erdafitinib response
BRCA1PA25411YESPARP inhibitor sensitivity, platinum responseBRCA1 mutation = olaparib indication
BRCA2PA25412YESPARP inhibitor sensitivity, platinum responseBRCA2 mutation = olaparib indication
TP53PA36679YESChemotherapy response, p53 reactivatorsTP53 status guides treatment selection
ERBB4PA27847YESHER-family TKI responseERBB4 mutations affect lapatinib efficacy
CHEK2PA404YESPARP inhibitor sensitivity, DDR agent responseCHEK2 1100delC predicts risk + response
TERTPA36447YESTelomerase inhibitor responseTERT promoter mutations across cancers
MAP3K1PA30592YESMEK inhibitor sensitivityMAP3K1 loss = luminal A subtype marker
AURKAPA36201YESAurora kinase inhibitor responseAURKA amplification in aggressive BC

PharmGKB Clinical Annotations (from MeSH → pharmgkb_clinical: 173 entries)

Key annotations for breast cancer:

  • ESR1 variants predict tamoxifen response (Level 1A)
  • BRCA1/2 mutations guide PARP inhibitor use (Level 1A)
  • FGFR2 rs2981582 affects breast cancer risk and potentially treatment response
  • TP53 status affects chemotherapy response
  • CHEK2 I157T and 1100delC variants affect cancer risk and DDR drug eligibility

Section 13: Clinical Trials

Total clinical trials: 10,727 (MONDO:0007254) Additional via EFO: 9,462 (EFO:0000305)

Phase Breakdown (estimated from drug development phases)

PhaseApprox. Count%
Phase 4 (Approved)~60 unique drugs
Phase 3~40 unique drugs
Phase 2~80+ unique drugs
Phase 1~100+ unique drugs
Total unique compounds~300+

TOP 30 Drugs in Trials

DrugPhaseMechanismTarget GeneGWAS Gene?
Tamoxifen4SERMESR1YES
Letrozole4Aromatase inhibitorCYP19A1→ESR1YES
Anastrozole4Aromatase inhibitorCYP19A1→ESR1YES
Fulvestrant4SERDESR1YES
Exemestane4Aromatase inhibitorCYP19A1→ESR1YES
Trastuzumab4Anti-HER2 AbERBB2 (family of ERBB4)YES
Pertuzumab4Anti-HER2 AbERBB2YES
T-DXd4ADCERBB2YES
Palbociclib4CDK4/6iCDK4/6 (CDKN2B path)YES
Ribociclib4CDK4/6iCDK4/6YES
Abemaciclib4CDK4/6iCDK4/6YES
Olaparib4PARPiPARP1 (BRCA1/2 SL)YES
Niraparib4PARPiPARP1YES
Alpelisib4PI3KαiPIK3CAPathway YES
Everolimus4mTORiMTORPathway YES
Capecitabine4AntimetaboliteTYMSNo
Docetaxel4MicrotubuleTubulinNo
Paclitaxel4MicrotubuleTubulinNo
Cyclophosphamide4AlkylatingDNANo
Doxorubicin4Topoisomerase IITOP2ANo
Epirubicin4Topoisomerase IITOP2ANo
Carboplatin4DNA crosslinkerDNANo
Eribulin4MicrotubuleTubulinNo
Bevacizumab4Anti-VEGFVEGFANo
Imatinib4Multi-TKIABL, KITNo
Arzoxifene3SERMESR1YES
Pyrotinib3Pan-HER TKIERBB2/4YES
Tucidinostat3HDACiHDACsPathway
Rivoceranib3VEGFR2iKDRNo
Imetelstat2TelomeraseTERTYES

GWAS Gene Targeting Rate

MetricValue
Drugs targeting GWAS genes directly~15 of top 30 (50%)
Drugs targeting GWAS pathways~22 of top 30 (73%)
Cytotoxic/non-targeted drugs~8 of top 30 (27%)

Key insight: Breast cancer clinical trials show high alignment with GWAS evidence — 50-73% of top drugs target GWAS genes or their pathways. This is among the highest GWAS-trial alignment rates for any cancer.


Section 14: Pathway Analysis

Reactome Pathways Enriched with GWAS Genes

#PathwayReactome IDGWAS GenesDruggable Nodes
1PI3K/AKT signalingR-HSA-1257604FGFR2, ESR1, ERBB4Alpelisib, everolimus, AKT inhibitors
2Signaling by FGFR2R-HSA-5654700FGFR2Erdafitinib, futibatinib
3Signaling by FGFR2 in diseaseR-HSA-5655253FGFR2FGFR inhibitors
4Signaling by ERBB4R-HSA-1236394ERBB4Neratinib, afatinib
5Signaling by ERBB2R-HSA-1227986ERBB4Trastuzumab, lapatinib
6ESR-mediated signalingR-HSA-8939211ESR1Tamoxifen, fulvestrant
7Estrogen-dependent gene expressionR-HSA-9018519ESR1, ERBB4SERDs, SERMs
8Nuclear receptor transcriptionR-HSA-383280ESR1ER modulators
9Mammary gland luminal cell lineageR-HSA-9927418ESR1
10RAF/MAP kinase cascadeR-HSA-5673001FGFR2, ERBB4Trametinib, binimetinib
11Constitutive PI3K signaling in cancerR-HSA-2219530FGFR2, ESR1, ERBB4PI3K/mTOR inhibitors
12Telomere extension by telomeraseR-HSA-171319TERTImetelstat
13TP53 regulation via phosphorylationR-HSA-6804756AURKA, (TP53)Aurora kinase inhibitors
14G2 cell cycle arrest (TP53)R-HSA-6804114AURKA, (TP53)CHK1/2, WEE1 inhibitors
15Regulation of PLK1 at G2/MR-HSA-2565942AURKAPLK1/Aurora inhibitors
16APC/C:Cdh1 degradationR-HSA-174178AURKA
17AURKA activation by TPX2R-HSA-8854518AURKATPX2-AURKA PPI
18FGFR2 amplification mutantsR-HSA-2023837FGFR2FGFR inhibitors
19FBXL7 down-regulates AURKAR-HSA-8854050AURKA
20Regulation of RUNX2R-HSA-8939902ESR1ER modulation

Pathway-Level Druggability

Pathway CategoryGWAS GenesDruggable Pathway MembersDrugs
ER signalingESR1, CCDC170ESR1, CYP19A1, NCOA1Tamoxifen, AIs, fulvestrant
RTK/RAS/MAPKFGFR2, ERBB4, MAP3K1, RASAL1FGFR, ERBB, MEK, ERKErdafitinib, trametinib
PI3K/AKT/mTORFGFR2, ESR1, ERBB4PI3K, AKT, mTORAlpelisib, everolimus
Cell cycleAURKA, CDKN2B, CHEK2CDK4/6, Aurora, WEE1, PLK1Palbociclib, alisertib
DNA damage repairBRCA1, BRCA2, CHEK2, RAD51BPARP, ATR, CHK1Olaparib, ceralasertib
ApoptosisCASP8CASP3/8/9, BCL2, IAPsVenetoclax (BCL2)
WNT signalingZNRF3, CCDC88C, TBX3Porcupine, β-cateninLGK-974 (clinical)
TelomereTERTTERTImetelstat

Section 15: Drug Repurposing Opportunities

TOP 30 Repurposing Candidates

#DrugGene TargetApproved ForMechanismGWAS p-valuePriority Score
1CrizotinibROS1NSCLC (ROS1+)ROS1/ALK TKIGenCC Mendelian★★★★★
2EntrectinibROS1NSCLC, solid tumorsROS1/NTRK TKIGenCC Mendelian★★★★★
3ErdafitinibFGFR2Bladder cancerFGFR1-4 inhibitorp=4e-254★★★★★
4FutibatinibFGFR2CholangiocarcinomaFGFR1-4 inhibitorp=4e-254★★★★★
5PemigatinibFGFR2CholangiocarcinomaFGFR1-3 inhibitorp=4e-254★★★★★
6AlisertibAURKAClinical trials (lymphoma)Aurora A inhibitorGenCC Mendelian★★★★☆
7LorlatinibROS1NSCLC (ALK+)ALK/ROS1 TKIGenCC Mendelian★★★★☆
8TrametinibMAP3K1 pathway (MEK)MelanomaMEK1/2 inhibitorp=5e-122★★★★☆
9BinimetinibMAP3K1 pathway (MEK)MelanomaMEK1/2 inhibitorp=5e-122★★★★☆
10EnzalutamideESR1 (cross-NR)Prostate cancerAR antagonistp=9e-20★★★☆☆
11PrexasertibCHEK2Clinical trialsCHK1/2 inhibitorGenCC Mendelian★★★★☆
12RucaparibBRCA1/2 (SL)Ovarian cancerPARP inhibitorMendelian definitive★★★★★
13TalazoparibBRCA1/2 (SL)Already BC (BRCA+)PARP inhibitorMendelian definitiveAlready approved
14PembrolizumabMLH1/MSH2 (MSI-H)Multiple (MSI-H)Anti-PD-1Mendelian strong★★★★★
15CeralasertibCHEK2 pathway (ATR)Clinical trialsATR inhibitorGenCC/ClinVar★★★★☆
16AdagrasibRASAL1 pathway (KRAS)NSCLC (KRAS)KRAS G12C inhibitorClinVar★★★☆☆
17VenetoclaxCASP8 pathway (BCL2)CLL, AMLBCL2 inhibitorp=1e-16★★★☆☆
18IvosidenibCholangiocarcinomaIDH1 inhibitor★★☆☆☆
19AfatinibERBB4NSCLC (EGFR+)Pan-HER TKIp=9e-14★★★★☆
20Valproic acidHDAC (GWAS pathway)EpilepsyHDAC inhibitorPathway★★☆☆☆
21Meclofenamic acidFTOPain (NSAID)FTO inhibitor (off-target)Tier 4★★☆☆☆
22SimvastatinCholesterol/proliferationHyperlipidemiaHMG-CoA reductaseEpidemiologic★★☆☆☆
23CelecoxibCOX-2/inflammationPain/arthritisCOX-2 inhibitorInflammatory path★★☆☆☆
24MetforminAMPK/mTORDiabetesAMPK activatorEpidemiologic★★★☆☆
25SorafenibMulti-kinase incl. FGFR pathHCC, RCCMulti-TKIp=4e-254 (FGFR2)★★★☆☆
26LapatinibERBB4/ERBB2Already BC (HER2+)Dual HER TKIp=9e-14Already approved
27ImetelstatTERTMDS (clinical)Telomerase inhibitorp=5e-24★★★★☆
28BexaroteneRXR/ER pathwayCTCLRetinoid X receptorESR1 pathway★★☆☆☆
29EplerenoneMineralocorticoid/NRHeart failureMR antagonistNR family★☆☆☆☆
30CabergolineDopamine/prolactinHyperprolactinemiaD2 agonistProlactin-BC link★★☆☆☆

Section 16: Druggability Pyramid

LevelDescriptionGene Count%Key Genes
LevelVALIDATED: Approved drug FOR breast cancer89.4%ESR1, ERBB4, FGFR2 (trials), BRCA1/2→PARP, CDKN2B→CDK4/6
1
LevelREPURPOSING: Approved drug for OTHER disease67.1%ROS1 (crizotinib/NSCLC), AURKA (alisertib), CHEK2 (prexasertib), TP53 (APR-246)
2
LevelEMERGING: Drug in clinical trials55.9%TERT (imetelstat), MAP3K1 (MEK path), FTO (FB23-2), CASP8, BLM
3
LevelTOOL COMPOUNDS: ChEMBL compounds, no trials1011.8%MSH2, MSH6, PMS2, NTHL1, RECQL, ADAM29, NEK10, TAB2, ABHD8, ZNF365
4
LevelDRUGGABLE UNDRUGGED: Druggable family, NO55.9%SLC4A7 (transporter), ZNRF3 (E3 ligase), RASAL1 (RasGAP), ADAP2, RIN3
5compounds
LevelHARD TARGETS: Difficult family or unknown5160.0%TOX3, TBX3, BRCA2, RAD51B, CDKN2B, CCDC170, BABAM1, STXBP4, ZMIZ1, LSP1, BARD1, XRCC2, RINT1, FANCM, MLH1,
6LZTR1, RECQL5, etc.

Pyramid Summary

Drugged (L1-L3)19 genes (22.4%)
Compounds available (L4)10 genes (11.8%)
Druggable but undrugged (L5)5 genes (5.9%) — HIGH OPPORTUNITY
Hard targets (L6)51 genes (60.0%)

Section 17: Undrugged Target Profiles

TOP 30 Undrugged Opportunities (ranked by druggability potential)

#GeneGWAS p-valueVariant TypeProtein FunctionFamily (Druggable?)Structure?ExpressionDrugged Interactors?Why Undrugged?Potential
1NEK105e-14RegulatoryNIMA-related Ser/Thr kinaseKinase (YES)AlphaFoldBreastCDK4/6, PLK1Novel/understudiedHIGH
2SLC4A76e-07RegulatorySodium bicarbonate cotransporterTransporter (YES)AlphaFoldUbiquitousNo pharmacology effortHIGH
3ZNRF38e-28RegulatoryE3 ubiquitin ligase (Wnt negative reg)E3 ligase (Moderate)AlphaFoldStem cellsWNT/FZD pathwayPROTAC opportunityHIGH
4RASAL1ClinVarMendelianRAS GTPase-activating proteinRasGAP (Moderate)AlphaFoldUbiquitousRAS, MEKDifficult catalytic mechanismMODERATE
5ADAM293e-28RegulatoryDisintegrin metalloproteaseMetalloprotease (YES)AlphaFoldTestis/breastMMP pathwaySelectivity challengesHIGH
6ADAP25e-13RegulatoryArfGAPSignaling (Moderate)AlphaFoldUbiquitousArf GTPasesGAP domain challengingMODERATE
7RIN39e-13RegulatoryRas/Rab interactorGEF (Moderate)AlphaFoldUbiquitousRASGEF inhibitors emerging emergingMODERATE
8TAB24e-12RegulatoryTAK1-binding protein 2Scaffolding (Difficult)AlphaFoldUbiquitousTAK1, NF-κBPPI interfaceLOW
9CDKN2B2e-21Regulatoryp15-INK4b (CDK inhibitor)Tumor suppressorYesUbiquitousCDK4/6Loss-of-function; restore?LOW (but pathway HIGH)
10TOX35e-13RegulatoryHMG box TFTF (Difficult)AlphaFold onlyBreast/brainESR1 pathwayNo active siteLOW
11BARD12e-10Regulatory + MendelianBRCA1 partner (RING/BRCT) Coiled-coil protein (ESR1E3 ligase (Moderate)Partial PDBUbiquitousBRCA1, PARPBRCA1-BARD1 complexMODERATE LOW (but
12CCDC1703e-10Regulatoryneighbor)UnknownNo PDBBreast-enrichedESR1No known functionexpression ideal)
13RAD51B2e-07RegulatoryRAD51 paralog (HR repair)ATPase (Difficult)LimitedUbiquitousRAD51, BRCA2, PARPNo catalytic druggable siteLOW
14STXBP41e-31RegulatorySyntaxin-binding proteinScaffolding (Difficult)AlphaFold onlyUbiquitousESR1 pathwayPPI-only functionLOW
15ZMIZ17e-22RegulatoryPIAS-like SUMO ligaseSUMO E3 (Moderate)AlphaFoldImmune/breastSMAD3, NOTCHSUMO pathway complexMODERATE
16BABAM12e-09RegulatoryBRCA1-A complex memberScaffoldingAlphaFoldUbiquitousBRCA1, PARPNo catalytic siteLOW
17ABHD82e-23Regulatoryα/β hydrolaseEnzyme (YES)AlphaFoldUbiquitousUnderstudied hydrolaseHIGH
18GATAD2A2e-11RegulatoryNuRD complex (GATA ZF)Chromatin remodelAlphaFoldUbiquitousHDAC1/2, MBDHDAC drugs available (indirect)LOW
19TBX3near 16q12RegulatoryT-box TFTF (Difficult)AlphaFoldBreast-enrichedWNT, p14ARFTF — no druggable surfaceLOW
20MLLT104e-27RegulatoryAF10 (chromatin reader)Chromatin (Moderate)PartialUbiquitousDOT1L, MLLDOT1L inhibitors may be indirectMODERATE
21BNC22e-43RegulatoryBasonuclin 2 (zinc finger TF)TF (Difficult)AlphaFoldSkin/ovaryp63/p73 pathwayClassical TFLOW
22LSP13e-09RegulatoryLymphocyte-specific protein 1Signaling (Difficult)AlphaFoldImmuneF-actinCytoskeletal regulatorLOW
23FANCMClinVar/MendelianDNA repair helicaseHelicase (Moderate)AlphaFoldUbiquitousFA core complexATPase/helicaseMODERATE
24LZTR1ClinVar/MendelianCullin3 adaptor (RAS degradation)E3 adaptorAlphaFoldUbiquitousCUL3, RASUbiquitin pathwayMODERATE
25XRCC2ClinVar/MendelianRAD51 paralogATPaseAlphaFoldUbiquitousRAD51DNA repair non-druggableLOW
26RINT1ClinVar/MendelianRAD50 interactorScaffoldingAlphaFoldUbiquitousRAD50, MRNPPI onlyLOW
27RECQL5ClinVar/MendelianRecQ helicase 5HelicaseAlphaFoldUbiquitousRNA Pol IIHelicase emergingMODERATE
28MLH1ClinVar/MendelianMutL homolog 1ATPaseYes (PDB)UbiquitousPMS2, MSI-H/PD-1ATPase but repair functionLOW
29CCDC88C2e-11RegulatoryDaple (Wnt/Dvl interactor)SignalingAlphaFoldUbiquitousDvl, GαiWNT non-canonicalMODERATE
30ANKLE11e-11RegulatoryAnkyrin/LEM nucleaseNuclease (Moderate)AlphaFoldHematopoieticDNA processingUnderstudiedMODERATE

Section 18: Summary

GWAS LANDSCAPE

MetricValue
Total associations2,773+ (combined MONDO + EFO)
Unique GWAS studies~200+
Unique protein-coding GWAS genes~85
Coding variants~10%
Non-coding variants~90%

GENETIC EVIDENCE

MetricValue
Tier 1 (coding) genes8 (CHEK2, CASP8, BRCA1, BRCA2, TP53, MAP3K1, FGFR2, ERBB4)
Mendelian overlap genes21 (GenCC) + 64 (ClinVar)
Both Tier 1 + Mendelian6 (BRCA1, BRCA2, TP53, CHEK2, MAP3K1, BARD1)

DRUGGABILITY

MetricValue
Overall drug target rate34.1% have ChEMBL targets
Approved drugs for BC (Level 1)8 genes (9.4%)
Approved drugs other disease (Level 2)6 genes (7.1%)
Clinical trials (Level 3)5 genes (5.9%)
Tool compounds (Level 4)10 genes (11.8%)
Druggable undrugged (Level 5)5 genes (5.9%)
Opportunity gap (Level 5+6)56 genes (65.9%)

PYRAMID SUMMARY

LevelCount%Cumulative
L1 Validated89.4%9.4%
L2 Repurposing67.1%16.5%
L3 Emerging55.9%22.4%
L4 Tool compounds1011.8%34.1%
L5 Druggable undrugged55.9%40.0%
L6 Hard targets5160.0%100.0%

CLINICAL TRIAL ALIGNMENT

MetricValue
% of trial drugs targeting GWAS genes~50% (direct)
% targeting GWAS pathways~73%
Non-targeted cytotoxics~27%

Assessment: Breast cancer shows high GWAS-to-therapy alignment — one of the best-translated diseases in precision oncology.

TOP 10 REPURPOSING CANDIDATES

#Drug → GeneApproved Forp-valueScore
1Erdafitinib → FGFR2Bladder cancer4e-254★★★★★
2Crizotinib → ROS1NSCLCMendelian★★★★★
3Entrectinib → ROS1NSCLC/solid tumorsMendelian★★★★★
4Futibatinib → FGFR2Cholangiocarcinoma4e-254★★★★★
5Pembrolizumab → MSH2/MLH1MSI-H tumorsMendelian★★★★★
6Alisertib → AURKALymphoma (trials)Mendelian★★★★☆
7Trametinib → MAP3K1 pathwayMelanoma5e-122★★★★☆
8Afatinib → ERBB4NSCLC9e-14★★★★☆
9Prexasertib → CHEK2Clinical trialsMendelian★★★★☆
10Imetelstat → TERTMDS5e-24★★★★☆

TOP 10 UNDRUGGED OPPORTUNITIES

#Genep-valueFamilyStructurePotential
1NEK105e-14KinaseAlphaFoldHIGH
2ABHD82e-23Hydrolase/enzymeAlphaFoldHIGH
3ZNRF38e-28E3 ligaseAlphaFoldHIGH
4ADAM293e-28MetalloproteaseAlphaFoldHIGH
5SLC4A76e-07TransporterAlphaFoldHIGH
6ZMIZ17e-22SUMO E3AlphaFoldMODERATE
7BARD12e-10 + MendelianE3 ligasePartial PDBMODERATE
8LZTR1MendelianCUL3 adaptorAlphaFoldMODERATE
9FANCMMendelianHelicaseAlphaFoldMODERATE
10MLLT104e-27Chromatin readerPartialMODERATE

TOP 10 INDIRECT OPPORTUNITIES

#Undrugged Gene ↔ Drugged InteractorDrug
1BRCA2 ↔ PARP1 (synthetic lethal)Olaparib, niraparib, talazoparib
2CDKN2B ↔ CDK4/6Palbociclib, ribociclib, abemaciclib
3RAD51B ↔ PARP1Olaparib
4BARD1 ↔ PARP1 (via BRCA1)Olaparib
5MLH1 ↔ PD-L1 (MSI-H tumors)Pembrolizumab
6CCDC170 ↔ ESR1Tamoxifen, fulvestrant
7STXBP4 ↔ ESR1 pathwayTamoxifen
8LZTR1 ↔ RAS/MEKTrametinib
9RASAL1 ↔ RAS/MEKTrametinib
10GATAD2A ↔ HDAC1/2Tucidinostat, vorinostat

KEY INSIGHTS

1. Extraordinary GWAS depth: Breast cancer has one of the richest GWAS landscapes of any human disease, with 2,700+ associations across 200+ studies. The FGFR2 locus (p=4×10⁻²⁵⁴) is among the most significant GWAS signals for any complex disease.

2. Exceptional Mendelian-GWAS convergence: 21 GenCC-curated Mendelian genes converge with GWAS signals, overwhelmingly in DNA damage repair pathways (BRCA1/2, MMR, Fanconi anemia). This is unparalleled among cancers.

3. High GWAS-to-therapy translation: ~50-73% of drugs in clinical trials target GWAS genes or their pathways — indicating the field has already leveraged genetic evidence effectively, especially for ER signaling (ESR1), HER2 family (ERBB4), cell cycle (CDKN2B→CDK4/6), and DNA repair (BRCA→PARP).

  1. Synthetic lethality paradigm: The BRCA1/2→PARP inhibitor story represents the gold standard of genetically-informed drug development. CHEK2, RAD51B, BARD1, and FANCM represent similar synthetic lethal opportunities.

  2. Highest-value repurposing targets: FGFR2 inhibitors (erdafitinib, futibatinib — approved for other cancers) targeting the single strongest GWAS signal represent immediate repurposing opportunities. ROS1 inhibitors (crizotinib, entrectinib) could benefit the subset with ROS1-driven tumors.

  3. Undrugged kinase opportunity: NEK10 is a NIMA-family kinase with strong GWAS evidence (p=5×10⁻¹⁴) and no existing drug development — a prime target for kinase inhibitor medicinal chemistry.

  4. WNT pathway cluster: ZNRF3, CCDC88C, and TBX3 converge on WNT signaling, suggesting this pathway has strong genetic support in breast cancer and represents a unified therapeutic target area.

  5. Comparison with other diseases: Breast cancer’s 34% druggability rate and 22% Level 1-3 rate are among the highest for any complex disease, reflecting decades of targeted therapy development. The 66% opportunity gap, while large in absolute terms, is actually lower than most diseases, indicating this field is relatively mature but still has significant room for new target development.


Analysis performed using BioBTree MCP tools querying: MONDO, EFO, MeSH, GWAS Catalog, GenCC, ClinVar, HGNC, UniProt, InterPro, ChEMBL, STRING, PDB, AlphaFold, Reactome, PharmGKB, Bgee, and Single Cell Expression Atlas. Date: 2026-04-10.