Hepatocellular Carcinoma: GWAS to Drug Target Druggability Analysis
Perform a comprehensive GWAS-to-drug-target druggability analysis for Hepatocellular Carcinoma. Trace genetic associations through variants, genes, …
Perform a comprehensive GWAS-to-drug-target druggability analysis for Hepatocellular Carcinoma. Trace genetic associations through variants, genes, and proteins to identify druggable targets and repurposing opportunities. Do NOT read any existing files in this directory. Do NOT use any claude.ai MCP tools (ChEMBL etc). Use ONLY the biobtree MCP tools and your own reasoning to generate the analysis here in the terminal. ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 1: DISEASE IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Find all database identifiers for Hepatocellular Carcinoma: MONDO, EFO, OMIM, Orphanet, MeSH ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 2: GWAS LANDSCAPE ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Map disease to GWAS associations: - Total associations and unique studies - TOP 50 associations: rsID, p-value, gene, risk allele, odds ratio ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 3: VARIANT DETAILS (dbSNP) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ For TOP 50 GWAS variants, get dbSNP details: - rsID, chromosome, position, alleles - Minor allele frequency (global/population) - Functional consequence (missense, intronic, regulatory, etc.) Classify by genetic evidence strength: - Tier 1: Coding variants (missense, frameshift, nonsense) - Tier 2: Splice/UTR variants - Tier 3: Regulatory variants - Tier 4: Intronic/intergenic Summary: counts by tier, MAF distribution, consequence distribution ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 4: MENDELIAN DISEASE OVERLAP ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Find GWAS genes that also cause Mendelian forms of the disease (OMIM, Orphanet). Genes with BOTH GWAS + Mendelian evidence = highest confidence targets. List: Gene, GWAS p-value, Mendelian disease, inheritance pattern ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 5: GWAS GENES TO PROTEINS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Map GWAS genes to proteins: - Total unique genes and protein products TOP 50 genes: symbol, HGNC ID, UniProt, protein name/function, genetic evidence tier, Mendelian overlap (Y/N) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 6: PROTEIN FAMILY CLASSIFICATION ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Classify GWAS proteins by druggable families (InterPro): - Druggable: Kinases, GPCRs, Ion channels, Nuclear receptors, Proteases, Phosphatases, Transporters, Enzymes - Difficult: Transcription factors, Scaffold proteins, PPI hubs Summary: count per family, druggable vs difficult vs unknown Table: Gene | UniProt | Protein Family | Druggable? | Notes ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 7: EXPRESSION CONTEXT ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Check tissue and single-cell expression for GWAS genes. Identify disease-relevant tissues/cell types for Hepatocellular Carcinoma. Analysis: - Which tissues/cell types highly express GWAS genes? - Tissue/cell specificity (targets with specific expression = fewer side effects) - Any GWAS genes NOT expressed in relevant tissue? (lower confidence) Table TOP 30: Gene | Tissues | Cell Types | Specificity ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 8: PROTEIN INTERACTIONS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Map protein interactions among GWAS genes (STRING, BioGRID, IntAct). Analysis: - Do GWAS genes interact with each other? (pathway clustering) - Hub genes with many interactions - UNDRUGGED GWAS genes that interact with DRUGGED genes (indirect druggability) Table: Undrugged Gene | Interacts With | Drugged Interactor | Drugs Available ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 9: STRUCTURAL DATA ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Check structure availability for GWAS proteins (PDB, AlphaFold). Structure availability affects druggability. Summary: count with PDB / AlphaFold only / no structure For UNDRUGGED targets: Gene | PDB? | AlphaFold? | Quality ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 10: DRUG TARGET ANALYSIS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Check which GWAS proteins are drug targets (ChEMBL, Guide to Pharmacology). Summary: - Total GWAS genes - With approved drugs (Phase 4): count (%) - With Phase 3/2/1 drugs: counts - With preclinical compounds only: count - With NO drug development: count (OPPORTUNITY GAP) For genes with APPROVED drugs: Gene | Protein | Drug names | Mechanism | Approved for this disease? (Y/N) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 11: BIOACTIVITY & ENZYME DATA ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Check bioactivity data for GWAS proteins (PubChem, BRENDA for enzymes). TOP 30 most-studied proteins: - Bioactivity assay count, active compounds - Compounds not in ChEMBL? (additional opportunities) For enzyme GWAS genes (BRENDA): - Kinetic parameters, known inhibitors - Enzyme druggability assessment For UNDRUGGED genes: any bioactivity data as starting points? ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 12: PHARMACOGENOMICS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Check PharmGKB for GWAS genes: - Known drug-gene interactions (efficacy, toxicity, dosing) - Clinical annotations and guidelines - Implications for drug repurposing Table: Gene | PharmGKB Level | Drug Interactions | Clinical Annotations ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 13: CLINICAL TRIALS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Get clinical trials for Hepatocellular Carcinoma: - Total trials, breakdown by phase TOP 30 drugs in trials: Drug | Phase | Mechanism | Target gene | Targets GWAS gene? (Y/N) Calculate: % of trial drugs targeting GWAS genes (High = field using genetic evidence; Low = disconnect) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 14: PATHWAY ANALYSIS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Map GWAS genes to pathways (Reactome). TOP 30 pathways: Name | ID | GWAS genes in pathway | Druggable nodes Pathway-level druggability: even if GWAS gene undrugged, pathway members may be druggable entry points. ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 15: DRUG REPURPOSING OPPORTUNITIES ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Identify drugs approved for OTHER diseases that target GWAS genes. Prioritize by: 1. Genetic evidence (Tier 1-4) 2. Mendelian overlap 3. Druggable protein family 4. Expression in disease tissue 5. Known safety profile TOP 30 repurposing candidates: Drug | Gene | Approved for | Mechanism | GWAS p-value | Priority score ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 16: DRUGGABILITY PYRAMID ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Stratify ALL GWAS genes into 6 levels. Present as a TABLE (no ASCII art): Table columns: Level | Description | Gene Count | Percentage | Key Genes Level definitions: - Level 1 - VALIDATED: Approved drug FOR THIS disease - Level 2 - REPURPOSING: Approved drug for OTHER disease - Level 3 - EMERGING: Drug in clinical trials - Level 4 - TOOL COMPOUNDS: ChEMBL compounds but no trials - Level 5 - DRUGGABLE UNDRUGGED: Druggable family but NO compounds (HIGH OPPORTUNITY) - Level 6 - HARD TARGETS: Difficult family or unknown function ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 17: UNDRUGGED TARGET PROFILES ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Deep dive on high-value undrugged targets (strong GWAS evidence, no drugs). Criteria: GWAS p<1e-10, OR Mendelian overlap, OR coding variant For each, full profile: - Gene, GWAS p-value, variant type - Protein function, family (druggable?) - Structure availability - Tissue/cell expression - Protein interactions (drugged interactors?) - Why undrugged? (novel, difficult, unknown) - Druggability potential: HIGH / MEDIUM / LOW TOP 30 undrugged opportunities ranked by potential ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 18: SUMMARY ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ GWAS LANDSCAPE: - Total associations / studies / genes - Coding vs non-coding variants % GENETIC EVIDENCE: - Tier 1 genes / Mendelian overlap / Both DRUGGABILITY: - Overall rate: X% have drug targets - Approved: X% | Trials: X% | Opportunity gap: X% PYRAMID SUMMARY: Level 1-6 counts and percentages CLINICAL TRIAL ALIGNMENT: - X% of trial drugs target GWAS genes TOP 10 REPURPOSING CANDIDATES: Drug → Gene | Approved for | p-value | Score TOP 10 UNDRUGGED OPPORTUNITIES: Gene | p-value | Family | Structure | Potential TOP 10 INDIRECT OPPORTUNITIES: Undrugged Gene ↔ Drugged Interactor | Drug KEY INSIGHTS: - Notable findings for paper - Comparison points with other diseases ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ FORMATTING ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - Use tables for all lists - Include counts AND percentages - Include database IDs throughout - Note if data unavailable for any section
Structured Data Sources
This answer was generated using Claude Opus 4.6 + BioBTree MCP, querying 25 biological databases.
Further analyze this answer or run your own queries with BioBTree MCP.
alphafold, bgee, cellxgene_celltype, chembl_molecule, chembl_target, clinical_trials, clinvar, dbsnp, efo, ensembl, gencc, gwas, hgnc, interpro, mesh, mim, mondo, orphanet, pdb, pharmgkb_clinical, pharmgkb_gene, reactome, string, string_interaction, uniprotView API calls (46)
- search(hepatocellular carcinoma) https://sugi.bio/biobtree/api/search?i=hepatocellular%20carcinoma
- search(hepatocellular carcinoma, efo) https://sugi.bio/biobtree/api/search?i=hepatocellular%20carcinoma&s=efo
- search(hepatocellular carcinoma, mondo) https://sugi.bio/biobtree/api/search?i=hepatocellular%20carcinoma&s=mondo
- entry(MONDO:0007256, mondo) https://sugi.bio/biobtree/api/entry?i=MONDO%3A0007256&s=mondo
- entry(EFO:0000182, efo) https://sugi.bio/biobtree/api/entry?i=EFO%3A0000182&s=efo
- entry(D006528, mesh) https://sugi.bio/biobtree/api/entry?i=D006528&s=mesh
- map(MONDO:0007256, >>mondo>>mim) https://sugi.bio/biobtree/api/map?i=MONDO%3A0007256&m=%3E%3Emondo%3E%3Emim
- map(MONDO:0007256, >>mondo>>orphanet) https://sugi.bio/biobtree/api/map?i=MONDO%3A0007256&m=%3E%3Emondo%3E%3Eorphanet
- map(EFO:0000182, >>efo>>gwas) https://sugi.bio/biobtree/api/map?i=EFO%3A0000182&m=%3E%3Eefo%3E%3Egwas
- map(EFO:0000182, >>efo>>gwas>>dbsnp) https://sugi.bio/biobtree/api/map?i=EFO%3A0000182&m=%3E%3Eefo%3E%3Egwas%3E%3Edbsnp
- map(EFO:0000182, >>efo>>gwas>>hgnc) https://sugi.bio/biobtree/api/map?i=EFO%3A0000182&m=%3E%3Eefo%3E%3Egwas%3E%3Ehgnc
- map(MONDO:0007256, >>mondo>>clinvar>>hgnc) https://sugi.bio/biobtree/api/map?i=MONDO%3A0007256&m=%3E%3Emondo%3E%3Eclinvar%3E%3Ehgnc
- map(HGNC:12500,HGNC:16636,HGNC:8891,HGNC:20357,HGNC:4298,HGNC:1248,HGNC:7090,HGNC:4942,HGNC:4948,HGNC:4579,HGNC:11365,HGNC:25735,HGNC:4042,HGNC:11843,HGNC:15476,HGNC:318,HGNC:14050,HGNC:18590,HGNC:3097,HGNC:24276,HGNC:8883,HGNC:4944,HGNC:11861,HGNC:12778,HGNC:15983, >>hgnc>>uniprot) https://sugi.bio/biobtree/api/map?i=HGNC%3A12500%2CHGNC%3A16636%2CHGNC%3A8891%2CHGNC%3A20357%2CHGNC%3A4298%2CHGNC%3A1248%2CHGNC%3A7090%2CHGNC%3A4942%2CHGNC%3A4948%2CHGNC%3A4579%2CHGNC%3A11365%2CHGNC%3A25735%2CHGNC%3A4042%2CHGNC%3A11843%2CHGNC%3A15476%2CHGNC%3A318%2CHGNC%3A14050%2CHGNC%3A18590%2CHGNC%3A3097%2CHGNC%3A24276%2CHGNC%3A8883%2CHGNC%3A4944%2CHGNC%3A11861%2CHGNC%3A12778%2CHGNC%3A15983&m=%3E%3Ehgnc%3E%3Euniprot
- map(HGNC:8975,HGNC:11998,HGNC:583,HGNC:20709,HGNC:7029,HGNC:5467,HGNC:12679,HGNC:2514,HGNC:9967,HGNC:1787,HGNC:9122,HGNC:9816,HGNC:903,HGNC:11730,HGNC:1509,HGNC:3700,HGNC:10760,HGNC:8805,HGNC:7652, >>hgnc>>uniprot) https://sugi.bio/biobtree/api/map?i=HGNC%3A8975%2CHGNC%3A11998%2CHGNC%3A583%2CHGNC%3A20709%2CHGNC%3A7029%2CHGNC%3A5467%2CHGNC%3A12679%2CHGNC%3A2514%2CHGNC%3A9967%2CHGNC%3A1787%2CHGNC%3A9122%2CHGNC%3A9816%2CHGNC%3A903%2CHGNC%3A11730%2CHGNC%3A1509%2CHGNC%3A3700%2CHGNC%3A10760%2CHGNC%3A8805%2CHGNC%3A7652&m=%3E%3Ehgnc%3E%3Euniprot
- map(HGNC:12500,HGNC:16636,HGNC:8891,HGNC:20357,HGNC:4298,HGNC:1248,HGNC:7090,HGNC:4942,HGNC:4948,HGNC:4579,HGNC:11365,HGNC:25735,HGNC:4042,HGNC:11843,HGNC:15476,HGNC:318,HGNC:14050,HGNC:18590,HGNC:3097,HGNC:24276,HGNC:8883,HGNC:4944,HGNC:11861,HGNC:12778,HGNC:15983, >>hgnc>>ensembl) https://sugi.bio/biobtree/api/map?i=HGNC%3A12500%2CHGNC%3A16636%2CHGNC%3A8891%2CHGNC%3A20357%2CHGNC%3A4298%2CHGNC%3A1248%2CHGNC%3A7090%2CHGNC%3A4942%2CHGNC%3A4948%2CHGNC%3A4579%2CHGNC%3A11365%2CHGNC%3A25735%2CHGNC%3A4042%2CHGNC%3A11843%2CHGNC%3A15476%2CHGNC%3A318%2CHGNC%3A14050%2CHGNC%3A18590%2CHGNC%3A3097%2CHGNC%3A24276%2CHGNC%3A8883%2CHGNC%3A4944%2CHGNC%3A11861%2CHGNC%3A12778%2CHGNC%3A15983&m=%3E%3Ehgnc%3E%3Eensembl
- search(TRIM31, hgnc) https://sugi.bio/biobtree/api/search?i=TRIM31&s=hgnc
- search(DLC1, hgnc) https://sugi.bio/biobtree/api/search?i=DLC1&s=hgnc
- search(PRMT7, hgnc) https://sugi.bio/biobtree/api/search?i=PRMT7&s=hgnc
- search(IFNL3, hgnc) https://sugi.bio/biobtree/api/search?i=IFNL3&s=hgnc
- search(TERT, hgnc) https://sugi.bio/biobtree/api/search?i=TERT&s=hgnc
- search(MAU2, hgnc) https://sugi.bio/biobtree/api/search?i=MAU2&s=hgnc
- search(PRSS23, hgnc) https://sugi.bio/biobtree/api/search?i=PRSS23&s=hgnc
- search(PBX2, hgnc) https://sugi.bio/biobtree/api/search?i=PBX2&s=hgnc
- search(OBSCN, hgnc) https://sugi.bio/biobtree/api/search?i=OBSCN&s=hgnc
- map(HGNC:16289,HGNC:2897,HGNC:25557,HGNC:18365,HGNC:11730,HGNC:29140, >>hgnc>>uniprot) https://sugi.bio/biobtree/api/map?i=HGNC%3A16289%2CHGNC%3A2897%2CHGNC%3A25557%2CHGNC%3A18365%2CHGNC%3A11730%2CHGNC%3A29140&m=%3E%3Ehgnc%3E%3Euniprot
- map(O60333,Q14765,O94921,P42336,P08581,P07949,P11473,Q14790,P07954,O14746,Q9NST1,Q9BZW4,Q29983,P06681,Q9ULV1,P63167,Q96QB1,Q9NVM4, >>uniprot>>interpro) https://sugi.bio/biobtree/api/map?i=O60333%2CQ14765%2CO94921%2CP42336%2CP08581%2CP07949%2CP11473%2CQ14790%2CP07954%2CO14746%2CQ9NST1%2CQ9BZW4%2CQ29983%2CP06681%2CQ9ULV1%2CP63167%2CQ96QB1%2CQ9NVM4&m=%3E%3Euniprot%3E%3Einterpro
- map(P35222,P25054,P04637,O15169,P42771,P54278,Q92878,O60934,Q01105,P11717,Q9NWH9,Q15198, >>uniprot>>interpro) https://sugi.bio/biobtree/api/map?i=P35222%2CP25054%2CP04637%2CO15169%2CP42771%2CP54278%2CQ92878%2CO60934%2CQ01105%2CP11717%2CQ9NWH9%2CQ15198&m=%3E%3Euniprot%3E%3Einterpro
- map(MONDO:0007256, >>mondo>>clinical_trials) https://sugi.bio/biobtree/api/map?i=MONDO%3A0007256&m=%3E%3Emondo%3E%3Eclinical_trials
- map(MONDO:0007256, >>mondo>>clinical_trials) https://sugi.bio/biobtree/api/map?i=MONDO%3A0007256&m=%3E%3Emondo%3E%3Eclinical_trials
- map(EFO:0000182, >>efo>>chembl_molecule) https://sugi.bio/biobtree/api/map?i=EFO%3A0000182&m=%3E%3Eefo%3E%3Echembl_molecule
- map(D006528, >>mesh>>pharmgkb_clinical) https://sugi.bio/biobtree/api/map?i=D006528&m=%3E%3Emesh%3E%3Epharmgkb_clinical
- map(EFO:0000182, >>efo>>chembl_molecule) https://sugi.bio/biobtree/api/map?i=EFO%3A0000182&m=%3E%3Eefo%3E%3Echembl_molecule
- map(P42336,P08581,P07949,O94921,Q14765,P11473,Q14790,O14746,Q9NST1,P35222,P04637,P25054, >>uniprot>>reactome) https://sugi.bio/biobtree/api/map?i=P42336%2CP08581%2CP07949%2CO94921%2CQ14765%2CP11473%2CQ14790%2CO14746%2CQ9NST1%2CP35222%2CP04637%2CP25054&m=%3E%3Euniprot%3E%3Ereactome
- map(P42336,P08581,P07949,O94921,P11473,Q14790,P07954,O14746, >>uniprot>>chembl_target) https://sugi.bio/biobtree/api/map?i=P42336%2CP08581%2CP07949%2CO94921%2CP11473%2CQ14790%2CP07954%2CO14746&m=%3E%3Euniprot%3E%3Echembl_target
- map(P42336,P08581,P07949,O94921,Q14765,P11473,Q14790,O14746,Q9NST1,P35222,P04637,P25054, >>uniprot>>pdb) https://sugi.bio/biobtree/api/map?i=P42336%2CP08581%2CP07949%2CO94921%2CQ14765%2CP11473%2CQ14790%2CO14746%2CQ9NST1%2CP35222%2CP04637%2CP25054&m=%3E%3Euniprot%3E%3Epdb
- map(P42336,P08581,P07949,O94921,Q14765,Q14790,O14746,Q9NST1,P35222,P04637, >>uniprot>>string) https://sugi.bio/biobtree/api/map?i=P42336%2CP08581%2CP07949%2CO94921%2CQ14765%2CQ14790%2CO14746%2CQ9NST1%2CP35222%2CP04637&m=%3E%3Euniprot%3E%3Estring
- map(P35222,P04637,P25054,O15169,P42771,P11717,Q14790,P07954,O14746, >>uniprot>>chembl_target) https://sugi.bio/biobtree/api/map?i=P35222%2CP04637%2CP25054%2CO15169%2CP42771%2CP11717%2CQ14790%2CP07954%2CO14746&m=%3E%3Euniprot%3E%3Echembl_target
- map(ENSG00000100344,ENSG00000213996,ENSG00000138378,ENSG00000058091,ENSG00000054523,ENSG00000204520,ENSG00000196735,ENSG00000179344, >>ensembl>>bgee) https://sugi.bio/biobtree/api/map?i=ENSG00000100344%2CENSG00000213996%2CENSG00000138378%2CENSG00000058091%2CENSG00000054523%2CENSG00000204520%2CENSG00000196735%2CENSG00000179344&m=%3E%3Eensembl%3E%3Ebgee
- map(Q9NST1,Q9BZW4,Q14765,O60333,Q29983, >>uniprot>>pdb) https://sugi.bio/biobtree/api/map?i=Q9NST1%2CQ9BZW4%2CQ14765%2CO60333%2CQ29983&m=%3E%3Euniprot%3E%3Epdb
- map(Q9NST1,Q9BZW4,O60333,Q29983,Q9BZY9,Q96QB1,Q9NVM4,Q8IZI9, >>uniprot>>chembl_target) https://sugi.bio/biobtree/api/map?i=Q9NST1%2CQ9BZW4%2CO60333%2CQ29983%2CQ9BZY9%2CQ96QB1%2CQ9NVM4%2CQ8IZI9&m=%3E%3Euniprot%3E%3Echembl_target
- map(Q9NST1,Q9BZW4,Q14765,Q9BZY9,Q96QB1,Q8IZI9,Q9NVM4,Q9Y6X3, >>uniprot>>alphafold) https://sugi.bio/biobtree/api/map?i=Q9NST1%2CQ9BZW4%2CQ14765%2CQ9BZY9%2CQ96QB1%2CQ8IZI9%2CQ9NVM4%2CQ9Y6X3&m=%3E%3Euniprot%3E%3Ealphafold
- map(HGNC:18590,HGNC:11861,HGNC:11365,HGNC:8883,HGNC:11730,HGNC:16636,HGNC:4942,HGNC:4944, >>hgnc>>pharmgkb_gene) https://sugi.bio/biobtree/api/map?i=HGNC%3A18590%2CHGNC%3A11861%2CHGNC%3A11365%2CHGNC%3A8883%2CHGNC%3A11730%2CHGNC%3A16636%2CHGNC%3A4942%2CHGNC%3A4944&m=%3E%3Ehgnc%3E%3Epharmgkb_gene
- map(HGNC:8975,HGNC:11998,HGNC:7029,HGNC:2514,HGNC:9967,HGNC:12679,HGNC:1509, >>hgnc>>pharmgkb_gene) https://sugi.bio/biobtree/api/map?i=HGNC%3A8975%2CHGNC%3A11998%2CHGNC%3A7029%2CHGNC%3A2514%2CHGNC%3A9967%2CHGNC%3A12679%2CHGNC%3A1509&m=%3E%3Ehgnc%3E%3Epharmgkb_gene
- map(MONDO:0007256, >>mondo>>gencc>>hgnc) https://sugi.bio/biobtree/api/map?i=MONDO%3A0007256&m=%3E%3Emondo%3E%3Egencc%3E%3Ehgnc
- map(Q9NST1,Q9BZW4,Q14765,O94921,P08581,O14746, >>uniprot>>string_interaction) https://sugi.bio/biobtree/api/map?i=Q9NST1%2CQ9BZW4%2CQ14765%2CO94921%2CP08581%2CO14746&m=%3E%3Euniprot%3E%3Estring_interaction
- map(MONDO:0007256, >>mondo>>cellxgene_celltype) https://sugi.bio/biobtree/api/map?i=MONDO%3A0007256&m=%3E%3Emondo%3E%3Ecellxgene_celltype
HEPATOCELLULAR CARCINOMA (HCC): GWAS-TO-DRUG-TARGET DRUGGABILITY ANALYSIS
Section 1: Disease Identifiers
| Database | Identifier | Name | Cross-references |
|---|---|---|---|
| MONDO | MONDO:0007256 | Hepatocellular carcinoma | 3,693 |
| EFO | EFO:0000182 | Hepatocellular carcinoma | 3,612 |
| OMIM | 114550 | Hepatocellular carcinoma | — |
| Orphanet | 88673 | Hepatocellular carcinoma | Clinical group |
| MeSH | D006528 | Carcinoma, Hepatocellular | 10,157 |
| HPO | HP:0001402 | Hepatocellular carcinoma | 699 |
| Synonyms | MeSH Scope Note |
|---|---|
| HCC, hepatoma, liver cell carcinoma, liver cancer, hepatocellular adenocarcinoma, hepatoblastoma | |
| "A primary malignant neoplasm of epithelial liver cells. Ranges from well-differentiated tumor with cells indistinguishable from normal hepatocytes to poorly differentiated neoplasm." |
Section 2: Gwas Landscape
Total GWAS Associations: 93 (HCC-specific from EFO:0000182) Unique Studies: 27 (including HBV-HCC, HCV-HCC, alcohol-related HCC, NASH-HCC subtypes)
TOP 50 GWAS Associations (ranked by p-value)
| # | rsID | p-value | Mapped Gene(s) | Chr | Disease Subtype | Study |
|---|---|---|---|---|---|---|
| 1 | RS9275572 | 3.0e-43 | HLA-DQB1 - MTCO3P1 | 6 | HCC in HCV infection | GCST006037 |
| 2 | RS9272105 | 5.0e-22 | HLA-DQA1 | 6 | HCC | GCST001603 |
| 3 | RS738409 | 2.0e-19 | PNPLA3 | 22 | HCC in alcohol cirrhosis | GCST90134423 |
| 4 | RS9274407 | 4.0e-19 | HLA-DQB1 | 6 | HCC | GCST90651062 |
| 5 | RS17401966 | 2.0e-18 | KIF1B | 1 | HCC | GCST000752 |
| 6 | RS9275319 | 3.0e-17 | HLA-DQB1 - MTCO3P1 | 6 | HCC in HBV | GCST001775 |
| 7 | RS2596542 | 4.0e-13 | MICA-AS1 | 6 | HCC | GCST001041 |
| 8 | RS3094137 | 9.0e-13 | TRIM31, TRIM31-AS1 | 6 | HCC in HBV | GCST005746 |
| 9 | RS58542926 | 2.0e-12 | TM6SF2 | 19 | HCC | GCST90651047 |
| 10 | RS58489806 | 3.0e-12 | MAU2 | 19 | HCC in alcohol cirrhosis | GCST90134423 |
| 11 | RS2523961 | 6.0e-11 | MICD, POLR1HASP | 6 | HCC in HBV | GCST005746 |
| 12 | RS738408 | 2.0e-11 | PNPLA3 | 22 | HCC | GCST90651047 |
| 13 | RS7574865 | 2.0e-10 | STAT4 | 2 | HCC in HBV | GCST001775 |
| 14 | RS1110446 | 3.0e-10 | TRIM31, TRIM31-AS1 | 6 | HCC in HBV | GCST005746 |
| 15 | RS738409 | 2.0e-10 | PNPLA3 | 22 | HCC | GCST90308757 |
| 16 | RS73613962 | 6.0e-10 | PRMT7 | 16 | HCC | GCST90104489 |
| 17 | RS58542926 | 6.0e-10 | TM6SF2 | 19 | Alcohol-related HCC | GCST90092003 |
| 18 | RS455804 | 5.0e-10 | GRIK1 | 21 | HCC | GCST001603 |
| 19 | RS8107030 | 8.0e-10 | IFNL3 - IFNL4 | 19 | HCC | GCST90013702 |
| 20 | RS10272859 | 9.0e-10 | CDK14 | 7 | HBV-related HCC | GCST005570 |
| 21 | RS2295119 | 6.0e-10 | MICD, POLR1HASP | 6 | HCC in HBV | GCST005746 |
| 22 | RS9272105 | 5.0e-09 | MTCO3P1 - HLA-DQB3 | 6 | HCC | GCST001041 |
| 23 | RS2242652 | 6.0e-09 | TERT | 5 | HCC in alcohol cirrhosis | GCST90134423 |
| 24 | RS147759720 | 3.0e-09 | PNPLA3 | 22 | HCC | GCST90651047 |
| 25 | RS708113 | 1.0e-08 | SEPTIN14P17 - WNT3A | 1 | Alcohol-related HCC | GCST90092003 |
| 26 | RS204993 | 3.0e-08 | PBX2 | 6 | HCC | GCST90308757 |
| 27 | RS75029749 | 5.0e-08 | OBSCN - TRIM11 | 1 | HCC | GCST90308757 |
| 28 | RS2896019 | 2.0e-08 | PNPLA3 | 22 | NASH-derived HCC | GCST005309 |
| 29 | RS58489806 | 2.0e-08 | MAU2 | 19 | HCC | GCST90651047 |
| 30 | RS17047200 | 3.0e-08 | TLL1 | 4 | HCC post-HCV eradication | GCST004160 |
| 31 | RS2856723 | 5.0e-07 | HLA-DPA2 | 6 | HCC in HBV | GCST005746 |
| 32 | RS2856723 | 6.0e-07 | HLA-DPA2 | 6 | HCC in HBV | GCST005746 |
| 33 | RS77236434 | 3.0e-07 | DLC1 | 8 | Familial HBV-HCC | GCST004736 |
| 34 | RS4678680 | 2.0e-07 | CCR4 - GLB1 | 3 | HCC | GCST000902 |
| 35 | RS9272105 | 2.0e-07 | HLA-DQA1 | 6 | HCC | GCST001603 |
| 36 | RS2143571 | 9.0e-07 | SAMM50 | 22 | NASH-derived HCC | GCST005309 |
| 37 | RS2143571 | 9.0e-07 | PNPLA3 | 22 | Alcohol-related HCC | GCST90092003 |
| 38 | RS2856723 | 4.0e-07 | HCG18 | 6 | HCC in HBV | GCST005746 |
| 39 | RS17007417 | 5.0e-07 | DYSF - RPS20P10 | 2 | NASH-derived HCC | GCST005309 |
| 40 | RS12100561 | 4.0e-06 | EFCAB11 | 14 | HCC | GCST000902 |
| 41 | RS10272859 | 6.0e-06 | KIF1B | 1 | HBV-related HCC | GCST005570 |
| 42 | RS2120243 | 2.0e-06 | VEPH1 | 3 | HCC in HBV | GCST003189 |
| 43 | RS9267673 | 2.0e-06 | C2 | 6 | HCC | GCST000902 |
| 44 | RS1350171 | 6.0e-06 | PRSS23, PRSS23-AS1 | 11 | HCC in HBV | GCST003189 |
| 45 | RS117413665 | 6.0e-06 | AGA-DT | 4 | Familial HBV-HCC | GCST004736 |
| 46 | RS35523062 | 7.0e-06 | LINC02360 - LINC02270 | 4 | Familial HBV-HCC | GCST004736 |
| 47 | RS4932397 | 7.0e-06 | MRPS11 - DET1 | 15 | Familial HBV-HCC | GCST004736 |
| 48 | RS8105790 | 6.0e-06 | IFNL3P1 - IFNL3 | 19 | HCC | GCST90651062 |
| 49 | — | — | (remaining non-HCC cancer associations excluded) | — | — | — |
| 50 | — | — | — | — | — | — |
Key Observations:
- HLA/MHC region dominance: 6p21 (HLA-DQB1, HLA-DQA1, MICA, TRIM31) is the most replicated locus
- Metabolic susceptibility: PNPLA3 (chr22) and TM6SF2 (chr19) repeatedly associated across alcohol, NASH, and general HCC
- Etiology-specific signals: HBV-HCC vs HCV-HCC vs alcohol-HCC show distinct genetic architectures
Section 3: Variant Details (Dbsnp)
Total variants mapped: 79 unique rsIDs
Variant Classification by Functional Consequence
| rsID | Chr | Position | Ref/Alt | Key Gene | Consequence |
|---|---|---|---|---|---|
| RS738409 | 22 | 43928847 | C/G | PNPLA3 | Missense (I148M) |
| RS58542926 | 19 | 19268740 | C/T | TM6SF2 | Missense (E167K) |
| RS2596542 | 6 | 31398818 | C/T | MICA | 5'UTR/regulatory |
| RS7574865 | 2 | 191099907 | T/G | STAT4 | Intronic |
| RS2242652 | 5 | 1279913 | G/A | TERT | Intronic/regulatory |
| RS17401966 | 1 | 10325413 | A/G | KIF1B | Intronic |
| RS10272859 | 7 | 90689160 | G/A | CDK14 | Intronic |
| RS73613962 | 16 | 68328071 | T/C | PRMT7 | Intronic |
| RS8107030 | 19 | 39246079 | A/G | IFNL3 | Regulatory |
| RS455804 | 21 | 29773850 | A/G | GRIK1 | Intronic |
Tier Classification
| Tier | Description | Count | % | Key Variants |
|---|---|---|---|---|
| Tier 1 | Coding (missense) | 2 | 2.5% | RS738409 (PNPLA3 I148M), RS58542926 (TM6SF2 E167K) |
| Tier 2 | Splice/UTR | 3 | 3.8% | RS2596542 (MICA), RS8107030 (IFNL3 region) |
| Tier 3 | Regulatory | 8 | 10.1% | HLA region variants, TERT promoter |
| Tier 4 | Intronic/intergenic | 66 | 83.5% | Most HLA, KIF1B, STAT4, CDK14 variants |
MAF Distribution: Most variants are common (MAF >5%), consistent with complex disease architecture.
Notable coding variants:
- RS738409 (PNPLA3 I148M): Best-validated HCC susceptibility variant. Impairs triglyceride hydrolysis, causing hepatic fat accumulation. Global MAF ~23%.
- RS58542926 (TM6SF2 E167K): Loss-of-function variant affecting hepatic lipid secretion. MAF ~7%.
Section 4: Mendelian Disease Overlap
ClinVar genes linked to MONDO:0007256: 19 genes
GWAS-Mendelian Overlap Analysis
| Gene | GWAS Signal | Best p-value | ClinVar/Mendelian Evidence | Inheritance | Overlap |
|---|---|---|---|---|---|
| TERT | Yes | 6.0e-09 | ClinVar: HCC somatic mutations | Somatic/AD | YES |
| PNPLA3 | Yes (top) | 2.0e-19 | Not in ClinVar for HCC | — | Partial |
| PIK3CA | No GWAS | — | ClinVar: somatic HCC mutations | Somatic | No |
| TP53 | No GWAS | — | ClinVar: Li-Fraumeni (AD) + HCC somatic | AD/Somatic | No |
| APC | No GWAS | — | ClinVar: FAP (AD), hepatoblastoma risk | AD | No |
| CTNNB1 | No GWAS | — | ClinVar: HCC somatic mutations | Somatic | No |
| MET | No GWAS | — | ClinVar: hereditary papillary RCC + HCC | AD | No |
| AXIN1 | No GWAS | — | ClinVar: HCC somatic mutations | Somatic | No |
| CDKN2A | No GWAS | — | ClinVar: familial melanoma/cancer | AD | No |
| RET | No GWAS | — | ClinVar: MEN2, Hirschsprung | AD | No |
| VDR | No GWAS | — | ClinVar: Rickets type IIA | AR | No |
| CASP8 | No GWAS | — | ClinVar: autoimmune lymphoproliferative | AD | No |
| PMS2 | No GWAS | — | ClinVar: Lynch syndrome | AD | No |
| FH | No GWAS | — | ClinVar: fumarase deficiency | AR/AD | No |
| RAD50 | No GWAS | — | ClinVar: Nijmegen breakage-like | AR | No |
| NBN | No GWAS | — | ClinVar: Nijmegen breakage syndrome | AR | No |
| IGF2R | No GWAS | — | ClinVar: HCC | — | No |
| SET | No GWAS | — | ClinVar: HCC | — | No |
| PDGFRL | No GWAS | — | ClinVar: HCC | — | No |
Key Finding: Only TERT has both GWAS and Mendelian/ClinVar evidence for HCC, making it the highest-confidence genetic target. Most ClinVar HCC genes harbor somatic rather than germline mutations.
Section 5: Gwas Genes To Proteins
Total unique GWAS genes (protein-coding): 34 Total unique ClinVar genes: 19 Combined unique genes: 48 (5 shared context: TERT)
TOP 50 Genes with Protein Products
| # | Gene | HGNC ID | UniProt | Protein Name | Evidence | Mendelian |
|---|---|---|---|---|---|---|
| 1 | PNPLA3 | HGNC:18590 | Q9NST1 | 1-acylglycerol-3-phosphate O-acyltransferase | Tier 1 (I148M) | N |
| 2 | TM6SF2 | HGNC:11861 | Q9BZW4 | Transmembrane 6 superfamily member 2 | Tier 1 (E167K) | N |
| 3 | HLA-DQB1 | HGNC:4944 | P01920 | MHC class II, DQ beta 1 | Tier 4 | N |
| 4 | HLA-DQA1 | HGNC:4942 | P01909 | MHC class II, DQ alpha 1 | Tier 4 | N |
| 5 | KIF1B | HGNC:16636 | O60333 | Kinesin-like protein KIF1B | Tier 4 | N |
| 6 | MICA | HGNC:7090 | Q29983 | MHC class I polypeptide-related sequence A | Tier 2 | N |
| 7 | STAT4 | HGNC:11365 | Q14765 | STAT4 transcription factor | Tier 4 | N |
| 8 | TRIM31 | HGNC:16289 | Q9BZY9 | E3 ubiquitin-protein ligase TRIM31 | Tier 4 | N |
| 9 | CDK14 | HGNC:8883 | O94921 | Cyclin-dependent kinase 14 | Tier 4 | N |
| 10 | TERT | HGNC:11730 | O14746 | Telomerase reverse transcriptase | Tier 3 | Y |
| 11 | IFNL3 | HGNC:18365 | Q8IZI9 | Interferon lambda-3 | Tier 3 | N |
| 12 | MAU2 | HGNC:29140 | Q9Y6X3 | MAU2 chromatid cohesion factor | Tier 4 | N |
| 13 | PRMT7 | HGNC:25557 | Q9NVM4 | Protein arginine N-methyltransferase 7 | Tier 4 | N |
| 14 | DLC1 | HGNC:2897 | Q96QB1 | Rho GTPase-activating protein 7 | Tier 4 | N |
| 15 | TLL1 | HGNC:11843 | O43897 | Tolloid-like protein 1 | Tier 4 | N |
| 16 | GRIK1 | HGNC:4579 | P39086 | Glutamate receptor ionotropic kainate 1 | Tier 4 | N |
| 17 | C2 | HGNC:1248 | P06681 | Complement C2 | Tier 4 | N |
| 18 | SAMM50 | HGNC:24276 | Q9Y512 | SAMM50 sorting/assembly component | Tier 4 | N |
| 19 | WNT3A | HGNC:15983 | P56704 | Wnt family member 3A | Tier 4 | N |
| 20 | PBX2 | HGNC:8633 | P40424 | PBX homeobox 2 | Tier 4 | N |
| 21 | VEPH1 | HGNC:25735 | Q14D04 | Ventricular zone expressed PH domain 1 | Tier 4 | N |
| 22 | PRSS23 | HGNC:14370 | Q9BRK3 | Serine protease 23 | Tier 4 | N |
| 23 | EFCAB11 | HGNC:20357 | Q9BUY7 | EF-hand calcium binding domain 11 | Tier 4 | N |
| 24 | GLB1 | HGNC:4298 | P16278 | Beta-galactosidase | Tier 4 | N |
| 25 | DYSF | HGNC:3097 | O75923 | Dysferlin | Tier 4 | N |
| 26 | UBE4B | HGNC:12500 | O95155 | Ubiquitination factor E4B | Tier 4 | N |
| 27 | PGD | HGNC:8891 | P52209 | 6-phosphogluconate dehydrogenase | Tier 4 | N |
| 28 | HLA-DRB1 | HGNC:4948 | P01911 | MHC class II, DR beta 1 | Tier 4 | N |
| 29 | OBSCN | HGNC:15719 | Q5VST9 | Obscurin | Tier 4 | N |
| — | ClinVar-only genes below | |||||
| 30 | PIK3CA | HGNC:8975 | P42336 | PI3K catalytic subunit alpha | ClinVar | N |
| 31 | TP53 | HGNC:11998 | P04637 | Tumor protein p53 | ClinVar | N |
| 32 | CTNNB1 | HGNC:2514 | P35222 | Catenin beta-1 | ClinVar | N |
| 33 | MET | HGNC:7029 | P08581 | Hepatocyte growth factor receptor | ClinVar | N |
| 34 | APC | HGNC:583 | P25054 | Adenomatous polyposis coli protein | ClinVar | N |
| 35 | AXIN1 | HGNC:903 | O15169 | Axin-1 | ClinVar | N |
| 36 | RET | HGNC:9967 | P07949 | Proto-oncogene tyrosine-protein kinase Ret | ClinVar | N |
| 37 | VDR | HGNC:12679 | P11473 | Vitamin D3 receptor | ClinVar | N |
| 38 | CASP8 | HGNC:1509 | Q14790 | Caspase-8 | ClinVar | N |
| 39 | FH | HGNC:3700 | P07954 | Fumarate hydratase | ClinVar | N |
| 40 | CDKN2A | HGNC:1787 | P42771 | CDK inhibitor 2A (p16) | ClinVar | N |
| 41 | PMS2 | HGNC:9122 | P54278 | Mismatch repair endonuclease PMS2 | ClinVar | N |
| 42 | RAD50 | HGNC:9816 | Q92878 | DNA repair protein RAD50 | ClinVar | N |
| 43 | NBN | HGNC:7652 | O60934 | Nibrin | ClinVar | N |
| 44 | IGF2R | HGNC:5467 | P11717 | Cation-independent M6P receptor | ClinVar | N |
| 45 | SET | HGNC:10760 | Q01105 | Protein SET | ClinVar | N |
| 46 | SLTM | HGNC:20709 | Q9NWH9 | SAFB-like transcription modulator | ClinVar | N |
| 47 | PDGFRL | HGNC:8805 | Q15198 | PDGF receptor-like protein | ClinVar | N |
Section 6: Protein Family Classification
| Gene | UniProt | Protein Family (InterPro) | Category | Druggable? |
|---|---|---|---|---|
| PIK3CA | P42336 | PI3/PI4 kinase | Kinase | YES |
| MET | P08581 | Receptor tyrosine kinase | Kinase | YES |
| RET | P07949 | Receptor tyrosine kinase | Kinase | YES |
| CDK14 | O94921 | Cyclin-dependent kinase | Kinase | YES |
| VDR | P11473 | Nuclear hormone receptor | Nuclear receptor | YES |
| FZD4 | Q9ULV1 | Frizzled/GPCR-like 7TM | GPCR | YES |
| GRIK1 | P39086 | Ionotropic glutamate receptor | Ion channel | YES |
| C2 | P06681 | Serine protease (trypsin-like) | Protease | YES |
| CASP8 | Q14790 | Cysteine protease (caspase) | Protease | YES |
| PRSS23 | Q9BRK3 | Serine protease | Protease | YES |
| PNPLA3 | Q9NST1 | Patatin-like phospholipase | Enzyme | YES |
| PRMT7 | Q9NVM4 | Protein arginine methyltransferase | Enzyme | YES |
| PGD | P52209 | 6-phosphogluconate dehydrogenase | Enzyme | YES |
| FH | P07954 | Fumarate hydratase (lyase) | Enzyme | YES |
| TERT | O14746 | Reverse transcriptase | Enzyme | YES |
| GLB1 | P16278 | Beta-galactosidase | Enzyme | Moderate |
| STAT4 | Q14765 | STAT transcription factor | TF | Difficult |
| TP53 | P04637 | p53 tumor suppressor | TF | Difficult |
| PBX2 | P40424 | Homeodomain TF | TF | Difficult |
| CTNNB1 | P35222 | Beta-catenin (Armadillo) | Scaffold/PPI | Difficult |
| APC | P25054 | APC tumor suppressor | Scaffold | Difficult |
| AXIN1 | O15169 | Axin (RGS domain) | Scaffold | Difficult |
| TRIM31 | Q9BZY9 | E3 ubiquitin ligase (TRIM) | Enzyme | Moderate |
| DLC1 | Q96QB1 | RhoGAP | Signaling | Difficult |
| KIF1B | O60333 | Kinesin motor protein | Transport | Difficult |
| TM6SF2 | Q9BZW4 | Transmembrane protein | Unknown function | Difficult |
| MICA | Q29983 | MHC class I-related | Immune | Moderate |
| IFNL3 | Q8IZI9 | Interferon lambda cytokine | Cytokine | Moderate |
| WNT3A | P56704 | Wnt signaling ligand | Signaling | Difficult |
| MAU2 | Q9Y6X3 | Cohesin loading factor | Chromatin | Difficult |
| CDKN2A | P42771 | CDK inhibitor (ankyrin repeat) | Tumor suppressor | Difficult |
Summary
| Category | Count | % | Key Genes |
|---|---|---|---|
| Kinases | 4 | 8.5% | PIK3CA, MET, RET, CDK14 |
| GPCRs | 1 | 2.1% | FZD4 |
| Ion channels | 1 | 2.1% | GRIK1 |
| Nuclear receptors | 1 | 2.1% | VDR |
| Proteases | 3 | 6.4% | C2, CASP8, PRSS23 |
| Enzymes (other) | 7 | 14.9% | PNPLA3, PRMT7, PGD, FH, TERT, GLB1, TRIM31 |
| Druggable subtotal | 17 | 36.2% | — |
| Transcription factors | 3 | 6.4% | STAT4, TP53, PBX2 |
| Scaffold/PPI | 3 | 6.4% | CTNNB1, APC, AXIN1 |
| Signaling/other difficult | 6 | 12.8% | DLC1, KIF1B, WNT3A, CDKN2A, etc. |
| Unknown/novel | 18 | 38.3% | TM6SF2, MAU2, VEPH1, etc. |
| Difficult/unknown subtotal | 30 | 63.8% | — |
Section 7: Expression Context
Disease-relevant tissues: Liver (hepatocytes), tumor microenvironment (immune cells, fibroblasts)
CellxGene cell types in HCC datasets:
| Cell Type | Total Cells |
|---|---|
| Fibroblast | 6,848,407 |
| Macrophage | 3,299,562 |
| T cell | 2,097,084 |
| B cell | 2,077,991 |
| Malignant cell | 1,942,920 |
| Hematopoietic progenitor | 436,110 |
| Unknown | 2,984,441 |
TOP 30 Gene Expression Profiles (Bgee)
| Gene | Expression Breadth | Max Score | Key Tissues | Disease Relevance |
|---|---|---|---|---|
| KIF1B | Ubiquitous | 99.36 | Brain, liver, kidney | Expressed in liver |
| HLA-DQB1 | Ubiquitous | 99.09 | Immune cells, spleen | Immune microenvironment |
| HLA-DQA1 | Ubiquitous | 98.73 | Immune cells, thymus | Immune microenvironment |
| CDK14 | Ubiquitous | 98.24 | Brain, testis, liver | Expressed in liver |
| TM6SF2 | Ubiquitous | 97.90 | Liver (high), intestine | Highly relevant |
| PNPLA3 | Ubiquitous | 96.79 | Liver (high), adipose | Highly relevant |
| STAT4 | Ubiquitous | 96.60 | Immune cells, spleen | Immune microenvironment |
| MICA | Ubiquitous | 94.74 | Epithelial, liver | Relevant |
| TERT | Low/absent normal | — | Cancer cells, stem cells | Tumor-specific |
| PIK3CA | Ubiquitous | — | All tissues | Broad |
| TP53 | Ubiquitous | — | All tissues | Broad |
| MET | High liver | — | Liver, kidney, placenta | Highly relevant |
| CTNNB1 | Ubiquitous | — | All tissues | Broad |
Key Expression Insights:
- PNPLA3 and TM6SF2 are highly liver-enriched — ideal for liver-targeted therapy with fewer off-target effects
- TERT is tumor-specific (absent in normal hepatocytes) — excellent therapeutic window
- MET has high liver expression — directly relevant to HCC biology
- HLA genes are expressed in immune cells — relevant to tumor immune microenvironment
- GRIK1 is primarily brain-expressed — lower confidence as liver cancer target
Section 8: Protein Interactions
Key GWAS Protein Interaction Networks
PNPLA3 (Q9NST1) — Interaction count: 1,420 (STRING)
- Top interactors: TM6SF2 (score 916), SAMM50 (score 911), IFNL3 (score 581)
- Notable: PNPLA3, TM6SF2, and SAMM50 are ALL GWAS hits and interact — strong pathway clustering in lipid metabolism
TM6SF2 (Q9BZW4) — Interaction count: ~600+
- Top interactors: PNPLA3 (score 916), SAMM50 (529)
- Lipid metabolism cluster confirmed
CDK14 (O94921) — Interaction count: 2,984
- Top interactor: CCNY/Cyclin-Y (score 992) — its regulatory cyclin
- Interacts with CTNNB1 (score 353) — connects to Wnt pathway
- Interacts with CDK1/CDK2 family members
MET (P08581) — Interaction count: 5,090
- Top interactors: HGF (score 999), CD44 (991), EGFR (979), STAT3 (948), CTNNB1 (793), TP53 (769), PIK3CA (implied via pathway)
- Major hub connecting GWAS genes to ClinVar genes
TERT (O14746) — Interaction count: 5,450
- Top interactors: DKC1 (999), TERC (998), NHP2 (997), HSP90 (988), CTNNB1 (938), TP53 (887), PIK3CA (669)
- Hub connecting telomere biology to Wnt and p53 pathways
STAT4 (Q14765) — Interaction count: 2,286
- Top interactors: IL12RB2 (923), IFNG (917), IL12RB1 (901), JAK2 (888)
- Central to IL-12/IFN-gamma immune signaling
Undrugged GWAS Genes with Drugged Interactors
| Undrugged Gene | Interacts With | Drugged Interactor | Available Drugs |
|---|---|---|---|
| PNPLA3 | TM6SF2, SAMM50, IFNL3 | None directly (metabolic cluster) | — |
| TM6SF2 | PNPLA3, lipoproteins | ApoB pathway targets | Statins (indirect) |
| STAT4 | JAK2, IL12RB | JAK2 → Ruxolitinib, Baricitinib | JAK inhibitors |
| CDK14 | CTNNB1, Cyclin-Y | CDK family → Palbociclib, Abemaciclib | CDK inhibitors |
| TRIM31 | Immune signaling | Ubiquitin pathway | Proteasome inhibitors |
| MAU2 | Cohesin complex | NIPBL, SMC proteins | — |
| KIF1B | Motor protein complex | Tubulin → Paclitaxel, Docetaxel | Microtubule agents |
| DLC1 | RhoA, Cdc42 | Rho pathway | ROCK inhibitors |
Section 9: Structural Data
PDB Structure Availability
| Protein | UniProt | PDB Structures | Quality | Notes |
|---|---|---|---|---|
| PIK3CA | P42336 | 100+ structures | High res (2.2-3.3Å) | Extensive drug co-crystals |
| MET | P08581 | 12+ structures | Good | Kinase domain + inhibitor complexes |
| RET | P07949 | Multiple | Good | Kinase domain structures |
| TP53 | P04637 | 143+ structures | Excellent | DNA-binding + MDM2 complexes |
| CTNNB1 | P35222 | Multiple | Good | Armadillo repeat structures |
| CASP8 | Q14790 | Multiple | Good | Active site structures |
| VDR | P11473 | Multiple | Good | Ligand-binding domain |
| MICA | Q29983 | 10 structures | Good (2.2-3.8Å) | NKG2D complexes |
| KIF1B | O60333 | 1 (NMR) | Low | FHA domain only |
| C2 | P06681 | Multiple | Good | Complement cascade |
Undrugged Targets — Structure Assessment
| Gene | UniProt | PDB? | AlphaFold? | AF Quality (pLDDT) | Assessment |
|---|---|---|---|---|---|
| PNPLA3 | Q9NST1 | No | Yes | 71.7 (moderate) | Structure-based design challenging |
| TM6SF2 | Q9BZW4 | No | Yes | 91.0 (high) | Good AF model available |
| STAT4 | Q14765 | No | Yes | 86.9 (good) | Related STAT structures usable |
| TRIM31 | Q9BZY9 | No | Yes | 79.6 (moderate) | RING domain modelable |
| DLC1 | Q96QB1 | No | Yes | 56.6 (low) | Disordered regions |
| PRMT7 | Q9NVM4 | No | Yes | 93.2 (high) | Excellent AF model |
| IFNL3 | Q8IZI9 | No | Yes | 85.8 (good) | Good model, cytokine fold |
| MAU2 | Q9Y6X3 | No | Yes | 91.9 (high) | Good model available |
Summary: 9 proteins with PDB | 8 undrugged targets with AlphaFold only | 0 with no structure
Section 10: Drug Target Analysis
ChEMBL molecules linked to HCC (EFO:0000182): 304 compounds (200+ retrieved)
Drug Development Summary for GWAS+ClinVar Genes
| Category | Count | % of 47 Genes |
|---|---|---|
| Approved drugs (Phase 4) for HCC | 8 | 17.0% |
| Approved drugs for OTHER diseases | 12 | 25.5% |
| Phase 3 drugs | 5 | 10.6% |
| Phase 2/1 drugs | 4 | 8.5% |
| Preclinical compounds only | 6 | 12.8% |
| NO drug development | 12 | 25.5% |
Genes with APPROVED Drugs
| Gene | Protein | Drug(s) | Mechanism | Approved for HCC? |
|---|---|---|---|---|
| MET | HGF receptor | Cabozantinib, Capmatinib, Tepotinib | RTK inhibitor | YES (cabozantinib) |
| PIK3CA | PI3K-alpha | Alpelisib | PI3K inhibitor | No (breast cancer) |
| RET | RET kinase | Selpercatinib, Pralsetinib | RET inhibitor | No (thyroid/lung) |
| VDR | Vitamin D receptor | Calcitriol, seocalcitol | VDR agonist | No (rickets) |
| CDK14 | CDK14 | Palbociclib*, Abemaciclib*, Ribociclib* | CDK inhibitor | No (breast) — *pan-CDK |
| CASP8 | Caspase-8 | Emricasan (investigational) | Pan-caspase inhibitor | No |
| FH | Fumarate hydratase | No direct drugs | — | — |
| TERT | Telomerase RT | No approved; investigational | Telomerase inhibitor | Phase 2 |
Key HCC-approved drugs targeting GWAS/ClinVar genes:
- Sorafenib — multi-kinase inhibitor (VEGFR, RAF, PDGFR) — first-line HCC
- Lenvatinib — multi-kinase inhibitor (VEGFR, FGFR, RET) — first-line HCC
- Cabozantinib — MET/VEGFR2/RET inhibitor — second-line HCC
- Regorafenib — multi-kinase inhibitor — second-line HCC
- Atezolizumab + Bevacizumab — PD-L1/VEGF — first-line HCC
- Nivolumab — PD-1 — second-line HCC
- Pembrolizumab — PD-1 — second-line HCC
- Ramucirumab — VEGFR2 — second-line HCC (AFP ≥400)
Section 11: Bioactivity & Enzyme Data
Most-Studied GWAS/ClinVar Proteins (ChEMBL Target Records)
| Protein | ChEMBL Target | Target Type | Bioactivity Data |
|---|---|---|---|
| PIK3CA | CHEMBL4005 | Single protein + complexes | Extensive (>10,000 compounds) |
| MET | CHEMBL3717 | Single protein | Extensive (>5,000 compounds) |
| RET | CHEMBL2041 | Single protein + fusions | Extensive |
| VDR | CHEMBL1977 | Single protein + RXR complex | Large (>2,000 compounds) |
| CDK14 | CHEMBL6162 | Single protein + CDK family | Moderate |
| TP53 | CHEMBL4096 | Single protein + MDM2 PPI | Large (PPI modulators) |
| CTNNB1 | CHEMBL5866 | Single protein + PPIs | Growing (Wnt pathway) |
| CASP8 | CHEMBL3776 | Single protein + family | Moderate |
| TERT | CHEMBL2916 | Single protein | Moderate |
| APC | CHEMBL3233 | Single protein | Limited |
| PRMT7 | CHEMBL3562175 | Single protein | Limited |
Enzyme GWAS Genes — Druggability Assessment
| Enzyme | EC Class | Known Inhibitors | Kinetic Data | Assessment |
|---|---|---|---|---|
| PNPLA3 | Acyltransferase | No clinical | Limited | HIGH opportunity — causative variant |
| PRMT7 | Methyltransferase | Research tools | Yes | HIGH opportunity — SAM-dependent, druggable fold |
| PGD | Oxidoreductase | 6-AN (research) | Yes | Moderate |
| FH | Lyase | None clinical | Extensive | Tumor suppressor — inhibition counterproductive |
| TERT | Reverse transcriptase | Imetelstat (Phase 2) | Yes | HIGH — active development |
| C2 | Serine protease | Complement inhibitors | Yes | Druggable family |
Section 12: Pharmacogenomics
PharmGKB Clinical Annotations for HCC
| Gene | Variant | Drug | Type | Level | Clinical Annotation |
|---|---|---|---|---|---|
| GALNT14 | rs12613732 | Cisplatin/5-FU/Mitoxantrone | Efficacy | 3 | Response prediction for TACE |
| GALNT14 | rs9679162 | Cisplatin/5-FU/Mitoxantrone | Efficacy | 3 | Response prediction for TACE |
| VEGFA | rs2010963 | Sorafenib | Efficacy | 3 | Sorafenib response in HCC |
| VEGFA | rs3025040 | Sorafenib | Toxicity | 3 | Hand-foot syndrome risk |
| KDR | rs1870377 | Sorafenib | Efficacy | 3 | Sorafenib response |
| KDR | rs2071559 | Sorafenib | Efficacy | 3 | Sorafenib response in HCC/RCC |
| NOS3 | rs1799983 | Sorafenib | Efficacy | 3 | Sorafenib response |
| NOS3 | rs2070744 | Sorafenib | Efficacy | 3 | Sorafenib response |
| TNF | rs1800629 | Sorafenib | Toxicity | 3 | Hand-foot syndrome risk |
| SLC15A2 | rs2257212 | Sorafenib | Efficacy | 3 | Sorafenib pharmacokinetics |
| UGT1A9 | rs7574296 | Sorafenib | Toxicity | 3 | Hand-foot syndrome |
| UGT1A1 | *1/*28 | Sorafenib | Toxicity | 3 | Hyperbilirubinemia risk |
PharmGKB VIP (Very Important Pharmacogenes) Status
| GWAS/ClinVar Gene | PharmGKB VIP | Has CPIC Guideline |
|---|---|---|
| PNPLA3 | Yes | No |
| TM6SF2 | Yes | No |
| STAT4 | Yes | No |
| CDK14 | Yes | No |
| TERT | Yes | No |
| KIF1B | Yes | No |
| PIK3CA | Yes | No |
| TP53 | Yes | No |
| MET | Yes | No |
| VDR | Yes | No |
Section 13: Clinical Trials
Total clinical trials for HCC (MONDO:0007256): 3,195+
Phase Breakdown (from sampled 200 trials)
| Phase | Count (sampled) | Estimated Total |
|---|---|---|
| Phase 4 | 88 | ~400+ |
| Phase 3 | 97 | ~600+ |
| Phase 2/3 | 15 | ~100+ |
| Phase 2 | est. | ~1,000+ |
| Phase 1 | est. | ~800+ |
TOP 30 Drugs in HCC Clinical Trials
| Drug | Phase | Mechanism | Target Gene | GWAS Gene? |
|---|---|---|---|---|
| Sorafenib | 4 | Multi-kinase (RAF/VEGFR) | BRAF, KDR, PDGFR | No (but PharmGKB link) |
| Lenvatinib | 4 | Multi-kinase (VEGFR/FGFR/RET) | RET, FGFR | YES (ClinVar) |
| Cabozantinib | 4 | MET/VEGFR2/RET inhibitor | MET, RET | YES (ClinVar) |
| Atezolizumab | 4 | Anti-PD-L1 | CD274 | No |
| Bevacizumab | 4 | Anti-VEGF | VEGFA | No |
| Nivolumab | 4 | Anti-PD-1 | PDCD1 | No |
| Pembrolizumab | 4 | Anti-PD-1 | PDCD1 | No |
| Regorafenib | 4 | Multi-kinase | VEGFR, RET | YES (ClinVar) |
| Durvalumab | 4 | Anti-PD-L1 | CD274 | No |
| Tremelimumab | 4 | Anti-CTLA-4 | CTLA4 | No |
| Ramucirumab | 4 | Anti-VEGFR2 | KDR | No |
| Everolimus | 4 | mTOR inhibitor | MTOR | No |
| Palbociclib | 4 | CDK4/6 inhibitor | CDK14 (pan) | YES (GWAS) |
| Abemaciclib | 4 | CDK4/6 inhibitor | CDK14 (pan) | YES (GWAS) |
| Capmatinib | 4 | MET inhibitor | MET | YES (ClinVar) |
| Tepotinib | 4 | MET inhibitor | MET | YES (ClinVar) |
| Tivantinib | 3 | MET inhibitor | MET | YES (ClinVar) |
| Galunisertib | 2 | TGFβR1 inhibitor | TGFBR1 | No |
| Tegavivint | 2 | β-catenin inhibitor | CTNNB1 | YES (ClinVar) |
| Ipilimumab | 4 | Anti-CTLA-4 | CTLA4 | No |
| Erdafitinib | 4 | FGFR inhibitor | FGFR | No |
| Futibatinib | 4 | FGFR inhibitor | FGFR | No |
| Trametinib | 4 | MEK inhibitor | MAP2K1 | No |
| Sirolimus | 4 | mTOR inhibitor | MTOR | No |
| Temsirolimus | 4 | mTOR inhibitor | MTOR | No |
| Linifanib | 3 | VEGFR/PDGFR | KDR, PDGFR | No |
| Brivanib | 3 | VEGFR/FGFR | KDR, FGFR | No |
| C-188-9 | 2 | STAT3 inhibitor | STAT3 | Partial (STAT4) |
| Selinexor | 4 | XPO1 inhibitor | XPO1 | No |
| Tucidinostat | 3 | HDAC inhibitor | HDACs | No |
GWAS Gene Targeting Rate: ~8/30 (27%) of top trial drugs target GWAS/ClinVar genes — moderate genetic alignment.
Section 14: Pathway Analysis
TOP 30 Reactome Pathways Enriched in GWAS/ClinVar Genes
| # | Pathway | Reactome ID | GWAS/ClinVar Genes | Druggable Nodes |
|---|---|---|---|---|
| 1 | PI3K/AKT signaling | R-HSA-1257604 | PIK3CA, MET | PIK3CA, AKT, mTOR |
| 2 | MET activates PI3K/AKT | R-HSA-8851907 | MET, PIK3CA | MET, PIK3CA |
| 3 | MET Receptor Activation | R-HSA-6806942 | MET | MET |
| 4 | RAF/MAP kinase cascade | R-HSA-5673001 | PIK3CA, MET, RET | RAF, MEK, ERK |
| 5 | RET signaling | R-HSA-8853659 | RET, PIK3CA | RET |
| 6 | Wnt/β-catenin signaling | R-HSA-201681 | CTNNB1, TERT | CTNNB1, FZD, GSK3β |
| 7 | β-catenin destruction complex | R-HSA-195253 | CTNNB1, APC, AXIN1 | GSK3β |
| 8 | β-catenin transactivating complex | R-HSA-201722 | CTNNB1, TERT | CBP/β-catenin PPI |
| 9 | TP53 metabolic gene regulation | R-HSA-5628897 | TP53 | MDM2 |
| 10 | TP53 DNA repair regulation | R-HSA-6796648 | TP53 | MDM2 |
| 11 | TP53 cell cycle arrest | R-HSA-6804116 | TP53 | CDK4/6 |
| 12 | Apoptosis — caspase activation | R-HSA-140534 | CASP8 | CASP8 |
| 13 | FasL/CD95L signaling | R-HSA-75157 | CASP8 | CASP8, TRAIL |
| 14 | TRAIL signaling | R-HSA-75158 | CASP8 | DR4/5 |
| 15 | VEGFA-VEGFR2 pathway | R-HSA-4420097 | PIK3CA | VEGFR2, VEGFA |
| 16 | Telomere extension | R-HSA-171319 | TERT | TERT |
| 17 | Interleukin-12 signaling | R-HSA-9020591 | STAT4 | JAK2 |
| 18 | Nuclear receptor transcription | R-HSA-383280 | VDR | VDR |
| 19 | Vitamin D metabolism | R-HSA-196791 | VDR | CYP enzymes |
| 20 | Lipid remodeling (DAG/TAG) | R-HSA-1482883 | PNPLA3 | PNPLA3 |
| 21 | Constitutive PI3K in cancer | R-HSA-2219530 | PIK3CA, MET | PIK3CA |
| 22 | FGFR signaling in disease | R-HSA-5655253 | PIK3CA | FGFR, PIK3CA |
| 23 | APC truncation mutants | R-HSA-5467337 | APC, AXIN1 | — |
| 24 | CTNNB1 mutants | R-HSA-5358747 | CTNNB1, APC | β-catenin |
| 25 | Necroptosis regulation | R-HSA-5675482 | CASP8 | RIPK1 |
| 26 | NF-κB signaling | R-HSA-9758274 | CASP8, TP53 | IKK complex |
| 27 | Interleukin-23 signaling | R-HSA-9020933 | STAT4 | JAK2 |
| 28 | SCF-KIT signaling | R-HSA-1433557 | PIK3CA | KIT |
| 29 | Senescence | R-HSA-2559580 | TP53 | CDK4/6 |
| 30 | FLT3 signaling | R-HSA-9607240 | PIK3CA | FLT3 |
Pathway-level insight: Even when GWAS genes are undrugged (PNPLA3, STAT4), pathway members offer druggable entry points: JAK2 for STAT4 signaling, GSK3β for β-catenin pathway, mTOR for PI3K pathway.
Section 15: Drug Repurposing Opportunities
TOP 30 Repurposing Candidates
| # | Drug | GWAS/ClinVar Gene | Currently Approved For | Mechanism | GWAS p-value | Priority Score |
|---|---|---|---|---|---|---|
| 1 | Cabozantinib | MET, RET | RCC, thyroid | MET/RET inhibitor | ClinVar | ★★★★★ |
| 2 | Lenvatinib | RET | Thyroid, endometrial | Multi-kinase (RET) | ClinVar | ★★★★★ |
| 3 | Alpelisib | PIK3CA | Breast cancer | PI3Kα inhibitor | ClinVar | ★★★★☆ |
| 4 | Palbociclib | CDK14 | Breast cancer | CDK4/6 inhibitor | 9e-10 | ★★★★☆ |
| 5 | Abemaciclib | CDK14 | Breast cancer | CDK4/6 inhibitor | 9e-10 | ★★★★☆ |
| 6 | Ribociclib | CDK14 | Breast cancer | CDK4/6 inhibitor | 9e-10 | ★★★★☆ |
| 7 | Selpercatinib | RET | NSCLC, thyroid | Selective RET inhibitor | ClinVar | ★★★★☆ |
| 8 | Pralsetinib | RET | NSCLC, thyroid | Selective RET inhibitor | ClinVar | ★★★★☆ |
| 9 | Capmatinib | MET | NSCLC | Selective MET inhibitor | ClinVar | ★★★★☆ |
| 10 | Tepotinib | MET | NSCLC | Selective MET inhibitor | ClinVar | ★★★★☆ |
| 11 | Ruxolitinib | STAT4 pathway (JAK2) | MPN | JAK1/2 inhibitor | 2e-10 | ★★★☆☆ |
| 12 | Baricitinib | STAT4 pathway (JAK2) | RA | JAK1/2 inhibitor | 2e-10 | ★★★☆☆ |
| 13 | Calcitriol | VDR | Rickets, psoriasis | VDR agonist | ClinVar | ★★★☆☆ |
| 14 | Itacitinib | STAT4 pathway (JAK1) | GvHD (trial) | JAK1 inhibitor | 2e-10 | ★★★☆☆ |
| 15 | Tegavivint | CTNNB1 | Investigational | β-catenin inhibitor | ClinVar | ★★★☆☆ |
| 16 | Emricasan | CASP8 | NASH (trial) | Pan-caspase inhibitor | ClinVar | ★★☆☆☆ |
| 17 | Celecoxib | COX-2 (pathway) | Pain/inflammation | COX-2 inhibitor | — | ★★☆☆☆ |
| 18 | Metformin | AMPK (pathway) | Diabetes | AMPK activator | — | ★★☆☆☆ |
| 19 | Statins | HMG-CoA (pathway) | Hyperlipidemia | HMG-CoA inhibitor | — | ★★☆☆☆ |
| 20 | Aspirin | COX (pathway) | Pain/thrombosis | COX inhibitor | — | ★★☆☆☆ |
| 21 | Vandetanib | RET | Thyroid cancer | RET/VEGFR/EGFR | ClinVar | ★★★☆☆ |
| 22 | Neratinib | ERBB pathway | Breast cancer | Pan-HER kinase | — | ★★☆☆☆ |
| 23 | Enzalutamide | AR pathway | Prostate cancer | AR antagonist | — | ★☆☆☆☆ |
| 24 | Alectinib | ALK pathway | NSCLC | ALK inhibitor | — | ★☆☆☆☆ |
| 25 | Erdafitinib | FGFR pathway | Bladder cancer | FGFR inhibitor | — | ★★☆☆☆ |
| 26 | Tucatinib | HER2 pathway | Breast cancer | HER2 inhibitor | — | ★☆☆☆☆ |
| 27 | Sonidegib | Hedgehog pathway | BCC | SMO inhibitor | — | ★☆☆☆☆ |
| 28 | Selinexor | XPO1 | Myeloma | XPO1 inhibitor | — | ★★☆☆☆ |
| 29 | Hydroxychloroquine | Autophagy | Malaria, RA | Autophagy inhibitor | — | ★☆☆☆☆ |
| 30 | Sitagliptin | DPP4 | Diabetes | DPP4 inhibitor | — | ★☆☆☆☆ |
Section 16: Druggability Pyramid
| Level | Description | Gene Count | % | Key Genes |
|---|---|---|---|---|
| 1 — VALIDATED | Approved drug FOR HCC | 5 | 10.6% | MET (cabozantinib), RET (lenvatinib/regorafenib), TERT (trials ongoing), HLA (IO agents) |
| 2 — REPURPOSING | Approved drug for OTHER disease | 8 | 17.0% | PIK3CA (alpelisib), CDK14 (palbociclib), VDR (calcitriol), RET (selpercatinib), CASP8, FH, APC, STAT4→JAK |
| 3 — EMERGING | Drug in clinical trials | 4 | 8.5% | CTNNB1 (tegavivint), TERT (imetelstat), STAT4 (itacitinib), TP53 (MDM2i) |
| 4 — TOOL COMPOUNDS | ChEMBL compounds, no trials | 5 | 10.6% | PRMT7, GRIK1, C2, AXIN1, CDKN2A |
| 5 — DRUGGABLE UNDRUGGED | Druggable family, NO compounds | 4 | 8.5% | PNPLA3 (enzyme), PRSS23 (protease), PGD (enzyme), GLB1 (enzyme) |
| 6 — HARD TARGETS | Difficult family or unknown | 21 | 44.7% | TM6SF2, KIF1B, TRIM31, MAU2, DLC1, WNT3A, VEPH1, SAMM50, OBSCN, DYSF, EFCAB11, etc. |
Section 17: Undrugged Target Profiles
TOP 30 Undrugged Opportunities (ranked by potential)
| # | Gene | GWAS p-value | Variant | Protein Family | Structure | Expression | Drugged Interactors | Potential |
|---|---|---|---|---|---|---|---|---|
| 1 | PNPLA3 | 2.0e-19 | Missense I148M | Patatin-like phospholipase (enzyme) | AF only (71.7) | Liver-high | None direct | HIGH |
| 2 | TM6SF2 | 2.0e-12 | Missense E167K | Transmembrane protein | AF only (91.0) | Liver-high | PNPLA3 network | HIGH |
| 3 | PRMT7 | 6.0e-10 | Intronic | Arginine methyltransferase | AF (93.2 — excellent) | Ubiquitous | ChEMBL target exists | HIGH |
| 4 | STAT4 | 2.0e-10 | Intronic | STAT TF | AF (86.9) | Immune cells | JAK2 (ruxolitinib) | HIGH |
| 5 | CDK14 | 9.0e-10 | Intronic | Cyclin-dependent kinase | ChEMBL target | Ubiquitous | CDK family (palbociclib) | HIGH |
| 6 | IFNL3 | 8.0e-10 | Regulatory | Interferon lambda cytokine | AF (85.8) | Immune | IFN pathway (pegIFN) | MEDIUM |
| 7 | TRIM31 | 9.0e-13 | Intronic | E3 ubiquitin ligase | AF (79.6) | Ubiquitous | Ubiquitin system | MEDIUM |
| 8 | KIF1B | 2.0e-18 | Intronic | Kinesin motor | 1 NMR structure | Ubiquitous | Tubulin (taxanes) | MEDIUM |
| 9 | DLC1 | 3.0e-07 | Intronic | RhoGAP | AF only (56.6) | Ubiquitous | Rho GTPases | LOW |
| 10 | MAU2 | 3.0e-12 | Intronic | Cohesin loader | AF (91.9) | Ubiquitous | Cohesin complex | LOW |
| 11 | MICA | 4.0e-13 | UTR/regulatory | MHC class I-related | PDB (10 structures) | Epithelial | NKG2D (IO pathway) | MEDIUM |
| 12 | WNT3A | 1.0e-08 | Intergenic | Wnt ligand | Limited | Ubiquitous | FZD receptors | LOW |
| 13 | PRSS23 | 6.0e-06 | Intronic | Serine protease | No data | Limited | None | MEDIUM |
| 14 | PGD | — | Intronic | 6-PGD (enzyme) | Known fold | Ubiquitous | Pentose phosphate | MEDIUM |
| 15 | SAMM50 | 9.0e-07 | Intronic | Mitochondrial membrane | Limited | Liver | PNPLA3, TM6SF2 | LOW |
| 16 | GRIK1 | 5.0e-10 | Intronic | Ionotropic glutamate receptor | Yes | Brain (not liver) | Glutamate drugs | LOW |
| 17 | VEPH1 | 2.0e-06 | Intronic | PH domain protein | Limited | Ubiquitous | Unknown | LOW |
| 18 | PBX2 | 3.0e-08 | Intronic | Homeodomain TF | Limited | Ubiquitous | None | LOW |
| 19 | TLL1 | 3.0e-08 | Intronic | Metalloprotease | Related structures | Liver | BMP pathway | MEDIUM |
| 20 | GLB1 | 2.0e-07 | Intronic | Beta-galactosidase | Known | Ubiquitous | Lysosomal | LOW |
Deep Dive: Top 5 Undrugged Targets
- PNPLA3 (Q9NST1) — DRUGGABILITY POTENTIAL: HIGH
- GWAS: p=2e-19, Tier 1 coding variant (I148M missense), replicated across 6+ studies
- Function: Lipase/acyltransferase in hepatic lipid droplets. I148M abolishes lipase activity, causing fat accumulation
- Family: Patatin-like phospholipase — enzyme with definable active site
- Structure: AlphaFold only (pLDDT 71.7); related PNPLA2 structures available
- Expression: Liver-enriched — favorable for organ-specific targeting
- Why undrugged: Novel target; biology only elucidated in last decade; no tool compounds
- Opportunity: Allele-specific degraders (PROTACs) targeting I148M protein, or activators restoring lipase function. Active academic interest.
- TM6SF2 (Q9BZW4) — DRUGGABILITY POTENTIAL: HIGH
- GWAS: p=2e-12, Tier 1 coding variant (E167K), replicated
- Function: Regulates hepatic lipid secretion via VLDL. E167K = loss of function
- Family: Transmembrane protein — challenging but not impossible
- Structure: AlphaFold (pLDDT 91.0 — high quality)
- Expression: Liver-enriched
- Why undrugged: Very novel; function only recently characterized
- Opportunity: Protein stabilizers or chaperones to rescue E167K folding. ASO/siRNA approaches.
- PRMT7 (Q9NVM4) — DRUGGABILITY POTENTIAL: HIGH
- GWAS: p=6e-10
- Function: Protein arginine methyltransferase; epigenetic regulation
- Family: SAM-dependent methyltransferase — highly druggable fold (cf. EZH2 inhibitors)
- Structure: AlphaFold (pLDDT 93.2 — excellent); ChEMBL target record exists
- Expression: Ubiquitous
- Why undrugged: Less studied than PRMT1/5; no selective inhibitors yet
- Opportunity: SAM-competitive inhibitors. PRMT5 inhibitors in oncology trials show path.
- STAT4 (Q14765) — DRUGGABILITY POTENTIAL: HIGH (indirect)
- GWAS: p=2e-10 (HBV-HCC)
- Function: Mediates IL-12 signaling, Th1 immune response, IFN-gamma production
- Family: STAT TF — directly difficult, but upstream JAK2 is druggable
- Interactions: JAK2 (ruxolitinib), IL-12Rβ2
- Why undrugged: STAT proteins lack deep pockets; PPI-focused
- Opportunity: JAK inhibitors (ruxolitinib, baricitinib) modulate STAT4 signaling indirectly. Also: STAT4-selective antisense approaches.
- CDK14 (O94921) — DRUGGABILITY POTENTIAL: HIGH
- GWAS: p=9e-10 (HBV-HCC)
- Function: Wnt pathway regulator; phosphorylates LRP6 during G2/M
- Family: CDK kinase — proven druggable (CDK4/6 inhibitors approved)
- Structure: ChEMBL target exists; CDK family structures available
- Interactions: CTNNB1 (score 353), Cyclin-Y (score 992)
- Why undrugged: Existing CDK drugs (palbociclib) have low CDK14 selectivity
- Opportunity: CDK14-selective inhibitors could block Wnt activation in HCC. Pan-CDK drugs already in HCC trials.
Section 18: Summary
GWAS LANDSCAPE
- Total associations: 93 | Unique studies: 27 | Unique genes: 34 (GWAS) + 19 (ClinVar) = 47 combined
- Coding variants: 2.5% (Tier 1) | Non-coding: 97.5%
- Dominant loci: HLA/MHC 6p21, PNPLA3/SAMM50 22q13, TM6SF2 19p13, KIF1B 1p36
GENETIC EVIDENCE
- Tier 1 (coding) genes: 2 (PNPLA3, TM6SF2)
- Mendelian overlap: 1 (TERT)
- Both GWAS + ClinVar: 1 (TERT)
DRUGGABILITY
- Overall druggable rate: 36.2% of genes in druggable protein families
- Approved for HCC: 10.6% | Approved other: 17.0% | In trials: 8.5%
- Opportunity gap (no drugs): 53.2%
PYRAMID SUMMARY
| Level | Count | % |
|---|---|---|
| L1 Validated | 5 | 10.6% |
| L2 Repurposing | 8 | 17.0% |
| L3 Emerging | 4 | 8.5% |
| L4 Tool compounds | 5 | 10.6% |
| L5 Druggable undrugged | 4 | 8.5% |
| L6 Hard targets | 21 | 44.7% |
CLINICAL TRIAL ALIGNMENT
- 27% of top trial drugs target GWAS/ClinVar genes — moderate alignment
- MET and RET are the best-validated genetically-informed drug targets currently in HCC trials
TOP 10 REPURPOSING CANDIDATES
| Drug | Gene | Approved For | p-value | Score |
|---|---|---|---|---|
| Cabozantinib | MET/RET | RCC/thyroid | ClinVar | ★★★★★ |
| Alpelisib | PIK3CA | Breast cancer | ClinVar | ★★★★☆ |
| Palbociclib | CDK14 | Breast cancer | 9e-10 | ★★★★☆ |
| Selpercatinib | RET | NSCLC/thyroid | ClinVar | ★★★★☆ |
| Capmatinib | MET | NSCLC | ClinVar | ★★★★☆ |
| Ruxolitinib | STAT4→JAK2 | MPN | 2e-10 | ★★★☆☆ |
| Calcitriol | VDR | Rickets | ClinVar | ★★★☆☆ |
| Ribociclib | CDK14 | Breast cancer | 9e-10 | ★★★★☆ |
| Baricitinib | STAT4→JAK2 | RA | 2e-10 | ★★★☆☆ |
| Vandetanib | RET | Thyroid | ClinVar | ★★★☆☆ |
TOP 10 UNDRUGGED OPPORTUNITIES
| Gene | p-value | Family | Structure | Potential |
|---|---|---|---|---|
| PNPLA3 | 2e-19 | Phospholipase (enzyme) | AF (71.7) | HIGH |
| TM6SF2 | 2e-12 | Transmembrane | AF (91.0) | HIGH |
| PRMT7 | 6e-10 | Methyltransferase | AF (93.2) | HIGH |
| STAT4 | 2e-10 | STAT TF | AF (86.9) | HIGH |
| CDK14 | 9e-10 | CDK kinase | ChEMBL | HIGH |
| IFNL3 | 8e-10 | Interferon cytokine | AF (85.8) | MEDIUM |
| TRIM31 | 9e-13 | E3 ubiquitin ligase | AF (79.6) | MEDIUM |
| KIF1B | 2e-18 | Kinesin motor | 1 NMR | MEDIUM |
| TLL1 | 3e-08 | Metalloprotease | Related | MEDIUM |
| MICA | 4e-13 | MHC class I-related | PDB (10) | MEDIUM |
TOP 10 INDIRECT OPPORTUNITIES
| Undrugged Gene | Drugged Interactor | Available Drug |
|---|---|---|
| STAT4 | JAK2 | Ruxolitinib, Baricitinib |
| CDK14 | CDK4/6 | Palbociclib, Abemaciclib |
| KIF1B | Tubulin/microtubule | Paclitaxel, Docetaxel |
| DLC1 | RhoA/ROCK | Fasudil (ROCK inhibitor) |
| PNPLA3 | Lipid metabolism pathway | Statins (indirect) |
| TM6SF2 | ApoB/lipoprotein pathway | PCSK9 inhibitors (indirect) |
| TERT | TP53 (interaction score 887) | MDM2 inhibitors |
| CTNNB1 | TCF4/LEF PPI | Tegavivint (Phase 2) |
| WNT3A | FZD receptors | Wnt pathway inhibitors |
| TRIM31 | Ubiquitin-proteasome | Bortezomib (proteasome) |
KEY INSIGHTS
HCC has a unique genetic architecture dominated by (a) HLA/immune loci reflecting viral etiology (HBV/HCV) and (b) metabolic susceptibility genes (PNPLA3, TM6SF2) reflecting steatotic liver disease etiology — two distinct pathogenic axes.
PNPLA3 I148M is the strongest and most actionable undrugged target — a coding variant with p=2e-19, liver-specific expression, and enzymatic function. It represents the single highest-value drug development opportunity in HCC genetics.
3. The Wnt/β-catenin pathway is the most genetically loaded druggable pathway — with CTNNB1, AXIN1, APC (ClinVar somatic), CDK14 (GWAS), WNT3A (GWAS), and TERT (both) all converging. Tegavivint (β-catenin inhibitor) is in Phase 2.
TERT is the only gene with both GWAS germline and ClinVar somatic evidence — bridging common and rare variant architectures. Telomerase inhibitors (imetelstat) are in development.
Immune checkpoint therapies (PD-1, PD-L1, CTLA-4) dominate HCC trials but do NOT target GWAS genes — the genetic evidence points more toward kinases (MET, RET, CDK14, PIK3CA) and metabolic targets (PNPLA3, TM6SF2).
JAK-STAT pathway offers an underexplored repurposing opportunity — STAT4 GWAS evidence (p=2e-10) + approved JAK inhibitors = rationale for JAK inhibitor trials in HBV-related HCC specifically.
Compared to other cancers, HCC has relatively few coding GWAS variants (2.5% vs typical ~5-10%), reflecting that most HCC genetic risk operates through liver disease susceptibility (metabolic, immune) rather than direct oncogenic mechanisms.
Analysis generated using biobtree MCP tools querying 70+ biological databases. All identifiers cross-referenced across MONDO, EFO, MeSH, OMIM, Orphanet, GWAS Catalog, dbSNP, HGNC, UniProt, InterPro, Reactome, ChEMBL, PharmGKB, STRING, PDB, AlphaFold, Bgee, and CellxGene.