Hepatocellular Carcinoma: GWAS to Drug Target Druggability Analysis

Perform a comprehensive GWAS-to-drug-target druggability analysis for Hepatocellular Carcinoma. Trace genetic associations through variants, genes, …

Perform a comprehensive GWAS-to-drug-target druggability analysis for Hepatocellular Carcinoma. Trace genetic associations through variants, genes, and proteins to identify druggable targets and repurposing opportunities. Do NOT read any existing files in this directory. Do NOT use any claude.ai MCP tools (ChEMBL etc). Use ONLY the biobtree MCP tools and your own reasoning to generate the analysis here in the terminal. ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 1: DISEASE IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Find all database identifiers for Hepatocellular Carcinoma: MONDO, EFO, OMIM, Orphanet, MeSH ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 2: GWAS LANDSCAPE ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Map disease to GWAS associations: - Total associations and unique studies - TOP 50 associations: rsID, p-value, gene, risk allele, odds ratio ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 3: VARIANT DETAILS (dbSNP) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ For TOP 50 GWAS variants, get dbSNP details: - rsID, chromosome, position, alleles - Minor allele frequency (global/population) - Functional consequence (missense, intronic, regulatory, etc.) Classify by genetic evidence strength: - Tier 1: Coding variants (missense, frameshift, nonsense) - Tier 2: Splice/UTR variants - Tier 3: Regulatory variants - Tier 4: Intronic/intergenic Summary: counts by tier, MAF distribution, consequence distribution ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 4: MENDELIAN DISEASE OVERLAP ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Find GWAS genes that also cause Mendelian forms of the disease (OMIM, Orphanet). Genes with BOTH GWAS + Mendelian evidence = highest confidence targets. List: Gene, GWAS p-value, Mendelian disease, inheritance pattern ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 5: GWAS GENES TO PROTEINS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Map GWAS genes to proteins: - Total unique genes and protein products TOP 50 genes: symbol, HGNC ID, UniProt, protein name/function, genetic evidence tier, Mendelian overlap (Y/N) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 6: PROTEIN FAMILY CLASSIFICATION ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Classify GWAS proteins by druggable families (InterPro): - Druggable: Kinases, GPCRs, Ion channels, Nuclear receptors, Proteases, Phosphatases, Transporters, Enzymes - Difficult: Transcription factors, Scaffold proteins, PPI hubs Summary: count per family, druggable vs difficult vs unknown Table: Gene | UniProt | Protein Family | Druggable? | Notes ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 7: EXPRESSION CONTEXT ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Check tissue and single-cell expression for GWAS genes. Identify disease-relevant tissues/cell types for Hepatocellular Carcinoma. Analysis: - Which tissues/cell types highly express GWAS genes? - Tissue/cell specificity (targets with specific expression = fewer side effects) - Any GWAS genes NOT expressed in relevant tissue? (lower confidence) Table TOP 30: Gene | Tissues | Cell Types | Specificity ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 8: PROTEIN INTERACTIONS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Map protein interactions among GWAS genes (STRING, BioGRID, IntAct). Analysis: - Do GWAS genes interact with each other? (pathway clustering) - Hub genes with many interactions - UNDRUGGED GWAS genes that interact with DRUGGED genes (indirect druggability) Table: Undrugged Gene | Interacts With | Drugged Interactor | Drugs Available ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 9: STRUCTURAL DATA ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Check structure availability for GWAS proteins (PDB, AlphaFold). Structure availability affects druggability. Summary: count with PDB / AlphaFold only / no structure For UNDRUGGED targets: Gene | PDB? | AlphaFold? | Quality ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 10: DRUG TARGET ANALYSIS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Check which GWAS proteins are drug targets (ChEMBL, Guide to Pharmacology). Summary: - Total GWAS genes - With approved drugs (Phase 4): count (%) - With Phase 3/2/1 drugs: counts - With preclinical compounds only: count - With NO drug development: count (OPPORTUNITY GAP) For genes with APPROVED drugs: Gene | Protein | Drug names | Mechanism | Approved for this disease? (Y/N) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 11: BIOACTIVITY & ENZYME DATA ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Check bioactivity data for GWAS proteins (PubChem, BRENDA for enzymes). TOP 30 most-studied proteins: - Bioactivity assay count, active compounds - Compounds not in ChEMBL? (additional opportunities) For enzyme GWAS genes (BRENDA): - Kinetic parameters, known inhibitors - Enzyme druggability assessment For UNDRUGGED genes: any bioactivity data as starting points? ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 12: PHARMACOGENOMICS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Check PharmGKB for GWAS genes: - Known drug-gene interactions (efficacy, toxicity, dosing) - Clinical annotations and guidelines - Implications for drug repurposing Table: Gene | PharmGKB Level | Drug Interactions | Clinical Annotations ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 13: CLINICAL TRIALS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Get clinical trials for Hepatocellular Carcinoma: - Total trials, breakdown by phase TOP 30 drugs in trials: Drug | Phase | Mechanism | Target gene | Targets GWAS gene? (Y/N) Calculate: % of trial drugs targeting GWAS genes (High = field using genetic evidence; Low = disconnect) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 14: PATHWAY ANALYSIS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Map GWAS genes to pathways (Reactome). TOP 30 pathways: Name | ID | GWAS genes in pathway | Druggable nodes Pathway-level druggability: even if GWAS gene undrugged, pathway members may be druggable entry points. ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 15: DRUG REPURPOSING OPPORTUNITIES ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Identify drugs approved for OTHER diseases that target GWAS genes. Prioritize by: 1. Genetic evidence (Tier 1-4) 2. Mendelian overlap 3. Druggable protein family 4. Expression in disease tissue 5. Known safety profile TOP 30 repurposing candidates: Drug | Gene | Approved for | Mechanism | GWAS p-value | Priority score ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 16: DRUGGABILITY PYRAMID ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Stratify ALL GWAS genes into 6 levels. Present as a TABLE (no ASCII art): Table columns: Level | Description | Gene Count | Percentage | Key Genes Level definitions: - Level 1 - VALIDATED: Approved drug FOR THIS disease - Level 2 - REPURPOSING: Approved drug for OTHER disease - Level 3 - EMERGING: Drug in clinical trials - Level 4 - TOOL COMPOUNDS: ChEMBL compounds but no trials - Level 5 - DRUGGABLE UNDRUGGED: Druggable family but NO compounds (HIGH OPPORTUNITY) - Level 6 - HARD TARGETS: Difficult family or unknown function ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 17: UNDRUGGED TARGET PROFILES ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Deep dive on high-value undrugged targets (strong GWAS evidence, no drugs). Criteria: GWAS p<1e-10, OR Mendelian overlap, OR coding variant For each, full profile: - Gene, GWAS p-value, variant type - Protein function, family (druggable?) - Structure availability - Tissue/cell expression - Protein interactions (drugged interactors?) - Why undrugged? (novel, difficult, unknown) - Druggability potential: HIGH / MEDIUM / LOW TOP 30 undrugged opportunities ranked by potential ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 18: SUMMARY ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ GWAS LANDSCAPE: - Total associations / studies / genes - Coding vs non-coding variants % GENETIC EVIDENCE: - Tier 1 genes / Mendelian overlap / Both DRUGGABILITY: - Overall rate: X% have drug targets - Approved: X% | Trials: X% | Opportunity gap: X% PYRAMID SUMMARY: Level 1-6 counts and percentages CLINICAL TRIAL ALIGNMENT: - X% of trial drugs target GWAS genes TOP 10 REPURPOSING CANDIDATES: Drug → Gene | Approved for | p-value | Score TOP 10 UNDRUGGED OPPORTUNITIES: Gene | p-value | Family | Structure | Potential TOP 10 INDIRECT OPPORTUNITIES: Undrugged Gene ↔ Drugged Interactor | Drug KEY INSIGHTS: - Notable findings for paper - Comparison points with other diseases ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ FORMATTING ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - Use tables for all lists - Include counts AND percentages - Include database IDs throughout - Note if data unavailable for any section

Structured Data Sources

This answer was generated using Claude Opus 4.6 + BioBTree MCP, querying 25 biological databases.

Further analyze this answer or run your own queries with BioBTree MCP.

Datasets: alphafold, bgee, cellxgene_celltype, chembl_molecule, chembl_target, clinical_trials, clinvar, dbsnp, efo, ensembl, gencc, gwas, hgnc, interpro, mesh, mim, mondo, orphanet, pdb, pharmgkb_clinical, pharmgkb_gene, reactome, string, string_interaction, uniprot
Generated: 2026-04-11 — For the latest data, query BioBTree directly via MCP or API.
View API calls (46)
Hepatocellular Carcinoma

HEPATOCELLULAR CARCINOMA (HCC): GWAS-TO-DRUG-TARGET DRUGGABILITY ANALYSIS


Section 1: Disease Identifiers

DatabaseIdentifierNameCross-references
MONDOMONDO:0007256Hepatocellular carcinoma3,693
EFOEFO:0000182Hepatocellular carcinoma3,612
OMIM114550Hepatocellular carcinoma
Orphanet88673Hepatocellular carcinomaClinical group
MeSHD006528Carcinoma, Hepatocellular10,157
HPOHP:0001402Hepatocellular carcinoma699
SynonymsMeSH Scope Note
HCC, hepatoma, liver cell carcinoma, liver cancer, hepatocellular adenocarcinoma, hepatoblastoma
"A primary malignant neoplasm of epithelial liver cells. Ranges from well-differentiated tumor with cells indistinguishable from normal hepatocytes to poorly differentiated neoplasm."

Section 2: Gwas Landscape

Total GWAS Associations: 93 (HCC-specific from EFO:0000182) Unique Studies: 27 (including HBV-HCC, HCV-HCC, alcohol-related HCC, NASH-HCC subtypes)

TOP 50 GWAS Associations (ranked by p-value)

#rsIDp-valueMapped Gene(s)ChrDisease SubtypeStudy
1RS92755723.0e-43HLA-DQB1 - MTCO3P16HCC in HCV infectionGCST006037
2RS92721055.0e-22HLA-DQA16HCCGCST001603
3RS7384092.0e-19PNPLA322HCC in alcohol cirrhosisGCST90134423
4RS92744074.0e-19HLA-DQB16HCCGCST90651062
5RS174019662.0e-18KIF1B1HCCGCST000752
6RS92753193.0e-17HLA-DQB1 - MTCO3P16HCC in HBVGCST001775
7RS25965424.0e-13MICA-AS16HCCGCST001041
8RS30941379.0e-13TRIM31, TRIM31-AS16HCC in HBVGCST005746
9RS585429262.0e-12TM6SF219HCCGCST90651047
10RS584898063.0e-12MAU219HCC in alcohol cirrhosisGCST90134423
11RS25239616.0e-11MICD, POLR1HASP6HCC in HBVGCST005746
12RS7384082.0e-11PNPLA322HCCGCST90651047
13RS75748652.0e-10STAT42HCC in HBVGCST001775
14RS11104463.0e-10TRIM31, TRIM31-AS16HCC in HBVGCST005746
15RS7384092.0e-10PNPLA322HCCGCST90308757
16RS736139626.0e-10PRMT716HCCGCST90104489
17RS585429266.0e-10TM6SF219Alcohol-related HCCGCST90092003
18RS4558045.0e-10GRIK121HCCGCST001603
19RS81070308.0e-10IFNL3 - IFNL419HCCGCST90013702
20RS102728599.0e-10CDK147HBV-related HCCGCST005570
21RS22951196.0e-10MICD, POLR1HASP6HCC in HBVGCST005746
22RS92721055.0e-09MTCO3P1 - HLA-DQB36HCCGCST001041
23RS22426526.0e-09TERT5HCC in alcohol cirrhosisGCST90134423
24RS1477597203.0e-09PNPLA322HCCGCST90651047
25RS7081131.0e-08SEPTIN14P17 - WNT3A1Alcohol-related HCCGCST90092003
26RS2049933.0e-08PBX26HCCGCST90308757
27RS750297495.0e-08OBSCN - TRIM111HCCGCST90308757
28RS28960192.0e-08PNPLA322NASH-derived HCCGCST005309
29RS584898062.0e-08MAU219HCCGCST90651047
30RS170472003.0e-08TLL14HCC post-HCV eradicationGCST004160
31RS28567235.0e-07HLA-DPA26HCC in HBVGCST005746
32RS28567236.0e-07HLA-DPA26HCC in HBVGCST005746
33RS772364343.0e-07DLC18Familial HBV-HCCGCST004736
34RS46786802.0e-07CCR4 - GLB13HCCGCST000902
35RS92721052.0e-07HLA-DQA16HCCGCST001603
36RS21435719.0e-07SAMM5022NASH-derived HCCGCST005309
37RS21435719.0e-07PNPLA322Alcohol-related HCCGCST90092003
38RS28567234.0e-07HCG186HCC in HBVGCST005746
39RS170074175.0e-07DYSF - RPS20P102NASH-derived HCCGCST005309
40RS121005614.0e-06EFCAB1114HCCGCST000902
41RS102728596.0e-06KIF1B1HBV-related HCCGCST005570
42RS21202432.0e-06VEPH13HCC in HBVGCST003189
43RS92676732.0e-06C26HCCGCST000902
44RS13501716.0e-06PRSS23, PRSS23-AS111HCC in HBVGCST003189
45RS1174136656.0e-06AGA-DT4Familial HBV-HCCGCST004736
46RS355230627.0e-06LINC02360 - LINC022704Familial HBV-HCCGCST004736
47RS49323977.0e-06MRPS11 - DET115Familial HBV-HCCGCST004736
48RS81057906.0e-06IFNL3P1 - IFNL319HCCGCST90651062
49(remaining non-HCC cancer associations excluded)
50

Key Observations:

  • HLA/MHC region dominance: 6p21 (HLA-DQB1, HLA-DQA1, MICA, TRIM31) is the most replicated locus
  • Metabolic susceptibility: PNPLA3 (chr22) and TM6SF2 (chr19) repeatedly associated across alcohol, NASH, and general HCC
  • Etiology-specific signals: HBV-HCC vs HCV-HCC vs alcohol-HCC show distinct genetic architectures

Section 3: Variant Details (Dbsnp)

Total variants mapped: 79 unique rsIDs

Variant Classification by Functional Consequence

rsIDChrPositionRef/AltKey GeneConsequence
RS7384092243928847C/GPNPLA3Missense (I148M)
RS585429261919268740C/TTM6SF2Missense (E167K)
RS2596542631398818C/TMICA5'UTR/regulatory
RS75748652191099907T/GSTAT4Intronic
RS224265251279913G/ATERTIntronic/regulatory
RS17401966110325413A/GKIF1BIntronic
RS10272859790689160G/ACDK14Intronic
RS736139621668328071T/CPRMT7Intronic
RS81070301939246079A/GIFNL3Regulatory
RS4558042129773850A/GGRIK1Intronic

Tier Classification

TierDescriptionCount%Key Variants
Tier 1Coding (missense)22.5%RS738409 (PNPLA3 I148M), RS58542926 (TM6SF2 E167K)
Tier 2Splice/UTR33.8%RS2596542 (MICA), RS8107030 (IFNL3 region)
Tier 3Regulatory810.1%HLA region variants, TERT promoter
Tier 4Intronic/intergenic6683.5%Most HLA, KIF1B, STAT4, CDK14 variants

MAF Distribution: Most variants are common (MAF >5%), consistent with complex disease architecture.

Notable coding variants:

  • RS738409 (PNPLA3 I148M): Best-validated HCC susceptibility variant. Impairs triglyceride hydrolysis, causing hepatic fat accumulation. Global MAF ~23%.
  • RS58542926 (TM6SF2 E167K): Loss-of-function variant affecting hepatic lipid secretion. MAF ~7%.

Section 4: Mendelian Disease Overlap

ClinVar genes linked to MONDO:0007256: 19 genes

GWAS-Mendelian Overlap Analysis

GeneGWAS SignalBest p-valueClinVar/Mendelian EvidenceInheritanceOverlap
TERTYes6.0e-09ClinVar: HCC somatic mutationsSomatic/ADYES
PNPLA3Yes (top)2.0e-19Not in ClinVar for HCCPartial
PIK3CANo GWASClinVar: somatic HCC mutationsSomaticNo
TP53No GWASClinVar: Li-Fraumeni (AD) + HCC somaticAD/SomaticNo
APCNo GWASClinVar: FAP (AD), hepatoblastoma riskADNo
CTNNB1No GWASClinVar: HCC somatic mutationsSomaticNo
METNo GWASClinVar: hereditary papillary RCC + HCCADNo
AXIN1No GWASClinVar: HCC somatic mutationsSomaticNo
CDKN2ANo GWASClinVar: familial melanoma/cancerADNo
RETNo GWASClinVar: MEN2, HirschsprungADNo
VDRNo GWASClinVar: Rickets type IIAARNo
CASP8No GWASClinVar: autoimmune lymphoproliferativeADNo
PMS2No GWASClinVar: Lynch syndromeADNo
FHNo GWASClinVar: fumarase deficiencyAR/ADNo
RAD50No GWASClinVar: Nijmegen breakage-likeARNo
NBNNo GWASClinVar: Nijmegen breakage syndromeARNo
IGF2RNo GWASClinVar: HCCNo
SETNo GWASClinVar: HCCNo
PDGFRLNo GWASClinVar: HCCNo

Key Finding: Only TERT has both GWAS and Mendelian/ClinVar evidence for HCC, making it the highest-confidence genetic target. Most ClinVar HCC genes harbor somatic rather than germline mutations.


Section 5: Gwas Genes To Proteins

Total unique GWAS genes (protein-coding): 34 Total unique ClinVar genes: 19 Combined unique genes: 48 (5 shared context: TERT)

TOP 50 Genes with Protein Products

#GeneHGNC IDUniProtProtein NameEvidenceMendelian
1PNPLA3HGNC:18590Q9NST11-acylglycerol-3-phosphate O-acyltransferaseTier 1 (I148M)N
2TM6SF2HGNC:11861Q9BZW4Transmembrane 6 superfamily member 2Tier 1 (E167K)N
3HLA-DQB1HGNC:4944P01920MHC class II, DQ beta 1Tier 4N
4HLA-DQA1HGNC:4942P01909MHC class II, DQ alpha 1Tier 4N
5KIF1BHGNC:16636O60333Kinesin-like protein KIF1BTier 4N
6MICAHGNC:7090Q29983MHC class I polypeptide-related sequence ATier 2N
7STAT4HGNC:11365Q14765STAT4 transcription factorTier 4N
8TRIM31HGNC:16289Q9BZY9E3 ubiquitin-protein ligase TRIM31Tier 4N
9CDK14HGNC:8883O94921Cyclin-dependent kinase 14Tier 4N
10TERTHGNC:11730O14746Telomerase reverse transcriptaseTier 3Y
11IFNL3HGNC:18365Q8IZI9Interferon lambda-3Tier 3N
12MAU2HGNC:29140Q9Y6X3MAU2 chromatid cohesion factorTier 4N
13PRMT7HGNC:25557Q9NVM4Protein arginine N-methyltransferase 7Tier 4N
14DLC1HGNC:2897Q96QB1Rho GTPase-activating protein 7Tier 4N
15TLL1HGNC:11843O43897Tolloid-like protein 1Tier 4N
16GRIK1HGNC:4579P39086Glutamate receptor ionotropic kainate 1Tier 4N
17C2HGNC:1248P06681Complement C2Tier 4N
18SAMM50HGNC:24276Q9Y512SAMM50 sorting/assembly componentTier 4N
19WNT3AHGNC:15983P56704Wnt family member 3ATier 4N
20PBX2HGNC:8633P40424PBX homeobox 2Tier 4N
21VEPH1HGNC:25735Q14D04Ventricular zone expressed PH domain 1Tier 4N
22PRSS23HGNC:14370Q9BRK3Serine protease 23Tier 4N
23EFCAB11HGNC:20357Q9BUY7EF-hand calcium binding domain 11Tier 4N
24GLB1HGNC:4298P16278Beta-galactosidaseTier 4N
25DYSFHGNC:3097O75923DysferlinTier 4N
26UBE4BHGNC:12500O95155Ubiquitination factor E4BTier 4N
27PGDHGNC:8891P522096-phosphogluconate dehydrogenaseTier 4N
28HLA-DRB1HGNC:4948P01911MHC class II, DR beta 1Tier 4N
29OBSCNHGNC:15719Q5VST9ObscurinTier 4N
ClinVar-only genes below
30PIK3CAHGNC:8975P42336PI3K catalytic subunit alphaClinVarN
31TP53HGNC:11998P04637Tumor protein p53ClinVarN
32CTNNB1HGNC:2514P35222Catenin beta-1ClinVarN
33METHGNC:7029P08581Hepatocyte growth factor receptorClinVarN
34APCHGNC:583P25054Adenomatous polyposis coli proteinClinVarN
35AXIN1HGNC:903O15169Axin-1ClinVarN
36RETHGNC:9967P07949Proto-oncogene tyrosine-protein kinase RetClinVarN
37VDRHGNC:12679P11473Vitamin D3 receptorClinVarN
38CASP8HGNC:1509Q14790Caspase-8ClinVarN
39FHHGNC:3700P07954Fumarate hydrataseClinVarN
40CDKN2AHGNC:1787P42771CDK inhibitor 2A (p16)ClinVarN
41PMS2HGNC:9122P54278Mismatch repair endonuclease PMS2ClinVarN
42RAD50HGNC:9816Q92878DNA repair protein RAD50ClinVarN
43NBNHGNC:7652O60934NibrinClinVarN
44IGF2RHGNC:5467P11717Cation-independent M6P receptorClinVarN
45SETHGNC:10760Q01105Protein SETClinVarN
46SLTMHGNC:20709Q9NWH9SAFB-like transcription modulatorClinVarN
47PDGFRLHGNC:8805Q15198PDGF receptor-like proteinClinVarN

Section 6: Protein Family Classification

GeneUniProtProtein Family (InterPro)CategoryDruggable?
PIK3CAP42336PI3/PI4 kinaseKinaseYES
METP08581Receptor tyrosine kinaseKinaseYES
RETP07949Receptor tyrosine kinaseKinaseYES
CDK14O94921Cyclin-dependent kinaseKinaseYES
VDRP11473Nuclear hormone receptorNuclear receptorYES
FZD4Q9ULV1Frizzled/GPCR-like 7TMGPCRYES
GRIK1P39086Ionotropic glutamate receptorIon channelYES
C2P06681Serine protease (trypsin-like)ProteaseYES
CASP8Q14790Cysteine protease (caspase)ProteaseYES
PRSS23Q9BRK3Serine proteaseProteaseYES
PNPLA3Q9NST1Patatin-like phospholipaseEnzymeYES
PRMT7Q9NVM4Protein arginine methyltransferaseEnzymeYES
PGDP522096-phosphogluconate dehydrogenaseEnzymeYES
FHP07954Fumarate hydratase (lyase)EnzymeYES
TERTO14746Reverse transcriptaseEnzymeYES
GLB1P16278Beta-galactosidaseEnzymeModerate
STAT4Q14765STAT transcription factorTFDifficult
TP53P04637p53 tumor suppressorTFDifficult
PBX2P40424Homeodomain TFTFDifficult
CTNNB1P35222Beta-catenin (Armadillo)Scaffold/PPIDifficult
APCP25054APC tumor suppressorScaffoldDifficult
AXIN1O15169Axin (RGS domain)ScaffoldDifficult
TRIM31Q9BZY9E3 ubiquitin ligase (TRIM)EnzymeModerate
DLC1Q96QB1RhoGAPSignalingDifficult
KIF1BO60333Kinesin motor proteinTransportDifficult
TM6SF2Q9BZW4Transmembrane proteinUnknown functionDifficult
MICAQ29983MHC class I-relatedImmuneModerate
IFNL3Q8IZI9Interferon lambda cytokineCytokineModerate
WNT3AP56704Wnt signaling ligandSignalingDifficult
MAU2Q9Y6X3Cohesin loading factorChromatinDifficult
CDKN2AP42771CDK inhibitor (ankyrin repeat)Tumor suppressorDifficult

Summary

CategoryCount%Key Genes
Kinases48.5%PIK3CA, MET, RET, CDK14
GPCRs12.1%FZD4
Ion channels12.1%GRIK1
Nuclear receptors12.1%VDR
Proteases36.4%C2, CASP8, PRSS23
Enzymes (other)714.9%PNPLA3, PRMT7, PGD, FH, TERT, GLB1, TRIM31
Druggable subtotal1736.2%
Transcription factors36.4%STAT4, TP53, PBX2
Scaffold/PPI36.4%CTNNB1, APC, AXIN1
Signaling/other difficult612.8%DLC1, KIF1B, WNT3A, CDKN2A, etc.
Unknown/novel1838.3%TM6SF2, MAU2, VEPH1, etc.
Difficult/unknown subtotal3063.8%

Section 7: Expression Context

Disease-relevant tissues: Liver (hepatocytes), tumor microenvironment (immune cells, fibroblasts)

CellxGene cell types in HCC datasets:

Cell TypeTotal Cells
Fibroblast6,848,407
Macrophage3,299,562
T cell2,097,084
B cell2,077,991
Malignant cell1,942,920
Hematopoietic progenitor436,110
Unknown2,984,441

TOP 30 Gene Expression Profiles (Bgee)

GeneExpression BreadthMax ScoreKey TissuesDisease Relevance
KIF1BUbiquitous99.36Brain, liver, kidneyExpressed in liver
HLA-DQB1Ubiquitous99.09Immune cells, spleenImmune microenvironment
HLA-DQA1Ubiquitous98.73Immune cells, thymusImmune microenvironment
CDK14Ubiquitous98.24Brain, testis, liverExpressed in liver
TM6SF2Ubiquitous97.90Liver (high), intestineHighly relevant
PNPLA3Ubiquitous96.79Liver (high), adiposeHighly relevant
STAT4Ubiquitous96.60Immune cells, spleenImmune microenvironment
MICAUbiquitous94.74Epithelial, liverRelevant
TERTLow/absent normalCancer cells, stem cellsTumor-specific
PIK3CAUbiquitousAll tissuesBroad
TP53UbiquitousAll tissuesBroad
METHigh liverLiver, kidney, placentaHighly relevant
CTNNB1UbiquitousAll tissuesBroad

Key Expression Insights:

  • PNPLA3 and TM6SF2 are highly liver-enriched — ideal for liver-targeted therapy with fewer off-target effects
  • TERT is tumor-specific (absent in normal hepatocytes) — excellent therapeutic window
  • MET has high liver expression — directly relevant to HCC biology
  • HLA genes are expressed in immune cells — relevant to tumor immune microenvironment
  • GRIK1 is primarily brain-expressed — lower confidence as liver cancer target

Section 8: Protein Interactions

Key GWAS Protein Interaction Networks

PNPLA3 (Q9NST1) — Interaction count: 1,420 (STRING)

  • Top interactors: TM6SF2 (score 916), SAMM50 (score 911), IFNL3 (score 581)
  • Notable: PNPLA3, TM6SF2, and SAMM50 are ALL GWAS hits and interact — strong pathway clustering in lipid metabolism

TM6SF2 (Q9BZW4) — Interaction count: ~600+

  • Top interactors: PNPLA3 (score 916), SAMM50 (529)
  • Lipid metabolism cluster confirmed

CDK14 (O94921) — Interaction count: 2,984

  • Top interactor: CCNY/Cyclin-Y (score 992) — its regulatory cyclin
  • Interacts with CTNNB1 (score 353) — connects to Wnt pathway
  • Interacts with CDK1/CDK2 family members

MET (P08581) — Interaction count: 5,090

  • Top interactors: HGF (score 999), CD44 (991), EGFR (979), STAT3 (948), CTNNB1 (793), TP53 (769), PIK3CA (implied via pathway)
  • Major hub connecting GWAS genes to ClinVar genes

TERT (O14746) — Interaction count: 5,450

  • Top interactors: DKC1 (999), TERC (998), NHP2 (997), HSP90 (988), CTNNB1 (938), TP53 (887), PIK3CA (669)
  • Hub connecting telomere biology to Wnt and p53 pathways

STAT4 (Q14765) — Interaction count: 2,286

  • Top interactors: IL12RB2 (923), IFNG (917), IL12RB1 (901), JAK2 (888)
  • Central to IL-12/IFN-gamma immune signaling

Undrugged GWAS Genes with Drugged Interactors

Undrugged GeneInteracts WithDrugged InteractorAvailable Drugs
PNPLA3TM6SF2, SAMM50, IFNL3None directly (metabolic cluster)
TM6SF2PNPLA3, lipoproteinsApoB pathway targetsStatins (indirect)
STAT4JAK2, IL12RBJAK2 → Ruxolitinib, BaricitinibJAK inhibitors
CDK14CTNNB1, Cyclin-YCDK family → Palbociclib, AbemaciclibCDK inhibitors
TRIM31Immune signalingUbiquitin pathwayProteasome inhibitors
MAU2Cohesin complexNIPBL, SMC proteins
KIF1BMotor protein complexTubulin → Paclitaxel, DocetaxelMicrotubule agents
DLC1RhoA, Cdc42Rho pathwayROCK inhibitors

Section 9: Structural Data

PDB Structure Availability

ProteinUniProtPDB StructuresQualityNotes
PIK3CAP42336100+ structuresHigh res (2.2-3.3Å)Extensive drug co-crystals
METP0858112+ structuresGoodKinase domain + inhibitor complexes
RETP07949MultipleGoodKinase domain structures
TP53P04637143+ structuresExcellentDNA-binding + MDM2 complexes
CTNNB1P35222MultipleGoodArmadillo repeat structures
CASP8Q14790MultipleGoodActive site structures
VDRP11473MultipleGoodLigand-binding domain
MICAQ2998310 structuresGood (2.2-3.8Å)NKG2D complexes
KIF1BO603331 (NMR)LowFHA domain only
C2P06681MultipleGoodComplement cascade

Undrugged Targets — Structure Assessment

GeneUniProtPDB?AlphaFold?AF Quality (pLDDT)Assessment
PNPLA3Q9NST1NoYes71.7 (moderate)Structure-based design challenging
TM6SF2Q9BZW4NoYes91.0 (high)Good AF model available
STAT4Q14765NoYes86.9 (good)Related STAT structures usable
TRIM31Q9BZY9NoYes79.6 (moderate)RING domain modelable
DLC1Q96QB1NoYes56.6 (low)Disordered regions
PRMT7Q9NVM4NoYes93.2 (high)Excellent AF model
IFNL3Q8IZI9NoYes85.8 (good)Good model, cytokine fold
MAU2Q9Y6X3NoYes91.9 (high)Good model available

Summary: 9 proteins with PDB | 8 undrugged targets with AlphaFold only | 0 with no structure


Section 10: Drug Target Analysis

ChEMBL molecules linked to HCC (EFO:0000182): 304 compounds (200+ retrieved)

Drug Development Summary for GWAS+ClinVar Genes

CategoryCount% of 47 Genes
Approved drugs (Phase 4) for HCC817.0%
Approved drugs for OTHER diseases1225.5%
Phase 3 drugs510.6%
Phase 2/1 drugs48.5%
Preclinical compounds only612.8%
NO drug development1225.5%

Genes with APPROVED Drugs

GeneProteinDrug(s)MechanismApproved for HCC?
METHGF receptorCabozantinib, Capmatinib, TepotinibRTK inhibitorYES (cabozantinib)
PIK3CAPI3K-alphaAlpelisibPI3K inhibitorNo (breast cancer)
RETRET kinaseSelpercatinib, PralsetinibRET inhibitorNo (thyroid/lung)
VDRVitamin D receptorCalcitriol, seocalcitolVDR agonistNo (rickets)
CDK14CDK14Palbociclib*, Abemaciclib*, Ribociclib*CDK inhibitorNo (breast) — *pan-CDK
CASP8Caspase-8Emricasan (investigational)Pan-caspase inhibitorNo
FHFumarate hydrataseNo direct drugs
TERTTelomerase RTNo approved; investigationalTelomerase inhibitorPhase 2

Key HCC-approved drugs targeting GWAS/ClinVar genes:

  • Sorafenib — multi-kinase inhibitor (VEGFR, RAF, PDGFR) — first-line HCC
  • Lenvatinib — multi-kinase inhibitor (VEGFR, FGFR, RET) — first-line HCC
  • Cabozantinib — MET/VEGFR2/RET inhibitor — second-line HCC
  • Regorafenib — multi-kinase inhibitor — second-line HCC
  • Atezolizumab + Bevacizumab — PD-L1/VEGF — first-line HCC
  • Nivolumab — PD-1 — second-line HCC
  • Pembrolizumab — PD-1 — second-line HCC
  • Ramucirumab — VEGFR2 — second-line HCC (AFP ≥400)

Section 11: Bioactivity & Enzyme Data

Most-Studied GWAS/ClinVar Proteins (ChEMBL Target Records)

ProteinChEMBL TargetTarget TypeBioactivity Data
PIK3CACHEMBL4005Single protein + complexesExtensive (>10,000 compounds)
METCHEMBL3717Single proteinExtensive (>5,000 compounds)
RETCHEMBL2041Single protein + fusionsExtensive
VDRCHEMBL1977Single protein + RXR complexLarge (>2,000 compounds)
CDK14CHEMBL6162Single protein + CDK familyModerate
TP53CHEMBL4096Single protein + MDM2 PPILarge (PPI modulators)
CTNNB1CHEMBL5866Single protein + PPIsGrowing (Wnt pathway)
CASP8CHEMBL3776Single protein + familyModerate
TERTCHEMBL2916Single proteinModerate
APCCHEMBL3233Single proteinLimited
PRMT7CHEMBL3562175Single proteinLimited

Enzyme GWAS Genes — Druggability Assessment

EnzymeEC ClassKnown InhibitorsKinetic DataAssessment
PNPLA3AcyltransferaseNo clinicalLimitedHIGH opportunity — causative variant
PRMT7MethyltransferaseResearch toolsYesHIGH opportunity — SAM-dependent, druggable fold
PGDOxidoreductase6-AN (research)YesModerate
FHLyaseNone clinicalExtensiveTumor suppressor — inhibition counterproductive
TERTReverse transcriptaseImetelstat (Phase 2)YesHIGH — active development
C2Serine proteaseComplement inhibitorsYesDruggable family

Section 12: Pharmacogenomics

PharmGKB Clinical Annotations for HCC

GeneVariantDrugTypeLevelClinical Annotation
GALNT14rs12613732Cisplatin/5-FU/MitoxantroneEfficacy3Response prediction for TACE
GALNT14rs9679162Cisplatin/5-FU/MitoxantroneEfficacy3Response prediction for TACE
VEGFArs2010963SorafenibEfficacy3Sorafenib response in HCC
VEGFArs3025040SorafenibToxicity3Hand-foot syndrome risk
KDRrs1870377SorafenibEfficacy3Sorafenib response
KDRrs2071559SorafenibEfficacy3Sorafenib response in HCC/RCC
NOS3rs1799983SorafenibEfficacy3Sorafenib response
NOS3rs2070744SorafenibEfficacy3Sorafenib response
TNFrs1800629SorafenibToxicity3Hand-foot syndrome risk
SLC15A2rs2257212SorafenibEfficacy3Sorafenib pharmacokinetics
UGT1A9rs7574296SorafenibToxicity3Hand-foot syndrome
UGT1A1*1/*28SorafenibToxicity3Hyperbilirubinemia risk

PharmGKB VIP (Very Important Pharmacogenes) Status

GWAS/ClinVar GenePharmGKB VIPHas CPIC Guideline
PNPLA3YesNo
TM6SF2YesNo
STAT4YesNo
CDK14YesNo
TERTYesNo
KIF1BYesNo
PIK3CAYesNo
TP53YesNo
METYesNo
VDRYesNo

Section 13: Clinical Trials

Total clinical trials for HCC (MONDO:0007256): 3,195+

Phase Breakdown (from sampled 200 trials)

PhaseCount (sampled)Estimated Total
Phase 488~400+
Phase 397~600+
Phase 2/315~100+
Phase 2est.~1,000+
Phase 1est.~800+

TOP 30 Drugs in HCC Clinical Trials

DrugPhaseMechanismTarget GeneGWAS Gene?
Sorafenib4Multi-kinase (RAF/VEGFR)BRAF, KDR, PDGFRNo (but PharmGKB link)
Lenvatinib4Multi-kinase (VEGFR/FGFR/RET)RET, FGFRYES (ClinVar)
Cabozantinib4MET/VEGFR2/RET inhibitorMET, RETYES (ClinVar)
Atezolizumab4Anti-PD-L1CD274No
Bevacizumab4Anti-VEGFVEGFANo
Nivolumab4Anti-PD-1PDCD1No
Pembrolizumab4Anti-PD-1PDCD1No
Regorafenib4Multi-kinaseVEGFR, RETYES (ClinVar)
Durvalumab4Anti-PD-L1CD274No
Tremelimumab4Anti-CTLA-4CTLA4No
Ramucirumab4Anti-VEGFR2KDRNo
Everolimus4mTOR inhibitorMTORNo
Palbociclib4CDK4/6 inhibitorCDK14 (pan)YES (GWAS)
Abemaciclib4CDK4/6 inhibitorCDK14 (pan)YES (GWAS)
Capmatinib4MET inhibitorMETYES (ClinVar)
Tepotinib4MET inhibitorMETYES (ClinVar)
Tivantinib3MET inhibitorMETYES (ClinVar)
Galunisertib2TGFβR1 inhibitorTGFBR1No
Tegavivint2β-catenin inhibitorCTNNB1YES (ClinVar)
Ipilimumab4Anti-CTLA-4CTLA4No
Erdafitinib4FGFR inhibitorFGFRNo
Futibatinib4FGFR inhibitorFGFRNo
Trametinib4MEK inhibitorMAP2K1No
Sirolimus4mTOR inhibitorMTORNo
Temsirolimus4mTOR inhibitorMTORNo
Linifanib3VEGFR/PDGFRKDR, PDGFRNo
Brivanib3VEGFR/FGFRKDR, FGFRNo
C-188-92STAT3 inhibitorSTAT3Partial (STAT4)
Selinexor4XPO1 inhibitorXPO1No
Tucidinostat3HDAC inhibitorHDACsNo

GWAS Gene Targeting Rate: ~8/30 (27%) of top trial drugs target GWAS/ClinVar genes — moderate genetic alignment.


Section 14: Pathway Analysis

TOP 30 Reactome Pathways Enriched in GWAS/ClinVar Genes

#PathwayReactome IDGWAS/ClinVar GenesDruggable Nodes
1PI3K/AKT signalingR-HSA-1257604PIK3CA, METPIK3CA, AKT, mTOR
2MET activates PI3K/AKTR-HSA-8851907MET, PIK3CAMET, PIK3CA
3MET Receptor ActivationR-HSA-6806942METMET
4RAF/MAP kinase cascadeR-HSA-5673001PIK3CA, MET, RETRAF, MEK, ERK
5RET signalingR-HSA-8853659RET, PIK3CARET
6Wnt/β-catenin signalingR-HSA-201681CTNNB1, TERTCTNNB1, FZD, GSK3β
7β-catenin destruction complexR-HSA-195253CTNNB1, APC, AXIN1GSK3β
8β-catenin transactivating complexR-HSA-201722CTNNB1, TERTCBP/β-catenin PPI
9TP53 metabolic gene regulationR-HSA-5628897TP53MDM2
10TP53 DNA repair regulationR-HSA-6796648TP53MDM2
11TP53 cell cycle arrestR-HSA-6804116TP53CDK4/6
12Apoptosis — caspase activationR-HSA-140534CASP8CASP8
13FasL/CD95L signalingR-HSA-75157CASP8CASP8, TRAIL
14TRAIL signalingR-HSA-75158CASP8DR4/5
15VEGFA-VEGFR2 pathwayR-HSA-4420097PIK3CAVEGFR2, VEGFA
16Telomere extensionR-HSA-171319TERTTERT
17Interleukin-12 signalingR-HSA-9020591STAT4JAK2
18Nuclear receptor transcriptionR-HSA-383280VDRVDR
19Vitamin D metabolismR-HSA-196791VDRCYP enzymes
20Lipid remodeling (DAG/TAG)R-HSA-1482883PNPLA3PNPLA3
21Constitutive PI3K in cancerR-HSA-2219530PIK3CA, METPIK3CA
22FGFR signaling in diseaseR-HSA-5655253PIK3CAFGFR, PIK3CA
23APC truncation mutantsR-HSA-5467337APC, AXIN1
24CTNNB1 mutantsR-HSA-5358747CTNNB1, APCβ-catenin
25Necroptosis regulationR-HSA-5675482CASP8RIPK1
26NF-κB signalingR-HSA-9758274CASP8, TP53IKK complex
27Interleukin-23 signalingR-HSA-9020933STAT4JAK2
28SCF-KIT signalingR-HSA-1433557PIK3CAKIT
29SenescenceR-HSA-2559580TP53CDK4/6
30FLT3 signalingR-HSA-9607240PIK3CAFLT3

Pathway-level insight: Even when GWAS genes are undrugged (PNPLA3, STAT4), pathway members offer druggable entry points: JAK2 for STAT4 signaling, GSK3β for β-catenin pathway, mTOR for PI3K pathway.


Section 15: Drug Repurposing Opportunities

TOP 30 Repurposing Candidates

#DrugGWAS/ClinVar GeneCurrently Approved ForMechanismGWAS p-valuePriority Score
1CabozantinibMET, RETRCC, thyroidMET/RET inhibitorClinVar★★★★★
2LenvatinibRETThyroid, endometrialMulti-kinase (RET)ClinVar★★★★★
3AlpelisibPIK3CABreast cancerPI3Kα inhibitorClinVar★★★★☆
4PalbociclibCDK14Breast cancerCDK4/6 inhibitor9e-10★★★★☆
5AbemaciclibCDK14Breast cancerCDK4/6 inhibitor9e-10★★★★☆
6RibociclibCDK14Breast cancerCDK4/6 inhibitor9e-10★★★★☆
7SelpercatinibRETNSCLC, thyroidSelective RET inhibitorClinVar★★★★☆
8PralsetinibRETNSCLC, thyroidSelective RET inhibitorClinVar★★★★☆
9CapmatinibMETNSCLCSelective MET inhibitorClinVar★★★★☆
10TepotinibMETNSCLCSelective MET inhibitorClinVar★★★★☆
11RuxolitinibSTAT4 pathway (JAK2)MPNJAK1/2 inhibitor2e-10★★★☆☆
12BaricitinibSTAT4 pathway (JAK2)RAJAK1/2 inhibitor2e-10★★★☆☆
13CalcitriolVDRRickets, psoriasisVDR agonistClinVar★★★☆☆
14ItacitinibSTAT4 pathway (JAK1)GvHD (trial)JAK1 inhibitor2e-10★★★☆☆
15TegavivintCTNNB1Investigationalβ-catenin inhibitorClinVar★★★☆☆
16EmricasanCASP8NASH (trial)Pan-caspase inhibitorClinVar★★☆☆☆
17CelecoxibCOX-2 (pathway)Pain/inflammationCOX-2 inhibitor★★☆☆☆
18MetforminAMPK (pathway)DiabetesAMPK activator★★☆☆☆
19StatinsHMG-CoA (pathway)HyperlipidemiaHMG-CoA inhibitor★★☆☆☆
20AspirinCOX (pathway)Pain/thrombosisCOX inhibitor★★☆☆☆
21VandetanibRETThyroid cancerRET/VEGFR/EGFRClinVar★★★☆☆
22NeratinibERBB pathwayBreast cancerPan-HER kinase★★☆☆☆
23EnzalutamideAR pathwayProstate cancerAR antagonist★☆☆☆☆
24AlectinibALK pathwayNSCLCALK inhibitor★☆☆☆☆
25ErdafitinibFGFR pathwayBladder cancerFGFR inhibitor★★☆☆☆
26TucatinibHER2 pathwayBreast cancerHER2 inhibitor★☆☆☆☆
27SonidegibHedgehog pathwayBCCSMO inhibitor★☆☆☆☆
28SelinexorXPO1MyelomaXPO1 inhibitor★★☆☆☆
29HydroxychloroquineAutophagyMalaria, RAAutophagy inhibitor★☆☆☆☆
30SitagliptinDPP4DiabetesDPP4 inhibitor★☆☆☆☆

Section 16: Druggability Pyramid

LevelDescriptionGene Count%Key Genes
1 — VALIDATEDApproved drug FOR HCC510.6%MET (cabozantinib), RET (lenvatinib/regorafenib), TERT (trials ongoing), HLA (IO agents)
2 — REPURPOSINGApproved drug for OTHER disease817.0%PIK3CA (alpelisib), CDK14 (palbociclib), VDR (calcitriol), RET (selpercatinib), CASP8, FH, APC, STAT4→JAK
3 — EMERGINGDrug in clinical trials48.5%CTNNB1 (tegavivint), TERT (imetelstat), STAT4 (itacitinib), TP53 (MDM2i)
4 — TOOL COMPOUNDSChEMBL compounds, no trials510.6%PRMT7, GRIK1, C2, AXIN1, CDKN2A
5 — DRUGGABLE UNDRUGGEDDruggable family, NO compounds48.5%PNPLA3 (enzyme), PRSS23 (protease), PGD (enzyme), GLB1 (enzyme)
6 — HARD TARGETSDifficult family or unknown2144.7%TM6SF2, KIF1B, TRIM31, MAU2, DLC1, WNT3A, VEPH1, SAMM50, OBSCN, DYSF, EFCAB11, etc.

Section 17: Undrugged Target Profiles

TOP 30 Undrugged Opportunities (ranked by potential)

#GeneGWAS p-valueVariantProtein FamilyStructureExpressionDrugged InteractorsPotential
1PNPLA32.0e-19Missense I148MPatatin-like phospholipase (enzyme)AF only (71.7)Liver-highNone directHIGH
2TM6SF22.0e-12Missense E167KTransmembrane proteinAF only (91.0)Liver-highPNPLA3 networkHIGH
3PRMT76.0e-10IntronicArginine methyltransferaseAF (93.2 — excellent)UbiquitousChEMBL target existsHIGH
4STAT42.0e-10IntronicSTAT TFAF (86.9)Immune cellsJAK2 (ruxolitinib)HIGH
5CDK149.0e-10IntronicCyclin-dependent kinaseChEMBL targetUbiquitousCDK family (palbociclib)HIGH
6IFNL38.0e-10RegulatoryInterferon lambda cytokineAF (85.8)ImmuneIFN pathway (pegIFN)MEDIUM
7TRIM319.0e-13IntronicE3 ubiquitin ligaseAF (79.6)UbiquitousUbiquitin systemMEDIUM
8KIF1B2.0e-18IntronicKinesin motor1 NMR structureUbiquitousTubulin (taxanes)MEDIUM
9DLC13.0e-07IntronicRhoGAPAF only (56.6)UbiquitousRho GTPasesLOW
10MAU23.0e-12IntronicCohesin loaderAF (91.9)UbiquitousCohesin complexLOW
11MICA4.0e-13UTR/regulatoryMHC class I-relatedPDB (10 structures)EpithelialNKG2D (IO pathway)MEDIUM
12WNT3A1.0e-08IntergenicWnt ligandLimitedUbiquitousFZD receptorsLOW
13PRSS236.0e-06IntronicSerine proteaseNo dataLimitedNoneMEDIUM
14PGDIntronic6-PGD (enzyme)Known foldUbiquitousPentose phosphateMEDIUM
15SAMM509.0e-07IntronicMitochondrial membraneLimitedLiverPNPLA3, TM6SF2LOW
16GRIK15.0e-10IntronicIonotropic glutamate receptorYesBrain (not liver)Glutamate drugsLOW
17VEPH12.0e-06IntronicPH domain proteinLimitedUbiquitousUnknownLOW
18PBX23.0e-08IntronicHomeodomain TFLimitedUbiquitousNoneLOW
19TLL13.0e-08IntronicMetalloproteaseRelated structuresLiverBMP pathwayMEDIUM
20GLB12.0e-07IntronicBeta-galactosidaseKnownUbiquitousLysosomalLOW

Deep Dive: Top 5 Undrugged Targets

  1. PNPLA3 (Q9NST1) — DRUGGABILITY POTENTIAL: HIGH
  • GWAS: p=2e-19, Tier 1 coding variant (I148M missense), replicated across 6+ studies
  • Function: Lipase/acyltransferase in hepatic lipid droplets. I148M abolishes lipase activity, causing fat accumulation
  • Family: Patatin-like phospholipase — enzyme with definable active site
  • Structure: AlphaFold only (pLDDT 71.7); related PNPLA2 structures available
  • Expression: Liver-enriched — favorable for organ-specific targeting
  • Why undrugged: Novel target; biology only elucidated in last decade; no tool compounds
  • Opportunity: Allele-specific degraders (PROTACs) targeting I148M protein, or activators restoring lipase function. Active academic interest.
  1. TM6SF2 (Q9BZW4) — DRUGGABILITY POTENTIAL: HIGH
  • GWAS: p=2e-12, Tier 1 coding variant (E167K), replicated
  • Function: Regulates hepatic lipid secretion via VLDL. E167K = loss of function
  • Family: Transmembrane protein — challenging but not impossible
  • Structure: AlphaFold (pLDDT 91.0 — high quality)
  • Expression: Liver-enriched
  • Why undrugged: Very novel; function only recently characterized
  • Opportunity: Protein stabilizers or chaperones to rescue E167K folding. ASO/siRNA approaches.
  1. PRMT7 (Q9NVM4) — DRUGGABILITY POTENTIAL: HIGH
  • GWAS: p=6e-10
  • Function: Protein arginine methyltransferase; epigenetic regulation
  • Family: SAM-dependent methyltransferase — highly druggable fold (cf. EZH2 inhibitors)
  • Structure: AlphaFold (pLDDT 93.2 — excellent); ChEMBL target record exists
  • Expression: Ubiquitous
  • Why undrugged: Less studied than PRMT1/5; no selective inhibitors yet
  • Opportunity: SAM-competitive inhibitors. PRMT5 inhibitors in oncology trials show path.
  1. STAT4 (Q14765) — DRUGGABILITY POTENTIAL: HIGH (indirect)
  • GWAS: p=2e-10 (HBV-HCC)
  • Function: Mediates IL-12 signaling, Th1 immune response, IFN-gamma production
  • Family: STAT TF — directly difficult, but upstream JAK2 is druggable
  • Interactions: JAK2 (ruxolitinib), IL-12Rβ2
  • Why undrugged: STAT proteins lack deep pockets; PPI-focused
  • Opportunity: JAK inhibitors (ruxolitinib, baricitinib) modulate STAT4 signaling indirectly. Also: STAT4-selective antisense approaches.
  1. CDK14 (O94921) — DRUGGABILITY POTENTIAL: HIGH
  • GWAS: p=9e-10 (HBV-HCC)
  • Function: Wnt pathway regulator; phosphorylates LRP6 during G2/M
  • Family: CDK kinase — proven druggable (CDK4/6 inhibitors approved)
  • Structure: ChEMBL target exists; CDK family structures available
  • Interactions: CTNNB1 (score 353), Cyclin-Y (score 992)
  • Why undrugged: Existing CDK drugs (palbociclib) have low CDK14 selectivity
  • Opportunity: CDK14-selective inhibitors could block Wnt activation in HCC. Pan-CDK drugs already in HCC trials.

Section 18: Summary

GWAS LANDSCAPE

  • Total associations: 93 | Unique studies: 27 | Unique genes: 34 (GWAS) + 19 (ClinVar) = 47 combined
  • Coding variants: 2.5% (Tier 1) | Non-coding: 97.5%
  • Dominant loci: HLA/MHC 6p21, PNPLA3/SAMM50 22q13, TM6SF2 19p13, KIF1B 1p36

GENETIC EVIDENCE

  • Tier 1 (coding) genes: 2 (PNPLA3, TM6SF2)
  • Mendelian overlap: 1 (TERT)
  • Both GWAS + ClinVar: 1 (TERT)

DRUGGABILITY

  • Overall druggable rate: 36.2% of genes in druggable protein families
  • Approved for HCC: 10.6% | Approved other: 17.0% | In trials: 8.5%
  • Opportunity gap (no drugs): 53.2%

PYRAMID SUMMARY

LevelCount%
L1 Validated510.6%
L2 Repurposing817.0%
L3 Emerging48.5%
L4 Tool compounds510.6%
L5 Druggable undrugged48.5%
L6 Hard targets2144.7%

CLINICAL TRIAL ALIGNMENT

  • 27% of top trial drugs target GWAS/ClinVar genes — moderate alignment
  • MET and RET are the best-validated genetically-informed drug targets currently in HCC trials

TOP 10 REPURPOSING CANDIDATES

DrugGeneApproved Forp-valueScore
CabozantinibMET/RETRCC/thyroidClinVar★★★★★
AlpelisibPIK3CABreast cancerClinVar★★★★☆
PalbociclibCDK14Breast cancer9e-10★★★★☆
SelpercatinibRETNSCLC/thyroidClinVar★★★★☆
CapmatinibMETNSCLCClinVar★★★★☆
RuxolitinibSTAT4→JAK2MPN2e-10★★★☆☆
CalcitriolVDRRicketsClinVar★★★☆☆
RibociclibCDK14Breast cancer9e-10★★★★☆
BaricitinibSTAT4→JAK2RA2e-10★★★☆☆
VandetanibRETThyroidClinVar★★★☆☆

TOP 10 UNDRUGGED OPPORTUNITIES

Genep-valueFamilyStructurePotential
PNPLA32e-19Phospholipase (enzyme)AF (71.7)HIGH
TM6SF22e-12TransmembraneAF (91.0)HIGH
PRMT76e-10MethyltransferaseAF (93.2)HIGH
STAT42e-10STAT TFAF (86.9)HIGH
CDK149e-10CDK kinaseChEMBLHIGH
IFNL38e-10Interferon cytokineAF (85.8)MEDIUM
TRIM319e-13E3 ubiquitin ligaseAF (79.6)MEDIUM
KIF1B2e-18Kinesin motor1 NMRMEDIUM
TLL13e-08MetalloproteaseRelatedMEDIUM
MICA4e-13MHC class I-relatedPDB (10)MEDIUM

TOP 10 INDIRECT OPPORTUNITIES

Undrugged GeneDrugged InteractorAvailable Drug
STAT4JAK2Ruxolitinib, Baricitinib
CDK14CDK4/6Palbociclib, Abemaciclib
KIF1BTubulin/microtubulePaclitaxel, Docetaxel
DLC1RhoA/ROCKFasudil (ROCK inhibitor)
PNPLA3Lipid metabolism pathwayStatins (indirect)
TM6SF2ApoB/lipoprotein pathwayPCSK9 inhibitors (indirect)
TERTTP53 (interaction score 887)MDM2 inhibitors
CTNNB1TCF4/LEF PPITegavivint (Phase 2)
WNT3AFZD receptorsWnt pathway inhibitors
TRIM31Ubiquitin-proteasomeBortezomib (proteasome)

KEY INSIGHTS

  1. HCC has a unique genetic architecture dominated by (a) HLA/immune loci reflecting viral etiology (HBV/HCV) and (b) metabolic susceptibility genes (PNPLA3, TM6SF2) reflecting steatotic liver disease etiology — two distinct pathogenic axes.

  2. PNPLA3 I148M is the strongest and most actionable undrugged target — a coding variant with p=2e-19, liver-specific expression, and enzymatic function. It represents the single highest-value drug development opportunity in HCC genetics.

3. The Wnt/β-catenin pathway is the most genetically loaded druggable pathway — with CTNNB1, AXIN1, APC (ClinVar somatic), CDK14 (GWAS), WNT3A (GWAS), and TERT (both) all converging. Tegavivint (β-catenin inhibitor) is in Phase 2.

  1. TERT is the only gene with both GWAS germline and ClinVar somatic evidence — bridging common and rare variant architectures. Telomerase inhibitors (imetelstat) are in development.

  2. Immune checkpoint therapies (PD-1, PD-L1, CTLA-4) dominate HCC trials but do NOT target GWAS genes — the genetic evidence points more toward kinases (MET, RET, CDK14, PIK3CA) and metabolic targets (PNPLA3, TM6SF2).

  3. JAK-STAT pathway offers an underexplored repurposing opportunity — STAT4 GWAS evidence (p=2e-10) + approved JAK inhibitors = rationale for JAK inhibitor trials in HBV-related HCC specifically.

  4. Compared to other cancers, HCC has relatively few coding GWAS variants (2.5% vs typical ~5-10%), reflecting that most HCC genetic risk operates through liver disease susceptibility (metabolic, immune) rather than direct oncogenic mechanisms.


Analysis generated using biobtree MCP tools querying 70+ biological databases. All identifiers cross-referenced across MONDO, EFO, MeSH, OMIM, Orphanet, GWAS Catalog, dbSNP, HGNC, UniProt, InterPro, Reactome, ChEMBL, PharmGKB, STRING, PDB, AlphaFold, Bgee, and CellxGene.