MLH1 Gene Complete Identifier and Functional Mapping Reference

Provide a comprehensive cross-database identifier and functional mapping reference for human MLH1 — a definitive lookup resource covering: ### …

Provide a comprehensive cross-database identifier and functional mapping reference for human MLH1 — a definitive lookup resource covering: ### Section 1: Gene identifiers For human gene MLH1, list ALL gene-level database identifiers. Required: - HGNC ID and approved symbol - Ensembl gene ID (ENSG...) - NCBI Entrez Gene ID - OMIM gene/locus ID - Genomic location: chromosome, start position, end position, strand (GRCh38) ### Section 2: Transcript identifiers For human gene MLH1, list ALL transcript-level identifiers. Required: - Ensembl transcripts: ALL ENST IDs with biotype. Total count. - RefSeq transcripts: ALL NM_ mRNA accessions. Mark which is MANE Select. - CCDS IDs. - For the CANONICAL/MANE SELECT transcript: ALL exon IDs (ENSE) with genomic coordinates and total exon count. ### Section 3: Protein identifiers For human gene MLH1 protein product(s), list ALL protein-level identifiers. Required: - UniProt accessions: ALL entries (reviewed and unreviewed). Mark the canonical reviewed entry. - RefSeq protein: ALL NP_ accessions. - Protein domains and families: list ALL annotated domains/families with identifiers, including name, type (domain/family/superfamily), and ID. - Antibody availability: known antibody resources for the protein. ### Section 4: Structure For human gene MLH1 protein, list ALL structural data. Required: - Experimental structures: ALL PDB IDs. For each: experimental method (X-ray/NMR/Cryo-EM) and resolution. Total count. - Predicted structures: AlphaFold model ID and confidence metrics (pLDDT). ### Section 5: Cross-species orthologs For human gene MLH1, list orthologous genes in key model organisms. Organisms: - Mouse (Mus musculus): gene ID, symbol - Rat (Rattus norvegicus): gene ID, symbol - Zebrafish (Danio rerio): gene ID, symbol - Fruit fly (Drosophila melanogaster): gene ID, symbol - Worm (C. elegans): gene ID, symbol - Yeast (S. cerevisiae): gene ID, symbol ### Section 6: Clinical variants & AI predictions For human gene MLH1, summarize clinical variants and AI predictions. Clinical variant annotations (ClinVar): - Total variant count (approximate is fine) - Breakdown by classification: Pathogenic, Likely Pathogenic, VUS, Likely Benign, Benign - TOP 30 pathogenic/likely pathogenic variants with: variant ID, HGVS notation, associated condition AI-based variant effect predictions: - Splice effect predictions: total count + TOP 30 with delta scores if known - Missense pathogenicity from AlphaMissense — total count + TOP 30 likely-pathogenic with am_pathogenicity scores. ### Section 7: Pathways & Gene Ontology For human gene MLH1, list biological pathways and Gene Ontology annotations. Pathway membership: - ALL biological pathways this gene participates in, with pathway IDs and names - Total pathway count Gene Ontology: - Biological Process: count and TOP 20 terms with GO IDs - Molecular Function: count and TOP 20 terms with GO IDs - Cellular Component: count and TOP 20 terms with GO IDs ### Section 8: Protein interactions & networks For human gene MLH1 protein, summarize protein interactions and networks. Protein-protein interactions (STRING, IntAct, BioGRID, etc.): - Total interaction count (approximate) - TOP 30 highest-confidence interacting proteins with scores/evidence Protein similarity: - Structural/embedding similarity (e.g. Foldseek, ESM): TOP 20 similar proteins with scores - Sequence homology: TOP 20 homologous proteins with identity/similarity ### Section 9: Transcription factor regulatory data For human gene MLH1, summarize transcription factor regulatory data. If MLH1 is a transcription factor: - Downstream targets: total count + TOP 30 with regulation type (activates/represses) and evidence - DNA binding motifs from JASPAR — all known motif IDs and motif family classification. Regardless: - Upstream regulators: TFs that regulate MLH1 — names with evidence type (ChIP-seq / predicted / experimentally validated) If MLH1 is not a transcription factor, say so briefly and skip the downstream/motif sections. ### Section 10: Drug & pharmacology data For human gene MLH1 protein as a drug target, summarize pharmacology data. If MLH1 is a known drug target: - Targeting molecules: total count in ChEMBL/DrugBank + TOP 30 by development phase (molecule ID, name, mechanism, highest phase) - Clinical trials: TOP 20 involving drugs targeting this gene — trial ID, phase, status, intervention - Pharmacogenomics: known drug-gene interactions affecting drug response + dosing guidelines if any If MLH1 is not currently a drug target, say so briefly. ### Section 11: Expression profiles For human gene MLH1, summarize expression profiles. Tissue expression (GTEx, HPA, Bgee, etc.): - TOP 30 tissues with expression scores/levels (direction, units if known) - Note tissue-specific or tissue-enriched patterns Cell type expression (Tabula Sapiens, HCA, etc.): - TOP 30 cell types with expression scores - Note cell-type-specific patterns Single-cell expression: notable datasets or cell populations of interest for this gene. ### Section 12: Disease associations For human gene MLH1, summarize disease associations. Mendelian / monogenic disease: - Diseases caused by mutations in MLH1: disease name, disease ID (OMIM/Orphanet/Mondo), inheritance pattern, evidence level - Include all directly linked conditions Phenotype associations: - Clinical phenotypes associated with the gene (HPO terms where known) - TOP 30 phenotype terms with HPO IDs Complex-disease / GWAS: - Traits and diseases significantly associated via GWAS: trait name, variant, effect size, study where known - TOP 30 GWAS associations

MLH1

Executive summary

MLH1 (MutL Homolog 1, HGNC:7127) encodes a 756 amino acid, 84.6 kDa DNA mismatch repair protein that is a cornerstone of genomic stability, and its loss is one of the most clinically consequential germline events in human cancer predisposition. Pathogenic and likely pathogenic variants — totaling roughly 1,200 out of ~6,260 ClinVar entries — cause Lynch syndrome (autosomal dominant) and mismatch repair cancer syndrome 1 (autosomal recessive), conferring strong inherited risk for colorectal, endometrial, ovarian, and several other cancers. MLH1 is ubiquitously expressed across 296 of 300 surveyed tissues (mean Bgee score 87.57), with particularly high expression in skeletal and cardiac muscle and in primordial germ cells, consistent with its additional role in meiotic recombination. Its highest-confidence protein partners are the core MMR complex members MSH2, MSH3, and MSH6 (STRING scores 999), and TP53 is a notable upstream transcriptional activator. Although MLH1 is a VIP pharmacogene in PharmGKB, it has no approved or investigational small-molecule drugs targeting it directly; its clinical utility remains in germline variant screening rather than pharmacological intervention.

Gene identifiers

IdentifierValue
HGNC IDHGNC:7127
Approved symbolMLH1
Ensembl gene IDENSG00000076242
NCBI Entrez Gene ID4292
OMIM gene ID120436
Chromosome3
Start position (GRCh38)36,993,226
End position (GRCh38)37,050,896
Strand+

Transcript identifiers

Ensembl transcripts (ENST IDs)

Transcript IDBiotype
ENST00000231790protein_coding
ENST00000413212nonsense_mediated_decay
ENST00000413740protein_coding
ENST00000429117protein_coding
ENST00000432299nonsense_mediated_decay
ENST00000435176protein_coding
ENST00000441265protein_coding
ENST00000442249retained_intron
ENST00000447829nonsense_mediated_decay
ENST00000450420protein_coding
ENST00000454028nonsense_mediated_decay
ENST00000455445protein_coding
ENST00000456676protein_coding
ENST00000457004nonsense_mediated_decay
ENST00000458009nonsense_mediated_decay
ENST00000458205protein_coding
ENST00000466900protein_coding
ENST00000476172retained_intron
ENST00000485889protein_coding
ENST00000492474protein_coding
ENST00000536378protein_coding
ENST00000539477protein_coding
ENST00000616768protein_coding
ENST00000673673protein_coding
ENST00000673686retained_intron
ENST00000673713retained_intron
ENST00000673715protein_coding
ENST00000673741retained_intron
ENST00000673889retained_intron
ENST00000673897nonsense_mediated_decay
ENST00000673899protein_coding
ENST00000673947nonsense_mediated_decay
ENST00000673972nonsense_mediated_decay
ENST00000673990protein_coding
ENST00000674019protein_coding
ENST00000674107retained_intron
ENST00000674111nonsense_mediated_decay
ENST00000674125retained_intron
ENST00000713802protein_coding
ENST00000931189protein_coding
ENST00000931190protein_coding
ENST00000931191protein_coding
ENST00000948704protein_coding
ENST00000948705protein_coding

Total: 44 Ensembl transcripts

RefSeq mRNA transcripts (NM_ accessions)

RefSeq IDMANE Select
NM_000249
NM_001167617
NM_001167618
NM_001167619
NM_001258271
NM_001258273
NM_001258274
NM_001354615
NM_001354616
NM_001354617
NM_001354618
NM_001354619
NM_001354620
NM_001354621
NM_001354622
NM_001354623
NM_001354624
NM_001354625
NM_001354626
NM_001354627
NM_001354628
NM_001354629
NM_001354630

Total: 23 human MLH1 mRNA transcripts (NM_000249 is MANE Select)

CCDS IDs

  • CCDS2663
  • CCDS54562
  • CCDS54563

MANE SELECT transcript exons (ENST00000231790 / NM_000249)

Exon #Exon IDGenomic Coordinates
1ENSE0000401271536993518–36993663
2ENSE0000349602236996619–36996709
3ENSE0000359903637000955–37001053
4ENSE0000363540637004401–37004474
5ENSE0000353385337006991–37007063
6ENSE0000363362337008814–37008905
7ENSE0000352196837011820–37011862
8ENSE0000351678737012011–37012099
9ENSE0000365603337014432–37014544
10ENSE0000171687137017506–37017599
11ENSE0000378562737020310–37020463
12ENSE0000368810637025637–37026007
13ENSE0000368056437028784–37028932
14ENSE0000173018837040186–37040294
15ENSE0000174761837042268–37042331
16ENSE0000174889737047519–37047683
17ENSE0000159340037048517–37048609
18ENSE0000178529637048904–37049017
19ENSE0000390258137050486–37050846

Total exons: 19 (Chromosome 3, + strand)

Protein identifiers

UniProt Accessions

  • P40692 (Reviewed - canonical entry) | DNA mismatch repair protein Mlh1 | 756 aa, 84.6 kDa

RefSeq Protein (NP_) Accessions

  • NP_000240 (MANE Select)
  • NP_001161089
  • NP_001161090
  • NP_001161091
  • NP_001245200
  • NP_001245202
  • NP_001245203
  • NP_001341544
  • NP_001341545
  • NP_001341546
  • NP_001341547
  • NP_001341548
  • NP_001341549

Protein Domains and Families

InterPro entries (8):

IDNameType
IPR002099DNA mismatch repair protein MutL/Mlh/PMSFamily
IPR013507DNA mismatch repair protein, S5 domain 2-likeDomain
IPR014721Small ribosomal subunit protein uS5 domain 2-type fold, subgroupHomologous_superfamily
IPR014762DNA mismatch repair, conserved siteConserved_site
IPR020568Ribosomal protein uS5 domain 2-type superfamilyHomologous_superfamily
IPR032189DNA mismatch repair protein Mlh1, C-terminalDomain
IPR036890Histidine kinase/HSP90-like ATPase superfamilyHomologous_superfamily
IPR038973DNA mismatch repair protein MutL/Mlh/Pms-likeFamily

Pfam entries (3):

  • PF01119
  • PF13589
  • PF16413

Antibody Resources

No antibody resources available in biobtree database.

Structure

Experimental Structures (PDB)

Human MLH1 has 2 crystal structures:

PDB IDMethodResolution (Å)
3RBNX-ray Diffraction2.16
4P7AX-ray Diffraction2.30

Additional structures contain MLH1 NLS peptide fragments in complex with mouse importin-alpha: 5U5P (2.171 Å), 6WBA (2.151 Å), 6WBB (2.663 Å), 6WBC (2.15 Å), 7M60 (2.30 Å).

Total human MLH1 structures: 2


Predicted Structures (AlphaFold)

Model IDGlobal pLDDTpLDDT > 90 (%)
AF-P40692-F177.8951

Cross-species orthologs

OrganismGene IDSymbol
Mouse (Mus musculus)ENSMUSG00000032498Mlh1
Rat (Rattus norvegicus)ENSRNOG00000033809Mlh1
Zebrafish (Danio rerio)ENSDARG00000025948mlh1
Fruit fly (Drosophila melanogaster)FBGN0011659Mlh1
Worm (C. elegans)WBGENE00003373mlh-1
Yeast (S. cerevisiae)YMR167WMLH1

Clinical variants & AI predictions

Clinical Variants (ClinVar)

Variant counts by classification:

ClassificationCount
Uncertain significance~4,600
Pathogenic~900
Likely pathogenic~300
Benign/Likely benign~200
Conflicting classifications~260
Total variants~6,260

Top 30 Pathogenic/Likely Pathogenic Variants:

Variant IDHGVSTypeClassificationAssociated Condition
1012206c.1003del (p.Leu335fs)DeletionPathogenicLynch Syndrome 1
1048862c.497T>G (p.Leu166Ter)SNVPathogenicColorectal cancer, hereditary nonpolyposis
1049302c.1275dup (p.Gln426fs)DuplicationPathogenicColorectal cancer, hereditary nonpolyposis
1049248c.1039-2_1409+1delDeletionPathogenicHereditary cancer syndrome
1049670c.1896+1G>CSNVPathogenicLynch syndrome
1049673c.1367del (p.Thr455_Ser456insTer)DeletionPathogenicColorectal cancer, hereditary nonpolyposis
1049396c.1999del (p.Asp667fs)DeletionPathogenicLynch syndrome
1049508c.631_632del (p.Ser211fs)DeletionPathogenicHereditary neoplastic syndrome
1049646c.546-2_589-59delDeletionPathogenicHereditary cancer-predisposing syndrome
1050017c.365del (p.Gly122fs)DeletionPathogenicLynch syndrome
1050185c.1039-2_1409+150delDeletionPathogenicHereditary cancer syndrome
1050489c.1770_1772delinsC (p.Leu590fs)IndelPathogenicTumor predisposition
1050518c.1732-2_1896+1delDeletionPathogenicLynch syndrome
1050645c.1559-2_1667+1delDeletionPathogenicColorectal cancer, hereditary nonpolyposis
1050662c.1559-4_1667+63delDeletionPathogenicHereditary cancer-predisposing syndrome
1050751c.995_996insA (p.Ser332fs)InsertionPathogenicLynch syndrome
1023950c.1681_1686del (p.Tyr561_Gln562del)DeletionPathogenicColorectal cancer
1048888c.2131dup (p.Ser711fs)DuplicationPathogenicHereditary neoplastic syndrome
1048910c.131_132delinsTT (p.Ser44Phe)IndelPathogenicLynch syndrome
1049107c.588+1G>CSNVLikely pathogenicHereditary cancer syndrome
1048973c.2258_2259dup (p.Glu754fs)DuplicationLikely pathogenicTumor predisposition
1006220c.193G>A (p.Gly65Ser)SNVLikely pathogenicLynch syndrome
1049886c.-2_116+1delDeletionPathogenicColorectal cancer, hereditary nonpolyposis
1050070c.-11C>ASNVUncertain/Likely pathogenicHereditary cancer syndrome
1049099c.2031del (p.Ser677fs)DeletionUncertain/Likely pathogenicLynch syndrome
1009544c.657_662del (p.Phe220_Gly221del)DeletionUncertain significanceColorectal cancer
1034510c.569_571del (p.Ile190del)DeletionUncertain significanceHereditary cancer syndrome
1001928c.380+6G>TSNVConflicting classificationsLynch syndrome
1010470c.117-3C>TSNVConflicting classificationsColorectal cancer
1015357c.117-6T>ASNVConflicting classificationsHereditary cancer-predisposing syndrome

AI-based Predictions

SpliceAI Predictions: 2,888 total variants

Top predictions by effect score:

  • Donor gain: c.1896+1G>C (0.99), c.614G>T (0.99), c.3633G>T (0.99)
  • Donor loss: c.1660+5_1667del (0.91), c.1410-2_1558+1del (0.91), c.588+15del (0.91)

AlphaMissense (Likely Pathogenic): ~100+ variants

VariantPositionScoreClassification
L11P3:369935790.999Likely pathogenic
I8N3:369935700.998Likely pathogenic
R9P3:369935730.978Likely pathogenic
R10W3:369935750.871Likely pathogenic
V15E3:369935910.993Likely pathogenic
V16E3:369935940.992Likely pathogenic
I19N3:369936031.000Likely pathogenic
A20P3:369936050.998Likely pathogenic
A21P3:369936080.998Likely pathogenic
G22R3:369936110.998Likely pathogenic
E23K3:369936140.996Likely pathogenic
V24D3:369936180.999Likely pathogenic
I25N3:369936210.999Likely pathogenic
[+86 additional variants]0.615–0.999Likely pathogenic

Pathways & Gene Ontology

Biological Pathways

Reactome Pathways (6 total)

IDPathway Name
R-HSA-5358565Mismatch repair (MMR) directed by MSH2:MSH6 (MutSalpha)
R-HSA-5358606Mismatch repair (MMR) directed by MSH2:MSH3 (MutSbeta)
R-HSA-5545483Defective Mismatch Repair Associated With MLH1
R-HSA-5632987Defective Mismatch Repair Associated With PMS2
R-HSA-6796648TP53 Regulates Transcription of DNA Repair Genes
R-HSA-912446Meiotic recombination

MSigDB Gene Sets (100 total)

MLH1 is annotated in 100 MSigDB gene sets, including Reactome pathway collections, KEGG pathways, Gene Ontology-based sets, and curated gene signatures. Key pathway-related entries include:

  • REACTOME_MISMATCH_REPAIR, REACTOME_DNA_REPAIR, REACTOME_DISEASES_OF_MISMATCH_REPAIR_MMR, REACTOME_REPRODUCTION, REACTOME_TRANSCRIPTIONAL_REGULATION_BY_TP53
  • KEGG_MISMATCH_REPAIR, KEGG_PATHWAYS_IN_CANCER, KEGG_COLORECTAL_CANCER, KEGG_ENDOMETRIAL_CANCER

Total pathway membership: 6 Reactome + 100 MSigDB gene sets


Gene Ontology Annotations

Biological Process (19 terms)

GO IDTerm
GO:0000289Nuclear-transcribed mRNA poly(A) tail shortening
GO:0000712Resolution of meiotic recombination intermediates
GO:0006298Mismatch repair
GO:0006303Double-strand break repair via nonhomologous end joining
GO:0007060Male meiosis chromosome segregation
GO:0007129Homologous chromosome pairing at meiosis
GO:0007283Spermatogenesis
GO:0008630Intrinsic apoptotic signaling pathway in response to DNA damage
GO:0009617Response to bacterium
GO:0016321Female meiosis chromosome segregation
GO:0016446Somatic hypermutation of immunoglobulin genes
GO:0043060Meiotic metaphase I homologous chromosome alignment
GO:0045141Meiotic telomere clustering
GO:0045190Isotype switching
GO:0045950Negative regulation of mitotic recombination
GO:0048298Positive regulation of isotype switching to IgA isotypes
GO:0048304Positive regulation of isotype switching to IgG isotypes
GO:0048477Oogenesis
GO:0051257Meiotic spindle midzone assembly

Molecular Function (7 terms)

GO IDTerm
GO:0003682Chromatin binding
GO:0005524ATP binding
GO:0008047Enzyme activator activity
GO:0016887ATP hydrolysis activity
GO:0019899Enzyme binding
GO:0032137Guanine/thymine mispair binding
GO:0140664ATP-dependent DNA damage sensor activity

Cellular Component (9 terms)

GO IDTerm
GO:0000795Synaptonemal complex
GO:0001673Male germ cell nucleus
GO:0005634Nucleus
GO:0005654Nucleoplasm
GO:0005694Chromosome
GO:0005712Chiasma
GO:0005715Late recombination nodule
GO:0016020Membrane
GO:0032389MutLalpha complex

Protein interactions & networks

Protein-Protein Interactions (PPIs)

Total interaction counts:

  • STRING: 3,428 interactions
  • IntAct: 514 interactions
  • BioGRID: 491 interactions
  • DIP: 1 interaction

TOP 30 highest-confidence STRING interactors with scores (0-1000 scale):

RankGeneUniProtScoreEvidence
1MSH3P20585999Mismatch repair complex
2MSH2P43246999Mismatch repair complex
3MSH6P52701999Mismatch repair complex
4EXO1Q9UQ84998DNA repair pathway
5MLH3P49751993Mismatch repair
6PMS1P54277993Mismatch repair complex
7PMS2P54278993Mismatch repair complex
8ATMQ13315992DNA damage response
9BRIP1Q9BX63951Fanconi anemia pathway
10BRCA1P38398930DNA repair
11MBD4O95243921Base excision repair
12NEIL1Q9UIF7914Base excision repair
13BRCA2P51587905DNA repair
14BRAFP15056876Signaling pathway
15FANCD2Q9BXW9874Fanconi anemia pathway
16KRASP01116868Signaling
17TP53P04637868Tumor suppressor
18MSH4O15457856Mismatch repair
19CDKN2AP42771856Cell cycle control
20MGMTP16455853DNA repair
21POLEQ07864852DNA polymerase
22FAN1Q9Y2M0844Fanconi anemia pathway
23RAD51DO75771838Homologous recombination
24EPCAMP16422837Cell adhesion
25RFC4P35249826Replication factor
26RFC2P32846825Replication factor
27MSH5O43196824Mismatch repair
28CHK2O96017824DNA damage checkpoint
29PTENP60484811Phosphatase/tumor suppressor
30BLMP54132808DNA helicase

Top IntAct interactions with confidence scores:

  • PMS2: 0.970 (direct interaction)
  • MYOG: 0.890
  • RADX: 0.870
  • ZC3H11A: 0.870
  • KPNA2: 0.830
  • PMS1: 0.830
  • MSH3: 0.740
  • AP2B1: 0.790
  • CBY2: 0.790-0.800
  • MAGEA8: 0.780
  • MLH3: 0.780
  • TASOR2: 0.720
  • CCDC33: 0.670
  • TMSB4X: 0.730
  • SKP2: 0.560
  • HSPA8: 0.500

Protein Similarity

TOP 20 ESM2 structural embeddings (0-1.0 scale, max similarity = 1.0):

RankUniProtTop SimilarityAvg SimilarityOrganism
1P432460.99990.9384Human MSH2
2Q136140.99950.9504Human
3Q5XXB50.99990.9384Ortholog
4B2KI880.99880.9520Cross-species
5Q024400.99900.9382Cross-species
6P248600.99860.9370Cross-species
7P378820.99860.9359Cross-species
8O952920.99860.9447Human
9P302770.99840.9367Cross-species
10Q1LZG60.99840.9371Cross-species
11O435020.99760.9410Human RAD51C
12A6QNT80.99780.9246Cross-species
13O954860.99780.9213Human SEC24A
14Q9S9N40.99770.9340Cross-species
15P146350.99850.9305Human CCNB1
16E1C6Q10.98710.9454Cross-species
17Q3U2P10.99660.9268Cross-species
18Q5ZIV10.99200.9386Cross-species
19F4JL280.93020.9217Cross-species
20Q3MHE40.99870.9413Cross-species

TOP 15 sequence homologs (DIAMOND, % identity / bit score):

RankUniProtIdentity (%)Bit ScoreDescription
1B5RL3699.301193.0MLH1 ortholog
2B5RR2999.301191.0MLH1 ortholog
3P9767991.301349.0Mouse MLH1
4P5427897.401274.0PMS2 (human)
5Q9JK9191.301351.0Cross-species MLH1
6P4069288.401305.0Human MLH1 (self)
7O5122992.301112.0Cross-species
8Q0SNV192.501112.0Cross-species
9Q662F392.501115.0Cross-species
10A1QZ0585.601051.0Cross-species
11P5428050.30421.0PMS1 homolog
12Q54KD840.20569.0MLH1 homolog
13P3892039.70467.0MLH1 variant
14Q9ZRV439.20510.0MLH1 homolog
15Q9P7W637.30438.0Distant homolog

Transcription factor regulatory data

MLH1 is not a transcription factor. MLH1 encodes the DNA mismatch repair protein MLH1, a core component of the mismatch repair pathway involved in maintaining genomic stability.

Upstream regulators (TFs that regulate MLH1)

19 transcription factors regulate MLH1 (from CollecTRI database):

RegulatorRegulation typeEvidence/Confidence
TP53ActivationCollecTRI
BHLHE41RepressionHigh confidence
CEBPZRepressionHigh confidence
DNMT3ARepressionCollecTRI
GLI1UnknownHigh confidence
GLI2UnknownHigh confidence
BHLHE40UnknownCollecTRI
BRIP1UnknownCollecTRI
E2F4UnknownCollecTRI
HIF1AUnknownCollecTRI
MAFGUnknownCollecTRI
WT1RepressionLow confidence
HOXA5UnknownHigh confidence
MLXIPUnknownHigh confidence
CTNNBL1UnknownLow confidence
DNMT1UnknownLow confidence
ESR2UnknownLow confidence
HOXD1UnknownLow confidence
TP73UnknownLow confidence

Note: TP53 (p53) activation of MLH1 is particularly notable as a tumor suppressor pathway. Several DNA methyltransferases (DNMT1, DNMT3A) appear to suppress MLH1 transcription, which may be relevant to epigenetic regulation of mismatch repair capacity.

Based on my search of the biobtree databases, I can now provide a comprehensive answer:

Drug & pharmacology data

MLH1 is NOT currently a known drug target.

There are:

  • Zero ChEMBL molecules targeting MLH1 as a direct target
  • Zero clinical trials with drugs specifically targeting MLH1
  • No approved or investigational drugs with MLH1 as a mechanism of action

Pharmacogenomic status:

  • MLH1 is a VIP (Very Important Pharmacogene) in PharmGKB (PA240)
  • Has variant annotations related to genetic predisposition (Lynch syndrome/HNPCC)
  • No CPIC dosing guidelines exist for MLH1-guided therapy
  • Clinical significance: MLH1 mutations are associated with hereditary nonpolyposis colorectal cancer (HNPCC), but this role is for genetic risk stratification, not as a therapeutic target

Summary: MLH1 is recognized as an important gene for understanding cancer predisposition and genetic disease risk, but it is not targeted by small molecule drugs, biologics, or other therapeutics currently in development. Its clinical relevance is primarily in germline variant screening for Lynch syndrome rather than as a pharmacological target.

Expression profiles

Tissue Expression (Bgee - 296/300 conditions)

RankTissueExpression ScoreQuality
1Tibialis anterior94.44Gold
2Skeletal muscle tissue of rectus abdominis94.42Gold
3Deltoid94.37Gold
4Left ventricle myocardium94.30Gold
5Heart left ventricle93.97Gold
6Cardiac ventricle93.90Gold
7Primordial germ cell in gonad93.85Gold
8Vastus lateralis93.84Gold
9Quadriceps femoris93.78Gold
10Apex of heart93.60Gold
11Skeletal muscle tissue93.35Gold
12Heart right ventricle93.35Gold
13Ganglionic eminence93.22Gold
14Muscle organ92.99Gold
15Skeletal muscle organ92.99Gold
16Biceps brachii92.97Gold
17Muscle tissue92.79Gold
18Heart92.78Gold
19Muscle of leg92.75Gold
20Ventricular zone92.73Gold
21Tibia92.69Gold
22Calcaneal tendon92.62Gold
23Gastrocnemius92.53Gold
24Diaphragm92.49Gold
25Pituitary gland92.48Gold
26Triceps brachii92.45Gold
27Adenohypophysis92.41Gold
28Skin of hip92.34Gold
29Right atrium auricular region92.32Gold
30Corpus callosum92.23Gold

Pattern notes:

  • Ubiquitous expression with mean score 87.57 across all 300 tissues
  • Skeletal muscle enrichment: Top 3 tissues are skeletal muscle groups (tibialis anterior, rectus abdominis, deltoid)
  • Cardiac expression: Heart tissues rank highly (scores 92.78–93.97), reflecting MLH1’s role in mismatch repair during active cell division
  • Developmental tissues: Primordial germ cells (#7, 93.85) and ganglionic eminence (#13, 93.22) show strong expression, consistent with high mitotic activity

Single-Cell Expression (SCXA - Single Cell Expression Atlas)

Datasets:

  • Primary dataset: E-MTAB-2983 – “Functional germ line stem cells do not exist in adult mammalian ovaries” (38 cells)
    • Reflects strong expression in female germ tissue identified in tissue data
  • Total experiments: 3 datasets with marker status
  • Cell clusters analyzed: 238 cell populations
  • Expression range: Mean expression 0.009–381.53 (maximum in specific cell clusters)

Notable populations: Expression is marked in germ line/gonadal tissues and tissues undergoing high proliferation, consistent with MLH1’s role in DNA mismatch repair and meiosis (required for proper meiotic recombination).

Cell Type Expression

The Bgee annotation references 16 distinct cell type categories (Cell Ontology). While detailed single-cell cluster annotations are limited in biobtree, the top tissues above correspond to:

  • Myocytes (skeletal and cardiac muscle cells) – dominant
  • Germ line cells (oocytes, spermatogonia) – specifically marked
  • Neural progenitor cells (ventricular zone, ganglionic eminence)

MLH1 expression correlates strongly with cell division activity, as expected for a mismatch repair gene essential during DNA synthesis and meiosis.

Disease associations

Mendelian / Monogenic Diseases

DiseaseDisease IDInheritanceEvidence Level
Lynch syndromeOMIM:120435, OMIM:609310, MONDO:0005835, Orphanet:144Autosomal dominantDefinitive/Strong
Lynch syndrome 1OMIM:120435, MONDO:0007356Autosomal dominantStrong/Definitive
Lynch syndrome 2OMIM:609310, MONDO:0012249Autosomal dominantDefinitive/Strong
Muir-Torre syndromeOMIM:158320, MONDO:0008018, Orphanet:587Autosomal dominantDefinitive/Strong
Mismatch repair cancer syndrome 1OMIM:276300, MONDO:0010159, Orphanet:252202Autosomal recessiveDefinitive/Strong
Constitutional mismatch repair deficiency syndromeOrphanet:252202Autosomal recessiveSupportive
Ovarian cancerMONDO:0008170Autosomal dominantStrong
Pancreatic cancerMONDO:0009831Autosomal dominantModerate
Breast cancerMONDO:0007254Autosomal dominantDisputed/Strong
Prostate cancerMONDO:0008315Autosomal dominantLimited
RhabdomyosarcomaMONDO:0005212Autosomal recessiveModerate
Colorectal cancer (hereditary nonpolyposis)MONDO:0018630Autosomal dominantDefinitive
Endometrial carcinomaMONDO:0002447Autosomal dominantDefinitive

Additional associated malignancies via ClinVar: colon carcinoma, gastric cancer, lung cancer, bile duct cancer, squamous cell carcinoma


Clinical Phenotypes (HPO Terms) - Top 30

HPO TermHPO ID
Autosomal dominant inheritanceHP:0000006
Autosomal recessive inheritanceHP:0000007
Breast carcinomaHP:0003002
Colon cancerHP:0003003
Endometrial carcinomaHP:0012114
Ovarian neoplasmHP:0100615
Neoplasm of the stomachHP:0006753
Neoplasm of the pancreasHP:0002894
RhabdomyosarcomaHP:0002859
Adenocarcinoma of the colonHP:0040276
Neoplasm of the rectumHP:0100743
Neoplasm of the liverHP:0002896
Hepatocellular carcinomaHP:0001402
LymphomaHP:0002665
Non-Hodgkin lymphomaHP:0012539
Basal cell carcinomaHP:0002671
Adenoma sebaceumHP:0009720
Sebaceous gland carcinomaHP:0030410
Neoplasm of the thyroid glandHP:0100031
Laryngeal carcinomaHP:0012118
Salivary gland neoplasmHP:0100684
Renal neoplasmHP:0009726
Urinary tract neoplasmHP:0010786
Hematological neoplasmHP:0004377
MedulloblastomaHP:0002885
Glioblastoma multiformeHP:0012174
AstrocytomaHP:0009592
Neoplasm of the skeletal systemHP:0010622
Adenocarcinoma of the small intestineHP:0040274
Intestinal polyposisHP:0200008

Complex-Disease / GWAS Associations - Top 7

Trait/DiseaseVariant/GeneChromosomeP-value
Proximal colorectal cancerMLH1chr34.0e-18
Platelet distribution widthMLH1chr32.0e-18
Liver enzyme levels (alanine transaminase)MLH1chr34.0e-15
Alanine aminotransferase levelsMLH1chr32.0e-08
SchizophreniaTRANK1chr33.0e-11
Autism spectrum disorder or schizophreniaHSPD1P6 - LINC02033chr31.0e-11
Subjective response to lithium treatmentHSPD1P6 - LINC02033chr38.0e-07

Structured Data Sources

Generated with Claude Haiku 4.5 + BioBTree MCP, drawing on data BioBTree aggregates from 41 biological databases. Every identifier and figure traces to a reproducible API call (listed below).

Further analyze this answer or run your own queries with BioBTree MCP.

Datasets: alphafold, alphamissense, antibody, bgee, biogrid_interaction, ccds, chembl_target, cl, clinical_trials, clinvar, collectri, diamond_similarity, ensembl, entrez, esm2_similarity, exon, gencc, go, gtex, gwas, hgnc, hpa, hpo, intact, interpro, mim, mondo, msigdb, orphanet, pdb, pfam, pharmgkb_gene, reactome, refseq, scxa, scxa_expression, spliceai, string_interaction, transcript, uberon, uniprot
Generated: 2026-05-25 — For the latest data, query BioBTree directly via MCP or API.
View API calls (130)