SOHLH2

gene
On this page

Also known as FLJ20449TEB1bHLHe81SPATA28

Summary

SOHLH2 (spermatogenesis and oogenesis specific basic helix-loop-helix 2, HGNC:26026) is a protein-coding gene on chromosome 13q13.3, encoding Spermatogenesis- and oogenesis-specific basic helix-loop-helix-containing protein 2 (Q9NX45). Transcription regulator of both male and female germline differentiation.

This gene encodes one of testis-specific transcription factors which are essential for spermatogenesis, oogenesis and folliculogenesis. This gene is located on chromosome 13. The proteins encoded by this gene and another testis-specific transcription factor, SOHLH1, can form heterodimers, in addition to homodimers. There is a read-through locus (GeneID: 100526761) that shares sequence identity with this gene and the upstream CCDC169 (GeneID: 728591). Alternatively spliced transcript variants encoding different isoforms have been found for this gene.

Source: NCBI Gene 54937 — RefSeq curated summary.

At a glance

  • Gene–disease (curated): inherited primary ovarian failure (Limited, GenCC)
  • GWAS associations: 3
  • MANE Select transcript: NM_017826

Identifiers

Gene identifiers

FieldValue
HGNC IDHGNC:26026
Approved symbolSOHLH2
Namespermatogenesis and oogenesis specific basic helix-loop-helix 2
Location13q13.3
Locus typegene with protein product
StatusApproved
AliasesFLJ20449, TEB1, bHLHe81, SPATA28
Ensembl geneENSG00000120669
Ensembl biotypeprotein_coding
OMIM616066
Entrez54937

Gene structure

Transcript identifiers

Ensembl transcripts: 2 — 2 protein_coding

ENST00000317764, ENST00000379881

RefSeq mRNA: 2 — MANE Select: NM_017826 NM_001282147, NM_017826

CCDS: CCDS61309, CCDS9355

Canonical transcript exons

ENST00000379881 — 11 exons

ExonStartEnd
ENSE000014828113616821736169054
ENSE000018609143621447936214556
ENSE000034684753617447636174567
ENSE000035459373617369236173810
ENSE000035750633619179536191894
ENSE000035811913617053136170787
ENSE000035895903620187936202093
ENSE000035917143617472236174869
ENSE000035951393619362136193725
ENSE000036330693619380636193867
ENSE000036376763618994636190056

Expression profiles

Bgee: expression breadth ubiquitous, 151 present calls, max score 99.61.

FANTOM5 (CAGE): breadth tissue_specific, TPM avg 0.4755 / max 125.2234, expressed in 66 samples.

FANTOM5 promoters (2 alternative TSS)

Promoter IDTPM avgSamples expressed
1367750.440764
1367760.034813

Top tissues by expression

274 total, by Bgee expression score (0-100, higher = more expressed):

TissueAnatomy IDExpression scoreQuality
secondary oocyteCL:000065599.61gold quality
oocyteCL:000002399.50gold quality
spermCL:000001994.64gold quality
male germ cellCL:000001591.57gold quality
male germ line stem cell (sensu Vertebrata) in testisCL:0000089 ∩ UBERON:000047385.36gold quality
left testisUBERON:000453382.04gold quality
primordial germ cell in gonadCL:0000670 ∩ UBERON:000099181.72gold quality
right testisUBERON:000453481.68gold quality
testisUBERON:000047381.56gold quality
cortical plateUBERON:000534373.19gold quality
adult organismUBERON:000702371.13gold quality
muscle layer of sigmoid colonUBERON:003580563.62gold quality
placentaUBERON:000198760.52gold quality
nucleus accumbensUBERON:000188260.27gold quality
prefrontal cortexUBERON:000045159.40gold quality
smooth muscle tissueUBERON:000113558.85gold quality
Brodmann (1909) area 9UBERON:001354058.33gold quality
primary visual cortexUBERON:000243658.31gold quality
anterior cingulate cortexUBERON:000983558.31gold quality
cingulate cortexUBERON:000302758.27gold quality
corpus epididymisUBERON:000435957.61gold quality
hypothalamusUBERON:000189857.42gold quality
right adrenal glandUBERON:000123357.22gold quality
seminal vesicleUBERON:000099856.75silver quality
Brodmann (1909) area 23UBERON:001355456.24silver quality
dorsolateral prefrontal cortexUBERON:000983456.22gold quality
cauda epididymisUBERON:000436056.11silver quality
neocortexUBERON:000195055.87gold quality
urinary bladderUBERON:000125555.60gold quality
right frontal lobeUBERON:000281055.45gold quality

Single-cell (SCXA)

Detected in 2 experiment(s), a significant marker in 0.

ExperimentMarker?Max mean expression
E-ENAD-17no182.20
E-ANND-3no2.29

Regulation

Is transcription factor: yes

Downstream targets (CollecTRI)

2 targets.

TargetRegulation
CXCL8Activation
KITActivation

JASPAR motifs

MotifNameFamily
MA1560.1SOHLH2PAS domain factors
MA1560.2SOHLH2PAS domain factors

JASPAR matrix evidence (PMIDs): PMID:26869299

miRNA regulators (miRDB)

46 targeting SOHLH2, top 30 by miRDB confidence (max_score; target_count = how many genes the miRNA targets in total — lower means more specific):

miRNAMax scoreAvg scoremiRNA target_count
HSA-MIR-432-3P100.0067.86705
HSA-MIR-1193100.0065.93529
HSA-MIR-574-5P100.0066.01989
HSA-MIR-607799.9968.042299
HSA-MIR-651-3P99.9473.485177
HSA-MIR-335-3P99.9373.364958
HSA-MIR-806399.9169.763146
HSA-MIR-568099.9169.833421
HSA-MIR-5582-3P99.8672.484221
HSA-MIR-548AR-3P99.8571.263889
HSA-MIR-132199.8465.301811
HSA-MIR-473999.8465.251832
HSA-MIR-4756-5P99.8464.981809
HSA-MIR-548AZ-3P99.8270.563549
HSA-MIR-548BC99.8270.613524
HSA-MIR-548E-3P99.8270.593514
HSA-MIR-548F-3P99.8270.593540
HSA-MIR-6817-3P99.7968.352126
HSA-MIR-370-5P99.7866.81706
HSA-MIR-3934-3P99.7665.511351
HSA-MIR-4645-3P99.7669.33993
HSA-MIR-548A-3P99.7670.583524
HSA-MIR-3158-5P99.6567.511763
HSA-MIR-3679-3P99.6469.881599
HSA-MIR-6513-3P99.5969.771102
HSA-MIR-427699.5667.662514
HSA-MIR-7106-5P99.5367.473574
HSA-MIR-1212399.5271.792990
HSA-MIR-6833-5P99.5068.931161
HSA-MIR-143-3P99.4969.051457

Literature-anchored findings (GeneRIF, showing 12)

  • Our identification of novel variants in the SOHLH2 gene, in women with POF of both Chinese and Serbian origin, strongly suggests an important role for SOHLH2 in human POF etiology. (PMID:24524832)
  • Low Sohlh2 expression is associated with ovarian cancer. (PMID:24858206)
  • The polymorphisms rs1328626 and rs6563386 of the SOHLH2 gene would be the genetic risk factors for nonobstructive azoospermia in the Chinese population. The SNP rs1328641 might influence testes development in the NOA patients. (PMID:25463635)
  • Results identified MMP9 as a novel target for transcriptional inactivation by Sohlh2 and demonstrated that the Sohlh2 downregulation of MMP9 is critical for inhibiting human ovarian cancer cell invasion. (PMID:26153894)
  • the expression of Sohlh genes in human tissues (PMID:26375665)
  • sohlh2 functions as a tumor metastasis suppressor via suppressing IL-8 expression in breast cancer (PMID:27384482)
  • Data found that sohlh2 overexpression inhibited breast cancer cell proliferation in vitro and tumor growth in vivo. In contrast, sohlh2 silencing induced the opposite effects. These functional effects of sohlh2 were exerted through inhibiting Wnt/beta- catenin signaling by the increase of APC expression. These findings provide a novel mechanistic role of sohlh2 in breast tumorigenesis. (PMID:30720232)
  • Sohlh2 alleviates malignancy of EOC cells under hypoxia via inhibiting the HIF1alpha/CA9 signaling pathway. (PMID:31318683)
  • Intronic variation of the SOHLH2 gene confers risk to male reproductive impairment. (PMID:32690270)
  • Germline variants at SOHLH2 influence multiple myeloma risk. (PMID:33875642)
  • SOHLH2 Suppresses Angiogenesis by Downregulating HIF1alpha Expression in Breast Cancer. (PMID:34158392)
  • Sohlh2 Regulates the Stemness and Differentiation of Colon Cancer Stem Cells by Downregulating LncRNA-H19 Transcription. (PMID:36287177)

Cross-species orthologs

2 orthologs

OrganismSymbolGene ID
mus_musculusSohlh2ENSMUSG00000027794
rattus_norvegicusSohlh2ENSRNOG00000038091

Paralogs (1): SOHLH1 (ENSG00000165643)

Protein

Protein identifiers

Spermatogenesis- and oogenesis-specific basic helix-loop-helix-containing protein 2Q9NX45 (reviewed: Q9NX45)

All UniProt accessions (1): Q9NX45

UniProt curated annotations — full annotation on UniProt →

Function. Transcription regulator of both male and female germline differentiation. Suppresses genes involved in spermatogonial stem cells maintenance, and induces genes important for spermatogonial differentiation. Coordinates oocyte differentiation without affecting meiosis I.

Subunit / interactions. Forms both hetero- and homodimers with SOHLH1.

Subcellular location. Nucleus. Cytoplasm.

Isoforms (3)

UniProt IDNamesCanonical?
Q9NX45-11yes
Q9NX45-22
Q9NX45-33

RefSeq proteins (2): NP_001269076, NP_060296* (*=MANE)

Domains & families (InterPro)

IDNameType
IPR011598bHLH_domDomain
IPR036638HLH_DNA-bd_sfHomologous_superfamily
IPR039583TCFL5/SOLH1/2Family

Pfam: PF00010

UniProt features (10 total): splice variant 3, sequence conflict 3, sequence variant 2, chain 1, domain 1

Structure

Experimental structures (PDB)

0 structures.

Predicted structure (AlphaFold)

ModelpLDDTFraction very-high
AF-Q9NX45-F158.360.11

Function

Pathways and Gene Ontology

Reactome pathways

0 pathways

MSigDB gene sets: 91 (showing top): GSE37336_LY6C_POS_VS_NEG_NAIVE_CD4_TCELL_UP, NKX25_02, GCANCTGNY_MYOD_Q6, GOBP_OOGENESIS, GOBP_MALE_GAMETE_GENERATION, HATADA_METHYLATED_IN_LUNG_CANCER_DN, chr13q13, RODWELL_AGING_KIDNEY_NO_BLOOD_DN, LIAO_METASTASIS, GOBP_CELLULAR_PROCESS_INVOLVED_IN_REPRODUCTION_IN_MULTICELLULAR_ORGANISM, GOBP_OOCYTE_DIFFERENTIATION, NKX22_01, GOBP_FEMALE_GAMETE_GENERATION, GOBP_DEVELOPMENTAL_PROCESS_INVOLVED_IN_REPRODUCTION, FREAC7_01

GO Biological Process (6): regulation of transcription by RNA polymerase II (GO:0006357), spermatogenesis (GO:0007283), oocyte differentiation (GO:0009994), cell differentiation (GO:0030154), regulation of DNA-templated transcription (GO:0006355), oogenesis (GO:0048477)

GO Molecular Function (8): RNA polymerase II cis-regulatory region sequence-specific DNA binding (GO:0000978), DNA-binding transcription factor activity, RNA polymerase II-specific (GO:0000981), protein homodimerization activity (GO:0042803), protein heterodimerization activity (GO:0046982), sequence-specific double-stranded DNA binding (GO:1990837), DNA binding (GO:0003677), DNA-binding transcription factor activity (GO:0003700), protein dimerization activity (GO:0046983)

GO Cellular Component (3): chromatin (GO:0000785), nucleus (GO:0005634), cytoplasm (GO:0005737)

GO top-level categories

Rollup of top GO terms by namespace:

CategoryTerms
regulation of DNA-templated transcription2
developmental process involved in reproduction2
RNA polymerase II transcription regulatory region sequence-specific DNA binding2
protein dimerization activity2
cellular anatomical structure2
transcription by RNA polymerase II1
male gamete generation1
cell differentiation1
oogenesis1
cellular developmental process1
DNA-templated transcription1
regulation of gene expression1
regulation of RNA biosynthetic process1
germ cell development1
female gamete generation1
cis-regulatory region sequence-specific DNA binding1
chromatin1
DNA-binding transcription factor activity1
regulation of transcription by RNA polymerase II1
identical protein binding1
double-stranded DNA binding1
sequence-specific DNA binding1
nucleic acid binding1
transcription cis-regulatory region binding1
transcription regulator activity1
protein binding1
chromosome1
intracellular membrane-bounded organelle1
intracellular anatomical structure1

Protein interactions and networks

STRING

584 interactions, top by confidence (×1000):

Protein AProtein BPartner UniProtScore
SOHLH2SOHLH1Q5JUK2919
SOHLH2NOBOXO60393748
SOHLH2FIGLAQ6QHK4734
SOHLH2STRA8Q7Z7C7684
SOHLH2NANOS2P60321684
SOHLH2LHX8Q68G74682
SOHLH2DAZLQ92904657
SOHLH2GFRA1P56159603
SOHLH2NANOS3P60323597
SOHLH2DMRT1Q9Y5R6597
SOHLH2ZBTB16Q05516594
SOHLH2SYCP3Q8IZU3565
SOHLH2GDF9O60383556
SOHLH2TERTO14746550
SOHLH2RPA3P35244545

IntAct

3 interactions, top by confidence:

ABTypeScore
ELOA2SOHLH2psi-mi:“MI:0915”(physical association)0.370
MAPTLANCL1psi-mi:“MI:0914”(association)0.350

BioGRID (1): SOHLH2 (Two-hybrid)

ESM2 similar proteins: A0JMR6, A1A4L6, A1YG61, A2T737, O70273, O75747, P01105, P10157, P11308, P13474, P14921, P15036, P15037, P15062, P18755, P19102, P26323, P27577, P41156, P41157, P41212, P57782, P81270, P97360, Q08AW4, Q15052, Q32LN0, Q3SZL0, Q3US16, Q58DT0, Q60641, Q6GPJ8, Q6P3D7, Q7ZYI3, Q8BZ05, Q8C7R7, Q8HWS3, Q8N8B7, Q8NDB2, Q8VDK3

Diamond homologs: Q3MHT3, Q6IUP1, Q9D489, Q9NX45

SIGNOR signaling

1 interactions.

AEffectBMechanism
SOHLH2“up-regulates quantity by expression”KIT“transcriptional regulation”

Disease & clinical

Clinical variants and AI predictions

ClinVar

0 variants total. Per-class counts are floors (≥ shown; pagination cap):

ClassificationCount (floor)
Pathogenic0
Likely pathogenic0
Uncertain significance0
Likely benign0
Benign0

Top pathogenic / likely-pathogenic (0)

SpliceAI

1459 predictions. Top by Δscore:

VariantEffectΔscore
13:36170788:C:CCacceptor_gain1.0000
13:36173812:T:Cacceptor_gain1.0000
13:36174882:T:TCacceptor_gain1.0000
13:36190052:CAGAG:Cacceptor_gain1.0000
13:36191890:CTTAA:Cacceptor_gain1.0000
13:36191895:C:CCacceptor_gain1.0000
13:36201873:GTTTA:Gdonor_loss1.0000
13:36201877:ACC:Adonor_loss1.0000
13:36202089:TTTGC:Tacceptor_gain1.0000
13:36202090:TTGC:Tacceptor_gain1.0000
13:36202091:TGC:Tacceptor_gain1.0000
13:36202092:GC:Gacceptor_gain1.0000
13:36202093:CC:Cacceptor_gain1.0000
13:36202093:CCTGA:Cacceptor_loss1.0000
13:36202094:C:CCacceptor_gain1.0000
13:36202095:T:Cacceptor_loss1.0000
13:36214473:TCTTA:Tdonor_loss1.0000
13:36214474:CTTAC:Cdonor_loss1.0000
13:36214475:TTA:Tdonor_loss1.0000
13:36214476:TA:Tdonor_loss1.0000
13:36214478:C:CAdonor_loss1.0000
13:36214478:CCTGG:Cdonor_gain1.0000
13:36170789:T:Cacceptor_gain0.9900
13:36170790:T:Cacceptor_gain0.9900
13:36173810:CCT:Cacceptor_gain0.9900
13:36173812:T:TCacceptor_gain0.9900
13:36173971:T:TAdonor_gain0.9900
13:36174449:T:TAdonor_gain0.9900
13:36174475:CCG:Cdonor_gain0.9900
13:36174882:T:Cacceptor_gain0.9900

AlphaMissense

2767 scored. Top likely-pathogenic:

VariantProtein changeam_pathogenicity
13:36174869:C:AR214S0.994
13:36174869:C:GR214S0.994
13:36189948:T:AR213S0.991
13:36189948:T:GR213S0.991
13:36174781:C:GA244P0.990
13:36174840:A:GL224P0.990
13:36189946:C:GR214T0.988
13:36174795:G:TA239D0.987
13:36189949:C:GR213T0.987
13:36174800:A:CD237E0.986
13:36174800:A:TD237E0.986
13:36174801:T:CD237G0.985
13:36189946:C:AR214M0.985
13:36174801:T:GD237A0.984
13:36174802:C:GD237H0.984
13:36174837:C:GR225P0.984
13:36174801:T:AD237V0.982
13:36174786:A:TL242H0.978
13:36174786:A:GL242P0.975
13:36189960:C:AK209N0.975
13:36189960:C:GK209N0.975
13:36174861:A:GI217T0.974
13:36174863:T:AR216S0.974
13:36174863:T:GR216S0.974
13:36174535:A:CF274L0.972
13:36174535:A:TF274L0.972
13:36174537:A:GF274L0.972
13:36174793:A:GS240P0.972
13:36189957:T:AE210D0.972
13:36189957:T:GE210D0.972

dbSNP variants (sampled 300 via entrez): RS1000031280 (13:36211420 G>A), RS1000074576 (13:36216412 G>T), RS1000112672 (13:36178601 G>A,C), RS1000205408 (13:36183662 T>C), RS1000224269 (13:36191208 TGA>T,TGAGA), RS1000276524 (13:36190946 A>G), RS1000429308 (13:36190934 T>C), RS1000524051 (13:36215614 G>A), RS1000597442 (13:36169868 C>T), RS1000605261 (13:36192258 C>T), RS1000633320 (13:36209510 A>C), RS1000666988 (13:36176499 A>G), RS1000698092 (13:36176976 G>C,T), RS1000829294 (13:36168991 C>T), RS1000840908 (13:36196832 C>T)

Disease associations

OMIM: gene MIM:616066 | disease phenotypes:

GenCC curated gene-disease

DiseaseClassificationInheritance
inherited primary ovarian failureLimitedAutosomal dominant

Mondo (1): inherited primary ovarian failure (MONDO:0019852)

Orphanet (0):

HPO phenotypes

0 total (0 of 0 shown, HPO-id order):

GWAS associations

3 associations (top):

StudyTraitp-value
GCST001519_17Economic and political preferences6.000000e-06
GCST002726_51Glucose homeostasis traits4.000000e-07
GCST008399_20Cocaine dependence2.000000e-06

EFO canonical traits (2, from GWAS)

EFO IDTrait name
EFO:0004827economic and social preference
EFO:0006832disposition index measurement

Drugs & pharmacology

Drug and pharmacology data

Is drug target: no

PharmGKB: 1 entry (VIP=true, CPIC=false)

CTD chemical–gene interactions

16 total (human), top 16 by PubMed support.

ChemicalActions (top 5)PubMed papers
Valproic Acidaffects expression, decreases expression, increases methylation3
Aflatoxin B1affects expression, increases methylation2
methylmercuric chlorideincreases expression1
trichostatin Aincreases expression, decreases expression1
3,4-dichloroanilinedecreases expression1
sodium arsenitedecreases expression1
CGP 52608affects binding, increases reaction1
incobotulinumtoxinAdecreases expression1
Rosiglitazoneincreases expression1
Decitabineincreases expression1
Diurondecreases expression1
Silverdecreases expression1
Tretinoindecreases expression1
Cyclosporineincreases expression1
Antirheumatic Agentsincreases expression1
Cadmium Chloridedecreases expression1

Clinical trials (associated diseases)

0 trials via MONDO — disease-level, not drug-specific.