SOX2

gene
On this page

Summary

SOX2 (SRY-box transcription factor 2, HGNC:11195) is a protein-coding gene on chromosome 3q26.33, encoding Transcription factor SOX-2 (P48431). Transcription factor that forms a trimeric complex with OCT4 on DNA and controls the expression of a number of genes involved in embryonic development such as YES1, FGF4, UTF1 and ZFP206. It is haploinsufficient (ClinGen: sufficient evidence).

This intronless gene encodes a member of the SRY-related HMG-box (SOX) family of transcription factors involved in the regulation of embryonic development and in the determination of cell fate. The product of this gene is required for stem-cell maintenance in the central nervous system, and also regulates gene expression in the stomach. Mutations in this gene have been associated with optic nerve hypoplasia and with syndromic microphthalmia, a severe form of structural eye malformation. This gene lies within an intron of another gene called SOX2 overlapping transcript (SOX2OT).

Source: NCBI Gene 6657 — RefSeq curated summary.

At a glance

  • Gene–disease (curated): anophthalmia/microphthalmia-esophageal atresia syndrome (Definitive, GenCC) — +2 more curated relationships
  • GWAS associations: 17
  • Clinical variants (ClinVar): 100 total — 22 pathogenic, 5 likely-pathogenic
  • Phenotypes (HPO): 76
  • Dosage sensitivity (ClinGen): haploinsufficiency sufficient evidence, triplosensitivity no evidence
  • Transcription factor: yes — 93 downstream targets (CollecTRI)
  • MANE Select transcript: NM_003106

Identifiers

Gene identifiers

FieldValue
HGNC IDHGNC:11195
Approved symbolSOX2
NameSRY-box transcription factor 2
Location3q26.33
Locus typegene with protein product
StatusApproved
Ensembl geneENSG00000181449
Ensembl biotypeprotein_coding
OMIM184429
Entrez6657

Gene structure

Transcript identifiers

Ensembl transcripts: 1 — 1 protein_coding

ENST00000325404

RefSeq mRNA: 1 — MANE Select: NM_003106 NM_003106

CCDS: CCDS3239

Canonical transcript exons

ENST00000325404 — 1 exons

ExonStartEnd
ENSE00001228743181711925181714436

Expression profiles

Bgee: expression breadth ubiquitous, 203 present calls, max score 99.60.

FANTOM5 (CAGE): breadth broad, TPM avg 35.4480 / max 7479.2049, expressed in 566 samples.

FANTOM5 promoters (7 alternative TSS)

Promoter IDTPM avgSamples expressed
4000126.3903545
400027.7743473
400030.6426205
400000.4455196
400040.085144
2030400.065139
400050.045012

Top tissues by expression

281 total, by Bgee expression score (0-100, higher = more expressed):

TissueAnatomy IDExpression scoreQuality
ventricular zoneUBERON:000305399.60gold quality
bronchial epithelial cellCL:000232899.43gold quality
ganglionic eminenceUBERON:000402398.55gold quality
lateral globus pallidusUBERON:000247698.50gold quality
epithelium of bronchusUBERON:000203197.48gold quality
bronchusUBERON:000218597.43gold quality
substantia nigra pars reticulataUBERON:000196697.30gold quality
embryoUBERON:000092297.28gold quality
dorsal motor nucleus of vagus nerveUBERON:000287096.95gold quality
ventral tegmental areaUBERON:000269196.91gold quality
superior vestibular nucleusUBERON:000722796.90gold quality
esophagus squamous epitheliumUBERON:000692096.89gold quality
medulla oblongataUBERON:000189696.83gold quality
oral cavityUBERON:000016796.72gold quality
subthalamic nucleusUBERON:000190696.43gold quality
substantia nigra pars compactaUBERON:000196596.28gold quality
mucosa of paranasal sinusUBERON:000503095.84gold quality
endothelial cellCL:000011595.58gold quality
inferior vagus X ganglionUBERON:000536395.55gold quality
lateral nuclear group of thalamusUBERON:000273695.54gold quality
nucleus accumbensUBERON:000188295.25gold quality
epithelium of nasopharynxUBERON:000195195.24gold quality
caudate nucleusUBERON:000187395.19gold quality
amygdalaUBERON:000187695.19gold quality
nasal cavity mucosaUBERON:000182694.90gold quality
olfactory segment of nasal mucosaUBERON:000538694.85gold quality
entorhinal cortexUBERON:000272894.69gold quality
temporal lobeUBERON:000187194.68gold quality
globus pallidusUBERON:000187594.39gold quality
inferior olivary complexUBERON:000212793.62gold quality

Single-cell (SCXA)

Detected in 16 experiment(s), a significant marker in 11.

ExperimentMarker?Max mean expression
E-HCAD-56yes1189.58
E-MTAB-10485yes912.71
E-HCAD-5yes901.78
E-MTAB-8271yes318.33
E-MTAB-9388yes35.71
E-GEOD-137537yes31.86
E-GEOD-84465yes29.94
E-MTAB-7316yes28.23
E-GEOD-135922yes23.02
E-MTAB-8410yes16.64
E-ANND-3yes12.11
E-MTAB-8894no2206.59
E-MTAB-11121no628.45
E-MTAB-10018no596.98
E-MTAB-6108no588.91

Regulation

Is transcription factor: yes

Downstream targets (CollecTRI)

93 targets.

TargetRegulation
ABCC3Activation
ABCC6Activation
ADAM10Activation
AGR2Activation
AHRActivation
ASXL1Activation
ATOH1Unknown
BDNFActivation
BIRC5Unknown
BMP4Repression
CCN2
CCND1Unknown
CCNE1Activation
CDKN1ARepression
CDKN1BUnknown
CDX2Unknown
CTBP2Activation
CTNNB1Repression
DKK1Activation
DLGAP1Unknown
DPPA4Unknown
EGFRActivation
FABP7Activation
FGF3Repression
FGF4Activation
FOXJ1Activation
GADD45BActivation
GATA6Activation
GFAPActivation
GLI2Activation

JASPAR motifs

MotifNameFamily
MA0143.4SOX2SOX-related factors
MA0143.5SOX2SOX-related factors
MA1962.1POU2F1::SOX2POU domain factors::SOX-related factors

JASPAR matrix evidence (PMIDs): PMID:15863505, PMID:29335749

Upstream regulators (CollecTRI, top): AR, CDX2, CHD8, DACH1, EGFR, EMX2, FEZF1, FOXM1, FOXO1, GLI2, GSK3B, HMGA2, ID4, IRX4, KDM2A, KLF4, LEF1, MSX2, MYC, NEUROD1, NEUROG1, OTX2, PAX6, PITX3, POU2F1, POU3F1, POU5F1, PROX1, RBPJ, SALL4, SIX3, SOX2-OT, SOX2, SOX3, SOX4, SOX9, STAT3, TFAP2A

miRNA regulators (miRDB)

98 targeting SOX2, top 30 by miRDB confidence (max_score; target_count = how many genes the miRNA targets in total — lower means more specific):

miRNAMax scoreAvg scoremiRNA target_count
HSA-MIR-7110-3P100.0073.182486
HSA-MIR-6748-5P100.0065.811057
HSA-MIR-3163100.0077.238605
HSA-MIR-200B-3P100.0073.312693
HSA-MIR-200C-3P100.0073.352685
HSA-MIR-429100.0073.442698
HSA-MIR-340-5P100.0072.504437
HSA-MIR-5011-5P100.0083.465820
HSA-MIR-6833-3P100.0070.633197
HSA-MIR-428299.9975.366408
HSA-MIR-1213699.9872.815713
HSA-MIR-548N99.9871.944170
HSA-MIR-60799.9773.625593
HSA-MIR-302C-5P99.9772.563642
HSA-MIR-50799.9770.111915
HSA-MIR-548AJ-3P99.9673.385345
HSA-MIR-548X-3P99.9673.385345
HSA-MIR-55799.9670.011640
HSA-MIR-493-5P99.9672.472382
HSA-LET-7C-3P99.9573.422862
HSA-MIR-3912-5P99.9566.11925
HSA-MIR-548J-3P99.9472.614881
HSA-MIR-6845-3P99.9466.881439
HSA-MIR-548AE-3P99.9372.664867
HSA-MIR-548AH-3P99.9372.544872
HSA-MIR-548AM-3P99.9372.544872
HSA-MIR-548AQ-3P99.9372.664867
HSA-MIR-381-3P99.9371.872854
HSA-MIR-552-5P99.9368.561583
HSA-MIR-450B-5P99.9271.483175

Functional genomics

ClinGen dosage: haploinsufficiency 3 (sufficient evidence), triplosensitivity 0 (no evidence). ClinGen Gene Dosage Map

Literature-anchored findings (GeneRIF, showing 40)

  • SOX2 has a role in eye development, and its mutation can cause anophthalmia (PMID:12612584)
  • Sox-11 activates transcription more strongly than Sox-2; the transactivation domain of Sox-11 is primarily responsible for this capability (PMID:12637543)
  • Sox2 can dimerize onto DNA in a distinct conformational arrangement. (PMID:12923055)
  • OCT1 and SOX2 have roles in transcriptional activation of the Oct1.Sox2.Hoxb1-DNA ternary transcription factor complex (PMID:14559893)
  • Down-regulation of Sox2 may be an important factor for the development of intestinal metaplasia (PMID:14655050)
  • SOX2 may play a role in differentiation of the human gastric epithelium and may be involved in gastric carcinogenesis (PMID:14719100)
  • Sox2 may play an important role in maintaining a gastric phenotype in stomach cancers as well as in normal tissue. (PMID:15910596)
  • Identification of Sox2 as a novel c-myc BUR-binding protein was achieved through yeast one-hybrid screening and the Sox2/DNA interaction was confirmed by electrophoretic mobility shift assay and immunoprecipitation with Sox2 antibody. (PMID:15920534)
  • The SOX2 protein co-occupy a substantial portion of its target genes, and collaborate to form regulatory circuitry consisting of autoregulatory and feedforward loops. (PMID:16153702)
  • Bilateral anophthalmia and brain malformations caused by a 20-bp deletion through a slipped-strand mispairing event in the SOX2 gene. (PMID:16283891)
  • Expression of SOX2 is involved in later events of pancreaatic carcinogenesis. (PMID:16552336)
  • Increased SOX2 is associated with the pancreatobiliary phenotype of ampulla of vater carcinoma and involved in later events in carcinogenesis, such as invasion and metastasis (PMID:16596179)
  • This review describes how cross regulation for PAX6, SOX2 and perhaps OTX2 has now been uncovered, pointing to the mechanisms that can fine-tune the expression of three such essential components in eye development. (PMID:16712695)
  • SOX2 is necessary for the normal development and function of the hypothalamo-pituitary and reproductive axes in both humans and mice. (PMID:16932809)
  • These data suggest that SOX2 plays an important role in regulation of pepsinogen A, and ectopic expression of SOX2 may be associated with abnormal differentiation of colorectal cancer cells. (PMID:17136346)
  • SOX2 mutation associated with AEG syndrome. (PMID:17219395)
  • Sox-2 immunoreactivity showed a nuclear labeling pattern and colocalised on GFAP immunoreactive cell in the human subventricular zone. (PMID:17291498)
  • Sox2 is preferentially expressed in breast tumours with basal-like phenotype. (PMID:17334350)
  • Detection of anti-SOX2 T cells predicts favorable clinical outcome in patients with asymptomatic plasmaproliferative disorders (PMID:17389240)
  • Sox2 was observed in all cases of immature teratomas in primitive neuroepithelial tissues, but was not expressed in mature tissues. (PMID:17464316)
  • RNAi-mediated silencing of SOX2 induced differentiation with mesodermal characteristics in EC cells (PMID:17506876)
  • Results provide further evidence that SOX2 haploinsufficiency is a common cause of severe developmental ocular malformations and that background genetic variation determines the varying phenotypes. (PMID:17522144)
  • have implicated duplications of SOX3 and mutations of both SOX2 and SOX3 in the aetiology of variants of septo-optic dysplasia (PMID:17587179)
  • PAX6 and SOX2 are obvious candidates in RE genetic studies because of their biological roles and prior linkage studies. The present findings strongly suggest refractive error is not directly affected in this population by variants in either gene. (PMID:17898260)
  • expression of both SOX2 and MUC5AC in serrated polyps supports the hypothesis that these polyps may be predominant precursors of mucinous and signet ring cell carcinomas of the colorectum (PMID:18027866)
  • only SOX2 has been identified as a major causative gene of anophthalima and microphthalmia. (PMID:18039390)
  • Using ectopic expression of Oct4, Sox2, Klf4 and Myc, we have derived iPS cells from fetal, neonatal and adult human primary cells (PMID:18157115)
  • Sox2 may be a tumor marker of glial lineages (PMID:18162777)
  • Sox2-expressing MSCs showed consistent proliferation and osteogenic capability in culture media containing bFGF (PMID:18187129)
  • The human primordial germ cells is the first primary cell type described to express POU5F1 and NANOG but not SOX2. (PMID:18199879)
  • SOX2 is frequently downregulated in gastric cancers (PMID:18268498)
  • SOX2 plays a critical role in the pituitary, forebrain, and eye during human embryonic development. (PMID:18285410)
  • A high level of SOX2 expression correlates with epigenetic modifications of distant enhancers SRR1 and SRR2 during creation of cell-specific epigenomes. (PMID:18293417)
  • SOX2 enhancers SRR1 and SRR2 are each differentially methylated and acetylated, on a temporal basis, contributing to the generation of neuron- and astrocyte-specific epigenomes from a common progenitor cell. (PMID:18293417)
  • Thsi study identifies DPPA4 and other genes as putative Sox2:Oct-3/4 target genes using a combination of in silico analysis and transcription-based assays. (PMID:18366076)
  • These results support the role of SOX2 in ocular development. Loss of SOX2 function results in severe eye malformation. CHX10 was not implicated with microphthalmia/anophthalmia in our patient cohort. (PMID:18385794)
  • Sox2, Oct4 and Nanog are linked together in a pluripotent regulatory network (PMID:18388306)
  • SMAD 2/3 signaling directly supports NANOG expression, while SMAD 1/5/8 activation moderately represses SOX2. (PMID:18393632)
  • SOX2 and beta-catenin act in synergy in the transcription regulation of CCND1 in breast cancer cells (PMID:18456656)
  • Mutation in SOX2 is associated with typical ocular coloboma and probably other anomalies in this Chinese family. (PMID:18474784)

Cross-species orthologs

7 orthologs

OrganismSymbolGene ID
danio_reriosox2ENSDARG00000070913
mus_musculusSox2ENSMUSG00000074637
rattus_norvegicusSox2ENSRNOG00000090426
drosophila_melanogasterSox14FBGN0005612
drosophila_melanogasterSox21aFBGN0036411
caenorhabditis_elegansWBGENE00004949
caenorhabditis_elegansWBGENE00004950

Paralogs (20): SOX8 (ENSG00000005513), SOX30 (ENSG00000039600), SOX10 (ENSG00000100146), SOX6 (ENSG00000110693), SOX4 (ENSG00000124766), SOX21 (ENSG00000125285), SOX9 (ENSG00000125398), SOX15 (ENSG00000129194), SOX5 (ENSG00000134532), SOX3 (ENSG00000134595), SOX13 (ENSG00000143842), SOX17 (ENSG00000164736), SOX14 (ENSG00000168875), SOX7 (ENSG00000171056), SOX11 (ENSG00000176887), SOX12 (ENSG00000177732), CFAP65 (ENSG00000181378), SOX1 (ENSG00000182968), SRY (ENSG00000184895), SOX18 (ENSG00000203883)

Protein

Protein identifiers

Transcription factor SOX-2P48431 (reviewed: P48431)

All UniProt accessions (2): P48431, A0A0U3FYV6

UniProt curated annotations — full annotation on UniProt →

Function. Transcription factor that forms a trimeric complex with OCT4 on DNA and controls the expression of a number of genes involved in embryonic development such as YES1, FGF4, UTF1 and ZFP206. Binds to the proximal enhancer region of NANOG. Critical for early embryogenesis and for embryonic stem cell pluripotency. Downstream SRRT target that mediates the promotion of neural stem cell self-renewal. Keeps neural cells undifferentiated by counteracting the activity of proneural proteins and suppresses neuronal differentiation. May function as a switch in neuronal development.

Subunit / interactions. Interacts with ZSCAN10. Interacts with SOX3 and FGFR1. Interacts with GLIS1. Interacts with POU5F1; binds synergistically with POU5F1 to DNA. Interacts with DDX56. Interacts with L3MBTL3 and DCAF5; the interaction requires methylation at Lys-42 and is necessary to target SOX2 for ubiquitination by the CRL4-DCAF5 E3 ubiquitin ligase complex. Interacts with RCOR1/CoREST. Interacts with PHF20L1; the interaction requires methylation at Lys-42 and Lys-117 and protects SOX2 from degradation. Interacts with TRIM26; this interaction prevents ubiquitination by WWP2.

Subcellular location. Nucleus speckle. Cytoplasm. Nucleus.

Post-translational modifications. Sumoylation inhibits binding on DNA and negatively regulates the FGF4 transactivation. Methylation at Lys-42 and Lys-117 is necessary for the regulation of SOX2 proteasomal degradation. Ubiquitinated by WWP2, leading to proteasomal degradation.

Disease relevance. Microphthalmia, syndromic, 3 (MCOPS3) [MIM:206900] A disease characterized by the rare association of malformations including uni- or bilateral anophthalmia or microphthalmia, and esophageal atresia with trachoesophageal fistula. Microphthalmia is a disorder of eye formation, ranging from small size of a single eye to complete bilateral absence of ocular tissues (anophthalmia). In many cases, microphthalmia/anophthalmia occurs in association with syndromes that include non-ocular abnormalities. The disease is caused by variants affecting the gene represented in this entry.

Domain organisation. The 9aaTAD motif is a transactivation domain present in a large number of yeast and animal transcription factors.

RefSeq proteins (1): NP_003097* (*=MANE)

Domains & families (InterPro)

IDNameType
IPR009071HMG_box_domDomain
IPR022097SOX_famFamily
IPR036910HMG_box_dom_sfHomologous_superfamily
IPR050140SRY-related_HMG-box_TF-likeFamily

Pfam: PF00505, PF12336

UniProt features (23 total): mutagenesis site 5, helix 4, sequence variant 3, region of interest 3, modified residue 3, chain 1, DNA-binding region 1, cross-link 1, short sequence motif 1, compositionally biased region 1

Structure

Experimental structures (PDB)

13 structures.

PDBMethodResolution (Å)
6WX8X-RAY DIFFRACTION2.3
6WX7X-RAY DIFFRACTION2.7
6WX9X-RAY DIFFRACTION2.8
6T90ELECTRON MICROSCOPY3.05
6YOVELECTRON MICROSCOPY3.42
9RL4ELECTRON MICROSCOPY3.5
9RN2ELECTRON MICROSCOPY4.1
9RMCELECTRON MICROSCOPY4.2
6T7BELECTRON MICROSCOPY5.1
9RN1ELECTRON MICROSCOPY5.9
1O4XSOLUTION NMR
2LE4SOLUTION NMR
9QPFSOLUTION NMR

Predicted structure (AlphaFold)

ModelpLDDTFraction very-high
AF-P48431-F161.910.27

Functional residue map

Curated UniProt residues grouped by drug-discovery relevance — catalytic, ligand-binding, modification, and mutation-validated positions. Source: UniProtKB sequence features.

Post-translational modifications (4): 245, 42, 117, 251

Mutagenesis-validated functional residues (5):

PositionPhenotype
42–43in mt1; reduced nuclear import; when associated with 56-a–a-58. in mt1.2; reduced nuclear import; when associated with
42loss of interaction with l3mbtl3. loss of ubiquitination by the crl4-dcaf5 e3 ubiquitin ligase complex. loss of interact
56–58in mt1; reduced nuclear import; when associated with 56-a–a-58. in mt1.2; reduced nuclear import; when associated with
113–115in mt2; reduced nuclear import. in mt1.2; reduced nuclear import; when associated with 42-a-a-43 and 56-a–a-58.
117loss of interaction with phf20l1; when associated with r-42.

Function

Pathways and Gene Ontology

Reactome pathways

25 pathways

IDPathway
R-HSA-2892245POU5F1 (OCT4), SOX2, NANOG repress genes related to differentiation
R-HSA-2892247POU5F1 (OCT4), SOX2, NANOG activate genes related to proliferation
R-HSA-3769402Deactivation of the beta-catenin transactivating complex
R-HSA-452723Transcriptional regulation of pluripotent stem cells
R-HSA-6785807Interleukin-4 and Interleukin-13 signaling
R-HSA-8986944Transcriptional Regulation by MECP2
R-HSA-9754189Germ layer formation at gastrulation
R-HSA-9823739Formation of the anterior neural plate
R-HSA-9832991Formation of the posterior neural plate
R-HSA-9834899Specification of the neural plate border
R-HSA-9856649Transcriptional and post-translational regulation of MITF-M expression and activity
R-HSA-9926550Regulation of MITF-M-dependent genes involved in extracellular matrix, focal adhesion and epithelial-to-mesenchymal transition
R-HSA-1266738Developmental Biology
R-HSA-1280215Cytokine Signaling in Immune system
R-HSA-162582Signal Transduction
R-HSA-168256Immune System
R-HSA-195721Signaling by WNT
R-HSA-201681TCF dependent signaling in response to WNT
R-HSA-212436Generic Transcription Pathway
R-HSA-449147Signaling by Interleukins
R-HSA-73857RNA Polymerase II Transcription
R-HSA-74160Gene expression (Transcription)
R-HSA-9730414MITF-M-regulated melanocyte development
R-HSA-9758941Gastrulation
R-HSA-9856651MITF-M-dependent gene expression

MSigDB gene sets: 480 (showing top): GSE18804_SPLEEN_MACROPHAGE_VS_COLON_TUMORAL_MACROPHAGE_UP, RRAGTTGT_UNKNOWN, GOBP_SOMATIC_STEM_CELL_POPULATION_MAINTENANCE, GOBP_NEGATIVE_REGULATION_OF_NEURON_DIFFERENTIATION, YAGI_AML_WITH_INV_16_TRANSLOCATION, REACTOME_CYTOKINE_SIGNALING_IN_IMMUNE_SYSTEM, GOBP_FORMATION_OF_PRIMARY_GERM_LAYER, GCANCTGNY_MYOD_Q6, GOZGIT_ESR1_TARGETS_DN, GOBP_CELL_CYCLE_PHASE_TRANSITION, GOBP_POSITIVE_REGULATION_OF_MAPK_CASCADE, GOBP_OSTEOBLAST_DIFFERENTIATION, GOBP_PITUITARY_GLAND_DEVELOPMENT, GOBP_NEUROGENESIS, CERVERA_SDHB_TARGETS_1_DN

GO Biological Process (31): negative regulation of transcription by RNA polymerase II (GO:0000122), osteoblast differentiation (GO:0001649), eye development (GO:0001654), endodermal cell fate specification (GO:0001714), chromatin organization (GO:0006325), regulation of DNA-templated transcription (GO:0006355), brain development (GO:0007420), response to wounding (GO:0009611), regulation of gene expression (GO:0010468), glial cell fate commitment (GO:0021781), pituitary gland development (GO:0021983), adenohypophysis development (GO:0021984), positive regulation of cell-cell adhesion (GO:0022409), neuron differentiation (GO:0030182), forebrain development (GO:0030900), somatic stem cell population maintenance (GO:0035019), tissue regeneration (GO:0042246), positive regulation of MAPK cascade (GO:0043410), response to ethanol (GO:0045471), positive regulation of cell differentiation (GO:0045597), negative regulation of neuron differentiation (GO:0045665), positive regulation of DNA-templated transcription (GO:0045893), positive regulation of transcription by RNA polymerase II (GO:0045944), inner ear development (GO:0048839), response to growth factor (GO:0070848), cellular response to hypoxia (GO:0071456), negative regulation of canonical Wnt signaling pathway (GO:0090090), response to oxygen-glucose deprivation (GO:0090649), neuronal stem cell population maintenance (GO:0097150), negative regulation of cell cycle G1/S phase transition (GO:1902807), regulation of myofibroblast cell apoptotic process (GO:1904520)

GO Molecular Function (10): transcription cis-regulatory region binding (GO:0000976), RNA polymerase II cis-regulatory region sequence-specific DNA binding (GO:0000978), DNA-binding transcription factor activity, RNA polymerase II-specific (GO:0000981), DNA-binding transcription activator activity, RNA polymerase II-specific (GO:0001228), DNA binding (GO:0003677), DNA-binding transcription factor activity (GO:0003700), miRNA binding (GO:0035198), sequence-specific DNA binding (GO:0043565), nitric-oxide synthase binding (GO:0050998), protein binding (GO:0005515)

GO Cellular Component (7): chromatin (GO:0000785), nucleus (GO:0005634), nucleoplasm (GO:0005654), transcription regulator complex (GO:0005667), cytoplasm (GO:0005737), cytosol (GO:0005829), nuclear speck (GO:0016607)

Reactome top-level categories

Rollup of top-13 pathways:

CategoryPathways
Gastrulation4
Transcriptional regulation of pluripotent stem cells2
TCF dependent signaling in response to WNT1
Developmental Biology1
Signaling by Interleukins1
Generic Transcription Pathway1
MITF-M-regulated melanocyte development1
MITF-M-dependent gene expression1
Immune System1
Signal Transduction1
Signaling by WNT1
RNA Polymerase II Transcription1
Cytokine Signaling in Immune system1

GO top-level categories

Rollup of top GO terms by namespace:

CategoryTerms
cellular anatomical structure4
cell differentiation3
RNA polymerase II transcription regulatory region sequence-specific DNA binding3
regulation of transcription by RNA polymerase II2
anatomical structure development2
transcription by RNA polymerase II1
negative regulation of DNA-templated transcription1
ossification1
sensory organ development1
visual system development1
cell fate specification1
endodermal cell fate commitment1
cellular component organization1
DNA-templated transcription1
regulation of gene expression1
regulation of RNA biosynthetic process1
central nervous system development1
animal organ development1
head development1
response to stress1
gene expression1
regulation of macromolecule biosynthetic process1
glial cell differentiation1
cell fate commitment1
diencephalon development1
endocrine system development1
gland development1
pituitary gland development1
regulation of cell-cell adhesion1
positive regulation of cell adhesion1
cell-cell adhesion1
generation of neurons1
brain development1
stem cell population maintenance1
regeneration1
developmental growth1
MAPK cascade1
regulation of MAPK cascade1
positive regulation of intracellular signal transduction1
response to alcohol1

Protein interactions and networks

STRING

7892 interactions, top by confidence (×1000):

Protein AProtein BPartner UniProtScore
SOX2NANOGQ9H9S0999
SOX2POU5F1P31359999
SOX2KLF4P78338995
SOX2PAX6P26367965
SOX2LIN28AQ9H9Z2962
SOX2CTNNB1P35222957
SOX2MYCP01106933
SOX2CDK1P06493922
SOX2TP63Q9H3D4918
SOX2HDAC1Q13547907
SOX2SALL4Q9UJQ4901
SOX2OTX2P32243897
SOX2TP53P04637883
SOX2NESP48681883
SOX2POU3F2P20265876

IntAct

81 interactions, top by confidence:

ABTypeScore
SUMO1SOX2psi-mi:“MI:0915”(physical association)0.670
SOX2SUMO1psi-mi:“MI:0915”(physical association)0.670
VRK1SOX2psi-mi:“MI:0403”(colocalization)0.610
SOX2VRK1psi-mi:“MI:0403”(colocalization)0.610
SOX2VRK1psi-mi:“MI:0915”(physical association)0.610
VRK1SOX2psi-mi:“MI:0915”(physical association)0.610
SOX2POU5F1psi-mi:“MI:0914”(association)0.560
POU5F1SOX2psi-mi:“MI:0915”(physical association)0.560
KDM6ASOX2psi-mi:“MI:0915”(physical association)0.540
SOX2PDLIM1psi-mi:“MI:0914”(association)0.530
SOX2MYH10psi-mi:“MI:0914”(association)0.530
H2AC4H2BC12psi-mi:“MI:0915”(physical association)0.530
MBD3KLF4psi-mi:“MI:0914”(association)0.500
MBD3SOX2psi-mi:“MI:0915”(physical association)0.500
NFIASOX2psi-mi:“MI:0915”(physical association)0.470
NFIBSOX2psi-mi:“MI:0915”(physical association)0.470

BioGRID (1672): DMAP1 (Affinity Capture-MS), SOX2 (Affinity Capture-Western), TBX2 (Affinity Capture-Western), TBX2 (Affinity Capture-Luminescence), SUMO1 (Two-hybrid), SOX2 (Affinity Capture-Western), POU5F1 (Affinity Capture-Western), CTNNB1 (Affinity Capture-Western), ANXA1 (Affinity Capture-MS), PROSC (Affinity Capture-MS), FAHD2A (Affinity Capture-MS), OXSM (Affinity Capture-MS), SOX14 (Affinity Capture-MS), SOX21 (Affinity Capture-MS), RBP1 (Affinity Capture-MS)

ESM2 similar proteins: A2TED3, O00570, O57401, O95409, P06602, P07548, P09085, P14734, P16241, P20264, P22544, P23441, P23757, P31361, P32027, P32182, P32242, P35583, P39768, P40764, P41225, P43241, P43698, P43699, P48430, P48431, P48432, P50220, P53783, P53784, P54231, P54269, P56224, P80205, Q04649, Q07687, Q24255, Q24533, Q2PG84, Q2Z1R2

Diamond homologs: A2TED3, A4QNG3, B0ZTE1, B0ZTE2, O00570, O42569, O57401, O60248, O95416, P36389, P36390, P36393, P36395, P36396, P41225, P43267, P47792, P48046, P48430, P48431, P48432, P48433, P51501, P53783, P53784, P54231, P55863, P61259, Q04892, Q05066, Q20201, Q21305, Q24533, Q28447, Q28778, Q28783, Q28798, Q2PG84, Q2Z1R2, Q32PP9

SIGNOR signaling

28 interactions.

AEffectBMechanism
EGFR“up-regulates quantity by expression”SOX2“transcriptional regulation”
SOX2“up-regulates quantity by expression”EGFR“transcriptional regulation”
SOX2“up-regulates quantity by expression”NR2E1“transcriptional regulation”
CTNNB1“down-regulates activity”SOX2binding
SOX2up-regulatesPluripotency
POU5F1“up-regulates quantity by expression”SOX2“transcriptional regulation”
ID4“up-regulates quantity by expression”SOX2“transcriptional regulation”
SOX2“up-regulates quantity by expression”ABCC3“transcriptional regulation”
SOX2“up-regulates quantity by expression”ABCC6“transcriptional regulation”
KDM5B“down-regulates quantity by repression”SOX2“transcriptional regulation”
AKT1“up-regulates quantity by stabilization”SOX2phosphorylation
UBR5“down-regulates quantity by destabilization”SOX2polyubiquitination
WWP2“down-regulates quantity”SOX2ubiquitination
EGFR“up-regulates quantity by stabilization”SOX2phosphorylation
MAPK3“down-regulates activity”SOX2phosphorylation
CDK2“up-regulates activity”SOX2phosphorylation
VRK1“up-regulates activity”SOX2phosphorylation
PAK6“up-regulates quantity”SOX2phosphorylation
SOX2“form complex”SOX2/POU5F1binding
ERK1/2“up-regulates quantity by expression”SOX2“transcriptional regulation”
CTNNB1“up-regulates activity”SOX2binding
“Av/b1 integrin”“up-regulates quantity by expression”SOX2
“A6/b1 integrin”“up-regulates quantity by expression”SOX2
CHD8“down-regulates quantity”SOX2“transcriptional regulation”
SOX2/POU5F1“up-regulates quantity by expression”SOX2“transcriptional regulation”

Enriched among interaction partners

Reactome pathways and GO biological processes over-represented among this gene’s 39 IntAct physical interaction partners (hypergeometric vs the genome-wide background, BH-FDR, gene-set size 15–500, ranked by fold). A functional readout of the neighbourhood — distinct from this gene’s own memberships above, and biased toward well-studied / hub proteins, so read it as themes rather than proof.

GO biological processes:

GO termPartnersFoldFDR
chromatin remodeling611.5×4e-03

Disease & clinical

Clinical variants and AI predictions

ClinVar

100 variants total. Per-class counts are floors (≥ shown; pagination cap):

ClassificationCount (floor)
Pathogenic22
Likely pathogenic5
Uncertain significance35
Likely benign15
Benign12

Top pathogenic / likely-pathogenic (27)

Variant IDHGVSClassification
1275797NM_003106.4(SOX2):c.621del (p.Gln206_Tyr207insTer)Pathogenic
12814NM_003106.4(SOX2):c.529C>T (p.Gln177Ter)Pathogenic
1387489NM_003106.4(SOX2):c.497G>A (p.Trp166Ter)Pathogenic
1695088NM_003106.4(SOX2):c.905del (p.Pro302fs)Pathogenic
1700169NM_003106.4(SOX2):c.329A>C (p.Tyr110Ser)Pathogenic
2823248NM_003106.4(SOX2):c.25del (p.Glu8_Leu9insTer)Pathogenic
2844603NM_003106.4(SOX2):c.142_144del (p.Phe48del)Pathogenic
29890NM_003106.4(SOX2):c.58_59dup (p.Gly21fs)Pathogenic
3246890NC_000003.11:g.(?180702428)(181592675_?)delPathogenic
39805NM_003106.4(SOX2):c.837del (p.Gly280fs)Pathogenic
4721391NM_003106.4(SOX2):c.90_96del (p.Gly31fs)Pathogenic
577696NM_003106.4(SOX2):c.337C>T (p.Arg113Trp)Pathogenic
859786NM_003106.4(SOX2):c.3dup (p.Tyr2fs)Pathogenic
986760GRCh37/hg19 3q26.33(chr3:180913778-181432287)x1Pathogenic
986763NM_003106.4(SOX2):c.131C>G (p.Pro44Arg)Pathogenic
986764NM_003106.4(SOX2):c.310G>T (p.Glu104Ter)Pathogenic
986766NM_003106.4(SOX2):c.828del (p.Met276fs)Pathogenic
986769NM_003106.4(SOX2):c.-13_43del (p.Met1fs)Pathogenic
986770NM_003106.4(SOX2):c.166C>G (p.Arg56Gly)Pathogenic
986775NM_003106.4(SOX2):c.542del (p.Pro181fs)Pathogenic
986776NM_003106.4(SOX2):c.538_542dup (p.Gln182fs)Pathogenic
986777NM_003106.4(SOX2):c.59del (p.Gly20fs)Pathogenic
3062095NM_003106.4(SOX2):c.142T>G (p.Phe48Val)Likely pathogenic
3235168NM_003106.4(SOX2):c.775_778del (p.Ser259fs)Likely pathogenic
3337616NM_003106.4(SOX2):c.940del (p.Leu314fs)Likely pathogenic
3338557NM_003106.4(SOX2):c.758del (p.Pro253fs)Likely pathogenic
3340875NM_003106.4(SOX2):c.871A>T (p.Arg291Ter)Likely pathogenic

SpliceAI

19 predictions. Top by Δscore:

VariantEffectΔscore
3:181712727:GA:Gdonor_gain0.5100
3:181712728:A:Gdonor_gain0.4300
3:181712151:G:Tdonor_gain0.3900
3:181712084:GCT:Gdonor_gain0.3400
3:181712151:G:GTdonor_gain0.3000
3:181712961:GAC:Gdonor_gain0.2700
3:181712963:C:Gdonor_gain0.2700
3:181712964:G:GGdonor_gain0.2600
3:181712886:G:GTdonor_gain0.2400
3:181713088:TGGTC:Tdonor_gain0.2400
3:181712726:GGA:Gdonor_gain0.2300
3:181712063:G:GTdonor_gain0.2200
3:181713103:C:Tdonor_gain0.2200
3:181713120:G:GAacceptor_gain0.2200
3:181713200:CGCCG:Cacceptor_gain0.2200
3:181712977:A:AGdonor_gain0.2100
3:181712978:G:GGdonor_gain0.2100
3:181713062:C:Tdonor_gain0.2100
3:181712720:G:GTdonor_gain0.2000

AlphaMissense

2100 scored. Top likely-pathogenic:

VariantProtein changeam_pathogenicity
3:181712478:C:AR40S1.000
3:181712482:T:AV41D1.000
3:181712482:T:CV41A1.000
3:181712484:A:GK42E1.000
3:181712485:A:TK42M1.000
3:181712486:G:CK42N1.000
3:181712486:G:TK42N1.000
3:181712487:C:GR43G1.000
3:181712487:C:TR43W1.000
3:181712488:G:AR43Q1.000
3:181712488:G:TR43L1.000
3:181712490:C:AP44T1.000
3:181712490:C:GP44A1.000
3:181712490:C:TP44S1.000
3:181712491:C:AP44H1.000
3:181712491:C:GP44R1.000
3:181712491:C:TP44L1.000
3:181712494:T:AM45K1.000
3:181712494:T:CM45T1.000
3:181712494:T:GM45R1.000
3:181712495:G:AM45I1.000
3:181712495:G:CM45I1.000
3:181712495:G:TM45I1.000
3:181712496:A:CN46H1.000
3:181712496:A:GN46D1.000
3:181712496:A:TN46Y1.000
3:181712497:A:CN46T1.000
3:181712497:A:GN46S1.000
3:181712497:A:TN46I1.000
3:181712498:T:AN46K1.000

dbSNP variants (sampled 300 via entrez): RS1000118579 (3:181710048 G>A), RS1000575571 (3:181710267 T>C), RS1000792606 (3:181710231 G>T), RS1000846691 (3:181710593 C>T), RS1001129423 (3:181711692 T>A), RS1001181848 (3:181711893 C>T), RS1001764076 (3:181710612 C>T), RS1003028996 (3:181711760 G>A,C), RS1003412207 (3:181713453 C>A), RS1004356947 (3:181709971 G>A,T), RS1004537039 (3:181714927 G>T), RS1004634634 (3:181710906 T>C,G), RS1004916616 (3:181714475 A>T), RS1006308193 (3:181711928 G>A), RS1006788647 (3:181712231 G>A,T)

Disease associations

OMIM: gene MIM:184429 | disease phenotypes: MIM:206900

GenCC curated gene-disease

DiseaseClassificationInheritance
anophthalmia/microphthalmia-esophageal atresia syndromeDefinitiveAutosomal dominant
isolated anophthalmia-microphthalmia syndromeSupportiveAutosomal dominant
septooptic dysplasiaSupportiveAutosomal dominant

Mondo (3): anophthalmia/microphthalmia-esophageal atresia syndrome (MONDO:0008799), isolated anophthalmia-microphthalmia syndrome (MONDO:0016764), septooptic dysplasia (MONDO:0008428)

Orphanet (1): Anophthalmia/microphthalmia-esophageal atresia syndrome (Orphanet:77298)

HPO phenotypes

76 total (30 of 76 shown, HPO-id order):

HPOTerm
HP:0000006Autosomal dominant inheritance
HP:0000028Cryptorchidism
HP:0000044Hypogonadotropic hypogonadism
HP:0000047Hypospadias
HP:0000054Micropenis
HP:0000175Cleft palate
HP:0000238Hydrocephalus
HP:0000252Microcephaly
HP:0000365Hearing impairment
HP:0000407Sensorineural hearing impairment
HP:0000458Anosmia
HP:0000486Strabismus
HP:0000501Glaucoma
HP:0000505Visual impairment
HP:0000518Cataract
HP:0000528Anophthalmia
HP:0000568Microphthalmia
HP:0000572Visual loss
HP:0000589Coloboma
HP:0000609Optic nerve hypoplasia
HP:0000610Abnormal choroid morphology
HP:0000612Iris coloboma
HP:0000639Nystagmus
HP:0000647Sclerocornea
HP:0000717Autism
HP:0000864Abnormality of the hypothalamus-pituitary axis
HP:0000873Diabetes insipidus
HP:000087811 pairs of ribs
HP:0000902Rib fusion
HP:0000921Missing ribs

GWAS associations

17 associations (top):

StudyTraitp-value
GCST003542_44Night sleep phenotypes5.000000e-06
GCST003996_27Monobrow2.000000e-31
GCST006095_4Excessive hairiness8.000000e-09
GCST006461_24Self-reported risk-taking behaviour7.000000e-13
GCST006706_3Eyebrow thickness1.000000e-19
GCST007325_119General risk tolerance (MTAG)3.000000e-11
GCST007325_96General risk tolerance (MTAG)1.000000e-10
GCST007576_386Chronotype2.000000e-13
GCST007576_387Chronotype7.000000e-10
GCST007576_78Chronotype2.000000e-13
GCST009963_4Cataracts (operation)2.000000e-14
GCST010988_125Adult body size1.000000e-09
GCST012013_21Cataracts5.000000e-09
GCST012013_8Cataracts5.000000e-30
GCST90006927_1Toxoplasma gondii sag1 antibody levels4.000000e-08
GCST90013423_1Age-related nuclear cataracts2.000000e-19
GCST90014268_13Cataracts2.000000e-32

EFO canonical traits (4, from GWAS)

EFO IDTrait name
EFO:0007906synophrys measurement
EFO:0008579risk-taking behaviour
EFO:0008328chronotype measurement
EFO:0009353Anti-Toxoplasma gondii IgG measurement

MeSH disease descriptors (1)

DescriptorNameTree numbers
D025962Septo-Optic DysplasiaC10.292.562.700.375.875; C10.500.034.937; C10.500.760.500; C11.590.436.400.875; C16.131.666.034.937; C16.131.666.763.500

Drugs & pharmacology

Drug and pharmacology data

Is drug target: no

PharmGKB: 1 entry (VIP=true, CPIC=false)

CTD chemical–gene interactions

136 total (human), top 30 by PubMed support.

ChemicalActions (top 5)PubMed papers
sodium arseniteincreases reaction, affects reaction, decreases reaction, affects expression, decreases expression (+1 more)10
Valproic Acidaffects cotreatment, decreases expression, decreases methylation8
Arsenic Trioxidedecreases expression, increases reaction, affects reaction, decreases response to substance, decreases reaction (+2 more)7
Estradiolaffects cotreatment, decreases expression, affects binding, increases expression, increases activity7
Tretinoinaffects cotreatment, decreases expression, decreases reaction, increases reaction, affects expression7
bisphenol Aincreases reaction, decreases expression, increases expression, affects binding5
(+)-JQ1 compoundaffects binding, decreases reaction, decreases expression, increases expression5
4-(5-benzo(1,3)dioxol-5-yl-4-pyridin-2-yl-1H-imidazol-2-yl)benzamideincreases expression, affects cotreatment, decreases expression, decreases reaction4
Arsenicincreases expression, increases abundance, affects expression, affects cotreatment, decreases expression4
Cisplatinincreases expression, affects reaction, affects expression, affects response to substance, decreases response to substance (+1 more)4
trichostatin Aaffects cotreatment, decreases expression3
cobaltous chlorideincreases expression, increases response to substance, decreases reaction, decreases expression3
methylmercuric chloridedecreases expression, increases expression2
decabromobiphenyl etherdecreases expression, increases expression2
diallyl trisulfidedecreases expression, decreases reaction2
mercuric bromidedecreases expression, affects cotreatment2
Chir 99021decreases expression, decreases reaction, affects binding, affects cotreatment2
bisphenol Saffects cotreatment, decreases expression, increases expression2
XAV939affects binding, affects cotreatment, decreases expression, decreases reaction, increases expression2
LDN 193189affects cotreatment, increases expression2
Resveratroldecreases expression2
Vorinostataffects cotreatment, decreases expression2
Panobinostataffects cotreatment, decreases expression2
Air Pollutantsincreases abundance, increases expression, decreases expression2
Ascorbic Acidaffects binding, affects cotreatment, decreases expression, increases expression2
Oxygenincreases expression, decreases expression2
Phenylmercuric Acetateaffects cotreatment, decreases expression2
Silicon Dioxidedecreases expression, decreases reaction2
p-Chloromercuribenzoic Acidaffects cotreatment, decreases expression2
Particulate Matterincreases abundance, increases expression2

Cellosaurus cell lines

15 cell lines: 9 transformed cell line, 3 embryonic stem cell, 3 cancer cell line

First 10 cell lines (id-ordered, not curated):

CellosaurusNameCategorySex
CVCL_A6M6SEES3-1V human SOX2, clone1Embryonic stem cellMale
CVCL_A6M7SEES3-1V human SOX2, clone2Embryonic stem cellMale
CVCL_A6M8SEES3-1V human SOX2, clone3Embryonic stem cellMale
CVCL_B7TVe-hChon-2Transformed cell lineFemale
CVCL_B7TWe-hChon-3Transformed cell lineFemale
CVCL_B7U6e-hMEC-1Transformed cell line
CVCL_B7UDe-hStr-1Transformed cell lineFemale
CVCL_B7UFe-hStr-3Transformed cell lineFemale
CVCL_B7UGe-hStr-4Transformed cell lineFemale
CVCL_B7UHe-hStr-5Transformed cell lineFemale

Clinical trials (associated diseases)

4 trials via MONDO — disease-level, not drug-specific.

TrialPhaseStatusTitle
NCT00140413PHASE4COMPLETEDEndocrine Dysfunction and Growth Hormone Deficiency in Children With Optic Nerve Hypoplasia
NCT06760546PHASE3RECRUITINGA Trial of Setmelanotide in Patients With Congenital Hypothalamic Obesity (Sub-study of NCT05774756)
NCT05717855Not specifiedCOMPLETEDScreening of Septo-optic Dysplasia During a Fetal Examination at 16-20 Weeks of Gestation
NCT06262152Not specifiedUNKNOWNSleep Profile of Patients With Septo-optic Dysplasia