GSX1

gene
On this page

Also known as Gsh-1

Summary

GSX1 (GS homeobox 1, HGNC:20374) is a protein-coding gene on chromosome 13q12.2, encoding GS homeobox 1 (Q9H4S2). Probable transcription factor that binds to the DNA sequence 5’-GC[TA][AC]ATTA[GA]-3'.

Enables sequence-specific double-stranded DNA binding activity. Acts upstream of or within positive regulation of transcription by RNA polymerase II. Predicted to be located in chromatin. Predicted to be active in nucleus.

Source: NCBI Gene 219409 — RefSeq curated summary.

At a glance

  • GWAS associations: 6
  • Clinical variants (ClinVar): 49 total
  • MANE Select transcript: NM_145657

Identifiers

Gene identifiers

FieldValue
HGNC IDHGNC:20374
Approved symbolGSX1
NameGS homeobox 1
Location13q12.2
Locus typegene with protein product
StatusApproved
AliasesGsh-1
Ensembl geneENSG00000169840
Ensembl biotypeprotein_coding
OMIM616542
Entrez219409

Gene structure

Transcript identifiers

Ensembl transcripts: 1 — 1 protein_coding

ENST00000302945

RefSeq mRNA: 1 — MANE Select: NM_145657 NM_145657

CCDS: CCDS9326

Canonical transcript exons

ENST00000302945 — 2 exons

ExonStartEnd
ENSE000011576702779356627794768
ENSE000011576772779248327793102

Expression profiles

Bgee: expression breadth broad, 18 present calls, max score 62.27.

FANTOM5 (CAGE): breadth tissue_specific, TPM avg 0.1134 / max 12.6374, expressed in 49 samples.

FANTOM5 promoters (1 alternative TSS)

Promoter IDTPM avgSamples expressed
1345370.113449

Top tissues by expression

234 total, by Bgee expression score (0-100, higher = more expressed):

TissueAnatomy IDExpression scoreQuality
superficial temporal arteryUBERON:000161462.27gold quality
buccal mucosa cellCL:000233660.84gold quality
epithelium of nasopharynxUBERON:000195160.55gold quality
amniotic fluidUBERON:000017359.59gold quality
gingival epitheliumUBERON:000194957.32gold quality
germinal epithelium of ovaryUBERON:000130455.59gold quality
gingivaUBERON:000182855.05gold quality
mucosa of paranasal sinusUBERON:000503054.85gold quality
Brodmann (1909) area 23UBERON:001355453.77gold quality
hypothalamusUBERON:000189852.50gold quality
middle temporal gyrusUBERON:000277152.17gold quality
trabecular bone tissueUBERON:000248350.66gold quality
corpus epididymisUBERON:000435950.57gold quality
cauda epididymisUBERON:000436050.13gold quality
caput epididymisUBERON:000435850.08gold quality
inferior vagus X ganglionUBERON:000536348.02gold quality
heart right ventricleUBERON:000208047.93gold quality
ventral tegmental areaUBERON:000269147.74gold quality
skin of hipUBERON:000155447.59silver quality
vastus lateralisUBERON:000137947.03gold quality
quadriceps femorisUBERON:000137746.87gold quality
subthalamic nucleusUBERON:000190645.90gold quality
cardia of stomachUBERON:000116245.56gold quality
saphenous veinUBERON:000731844.47gold quality
medulla oblongataUBERON:000189644.31gold quality
dorsal plus ventral thalamusUBERON:000189743.95gold quality
pigmented layer of retinaUBERON:000178243.75gold quality
trigeminal ganglionUBERON:000167543.64gold quality
tracheaUBERON:000312643.54gold quality
medial globus pallidusUBERON:000247743.43gold quality

Single-cell (SCXA)

Detected in 1 experiment(s), a significant marker in 0.

ExperimentMarker?Max mean expression
E-ANND-3no0.46

Regulation

Is transcription factor: yes

Downstream targets (CollecTRI)

4 targets.

TargetRegulation
GCLC
GHRHActivation
GSX1
INS

Upstream regulators (CollecTRI, top): GSX1

Literature-anchored findings (GeneRIF, showing 1)

  • the study ruled out microdeletions on the critical region as a common cause of Moebius syndrome and excluded GSH1 gene (PMID:19460469)

Cross-species orthologs

5 orthologs

OrganismSymbolGene ID
danio_reriogsx1ENSDARG00000035735
mus_musculusGsx1ENSMUSG00000053129
rattus_norvegicusGsx1ENSRNOG00000000952
caenorhabditis_elegansWBGENE00012584
caenorhabditis_elegansWBGENE00044032

Paralogs (1): NOTO (ENSG00000214513)

Protein

Protein identifiers

GS homeobox 1Q9H4S2 (reviewed: Q9H4S2)

Alternative names: Homeobox protein GSH-1

All UniProt accessions (1): Q9H4S2

UniProt curated annotations — full annotation on UniProt →

Function. Probable transcription factor that binds to the DNA sequence 5’-GC[TA][AC]ATTA[GA]-3’. Activates the transcription of the GHRH gene. Plays an important role in pituitary development.

Subcellular location. Nucleus.

Similarity. Belongs to the Antp homeobox family.

RefSeq proteins (1): NP_663632* (*=MANE)

Domains & families (InterPro)

IDNameType
IPR001356HDDomain
IPR009057Homeodomain-like_sfHomologous_superfamily
IPR017970Homeobox_CSConserved_site
IPR020479HD_metazoaDomain
IPR042191GSH1/2Family

Pfam: PF00046

UniProt features (7 total): compositionally biased region 3, region of interest 2, chain 1, DNA-binding region 1

Structure

Experimental structures (PDB)

0 structures.

Predicted structure (AlphaFold)

ModelpLDDTFraction very-high
AF-Q9H4S2-F165.450.23

Function

Pathways and Gene Ontology

Reactome pathways

0 pathways

MSigDB gene sets: 109 (showing top): GOBP_SPINAL_CORD_DEVELOPMENT, AHRARNT_01, RRAGTTGT_UNKNOWN, BENPORATH_ES_WITH_H3K27ME3, GOBP_PITUITARY_GLAND_DEVELOPMENT, GOBP_NEUROGENESIS, GGGTGGRR_PAX4_03, GOBP_CELL_DIFFERENTIATION_IN_SPINAL_CORD, GOBP_FOREBRAIN_DEVELOPMENT, MARTORIATI_MDM4_TARGETS_NEUROEPITHELIUM_DN, GOBP_HYPOTHALAMUS_DEVELOPMENT, NKX62_Q2, FREAC3_01, NF1_Q6_01, OCT1_03

GO Biological Process (10): transcription by RNA polymerase II (GO:0006366), central nervous system development (GO:0007417), brain development (GO:0007420), spinal cord association neuron differentiation (GO:0021527), hypothalamus development (GO:0021854), adenohypophysis development (GO:0021984), neuron differentiation (GO:0030182), positive regulation of transcription by RNA polymerase II (GO:0045944), neuron fate commitment (GO:0048663), regulation of DNA-templated transcription (GO:0006355)

GO Molecular Function (8): RNA polymerase II cis-regulatory region sequence-specific DNA binding (GO:0000978), DNA-binding transcription factor activity, RNA polymerase II-specific (GO:0000981), DNA-binding transcription activator activity, RNA polymerase II-specific (GO:0001228), sequence-specific DNA binding (GO:0043565), sequence-specific double-stranded DNA binding (GO:1990837), DNA binding (GO:0003677), DNA-binding transcription factor activity (GO:0003700), protein binding (GO:0005515)

GO Cellular Component (2): chromatin (GO:0000785), nucleus (GO:0005634)

GO top-level categories

Rollup of top GO terms by namespace:

CategoryTerms
RNA polymerase II transcription regulatory region sequence-specific DNA binding3
DNA-templated transcription2
anatomical structure development2
regulation of transcription by RNA polymerase II2
nervous system development1
system development1
central nervous system development1
animal organ development1
head development1
cell differentiation in spinal cord1
dorsal spinal cord development1
central nervous system neuron differentiation1
diencephalon development1
limbic system development1
pituitary gland development1
cell differentiation1
generation of neurons1
transcription by RNA polymerase II1
positive regulation of DNA-templated transcription1
neuron differentiation1
cell fate commitment1
regulation of gene expression1
regulation of RNA biosynthetic process1
cis-regulatory region sequence-specific DNA binding1
chromatin1
DNA-binding transcription factor activity1
DNA-binding transcription factor activity, RNA polymerase II-specific1
DNA-binding transcription activator activity1
positive regulation of transcription by RNA polymerase II1
DNA binding1
double-stranded DNA binding1
sequence-specific DNA binding1
nucleic acid binding1
transcription cis-regulatory region binding1
regulation of DNA-templated transcription1
transcription regulator activity1
binding1
chromosome1
cellular anatomical structure1
intracellular membrane-bounded organelle1

Protein interactions and networks

STRING

816 interactions, top by confidence (×1000):

Protein AProtein BPartner UniProtScore
GSX1ASCL1P50553634
GSX1LHX6Q9UPM6579
GSX1TBR1Q16650542
GSX1LHX8Q68G74510
GSX1SP8Q8IXZ3483
GSX1PTF1AQ7RTS3456
GSX1NEUROG2Q9H2A3456
GSX1OLIG2Q13516452
GSX1CUX1P39880425
GSX1SP9P0CG40409
GSX1NEUROG1Q92886391
GSX1LHX5Q9H2C1380
GSX1ZNF503Q96F45379
GSX1TSHZ1Q6ZSZ6367
GSX1GAD1Q99259365

IntAct

5 interactions, top by confidence:

ABTypeScore
KRTAP19-7GSX1psi-mi:“MI:0915”(physical association)0.560
GSX1YKT6psi-mi:“MI:0914”(association)0.350
GSX1KRTAP19-7psi-mi:“MI:0915”(physical association)0.000

BioGRID (46): KRTAP19-7 (Two-hybrid), PSMG4 (Affinity Capture-MS), CYB5R3 (Affinity Capture-MS), HIBCH (Affinity Capture-MS), PDCD10 (Affinity Capture-MS), STXBP2 (Affinity Capture-MS), DHRS4 (Affinity Capture-MS), BCS1L (Affinity Capture-MS), PPID (Affinity Capture-MS), AK4 (Affinity Capture-MS), NDUFAF7 (Affinity Capture-MS), EEF1A2 (Affinity Capture-MS), NAA50 (Affinity Capture-MS), ZMPSTE24 (Affinity Capture-MS), EFHD2 (Affinity Capture-MS)

ESM2 similar proteins: A0A8V0YY16, A0JPN1, A7MB54, A8MTJ6, O35762, O42115, O57601, O88181, O95096, P09065, P23683, P28356, P31311, P31315, P32443, P39020, P42581, P42586, P43697, P48031, P49640, P50222, P50476, P52951, P52954, P52955, P78426, P81067, P81068, P97334, Q14549, Q14774, Q1KKY1, Q1XID0, Q2NKI2, Q2VL76, Q2VL80, Q4V5A3, Q5SQQ9, Q60554

Diamond homologs: A1YER7, A1YF08, A1YFD8, A1YFY3, A1YG85, A1YGA4, A2D4P8, A2D5I1, A2D5K9, A2D5Y4, A2T6X6, A2T756, A2T779, A2T7T2, F1Q4R9, M0R6D8, O13074, O42230, O42365, O42367, O42506, O57374, P06798, P07548, P09016, P09017, P09019, P09020, P09021, P09067, P09070, P09074, P0C1T1, P10178, P10284, P10628, P14652, P14837, P14838, P14840

SIGNOR signaling

0 interactions.

Disease & clinical

Clinical variants and AI predictions

ClinVar

49 variants total. Per-class counts are floors (≥ shown; pagination cap):

ClassificationCount (floor)
Pathogenic0
Likely pathogenic0
Uncertain significance46
Likely benign1
Benign2

Top pathogenic / likely-pathogenic (0)

SpliceAI

245 predictions. Top by Δscore:

VariantEffectΔscore
13:27793174:G:GTdonor_gain1.0000
13:27793558:A:AGacceptor_gain1.0000
13:27793559:C:Gacceptor_gain1.0000
13:27793562:CCAG:Cacceptor_loss1.0000
13:27793563:CA:Cacceptor_loss1.0000
13:27793564:A:AGacceptor_gain1.0000
13:27793564:AGACA:Aacceptor_loss1.0000
13:27793565:G:GTacceptor_gain1.0000
13:27793565:GA:Gacceptor_gain1.0000
13:27793565:GAC:Gacceptor_gain1.0000
13:27793565:GACA:Gacceptor_gain1.0000
13:27793098:TGTGG:Tdonor_loss0.9900
13:27793099:GTGG:Gdonor_gain0.9900
13:27793101:GG:Gdonor_gain0.9900
13:27793102:GG:Gdonor_gain0.9900
13:27793102:GGTAA:Gdonor_loss0.9900
13:27793103:G:GCdonor_loss0.9900
13:27793104:T:TCdonor_loss0.9900
13:27793554:T:TAacceptor_gain0.9900
13:27793103:G:GGdonor_gain0.9800
13:27793105:AA:Adonor_loss0.9800
13:27793563:CAGA:Cacceptor_gain0.9800
13:27793564:AGAC:Aacceptor_gain0.9800
13:27793080:G:GTdonor_gain0.9700
13:27793171:G:GTdonor_gain0.9700
13:27793562:CCAGA:Cacceptor_gain0.9700
13:27793565:GACAG:Gacceptor_gain0.9700
13:27793679:C:Gacceptor_gain0.9700
13:27793153:G:GTdonor_gain0.9600
13:27793181:G:Tdonor_gain0.9600

AlphaMissense

1686 scored. Top likely-pathogenic:

VariantProtein changeam_pathogenicity
13:27792703:T:CF5L1.000
13:27792705:C:AF5L1.000
13:27792705:C:GF5L1.000
13:27793595:A:GK148E1.000
13:27793597:G:CK148N1.000
13:27793597:G:TK148N1.000
13:27793599:G:TR149M1.000
13:27793600:G:CR149S1.000
13:27793600:G:TR149S1.000
13:27793604:C:AR151S1.000
13:27793604:C:GR151G1.000
13:27793604:C:TR151C1.000
13:27793605:G:AR151H1.000
13:27793611:C:AA153D1.000
13:27793613:T:AF154I1.000
13:27793613:T:CF154L1.000
13:27793613:T:GF154V1.000
13:27793614:T:CF154S1.000
13:27793614:T:GF154C1.000
13:27793615:C:AF154L1.000
13:27793615:C:GF154L1.000
13:27793617:C:TT155I1.000
13:27793626:A:GQ158R1.000
13:27793627:G:CQ158H1.000
13:27793627:G:TQ158H1.000
13:27793629:T:AL159Q1.000
13:27793629:T:CL159P1.000
13:27793632:T:CL160P1.000
13:27793638:T:AL162Q1.000
13:27793638:T:CL162P1.000

dbSNP variants (sampled 300 via entrez): RS1000477680 (13:27792568 G>A), RS1001072649 (13:27791172 C>T), RS1001491358 (13:27791394 C>T), RS1001819248 (13:27795213 A>G), RS1001925354 (13:27795196 C>A), RS1003041464 (13:27794164 C>G,T), RS1003756564 (13:27792290 C>G,T), RS1003830011 (13:27792515 T>A,C), RS1006147791 (13:27790530 A>T), RS1006208102 (13:27791425 A>G), RS1006990354 (13:27794806 A>C), RS1007996001 (13:27793336 G>C,T), RS1008224776 (13:27794372 C>A,G), RS1008698565 (13:27792859 G>A,C,T), RS1008773115 (13:27791950 C>T)

Disease associations

OMIM: gene MIM:616542 | disease phenotypes:

GenCC curated gene-disease

Mondo (0):

Orphanet (0):

HPO phenotypes

0 total (0 of 0 shown, HPO-id order):

GWAS associations

6 associations (top):

StudyTraitp-value
GCST000253_2Attention deficit hyperactivity disorder and conduct disorder9.000000e-06
GCST001996_3Adverse response to chemotherapy (neutropenia/leucopenia) (epirubicin)5.000000e-06
GCST002553_1Pancreatic cancer2.000000e-09
GCST005951_4Body mass index2.000000e-08
GCST006431_6Plasma parathyroid hormone levels2.000000e-06
GCST90011898_20Alanine aminotransferase levels4.000000e-12

EFO canonical traits (1, from GWAS)

EFO IDTrait name
EFO:0004340body mass index

Drugs & pharmacology

Drug and pharmacology data

Is drug target: no

PharmGKB: 1 entry (VIP=true, CPIC=false)

CTD chemical–gene interactions

6 total (human), top 6 by PubMed support.

ChemicalActions (top 5)PubMed papers
entinostatdecreases expression1
Resveratrolaffects cotreatment, decreases expression1
Acetaminophendecreases expression1
Benzo(a)pyreneaffects methylation, increases methylation1
Plant Extractsaffects cotreatment, decreases expression1
Progesteroneincreases expression1

Clinical trials (associated diseases)

0 trials via MONDO — disease-level, not drug-specific.

  • Disease cohort memberships (association, not causation — diseases whose associated-gene cohort lists this gene; a subset are also under Associated diseases): conduct disorder