THOC2
geneOn this page
Also known as THO2dJ506G2.1
Summary
THOC2 (THO complex subunit 2, HGNC:19073) is a protein-coding gene on chromosome Xq25, encoding THO complex subunit 2 (Q8NI27). Component of the THO subcomplex of the TREX complex which is thought to couple mRNA transcription, processing and nuclear export, and which specifically associates with spliced mRNA and not with unspliced pre-mRNA. It is a common-essential gene (DepMap: required in 99.1% of cancer cell lines).
The TREX multiprotein complex binds specifically to spliced mRNAs to facilitate mRNA export. The protein encoded by this gene is a member of the THO complex, a subset of the TREX complex. The encoded protein interacts with the THOC1 protein.
Source: NCBI Gene 57187 — RefSeq curated summary.
At a glance
- Gene–disease (curated): X-linked complex neurodevelopmental disorder (Definitive, ClinGen) — +1 more curated relationship
- GWAS associations: 1
- Clinical variants (ClinVar): 587 total — 8 pathogenic, 13 likely-pathogenic
- Phenotypes (HPO): 83
- Druggable target: yes
- Cancer dependency (DepMap): dependent in 99.1% of screened cell lines (common-essential)
- MANE Select transcript:
NM_001081550
Identifiers
Gene identifiers
| Field | Value |
|---|---|
| HGNC ID | HGNC:19073 |
| Approved symbol | THOC2 |
| Name | THO complex subunit 2 |
| Location | Xq25 |
| Locus type | gene with protein product |
| Status | Approved |
| Aliases | THO2, dJ506G2.1 |
| Ensembl gene | ENSG00000125676 |
| Ensembl biotype | protein_coding |
| OMIM | 300395 |
| Entrez | 57187 |
Gene structure
Transcript identifiers
Ensembl transcripts: 21 — 10 protein_coding, 6 retained_intron, 3 protein_coding_CDS_not_defined, 2 nonsense_mediated_decay
ENST00000245838, ENST00000355725, ENST00000416618, ENST00000419789, ENST00000432353, ENST00000433883, ENST00000438358, ENST00000441692, ENST00000448128, ENST00000455053, ENST00000459945, ENST00000464161, ENST00000464604, ENST00000464982, ENST00000464992, ENST00000491737, ENST00000492203, ENST00000496830, ENST00000497887, ENST00000618150, ENST00000931863
RefSeq mRNA: 1 — MANE Select: NM_001081550
NM_001081550
CCDS: CCDS43988
Canonical transcript exons
ENST00000245838 — 39 exons
| Exon | Start | End |
|---|---|---|
| ENSE00001537798 | 123600569 | 123601338 |
| ENSE00002700240 | 123732952 | 123733052 |
| ENSE00003516348 | 123671669 | 123671761 |
| ENSE00003520667 | 123703454 | 123703505 |
| ENSE00003528531 | 123613639 | 123613708 |
| ENSE00003528571 | 123611440 | 123611516 |
| ENSE00003544996 | 123610918 | 123610963 |
| ENSE00003556009 | 123696021 | 123696154 |
| ENSE00003614298 | 123613399 | 123613556 |
| ENSE00003631304 | 123614052 | 123614189 |
| ENSE00003644252 | 123622758 | 123622860 |
| ENSE00003655894 | 123686548 | 123686714 |
| ENSE00003659811 | 123619401 | 123619436 |
| ENSE00003674538 | 123624060 | 123624191 |
| ENSE00003888891 | 123638934 | 123639027 |
| ENSE00003888921 | 123665642 | 123665837 |
| ENSE00003889126 | 123625912 | 123626069 |
| ENSE00003889566 | 123644779 | 123644909 |
| ENSE00003889679 | 123638043 | 123638123 |
| ENSE00003890321 | 123712850 | 123712908 |
| ENSE00003890379 | 123623105 | 123623283 |
| ENSE00003890581 | 123632861 | 123633040 |
| ENSE00003890683 | 123697681 | 123697751 |
| ENSE00003891306 | 123627693 | 123627968 |
| ENSE00003891615 | 123667106 | 123667278 |
| ENSE00003892015 | 123631688 | 123631852 |
| ENSE00003892209 | 123668159 | 123668314 |
| ENSE00003892304 | 123645334 | 123645375 |
| ENSE00003892371 | 123636079 | 123636175 |
| ENSE00003893204 | 123621157 | 123621587 |
| ENSE00003893645 | 123624541 | 123624669 |
| ENSE00003893834 | 123633953 | 123634070 |
| ENSE00003894240 | 123623787 | 123623971 |
| ENSE00003894520 | 123696721 | 123696842 |
| ENSE00003894623 | 123640538 | 123640622 |
| ENSE00003894692 | 123626521 | 123626662 |
| ENSE00003894918 | 123644575 | 123644676 |
| ENSE00003895400 | 123620907 | 123620965 |
| ENSE00003896240 | 123706858 | 123706949 |
Expression profiles
Bgee: expression breadth ubiquitous, 297 present calls, max score 99.20.
FANTOM5 (CAGE): breadth ubiquitous, TPM avg 33.8049 / max 818.1560, expressed in 1805 samples.
FANTOM5 promoters (2 alternative TSS)
| Promoter ID | TPM avg | Samples expressed |
|---|---|---|
| 200388 | 32.1036 | 1803 |
| 200386 | 1.7013 | 771 |
Top tissues by expression
299 total, by Bgee expression score (0-100, higher = more expressed):
| Tissue | Anatomy ID | Expression score | Quality |
|---|---|---|---|
| secondary oocyte | CL:0000655 | 99.20 | gold quality |
| calcaneal tendon | UBERON:0003701 | 98.40 | gold quality |
| oocyte | CL:0000023 | 98.23 | gold quality |
| sural nerve | UBERON:0015488 | 97.92 | gold quality |
| ventricular zone | UBERON:0003053 | 96.56 | gold quality |
| colonic epithelium | UBERON:0000397 | 96.42 | gold quality |
| adrenal tissue | UBERON:0018303 | 96.03 | gold quality |
| right lung | UBERON:0002167 | 95.35 | gold quality |
| lower esophagus muscularis layer | UBERON:0035833 | 95.31 | gold quality |
| lower esophagus | UBERON:0013473 | 95.30 | gold quality |
| visceral pleura | UBERON:0002401 | 95.10 | gold quality |
| tendon | UBERON:0000043 | 94.99 | gold quality |
| left ovary | UBERON:0002119 | 94.74 | gold quality |
| mucosa of stomach | UBERON:0001199 | 94.72 | gold quality |
| body of pancreas | UBERON:0001150 | 94.66 | gold quality |
| right ovary | UBERON:0002118 | 94.66 | gold quality |
| ganglionic eminence | UBERON:0004023 | 94.66 | gold quality |
| monocyte | CL:0000576 | 94.60 | gold quality |
| mononuclear cell | CL:0000842 | 94.55 | gold quality |
| tonsil | UBERON:0002372 | 94.55 | gold quality |
| esophagogastric junction muscularis propria | UBERON:0035841 | 94.49 | gold quality |
| pylorus | UBERON:0001166 | 94.46 | gold quality |
| body of uterus | UBERON:0009853 | 94.37 | gold quality |
| muscle layer of sigmoid colon | UBERON:0035805 | 94.37 | gold quality |
| skin of abdomen | UBERON:0001416 | 94.30 | gold quality |
| leukocyte | CL:0000738 | 94.27 | gold quality |
| esophagus | UBERON:0001043 | 94.25 | gold quality |
| upper lobe of left lung | UBERON:0008952 | 94.24 | gold quality |
| left lobe of thyroid gland | UBERON:0001120 | 94.15 | gold quality |
| small intestine Peyer’s patch | UBERON:0003454 | 94.13 | gold quality |
Single-cell (SCXA)
Detected in 1 experiment(s), a significant marker in 1.
| Experiment | Marker? | Max mean expression |
|---|---|---|
| E-ANND-3 | yes | 12.96 |
Regulation
Is transcription factor: no
miRNA regulators (miRDB)
110 targeting THOC2, top 30 by miRDB confidence (max_score; target_count = how many genes the miRNA targets in total — lower means more specific):
| miRNA | Max score | Avg score | miRNA target_count |
|---|---|---|---|
| HSA-MIR-4673 | 100.00 | 66.64 | 1490 |
| HSA-MIR-5692A | 100.00 | 74.40 | 6850 |
| HSA-MIR-3163 | 100.00 | 77.23 | 8605 |
| HSA-MIR-4500 | 99.99 | 72.72 | 2367 |
| HSA-MIR-3662 | 99.99 | 73.82 | 5684 |
| HSA-MIR-4645-5P | 99.98 | 65.81 | 1284 |
| HSA-LET-7A-5P | 99.98 | 72.29 | 1790 |
| HSA-LET-7B-5P | 99.98 | 72.31 | 1790 |
| HSA-LET-7C-5P | 99.98 | 72.29 | 1790 |
| HSA-LET-7E-5P | 99.98 | 72.29 | 1790 |
| HSA-LET-7F-5P | 99.98 | 72.56 | 1784 |
| HSA-LET-7G-5P | 99.98 | 72.37 | 1784 |
| HSA-LET-7I-5P | 99.98 | 72.37 | 1788 |
| HSA-MIR-98-5P | 99.98 | 72.33 | 1787 |
| HSA-MIR-103A-3P | 99.98 | 69.14 | 1595 |
| HSA-MIR-107 | 99.98 | 69.14 | 1595 |
| HSA-MIR-3692-3P | 99.98 | 70.27 | 2139 |
| HSA-MIR-4803 | 99.98 | 71.99 | 3117 |
| HSA-MIR-520D-5P | 99.98 | 73.34 | 4883 |
| HSA-MIR-524-5P | 99.98 | 73.43 | 4882 |
| HSA-MIR-12136 | 99.98 | 72.81 | 5713 |
| HSA-MIR-3688-3P | 99.97 | 72.02 | 2834 |
| HSA-MIR-607 | 99.97 | 73.62 | 5593 |
| HSA-MIR-1229-3P | 99.97 | 66.49 | 906 |
| HSA-MIR-7152-3P | 99.97 | 67.47 | 849 |
| HSA-LET-7D-5P | 99.96 | 71.76 | 1632 |
| HSA-MIR-4458 | 99.96 | 71.64 | 1650 |
| HSA-MIR-548AJ-3P | 99.96 | 73.38 | 5345 |
| HSA-MIR-548X-3P | 99.96 | 73.38 | 5345 |
| HSA-MIR-493-5P | 99.96 | 72.47 | 2382 |
Functional genomics
DepMap (CRISPR cell-line fitness): dependent in 99.1% of screened cell lines, common-essential.
Literature-anchored findings (GeneRIF, showing 4)
- recruitment of the human TREX complex to spliced mRNA is not directly coupled to transcription, but is instead coupled to transcription indirectly through splicing (PMID:15998806)
- THOC2 mutations implicate mRNA-export pathway in X-linked intellectual disability. (PMID:26166480)
- Study presents detailed clinical assessment and functional studies on a de novo variant in a female with an epileptic encephalopathy and discuss an additional four families with rare variants in THOC2 with supportive evidence for pathogenicity. (PMID:29851191)
- THOC2 reduction repressed melanoma cell proliferation and invasion, and induced cell apoptosis. (PMID:31680623)
Cross-species orthologs
7 orthologs
| Organism | Symbol | Gene ID |
|---|---|---|
| danio_rerio | thoc2 | ENSDARG00000037503 |
| mus_musculus | Thoc2 | ENSMUSG00000037475 |
| mus_musculus | Thoc2l | ENSMUSG00000097392 |
| rattus_norvegicus | Thoc2 | ENSRNOG00000007315 |
| rattus_norvegicus | Thoc2l | ENSRNOG00000042340 |
| drosophila_melanogaster | tho2 | FBGN0031390 |
| caenorhabditis_elegans | WBGENE00015813 |
Protein
Protein identifiers
THO complex subunit 2 — Q8NI27 (reviewed: Q8NI27)
Alternative names: hTREX120
All UniProt accessions (9): Q8NI27, A0A0C4DG98, B7ZB98, B7ZBA0, F2Z2V2, H0Y594, H0Y7U4, H0Y815, H7C477
UniProt curated annotations — full annotation on UniProt →
Function. Component of the THO subcomplex of the TREX complex which is thought to couple mRNA transcription, processing and nuclear export, and which specifically associates with spliced mRNA and not with unspliced pre-mRNA. Required for efficient export of polyadenylated RNA and spliced mRNA. The THOC1-THOC2-THOC3 core complex alone is sufficient to bind export factor NXF1-NXT1 and promote ATPase activity of DDX39B; in the complex THOC2 is the only component that directly interacts with DDX39B. TREX is recruited to spliced mRNAs by a transcription-independent mechanism, binds to mRNA upstream of the exon-junction complex (EJC) and is recruited in a splicing- and cap-dependent manner to a region near the 5’ end of the mRNA where it functions in mRNA export to the cytoplasm via the TAP/NXF1 pathway. Required for NXF1 localization to the nuclear rim. THOC2 (and probably the THO complex) is involved in releasing mRNA from nuclear speckle domains. (Microbial infection) The TREX complex is essential for the export of Kaposi’s sarcoma-associated herpesvirus (KSHV) intronless mRNAs and infectious virus production.
Subunit / interactions. Component of the THO subcomplex, which is composed of THOC1, THOC2, THOC3, THOC5, THOC6 and THOC7. The THO subcomplex interacts with DDX39B to form the THO-DDX39B complex which multimerizes into a 28-subunit tetrameric assembly. Component of the transcription/export (TREX) complex at least composed of ALYREF/THOC4, DDX39B, SARNP/CIP29, CHTOP and the THO subcomplex; in the complex interacts with THOC1, THOC3, THOC5, THOC7 and DDX39B. TREX seems to have a dynamic structure involving ATP-dependent remodeling. Interacts with POLDIP3 and ZC3H11A.
Subcellular location. Nucleus. Nucleus speckle. Cytoplasm.
Tissue specificity. Expressed in the hippocampus and the cerebral cortex.
Disease relevance. Intellectual developmental disorder, X-linked, syndromic, Kumar type (MRXSK) [MIM:300957] A form of intellectual disability, a disorder characterized by significantly below average general intellectual functioning associated with impairments in adaptive behavior and manifested during the developmental period. Intellectual deficiency is the only primary symptom of non-syndromic X-linked forms, while syndromic forms present with associated physical, neurological and/or psychiatric manifestations. MRXSK patients manifest variable degrees of intellectual disability. Commonly observed features included speech delay, elevated BMI, short stature, seizure disorders, gait disturbance, and tremors. The disease is caused by variants affecting the gene represented in this entry. Arthrogryposis multiplex congenita 7, X-linked (AMC7) [MIM:301127] A form of arthrogryposis multiplex congenita, a developmental condition characterized by multiple joint contractures resulting from reduced or absent fetal movements. AMC7 is an X-linked recessive, severe form with onset in utero. Affected fetuses may also have subcutaneous edema and dysmorphic facial features. The disease is caused by variants affecting the gene represented in this entry.
Similarity. Belongs to the THOC2 family.
Isoforms (2)
| UniProt ID | Names | Canonical? |
|---|---|---|
| Q8NI27-1 | 1 | yes |
| Q8NI27-2 | 2 |
RefSeq proteins (1): NP_001075019* (*=MANE)
Domains & families (InterPro)
| ID | Name | Type |
|---|---|---|
| IPR021418 | THO_THOC2_C | Domain |
| IPR021726 | THO_THOC2_N | Domain |
| IPR032302 | THOC2_N | Domain |
| IPR040007 | Tho2 | Family |
Pfam: PF11262, PF11732, PF16134
UniProt features (121 total): helix 56, sequence variant 19, compositionally biased region 9, modified residue 9, region of interest 7, strand 7, sequence conflict 4, turn 4, mutagenesis site 2, chain 1, splice variant 1, coiled-coil region 1, short sequence motif 1
Structure
Experimental structures (PDB)
4 structures.
| PDB | Method | Resolution (Å) |
|---|---|---|
| 7APK | ELECTRON MICROSCOPY | 3.3 |
| 7ZNL | ELECTRON MICROSCOPY | 3.45 |
| 7ZNK | ELECTRON MICROSCOPY | 3.9 |
| 8R7L | ELECTRON MICROSCOPY | 4.12 |
Predicted structure (AlphaFold)
| Model | pLDDT | Fraction very-high |
|---|---|---|
| AF-Q8NI27-F1 | 72.83 | 0.33 |
Functional residue map
Curated UniProt residues grouped by drug-discovery relevance — catalytic, ligand-binding, modification, and mutation-validated positions. Source: UniProtKB sequence features.
Post-translational modifications (9): 1222, 1385, 1390, 1393, 1417, 1443, 1450, 1486, 1516
Mutagenesis-validated functional residues (2):
| Position | Phenotype |
|---|---|
| 551–558 | impairs interaction with ddx39b. abolishes interaction with ddx39b; when associated with 589-a–a-592. |
| 589–592 | impairs interaction with ddx39b. abolishes interaction with ddx39b; when associated with 551-a–s-558. |
Function
Pathways and Gene Ontology
Reactome pathways
8 pathways
| ID | Pathway |
|---|---|
| R-HSA-159236 | Transport of Mature mRNA derived from an Intron-Containing Transcript |
| R-HSA-72187 | mRNA 3’-end processing |
| R-HSA-73856 | RNA Polymerase II Transcription Termination |
| R-HSA-72202 | Transport of Mature Transcript to Cytoplasm |
| R-HSA-72203 | Processing of Capped Intron-Containing Pre-mRNA |
| R-HSA-73857 | RNA Polymerase II Transcription |
| R-HSA-74160 | Gene expression (Transcription) |
| R-HSA-8953854 | Metabolism of RNA |
MSigDB gene sets: 381 (showing top):
RODRIGUES_THYROID_CARCINOMA_ANAPLASTIC_UP, GOBP_EMBRYO_DEVELOPMENT_ENDING_IN_BIRTH_OR_EGG_HATCHING, BROWNE_HCMV_INFECTION_6HR_DN, YAGI_AML_WITH_INV_16_TRANSLOCATION, GRAESSMANN_APOPTOSIS_BY_DOXORUBICIN_DN, GOBP_NEUROGENESIS, MORF_HDAC2, GOBP_REGULATION_OF_NUCLEOBASE_CONTAINING_COMPOUND_TRANSPORT, GOBP_NEGATIVE_REGULATION_OF_CELLULAR_COMPONENT_ORGANIZATION, GOBP_NUCLEAR_TRANSPORT, MODULE_331, GOBP_STEM_CELL_DIVISION, GOBP_REGULATION_OF_CELL_PROJECTION_ORGANIZATION, GOBP_IN_UTERO_EMBRYONIC_DEVELOPMENT, GOBP_REGULATION_OF_NUCLEOCYTOPLASMIC_TRANSPORT
GO Biological Process (13): cell morphogenesis (GO:0000902), blastocyst development (GO:0001824), mRNA processing (GO:0006397), mRNA export from nucleus (GO:0006406), RNA splicing (GO:0008380), regulation of gene expression (GO:0010468), regulation of mRNA export from nucleus (GO:0010793), negative regulation of neuron projection development (GO:0010977), poly(A)+ mRNA export from nucleus (GO:0016973), stem cell division (GO:0017145), neuron development (GO:0048666), generation of neurons (GO:0048699), mRNA transport (GO:0051028)
GO Molecular Function (3): mRNA binding (GO:0003729), RNA binding (GO:0003723), protein binding (GO:0005515)
GO Cellular Component (8): transcription export complex (GO:0000346), THO complex (GO:0000347), THO complex part of transcription export complex (GO:0000445), nucleus (GO:0005634), nucleoplasm (GO:0005654), cytoplasm (GO:0005737), nuclear speck (GO:0016607), chromosome, telomeric region (GO:0000781)
Reactome top-level categories
Rollup of top-5 pathways:
| Category | Pathways |
|---|---|
| Processing of Capped Intron-Containing Pre-mRNA | 2 |
| Transport of Mature Transcript to Cytoplasm | 1 |
| RNA Polymerase II Transcription | 1 |
| Metabolism of RNA | 1 |
| Gene expression (Transcription) | 1 |
GO top-level categories
Rollup of top GO terms by namespace:
| Category | Terms |
|---|---|
| RNA processing | 2 |
| gene expression | 2 |
| mRNA export from nucleus | 2 |
| nuclear protein-containing complex | 2 |
| cellular anatomical structure | 2 |
| anatomical structure morphogenesis | 1 |
| in utero embryonic development | 1 |
| anatomical structure development | 1 |
| mRNA metabolic process | 1 |
| RNA export from nucleus | 1 |
| mRNA transport | 1 |
| regulation of macromolecule biosynthetic process | 1 |
| regulation of RNA export from nucleus | 1 |
| regulation of ribonucleoprotein complex localization | 1 |
| regulation of neuron projection development | 1 |
| neuron projection development | 1 |
| negative regulation of cell projection organization | 1 |
| cell division | 1 |
| neuron differentiation | 1 |
| cell development | 1 |
| neurogenesis | 1 |
| RNA transport | 1 |
| RNA binding | 1 |
| nucleic acid binding | 1 |
| binding | 1 |
| transcription export complex | 1 |
| THO complex | 1 |
| intracellular membrane-bounded organelle | 1 |
| nuclear lumen | 1 |
| intracellular anatomical structure | 1 |
| nuclear ribonucleoprotein granule | 1 |
| chromosomal region | 1 |
Protein interactions and networks
STRING
2800 interactions, top by confidence (×1000):
| Protein A | Protein B | Partner UniProt | Score |
|---|---|---|---|
| THOC2 | THOC1 | Q96FV9 | 999 |
| THOC2 | THOC3 | Q96J01 | 999 |
| THOC2 | DDX39B | Q13838 | 998 |
| THOC2 | THOC5 | Q13769 | 997 |
| THOC2 | THOC7 | Q6I9Y2 | 997 |
| THOC2 | THOC6 | Q86W42 | 997 |
| THOC2 | GLI2 | P10070 | 992 |
| THOC2 | ALYREF | Q86V81 | 992 |
| THOC2 | CYLD | Q9NQC7 | 992 |
| THOC2 | RAE1 | P78406 | 860 |
| THOC2 | NXF1 | Q9UBU9 | 827 |
| THOC2 | SARNP | P82979 | 763 |
| THOC2 | FYTTD1 | Q96QD9 | 733 |
| THOC2 | ZC3H18 | Q86VM9 | 713 |
| THOC2 | NCBP3 | Q53F19 | 688 |
IntAct
108 interactions, top by confidence:
| A | B | Type | Score |
|---|---|---|---|
| THOC1 | THOC5 | psi-mi:“MI:0914”(association) | 0.930 |
| MED4 | MED19 | psi-mi:“MI:0914”(association) | 0.900 |
| MED20 | MED19 | psi-mi:“MI:0914”(association) | 0.840 |
| DDX39B | THOC5 | psi-mi:“MI:0915”(physical association) | 0.800 |
| DDX39B | ALYREF | psi-mi:“MI:0914”(association) | 0.770 |
| THOC2 | THOC5 | psi-mi:“MI:0914”(association) | 0.730 |
| CFTR | ESYT2 | psi-mi:“MI:2364”(proximity) | 0.710 |
| ALYREF | THOC5 | psi-mi:“MI:0914”(association) | 0.710 |
| THOC5 | ALYREF | psi-mi:“MI:0914”(association) | 0.710 |
| MOB1A | LATS1 | psi-mi:“MI:0914”(association) | 0.670 |
| THOC1 | EIF4A3 | psi-mi:“MI:0914”(association) | 0.660 |
| CHTOP | THOC5 | psi-mi:“MI:0914”(association) | 0.660 |
| THOC1 | DDX39A | psi-mi:“MI:0914”(association) | 0.640 |
| KPNB1 | POM121C | psi-mi:“MI:0914”(association) | 0.530 |
| THOC3 | CLUH | psi-mi:“MI:0914”(association) | 0.530 |
| NCBP3 | SAP18 | psi-mi:“MI:0914”(association) | 0.530 |
| THOC5 | EIF4A3 | psi-mi:“MI:0914”(association) | 0.530 |
| VCAM1 | PSMD11 | psi-mi:“MI:0914”(association) | 0.530 |
| EIF4A3 | psi-mi:“MI:0915”(physical association) | 0.490 | |
| THOC2 | CRK | psi-mi:“MI:0915”(physical association) | 0.400 |
| HLCS | THOC2 | psi-mi:“MI:0915”(physical association) | 0.400 |
BioGRID (343): THOC2 (Affinity Capture-MS), THOC2 (Affinity Capture-MS), THOC2 (Affinity Capture-MS), THOC2 (Affinity Capture-MS), THOC2 (Affinity Capture-MS), THOC2 (Affinity Capture-MS), PDCD4 (Co-fractionation), THOC1 (Co-fractionation), THOC2 (Co-fractionation), THOC2 (Co-fractionation), THOC6 (Co-fractionation), THOC7 (Co-fractionation), USP7 (Co-fractionation), THOC2 (Affinity Capture-MS), THOC2 (Affinity Capture-MS)
ESM2 similar proteins: A0A3Q1LSX9, A2APV2, A2AT37, A2VD00, A4II09, B0KWH8, B1AZI6, B1MTK1, B2KI97, B3MS75, B3NU52, B4GW22, B4I0W6, B4JM29, B4L2J8, B4NC41, B4Q034, C0H906, C1FXW9, F4IUX6, Q09161, Q16UN6, Q1LUC1, Q1LXC9, Q29G82, Q2L4X1, Q3UYV9, Q4R6R4, Q56A27, Q5R7L4, Q5ZJZ6, Q5ZL42, Q5ZLT7, Q5ZMW3, Q6DDM4, Q6DIE2, Q6GQ80, Q6GQD0, Q6P2Z0, Q6P7P5
Diamond homologs: B0KWH8, B1AZI6, B1MTK1, B2KI97, C1FXW9, F4IAT2, Q09779, Q8NI27
SIGNOR signaling
1 interactions.
| A | Effect | B | Mechanism |
|---|---|---|---|
| THOC2 | “form complex” | “TREX complex” | binding |
Enriched among interaction partners
Reactome pathways and GO biological processes over-represented among this gene’s 109 IntAct physical interaction partners (hypergeometric vs the genome-wide background, BH-FDR, gene-set size 15–500, ranked by fold). A functional readout of the neighbourhood — distinct from this gene’s own memberships above, and biased toward well-studied / hub proteins, so read it as themes rather than proof.
Reactome pathways:
| Pathway | Partners | Fold | FDR |
|---|---|---|---|
| mRNA 3’-end processing | 12 | 32.4× | 3e-13 |
| Transport of Mature Transcript to Cytoplasm | 6 | 31.3× | 2e-06 |
| Transport of Mature mRNA derived from an Intron-Containing Transcript | 13 | 27.1× | 3e-13 |
| RNA Polymerase II Transcription Termination | 7 | 21.1× | 2e-06 |
| Processing of Capped Intron-Containing Pre-mRNA | 14 | 15.8× | 3e-11 |
| mRNA Splicing | 9 | 13.5× | 2e-06 |
| mRNA Splicing - Major Pathway | 14 | 10.5× | 6e-09 |
| Metabolism of RNA | 14 | 8.0× | 2e-07 |
GO biological processes:
| GO term | Partners | Fold | FDR |
|---|---|---|---|
| RNA export from nucleus | 5 | 50.9× | 7e-06 |
| negative regulation of mRNA splicing, via spliceosome | 5 | 41.6× | 1e-05 |
| mRNA export from nucleus | 12 | 38.6× | 1e-13 |
| positive regulation of transcription elongation by RNA polymerase II | 6 | 19.6× | 6e-05 |
| RNA splicing | 14 | 13.4× | 8e-10 |
| mRNA processing | 12 | 10.3× | 5e-07 |
| mRNA splicing, via spliceosome | 10 | 10.0× | 1e-05 |
Disease & clinical
Clinical variants and AI predictions
ClinVar
587 variants total. Per-class counts are floors (≥ shown; pagination cap):
| Classification | Count (floor) |
|---|---|
| Pathogenic | 8 |
| Likely pathogenic | 13 |
| Uncertain significance | 200 |
| Likely benign | 31 |
| Benign | 21 |
Top pathogenic / likely-pathogenic (21)
| Variant ID | HGVS | Classification |
|---|---|---|
| 208523 | NM_001081550.2(THOC2):c.1313T>C (p.Leu438Pro) | Pathogenic |
| 208524 | NM_001081550.2(THOC2):c.937C>T (p.Leu313Phe) | Pathogenic |
| 208525 | NM_001081550.2(THOC2):c.3034T>C (p.Ser1012Pro) | Pathogenic |
| 208526 | NM_001081550.2(THOC2):c.2399T>C (p.Ile800Thr) | Pathogenic |
| 3362897 | THOC2, 2.4-KB DEL, EX37-38DEL | Pathogenic |
| 3362900 | THOC2, 4-BP DEL, 2482GTCA | Pathogenic |
| 488431 | NM_001081550.2(THOC2):c.2138G>A (p.Gly713Asp) | Pathogenic |
| 488434 | NM_001081550.2(THOC2):c.4450-2A>G | Pathogenic |
| 1527937 | NM_001081550.2(THOC2):c.1844G>A (p.Cys615Tyr) | Likely pathogenic |
| 1700225 | NM_001081550.2(THOC2):c.34T>C (p.Trp12Arg) | Likely pathogenic |
| 1700228 | NM_001081550.2(THOC2):c.2695T>C (p.Tyr899His) | Likely pathogenic |
| 1801970 | NM_001081550.2(THOC2):c.2788C>T (p.Arg930Cys) | Likely pathogenic |
| 2314312 | NM_001081550.2(THOC2):c.2818G>A (p.Glu940Lys) | Likely pathogenic |
| 3342610 | NM_001081550.2(THOC2):c.2482-1_2484del | Likely pathogenic |
| 3349329 | NM_001081550.2(THOC2):c.3215C>T (p.Thr1072Ile) | Likely pathogenic |
| 488433 | NM_001081550.2(THOC2):c.3503+4A>C | Likely pathogenic |
| 488437 | NM_001081550.2(THOC2):c.3323C>T (p.Ser1108Leu) | Likely pathogenic |
| 804356 | NM_001081550.2(THOC2):c.2642A>G (p.Tyr881Cys) | Likely pathogenic |
| 807709 | NM_001081550.2(THOC2):c.149A>C (p.Tyr50Ser) | Likely pathogenic |
| 982367 | NM_001081550.2(THOC2):c.1942G>T (p.Ala648Ser) | Likely pathogenic |
| 984647 | NM_001081550.2(THOC2):c.3305A>G (p.Tyr1102Cys) | Likely pathogenic |
SpliceAI
6313 predictions. Top by Δscore:
| Variant | Effect | Δscore |
|---|---|---|
| X:123611452:T:TA | donor_gain | 1.0000 |
| X:123612217:ATAG:A | donor_gain | 1.0000 |
| X:123612217:ATAGC:A | donor_gain | 1.0000 |
| X:123613552:CCCCA:C | acceptor_gain | 1.0000 |
| X:123613553:CCCA:C | acceptor_gain | 1.0000 |
| X:123613553:CCCAC:C | acceptor_gain | 1.0000 |
| X:123613554:CCA:C | acceptor_gain | 1.0000 |
| X:123613554:CCAC:C | acceptor_gain | 1.0000 |
| X:123613555:CAC:C | acceptor_gain | 1.0000 |
| X:123613557:C:CC | acceptor_gain | 1.0000 |
| X:123613637:A:AC | donor_gain | 1.0000 |
| X:123613638:C:CC | donor_gain | 1.0000 |
| X:123614050:AC:A | donor_gain | 1.0000 |
| X:123614051:CC:C | donor_gain | 1.0000 |
| X:123614066:T:A | donor_gain | 1.0000 |
| X:123614190:C:CC | acceptor_gain | 1.0000 |
| X:123615309:A:AC | donor_gain | 1.0000 |
| X:123615310:T:C | donor_gain | 1.0000 |
| X:123620966:C:CC | acceptor_gain | 1.0000 |
| X:123621589:T:C | acceptor_gain | 1.0000 |
| X:123621589:T:TC | acceptor_gain | 1.0000 |
| X:123622752:TCCTA:T | donor_loss | 1.0000 |
| X:123622753:CCTAC:C | donor_loss | 1.0000 |
| X:123622754:CTA:C | donor_loss | 1.0000 |
| X:123622755:TAC:T | donor_loss | 1.0000 |
| X:123622756:ACC:A | donor_loss | 1.0000 |
| X:123622757:CCTGT:C | donor_loss | 1.0000 |
| X:123622856:TTTAT:T | acceptor_gain | 1.0000 |
| X:123622857:TTAT:T | acceptor_gain | 1.0000 |
| X:123622858:TAT:T | acceptor_gain | 1.0000 |
AlphaMissense
10588 scored. Top likely-pathogenic:
| Variant | Protein change | am_pathogenicity |
|---|---|---|
| X:123623228:G:C | H1187D | 1.000 |
| X:123623229:A:C | F1186L | 1.000 |
| X:123623229:A:T | F1186L | 1.000 |
| X:123623230:A:G | F1186S | 1.000 |
| X:123623231:A:G | F1186L | 1.000 |
| X:123623275:C:T | G1171E | 1.000 |
| X:123623793:G:T | A1166D | 1.000 |
| X:123623794:C:G | A1166P | 1.000 |
| X:123623868:C:T | G1141D | 1.000 |
| X:123623886:G:T | P1135Q | 1.000 |
| X:123623887:G:A | P1135S | 1.000 |
| X:123623910:A:G | L1127P | 1.000 |
| X:123623955:A:G | L1112P | 1.000 |
| X:123624067:A:G | L1104P | 1.000 |
| X:123624080:A:G | W1100R | 1.000 |
| X:123624080:A:T | W1100R | 1.000 |
| X:123624081:T:A | K1099N | 1.000 |
| X:123624081:T:G | K1099N | 1.000 |
| X:123624082:T:A | K1099I | 1.000 |
| X:123624083:T:C | K1099E | 1.000 |
| X:123624086:G:C | H1098D | 1.000 |
| X:123624091:A:T | V1096D | 1.000 |
| X:123624097:C:G | R1094P | 1.000 |
| X:123624100:A:G | F1093S | 1.000 |
| X:123624115:A:G | L1088S | 1.000 |
| X:123624168:G:C | F1070L | 1.000 |
| X:123624168:G:T | F1070L | 1.000 |
| X:123624170:A:G | F1070L | 1.000 |
| X:123624172:C:T | G1069E | 1.000 |
| X:123624173:C:G | G1069R | 1.000 |
dbSNP variants (sampled 300 via entrez): RS1000011782 (X:123656098 A>G), RS1000018433 (X:123728270 C>T), RS1000047919 (X:123654055 C>T), RS1000049051 (X:123720935 A>G), RS1000053335 (X:123640748 A>G), RS1000071868 (X:123658657 G>A), RS1000124839 (X:123639925 C>T), RS1000130298 (X:123718925 C>T), RS1000156599 (X:123682492 G>A), RS1000192990 (X:123733488 TTCTCCC>T,TTCTCCCTCTCCC), RS1000198446 (X:123688567 C>A), RS1000228637 (X:123719284 A>C), RS1000229470 (X:123614397 C>A), RS1000291732 (X:123663625 T>C,G), RS1000318879 (X:123646098 G>A)
Disease associations
OMIM: gene MIM:300395 | disease phenotypes: MIM:300957, MIM:620977, MIM:209850, MIM:301127
GenCC curated gene-disease
| Disease | Classification | Inheritance |
|---|---|---|
| X-linked intellectual disability-short stature-overweight syndrome | Strong | X-linked |
ClinGen Gene-Disease Validity (1)
Expert-panel classifications — Definitive > Strong > Moderate > Limited > Disputed > Refuted.
| Disease | Classification | Inheritance |
|---|---|---|
| X-linked complex neurodevelopmental disorder | Definitive | XL |
Mondo (6): intellectual disability (MONDO:0001071), X-linked intellectual disability-short stature-overweight syndrome (MONDO:0010496), neurodevelopmental disorder (MONDO:0700092), immunodeficiency 127 (MONDO:0975832), autism (MONDO:0005260), arthrogryposis multiplex congenita 7, X-linked (MONDO:0975826)
Orphanet (2): X-linked intellectual disability-short stature-overweight syndrome (Orphanet:457240), NON RARE IN EUROPE: Unexplained intellectual disability (Orphanet:319658)
HPO phenotypes
83 total (30 of 83 shown, HPO-id order):
| HPO | Term |
|---|---|
| HP:0000028 | Cryptorchidism |
| HP:0000054 | Micropenis |
| HP:0000202 | Orofacial cleft |
| HP:0000218 | High palate |
| HP:0000252 | Microcephaly |
| HP:0000256 | Macrocephaly |
| HP:0000316 | Hypertelorism |
| HP:0000337 | Broad forehead |
| HP:0000347 | Micrognathia |
| HP:0000348 | High forehead |
| HP:0000400 | Macrotia |
| HP:0000407 | Sensorineural hearing impairment |
| HP:0000463 | Anteverted nares |
| HP:0000486 | Strabismus |
| HP:0000505 | Visual impairment |
| HP:0000577 | Exotropia |
| HP:0000639 | Nystagmus |
| HP:0000708 | Atypical behavior |
| HP:0000716 | Depression |
| HP:0000729 | Autistic behavior |
| HP:0000733 | Motor stereotypy |
| HP:0000739 | Anxiety |
| HP:0000742 | Self-mutilation |
| HP:0000750 | Delayed speech and language development |
| HP:0000824 | Decreased response to growth hormone stimulation test |
| HP:0000954 | Single transverse palmar crease |
| HP:0001188 | Hand clenching |
| HP:0001249 | Intellectual disability |
| HP:0001250 | Seizure |
| HP:0001252 | Hypotonia |
GWAS associations
1 associations (top):
| Study | Trait | p-value |
|---|---|---|
| GCST90002381_503 | Eosinophil count | 2.000000e-10 |
EFO canonical traits (1, from GWAS)
| EFO ID | Trait name |
|---|---|
| EFO:0004842 | eosinophil count |
MeSH disease descriptors (3)
| Descriptor | Name | Tree numbers |
|---|---|---|
| D001321 | Autistic Disorder | F03.625.164.113.500 |
| D008607 | Intellectual Disability | C10.597.606.360; C23.888.592.604.646; F01.700.687; F03.625.539 |
| D065886 | Neurodevelopmental Disorders | F03.625 |
Drugs & pharmacology
Drug and pharmacology data
Is drug target: yes
ChEMBL targets (1): CHEMBL6066378 (SINGLE PROTEIN)
PharmGKB: 1 entry (VIP=true, CPIC=false)
ChEMBL bioactivities
4 potent at pChembl≥5 of 4 total, top 4 by pChembl (potency: 10 = 0.1 nM, 6 = 1 µM).
| pChembl | Type | Value | Unit | Molecule |
|---|---|---|---|---|
| 5.69 | Kd | 2050 | nM | CHEMBL5653589 |
| 5.69 | ED50 | 2050 | nM | CHEMBL5653589 |
| 5.08 | Kd | 8402 | nM | CHEMBL3752910 |
| 5.08 | ED50 | 8402 | nM | CHEMBL3752910 |
PubChem BioAssay actives
2 with measured affinity, of 4 total; 2 most potent distinct compounds. Largely complementary to BindingDB; screening values are coarse (µM, 4 dp), so sub-nM hits tie at the floor.
| Compound | Assay | Type | Value | Unit |
|---|---|---|---|---|
| 4-methyl-3-[(2-methyl-6-pyridin-3-ylpyrazolo[3,4-d]pyrimidin-4-yl)amino]-N-[3-(trifluoromethyl)phenyl]benzamide | 2149582: Binding affinity to human THOC2 incubated for 45 mins by Kinobead based pull down assay | kd | 2.0497 | uM |
| 4-methyl-3-[(1-methyl-6-pyridin-3-ylpyrazolo[3,4-d]pyrimidin-4-yl)amino]-N-[3-(trifluoromethyl)phenyl]benzamide | 2149582: Binding affinity to human THOC2 incubated for 45 mins by Kinobead based pull down assay | kd | 8.4024 | uM |
CTD chemical–gene interactions
52 total (human), top 30 by PubMed support.
| Chemical | Actions (top 5) | PubMed papers |
|---|---|---|
| bisphenol A | decreases expression | 2 |
| sodium arsenite | decreases expression, increases expression | 2 |
| Valproic Acid | decreases expression, increases methylation | 2 |
| aristolochic acid I | decreases expression | 1 |
| FR900359 | affects phosphorylation | 1 |
| bisphenol F | affects cotreatment, increases expression | 1 |
| TAK-243 | decreases sumoylation | 1 |
| methylmercuric chloride | decreases expression | 1 |
| triphenyl phosphate | affects expression | 1 |
| alpha-pinene | affects cotreatment, increases oxidation, increases abundance | 1 |
| titanium dioxide | increases methylation | 1 |
| decabromobiphenyl ether | increases expression | 1 |
| beta-lapachone | decreases expression | 1 |
| methylparaben | increases expression | 1 |
| tetrabromobisphenol A | decreases expression | 1 |
| potassium chromate(VI) | affects cotreatment, decreases expression | 1 |
| aflatoxin B2 | increases methylation | 1 |
| coumarin | decreases phosphorylation | 1 |
| methacrylaldehyde | increases abundance, affects cotreatment, increases oxidation | 1 |
| beta-methylcholine | affects expression | 1 |
| epigallocatechin gallate | affects cotreatment, decreases expression | 1 |
| di-n-butylphosphoric acid | affects expression | 1 |
| perfluorooctane sulfonic acid | decreases expression | 1 |
| CGP 52608 | affects binding, increases reaction | 1 |
| 2,2’,4,4’-tetrabromodiphenyl ether | decreases expression | 1 |
| pentabrominated diphenyl ether 100 | decreases expression | 1 |
| hexabrominated diphenyl ether 153 | decreases expression | 1 |
| jinfukang | decreases expression | 1 |
| Resveratrol | affects cotreatment, increases expression | 1 |
| Sunitinib | increases expression | 1 |
ChEMBL screening assays
1 unique, capped per target: 1 binding
Representative assays (with source publication via chembl_document):
| Assay ID | Type | Description | Source paper |
|---|---|---|---|
| CHEMBL5652624 | Binding | Binding affinity to human THOC2 incubated for 45 mins by Kinobead based pull down assay | NVP-BHG712: Effects of Regioisomers on the Affinity and Selectivity toward the EPHrin Family. — ChemMedChem |
Clinical trials (associated diseases)
298 trials via MONDO — disease-level, not drug-specific.
| Trial | Phase | Status | Title |
|---|---|---|---|
| NCT05657860 | PHASE4 | COMPLETED | Guanfacine Extended Release for the Reduction of Aggression and Self-injurious Behavior Associated With Prader-Willi Syndrome |
| NCT05744479 | PHASE4 | RECRUITING | Metformin for Antipsychotic-induced Weight Gain in Adults With Intellectual Disability |
| NCT06107829 | PHASE4 | WITHDRAWN | Valbenazine Treatment of Tardive Dyskinesia in Adults With Intellectual/Developmental Disabilities |
| NCT06997198 | PHASE4 | NOT_YET_RECRUITING | Deutetrabenazine Treatment for Tardive Dyskinesia in Intellectual/Developmental Disabilities |
| NCT04586348 | PHASE4 | UNKNOWN | Prenatal Iodine Supplementation and Early Childhood Neurodevelopment |
| NCT04873115 | PHASE4 | UNKNOWN | Double-blind, Placebo-controlled, Randomized Clinical Trial Comparing the Efficacy and Safety of Sialanar Plus orAl rehabiLitation Against Placebo Plus Oral Rehabilitation for chIldren and Adolescents With seVere Sialorrhoea and Neurodisabilties, |
| NCT02270736 | PHASE3 | COMPLETED | Clinical Study to Investigate the Efficacy and Safety of NT 201 Compared to Placebo in the Treatment of Chronic Troublesome Drooling Associated With Neurological Disorders and/or Intellectual Disability |
| NCT02559102 | PHASE3 | COMPLETED | Dexmedetomidine Sedation Versus General Anaesthesia for Inguinal Hernia Surgery in Infants |
| NCT02757079 | PHASE3 | COMPLETED | Study of the Efficacy and Safety of NPC-15 for Sleep Disorders of Children With Neurodevelopmental Disorders |
| NCT06915480 | PHASE3 | RECRUITING | Reducing Missed Appointments |
| NCT07377032 | PHASE3 | RECRUITING | TAP-GRIN: Interventional Study on Patients With GRIN-related Neurodevelopmental Disorders |
| NCT02304302 | PHASE2 | COMPLETED | Down Syndrome Memantine Follow-up Study |
| NCT03862950 | PHASE2 | COMPLETED | A Trial of Metformin in Individuals With Fragile X Syndrome (Met) |
| NCT04529226 | PHASE2 | UNKNOWN | Study to Compare Clozapine vs Treatment as Usual in People With Intellectual Disability & Treatment-resistant Psychosis |
| NCT04821856 | PHASE2 | COMPLETED | Evaluation of the Effectiveness of Cannabidiol in Treating Severe Behavioural Problems in Children and Adolescents With Intellectual Disability |
| NCT02909959 | PHASE2 | COMPLETED | Sulforaphane for the Treatment of Young Men With Autism Spectrum Disorder |
| NCT06081348 | PHASE2 | RECRUITING | Sertraline vs. Placebo in the Treatment of Anxiety in Children and AdoLescents With NeurodevelopMental Disorders |
| NCT06352372 | PHASE2 | COMPLETED | Safety and Efficacy of tPBM for Epileptiform Activity in Autism |
| NCT05273320 | PHASE1 | COMPLETED | Clinical Trial of Nabilone for Aggression in Adults With Intellectual and Developmental Disabilities |
| NCT05301361 | PHASE1 | ENROLLING_BY_INVITATION | Sensitivity of the NIH Toolbox to Stimulant Treatment in Intellectual Disabilities |
| NCT06016764 | PHASE1 | COMPLETED | Use of MRI and cTBS for Catatonia in Autism |
| NCT06586827 | PHASE1 | COMPLETED | Impact of Competency-Based Training and Technical Assistance Employment Outcomes of Individuals With ID/DD |
| NCT07531940 | PHASE1 | NOT_YET_RECRUITING | Escalating Doses of Memantine in Down Syndrome (MEDS-123) |
| NCT00503191 | PHASE1 | COMPLETED | NeuroModulation Technique Treatment of Autism |
| NCT04475848 | PHASE1 | COMPLETED | A Study to Investigate the Safety, Tolerability, Pharmacokinetics, Pharmacodynamics and Food Effect of RO6953958 in Healthy Participants |
| NCT06300398 | PHASE1 | COMPLETED | IAMA-6 Oral Dose Study in Healthy Adults |
| NCT03479476 | PHASE2/PHASE3 | COMPLETED | A Trial of Metformin in Individuals With Fragile X Syndrome |
| NCT02616796 | PHASE1/PHASE2 | COMPLETED | Effects of Social Gaze Training on Brain and Behavior in Fragile X Syndrome |
| NCT06860672 | EARLY_PHASE1 | RECRUITING | Clinical Trial of the Dual Vector Base Editor for the Treatment of the CHD3-R1025W Mutation |
| NCT00597948 | Not specified | COMPLETED | Healthy Lifestyles for People With Intellectual Disabilities |
| NCT01087320 | Not specified | RECRUITING | Genome Medical Sequencing for Gene Discovery |
| NCT01652963 | Not specified | UNKNOWN | Picture-based Computerised Assessment and Training of Cognitive Behaviour Therapy Skills |
| NCT01695395 | Not specified | COMPLETED | Mental Health Care Provision for Adults With Intellectual Disability and a Mental Disorder |
| NCT01867554 | Not specified | COMPLETED | Research and Characterization of New Genes Involved in Intellectual Disability |
| NCT01915381 | Not specified | COMPLETED | Improving Adherence Healthy Lifestyle With a Smartphone Application Based on Adults With Intellectual Disabilities |
| NCT01988623 | Not specified | COMPLETED | Pivotal Response Treatment for Individuals With Intellectual Disabilities |
| NCT02099773 | Not specified | COMPLETED | Support Staff-client Interactions With Augmentative and Alternative Communication |
| NCT02136849 | Not specified | COMPLETED | Inter-regional Project of the Great Western Exploration Approach for Exome Molecular Causes Severe Intellectual Disability Isolated or Syndromic |
| NCT02225041 | Not specified | COMPLETED | Sedation Strategy and Cognitive Outcome After Critical Illness in Early Childhood |
| NCT02414438 | Not specified | COMPLETED | Establishing the Clinical Utility of First StepDx PLUS and NextStepDx PLUS Study |
Related Atlas pages
- Associated diseases: X-linked intellectual disability-short stature-overweight syndrome, X-linked complex neurodevelopmental disorder
- Disease cohort memberships (association, not causation — diseases whose associated-gene cohort lists this gene; a subset are also under Associated diseases): arthrogryposis multiplex congenita 7, X-linked, immunodeficiency 127, X-linked intellectual disability-short stature-overweight syndrome