CPSF1
geneOn this page
Also known as CPSF160
Summary
CPSF1 (cleavage and polyadenylation specific factor 1, HGNC:2324) is a protein-coding gene on chromosome 8q24.3, encoding Cleavage and polyadenylation specificity factor subunit 1 (Q10570). Component of the cleavage and polyadenylation specificity factor (CPSF) complex that plays a key role in pre-mRNA 3’-end formation, recognizing the AAUAAA signal sequence and interacting with poly(A) polymerase and other factors to bring about cleavage and poly(A) addition. It is a common-essential gene (DepMap: required in 98.8% of cancer cell lines).
Cleavage and polyadenylation specificity factor (CPSF) is a multisubunit complex that plays a central role in 3-prime processing of pre-mRNAs. CPSF recognizes the AAUAAA signal in the pre-mRNA and interacts with other proteins to facilitate both RNA cleavage and poly(A) synthesis. CPSF1 is the largest subunit of the CPSF complex (Murthy and Manley, 1995 [PubMed 7590244]).
Source: NCBI Gene 29894 — RefSeq curated summary.
At a glance
- Gene–disease (curated): myopia 27 (Strong, GenCC)
- GWAS associations: 1
- Clinical variants (ClinVar): 318 total — 5 pathogenic, 3 likely-pathogenic
- Phenotypes (HPO): 3
- Druggable target: yes — 1 molecules with ChEMBL bioactivity
- Cancer dependency (DepMap): dependent in 98.8% of screened cell lines (common-essential)
- MANE Select transcript:
NM_013291
Identifiers
Gene identifiers
| Field | Value |
|---|---|
| HGNC ID | HGNC:2324 |
| Approved symbol | CPSF1 |
| Name | cleavage and polyadenylation specific factor 1 |
| Location | 8q24.3 |
| Locus type | gene with protein product |
| Status | Approved |
| Aliases | CPSF160 |
| Ensembl gene | ENSG00000071894 |
| Ensembl biotype | protein_coding |
| OMIM | 606027 |
| Entrez | 29894 |
Gene structure
Transcript identifiers
Ensembl transcripts: 31 — 21 protein_coding, 9 retained_intron, 1 nonsense_mediated_decay
ENST00000526271, ENST00000527827, ENST00000527916, ENST00000529288, ENST00000531042, ENST00000531480, ENST00000531727, ENST00000532560, ENST00000532725, ENST00000532935, ENST00000533492, ENST00000616140, ENST00000620219, ENST00000622776, ENST00000886809, ENST00000886810, ENST00000886811, ENST00000886812, ENST00000886813, ENST00000886814, ENST00000886815, ENST00000886816, ENST00000886817, ENST00000886818, ENST00000886819, ENST00000913993, ENST00000913994, ENST00000913995, ENST00000913996, ENST00000913997, ENST00000913998
RefSeq mRNA: 1 — MANE Select: NM_013291
NM_013291
CCDS: CCDS34966
Canonical transcript exons
ENST00000616140 — 38 exons
| Exon | Start | End |
|---|---|---|
| ENSE00003482940 | 144401211 | 144401291 |
| ENSE00003614369 | 144401430 | 144401563 |
| ENSE00003712245 | 144398779 | 144398868 |
| ENSE00003713910 | 144400354 | 144400493 |
| ENSE00003716011 | 144400671 | 144400817 |
| ENSE00003717346 | 144394234 | 144394300 |
| ENSE00003718378 | 144400924 | 144401075 |
| ENSE00003718996 | 144409015 | 144409172 |
| ENSE00003720349 | 144398958 | 144399038 |
| ENSE00003725856 | 144409289 | 144409335 |
| ENSE00003727674 | 144399992 | 144400085 |
| ENSE00003738057 | 144394113 | 144394160 |
| ENSE00003744072 | 144393452 | 144393590 |
| ENSE00003744148 | 144393231 | 144393365 |
| ENSE00003748479 | 144393667 | 144393796 |
| ENSE00003788623 | 144393883 | 144394038 |
| ENSE00003889080 | 144395098 | 144395182 |
| ENSE00003889229 | 144398525 | 144398638 |
| ENSE00003889326 | 144399781 | 144399868 |
| ENSE00003889354 | 144397954 | 144398132 |
| ENSE00003889632 | 144395265 | 144395355 |
| ENSE00003889855 | 144396348 | 144396500 |
| ENSE00003890292 | 144397743 | 144397879 |
| ENSE00003890322 | 144394882 | 144395023 |
| ENSE00003890453 | 144401646 | 144401673 |
| ENSE00003890597 | 144397207 | 144397413 |
| ENSE00003890813 | 144396598 | 144396741 |
| ENSE00003890831 | 144399276 | 144399373 |
| ENSE00003891118 | 144396840 | 144396929 |
| ENSE00003891484 | 144399128 | 144399202 |
| ENSE00003892421 | 144395435 | 144395551 |
| ENSE00003893798 | 144399452 | 144399503 |
| ENSE00003893840 | 144394379 | 144394555 |
| ENSE00003894660 | 144398302 | 144398443 |
| ENSE00003894668 | 144399588 | 144399710 |
| ENSE00003895112 | 144397487 | 144397661 |
| ENSE00003895529 | 144400166 | 144400276 |
| ENSE00003895575 | 144394644 | 144394796 |
Expression profiles
Bgee: expression breadth ubiquitous, 134 present calls, max score 98.97.
FANTOM5 (CAGE): breadth ubiquitous, TPM avg 30.1606 / max 162.1670, expressed in 1805 samples.
FANTOM5 promoters (7 alternative TSS)
| Promoter ID | TPM avg | Samples expressed |
|---|---|---|
| 95664 | 24.1346 | 1803 |
| 95665 | 5.0429 | 1540 |
| 95663 | 0.7342 | 448 |
| 95666 | 0.2036 | 102 |
| 95660 | 0.0217 | 7 |
| 95662 | 0.0203 | 9 |
| 95661 | 0.0034 | 3 |
Top tissues by expression
134 total, by Bgee expression score (0-100, higher = more expressed):
| Tissue | Anatomy ID | Expression score | Quality |
|---|---|---|---|
| right testis | UBERON:0004534 | 98.97 | gold quality |
| left testis | UBERON:0004533 | 98.90 | gold quality |
| right hemisphere of cerebellum | UBERON:0014890 | 98.51 | gold quality |
| right uterine tube | UBERON:0001302 | 98.49 | gold quality |
| cerebellar hemisphere | UBERON:0002245 | 98.38 | gold quality |
| cerebellum | UBERON:0002037 | 98.36 | gold quality |
| cerebellar cortex | UBERON:0002129 | 98.36 | gold quality |
| granulocyte | CL:0000094 | 98.35 | gold quality |
| pituitary gland | UBERON:0000007 | 98.07 | gold quality |
| lower esophagus mucosa | UBERON:0035834 | 98.00 | gold quality |
| adenohypophysis | UBERON:0002196 | 97.99 | gold quality |
| duodenum | UBERON:0002114 | 97.92 | gold quality |
| testis | UBERON:0000473 | 97.82 | gold quality |
| mucosa of transverse colon | UBERON:0004991 | 97.79 | gold quality |
| spleen | UBERON:0002106 | 97.77 | gold quality |
| small intestine Peyer’s patch | UBERON:0003454 | 97.75 | gold quality |
| small intestine | UBERON:0002108 | 97.51 | gold quality |
| right ovary | UBERON:0002118 | 97.44 | gold quality |
| body of pancreas | UBERON:0001150 | 97.35 | gold quality |
| right lobe of thyroid gland | UBERON:0001119 | 97.31 | gold quality |
| metanephros cortex | UBERON:0010533 | 97.31 | gold quality |
| body of stomach | UBERON:0001161 | 97.29 | gold quality |
| prostate gland | UBERON:0002367 | 97.26 | gold quality |
| transverse colon | UBERON:0001157 | 97.11 | gold quality |
| primary visual cortex | UBERON:0002436 | 97.08 | gold quality |
| left lobe of thyroid gland | UBERON:0001120 | 97.00 | gold quality |
| left ovary | UBERON:0002119 | 96.98 | gold quality |
| apex of heart | UBERON:0002098 | 96.91 | gold quality |
| body of uterus | UBERON:0009853 | 96.88 | gold quality |
| tibial nerve | UBERON:0001323 | 96.78 | gold quality |
Single-cell (SCXA)
Detected in 2 experiment(s), a significant marker in 1.
| Experiment | Marker? | Max mean expression |
|---|---|---|
| E-ANND-3 | yes | 4.85 |
| E-GEOD-110499 | no | 51.92 |
Regulation
Is transcription factor: no
Functional genomics
DepMap (CRISPR cell-line fitness): dependent in 98.8% of screened cell lines, common-essential.
Literature-anchored findings (GeneRIF, showing 6)
- CPSF1 was identified among protein-binding candidates. A consensus polyadenylation signal AAUAAA is present in intron 6 of IL7 receptor directly downstream from the 5’ splice site. Mutations to this site and CPSF1 knockdown increased exon 6 inclusion. (PMID:23151878)
- The authors define the molecular architecture of the core human CPSF complex comprising CPSF160, WDR33, CPSF30 and Fip1 and identify specific domains involved in inter-subunit interactions. Together, these results shed light on the function of CPSF in mediating polyA signal-dependent RNA cleavage and polyadenylation. (PMID:29274231)
- Knockdown of CPSF inhibits the proliferation, migration and invasion of ovarian carcinoma cells. (PMID:29358555)
- cryo-electron microscopic analysis of a core CPSF module bound to the polyadenylation signal hexamer motif (PMID:29358758)
- Heterozygous loss-of-function mutations in CPSF1 are associated with early-onset high myopia and that CPSF1 may play an important role in the development of retinal ganglion cell axon projection. (PMID:30689892)
- Epidermal progenitors suppress GRHL3-mediated differentiation through intronic polyadenylation promoted by CPSF-HNRNPA3 collaboration. (PMID:33469008)
Cross-species orthologs
5 orthologs
| Organism | Symbol | Gene ID |
|---|---|---|
| danio_rerio | cpsf1 | ENSDARG00000034178 |
| mus_musculus | Cpsf1 | ENSMUSG00000034022 |
| rattus_norvegicus | Cpsf1 | ENSRNOG00000030705 |
| drosophila_melanogaster | Cpsf160 | FBGN0024698 |
| caenorhabditis_elegans | WBGENE00022301 |
Paralogs (2): DDB1 (ENSG00000167986), SF3B3 (ENSG00000189091)
Protein
Protein identifiers
Cleavage and polyadenylation specificity factor subunit 1 — Q10570 (reviewed: Q10570)
Alternative names: Cleavage and polyadenylation specificity factor 160 kDa subunit
All UniProt accessions (4): Q10570, A0A087WTV4, A0A087X101, E9PIM1
UniProt curated annotations — full annotation on UniProt →
Function. Component of the cleavage and polyadenylation specificity factor (CPSF) complex that plays a key role in pre-mRNA 3’-end formation, recognizing the AAUAAA signal sequence and interacting with poly(A) polymerase and other factors to bring about cleavage and poly(A) addition. This subunit is involved in the RNA recognition step of the polyadenylation reaction. May play a role in eye morphogenesis and the development of retinal ganglion cell projections to the midbrain.
Subunit / interactions. Component of the cleavage and polyadenylation specificity factor (CPSF) complex, composed of CPSF1, CPSF2, CPSF3, CPSF4 and FIP1L1. Found in a complex with CPSF1, FIP1L1 and PAPOLA. Interacts with FIP1L1, TENT2/GLD2 and SRRM1. Interacts with TUT1; the interaction is direct and mediates the recruitment of the CPSF complex on the 3’UTR of selected pre-mRNAs.
Subcellular location. Nucleus. Nucleoplasm.
Tissue specificity. Widely expressed, with high expression in the retina.
Post-translational modifications. The N-terminus is blocked.
Disease relevance. Myopia 27, autosomal dominant (MYP27) [MIM:618827] A form of myopia, a refractive error of the eye, in which parallel rays from a distant object come to focus in front of the retina, vision being better for near objects than for far. MYP27 patients are affected by early-onset high myopia with increased axial lengths. Fundus changes include optic nerve head crescent and tigroid appearance of the posterior retina. The disease is caused by variants affecting the gene represented in this entry.
Similarity. Belongs to the CPSF1 family.
RefSeq proteins (1): NP_037423* (*=MANE)
Domains & families (InterPro)
| ID | Name | Type |
|---|---|---|
| IPR004871 | RSE1/DDB1/CPSF1_C | Domain |
| IPR015943 | WD40/YVTN_repeat-like_dom_sf | Homologous_superfamily |
| IPR018846 | Beta-prop_RSE1/DDB1/CPSF1_1st | Domain |
| IPR050358 | RSE1/DDB1/CFT1 | Family |
| IPR058543 | Beta-prop_RSE1/DDB1/CPSF1_2nd | Domain |
Pfam: PF03178, PF10433, PF23726
UniProt features (146 total): strand 103, turn 14, helix 14, region of interest 4, sequence variant 3, sequence conflict 2, compositionally biased region 2, modified residue 2, chain 1, short sequence motif 1
Structure
Experimental structures (PDB)
13 structures.
| PDB | Method | Resolution (Å) |
|---|---|---|
| 9T3X | ELECTRON MICROSCOPY | 2.1 |
| 6F9N | X-RAY DIFFRACTION | 2.5 |
| 8E3I | ELECTRON MICROSCOPY | 2.53 |
| 9OXE | ELECTRON MICROSCOPY | 2.53 |
| 8E3Q | ELECTRON MICROSCOPY | 2.68 |
| 8R8R | ELECTRON MICROSCOPY | 2.79 |
| 6URG | ELECTRON MICROSCOPY | 3 |
| 6FUW | ELECTRON MICROSCOPY | 3.07 |
| 9OXS | ELECTRON MICROSCOPY | 3.07 |
| 6BLY | ELECTRON MICROSCOPY | 3.36 |
| 6DNH | ELECTRON MICROSCOPY | 3.4 |
| 6URO | ELECTRON MICROSCOPY | 3.6 |
| 6BM0 | ELECTRON MICROSCOPY | 3.8 |
Predicted structure (AlphaFold)
| Model | pLDDT | Fraction very-high |
|---|---|---|
| AF-Q10570-F1 | 82.97 | 0.62 |
Functional residue map
Curated UniProt residues grouped by drug-discovery relevance — catalytic, ligand-binding, modification, and mutation-validated positions. Source: UniProtKB sequence features.
Post-translational modifications (2): 756, 766
Function
Pathways and Gene Ontology
Reactome pathways
8 pathways
| ID | Pathway |
|---|---|
| R-HSA-159231 | Transport of Mature mRNA Derived from an Intronless Transcript |
| R-HSA-6784531 | tRNA processing in the nucleus |
| R-HSA-72187 | mRNA 3’-end processing |
| R-HSA-73856 | RNA Polymerase II Transcription Termination |
| R-HSA-77595 | Processing of Intronless Pre-mRNAs |
| R-HSA-9770562 | mRNA Polyadenylation |
| R-HSA-9918481 | Dengue Virus-Host Interactions |
| R-HSA-72203 | Processing of Capped Intron-Containing Pre-mRNA |
MSigDB gene sets: 133 (showing top):
WANG_CLIM2_TARGETS_UP, ACEVEDO_NORMAL_TISSUE_ADJACENT_TO_LIVER_TUMOR_DN, BROWNE_HCMV_INFECTION_48HR_DN, ONKEN_UVEAL_MELANOMA_UP, BLALOCK_ALZHEIMERS_DISEASE_UP, REACTOME_MRNA_3_END_PROCESSING, REACTOME_PROCESSING_OF_CAPPED_INTRON_CONTAINING_PRE_MRNA, GOBP_MRNA_3_END_PROCESSING, BROWNE_HCMV_INFECTION_14HR_DN, LIAO_METASTASIS, DANG_BOUND_BY_MYC, DEURIG_T_CELL_PROLYMPHOCYTIC_LEUKEMIA_UP, SHEN_SMARCA2_TARGETS_DN, REACTOME_METABOLISM_OF_RNA, PARENT_MTOR_SIGNALING_UP
GO Biological Process (3): mRNA processing (GO:0006397), co-transcriptional RNA 3’-end processing, cleavage and polyadenylation pathway (GO:0180012), RNA 3’-end processing (GO:0031123)
GO Molecular Function (5): enzyme binding (GO:0019899), mRNA 3’-UTR AU-rich region binding (GO:0035925), nucleic acid binding (GO:0003676), RNA binding (GO:0003723), protein binding (GO:0005515)
GO Cellular Component (3): nucleus (GO:0005634), nucleoplasm (GO:0005654), mRNA cleavage and polyadenylation specificity factor complex (GO:0005847)
Reactome top-level categories
Rollup of top-8 pathways:
| Category | Pathways |
|---|---|
| Transport of Mature mRNAs Derived from Intronless Transcripts | 1 |
| tRNA processing | 1 |
| Processing of Capped Intron-Containing Pre-mRNA | 1 |
| RNA Polymerase II Transcription | 1 |
| Processing of Capped Intronless Pre-mRNA | 1 |
| mRNA 3’-end processing | 1 |
| Dengue Virus Infection | 1 |
| Metabolism of RNA | 1 |
GO top-level categories
Rollup of top GO terms by namespace:
| Category | Terms |
|---|---|
| RNA processing | 2 |
| binding | 2 |
| mRNA metabolic process | 1 |
| RNA 3’-end processing | 1 |
| protein binding | 1 |
| mRNA 3’-UTR binding | 1 |
| nucleic acid binding | 1 |
| intracellular membrane-bounded organelle | 1 |
| nuclear lumen | 1 |
| cellular anatomical structure | 1 |
| mRNA cleavage factor complex | 1 |
Protein interactions and networks
STRING
2282 interactions, top by confidence (×1000):
| Protein A | Protein B | Partner UniProt | Score |
|---|---|---|---|
| CPSF1 | CPSF4 | O95639 | 999 |
| CPSF1 | CPSF2 | Q9P2I0 | 999 |
| CPSF1 | CPSF3 | Q9UKF6 | 998 |
| CPSF1 | CSTF3 | Q12996 | 998 |
| CPSF1 | WDR33 | Q9C0J8 | 998 |
| CPSF1 | FIP1L1 | Q6UN15 | 993 |
| CPSF1 | PAPOLA | P51003 | 968 |
| CPSF1 | SYMPK | Q92797 | 961 |
| CPSF1 | CSTF1 | Q05048 | 919 |
| CPSF1 | CSTF2 | P33240 | 919 |
| CPSF1 | CPSF7 | Q8N684 | 894 |
| CPSF1 | NUDT21 | O43809 | 859 |
| CPSF1 | PAPOLG | Q9BWT3 | 828 |
| CPSF1 | PAPOLB | Q9NRJ5 | 818 |
| CPSF1 | CPSF6 | Q16630 | 803 |
IntAct
132 interactions, top by confidence:
| A | B | Type | Score |
|---|---|---|---|
| CPSF1 | CPSF4 | psi-mi:“MI:0915”(physical association) | 0.670 |
| CPSF3 | CPSF4 | psi-mi:“MI:0914”(association) | 0.640 |
| CDC73 | CSTF2 | psi-mi:“MI:0914”(association) | 0.580 |
| CDC73 | CSTF2 | psi-mi:“MI:0915”(physical association) | 0.580 |
| CPSF1 | REL | psi-mi:“MI:0915”(physical association) | 0.560 |
| NS1 | PIK3R2 | psi-mi:“MI:0914”(association) | 0.560 |
| CPSF1 | DDIT4L | psi-mi:“MI:0915”(physical association) | 0.560 |
| INCA1 | CPSF1 | psi-mi:“MI:0915”(physical association) | 0.560 |
| CPSF1 | CPSF4 | psi-mi:“MI:0915”(physical association) | 0.560 |
| SYMPK | CPSF4 | psi-mi:“MI:0914”(association) | 0.530 |
| NCBP3 | SAP18 | psi-mi:“MI:0914”(association) | 0.530 |
| WDR33 | CPSF4 | psi-mi:“MI:0914”(association) | 0.530 |
| NS1 | PIK3R2 | psi-mi:“MI:0914”(association) | 0.530 |
| CDC73 | CPSF4 | psi-mi:“MI:0914”(association) | 0.500 |
| CPSF1 | NPM1 | psi-mi:“MI:0915”(physical association) | 0.500 |
| CPSF6 | DDX39A | psi-mi:“MI:0914”(association) | 0.480 |
| DDX21 | MED19 | psi-mi:“MI:2364”(proximity) | 0.480 |
| TUT1 | CPSF1 | psi-mi:“MI:0914”(association) | 0.460 |
| RPP38 | CPSF1 | psi-mi:“MI:0915”(physical association) | 0.400 |
| CPSF1 | WWOX | psi-mi:“MI:0915”(physical association) | 0.400 |
| ECE1 | CPSF1 | psi-mi:“MI:0915”(physical association) | 0.370 |
BioGRID (357): CPSF1 (Affinity Capture-Western), CPSF1 (Two-hybrid), CPSF1 (Affinity Capture-RNA), CPSF1 (Affinity Capture-RNA), CPSF1 (Affinity Capture-MS), CPSF1 (Affinity Capture-MS), CPSF1 (Affinity Capture-MS), CPSF1 (Affinity Capture-MS), CPSF1 (Affinity Capture-MS), CPSF1 (Co-fractionation), CPSF1 (Co-fractionation), CPSF2 (Co-fractionation), CPSF3 (Co-fractionation), CPSF4 (Co-fractionation), CSTF2 (Co-fractionation)
ESM2 similar proteins: A0A0R4IC37, A1A4K3, A2CEI4, B1WC10, E9PY46, F1QEB7, F4IDS7, O08658, O13046, O75694, O75717, O95876, P33194, P37199, P59328, Q08D69, Q10569, Q10570, Q16531, Q32NR9, Q3U1J4, Q4ADV7, Q566H4, Q5DQR4, Q5R649, Q5U1Z0, Q5ZLG9, Q6P6Z0, Q6PGF3, Q6PJI9, Q7XWP1, Q802U2, Q805F9, Q8BMG7, Q8C0M0, Q8C456, Q8CEC0, Q8CJF7, Q8K1X1, Q8NFP9
Diamond homologs: A0A0R4IC37, A8XPU7, Q10569, Q10570, Q7XWP1, Q9EPU4, Q9FGR0, Q9N4C2, Q9V726, Q4WLI5, Q5B1X8
SIGNOR signaling
1 interactions.
| A | Effect | B | Mechanism |
|---|---|---|---|
| CPSF1 | “form complex” | “CPSF complex” | binding |
Enriched among interaction partners
Reactome pathways and GO biological processes over-represented among this gene’s 120 IntAct physical interaction partners (hypergeometric vs the genome-wide background, BH-FDR, gene-set size 15–500, ranked by fold). A functional readout of the neighbourhood — distinct from this gene’s own memberships above, and biased toward well-studied / hub proteins, so read it as themes rather than proof.
Reactome pathways:
| Pathway | Partners | Fold | FDR |
|---|---|---|---|
| Processing of Intronless Pre-mRNAs | 8 | 53.7× | 1e-10 |
| RNA Polymerase II Transcription Termination | 10 | 25.8× | 4e-10 |
| mRNA 3’-end processing | 11 | 25.5× | 6e-11 |
| mRNA Polyadenylation | 20 | 20.7× | 1e-18 |
| Transport of Mature mRNA Derived from an Intronless Transcript | 6 | 19.2× | 4e-05 |
| Processing of Capped Intron-Containing Pre-mRNA | 13 | 12.6× | 2e-09 |
| mRNA Splicing | 8 | 10.3× | 5e-05 |
| Dengue Virus-Host Interactions | 18 | 9.7× | 6e-11 |
GO biological processes:
| GO term | Partners | Fold | FDR |
|---|---|---|---|
| mRNA 3’-end processing | 7 | 36.4× | 5e-07 |
| regulation of alternative mRNA splicing, via spliceosome | 5 | 11.3× | 8e-03 |
| mRNA splicing, via spliceosome | 11 | 9.3× | 7e-06 |
| mRNA processing | 11 | 8.0× | 2e-05 |
Disease & clinical
Clinical variants and AI predictions
ClinVar
318 variants total. Per-class counts are floors (≥ shown; pagination cap):
| Classification | Count (floor) |
|---|---|
| Pathogenic | 5 |
| Likely pathogenic | 3 |
| Uncertain significance | 233 |
| Likely benign | 18 |
| Benign | 1 |
Top pathogenic / likely-pathogenic (8)
| Variant ID | HGVS | Classification |
|---|---|---|
| 1074846 | NM_013291.3(CPSF1):c.2524dup (p.Glu842fs) | Pathogenic |
| 829510 | NM_013291.3(CPSF1):c.4146-2A>G | Pathogenic |
| 829512 | NM_013291.3(CPSF1):c.3823G>T (p.Asp1275Tyr) | Pathogenic |
| 829514 | NM_013291.3(CPSF1):c.1858C>T (p.Gln620Ter) | Pathogenic |
| 837526 | NM_013291.3(CPSF1):c.1603_1604del (p.Met535fs) | Pathogenic |
| 3068158 | NM_013291.3(CPSF1):c.539+1G>A | Likely pathogenic |
| 3235178 | NM_013291.3(CPSF1):c.1258_1259del (p.Lys420fs) | Likely pathogenic |
| 4293872 | NM_013291.3(CPSF1):c.2156del (p.Asp719fs) | Likely pathogenic |
SpliceAI
5358 predictions. Top by Δscore:
| Variant | Effect | Δscore |
|---|---|---|
| 8:144393450:A:AC | donor_gain | 1.0000 |
| 8:144393451:C:CC | donor_gain | 1.0000 |
| 8:144393451:CT:C | donor_gain | 1.0000 |
| 8:144393451:CTA:C | donor_gain | 1.0000 |
| 8:144393451:CTAT:C | donor_gain | 1.0000 |
| 8:144393451:CTATG:C | donor_gain | 1.0000 |
| 8:144393587:CATC:C | acceptor_gain | 1.0000 |
| 8:144393589:TC:T | acceptor_gain | 1.0000 |
| 8:144393590:CC:C | acceptor_gain | 1.0000 |
| 8:144393590:CCTGG:C | acceptor_loss | 1.0000 |
| 8:144393591:C:CC | acceptor_gain | 1.0000 |
| 8:144393591:CT:C | acceptor_loss | 1.0000 |
| 8:144393592:T:A | acceptor_loss | 1.0000 |
| 8:144393661:GCTC:G | donor_loss | 1.0000 |
| 8:144393662:CTCA:C | donor_loss | 1.0000 |
| 8:144393663:TCA:T | donor_loss | 1.0000 |
| 8:144393664:CACCG:C | donor_loss | 1.0000 |
| 8:144393666:C:CT | donor_loss | 1.0000 |
| 8:144393792:GGTGG:G | acceptor_gain | 1.0000 |
| 8:144393793:GTGG:G | acceptor_gain | 1.0000 |
| 8:144393794:TGG:T | acceptor_gain | 1.0000 |
| 8:144393795:GG:G | acceptor_gain | 1.0000 |
| 8:144393796:GC:G | acceptor_loss | 1.0000 |
| 8:144393797:C:CA | acceptor_loss | 1.0000 |
| 8:144393797:C:CC | acceptor_gain | 1.0000 |
| 8:144393801:C:CT | acceptor_gain | 1.0000 |
| 8:144393802:A:T | acceptor_gain | 1.0000 |
| 8:144393806:G:C | acceptor_gain | 1.0000 |
| 8:144393806:G:GC | acceptor_gain | 1.0000 |
| 8:144394039:C:CC | acceptor_gain | 1.0000 |
AlphaMissense
9421 scored. Top likely-pathogenic:
| Variant | Protein change | am_pathogenicity |
|---|---|---|
| 8:144393468:C:T | G1423D | 1.000 |
| 8:144393480:G:T | A1419D | 1.000 |
| 8:144393715:A:G | L1366P | 1.000 |
| 8:144393727:A:G | L1362P | 1.000 |
| 8:144393736:A:G | L1359P | 1.000 |
| 8:144394011:A:G | L1296P | 1.000 |
| 8:144394028:A:C | S1290R | 1.000 |
| 8:144394028:A:T | S1290R | 1.000 |
| 8:144394030:T:G | S1290R | 1.000 |
| 8:144394422:A:G | L1234P | 1.000 |
| 8:144394446:T:A | D1226V | 1.000 |
| 8:144394503:T:A | D1207V | 1.000 |
| 8:144394503:T:C | D1207G | 1.000 |
| 8:144394504:C:G | D1207H | 1.000 |
| 8:144394508:G:C | F1205L | 1.000 |
| 8:144394508:G:T | F1205L | 1.000 |
| 8:144394509:A:C | F1205C | 1.000 |
| 8:144394509:A:G | F1205S | 1.000 |
| 8:144394510:A:G | F1205L | 1.000 |
| 8:144394510:A:T | F1205I | 1.000 |
| 8:144394651:C:T | G1187D | 1.000 |
| 8:144394886:C:T | G1137E | 1.000 |
| 8:144394887:C:G | G1137R | 1.000 |
| 8:144394887:C:T | G1137R | 1.000 |
| 8:144394926:C:A | G1124W | 1.000 |
| 8:144395120:A:G | W1084R | 1.000 |
| 8:144395120:A:T | W1084R | 1.000 |
| 8:144395534:A:C | S999R | 1.000 |
| 8:144395534:A:T | S999R | 1.000 |
| 8:144395536:T:G | S999R | 1.000 |
dbSNP variants (sampled 300 via entrez): RS1000004491 (8:144410450 T>A), RS1000079849 (8:144400198 C>G,T), RS1000327769 (8:144402263 G>A), RS1001277393 (8:144405090 T>G), RS1001364441 (8:144393113 T>C), RS1001460151 (8:144400042 G>A), RS1002563980 (8:144408891 C>A), RS1002674087 (8:144406210 G>T), RS1002854810 (8:144400890 C>G,T), RS1003237699 (8:144411153 G>T), RS1003544905 (8:144401132 T>C), RS1003673112 (8:144404831 C>A,T), RS1003835135 (8:144409872 A>G), RS1004352058 (8:144407424 G>A), RS1004581928 (8:144401726 T>C)
Disease associations
OMIM: gene MIM:606027 | disease phenotypes: MIM:618827
GenCC curated gene-disease
| Disease | Classification | Inheritance |
|---|---|---|
| myopia 27 | Strong | Autosomal dominant |
Mondo (1): myopia 27 (MONDO:0032941)
Orphanet (0):
HPO phenotypes
3 total (3 of 3 shown, HPO-id order):
| HPO | Term |
|---|---|
| HP:0000006 | Autosomal dominant inheritance |
| HP:0007800 | Increased axial length of the globe |
| HP:0011003 | High myopia |
GWAS associations
1 associations (top):
| Study | Trait | p-value |
|---|---|---|
| GCST002598_30 | Educational attainment | 9.000000e-06 |
EFO canonical traits (1, from GWAS)
| EFO ID | Trait name |
|---|---|
| EFO:0004784 | self reported educational attainment |
Drugs & pharmacology
Drug and pharmacology data
Is drug target: yes
ChEMBL targets (1): CHEMBL5724781 (SINGLE PROTEIN)
Molecules with ChEMBL bioactivity
1 molecules (phase ≥1), by development phase (incl. off-target/promiscuous compounds). Patent mentions across the top 20 by phase: 1,538 (via chembl_molecule»patent_compound — counts attach to the compound, not the gene–compound relationship, so off-target/promiscuous molecules can dominate).
| Molecule | Name | Phase | Patents |
|---|---|---|---|
| CHEMBL1232461 | MOLIBRESIB | 2 | 1,538 |
PharmGKB: 1 entry (VIP=true, CPIC=false)
ChEMBL bioactivities
3 potent at pChembl≥5 of 3 total, top 3 by pChembl (potency: 10 = 0.1 nM, 6 = 1 µM).
| pChembl | Type | Value | Unit | Molecule |
|---|---|---|---|---|
| 8.01 | Kd | 9.747 | nM | CHEMBL3752910 |
| 7.93 | ED50 | 11.83 | nM | CHEMBL3752910 |
| 5.81 | IC50 | 1540 | nM | MOLIBRESIB |
PubChem BioAssay actives
2 with measured affinity, of 8 total; 2 most potent distinct compounds. Largely complementary to BindingDB; screening values are coarse (µM, 4 dp), so sub-nM hits tie at the floor.
| Compound | Assay | Type | Value | Unit |
|---|---|---|---|---|
| 4-methyl-3-[(1-methyl-6-pyridin-3-ylpyrazolo[3,4-d]pyrimidin-4-yl)amino]-N-[3-(trifluoromethyl)phenyl]benzamide | 2149844: Binding affinity to human CPSF1 incubated for 45 mins by Kinobead based pull down assay | kd | 0.0097 | uM |
| 2-[(4S)-6-(4-chlorophenyl)-8-methoxy-1-methyl-4H-[1,2,4]triazolo[4,3-a][1,4]benzodiazepin-4-yl]-N-ethylacetamide | 2178616: Inhibition of CPSF1 (unknown origin) incubated for 1 hr by colloidal coomassie staining based LC-MS/MS analysis | ic50 | 1.5400 | uM |
CTD chemical–gene interactions
31 total (human), top 30 by PubMed support.
| Chemical | Actions (top 5) | PubMed papers |
|---|---|---|
| sodium arsenite | decreases expression, affects cotreatment, increases abundance, increases expression | 4 |
| FR900359 | affects phosphorylation | 1 |
| bisphenol A | decreases expression | 1 |
| potassium perchlorate | decreases expression | 1 |
| arsenite | affects binding, decreases reaction | 1 |
| cobaltous chloride | increases expression | 1 |
| pyrrolidine dithiocarbamic acid | affects cotreatment, increases expression | 1 |
| bathocuproine sulfonate | affects cotreatment, increases expression | 1 |
| beta-methylcholine | affects expression | 1 |
| ICG 001 | increases expression | 1 |
| MT19c compound | decreases expression | 1 |
| 4-(4-((5-(4,5-dimethyl-2-nitrophenyl)-2-furanyl)methylene)-4,5-dihydro-3-methyl-5-oxo-1H-pyrazol-1-yl)benzoic acid | increases expression | 1 |
| Arsenic | affects cotreatment, decreases expression, increases abundance, increases expression | 1 |
| Caffeine | affects phosphorylation | 1 |
| Cannabidiol | affects cotreatment, decreases expression | 1 |
| Cisplatin | affects cotreatment, increases expression | 1 |
| Cuprizone | affects cotreatment, decreases expression | 1 |
| Ivermectin | decreases expression | 1 |
| Lead | affects expression | 1 |
| Methotrexate | affects response to substance | 1 |
| Piroxicam | affects cotreatment, increases expression | 1 |
| Ribonucleotides | affects binding | 1 |
| Rotenone | increases expression | 1 |
| Selenium | increases expression | 1 |
| Smoke | decreases expression | 1 |
| Tobacco Smoke Pollution | decreases expression | 1 |
| Valproic Acid | increases methylation | 1 |
| Vitamin E | increases expression | 1 |
| beta-Naphthoflavone | decreases expression | 1 |
| Acrylamide | decreases expression | 1 |
ChEMBL screening assays
7 unique, capped per target: 7 binding
Representative assays (with source publication via chembl_document):
| Assay ID | Type | Description | Source paper |
|---|---|---|---|
| CHEMBL5652886 | Binding | Binding affinity to human CPSF1 incubated for 45 mins by Kinobead based pull down assay | NVP-BHG712: Effects of Regioisomers on the Affinity and Selectivity toward the EPHrin Family. — ChemMedChem |
Clinical trials (associated diseases)
0 trials via MONDO — disease-level, not drug-specific.