CPSF4
geneOn this page
Also known as NARCPSF30
Summary
CPSF4 (cleavage and polyadenylation specific factor 4, HGNC:2327) is a protein-coding gene on chromosome 7q22.1, encoding Cleavage and polyadenylation specificity factor subunit 4 (O95639). Component of the cleavage and polyadenylation specificity factor (CPSF) complex that play a key role in pre-mRNA 3’-end formation, recognizing the AAUAAA signal sequence and interacting with poly(A) polymerase and other factors to bring about cleavage and poly(A) addition. It is a common-essential gene (DepMap: required in 99.8% of cancer cell lines).
Inhibition of the nuclear export of poly(A)-containing mRNAs caused by the influenza A virus NS1 protein requires its effector domain. The NS1 effector domain functionally interacts with the cellular 30 kDa subunit of cleavage and polyadenylation specific factor 4, an essential component of the 3’ end processing machinery of cellular pre-mRNAs. In influenza virus-infected cells, the NS1 protein is physically associated with cleavage and polyadenylation specific factor 4, 30kD subunit. Binding of the NS1 protein to the 30 kDa protein in vitro prevents CPSF binding to the RNA substrate and inhibits 3’ end cleavage and polyadenylation of host pre-mRNAs. Thus the NS1 protein selectively inhibits the nuclear export of cellular, and not viral, mRNAs. Multiple alternatively spliced transcript variants that encode different isoforms have been described for this gene.
Source: NCBI Gene 10898 — RefSeq curated summary.
At a glance
- GWAS associations: 8
- Clinical variants (ClinVar): 3 total
- Druggable target: yes
- Cancer dependency (DepMap): dependent in 99.8% of screened cell lines (common-essential)
- MANE Select transcript:
NM_006693
Identifiers
Gene identifiers
| Field | Value |
|---|---|
| HGNC ID | HGNC:2327 |
| Approved symbol | CPSF4 |
| Name | cleavage and polyadenylation specific factor 4 |
| Location | 7q22.1 |
| Locus type | gene with protein product |
| Status | Approved |
| Aliases | NAR, CPSF30 |
| Ensembl gene | ENSG00000160917 |
| Ensembl biotype | protein_coding |
| OMIM | 603052 |
| Entrez | 10898 |
Gene structure
Transcript identifiers
Ensembl transcripts: 26 — 20 protein_coding, 4 retained_intron, 1 nonsense_mediated_decay, 1 protein_coding_CDS_not_defined
ENST00000292476, ENST00000412686, ENST00000430038, ENST00000436336, ENST00000440514, ENST00000441580, ENST00000451876, ENST00000452047, ENST00000465132, ENST00000469897, ENST00000471455, ENST00000482251, ENST00000484112, ENST00000887788, ENST00000887790, ENST00000887791, ENST00000887794, ENST00000887795, ENST00000926770, ENST00000926771, ENST00000967753, ENST00000967754, ENST00000967755, ENST00000967756, ENST00000967757, ENST00000967758
RefSeq mRNA: 5 — MANE Select: NM_006693
NM_001081559, NM_001318160, NM_001318161, NM_001318162, NM_006693
CCDS: CCDS47652, CCDS5664, CCDS83205
Canonical transcript exons
ENST00000292476 — 8 exons
| Exon | Start | End |
|---|---|---|
| ENSE00001055437 | 99453966 | 99454136 |
| ENSE00001842942 | 99438943 | 99439185 |
| ENSE00003487558 | 99452368 | 99452440 |
| ENSE00003512504 | 99450276 | 99450371 |
| ENSE00003549494 | 99448121 | 99448273 |
| ENSE00003596374 | 99444789 | 99444839 |
| ENSE00003784899 | 99450702 | 99450795 |
| ENSE00003848989 | 99456432 | 99457373 |
Expression profiles
Bgee: expression breadth ubiquitous, 280 present calls, max score 98.81.
FANTOM5 (CAGE): breadth ubiquitous, TPM avg 9.8118 / max 132.3603, expressed in 1774 samples.
FANTOM5 promoters (2 alternative TSS)
| Promoter ID | TPM avg | Samples expressed |
|---|---|---|
| 79874 | 6.8315 | 1712 |
| 79875 | 2.9803 | 1054 |
Top tissues by expression
293 total, by Bgee expression score (0-100, higher = more expressed):
| Tissue | Anatomy ID | Expression score | Quality |
|---|---|---|---|
| secondary oocyte | CL:0000655 | 98.81 | gold quality |
| oocyte | CL:0000023 | 96.79 | gold quality |
| right hemisphere of cerebellum | UBERON:0014890 | 96.50 | gold quality |
| mucosa of stomach | UBERON:0001199 | 96.39 | gold quality |
| cerebellar hemisphere | UBERON:0002245 | 96.11 | gold quality |
| cerebellar cortex | UBERON:0002129 | 96.01 | gold quality |
| cerebellum | UBERON:0002037 | 94.52 | gold quality |
| right frontal lobe | UBERON:0002810 | 94.21 | gold quality |
| left testis | UBERON:0004533 | 93.80 | gold quality |
| right testis | UBERON:0004534 | 93.70 | gold quality |
| nerve | UBERON:0001021 | 93.65 | gold quality |
| tibial nerve | UBERON:0001323 | 93.65 | gold quality |
| endothelial cell | CL:0000115 | 93.23 | gold quality |
| body of pancreas | UBERON:0001150 | 93.11 | gold quality |
| ectocervix | UBERON:0012249 | 92.93 | gold quality |
| granulocyte | CL:0000094 | 92.76 | gold quality |
| type B pancreatic cell | CL:0000169 | 92.62 | gold quality |
| left uterine tube | UBERON:0001303 | 92.62 | gold quality |
| skin of leg | UBERON:0001511 | 92.53 | gold quality |
| body of uterus | UBERON:0009853 | 92.48 | gold quality |
| skin of abdomen | UBERON:0001416 | 92.46 | gold quality |
| testis | UBERON:0000473 | 92.29 | gold quality |
| lower esophagus mucosa | UBERON:0035834 | 92.27 | gold quality |
| gastrocnemius | UBERON:0001388 | 92.23 | gold quality |
| adenohypophysis | UBERON:0002196 | 92.21 | gold quality |
| Brodmann (1909) area 9 | UBERON:0013540 | 92.20 | gold quality |
| right ovary | UBERON:0002118 | 91.86 | gold quality |
| cingulate cortex | UBERON:0003027 | 91.84 | gold quality |
| amygdala | UBERON:0001876 | 91.80 | gold quality |
| subcutaneous adipose tissue | UBERON:0002190 | 91.80 | gold quality |
Single-cell (SCXA)
Detected in 1 experiment(s), a significant marker in 1.
| Experiment | Marker? | Max mean expression |
|---|---|---|
| E-ANND-3 | yes | 3.68 |
Regulation
Is transcription factor: no
miRNA regulators (miRDB)
57 targeting CPSF4, top 30 by miRDB confidence (max_score; target_count = how many genes the miRNA targets in total — lower means more specific):
| miRNA | Max score | Avg score | miRNA target_count |
|---|---|---|---|
| HSA-MIR-4500 | 99.99 | 72.72 | 2367 |
| HSA-LET-7A-5P | 99.98 | 72.29 | 1790 |
| HSA-LET-7B-5P | 99.98 | 72.31 | 1790 |
| HSA-LET-7C-5P | 99.98 | 72.29 | 1790 |
| HSA-LET-7E-5P | 99.98 | 72.29 | 1790 |
| HSA-LET-7F-5P | 99.98 | 72.56 | 1784 |
| HSA-LET-7G-5P | 99.98 | 72.37 | 1784 |
| HSA-LET-7I-5P | 99.98 | 72.37 | 1788 |
| HSA-MIR-98-5P | 99.98 | 72.33 | 1787 |
| HSA-MIR-607 | 99.97 | 73.62 | 5593 |
| HSA-LET-7D-5P | 99.96 | 71.76 | 1632 |
| HSA-MIR-4458 | 99.96 | 71.64 | 1650 |
| HSA-MIR-1468-3P | 99.96 | 72.74 | 3797 |
| HSA-MIR-23A-3P | 99.95 | 74.24 | 3163 |
| HSA-MIR-23B-3P | 99.95 | 74.24 | 3163 |
| HSA-MIR-23C | 99.95 | 73.92 | 3192 |
| HSA-MIR-6721-5P | 99.93 | 68.92 | 2981 |
| HSA-MIR-3671 | 99.90 | 73.04 | 3897 |
| HSA-MIR-4496 | 99.88 | 68.89 | 2236 |
| HSA-MIR-7978 | 99.86 | 66.90 | 856 |
| HSA-MIR-4698 | 99.84 | 71.41 | 4303 |
| HSA-MIR-3934-3P | 99.76 | 65.51 | 1351 |
| HSA-MIR-8084 | 99.73 | 69.57 | 1760 |
| HSA-MIR-4677-5P | 99.70 | 70.09 | 1940 |
| HSA-MIR-4530 | 99.69 | 66.47 | 1509 |
| HSA-MIR-3934-5P | 99.67 | 64.04 | 846 |
| HSA-MIR-6132 | 99.60 | 65.83 | 1554 |
| HSA-MIR-6836-5P | 99.60 | 65.62 | 1538 |
| HSA-MIR-6832-3P | 99.52 | 70.44 | 1726 |
| HSA-MIR-5582-5P | 99.27 | 71.42 | 1879 |
Functional genomics
DepMap (CRISPR cell-line fitness): dependent in 99.8% of screened cell lines, common-essential.
Literature-anchored findings (GeneRIF, showing 15)
- Data show that the NS1A protein of the pathogenic H5N1 influenza A/Hong Kong/483/97 (HK97) virus isolated from humans has an intrinsic defect in CPSF30 binding. (PMID:17522219)
- CPSF4 plays a critical role in regulating lung cancer cell proliferation and survival and may be a potential prognostic biomarker and therapeutic target for lung adenocarcinoma. (PMID:24358221)
- Data show that 4 drug-like compounds from traditional Chinese Medicine (TCM) database were selected as potential inhibitors for the cleavage and polyadenylation specific factor CPSF30-binding site of influenza A virus of non-structural protein 1 (NS1A). (PMID:24562912)
- CPSF4 upregulates human TERT promoter activity, human TERT expression and telomerase activity. (PMID:24618080)
- unexpectedly found that CPSF subunits CPSF30 and Wdr33 directly contact AAUAAA (PMID:25301780)
- These findings support a role for iron in some zinc-finger proteins. Using electrophoretic mobility shift assays and fluorescence anisotropy, we report that CPSF30 selectively recognizes the AU-rich hexamer (AAUAAA) sequence present in pre-mRNA, providing the first molecular-based evidence to our knowledge for CPSF30/RNA binding. (PMID:27071088)
- Cleavage and polyadenylation specific factor 4 targets NF-kappaB/cyclooxygenase-2 signaling to promote the growth and survival of non-small cell lung carcinoma cells. (PMID:27450326)
- Residues F103 and M106 within the NS1-CPSF4 binding region are important for viral replication. (PMID:28554059)
- The authors define the molecular architecture of the core human CPSF complex comprising CPSF160, WDR33, CPSF30 and Fip1 and identify specific domains involved in inter-subunit interactions. Together, these results shed light on the function of CPSF in mediating polyA signal-dependent RNA cleavage and polyadenylation. (PMID:29274231)
- MiR-4458 inhibits breast cancer cell growth, migration, and invasiveness by targeting CPSF4; results suggest that the miR-4458-CPSF4-COX-2-hTERT axis might serve as a potential target for the treatment of breast cancer patients (PMID:30970220)
- Overproduced CPSF4 Promotes Cell Proliferation and Invasion via PI3K-AKT Signaling Pathway in Oral Squamous Cell Carcinoma. (PMID:33535057)
- CPSF4 promotes triple negative breast cancer metastasis by upregulating MDM4. (PMID:34006850)
- CPSF4 regulates circRNA formation and microRNA mediated gene silencing in hepatocellular carcinoma. (PMID:34103682)
- CPSF4 promotes tumor-initiating phenotype by enhancing VEGF/NRP2/TAZ signaling in lung cancer. (PMID:36567417)
- Cleavage and Polyadenylation-Specific Factor 4 (CPSF4) Expression Is Associated with Enhanced Prostate Cancer Cell Migration and Cell Cycle Dysregulation, In Vitro. (PMID:37629142)
Cross-species orthologs
6 orthologs
| Organism | Symbol | Gene ID |
|---|---|---|
| danio_rerio | cpsf4 | ENSDARG00000020217 |
| mus_musculus | Cpsf4 | ENSMUSG00000029625 |
| rattus_norvegicus | Cpsf4 | ENSRNOG00000000985 |
| rattus_norvegicus | Cpsf4l2 | ENSRNOG00000025217 |
| rattus_norvegicus | Cpsf4-ps1 | ENSRNOG00000062241 |
| drosophila_melanogaster | Clp | FBGN0015621 |
Paralogs (2): ZC3H3 (ENSG00000014164), CPSF4L (ENSG00000187959)
Protein
Protein identifiers
Cleavage and polyadenylation specificity factor subunit 4 — O95639 (reviewed: O95639)
Alternative names: Cleavage and polyadenylation specificity factor 30 kDa subunit, NS1 effector domain-binding protein 1, No arches homolog
All UniProt accessions (7): O95639, B7Z7B0, C9JEV9, C9K0K2, F8WEL7, H7C016, H7C419
UniProt curated annotations — full annotation on UniProt →
Function. Component of the cleavage and polyadenylation specificity factor (CPSF) complex that play a key role in pre-mRNA 3’-end formation, recognizing the AAUAAA signal sequence and interacting with poly(A) polymerase and other factors to bring about cleavage and poly(A) addition. CPSF4 binds RNA polymers with a preference for poly(U).
Subunit / interactions. Component of the cleavage and polyadenylation specificity factor (CPSF) complex, composed of CPSF1, CPSF2, CPSF3, CPSF4 and FIP1L1. Interacts with FIP1L1. (Microbial infection) Interacts with influenza A virus NS1 blocks processing of pre-mRNAs, thereby preventing nuclear export of host cell mRNAs.
Subcellular location. Nucleus.
Miscellaneous. May be due to a competing acceptor splice site.
Similarity. Belongs to the CPSF4/YTH1 family.
Isoforms (3)
| UniProt ID | Names | Canonical? |
|---|---|---|
| O95639-1 | 1 | yes |
| O95639-2 | 2 | |
| O95639-3 | 3 |
RefSeq proteins (5): NP_001075028, NP_001305089, NP_001305090, NP_001305091, NP_006684* (*=MANE)
Domains & families (InterPro)
| ID | Name | Type |
|---|---|---|
| IPR000571 | Znf_CCCH | Domain |
| IPR001878 | Znf_CCHC | Domain |
| IPR036855 | Znf_CCCH_sf | Homologous_superfamily |
| IPR036875 | Znf_CCHC_sf | Homologous_superfamily |
| IPR041686 | Znf-CCCH_3 | Domain |
| IPR045348 | CPSF4/Yth1 | Family |
Pfam: PF00098, PF00642, PF15663
UniProt features (32 total): helix 11, zinc finger region 6, strand 5, modified residue 4, splice variant 2, turn 2, chain 1, region of interest 1
Structure
Experimental structures (PDB)
13 structures.
| PDB | Method | Resolution (Å) |
|---|---|---|
| 7K95 | X-RAY DIFFRACTION | 1.9 |
| 2RHK | X-RAY DIFFRACTION | 1.95 |
| 7ZYH | X-RAY DIFFRACTION | 2.2 |
| 8E3I | ELECTRON MICROSCOPY | 2.53 |
| 9OXE | ELECTRON MICROSCOPY | 2.53 |
| 8E3Q | ELECTRON MICROSCOPY | 2.68 |
| 8R8R | ELECTRON MICROSCOPY | 2.79 |
| 6URG | ELECTRON MICROSCOPY | 3 |
| 6FUW | ELECTRON MICROSCOPY | 3.07 |
| 9OXS | ELECTRON MICROSCOPY | 3.07 |
| 6DNH | ELECTRON MICROSCOPY | 3.4 |
| 6URO | ELECTRON MICROSCOPY | 3.6 |
| 2D9N | SOLUTION NMR |
Predicted structure (AlphaFold)
| Model | pLDDT | Fraction very-high |
|---|---|---|
| AF-O95639-F1 | 76.60 | 0.38 |
Functional residue map
Curated UniProt residues grouped by drug-discovery relevance — catalytic, ligand-binding, modification, and mutation-validated positions. Source: UniProtKB sequence features.
Post-translational modifications (4): 212, 267, 200, 202
Function
Pathways and Gene Ontology
Reactome pathways
9 pathways
| ID | Pathway |
|---|---|
| R-HSA-159231 | Transport of Mature mRNA Derived from an Intronless Transcript |
| R-HSA-168315 | Inhibition of Host mRNA Processing and RNA Silencing |
| R-HSA-6784531 | tRNA processing in the nucleus |
| R-HSA-72187 | mRNA 3’-end processing |
| R-HSA-73856 | RNA Polymerase II Transcription Termination |
| R-HSA-77595 | Processing of Intronless Pre-mRNAs |
| R-HSA-9770562 | mRNA Polyadenylation |
| R-HSA-9918481 | Dengue Virus-Host Interactions |
| R-HSA-72203 | Processing of Capped Intron-Containing Pre-mRNA |
MSigDB gene sets: 191 (showing top):
MORF_MSH3, MORF_BRCA1, MORF_ATRX, AP2_Q3, PUJANA_CHEK2_PCC_NETWORK, MARTINEZ_RB1_TARGETS_DN, MORF_PPP5C, REACTOME_MRNA_3_END_PROCESSING, MORF_FANCG, REACTOME_PROCESSING_OF_CAPPED_INTRON_CONTAINING_PRE_MRNA, GOBP_MRNA_3_END_PROCESSING, MORF_RAP1A, DODD_NASOPHARYNGEAL_CARCINOMA_UP, PUJANA_BRCA_CENTERED_NETWORK, MORF_MT4
GO Biological Process (2): mRNA processing (GO:0006397), mRNA 3’-end processing (GO:0031124)
GO Molecular Function (6): RNA binding (GO:0003723), zinc ion binding (GO:0008270), sequence-specific double-stranded DNA binding (GO:1990837), nucleic acid binding (GO:0003676), protein binding (GO:0005515), metal ion binding (GO:0046872)
GO Cellular Component (3): nucleoplasm (GO:0005654), mRNA cleavage and polyadenylation specificity factor complex (GO:0005847), nucleus (GO:0005634)
Reactome top-level categories
Rollup of top-9 pathways:
| Category | Pathways |
|---|---|
| Transport of Mature mRNAs Derived from Intronless Transcripts | 1 |
| NS1 Mediated Effects on Host Pathways | 1 |
| tRNA processing | 1 |
| Processing of Capped Intron-Containing Pre-mRNA | 1 |
| RNA Polymerase II Transcription | 1 |
| Processing of Capped Intronless Pre-mRNA | 1 |
| mRNA 3’-end processing | 1 |
| Dengue Virus Infection | 1 |
| Metabolism of RNA | 1 |
GO top-level categories
Rollup of top GO terms by namespace:
| Category | Terms |
|---|---|
| binding | 2 |
| RNA processing | 1 |
| mRNA metabolic process | 1 |
| mRNA processing | 1 |
| RNA 3’-end processing | 1 |
| nucleic acid binding | 1 |
| transition metal ion binding | 1 |
| double-stranded DNA binding | 1 |
| sequence-specific DNA binding | 1 |
| cation binding | 1 |
| nuclear lumen | 1 |
| cellular anatomical structure | 1 |
| mRNA cleavage factor complex | 1 |
| intracellular membrane-bounded organelle | 1 |
Protein interactions and networks
STRING
1512 interactions, top by confidence (×1000):
| Protein A | Protein B | Partner UniProt | Score |
|---|---|---|---|
| CPSF4 | CPSF1 | Q10570 | 999 |
| CPSF4 | CPSF2 | Q9P2I0 | 999 |
| CPSF4 | CPSF3 | Q9UKF6 | 997 |
| CPSF4 | WDR33 | Q9C0J8 | 997 |
| CPSF4 | FIP1L1 | Q6UN15 | 971 |
| CPSF4 | CSTF3 | Q12996 | 903 |
| CPSF4 | NOC2L | Q9Y3T9 | 882 |
| CPSF4 | CSTF2 | P33240 | 878 |
| CPSF4 | CSTF1 | Q05048 | 874 |
| CPSF4 | SYMPK | Q92797 | 856 |
| CPSF4 | PCF11 | O94913 | 834 |
| CPSF4 | PABPN1 | Q86U42 | 799 |
| CPSF4 | CPSF7 | Q8N684 | 796 |
| CPSF4 | NUDT21 | O43809 | 787 |
| CPSF4 | PAPOLA | P51003 | 777 |
IntAct
100 interactions, top by confidence:
| A | B | Type | Score |
|---|---|---|---|
| EAF1 | ELL2 | psi-mi:“MI:0914”(association) | 0.840 |
| CPSF4 | FIP1L1 | psi-mi:“MI:0914”(association) | 0.660 |
| CPSF4 | FIP1L1 | psi-mi:“MI:0915”(physical association) | 0.660 |
| CPSF3 | CPSF4 | psi-mi:“MI:0914”(association) | 0.640 |
| MEOX2 | CPSF4 | psi-mi:“MI:0915”(physical association) | 0.560 |
| CPSF4 | psi-mi:“MI:0915”(physical association) | 0.560 | |
| CPSF4 | MEOX2 | psi-mi:“MI:0915”(physical association) | 0.560 |
| CPSF4 | psi-mi:“MI:0915”(physical association) | 0.560 | |
| CPSF1 | CPSF4 | psi-mi:“MI:0915”(physical association) | 0.560 |
| SYMPK | CPSF4 | psi-mi:“MI:0914”(association) | 0.530 |
| WDR33 | CPSF4 | psi-mi:“MI:0914”(association) | 0.530 |
| ZC3H18 | AQR | psi-mi:“MI:0914”(association) | 0.530 |
| NS1 | CPSF4 | psi-mi:“MI:0915”(physical association) | 0.500 |
| CDC73 | CPSF4 | psi-mi:“MI:0914”(association) | 0.500 |
| CPSF4 | CDC73 | psi-mi:“MI:0915”(physical association) | 0.500 |
| CPSF6 | DDX39A | psi-mi:“MI:0914”(association) | 0.480 |
| DDX21 | MED19 | psi-mi:“MI:2364”(proximity) | 0.480 |
| NPM1 | CPSF4 | psi-mi:“MI:0914”(association) | 0.480 |
| NS1 | CPSF4 | psi-mi:“MI:0915”(physical association) | 0.400 |
| CPSF4 | NS | psi-mi:“MI:0915”(physical association) | 0.370 |
| NS1 | CPSF4 | psi-mi:“MI:0915”(physical association) | 0.370 |
| CPSF4 | NS1 | psi-mi:“MI:0915”(physical association) | 0.370 |
| NS | CPSF4 | psi-mi:“MI:0915”(physical association) | 0.370 |
BioGRID (169): CPSF4 (Affinity Capture-Western), CPSF4 (Two-hybrid), FIP1L1 (Two-hybrid), CPSF4 (Affinity Capture-MS), CPSF4 (Affinity Capture-MS), FIP1L1 (Two-hybrid), CPSF2 (Co-fractionation), CPSF4 (Co-fractionation), CPSF4 (Co-fractionation), CPSF4 (Co-fractionation), WDR33 (Co-fractionation), FIP1L1 (Affinity Capture-Western), CPSF4 (Reconstituted Complex), CPSF4 (Affinity Capture-MS), CPSF4 (Affinity Capture-MS)
ESM2 similar proteins: A0A1L1SUL6, A4QJJ7, A4QK16, A6H5F6, A6MM34, A6NH52, A6NMK7, A7M939, A8SEA1, A8W3M5, B0Z4S1, B0Z505, B0Z589, B0Z5H3, B1AR13, B1NWE8, B2LMJ1, O19048, O19137, O95639, P0C7P0, P0DI19, P46292, P60335, P83870, P83871, P98138, Q09G48, Q0G9M1, Q0WMV8, Q15365, Q1KXW1, Q332Y0, Q5E9A3, Q5FVR7, Q5R654, Q63789, Q66KE3, Q68S08, Q6DJP7
Diamond homologs: A6NMK7, A9LNK9, O19137, O95639, Q4P384, Q4WKD9, Q59T36, Q5BGN2, Q5FVR7, Q66KE3, Q6DJP7, Q8BQZ5, Q9VPT8, Q06102, Q2URI6, Q4IPA4, Q6BTT1, Q6C922, Q6CKU1, Q6FTL0, Q758T3, Q7SGR2, Q9UTD1, P0CS64, P0CS65, Q0DA50, Q13064, Q9FE91
SIGNOR signaling
1 interactions.
| A | Effect | B | Mechanism |
|---|---|---|---|
| CPSF4 | “form complex” | “CPSF complex” | binding |
Enriched among interaction partners
Reactome pathways and GO biological processes over-represented among this gene’s 97 IntAct physical interaction partners (hypergeometric vs the genome-wide background, BH-FDR, gene-set size 15–500, ranked by fold). A functional readout of the neighbourhood — distinct from this gene’s own memberships above, and biased toward well-studied / hub proteins, so read it as themes rather than proof.
Reactome pathways:
| Pathway | Partners | Fold | FDR |
|---|---|---|---|
| Processing of Intronless Pre-mRNAs | 9 | 71.4× | 1e-13 |
| mRNA 3’-end processing | 16 | 43.8× | 1e-20 |
| RNA Polymerase II Transcription Termination | 14 | 42.7× | 7e-18 |
| Transport of Mature Transcript to Cytoplasm | 6 | 31.7× | 1e-06 |
| mRNA Polyadenylation | 20 | 24.4× | 1e-20 |
| Transport of Mature mRNA Derived from an Intronless Transcript | 6 | 22.7× | 7e-06 |
| mRNA Splicing | 10 | 15.2× | 4e-08 |
| Transport of Mature mRNA derived from an Intron-Containing Transcript | 7 | 14.8× | 1e-05 |
GO biological processes:
| GO term | Partners | Fold | FDR |
|---|---|---|---|
| mRNA 3’-end processing | 7 | 45.2× | 4e-08 |
| regulation of alternative mRNA splicing, via spliceosome | 6 | 16.8× | 1e-04 |
| mRNA transport | 5 | 15.1× | 1e-03 |
| mRNA processing | 14 | 12.7× | 2e-09 |
| mRNA splicing, via spliceosome | 11 | 11.6× | 4e-07 |
| RNA splicing | 10 | 10.1× | 6e-06 |
| chromatin remodeling | 7 | 5.9× | 9e-03 |
Disease & clinical
Clinical variants and AI predictions
ClinVar
3 variants total. Per-class counts are floors (≥ shown; pagination cap):
| Classification | Count (floor) |
|---|---|
| Pathogenic | 0 |
| Likely pathogenic | 0 |
| Uncertain significance | 1 |
| Likely benign | 0 |
| Benign | 1 |
Top pathogenic / likely-pathogenic (0)
SpliceAI
2297 predictions. Top by Δscore:
| Variant | Effect | Δscore |
|---|---|---|
| 7:99439186:G:GG | donor_gain | 1.0000 |
| 7:99444787:A:AG | acceptor_gain | 1.0000 |
| 7:99444788:G:GG | acceptor_gain | 1.0000 |
| 7:99444788:GA:G | acceptor_gain | 1.0000 |
| 7:99444840:GT:G | donor_loss | 1.0000 |
| 7:99444841:T:G | donor_loss | 1.0000 |
| 7:99448115:C:A | acceptor_gain | 1.0000 |
| 7:99448119:AGGG:A | acceptor_gain | 1.0000 |
| 7:99448120:GGGG:G | acceptor_gain | 1.0000 |
| 7:99448269:GTTCG:G | donor_gain | 1.0000 |
| 7:99450367:GCACG:G | donor_gain | 1.0000 |
| 7:99450370:CGGT:C | donor_loss | 1.0000 |
| 7:99450373:T:G | donor_loss | 1.0000 |
| 7:99450793:GCA:G | donor_gain | 1.0000 |
| 7:99450796:G:GG | donor_gain | 1.0000 |
| 7:99452343:A:AG | acceptor_gain | 1.0000 |
| 7:99452344:T:G | acceptor_gain | 1.0000 |
| 7:99452351:T:A | acceptor_gain | 1.0000 |
| 7:99452352:G:A | acceptor_gain | 1.0000 |
| 7:99452358:T:A | acceptor_gain | 1.0000 |
| 7:99454136:GGT:G | donor_loss | 1.0000 |
| 7:99454137:GTG:G | donor_loss | 1.0000 |
| 7:99439138:TG:T | donor_gain | 0.9900 |
| 7:99439139:GG:G | donor_gain | 0.9900 |
| 7:99439181:GGACA:G | donor_gain | 0.9900 |
| 7:99439182:GACA:G | donor_gain | 0.9900 |
| 7:99439182:GACAG:G | donor_gain | 0.9900 |
| 7:99439183:A:T | donor_gain | 0.9900 |
| 7:99444784:TACAG:T | acceptor_gain | 0.9900 |
| 7:99444785:A:AG | acceptor_gain | 0.9900 |
AlphaMissense
1793 scored. Top likely-pathogenic:
| Variant | Protein change | am_pathogenicity |
|---|---|---|
| 7:99439134:G:C | A18P | 1.000 |
| 7:99439152:G:A | G24R | 1.000 |
| 7:99439152:G:C | G24R | 1.000 |
| 7:99439153:G:A | G24E | 1.000 |
| 7:99439170:T:C | F30L | 1.000 |
| 7:99439171:T:C | F30S | 1.000 |
| 7:99439171:T:G | F30C | 1.000 |
| 7:99439172:C:A | F30L | 1.000 |
| 7:99439172:C:G | F30L | 1.000 |
| 7:99439180:T:C | M33T | 1.000 |
| 7:99439180:T:G | M33R | 1.000 |
| 7:99439181:G:A | M33I | 1.000 |
| 7:99439181:G:C | M33I | 1.000 |
| 7:99439181:G:T | M33I | 1.000 |
| 7:99439182:G:C | D34H | 1.000 |
| 7:99439183:A:C | D34A | 1.000 |
| 7:99439183:A:T | D34V | 1.000 |
| 7:99444790:G:C | K35N | 1.000 |
| 7:99444790:G:T | K35N | 1.000 |
| 7:99444804:T:A | V40D | 1.000 |
| 7:99444806:T:A | C41S | 1.000 |
| 7:99444806:T:C | C41R | 1.000 |
| 7:99444807:G:A | C41Y | 1.000 |
| 7:99444807:G:C | C41S | 1.000 |
| 7:99444807:G:T | C41F | 1.000 |
| 7:99444808:T:G | C41W | 1.000 |
| 7:99444815:T:C | F44L | 1.000 |
| 7:99444817:T:A | F44L | 1.000 |
| 7:99444817:T:G | F44L | 1.000 |
| 7:99444830:T:A | C49S | 1.000 |
dbSNP variants (sampled 300 via entrez): RS1000236639 (7:99440373 CAG>C), RS1000357355 (7:99445716 T>C), RS1000711537 (7:99438279 C>T), RS1000858065 (7:99444334 G>A,T), RS1000889385 (7:99444538 A>G,T), RS1001152144 (7:99449166 G>A), RS1001183304 (7:99449494 G>C), RS1001187819 (7:99438586 C>A,T), RS1001271470 (7:99450644 C>T), RS1001305109 (7:99456671 A>G), RS1001792642 (7:99439965 C>T), RS1001840351 (7:99452694 C>A), RS1002117584 (7:99439598 T>A,G), RS1002180223 (7:99437931 C>T), RS1002211235 (7:99438228 C>T)
Disease associations
OMIM: gene MIM:603052 | disease phenotypes:
GenCC curated gene-disease
Mondo (0):
Orphanet (0):
HPO phenotypes
0 total (0 of 0 shown, HPO-id order):
GWAS associations
8 associations (top):
| Study | Trait | p-value |
|---|---|---|
| GCST004730_2 | Facial emotion recognition (sad faces) | 3.000000e-06 |
| GCST005144_1 | Tacrolimus trough concentration in kidney transplant patients | 2.000000e-17 |
| GCST005335_2 | Diffuse cutaneous systemic sclerosis | 2.000000e-06 |
| GCST006249_21 | Serum metabolite levels | 4.000000e-11 |
| GCST006249_45 | Serum metabolite levels | 6.000000e-32 |
| GCST006249_47 | Serum metabolite levels | 1.000000e-45 |
| GCST90002389_160 | Lymphocyte percentage of white cells | 4.000000e-16 |
| GCST90002399_184 | Neutrophil percentage of white cells | 2.000000e-15 |
EFO canonical traits (4, from GWAS)
| EFO ID | Trait name |
|---|---|
| EFO:0008329 | facial emotion recognition measurement |
| EFO:0008458 | tacrolimus measurement |
| EFO:0007993 | lymphocyte percentage of leukocytes |
| EFO:0007990 | neutrophil percentage of leukocytes |
Drugs & pharmacology
Drug and pharmacology data
Is drug target: yes
ChEMBL targets (1): CHEMBL6066178 (SINGLE PROTEIN)
PharmGKB: 1 entry (VIP=true, CPIC=false)
CTD chemical–gene interactions
25 total (human), top 25 by PubMed support.
| Chemical | Actions (top 5) | PubMed papers |
|---|---|---|
| Cadmium Chloride | decreases expression, increases abundance, increases expression | 2 |
| aristolochic acid I | decreases expression | 1 |
| triphenyl phosphate | affects expression | 1 |
| beta-methylcholine | affects expression | 1 |
| K 7174 | decreases expression | 1 |
| 4-chloro-N-((4-(1,1-dimethylethyl)phenyl)methyl)-3-ethyl-1-methyl-1H-pyrazole-5-carboxamide | decreases expression | 1 |
| erucylphospho-N,N,N-trimethylpropylammonium | decreases expression | 1 |
| pyrimidifen | decreases expression | 1 |
| thifluzamide | decreases expression | 1 |
| jinfukang | affects cotreatment, increases expression | 1 |
| Sunitinib | increases expression | 1 |
| Cadmium | increases abundance, increases expression | 1 |
| Caffeine | decreases phosphorylation | 1 |
| Cisplatin | affects cotreatment, increases expression | 1 |
| Ethyl Methanesulfonate | increases expression | 1 |
| Ivermectin | decreases expression | 1 |
| Lead | affects expression | 1 |
| Methyl Methanesulfonate | increases expression | 1 |
| Ribonucleotides | affects binding | 1 |
| Rotenone | decreases expression | 1 |
| Smoke | decreases expression | 1 |
| Tretinoin | decreases expression | 1 |
| Valproic Acid | increases expression | 1 |
| Aflatoxin B1 | increases expression | 1 |
| Copper Sulfate | decreases expression | 1 |
ChEMBL screening assays
6 unique, capped per target: 6 binding
Representative assays (with source publication via chembl_document):
| Assay ID | Type | Description | Source paper |
|---|---|---|---|
| CHEMBL5697614 | Binding | Inhibition of CPSF4 (unknown origin) assessed as fold change at 10 uM incubated for 1 hr by colloidal coomassie staining based LC-MS/MS analysis | Inhibition of BET recruitment to chromatin as an effective treatment for MLL-fusion leukaemia. — Nature |
Clinical trials (associated diseases)
0 trials via MONDO — disease-level, not drug-specific.
Related Atlas pages
- Disease cohort memberships (association, not causation — diseases whose associated-gene cohort lists this gene; a subset are also under Associated diseases): diffuse scleroderma