CSTF1

gene
On this page

Summary

CSTF1 (cleavage stimulation factor subunit 1, HGNC:2483) is a protein-coding gene on chromosome 20q13.2-q13.31, encoding Cleavage stimulation factor subunit 1 (Q05048). One of the multiple factors required for polyadenylation and 3’-end cleavage of mammalian pre-mRNAs. It is a common-essential gene (DepMap: required in 96.4% of cancer cell lines).

This gene encodes one of three subunits which combine to form cleavage stimulation factor (CSTF). CSTF is involved in the polyadenylation and 3’end cleavage of pre-mRNAs. Similar to mammalian G protein beta subunits, this protein contains transducin-like repeats. Several transcript variants with different 5’ UTR, but encoding the same protein, have been found for this gene.

Source: NCBI Gene 1477 — RefSeq curated summary.

At a glance

  • GWAS associations: 1
  • Clinical variants (ClinVar): 34 total
  • Cancer dependency (DepMap): dependent in 96.4% of screened cell lines (common-essential)
  • MANE Select transcript: NM_001324

Identifiers

Gene identifiers

FieldValue
HGNC IDHGNC:2483
Approved symbolCSTF1
Namecleavage stimulation factor subunit 1
Location20q13.2-q13.31
Locus typegene with protein product
StatusApproved
Ensembl geneENSG00000101138
Ensembl biotypeprotein_coding
OMIM600369
Entrez1477

Gene structure

Transcript identifiers

Ensembl transcripts: 16 — 12 protein_coding, 2 nonsense_mediated_decay, 1 retained_intron, 1 protein_coding_CDS_not_defined

ENST00000217109, ENST00000415828, ENST00000428552, ENST00000452950, ENST00000490539, ENST00000493039, ENST00000498689, ENST00000613138, ENST00000892721, ENST00000892722, ENST00000892723, ENST00000892724, ENST00000892725, ENST00000892726, ENST00000919355, ENST00000919356

RefSeq mRNA: 3 — MANE Select: NM_001324 NM_001033521, NM_001033522, NM_001324

CCDS: CCDS13452

Canonical transcript exons

ENST00000217109 — 6 exons

ExonStartEnd
ENSE000008457025639552156395721
ENSE000010228045639264556392713
ENSE000036303465639720756397484
ENSE000036575395640346856406362
ENSE000036739625639764456397841
ENSE000037845835639896756399357

Expression profiles

Bgee: expression breadth ubiquitous, 248 present calls, max score 92.96.

FANTOM5 (CAGE): breadth ubiquitous, TPM avg 28.4097 / max 296.0451, expressed in 1801 samples.

FANTOM5 promoters (4 alternative TSS)

Promoter IDTPM avgSamples expressed
18541518.45991777
1854127.76351657
1854141.4900755
1854130.6964389

Top tissues by expression

284 total, by Bgee expression score (0-100, higher = more expressed):

TissueAnatomy IDExpression scoreQuality
male germ line stem cell (sensu Vertebrata) in testisCL:0000089 ∩ UBERON:000047392.96gold quality
adrenal tissueUBERON:001830391.56gold quality
secondary oocyteCL:000065591.34gold quality
calcaneal tendonUBERON:000370190.06gold quality
primordial germ cell in gonadCL:0000670 ∩ UBERON:000099189.93gold quality
islet of LangerhansUBERON:000000689.58gold quality
stromal cell of endometriumCL:000225589.51gold quality
ventricular zoneUBERON:000305387.46gold quality
rectumUBERON:000105286.60gold quality
granulocyteCL:000009486.49gold quality
body of pancreasUBERON:000115086.36gold quality
ganglionic eminenceUBERON:000402386.26gold quality
left ovaryUBERON:000211986.08gold quality
pancreasUBERON:000126486.05gold quality
oocyteCL:000002386.00gold quality
right adrenal glandUBERON:000123385.96gold quality
right adrenal gland cortexUBERON:003582785.91gold quality
muscle of legUBERON:000138385.75gold quality
right ovaryUBERON:000211885.70gold quality
gall bladderUBERON:000211085.68gold quality
right lobe of liverUBERON:000111485.67gold quality
gastrocnemiusUBERON:000138885.63gold quality
spermCL:000001985.38gold quality
testisUBERON:000047385.35gold quality
left adrenal glandUBERON:000123485.30gold quality
left adrenal gland cortexUBERON:003582585.25gold quality
left testisUBERON:000453385.14gold quality
hindlimb stylopod muscleUBERON:000425284.87gold quality
right testisUBERON:000453484.62gold quality
left uterine tubeUBERON:000130384.60gold quality

Single-cell (SCXA)

Detected in 2 experiment(s), a significant marker in 1.

ExperimentMarker?Max mean expression
E-MTAB-7249no434.86
E-ANND-3no0.00

Regulation

Is transcription factor: no

miRNA regulators (miRDB)

53 targeting CSTF1, top 30 by miRDB confidence (max_score; target_count = how many genes the miRNA targets in total — lower means more specific):

miRNAMax scoreAvg scoremiRNA target_count
HSA-LET-7A-3P100.0074.033932
HSA-LET-7B-3P100.0074.083913
HSA-LET-7F-1-3P100.0074.023928
HSA-MIR-98-3P100.0074.083907
HSA-MIR-5011-5P100.0083.465820
HSA-MIR-1277-5P100.0073.955056
HSA-MIR-6870-5P99.9968.552115
HSA-MIR-477599.9875.006394
HSA-MIR-4723-5P99.9768.702034
HSA-MIR-569899.9768.492029
HSA-MIR-7111-5P99.9768.482062
HSA-MIR-590-3P99.9674.346478
HSA-MIR-3140-3P99.8868.472069
HSA-MIR-477999.8666.501583
HSA-MIR-548AG99.7769.251492
HSA-MIR-548AI99.6969.241494
HSA-MIR-548BA99.6969.141514
HSA-MIR-570-5P99.6969.241494
HSA-MIR-320299.6667.702737
HSA-MIR-5197-5P99.6469.081494
HSA-MIR-612699.6268.09996
HSA-MIR-427699.5667.662514
HSA-MIR-451B99.5568.281380
HSA-MIR-4761-5P99.5166.69804
HSA-MIR-486-5P99.5170.39707
HSA-MIR-582-5P99.4770.792635
HSA-MIR-548B-3P99.3867.261000
HSA-MIR-3678-3P99.3167.101432
HSA-MIR-2115-3P99.3169.682026
HSA-MIR-1273H-3P99.2967.55980

Functional genomics

DepMap (CRISPR cell-line fitness): dependent in 96.4% of screened cell lines, common-essential.

Literature-anchored findings (GeneRIF, showing 2)

  • A deletion variant of BARD1, a BRCA1 partner, occurs in cells and interacts with and colocalizes with CstF-50. (PMID:15878232)
  • characterize the BARD1 structural biochemistry responsible for CstF-50 binding (PMID:18842000)

Cross-species orthologs

5 orthologs

OrganismSymbolGene ID
danio_reriocstf1ENSDARG00000044820
mus_musculusCstf1ENSMUSG00000027498
rattus_norvegicusCstf1ENSRNOG00000004775
drosophila_melanogasterCstF50FBGN0039867
caenorhabditis_elegansWBGENE00000773

Paralogs (2): IK (ENSG00000113141), WDR55 (ENSG00000120314)

Protein

Protein identifiers

Cleavage stimulation factor subunit 1Q05048 (reviewed: Q05048)

Alternative names: CF-1 50 kDa subunit, Cleavage stimulation factor 50 kDa subunit

All UniProt accessions (4): Q05048, A0A087WV92, A0A0A0MSZ9, A3KFI9

UniProt curated annotations — full annotation on UniProt →

Function. One of the multiple factors required for polyadenylation and 3’-end cleavage of mammalian pre-mRNAs. May be responsible for the interaction of CSTF with other factors to form a stable complex on the pre-mRNA.

Subunit / interactions. Homodimer. The CSTF complex is composed of CSTF1 (50 kDa subunit), CSTF2 (64 kDa subunit) and CSTF3 (77 kDa subunit). Interacts (via repeats WD) directly with CSTF3. Interacts (via repeat WD6) with BARD1. Interacts with ERCC6.

Subcellular location. Nucleus.

Post-translational modifications. The N-terminus is blocked.

Domain organisation. N-terminus mediates homodimerization.

RefSeq proteins (3): NP_001028693, NP_001028694, NP_001315* (*=MANE)

Domains & families (InterPro)

IDNameType
IPR001680WD40_rptRepeat
IPR015943WD40/YVTN_repeat-like_dom_sfHomologous_superfamily
IPR019775WD40_repeat_CSConserved_site
IPR032028CSTF1_dimerDomain
IPR036322WD40_repeat_dom_sfHomologous_superfamily
IPR038184CSTF1_dimer_sfHomologous_superfamily
IPR044633CstF1-likeFamily

Pfam: PF00400, PF16699

UniProt features (44 total): strand 27, repeat 6, turn 6, helix 3, chain 1, region of interest 1

Structure

Experimental structures (PDB)

1 structures.

PDBMethodResolution (Å)
6B3XX-RAY DIFFRACTION2.3

Predicted structure (AlphaFold)

ModelpLDDTFraction very-high
AF-Q05048-F190.650.77

Function

Pathways and Gene Ontology

Reactome pathways

6 pathways

IDPathway
R-HSA-72187mRNA 3’-end processing
R-HSA-73856RNA Polymerase II Transcription Termination
R-HSA-77595Processing of Intronless Pre-mRNAs
R-HSA-9770562mRNA Polyadenylation
R-HSA-9918481Dengue Virus-Host Interactions
R-HSA-72203Processing of Capped Intron-Containing Pre-mRNA

MSigDB gene sets: 161 (showing top): GSE45365_CD8A_DC_VS_CD11B_DC_IFNAR_KO_UP, MULLIGHAN_NPM1_SIGNATURE_3_UP, BUYTAERT_PHOTODYNAMIC_THERAPY_STRESS_DN, TGCGCANK_UNKNOWN, GOBP_REGULATION_OF_MRNA_3_END_PROCESSING, REACTOME_MRNA_3_END_PROCESSING, REACTOME_PROCESSING_OF_CAPPED_INTRON_CONTAINING_PRE_MRNA, GOBP_MRNA_3_END_PROCESSING, BOYLAN_MULTIPLE_MYELOMA_C_CLUSTER_UP, ZHOU_INFLAMMATORY_RESPONSE_LIVE_DN, DODD_NASOPHARYNGEAL_CARCINOMA_UP, MYB_Q3, YNGTTNNNATT_UNKNOWN, GRADE_COLON_AND_RECTAL_CANCER_UP, REACTOME_METABOLISM_OF_RNA

GO Biological Process (2): mRNA 3’-end processing (GO:0031124), mRNA processing (GO:0006397)

GO Molecular Function (2): RNA binding (GO:0003723), protein binding (GO:0005515)

GO Cellular Component (3): nucleoplasm (GO:0005654), mRNA cleavage stimulating factor complex (GO:0005848), nucleus (GO:0005634)

Reactome top-level categories

Rollup of top-6 pathways:

CategoryPathways
Processing of Capped Intron-Containing Pre-mRNA1
RNA Polymerase II Transcription1
Processing of Capped Intronless Pre-mRNA1
mRNA 3’-end processing1
Dengue Virus Infection1
Metabolism of RNA1

GO top-level categories

Rollup of top GO terms by namespace:

CategoryTerms
mRNA processing1
RNA 3’-end processing1
RNA processing1
mRNA metabolic process1
nucleic acid binding1
binding1
nuclear lumen1
cellular anatomical structure1
mRNA cleavage factor complex1
intracellular membrane-bounded organelle1

Protein interactions and networks

STRING

1212 interactions, top by confidence (×1000):

Protein AProtein BPartner UniProtScore
CSTF1CSTF2P33240999
CSTF1CSTF3Q12996999
CSTF1BARD1Q99728987
CSTF1BRCA1P38398948
CSTF1CPSF2Q9P2I0920
CSTF1CPSF1Q10570919
CSTF1CPSF3Q9UKF6875
CSTF1CPSF4O95639874
CSTF1PCF11O94913873
CSTF1SYMPKQ92797871
CSTF1CPSF7Q8N684804
CSTF1PAPOLAP51003786
CSTF1PAPOLGQ9BWT3783
CSTF1PAPOLBQ9NRJ5780
CSTF1NUDT21O43809776

IntAct

63 interactions, top by confidence:

ABTypeScore
TOLLIPTOM1L1psi-mi:“MI:0914”(association)0.800
TOLLIPCSTF1psi-mi:“MI:0915”(physical association)0.760
SNRPBPRMT5psi-mi:“MI:0914”(association)0.670
CSTF1BARD1psi-mi:“MI:0915”(physical association)0.580
BARD1CSTF1psi-mi:“MI:0915”(physical association)0.580
SNRPNPRMT5psi-mi:“MI:0914”(association)0.530
RACGAP1CHEK1psi-mi:“MI:0914”(association)0.530
BMP1TLL1psi-mi:“MI:0914”(association)0.530
CSTF2TCRTAPpsi-mi:“MI:0914”(association)0.530
TOLLIPIRAK2psi-mi:“MI:0914”(association)0.500
CFTRCNOT1psi-mi:“MI:0914”(association)0.480
CPSF6DDX39Apsi-mi:“MI:0914”(association)0.480
POLR1ACSTF1psi-mi:“MI:0915”(physical association)0.400
CSTF1PCNApsi-mi:“MI:0915”(physical association)0.400
GCM2CSTF1psi-mi:“MI:0915”(physical association)0.400
CPSF3P4HA2psi-mi:“MI:0914”(association)0.350
Cct4ARHGAP32psi-mi:“MI:0914”(association)0.350
Cct2OSBPL9psi-mi:“MI:0914”(association)0.350
Cct8DTLpsi-mi:“MI:0914”(association)0.350
FIP1L1WWP2psi-mi:“MI:0914”(association)0.350
NS1ESYT2psi-mi:“MI:0914”(association)0.350
ESR1ESYT2psi-mi:“MI:0914”(association)0.350
LRRK2psi-mi:“MI:0914”(association)0.350
GTF2E2UBA6psi-mi:“MI:0914”(association)0.350

BioGRID (156): CSTF1 (Affinity Capture-MS), CSTF1 (Affinity Capture-MS), CSTF1 (Affinity Capture-MS), CSTF1 (Co-fractionation), CSTF1 (Co-fractionation), CSTF1 (Co-fractionation), CSTF1 (Co-fractionation), CSTF1 (Co-fractionation), CSTF1 (Co-fractionation), CSTF3 (Co-fractionation), DCUN1D1 (Co-fractionation), DLD (Co-fractionation), DNAAF5 (Co-fractionation), HSPBP1 (Co-fractionation), MAPK3 (Co-fractionation)

ESM2 similar proteins: A0JN27, A6H7F7, B2RYU6, B5FXJ6, B5FYY5, B5X7X4, B5XGE7, O43504, P55168, P61201, P61202, P61203, P79101, Q05048, Q13888, Q28F72, Q2TBL9, Q2TBV5, Q2YDH6, Q3T132, Q4KLA0, Q4R9A8, Q4VC33, Q5BJQ6, Q5F398, Q5M8X5, Q5R532, Q5R8K2, Q5R9J9, Q5RKJ1, Q63ZJ2, Q6DEG4, Q6DF40, Q6GR10, Q6IQT4, Q6IR75, Q6P1K8, Q7L5Y9, Q7SXR3, Q7ZXB7

Diamond homologs: A1CF18, A1CUD6, A1DP19, A2QP30, A4R3M4, A7EKM8, A8NEG8, A8XZJ9, A9V790, B0XM00, B2AEZ5, B2B766, B2VWG7, B2VZH2, B3MEY6, B3MHX6, B3NLK7, B3NPW0, B4GAJ1, B4GIU9, B4HN85, B4HSL3, B4JWA1, B4KQU8, B4KT48, B4LQ21, B4MY65, B4MYI5, B4P528, B4P6P9, B4QHG6, B6GZD3, B6HP56, B6QC06, B6QC56, B7FNU7, B8M0Q1, B8N9H4, B8P4B0, B8PD53

SIGNOR signaling

1 interactions.

AEffectBMechanism
CSTF1“form complex”“CSTF complex”binding

Enriched among interaction partners

Reactome pathways and GO biological processes over-represented among this gene’s 78 IntAct physical interaction partners (hypergeometric vs the genome-wide background, BH-FDR, gene-set size 15–500, ranked by fold). A functional readout of the neighbourhood — distinct from this gene’s own memberships above, and biased toward well-studied / hub proteins, so read it as themes rather than proof.

Reactome pathways:

PathwayPartnersFoldFDR
Processing of Intronless Pre-mRNAs652.7×4e-07
RNA Polymerase II Transcription Termination723.6×2e-06
mRNA 3’-end processing721.2×3e-06
mRNA Polyadenylation1013.5×5e-07
Dengue Virus-Host Interactions107.0×1e-04

Disease & clinical

Clinical variants and AI predictions

ClinVar

34 variants total. Per-class counts are floors (≥ shown; pagination cap):

ClassificationCount (floor)
Pathogenic0
Likely pathogenic0
Uncertain significance26
Likely benign0
Benign0

Top pathogenic / likely-pathogenic (0)

SpliceAI

947 predictions. Top by Δscore:

VariantEffectΔscore
20:56392512:G:GTdonor_gain1.0000
20:56392535:GCCA:Gdonor_gain1.0000
20:56392539:G:GGdonor_gain1.0000
20:56395519:A:AGacceptor_gain1.0000
20:56395519:AGCT:Aacceptor_gain1.0000
20:56395520:G:GAacceptor_gain1.0000
20:56395520:GCT:Gacceptor_gain1.0000
20:56395520:GCTG:Gacceptor_gain1.0000
20:56395675:GTCT:Gdonor_gain1.0000
20:56397323:GCTTC:Gdonor_gain1.0000
20:56397638:GTCTA:Gacceptor_loss1.0000
20:56397639:TCTA:Tacceptor_loss1.0000
20:56397640:CTA:Cacceptor_loss1.0000
20:56397641:TAGGT:Tacceptor_loss1.0000
20:56397642:AGGTC:Aacceptor_loss1.0000
20:56397643:G:Tacceptor_loss1.0000
20:56397769:TTC:Tdonor_gain1.0000
20:56397776:G:Tdonor_gain1.0000
20:56392534:TGCCA:Tdonor_gain0.9900
20:56392535:GCCAG:Gdonor_gain0.9900
20:56392543:G:GGdonor_gain0.9900
20:56395516:TGCAG:Tacceptor_gain0.9900
20:56395517:GCAGC:Gacceptor_gain0.9900
20:56395518:CA:Cacceptor_loss0.9900
20:56395518:CAGCT:Cacceptor_gain0.9900
20:56395519:AG:Aacceptor_loss0.9900
20:56395519:AGC:Aacceptor_gain0.9900
20:56395519:AGCTG:Aacceptor_gain0.9900
20:56395520:G:Aacceptor_loss0.9900
20:56395520:G:Tacceptor_gain0.9900

AlphaMissense

2833 scored. Top likely-pathogenic:

VariantProtein changeam_pathogenicity
20:56395617:T:CL22P1.000
20:56395644:C:AA31D1.000
20:56397344:T:CY103H1.000
20:56397344:T:GY103D1.000
20:56397348:T:AV104D1.000
20:56397351:C:TT105I1.000
20:56397356:C:AH107N1.000
20:56397356:C:GH107D1.000
20:56397356:C:TH107Y1.000
20:56397357:A:GH107R1.000
20:56397358:T:AH107Q1.000
20:56397358:T:GH107Q1.000
20:56397359:A:GK108E1.000
20:56397360:A:TK108I1.000
20:56397361:A:CK108N1.000
20:56397361:A:TK108N1.000
20:56397368:T:CC111R1.000
20:56397369:G:AC111Y1.000
20:56397369:G:TC111F1.000
20:56397370:C:GC111W1.000
20:56397378:C:AA114D1.000
20:56397395:G:AG120R1.000
20:56397395:G:CG120R1.000
20:56397396:G:AG120E1.000
20:56397396:G:TG120V1.000
20:56397405:T:AI123K1.000
20:56397407:G:CA124P1.000
20:56397408:C:AA124D1.000
20:56397411:C:TT125I1.000
20:56397413:G:AG126R1.000

dbSNP variants (sampled 300 via entrez): RS1000195438 (20:56400646 T>C), RS1000209810 (20:56393513 A>G), RS1000331001 (20:56405926 T>A), RS1000407747 (20:56396810 G>A), RS1000638371 (20:56391902 C>T), RS1000827368 (20:56401743 A>G), RS1000880466 (20:56400288 G>A,C), RS1000903261 (20:56402183 T>C), RS1000983748 (20:56395224 T>C), RS1001839929 (20:56399547 T>C), RS1002024153 (20:56404225 G>A), RS1002167883 (20:56404896 A>C), RS1002170745 (20:56403275 C>G,T), RS1002274141 (20:56396442 A>G), RS1002608048 (20:56398243 C>A,G,T)

Disease associations

OMIM: gene MIM:600369 | disease phenotypes:

GenCC curated gene-disease

Mondo (0):

Orphanet (0):

HPO phenotypes

0 total (0 of 0 shown, HPO-id order):

GWAS associations

1 associations (top):

StudyTraitp-value
GCST003901_14Cognitive decline (age-related)3.000000e-06

Drugs & pharmacology

Drug and pharmacology data

Is drug target: no

PharmGKB: 1 entry (VIP=true, CPIC=false)

CTD chemical–gene interactions

39 total (human), top 30 by PubMed support.

ChemicalActions (top 5)PubMed papers
3-((6-(2-methoxyphenyl)pyrimidin-4-yl)amino)phenyl)methane sulfonamidedecreases expression1
bisphenol Faffects cotreatment, increases expression1
dicrotophosdecreases expression1
alpha-pineneaffects cotreatment, increases oxidation, increases abundance1
bisphenol Aaffects cotreatment, increases expression1
salinomycindecreases expression1
arseniteaffects binding, increases reaction1
sodium arsenitedecreases expression1
methacrylaldehydeaffects cotreatment, increases oxidation, increases abundance1
di-n-butylphosphoric acidaffects expression1
CGP 52608affects binding, increases reaction1
jinfukangdecreases expression1
Acroleinaffects cotreatment, increases oxidation, increases abundance1
Air Pollutantsaffects cotreatment, increases abundance, increases oxidation1
Caffeinedecreases phosphorylation1
Cannabidiolaffects cotreatment, decreases expression1
Clozapinedecreases expression1
Coumestrolincreases expression1
Cuprizoneaffects cotreatment, decreases expression1
Dexamethasoneaffects cotreatment, increases expression1
Doxorubicindecreases expression1
Enzyme Inhibitorsdecreases activity, increases O-linked glycosylation1
Formaldehydedecreases expression1
Indomethacinaffects cotreatment, increases expression1
Ivermectindecreases expression1
Ozoneaffects cotreatment, increases oxidation, increases abundance1
Potassium Dichromateincreases expression1
Quartzincreases expression1
Ribonucleotidesaffects binding1
Silicon Dioxideincreases expression1

Clinical trials (associated diseases)

0 trials via MONDO — disease-level, not drug-specific.

No linked Atlas pages yet — the cross-entity mesh grows as the corpus expands.