PGBD1

gene
On this page

Also known as HUCEP-4dJ874C20.4SCAND4

Summary

PGBD1 (piggyBac transposable element derived 1, HGNC:19398) is a protein-coding gene on chromosome 6p22.1, encoding PiggyBac transposable element-derived protein 1 (Q96JS3). Transposase-derived from PiggyBac DNA transposons.

The piggyBac family of proteins, found in diverse animals, are transposases related to the transposase of the canonical piggyBac transposon from the moth, Trichoplusia ni. This family also includes genes in several genomes, including human, that appear to have been derived from the piggyBac transposons. This gene belongs to the subfamily of piggyBac transposable element derived (PGBD) genes. The PGBD proteins appear to be novel, with no obvious relationship to other transposases, or other known protein families. This gene product is specifically expressed in the brain, however, its exact function is not known. Alternative splicing results in multiple transcript variants encoding the same protein.

Source: NCBI Gene 84547 — RefSeq curated summary.

At a glance

  • GWAS associations: 27
  • Clinical variants (ClinVar): 118 total
  • MANE Select transcript: NM_032507

Identifiers

Gene identifiers

FieldValue
HGNC IDHGNC:19398
Approved symbolPGBD1
NamepiggyBac transposable element derived 1
Location6p22.1
Locus typegene with protein product
StatusApproved
AliasesHUCEP-4, dJ874C20.4, SCAND4
Ensembl geneENSG00000137338
Ensembl biotypeprotein_coding
OMIM621244
Entrez84547

Gene structure

Transcript identifiers

Ensembl transcripts: 6 — 6 protein_coding

ENST00000259883, ENST00000682144, ENST00000918204, ENST00000918205, ENST00000918206, ENST00000918207

RefSeq mRNA: 3 — MANE Select: NM_032507 NM_001184743, NM_001386059, NM_032507

CCDS: CCDS4648

Canonical transcript exons

ENST00000682144 — 7 exons

ExonStartEnd
ENSE000009289262828555128285707
ENSE000009289272828708028287168
ENSE000009289282829681628296945
ENSE000009289292829789528297991
ENSE000012121312828157228281918
ENSE000015485502830072428302549
ENSE000039170262828377628284209

Expression profiles

Bgee: expression breadth ubiquitous, 177 present calls, max score 87.71.

FANTOM5 (CAGE): breadth ubiquitous, TPM avg 7.8017 / max 75.3514, expressed in 1567 samples.

FANTOM5 promoters (2 alternative TSS)

Promoter IDTPM avgSamples expressed
666276.99831550
666260.8034505

Top tissues by expression

248 total, by Bgee expression score (0-100, higher = more expressed):

TissueAnatomy IDExpression scoreQuality
primordial germ cell in gonadCL:0000670 ∩ UBERON:000099187.71gold quality
cortical plateUBERON:000534381.87gold quality
ganglionic eminenceUBERON:000402381.46gold quality
male germ line stem cell (sensu Vertebrata) in testisCL:0000089 ∩ UBERON:000047380.41gold quality
ventricular zoneUBERON:000305379.54gold quality
cerebellar hemisphereUBERON:000224578.85gold quality
cerebellar cortexUBERON:000212978.65gold quality
right hemisphere of cerebellumUBERON:001489078.33gold quality
cerebellumUBERON:000203777.22gold quality
stromal cell of endometriumCL:000225577.09gold quality
Brodmann (1909) area 9UBERON:001354076.93gold quality
smooth muscle tissueUBERON:000113576.92gold quality
muscle layer of sigmoid colonUBERON:003580576.53gold quality
right frontal lobeUBERON:000281076.08gold quality
lower esophagus muscularis layerUBERON:003583375.03gold quality
lower esophagusUBERON:001347374.98gold quality
apex of heartUBERON:000209874.87gold quality
esophagogastric junction muscularis propriaUBERON:003584174.78gold quality
anterior cingulate cortexUBERON:000983574.54gold quality
mucosa of stomachUBERON:000119974.34gold quality
descending thoracic aortaUBERON:000234574.24gold quality
primary visual cortexUBERON:000243674.24gold quality
prefrontal cortexUBERON:000045174.18gold quality
thoracic aortaUBERON:000151574.16gold quality
ascending aortaUBERON:000149674.07gold quality
C1 segment of cervical spinal cordUBERON:000646973.80gold quality
body of uterusUBERON:000985373.67gold quality
dorsolateral prefrontal cortexUBERON:000983473.63gold quality
neocortexUBERON:000195073.58gold quality
left coronary arteryUBERON:000162673.39gold quality

Single-cell (SCXA)

Detected in 1 experiment(s), a significant marker in 1.

ExperimentMarker?Max mean expression
E-ANND-3yes3.07

Regulation

Is transcription factor: no

miRNA regulators (miRDB)

31 targeting PGBD1, top 30 by miRDB confidence (max_score; target_count = how many genes the miRNA targets in total — lower means more specific):

miRNAMax scoreAvg scoremiRNA target_count
HSA-MIR-4795-3P100.0074.624024
HSA-MIR-126-5P100.0072.713180
HSA-MIR-548N99.9871.944170
HSA-MIR-302C-5P99.9772.563642
HSA-MIR-3912-5P99.9566.11925
HSA-MIR-314399.9371.963104
HSA-MIR-6780A-5P99.8866.692776
HSA-MIR-5582-3P99.8672.484221
HSA-MIR-4728-5P99.8569.394718
HSA-MIR-94499.8270.853042
HSA-MIR-6785-5P99.8268.684428
HSA-MIR-323A-3P99.7970.301739
HSA-MIR-1273H-5P99.7766.322471
HSA-MIR-149-3P99.7268.223963
HSA-MIR-4699-3P99.7170.153142
HSA-MIR-30B-3P99.7065.762325
HSA-MIR-3689A-3P99.7065.732306
HSA-MIR-3689B-3P99.7065.712311
HSA-MIR-3689C99.7065.712311
HSA-MIR-6779-5P99.7065.762363
HSA-MIR-6883-5P99.6968.053785
HSA-MIR-510-3P99.5470.062965
HSA-MIR-7159-3P99.5170.171920
HSA-MIR-29A-5P99.0868.591813
HSA-MIR-548L99.0670.902560
HSA-MIR-10B-3P99.0466.98988
HSA-MIR-654-3P98.3867.61905
HSA-MIR-3074-3P97.8367.26922
HSA-MIR-428897.1167.231636
HSA-MIR-6730-3P97.0367.54889

Literature-anchored findings (GeneRIF, showing 5)

  • Genome-wide association study identifies a susceptibility locus for schizophrenia in Han Chinese at 11p11.2. (PMID:22037552)
  • The genes HIST1H2BJ, PRSS16, and PGBD1 were not associated with Japanese patients with schizophrenia. (PMID:22488895)
  • the high level of transfection achieved with PB may have significant advantages in basic scientific research for dental tissue engineering applications, such as functional studies of genes and proteins (PMID:26208039)
  • Expression levels in the transposon-based cells were two to five -folds more than those created by conventional method except for the IRES-mediated ones, in which the observed difference increased more than 100-fold (PMID:28662065)
  • A Novel Gene Controls a New Structure: PiggyBac Transposable Element-Derived 1, Unique to Mammals, Controls Mammal-Specific Neuronal Paraspeckles. (PMID:36205081)

Cross-species orthologs

1 orthologs

OrganismSymbolGene ID
mus_musculusPgbd1ENSMUSG00000055313

Paralogs (30): ZNF263 (ENSG00000006194), ZNF213 (ENSG00000085644), ZNF500 (ENSG00000103199), ZKSCAN1 (ENSG00000106261), ZNF205 (ENSG00000122386), ZSCAN9 (ENSG00000137185), ZNF215 (ENSG00000149054), ZSCAN12 (ENSG00000158691), ZNF394 (ENSG00000160908), ZNF75A (ENSG00000162086), ZSCAN21 (ENSG00000166529), ZNF232 (ENSG00000167840), ZNF24 (ENSG00000172466), ZNF449 (ENSG00000173275), ZSCAN4 (ENSG00000180532), ZSCAN22 (ENSG00000182318), ZNF75D (ENSG00000186376), ZNF396 (ENSG00000186496), ZNF397 (ENSG00000186812), ZSCAN30 (ENSG00000186814), ZKSCAN4 (ENSG00000187626), ZSCAN23 (ENSG00000187987), ZKSCAN3 (ENSG00000189298), ZSCAN16 (ENSG00000196812), ZSCAN25 (ENSG00000197037), ZSCAN26 (ENSG00000197062), ZNF165 (ENSG00000197279), ZKSCAN8 (ENSG00000198315), ZSCAN31 (ENSG00000235109), ZNF853 (ENSG00000236609)

Protein

Protein identifiers

PiggyBac transposable element-derived protein 1Q96JS3 (reviewed: Q96JS3)

Alternative names: Cerebral protein 4

All UniProt accessions (1): Q96JS3

UniProt curated annotations — full annotation on UniProt →

Function. Transposase-derived from PiggyBac DNA transposons. Although it has been fully domesticated and lacks transposase activity, PGBD1 has acquired DNA-binding capability. It preferentially binds in and around genes involved in neuronal development, leading to their transcriptional pausing. Notably, PGBD1 suppresses paraspeckle assembly in neuronal cells.

Subcellular location. Nucleus.

Miscellaneous. PGBD1 is a nonmonotreme mammal-specific horizontally transferred gene.

RefSeq proteins (3): NP_001171672, NP_001372988, NP_115896* (*=MANE)

Domains & families (InterPro)

IDNameType
IPR001190SRCRDomain
IPR003309SCAN_domDomain
IPR029526PGBDDomain
IPR038269SCAN_sfHomologous_superfamily
IPR052638PiggyBac_TE-derivedFamily

Pfam: PF02023, PF13843

UniProt features (16 total): sequence variant 8, sequence conflict 2, region of interest 2, chain 1, domain 1, modified residue 1, cross-link 1

Structure

Experimental structures (PDB)

0 structures.

Predicted structure (AlphaFold)

ModelpLDDTFraction very-high
AF-Q96JS3-F166.920.37

Functional residue map

Curated UniProt residues grouped by drug-discovery relevance — catalytic, ligand-binding, modification, and mutation-validated positions. Source: UniProtKB sequence features.

Post-translational modifications (2): 360, 218

Function

Pathways and Gene Ontology

Reactome pathways

0 pathways

MSigDB gene sets: 45 (showing top): GOBP_NEUROGENESIS, JAATINEN_HEMATOPOIETIC_STEM_CELL_UP, VECCHI_GASTRIC_CANCER_ADVANCED_VS_EARLY_UP, ZHENG_BOUND_BY_FOXP3, GOMF_SEQUENCE_SPECIFIC_DNA_BINDING, JOHNSTONE_PARVB_TARGETS_2_DN, GOMF_TRANSCRIPTION_REGULATOR_ACTIVITY, FEV_TARGET_GENES, SNIP1_TARGET_GENES, MIR4795_3P, MIR548N, MIR944, MIR6780A_5P, MIR29A_5P, MIR3074_3P

GO Biological Process (2): regulation of transcription by RNA polymerase II (GO:0006357), neurogenesis (GO:0022008)

GO Molecular Function (5): DNA binding (GO:0003677), DNA-binding transcription factor activity (GO:0003700), identical protein binding (GO:0042802), sequence-specific DNA binding (GO:0043565), protein binding (GO:0005515)

GO Cellular Component (2): nucleus (GO:0005634), membrane (GO:0016020)

GO top-level categories

Rollup of top GO terms by namespace:

CategoryTerms
regulation of DNA-templated transcription2
transcription by RNA polymerase II1
nervous system development1
cell differentiation1
nucleic acid binding1
transcription cis-regulatory region binding1
transcription regulator activity1
protein binding1
DNA binding1
binding1
intracellular membrane-bounded organelle1
cellular anatomical structure1

Protein interactions and networks

STRING

662 interactions, top by confidence (×1000):

Protein AProtein BPartner UniProtScore
PGBD1NKAPLQ5M9Q1596
PGBD1TNK1Q13470573
PGBD1PGBD5Q8N414534
PGBD1THAP9Q9H5L6530
PGBD1PRSS16Q9NQE7526
PGBD1POGKQ9P215490
PGBD1GALPQ9UBC7484
PGBD1GIN1Q9NXP7477
PGBD1ZBED8Q8IZ13466
PGBD1SIPA1L3O60292464
PGBD1ZNF862O60290450
PGBD1GAB2Q9UQC2447
PGBD1DKKL1Q9UK85434
PGBD1APOEP02649418
PGBD1NAIF1Q69YI7415

IntAct

105 interactions, top by confidence:

ABTypeScore
PGBD1SCAND1psi-mi:“MI:0915”(physical association)0.940
SCAND1PGBD1psi-mi:“MI:0915”(physical association)0.940
PGBD1SCAND1psi-mi:“MI:0914”(association)0.940
PGBD1ZNF24psi-mi:“MI:0915”(physical association)0.900
ZNF24PGBD1psi-mi:“MI:0915”(physical association)0.900
PGBD1ZNF24psi-mi:“MI:0914”(association)0.900
PGBD1ZNF446psi-mi:“MI:0915”(physical association)0.800
ZNF446PGBD1psi-mi:“MI:0915”(physical association)0.800
PGBD1ZSCAN22psi-mi:“MI:0915”(physical association)0.790
ZSCAN22PGBD1psi-mi:“MI:0915”(physical association)0.790

BioGRID (70): PGBD1 (Two-hybrid), PGBD1 (Two-hybrid), PGBD1 (Two-hybrid), PGBD1 (Two-hybrid), PGBD1 (Two-hybrid), PGBD1 (Two-hybrid), PGBD1 (Two-hybrid), ZSCAN22 (Two-hybrid), PGBD1 (Affinity Capture-MS), PGBD1 (Affinity Capture-MS), PGBD1 (Affinity Capture-MS), PGBD1 (Affinity Capture-MS), PGBD1 (Affinity Capture-MS), MEOX2 (Two-hybrid), NR4A1 (Two-hybrid)

ESM2 similar proteins: A3KMX0, A4IFA3, A4IGY9, A4Z943, A4Z944, B8QB46, D2EAC2, E1C2V1, O43422, O60290, P10911, P35125, P86452, Q13075, Q2NKX8, Q3UPF5, Q49AG3, Q5FWF4, Q5SVZ6, Q5T890, Q5TKR9, Q6DJS0, Q6EKJ0, Q6R2W3, Q6YI93, Q7Z2W4, Q80WE4, Q86UP8, Q86VD1, Q8BZ21, Q8N8K9, Q8QMP8, Q8TDB6, Q8WML3, Q92794, Q96JM7, Q96JS3, Q99388, Q99NI3, Q9CUX1

Diamond homologs: A1YEP8, A1YEQ3, A1YEV9, A1YFW2, A1YFW6, A1YG26, A1YG48, A1YG60, A1YGJ4, A1YGK6, A2T6E3, A2T6V8, A2T6W2, A2T712, A2T736, A2T7D2, A2T7D7, A2T7F2, A2T7F4, A2T7L7, A2T812, A6QNZ0, A6QPT6, B2KFW1, O14709, O14771, O14978, O15535, O43309, O60304, O95125, P10073, P17022, P17028, P17029, P17040, P28698, P49910, P51815, P59923

SIGNOR signaling

0 interactions.

Disease & clinical

Clinical variants and AI predictions

ClinVar

118 variants total. Per-class counts are floors (≥ shown; pagination cap):

ClassificationCount (floor)
Pathogenic0
Likely pathogenic0
Uncertain significance97
Likely benign8
Benign2

Top pathogenic / likely-pathogenic (0)

SpliceAI

1064 predictions. Top by Δscore:

VariantEffectΔscore
6:28284187:G:GTdonor_gain1.0000
6:28296942:A:Gdonor_gain1.0000
6:28296944:GA:Gdonor_gain1.0000
6:28296946:G:GGdonor_gain1.0000
6:28297893:A:AGacceptor_gain1.0000
6:28297894:G:GGacceptor_gain1.0000
6:28300722:A:AGacceptor_gain1.0000
6:28300722:A:ATacceptor_loss1.0000
6:28300723:G:GGacceptor_gain1.0000
6:28300723:G:GTacceptor_loss1.0000
6:28300723:GA:Gacceptor_gain1.0000
6:28300723:GAGA:Gacceptor_gain1.0000
6:28281711:C:Gdonor_gain0.9900
6:28283774:A:AGacceptor_gain0.9900
6:28283775:G:GGacceptor_gain0.9900
6:28284187:G:Tdonor_gain0.9900
6:28284206:ACAG:Adonor_loss0.9900
6:28284208:AG:Adonor_loss0.9900
6:28284209:GG:Gdonor_loss0.9900
6:28284211:T:Adonor_loss0.9900
6:28284214:G:Tdonor_gain0.9900
6:28285698:GAA:Gdonor_gain0.9900
6:28285701:G:GGdonor_gain0.9900
6:28285703:GAGTG:Gdonor_gain0.9900
6:28285705:GTG:Gdonor_gain0.9900
6:28287168:GGT:Gdonor_loss0.9900
6:28287169:G:GAdonor_loss0.9900
6:28287170:TAA:Tdonor_loss0.9900
6:28296925:T:Gdonor_gain0.9900
6:28296941:GA:Gdonor_gain0.9900

AlphaMissense

0 scored. Top likely-pathogenic:

dbSNP variants (sampled 300 via entrez): RS1000113926 (6:28299409 G>T), RS1000448868 (6:28286209 T>C), RS1000472666 (6:28279749 T>G), RS1000600442 (6:28293919 A>C,G), RS1000741916 (6:28294330 G>T), RS1000819624 (6:28285312 T>C), RS1001124101 (6:28286495 A>G), RS1001769562 (6:28293458 T>A), RS1002149807 (6:28281535 C>G,T), RS1002176363 (6:28301586 G>A,T), RS1002278051 (6:28295677 C>A,T), RS1002388370 (6:28288399 A>G), RS1002673161 (6:28290003 T>C), RS1002714179 (6:28290377 T>C), RS1002828133 (6:28288772 G>A)

Disease associations

OMIM: gene MIM:621244 | disease phenotypes:

GenCC curated gene-disease

Mondo (0):

Orphanet (0):

HPO phenotypes

0 total (0 of 0 shown, HPO-id order):

GWAS associations

27 associations (top):

StudyTraitp-value
GCST003997_12Myopia1.000000e-12
GCST004521_112Autism spectrum disorder or schizophrenia3.000000e-26
GCST004521_115Autism spectrum disorder or schizophrenia3.000000e-16
GCST004521_166Autism spectrum disorder or schizophrenia4.000000e-24
GCST004521_212Autism spectrum disorder or schizophrenia5.000000e-14
GCST004521_23Autism spectrum disorder or schizophrenia2.000000e-11
GCST004521_43Autism spectrum disorder or schizophrenia2.000000e-27
GCST004521_6Autism spectrum disorder or schizophrenia2.000000e-15
GCST004521_7Autism spectrum disorder or schizophrenia2.000000e-15
GCST004521_73Autism spectrum disorder or schizophrenia8.000000e-11
GCST004521_77Autism spectrum disorder or schizophrenia1.000000e-19
GCST004748_92Lung cancer2.000000e-12
GCST004750_90Squamous cell lung carcinoma4.000000e-11
GCST007269_245Pulse pressure6.000000e-15
GCST008921_4Asthma and major depressive disorder2.000000e-11
GCST008921_6Asthma and major depressive disorder1.000000e-09
GCST010002_50Refractive error4.000000e-34
GCST010083_222Hemoglobin levels1.000000e-44
GCST010142_16Fish- and plant-related diet2.000000e-10
GCST010142_19Fish- and plant-related diet4.000000e-10
GCST010142_34Fish- and plant-related diet7.000000e-09
GCST010142_35Fish- and plant-related diet8.000000e-09
GCST010142_42Fish- and plant-related diet1.000000e-08
GCST010142_7Fish- and plant-related diet3.000000e-12
GCST010702_75Subcortical volume (MOSTest)3.000000e-11
GCST010703_272Brain morphology (MOSTest)7.000000e-16
GCST010979_11Kawasaki disease7.000000e-10

EFO canonical traits (4, from GWAS)

EFO IDTrait name
EFO:0005763pulse pressure measurement
EFO:0004509hemoglobin measurement
EFO:0008111diet measurement
EFO:0004346neuroimaging measurement

Drugs & pharmacology

Drug and pharmacology data

Is drug target: no

PharmGKB: 1 entry (VIP=true, CPIC=false)

CTD chemical–gene interactions

31 total (human), top 30 by PubMed support.

ChemicalActions (top 5)PubMed papers
Benzo(a)pyreneincreases expression2
FR900359decreases phosphorylation1
urushioldecreases expression1
trichostatin Aaffects expression1
sodium arseniteaffects cotreatment, decreases expression, increases abundance1
butyraldehydedecreases expression1
manganese chlorideaffects cotreatment, decreases expression, increases abundance1
Temozolomideincreases expression1
Leflunomidedecreases expression1
Acetaminophenincreases expression1
Air Pollutantsdecreases expression, increases abundance1
Arsenicaffects cotreatment, decreases expression, increases abundance1
Atrazinedecreases expression1
Caffeineincreases phosphorylation1
Doxorubicinaffects expression1
Estradiolaffects cotreatment, increases expression1
Ethyl Methanesulfonateincreases expression1
Formaldehydeincreases expression1
Manganeseaffects cotreatment, decreases expression, increases abundance1
Methyl Methanesulfonateincreases expression1
Quercetinincreases expression1
Tobacco Smoke Pollutiondecreases expression1
Tretinoindecreases expression1
7,8-Dihydro-7,8-dihydroxybenzo(a)pyrene 9,10-oxidedecreases expression1
Cyclosporineincreases expression1
Aflatoxin B1decreases methylation1
Cadmium Chloridedecreases expression1
Okadaic Acidincreases expression1
Copper Sulfatedecreases expression1
Acrylamideincreases expression1

Clinical trials (associated diseases)

0 trials via MONDO — disease-level, not drug-specific.

  • Disease cohort memberships (association, not causation — diseases whose associated-gene cohort lists this gene; a subset are also under Associated diseases): Kawasaki disease