PGA4

gene
On this page

Summary

PGA4 (pepsinogen A4, HGNC:8886) is a protein-coding gene on chromosome 11q12.2, encoding Pepsin A-4 (P0DJD7). Shows particularly broad specificity; although bonds involving phenylalanine and leucine are preferred, many others are also cleaved to some extent.

This gene encodes a protein precursor of the digestive enzyme pepsin, a member of the peptidase A1 family of endopeptidases. The encoded precursor is secreted by gastric chief cells and undergoes autocatalytic cleavage in acidic conditions to form the active enzyme, which functions in the digestion of dietary proteins. This gene is found in a cluster of related genes on chromosome 11, each of which encodes one of multiple pepsinogens. Pepsinogen levels in serum may serve as a biomarker for atrophic gastritis and gastric cancer.

Source: NCBI Gene 643847 — RefSeq curated summary.

At a glance

  • Clinical variants (ClinVar): 9 total
  • MANE Select transcript: NM_001079808

Identifiers

Gene identifiers

FieldValue
HGNC IDHGNC:8886
Approved symbolPGA4
Namepepsinogen A4
Location11q12.2
Locus typegene with protein product
StatusApproved
Ensembl geneENSG00000229183
Ensembl biotypeprotein_coding
OMIM169720
Entrez643847

Gene structure

Transcript identifiers

Ensembl transcripts: 8 — 4 retained_intron, 3 protein_coding, 1 TEC

ENST00000378149, ENST00000537932, ENST00000539581, ENST00000541743, ENST00000543193, ENST00000544899, ENST00000545726, ENST00000624544

RefSeq mRNA: 3 — MANE Select: NM_001079808 NM_001079808, NM_001422100, NM_001422101

CCDS: CCDS31575

Canonical transcript exons

ENST00000378149 — 9 exons

ExonStartEnd
ENSE000024555466122234761222456
ENSE000024625446122712061227319
ENSE000025088476123138261231694
ENSE000025202176122632661226444
ENSE000035631526122867161228787
ENSE000036240606122294361223105
ENSE000036715676122991961230063
ENSE000036825656123016661230264
ENSE000036926196122533961225456

Expression profiles

Bgee: expression breadth ubiquitous, 128 present calls, max score 98.91.

Top tissues by expression

133 total, by Bgee expression score (0-100, higher = more expressed):

TissueAnatomy IDExpression scoreQuality
right uterine tubeUBERON:000130298.91gold quality
body of stomachUBERON:000116195.88gold quality
stomachUBERON:000094594.33gold quality
lower esophagus mucosaUBERON:003583492.71gold quality
fundus of stomachUBERON:000116092.39gold quality
right coronary arteryUBERON:000162590.82gold quality
mucosa of stomachUBERON:000119988.92gold quality
endocervixUBERON:000045885.85gold quality
ectocervixUBERON:001224985.41gold quality
left uterine tubeUBERON:000130383.23gold quality
right lungUBERON:000216782.07gold quality
metanephros cortexUBERON:001053381.94gold quality
body of pancreasUBERON:000115081.35gold quality
right ovaryUBERON:000211880.60gold quality
uterine cervixUBERON:000000280.17gold quality
male germ line stem cell (sensu Vertebrata) in testisCL:0000089 ∩ UBERON:000047379.02gold quality
fallopian tubeUBERON:000388978.93gold quality
right lobe of liverUBERON:000111478.07gold quality
spleenUBERON:000210677.88gold quality
descending thoracic aortaUBERON:000234577.84gold quality
right adrenal glandUBERON:000123377.34gold quality
left ovaryUBERON:000211977.00gold quality
transverse colonUBERON:000115776.78gold quality
subcutaneous adipose tissueUBERON:000219076.51gold quality
mucosa of transverse colonUBERON:000499176.28gold quality
left adrenal glandUBERON:000123476.26gold quality
esophagus mucosaUBERON:000246976.25gold quality
left adrenal gland cortexUBERON:003582576.16gold quality
adipose tissueUBERON:000101375.97gold quality
omental fat padUBERON:001041475.61gold quality

Single-cell (SCXA)

Detected in 1 experiment(s), a significant marker in 0.

ExperimentMarker?Max mean expression
E-ANND-3no0.97

Regulation

Is transcription factor: no

miRNA regulators (miRDB)

5 targeting PGA4, top 30 by miRDB confidence (max_score; target_count = how many genes the miRNA targets in total — lower means more specific):

miRNAMax scoreAvg scoremiRNA target_count
HSA-MIR-6793-5P99.9765.95758
HSA-MIR-6783-3P99.8967.922059
HSA-MIR-430799.8270.453374
HSA-MIR-4768-3P98.1666.022330
HSA-MIR-194-3P97.3665.961027

Cross-species orthologs

14 orthologs

OrganismSymbolGene ID
drosophila_melanogasterBaceFBGN0032049
drosophila_melanogasterCG6508FBGN0032303
drosophila_melanogasterCG17134FBGN0032304
drosophila_melanogasterCG33128FBGN0053128
caenorhabditis_elegansWBGENE00000214
caenorhabditis_elegansWBGENE00000218
caenorhabditis_elegansWBGENE00012681
caenorhabditis_elegansWBGENE00012682
caenorhabditis_elegansWBGENE00012683
caenorhabditis_elegansWBGENE00013973
caenorhabditis_elegansWBGENE00017678
caenorhabditis_elegansWBGENE00019104
caenorhabditis_elegansWBGENE00019105
caenorhabditis_elegansWBGENE00077655

Paralogs (9): PGC (ENSG00000096088), CTSD (ENSG00000117984), NAPSA (ENSG00000131400), REN (ENSG00000143839), BACE2 (ENSG00000182240), BACE1 (ENSG00000186318), CTSE (ENSG00000196188), PGA3 (ENSG00000229859), PGA5 (ENSG00000256713)

Protein

Protein identifiers

Pepsin A-4P0DJD7 (reviewed: P0DJD7)

Alternative names: Pepsinogen-4

All UniProt accessions (4): P0DJD7, A0A1S5UZ02, F5GXL4, F5H0H6

UniProt curated annotations — full annotation on UniProt →

Function. Shows particularly broad specificity; although bonds involving phenylalanine and leucine are preferred, many others are also cleaved to some extent.

Subcellular location. Secreted.

Similarity. Belongs to the peptidase A1 family.

RefSeq proteins (3): NP_001073276, NP_001409029, NP_001409030 (=MANE)

Domains & families (InterPro)

IDNameType
IPR001461Aspartic_peptidase_A1Family
IPR001969Aspartic_peptidase_ASActive_site
IPR012848Aspartic_peptidase_NDomain
IPR021109Peptidase_aspartic_dom_sfHomologous_superfamily
IPR033121PEPTIDASE_A1Domain
IPR034162Pepsin_ADomain

Pfam: PF00026, PF07966

UniProt features (53 total): strand 26, helix 11, turn 4, disulfide bond 3, sequence conflict 2, active site 2, signal peptide 1, propeptide 1, chain 1, domain 1, modified residue 1

Structure

Experimental structures (PDB)

5 structures.

PDBMethodResolution (Å)
1QRPX-RAY DIFFRACTION1.96
1PSOX-RAY DIFFRACTION2
1PSNX-RAY DIFFRACTION2.2
1FLHX-RAY DIFFRACTION2.45
3UTLX-RAY DIFFRACTION2.61

Predicted structure (AlphaFold)

ModelpLDDTFraction very-high
AF-P0DJD7-F191.150.74

Functional residue map

Curated UniProt residues grouped by drug-discovery relevance — catalytic, ligand-binding, modification, and mutation-validated positions. Source: UniProtKB sequence features.

Catalytic / active sites (2): 94; 277

Post-translational modifications (1): 130

Disulfide bonds (3): 107–112, 268–272, 311–344

Function

Pathways and Gene Ontology

Reactome pathways

1 pathways

IDPathway
R-HSA-5683826Surfactant metabolism

MSigDB gene sets: 19 (showing top): GOBP_DIGESTION, NIKOLSKY_BREAST_CANCER_11Q12_Q14_AMPLICON, GOCC_MULTIVESICULAR_BODY, GOCC_ENDOSOME_LUMEN, GOBP_PROTEOLYSIS, GOMF_PEPTIDASE_ACTIVITY, GOMF_ASPARTIC_TYPE_PEPTIDASE_ACTIVITY, GOCC_LATE_ENDOSOME_LUMEN, GOCC_MULTIVESICULAR_BODY_LUMEN, REACTOME_SURFACTANT_METABOLISM, MIR4768_3P, MIR194_3P, MIR6793_5P, WP_GASTRIC_ACID_PRODUCTION, FOURATI_BLOOD_TWINRIX_AGE_25_83YO_RESPONDERS_VS_POOR_RESPONDERS_0DY_DN

GO Biological Process (2): proteolysis (GO:0006508), digestion (GO:0007586)

GO Molecular Function (4): aspartic-type endopeptidase activity (GO:0004190), protein binding (GO:0005515), peptidase activity (GO:0008233), hydrolase activity (GO:0016787)

GO Cellular Component (3): extracellular exosome (GO:0070062), multivesicular body lumen (GO:0097486), extracellular region (GO:0005576)

Reactome top-level categories

Rollup of top-1 pathways:

CategoryPathways
Metabolism of proteins1

GO top-level categories

Rollup of top GO terms by namespace:

CategoryTerms
protein metabolic process1
multicellular organismal process1
endopeptidase activity1
aspartic-type peptidase activity1
binding1
hydrolase activity1
catalytic activity, acting on a protein1
catalytic activity1
extracellular vesicle1
multivesicular body1
late endosome lumen1
cellular anatomical structure1

Protein interactions and networks

STRING

0 interactions, top by confidence (×1000):

IntAct

29 interactions, top by confidence:

ABTypeScore
PGA4SPAG4psi-mi:“MI:0915”(physical association)0.560
CD74PGA4psi-mi:“MI:0915”(physical association)0.560
PGA4STOMpsi-mi:“MI:0915”(physical association)0.560
PGA4TMEM106Cpsi-mi:“MI:0915”(physical association)0.560
PGA4HHLA2psi-mi:“MI:0915”(physical association)0.560
PGA4FIMPpsi-mi:“MI:0915”(physical association)0.560
PGA4AMIGO1psi-mi:“MI:0915”(physical association)0.560
PGA4FNDC9psi-mi:“MI:0915”(physical association)0.560
PGA4PVRpsi-mi:“MI:0915”(physical association)0.560
NTNG1AMY1Apsi-mi:“MI:0914”(association)0.350
SPAG4PGA4psi-mi:“MI:0915”(physical association)0.000
PGA4SPAG4psi-mi:“MI:0915”(physical association)0.000
PGA4CD74psi-mi:“MI:0915”(physical association)0.000
PGA4STOMpsi-mi:“MI:0915”(physical association)0.000
PGA4TMEM106Cpsi-mi:“MI:0915”(physical association)0.000
PGA4HHLA2psi-mi:“MI:0915”(physical association)0.000
PGA4FIMPpsi-mi:“MI:0915”(physical association)0.000
PGA4AMIGO1psi-mi:“MI:0915”(physical association)0.000
PGA4FNDC9psi-mi:“MI:0915”(physical association)0.000
PGA4PVRpsi-mi:“MI:0915”(physical association)0.000

BioGRID (11): PGA4 (Two-hybrid), PGA4 (Two-hybrid), PGA4 (Two-hybrid), PGA4 (Two-hybrid), PGA4 (Two-hybrid), PGA4 (Two-hybrid), PGA4 (Two-hybrid), PGA4 (Two-hybrid), PGA4 (Two-hybrid), PGA4 (Affinity Capture-MS), PGA4 (Affinity Capture-MS)

ESM2 similar proteins: O93428, P00791, P00792, P00793, P00794, P00795, P03954, P03955, P04073, P0DJD7, P0DJD8, P0DJD9, P11489, P14091, P16228, P16476, P18276, P20142, P25796, P27677, P27678, P27821, P27822, P27823, P28712, P28713, P43159, P56272, P70269, P81497, P81498, Q03168, Q28389, Q64411, Q689Z7, Q800A0, Q805F2, Q805F3, Q8SQ41, Q9D7R7

Diamond homologs: A0A146F0J0, A1CBR4, A1DDK1, A2R3L3, B0Y1V8, B6HL60, B8MF81, B8NLY9, C5FBS2, C5FW52, C5FZ57, C5PEI9, D4AIC4, D4ANC3, D4AT39, D4D7C5, D4DE18, D4DGR1, E5A7T3, E5R1B9, O01530, O13340, O60020, O65390, O76856, O93885, P00791, P00792, P00798, P03954, P03955, P04073, P06026, P0CU33, P0DJD7, P0DJD8, P0DJD9, P10602, P11489, P11838

SIGNOR signaling

0 interactions.

Disease & clinical

Clinical variants and AI predictions

ClinVar

9 variants total. Per-class counts are floors (≥ shown; pagination cap):

ClassificationCount (floor)
Pathogenic0
Likely pathogenic0
Uncertain significance8
Likely benign1
Benign0

Top pathogenic / likely-pathogenic (0)

SpliceAI

848 predictions. Top by Δscore:

VariantEffectΔscore
11:61222457:G:GGdonor_gain1.0000
11:61222939:ACAG:Aacceptor_gain1.0000
11:61222941:A:AGacceptor_gain1.0000
11:61222941:AG:Aacceptor_gain1.0000
11:61222941:AGG:Aacceptor_gain1.0000
11:61222942:G:GAacceptor_gain1.0000
11:61222942:GG:Gacceptor_gain1.0000
11:61222942:GGG:Gacceptor_gain1.0000
11:61222942:GGGT:Gacceptor_gain1.0000
11:61222942:GGGTC:Gacceptor_gain1.0000
11:61223039:A:Tdonor_gain1.0000
11:61223101:TGGAT:Tdonor_gain1.0000
11:61223102:GGAT:Gdonor_gain1.0000
11:61223102:GGATG:Gdonor_gain1.0000
11:61223103:GAT:Gdonor_gain1.0000
11:61223103:GATG:Gdonor_gain1.0000
11:61223104:AT:Adonor_gain1.0000
11:61223104:ATGTG:Adonor_loss1.0000
11:61223105:TGT:Tdonor_loss1.0000
11:61223106:G:GGdonor_gain1.0000
11:61223106:GT:Gdonor_loss1.0000
11:61223107:T:Adonor_loss1.0000
11:61226321:T:TAacceptor_gain1.0000
11:61226321:TGCA:Tacceptor_loss1.0000
11:61226322:GCAG:Gacceptor_loss1.0000
11:61226323:CAGC:Cacceptor_loss1.0000
11:61226324:A:AGacceptor_gain1.0000
11:61226324:AGC:Aacceptor_loss1.0000
11:61226325:G:GCacceptor_gain1.0000
11:61226325:GC:Gacceptor_gain1.0000

AlphaMissense

2544 scored. Top likely-pathogenic:

VariantProtein changeam_pathogenicity
11:61229974:G:CD277H0.999
11:61229975:A:CD277A0.999
11:61229975:A:TD277V0.999
11:61229976:C:AD277E0.999
11:61229976:C:GD277E0.999
11:61231445:T:AW361R0.999
11:61231445:T:CW361R0.999
11:61231455:G:TG364V0.999
11:61225400:A:CD94A0.998
11:61225400:A:TD94V0.998
11:61225401:C:AD94E0.998
11:61225401:C:GD94E0.998
11:61225403:C:TT95I0.998
11:61225420:T:AW101R0.998
11:61225420:T:CW101R0.998
11:61225439:G:AC107Y0.998
11:61225440:C:GC107W0.998
11:61227213:G:AG184R0.998
11:61227213:G:CG184R0.998
11:61227213:G:TG184W0.998
11:61228768:T:AW252R0.998
11:61228768:T:CW252R0.998
11:61228770:G:CW252C0.998
11:61228770:G:TW252C0.998
11:61229975:A:GD277G0.998
11:61229981:G:TG279V0.998
11:61231454:G:CG364R0.998
11:61231454:G:TG364C0.998
11:61225367:G:AG83E0.997
11:61225399:G:CD94H0.997

dbSNP variants (sampled 300 via entrez): RS1003927024 (11:61221636 T>A), RS1014015090 (11:61230954 C>G), RS1015851097 (11:61230061 G>A), RS1033619332 (11:61230519 CT>C), RS1039301952 (11:61221250 G>A), RS1050756118 (11:61221406 G>A), RS1157001068 (11:61230081 T>C,G), RS1158348237 (11:61221422 T>TCCC), RS1159196638 (11:61230759 A>G), RS1159313250 (11:61221813 C>A,T), RS1159616288 (11:61221055 G>A), RS1159681231 (11:61231335 C>G,T), RS11607442 (11:61220859 G>T), RS1161035083 (11:61232043 C>T), RS1162278516 (11:61230965 G>A)

Disease associations

OMIM: gene MIM:169720 | disease phenotypes:

GenCC curated gene-disease

Mondo (0):

Orphanet (0):

HPO phenotypes

0 total (0 of 0 shown, HPO-id order):

GWAS associations

0 associations (top):

Drugs & pharmacology

Drug and pharmacology data

Is drug target: no

PharmGKB: 1 entry (VIP=true, CPIC=false)

Binding affinities (BindingDB)

5 measured of 5 human assays (5 total across all organisms); most potent 5 below. Values come from heterogeneous assays and are not directly comparable.

LigandMeasureValue
N-[(1Z)-amino[(3-hydroxypropyl)amino]methylidene]-2-[2-(2-chlorophenyl)-5-(4-propoxyphenyl)-1H-pyrrol-1-yl]acetamideIC50110 nM
2-[2-(2-chlorophenyl)-5-[4-(4-acetylphenoxy)phenyl]-1H-pyrrol-1-yl]-N-(diaminomethylidene)acetamideIC50160 nM
2-[2-(adamantan-1-yl)-5-phenyl-1H-pyrrol-1-yl]-N-[(1Z)-amino[(3-hydroxypropyl)amino]methylidene]acetamideIC50240 nM
2-[2-(adamantan-1-yl)-5-phenyl-1H-pyrrol-1-yl]-N-(diaminomethylidene)acetamideIC50600 nM
(N-(diaminomethylene)-2,4-diphenyl-1H-pyrrole-1-acetamide)IC503700 nM

CTD chemical–gene interactions

4 total (human), top 4 by PubMed support.

ChemicalActions (top 5)PubMed papers
sodium arsenitedecreases expression1
Leadaffects expression1
Tretinoinincreases expression1
Valproic Acidincreases methylation1

Clinical trials (associated diseases)

0 trials via MONDO — disease-level, not drug-specific.

No linked Atlas pages yet — the cross-entity mesh grows as the corpus expands.