CLGN

gene
On this page

Summary

CLGN (calmegin, HGNC:2060) is a protein-coding gene on chromosome 4q31.1, encoding Calmegin (O14967). Functions during spermatogenesis as a chaperone for a range of client proteins that are important for sperm adhesion onto the egg zona pellucida and for subsequent penetration of the zona pellucida.

Calmegin is a testis-specific endoplasmic reticulum chaperone protein. CLGN may play a role in spermatogeneisis and infertility.

Source: NCBI Gene 1047 — RefSeq curated summary.

At a glance

  • GWAS associations: 8
  • Clinical variants (ClinVar): 104 total — 6 pathogenic, 1 likely-pathogenic
  • Druggable target: yes
  • MANE Select transcript: NM_004362

Identifiers

Gene identifiers

FieldValue
HGNC IDHGNC:2060
Approved symbolCLGN
Namecalmegin
Location4q31.1
Locus typegene with protein product
StatusApproved
Ensembl geneENSG00000153132
Ensembl biotypeprotein_coding
OMIM601858
Entrez1047

Gene structure

Transcript identifiers

Ensembl transcripts: 10 — 10 protein_coding

ENST00000325617, ENST00000414773, ENST00000509477, ENST00000897460, ENST00000931054, ENST00000931055, ENST00000963782, ENST00000963783, ENST00000963784, ENST00000963785

RefSeq mRNA: 2 — MANE Select: NM_004362 NM_001130675, NM_004362

CCDS: CCDS3751

Canonical transcript exons

ENST00000325617 — 15 exons

ExonStartEnd
ENSE00001008977140392219140392378
ENSE00001008979140396092140396205
ENSE00001008981140398851140399040
ENSE00001008982140410553140410626
ENSE00001008984140400357140400549
ENSE00001008985140412935140413087
ENSE00001008987140392586140392711
ENSE00001008988140405942140406083
ENSE00001008990140395819140395969
ENSE00001008991140401985140402066
ENSE00001008993140409837140409895
ENSE00001008994140393826140394041
ENSE00001008995140390628140390728
ENSE00002059656140427537140427648
ENSE00003843574140388453140389304

Expression profiles

Bgee: expression breadth ubiquitous, 207 present calls, max score 97.25.

FANTOM5 (CAGE): breadth ubiquitous, TPM avg 8.9418 / max 634.6791, expressed in 995 samples.

FANTOM5 promoters (4 alternative TSS)

Promoter IDTPM avgSamples expressed
541227.6455964
541211.0065381
541240.224778
541230.065113

Top tissues by expression

275 total, by Bgee expression score (0-100, higher = more expressed):

TissueAnatomy IDExpression scoreQuality
heart right ventricleUBERON:000208097.25gold quality
left ventricle myocardiumUBERON:000656695.59gold quality
myocardiumUBERON:000234995.45gold quality
left testisUBERON:000453394.79gold quality
right testisUBERON:000453494.50gold quality
cardiac muscle of right atriumUBERON:000337994.34gold quality
testisUBERON:000047393.03gold quality
cardiac ventricleUBERON:000208292.11gold quality
heart left ventricleUBERON:000208492.07gold quality
cardiac atriumUBERON:000208191.92gold quality
right atrium auricular regionUBERON:000663191.74gold quality
endothelial cellCL:000011589.58gold quality
islet of LangerhansUBERON:000000689.17gold quality
apex of heartUBERON:000209888.20gold quality
heartUBERON:000094887.60gold quality
pigmented layer of retinaUBERON:000178286.38gold quality
male germ line stem cell (sensu Vertebrata) in testisCL:0000089 ∩ UBERON:000047384.19gold quality
ventricular zoneUBERON:000305383.34gold quality
right uterine tubeUBERON:000130283.02gold quality
pituitary glandUBERON:000000782.36gold quality
adenohypophysisUBERON:000219682.06gold quality
ganglionic eminenceUBERON:000402380.14gold quality
choroid plexus epitheliumUBERON:000391179.59gold quality
bronchial epithelial cellCL:000232879.22gold quality
prostate glandUBERON:000236777.98gold quality
primordial germ cell in gonadCL:0000670 ∩ UBERON:000099177.61gold quality
cerebellar cortexUBERON:000212977.60gold quality
cerebellar hemisphereUBERON:000224577.55gold quality
embryoUBERON:000092277.23gold quality
adult organismUBERON:000702376.40gold quality

Single-cell (SCXA)

Detected in 3 experiment(s), a significant marker in 1.

ExperimentMarker?Max mean expression
E-MTAB-10287yes25.16
E-MTAB-6058no390.67
E-ANND-3no3.31

Regulation

Is transcription factor: no

Upstream regulators (CollecTRI, top): TCFL5

miRNA regulators (miRDB)

48 targeting CLGN, top 30 by miRDB confidence (max_score; target_count = how many genes the miRNA targets in total — lower means more specific):

miRNAMax scoreAvg scoremiRNA target_count
HSA-MIR-548AJ-3P99.9673.385345
HSA-MIR-548X-3P99.9673.385345
HSA-MIR-548J-3P99.9472.614881
HSA-MIR-548AE-3P99.9372.664867
HSA-MIR-548AH-3P99.9372.544872
HSA-MIR-548AM-3P99.9372.544872
HSA-MIR-548AQ-3P99.9372.664867
HSA-MIR-153-5P99.8973.866317
HSA-MIR-221-3P99.8671.561329
HSA-MIR-222-3P99.8671.351337
HSA-MIR-450399.8571.451869
HSA-MIR-4799-5P99.8270.602663
HSA-MIR-449599.8272.083080
HSA-MIR-205299.7969.372031
HSA-MIR-4766-5P99.7569.232662
HSA-MIR-3059-5P99.7069.932491
HSA-MIR-6892-3P99.6866.401178
HSA-MIR-409-3P99.5066.331192
HSA-MIR-1211799.5067.57868
HSA-MIR-190B-3P99.3368.291382
HSA-MIR-1273H-3P99.2967.55980
HSA-MIR-548AS-3P99.1269.122294
HSA-MIR-29A-5P99.0868.591813
HSA-MIR-143-5P98.9868.87946
HSA-MIR-390898.7567.311160
HSA-MIR-1139998.7165.69869
HSA-MIR-3944-5P98.5067.55997
HSA-MIR-4766-3P98.4867.941347
HSA-MIR-6780A-3P98.4267.491518
HSA-MIR-654-3P98.3867.61905

Literature-anchored findings (GeneRIF, showing 3)

  • calmegin expression is regulated by histone deacetylase and CpG methyltransferase in a coordinative way (PMID:16264275)
  • Although their similarity in carbohydrate binding specificities is high, there seems to be some differences in the mode of substrate recognition between calmegin and calnexin. (PMID:24769397)
  • In gene-based analysis CLGN1 represented one of a cluster of genes, which interacted with sodium to influence Blood Pressure. (PMID:27271309)

Cross-species orthologs

6 orthologs

OrganismSymbolGene ID
danio_rerioclgnENSDARG00000009315
mus_musculusClgnENSMUSG00000002190
rattus_norvegicusClgnENSRNOG00000003755
drosophila_melanogasterCnx99AFBGN0015622
drosophila_melanogasterCG1924FBGN0030377
caenorhabditis_elegansWBGENE00000567

Paralogs (3): CANX (ENSG00000127022), CALR (ENSG00000179218), CALR3 (ENSG00000269058)

Protein

Protein identifiers

CalmeginO14967 (reviewed: O14967)

All UniProt accessions (3): O14967, A0A140VKG2, D6RAZ4

UniProt curated annotations — full annotation on UniProt →

Function. Functions during spermatogenesis as a chaperone for a range of client proteins that are important for sperm adhesion onto the egg zona pellucida and for subsequent penetration of the zona pellucida. Required for normal sperm migration from the uterus into the oviduct. Required for normal male fertility. Binds calcium ions.

Subunit / interactions. Interacts with PPIB. Interacts with ADAM2. Interacts with PDILT.

Subcellular location. Endoplasmic reticulum membrane.

Tissue specificity. Detected in testis (at protein level). Detected in testis.

Similarity. Belongs to the calreticulin family.

Isoforms (2)

UniProt IDNamesCanonical?
O14967-11yes
O14967-22

RefSeq proteins (2): NP_001124147, NP_004353* (*=MANE)

Domains & families (InterPro)

IDNameType
IPR001580Calret/calnexFamily
IPR009033Calreticulin/calnexin_P_dom_sfHomologous_superfamily
IPR013320ConA-like_dom_sfHomologous_superfamily
IPR018124Calret/calnex_CSConserved_site

Pfam: PF00262

UniProt features (37 total): repeat 8, modified residue 8, compositionally biased region 5, region of interest 3, sequence variant 3, topological domain 2, disulfide bond 2, splice variant 2, signal peptide 1, chain 1, transmembrane region 1, sequence conflict 1

Structure

Experimental structures (PDB)

0 structures.

Predicted structure (AlphaFold)

ModelpLDDTFraction very-high
AF-O14967-F176.360.54

Functional residue map

Curated UniProt residues grouped by drug-discovery relevance — catalytic, ligand-binding, modification, and mutation-validated positions. Source: UniProtKB sequence features.

Post-translational modifications (8): 128, 560, 576, 579, 581, 591, 594, 601

Disulfide bonds (2): 151–185, 351–355

Function

Pathways and Gene Ontology

Reactome pathways

0 pathways

MSigDB gene sets: 150 (showing top): GOBP_SINGLE_FERTILIZATION, GOBP_RESPONSE_TO_NITROGEN_COMPOUND, MODULE_169, GOBP_RESPONSE_TO_ENDOPLASMIC_RETICULUM_STRESS, GOBP_MACROMOLECULE_CATABOLIC_PROCESS, PATIL_LIVER_CANCER, GOBP_PROTEIN_MATURATION, SHETH_LIVER_CANCER_VS_TXNIP_LOSS_PAM5, GAZDA_DIAMOND_BLACKFAN_ANEMIA_PROGENITOR_DN, GOBP_SPERM_EGG_RECOGNITION, RIGGI_EWING_SARCOMA_PROGENITOR_DN, MODULE_99, GOBP_PROTEIN_FOLDING, SANSOM_APC_TARGETS_DN, GOBP_PROTEASOMAL_PROTEIN_CATABOLIC_PROCESS

GO Biological Process (5): protein folding (GO:0006457), single fertilization (GO:0007338), binding of sperm to zona pellucida (GO:0007339), ERAD pathway (GO:0036503), protein-containing complex assembly (GO:0065003)

GO Molecular Function (4): calcium ion binding (GO:0005509), protein folding chaperone (GO:0044183), obsolete unfolded protein binding (GO:0051082), protein binding (GO:0005515)

GO Cellular Component (4): nuclear envelope (GO:0005635), endoplasmic reticulum (GO:0005783), endoplasmic reticulum membrane (GO:0005789), membrane (GO:0016020)

GO top-level categories

Rollup of top GO terms by namespace:

CategoryTerms
endomembrane system2
cellular process1
protein maturation1
fertilization1
sperm-egg recognition1
proteasomal protein catabolic process1
response to endoplasmic reticulum stress1
response to chemical1
cellular component assembly1
protein-containing complex organization1
metal ion binding1
molecular_function1
protein folding1
binding1
nucleus1
organelle envelope1
cytoplasm1
intracellular membrane-bounded organelle1
organelle membrane1
nuclear outer membrane-endoplasmic reticulum membrane network1
endoplasmic reticulum subcompartment1
cellular anatomical structure1

Protein interactions and networks

STRING

1713 interactions, top by confidence (×1000):

Protein AProtein BPartner UniProtScore
CLGNADAM2P78326962
CLGNPDILTQ8N807659
CLGNIZUMO1Q8IYV9559
CLGNPPIBP23284548
CLGNDEFB126Q9BYW3486
CLGNTEX101Q9BY14485
CLGNRNASE10Q5GAN6477
CLGNUGGT1Q9NYU2475
CLGNDCST2Q5T1A1472
CLGNLY6KQ17RY6451
CLGNTEKT3Q9BXF9447
CLGNENKURQ8TC29443
CLGNPRSS37A4D1T9436
CLGNPDIA3P30101434
CLGNPGAP1Q75T13411

IntAct

234 interactions, top by confidence:

ABTypeScore
TCTN2CLGNpsi-mi:“MI:0914”(association)0.780
CFTRESYT2psi-mi:“MI:2364”(proximity)0.710
B3GNT3PGRMC1psi-mi:“MI:0914”(association)0.670
SLC12A2CLGNpsi-mi:“MI:0914”(association)0.640
SLC39A5FAM171A2psi-mi:“MI:0914”(association)0.640
CANXPGRMC1psi-mi:“MI:0914”(association)0.570
INSRPIK3R2psi-mi:“MI:2364”(proximity)0.570
SLC12A4CLGNpsi-mi:“MI:0914”(association)0.530
TOR1ACLGNpsi-mi:“MI:0914”(association)0.530
TOR1BCLGNpsi-mi:“MI:0914”(association)0.530
IGSF8CLGNpsi-mi:“MI:0914”(association)0.530
SMPD1CLGNpsi-mi:“MI:0914”(association)0.530
ATP6AP2CLGNpsi-mi:“MI:0914”(association)0.530
PIEZO1CLGNpsi-mi:“MI:0914”(association)0.530
POGLUT1CLGNpsi-mi:“MI:0914”(association)0.530
TMEM237CLGNpsi-mi:“MI:0914”(association)0.530
TMTC4CLGNpsi-mi:“MI:0914”(association)0.530
SCARB2CLGNpsi-mi:“MI:0914”(association)0.530
GPC3CLGNpsi-mi:“MI:0914”(association)0.530
SLC22A5CLGNpsi-mi:“MI:0914”(association)0.530
MBTPS1CLGNpsi-mi:“MI:0914”(association)0.530
GALNSCLGNpsi-mi:“MI:0914”(association)0.530
CLGNNPC1psi-mi:“MI:0914”(association)0.530
OGFOD3CLGNpsi-mi:“MI:0914”(association)0.530
NCEH1CLGNpsi-mi:“MI:0914”(association)0.530

BioGRID (350): CLGN (Affinity Capture-MS), CLGN (Affinity Capture-MS), CLGN (Affinity Capture-MS), CLGN (Affinity Capture-MS), CLGN (Affinity Capture-MS), ABCC1 (Co-fractionation), CALU (Co-fractionation), CLGN (Co-fractionation), CLGN (Co-fractionation), CLGN (Co-fractionation), CLGN (Co-fractionation), CLGN (Co-fractionation), CLGN (Co-fractionation), CLGN (Co-fractionation), CLGN (Co-fractionation)

ESM2 similar proteins: A0A0D1C6P2, A8XA40, D4AVD4, E2RA18, J9VLH0, O04151, O04153, O14967, O18750, O81919, O82709, P08110, P11012, P14625, P24643, P27798, P27824, P29402, P29413, P34652, P35564, P35565, P36581, P41148, P52194, P83003, P93508, Q06814, Q23858, Q29092, Q2TBR8, Q38798, Q38858, Q39817, Q39994, Q3SYT6, Q40401, Q4R520, Q5R440, Q5R6F7

Diamond homologs: A0A0D1C6P2, A8XA40, D4AVD4, E2RA18, J9VLH0, O04151, O04153, O14967, O81919, O82709, P11012, P14211, P15253, P18418, P24643, P27797, P27798, P27824, P27825, P28491, P29402, P29413, P34652, P35564, P35565, P36581, P52193, P52194, P83003, P93508, Q06814, Q23858, Q2HWU3, Q2TBR8, Q38798, Q38858, Q39817, Q39994, Q3SYT6, Q40401

SIGNOR signaling

0 interactions.

Enriched among interaction partners

Reactome pathways and GO biological processes over-represented among this gene’s 221 IntAct physical interaction partners (hypergeometric vs the genome-wide background, BH-FDR, gene-set size 15–500, ranked by fold). A functional readout of the neighbourhood — distinct from this gene’s own memberships above, and biased toward well-studied / hub proteins, so read it as themes rather than proof.

Reactome pathways:

PathwayPartnersFoldFDR
Glycosphingolipid metabolism713.8×1e-04
HS-GAG biosynthesis511.3×6e-03
Glycosaminoglycan metabolism710.1×6e-04
Sphingolipid metabolism77.7×3e-03
R-HSA-42539397.6×4e-04
Metabolism of carbohydrates and carbohydrate derivatives97.1×6e-04
SLC-mediated transmembrane transport166.2×4e-06
Transport of small molecules213.5×1e-04

Disease & clinical

Clinical variants and AI predictions

ClinVar

104 variants total. Per-class counts are floors (≥ shown; pagination cap):

ClassificationCount (floor)
Pathogenic6
Likely pathogenic1
Uncertain significance84
Likely benign4
Benign1

Top pathogenic / likely-pathogenic (7)

Variant IDHGVSClassification
1340411GRCh37/hg19 4q28.3-31.21(chr4:136529470-141564812)x1Pathogenic
3062804GRCh37/hg19 4q27-35.2(chr4:123399154-190957473)x3Pathogenic
3391884GRCh37/hg19 4q28.3-31.21(chr4:137222514-142805472)x1Pathogenic
3391885GRCh37/hg19 4q28.3-31.21(chr4:137942896-145815498)x1Pathogenic
4682609GRCh37/hg19 4q28.3-31.21(chr4:134952803-144712496)x1Pathogenic
814609GRCh37/hg19 4q28.3-31.21(chr4:137901978-141527647)x1Pathogenic
599571NM_004362.3(CLGN):c.959A>G (p.Lys320Arg)Likely pathogenic

SpliceAI

2482 predictions. Top by Δscore:

VariantEffectΔscore
4:140390649:A:ACdonor_gain1.0000
4:140390650:T:Cdonor_gain1.0000
4:140390665:T:Adonor_gain1.0000
4:140390726:CTT:Cacceptor_gain1.0000
4:140390729:C:CCacceptor_gain1.0000
4:140390743:T:Cacceptor_gain1.0000
4:140390743:T:TCacceptor_gain1.0000
4:140390746:T:Cacceptor_gain1.0000
4:140390746:T:TCacceptor_gain1.0000
4:140392214:CTTAC:Cdonor_loss1.0000
4:140392215:TTAC:Tdonor_loss1.0000
4:140392217:A:Tdonor_loss1.0000
4:140392218:C:Adonor_loss1.0000
4:140393890:T:TAdonor_gain1.0000
4:140394040:CC:Cacceptor_gain1.0000
4:140394041:CC:Cacceptor_gain1.0000
4:140394042:C:CAacceptor_loss1.0000
4:140394043:T:Gacceptor_loss1.0000
4:140395813:TGTTA:Tdonor_loss1.0000
4:140395814:GTTAC:Gdonor_loss1.0000
4:140395815:TTAC:Tdonor_loss1.0000
4:140395816:TA:Tdonor_loss1.0000
4:140395817:A:ATdonor_loss1.0000
4:140395818:C:Gdonor_loss1.0000
4:140395965:CATTC:Cacceptor_gain1.0000
4:140395967:TTC:Tacceptor_gain1.0000
4:140395971:T:Cacceptor_gain1.0000
4:140395971:T:TCacceptor_gain1.0000
4:140395976:A:ACacceptor_gain1.0000
4:140395976:A:Cacceptor_gain1.0000

AlphaMissense

4098 scored. Top likely-pathogenic:

VariantProtein changeam_pathogenicity
4:140394035:A:GW386R0.993
4:140394035:A:TW386R0.993
4:140398901:C:AW278C0.991
4:140398901:C:GW278C0.991
4:140398903:A:GW278R0.991
4:140398903:A:TW278R0.991
4:140402015:T:AK157N0.991
4:140402015:T:GK157N0.991
4:140394033:C:AW386C0.990
4:140394033:C:GW386C0.990
4:140395854:A:GW372R0.990
4:140395854:A:TW372R0.990
4:140395894:C:AW358C0.990
4:140395894:C:GW358C0.990
4:140396150:A:GW314R0.990
4:140396150:A:TW314R0.990
4:140406078:A:GW95R0.990
4:140406078:A:TW95R0.990
4:140395916:C:GC351S0.989
4:140395917:A:TC351S0.989
4:140396148:C:AW314C0.989
4:140396148:C:GW314C0.989
4:140400359:A:GL231P0.989
4:140406000:C:GA121P0.989
4:140395896:A:GW358R0.988
4:140395896:A:TW358R0.988
4:140395904:C:GC355S0.988
4:140395905:A:TC355S0.988
4:140393947:A:GL415P0.987
4:140400511:A:CF180L0.987

dbSNP variants (sampled 300 via entrez): RS1000180732 (4:140413150 T>C), RS1000270046 (4:140405372 G>T), RS1000359321 (4:140424198 T>G), RS1000374276 (4:140423780 T>C), RS1000374823 (4:140413363 C>G,T), RS1000594790 (4:140392730 T>C), RS1000633192 (4:140408239 C>T), RS1000652122 (4:140425774 G>A), RS1000709153 (4:140406759 G>A), RS1000794498 (4:140395126 A>G), RS1000878125 (4:140425589 A>G), RS1000929731 (4:140419032 G>C), RS1001107324 (4:140401596 A>C), RS10011171 (4:140420114 C>A,T), RS10012138 (4:140421979 A>C,G)

Disease associations

OMIM: gene MIM:601858 | disease phenotypes:

GenCC curated gene-disease

Mondo (0):

Orphanet (0):

HPO phenotypes

0 total (0 of 0 shown, HPO-id order):

GWAS associations

8 associations (top):

StudyTraitp-value
GCST000083_5Select biomarker traits1.000000e-06
GCST006111_1Diastolic blood pressure x sodium interaction (2df test)3.000000e-22
GCST006113_1Systolic blood pressure x sodium interaction (2df test)4.000000e-12
GCST006114_1Mean arterial pressure x sodium interaction (2df test)3.000000e-15
GCST010725_4Malaria4.000000e-10
GCST010725_84Malaria7.000000e-11
GCST010725_89Malaria7.000000e-11
GCST011742_48Triglyceride levels in HIV infection4.000000e-06

EFO canonical traits (6, from GWAS)

EFO IDTrait name
EFO:0004570bilirubin measurement
EFO:0006336diastolic blood pressure
EFO:0009282sodium measurement
EFO:0006335systolic blood pressure
EFO:0006340mean arterial pressure
EFO:0004530triglyceride measurement

Drugs & pharmacology

Drug and pharmacology data

Is drug target: yes

ChEMBL targets (1): CHEMBL4295653 (SINGLE PROTEIN)

PharmGKB: 1 entry (VIP=true, CPIC=false)

CTD chemical–gene interactions

88 total (human), top 30 by PubMed support.

ChemicalActions (top 5)PubMed papers
Valproic Acidaffects expression, increases expression4
Cyclosporineincreases expression4
bisphenol Aaffects expression, increases expression3
trichostatin Aaffects cotreatment, decreases expression, increases expression3
sodium arseniteaffects cotreatment, decreases expression, increases abundance, increases expression2
perfluorooctane sulfonic aciddecreases expression, increases expression2
(+)-JQ1 compoundincreases expression2
Vorinostatdecreases expression, increases expression, affects cotreatment2
Panobinostataffects cotreatment, decreases expression2
Air Pollutantsincreases abundance, increases expression, decreases expression2
Phenylmercuric Acetateaffects cotreatment, decreases expression2
Tretinoindecreases expression2
8-Bromo Cyclic Adenosine Monophosphateincreases expression2
Particulate Matterincreases abundance, increases expression, decreases expression2
dicrotophosdecreases expression1
methylmercuric chloridedecreases expression1
3,4-dichloroanilineincreases expression1
sulforaphaneincreases expression1
tris(1,3-dichloro-2-propyl)phosphateincreases expression1
butyraldehydeincreases expression1
perfluorooctanoic acidincreases expression1
manganese chloridedecreases expression, increases abundance, affects cotreatment1
potassium chromate(VI)increases expression1
bicalutamideincreases expression1
2,3-dimethoxy-1,4-naphthoquinonedecreases expression1
perfluoro-n-nonanoic acidincreases expression1
2-palmitoylglycerolincreases expression1
entinostatincreases expression1
K 7174increases expression1
4-(5-benzo(1,3)dioxol-5-yl-4-pyridin-2-yl-1H-imidazol-2-yl)benzamideaffects cotreatment, decreases expression1

ChEMBL screening assays

1 unique, capped per target: 1 binding

Representative assays (with source publication via chembl_document):

Assay IDTypeDescriptionSource paper
CHEMBL4118299BindingInhibition of AurA in human SKCO1 cells at 1 uM after 4 hrs using biotin labeled GKFGNVYLAR probe by mass-spectrometric analysis relative to controlStudies of TAK1-centered polypharmacology with novel covalent TAK1 inhibitors. — Bioorg Med Chem

Clinical trials (associated diseases)

0 trials via MONDO — disease-level, not drug-specific.

No linked Atlas pages yet — the cross-entity mesh grows as the corpus expands.