COL21A1

gene
On this page

Summary

COL21A1 (collagen type XXI alpha 1 chain, HGNC:17025) is a protein-coding gene on chromosome 6p12.1, encoding Collagen alpha-1(XXI) chain (Q96P44).

This gene encodes the alpha chain of type XXI collagen, a member of the FACIT (fibril-associated collagens with interrupted helices) collagen family. Type XXI collagen is localized to tissues containing type I collagen and maintains the integrity of the extracellular matrix. Alternative splicing results in multiple transcript variants.

Source: NCBI Gene 81578 — RefSeq curated summary.

At a glance

  • GWAS associations: 19
  • Clinical variants (ClinVar): 168 total
  • MANE Select transcript: NM_030820

Identifiers

Gene identifiers

FieldValue
HGNC IDHGNC:17025
Approved symbolCOL21A1
Namecollagen type XXI alpha 1 chain
Location6p12.1
Locus typegene with protein product
StatusApproved
Ensembl geneENSG00000124749
Ensembl biotypeprotein_coding
OMIM610002
Entrez81578

Gene structure

Transcript identifiers

Ensembl transcripts: 11 — 4 protein_coding, 4 protein_coding_CDS_not_defined, 2 retained_intron, 1 nonsense_mediated_decay

ENST00000244728, ENST00000370817, ENST00000370819, ENST00000456983, ENST00000461489, ENST00000467045, ENST00000467216, ENST00000469682, ENST00000482933, ENST00000484987, ENST00000488912

RefSeq mRNA: 5 — MANE Select: NM_030820 NM_001318751, NM_001318752, NM_001318753, NM_001318754, NM_030820

CCDS: CCDS55025, CCDS83099

Canonical transcript exons

ENST00000244728 — 30 exons

ExonStartEnd
ENSE000010157175617064956170865
ENSE000010157205616812456168297
ENSE000010157255617096056171128
ENSE000010157275617957856180129
ENSE000011861775612609656126149
ENSE000011861805614178556141838
ENSE000011861835614193056141983
ENSE000011861855615688756156949
ENSE000011861885616442356164506
ENSE000012522085618253156182656
ENSE000014537025612556756125620
ENSE000014537045616690656166983
ENSE000018832545624738756247580
ENSE000021825465610147256101525
ENSE000021839665612423956124292
ENSE000022016015612406256124115
ENSE000025233835616481456164822
ENSE000034908865606074156060795
ENSE000034988145607752956077573
ENSE000035156125606164956061681
ENSE000035277055607423256074285
ENSE000035448665605659056057844
ENSE000035475415606904656069117
ENSE000035559275607547956075532
ENSE000035696065606089156061037
ENSE000036031435606001856060218
ENSE000036340655607074556070798
ENSE000036591625605916556059242
ENSE000036636515606457856064622
ENSE000036817645606729556067330

Expression profiles

Bgee: expression breadth ubiquitous, 239 present calls, max score 96.82.

FANTOM5 (CAGE): breadth broad, TPM avg 3.9160 / max 336.6201, expressed in 645 samples.

FANTOM5 promoters (9 alternative TSS)

Promoter IDTPM avgSamples expressed
740612.8197546
740590.4451233
740600.3128164
740560.190235
740630.062511
740550.04609
740640.01638
740620.01369
740650.00995

Top tissues by expression

277 total, by Bgee expression score (0-100, higher = more expressed):

TissueAnatomy IDExpression scoreQuality
blood vessel layerUBERON:000479796.82gold quality
bronchial epithelial cellCL:000232895.76gold quality
sural nerveUBERON:001548895.74gold quality
descending thoracic aortaUBERON:000234595.36gold quality
ascending aortaUBERON:000149694.98gold quality
thoracic aortaUBERON:000151594.98gold quality
left ventricle myocardiumUBERON:000656693.43gold quality
right coronary arteryUBERON:000162593.11gold quality
skin of hipUBERON:000155493.07gold quality
epithelium of bronchusUBERON:000203193.04gold quality
periodontal ligamentUBERON:000826692.98gold quality
bronchusUBERON:000218592.16gold quality
myocardiumUBERON:000234991.75gold quality
seminal vesicleUBERON:000099890.42gold quality
left coronary arteryUBERON:000162689.95gold quality
heart left ventricleUBERON:000208489.79gold quality
cardiac ventricleUBERON:000208289.76gold quality
coronary arteryUBERON:000162189.59gold quality
placentaUBERON:000198789.58gold quality
tibial nerveUBERON:000132389.47gold quality
upper leg skinUBERON:000426289.40gold quality
heartUBERON:000094889.18gold quality
endocervixUBERON:000045889.14gold quality
heart right ventricleUBERON:000208089.11gold quality
aortaUBERON:000094789.06gold quality
apex of heartUBERON:000209888.90gold quality
calcaneal tendonUBERON:000370188.87gold quality
cardiac muscle of right atriumUBERON:000337988.69gold quality
right uterine tubeUBERON:000130288.54gold quality
mucosa of paranasal sinusUBERON:000503087.88gold quality

Single-cell (SCXA)

Detected in 5 experiment(s), a significant marker in 4.

ExperimentMarker?Max mean expression
E-GEOD-124472yes745.85
E-HCAD-10yes59.42
E-MTAB-6701yes53.06
E-ANND-3yes8.93
E-MTAB-9543no1.22

Regulation

Is transcription factor: no

miRNA regulators (miRDB)

87 targeting COL21A1, top 30 by miRDB confidence (max_score; target_count = how many genes the miRNA targets in total — lower means more specific):

miRNAMax scoreAvg scoremiRNA target_count
HSA-MIR-29A-3P100.0073.111835
HSA-MIR-29B-3P100.0073.181833
HSA-MIR-29C-3P100.0073.151833
HSA-MIR-3646100.0073.565283
HSA-MIR-3163100.0077.238605
HSA-MIR-4789-3P99.9970.752484
HSA-MIR-480399.9871.993117
HSA-MIR-27A-3P99.9872.132955
HSA-MIR-27B-3P99.9872.132955
HSA-MIR-998599.9872.112939
HSA-MIR-524-5P99.9873.434882
HSA-MIR-520D-5P99.9873.344883
HSA-MIR-3688-3P99.9772.022834
HSA-MIR-302C-5P99.9772.563642
HSA-MIR-568899.9673.234504
HSA-MIR-495-3P99.9672.814197
HSA-MIR-128-3P99.9571.172484
HSA-MIR-216A-3P99.9571.192505
HSA-MIR-767-5P99.9570.85993
HSA-LET-7C-3P99.9573.422862
HSA-MIR-6835-3P99.9370.492904
HSA-MIR-314399.9371.963104
HSA-MIR-205-3P99.9269.923165
HSA-MIR-10523-5P99.9169.222038
HSA-MIR-498-3P99.9171.271114
HSA-MIR-3529-3P99.9073.553045
HSA-MIR-548E-5P99.8972.734486
HSA-MIR-568299.8972.561005
HSA-MIR-3681-3P99.8870.462254
HSA-MIR-449699.8868.892236

Literature-anchored findings (GeneRIF, showing 4)

  • Extracellular matrix proteins expression profiling in chemoresistant variants of the A2780 ovarian cancer cell line. (PMID:24804215)
  • Low copy number of Col21A1 is associated with nonsyndromic cleft lip and/or palate. (PMID:30924295)
  • Validating candidate biomarkers for different stages of non-alcoholic fatty liver disease. (PMID:32898995)
  • Prognostic significance of abnormal matrix collagen remodeling in colorectal cancer based on histologic and bioinformatics analysis. (PMID:32945508)

Cross-species orthologs

0 orthologs

Paralogs (37): COL9A2 (ENSG00000049089), COL23A1 (ENSG00000050767), COL11A1 (ENSG00000060718), COL17A1 (ENSG00000065618), COL5A3 (ENSG00000080573), COL4A4 (ENSG00000081052), COL16A1 (ENSG00000084636), COL9A3 (ENSG00000092758), COL20A1 (ENSG00000101203), COL1A1 (ENSG00000108821), COL9A1 (ENSG00000112280), COL7A1 (ENSG00000114270), COL5A1 (ENSG00000130635), COL4A2 (ENSG00000134871), COL2A1 (ENSG00000139219), COL6A1 (ENSG00000142156), COL6A2 (ENSG00000142173), EDA (ENSG00000158813), COL26A1 (ENSG00000160963), COL1A2 (ENSG00000164692), COL3A1 (ENSG00000168542), COL4A3 (ENSG00000169031), COL22A1 (ENSG00000169436), COL24A1 (ENSG00000171502), COL18A1 (ENSG00000182871), EMID1 (ENSG00000186998), COL4A1 (ENSG00000187498), COL4A5 (ENSG00000188153), COL25A1 (ENSG00000188517), COL27A1 (ENSG00000196739), COL13A1 (ENSG00000197467), COL4A6 (ENSG00000197565), COL11A2 (ENSG00000204248), COL5A2 (ENSG00000204262), COL15A1 (ENSG00000204291), COLQ (ENSG00000206561), COL28A1 (ENSG00000215018)

Protein

Protein identifiers

Collagen alpha-1(XXI) chainQ96P44 (reviewed: Q96P44)

All UniProt accessions (5): A0A158RFW1, A6PVD9, Q96P44, H0Y4C9, H0YDH6

UniProt curated annotations — full annotation on UniProt →

Subcellular location. Secreted. Extracellular space. Extracellular matrix. Cytoplasm.

Tissue specificity. Highly expressed in lymph node, jejunum, pancreas, stomach, trachea, testis, uterus and placenta; moderately expressed in brain, colon, lung, prostate, spinal cord, salivary gland and vascular smooth-muscle cells and very weakly expressed in heart, liver, kidney, bone marrow, spleen, thymus, skeletal muscle, adrenal gland and peripheral leukocytes. Expression in heart was higher in the right ventricle and atrium than in the left ventricle and atrium.

Induction. Stimulated by PDGF/platelet-derived growth factor.

Similarity. Belongs to the fibril-associated collagens with interrupted helices (FACIT) family.

Isoforms (3)

UniProt IDNamesCanonical?
Q96P44-11yes
Q96P44-22
Q96P44-33

RefSeq proteins (3): NP_001305680, NP_001305681, NP_110447* (*=MANE)

Domains & families (InterPro)

IDNameType
IPR002035VWF_ADomain
IPR008160CollagenRepeat
IPR013320ConA-like_dom_sfHomologous_superfamily
IPR036465vWFA_dom_sfHomologous_superfamily
IPR048287TSPN-like_NDomain
IPR050938Collagen_Structural_ProteinsFamily

Pfam: PF00092, PF01391

UniProt features (37 total): domain 9, compositionally biased region 9, sequence variant 7, splice variant 4, sequence conflict 3, region of interest 2, signal peptide 1, chain 1, glycosylation site 1

Structure

Experimental structures (PDB)

0 structures.

Predicted structure (AlphaFold)

ModelpLDDTFraction very-high
AF-Q96P44-F168.410.34

Functional residue map

Curated UniProt residues grouped by drug-discovery relevance — catalytic, ligand-binding, modification, and mutation-validated positions. Source: UniProtKB sequence features.

Glycosylation sites (1): 62

Function

Pathways and Gene Ontology

Reactome pathways

2 pathways

IDPathway
R-HSA-1650814Collagen biosynthesis and modifying enzymes
R-HSA-8948216Collagen chain trimerization

MSigDB gene sets: 95 (showing top): WALLACE_PROSTATE_CANCER_RACE_UP, BROWNE_HCMV_INFECTION_8HR_UP, GOCC_COLLAGEN_TRIMER, LUCAS_HNF4A_TARGETS_UP, chr6p12, BLALOCK_ALZHEIMERS_DISEASE_UP, GOMF_EXTRACELLULAR_MATRIX_STRUCTURAL_CONSTITUENT, GOZGIT_ESR1_TARGETS_UP, SCHLOSSER_SERUM_RESPONSE_DN, RIGGI_EWING_SARCOMA_PROGENITOR_UP, GOCC_ENDOPLASMIC_RETICULUM_LUMEN, GOMF_STRUCTURAL_MOLECULE_ACTIVITY, FIGUEROA_AML_METHYLATION_CLUSTER_2_UP, FIGUEROA_AML_METHYLATION_CLUSTER_3_UP, FIGUEROA_AML_METHYLATION_CLUSTER_7_UP

GO Biological Process (0):

GO Molecular Function (1): extracellular matrix structural constituent conferring tensile strength (GO:0030020)

GO Cellular Component (6): extracellular region (GO:0005576), collagen trimer (GO:0005581), endoplasmic reticulum lumen (GO:0005788), cytosol (GO:0005829), extracellular matrix (GO:0031012), cytoplasm (GO:0005737)

Reactome top-level categories

Rollup of top-2 pathways:

CategoryPathways
Collagen formation1
Collagen biosynthesis and modifying enzymes1

GO top-level categories

Rollup of top GO terms by namespace:

CategoryTerms
cellular anatomical structure3
extracellular matrix structural constituent1
protein-containing complex1
endoplasmic reticulum1
intracellular organelle lumen1
cytoplasm1
external encapsulating structure1
intracellular anatomical structure1

Protein interactions and networks

STRING

1074 interactions, top by confidence (×1000):

Protein AProtein BPartner UniProtScore
COL21A1CNIH1O95406599
COL21A1CNIH3Q8TBE1545
COL21A1TFEBP19484482
COL21A1SEC16BQ96JE7467
COL21A1HSPG2P98160458
COL21A1STK17AQ9UEE5447
COL21A1COL26A1Q96A83442
COL21A1KHDRBS2Q5VWX1409
COL21A1ADAMTS14Q8WXS8405
COL21A1OR7A10O76100380
COL21A1COL24A1Q17RW2380
COL21A1CRTAPO75718375
COL21A1P3H1Q32P28367
COL21A1VWA1Q6PCB0359
COL21A1ITGBL1O95965354

IntAct

5 interactions, top by confidence:

ABTypeScore
COL21A1PLOD3psi-mi:“MI:0914”(association)0.530
COL21A1CFTRpsi-mi:“MI:0915”(physical association)0.370
COL21A1PLOD2psi-mi:“MI:0914”(association)0.350
C1QTNF1PLOD2psi-mi:“MI:0914”(association)0.350

BioGRID (10): COL21A1 (Affinity Capture-MS), COLGALT2 (Affinity Capture-MS), COL14A1 (Affinity Capture-MS), PLOD1 (Affinity Capture-MS), COL4A2 (Affinity Capture-MS), PLOD2 (Affinity Capture-MS), PLOD3 (Affinity Capture-MS), COL21A1 (PCA), COL21A1 (Affinity Capture-MS), COL21A1 (Affinity Capture-RNA)

ESM2 similar proteins: A0A060WQA3, A0MSJ1, A5PN28, A6NHN0, A8WR59, C0HLN2, C7DZK3, O35167, O35348, O76368, O88207, P08122, P08572, P12106, P12107, P12108, P13942, P20849, P20850, P20908, P20909, P25067, P25940, P53420, P83371, P98085, Q01955, Q03637, Q05722, Q07092, Q07643, Q0VF58, Q14031, Q14055, Q14993, Q17RW2, Q28083, Q30D77, Q32S24, Q4ZJM7

Diamond homologs: A2AX52, A6H584, A6NMZ7, A6QLN9, A8TX70, E7FF10, O00339, O08746, O42401, O75578, O89029, P05099, P05555, P11215, P12111, P15989, P20701, P20702, P34576, P51942, P61625, Q02388, Q13349, Q21281, Q21540, Q28902, Q3V0T4, Q63870, Q642A6, Q6PCB0, Q6UXI7, Q8C6K9, Q8NFW1, Q8R2Z5, Q90615, Q91145, Q923P0, Q95LI2, Q96P44, Q9P218

SIGNOR signaling

3 interactions.

AEffectBMechanism
COL21A1“up-regulates activity”DDR1binding
COL21A1“up-regulates activity”DDR2binding
COL21A1“up-regulates activity”“A2/b1 integrin”binding

Disease & clinical

Clinical variants and AI predictions

ClinVar

168 variants total. Per-class counts are floors (≥ shown; pagination cap):

ClassificationCount (floor)
Pathogenic0
Likely pathogenic0
Uncertain significance126
Likely benign9
Benign2

Top pathogenic / likely-pathogenic (0)

SpliceAI

6378 predictions. Top by Δscore:

VariantEffectΔscore
6:56060152:CCA:Cacceptor_gain1.0000
6:56060153:CA:Cacceptor_gain1.0000
6:56060154:A:Cacceptor_gain1.0000
6:56060890:CGGG:Cdonor_gain1.0000
6:56061681:CCTT:Cacceptor_gain1.0000
6:56062761:A:ACdonor_gain1.0000
6:56062762:C:CCdonor_gain1.0000
6:56069117:CCTA:Cacceptor_gain1.0000
6:56074284:CC:Cacceptor_gain1.0000
6:56074285:CC:Cacceptor_gain1.0000
6:56075477:A:ACdonor_gain1.0000
6:56075478:C:CCdonor_gain1.0000
6:56075478:CAGG:Cdonor_gain1.0000
6:56075485:C:Adonor_gain1.0000
6:56077525:TTA:Tdonor_loss1.0000
6:56077527:A:ACdonor_gain1.0000
6:56077527:AC:Adonor_gain1.0000
6:56077528:C:CAdonor_gain1.0000
6:56077528:CC:Cdonor_gain1.0000
6:56077528:CCT:Cdonor_gain1.0000
6:56077528:CCTT:Cdonor_gain1.0000
6:56077570:TTCC:Tacceptor_gain1.0000
6:56077571:TCC:Tacceptor_gain1.0000
6:56077572:CC:Cacceptor_gain1.0000
6:56077572:CCC:Cacceptor_gain1.0000
6:56077573:CC:Cacceptor_gain1.0000
6:56077574:C:CCacceptor_gain1.0000
6:56077574:C:Tacceptor_gain1.0000
6:56077574:CTAAA:Cacceptor_loss1.0000
6:56077575:T:Aacceptor_loss1.0000

AlphaMissense

6107 scored. Top likely-pathogenic:

VariantProtein changeam_pathogenicity
6:56179670:G:TA183D0.999
6:56179982:A:TV79D0.999
6:56180043:A:GW59R0.999
6:56180043:A:TW59R0.999
6:56179583:C:GC212S0.998
6:56179584:A:TC212S0.998
6:56179718:G:TA167D0.998
6:56179787:T:AD144V0.998
6:56179803:C:GA139P0.998
6:56180084:G:AS45F0.998
6:56180085:A:GS45P0.998
6:56180090:T:AD43V0.998
6:56180091:C:GD43H0.998
6:56166955:C:TC410Y0.997
6:56179583:C:TC212Y0.997
6:56179682:A:GL179P0.997
6:56179778:G:AS147F0.997
6:56179793:A:GL142P0.997
6:56179802:G:TA139E0.997
6:56179988:C:TG77E0.997
6:56179989:C:GG77R0.997
6:56179989:C:TG77R0.997
6:56180041:C:AW59C0.997
6:56180041:C:GW59C0.997
6:56166964:C:GR407P0.996
6:56170822:A:GS285P0.996
6:56171119:C:GC217S0.996
6:56171120:A:GC217R0.996
6:56171120:A:TC217S0.996
6:56179584:A:GC212R0.996

dbSNP variants (sampled 300 via entrez): RS1000000722 (6:56159366 T>G), RS1000021275 (6:56201542 C>A,T), RS1000024804 (6:56069162 A>C,G), RS1000031319 (6:56373133 G>A), RS1000033085 (6:56158933 T>C), RS1000033859 (6:56095172 T>C), RS1000047398 (6:56221578 A>G), RS1000049151 (6:56136176 C>T), RS1000079701 (6:56252833 C>T), RS1000083856 (6:56290044 T>G), RS1000101154 (6:56168915 T>C), RS1000107778 (6:56211901 C>G), RS1000118222 (6:56264990 C>G), RS1000126049 (6:56352747 A>G), RS1000132962 (6:56379744 A>G)

Disease associations

OMIM: gene MIM:610002 | disease phenotypes:

GenCC curated gene-disease

Mondo (0):

Orphanet (0):

HPO phenotypes

0 total (0 of 0 shown, HPO-id order):

GWAS associations

19 associations (top):

StudyTraitp-value
GCST002084_8Allergic sensitization7.000000e-06
GCST002211_2Psychosis (atypical)2.000000e-07
GCST003518_100Daytime sleep phenotypes5.000000e-06
GCST003805_2Diastolic blood pressure response to hydrochlorothiazide in hypertension9.000000e-06
GCST003854_31Gut microbiota (functional units)2.000000e-08
GCST003998_25Joint mobility (Beighton score)9.000000e-07
GCST004746_12Small cell lung carcinoma8.000000e-06
GCST006022_8Pulse pressure1.000000e-09
GCST006230_10Pulse pressure5.000000e-08
GCST006979_292Heel bone mineral density2.000000e-17
GCST007096_251Pulse pressure4.000000e-19
GCST007205_9Schizophrenia5.000000e-06
GCST007267_217Systolic blood pressure3.000000e-09
GCST007268_27Diastolic blood pressure2.000000e-11
GCST007269_250Pulse pressure4.000000e-26
GCST007270_1Systolic blood pressure8.000000e-11
GCST007272_15Pulse pressure6.000000e-36
GCST007272_5Pulse pressure2.000000e-14
GCST010989_106Body size at age 106.000000e-09

EFO canonical traits (10, from GWAS)

EFO IDTrait name
EFO:0005298allergic sensitization measurement
EFO:0007828daytime rest measurement
EFO:0006945diastolic blood pressure change measurement
EFO:0007874gut microbiome measurement
EFO:0007905joint hypermobility measurement
EFO:0005763pulse pressure measurement
EFO:0009270heel bone mineral density
EFO:0006335systolic blood pressure
EFO:0006336diastolic blood pressure
EFO:0009819comparative body size at age 10, self-reported

Drugs & pharmacology

Drug and pharmacology data

Is drug target: no

PharmGKB: 1 entry (VIP=true, CPIC=false)

CTD chemical–gene interactions

49 total (human), top 30 by PubMed support.

ChemicalActions (top 5)PubMed papers
Valproic Acidaffects cotreatment, increases expression7
trichostatin Aincreases expression, affects cotreatment3
entinostatincreases expression, affects cotreatment2
Nickeldecreases expression2
Phenylmercuric Acetateaffects cotreatment, increases expression2
Silicon Dioxidedecreases expression, increases expression2
Cadmium Chlorideincreases abundance, increases expression2
methylmercuric chloridedecreases expression1
bisphenol Aaffects methylation1
sodium arseniteincreases expression1
butyraldehydedecreases expression1
ochratoxin Aincreases expression1
2,3-bis(3’-hydroxybenzyl)butyrolactoneaffects cotreatment, increases expression1
S-(1,2-dichlorovinyl)cysteinedecreases expression, decreases reaction1
CGP 52608affects binding, increases reaction1
4-(5-benzo(1,3)dioxol-5-yl-4-pyridin-2-yl-1H-imidazol-2-yl)benzamideaffects cotreatment, increases expression1
belinostatincreases expression1
abrinedecreases expression1
dorsomorphinaffects cotreatment, increases expression1
bisphenol Saffects cotreatment, increases expression1
incobotulinumtoxinAincreases expression1
Resveratrolincreases expression, affects cotreatment1
Arsenic Trioxideaffects cotreatment, decreases expression1
Vorinostatincreases expression1
Panobinostataffects cotreatment, increases expression1
Acetaminophendecreases expression1
Air Pollutantsdecreases expression, increases abundance1
Arsenicaffects methylation1
Benzo(a)pyreneaffects methylation, increases methylation1
Cadmiumincreases abundance, increases expression1

Clinical trials (associated diseases)

0 trials via MONDO — disease-level, not drug-specific.