COL28A1

gene
On this page

Summary

COL28A1 (collagen type XXVIII alpha 1 chain, HGNC:22442) is a protein-coding gene on chromosome 7p21.3, encoding Collagen alpha-1(XXVIII) chain (Q2UY09). May act as a cell-binding protein.

COL28A1 belongs to a class of collagens containing von Willebrand factor (VWF; MIM 613160) type A (VWFA) domains (Veit et al., 2006 [PubMed 16330543]).

Source: NCBI Gene 340267 — RefSeq curated summary.

At a glance

  • GWAS associations: 6
  • Clinical variants (ClinVar): 226 total
  • Druggable target: yes
  • MANE Select transcript: NM_001037763

Identifiers

Gene identifiers

FieldValue
HGNC IDHGNC:22442
Approved symbolCOL28A1
Namecollagen type XXVIII alpha 1 chain
Location7p21.3
Locus typegene with protein product
StatusApproved
Ensembl geneENSG00000215018
Ensembl biotypeprotein_coding
OMIM609996
Entrez340267

Gene structure

Transcript identifiers

Ensembl transcripts: 6 — 3 protein_coding, 2 nonsense_mediated_decay, 1 retained_intron

ENST00000399429, ENST00000430711, ENST00000435823, ENST00000444268, ENST00000453441, ENST00000465339

RefSeq mRNA: 1 — MANE Select: NM_001037763 NM_001037763

CCDS: CCDS43553

Canonical transcript exons

ENST00000399429 — 35 exons

ExonStartEnd
ENSE0000153822073603907360528
ENSE0000153822173707257370882
ENSE0000153822373729987373546
ENSE0000153822473754617375497
ENSE0000153822873806607380695
ENSE0000153823273807827380862
ENSE0000153824373815447381612
ENSE0000153824774178597417927
ENSE0000161346075219057521961
ENSE0000167691875177967517837
ENSE0000170108575200627520115
ENSE0000175760475327527532912
ENSE0000176194475313487531904
ENSE0000178866075242297524249
ENSE0000180411775158147515840
ENSE0000181712473578757358805
ENSE0000185341075357507535873
ENSE0000346169574746017474669
ENSE0000346480674435857443653
ENSE0000347361574771127477180
ENSE0000347832574444187444489
ENSE0000353684874198857419953
ENSE0000355150275110917511135
ENSE0000357171774407907440861
ENSE0000357852374560447456112
ENSE0000358720574523197452387
ENSE0000360073774893897489457
ENSE0000362022775060147506067
ENSE0000363791174324737432541
ENSE0000364944174363957436463
ENSE0000364960375071177507161
ENSE0000366200774326327432700
ENSE0000366715474534407453508
ENSE0000366986674373947437462
ENSE0000367044574905787490646

Expression profiles

Bgee: expression breadth ubiquitous, 182 present calls, max score 98.03.

FANTOM5 (CAGE): breadth tissue_specific, TPM avg 0.3279 / max 54.2742, expressed in 86 samples.

FANTOM5 promoters (2 alternative TSS)

Promoter IDTPM avgSamples expressed
826770.190074
826760.137955

Top tissues by expression

241 total, by Bgee expression score (0-100, higher = more expressed):

TissueAnatomy IDExpression scoreQuality
sural nerveUBERON:001548898.03gold quality
trigeminal ganglionUBERON:000167595.13gold quality
tibial nerveUBERON:000132394.07gold quality
oviduct epitheliumUBERON:000480493.72gold quality
dorsal root ganglionUBERON:000004493.31gold quality
right uterine tubeUBERON:000130290.89gold quality
bronchial epithelial cellCL:000232887.69gold quality
bronchusUBERON:000218586.85gold quality
olfactory segment of nasal mucosaUBERON:000538686.77gold quality
colonic epitheliumUBERON:000039786.19gold quality
parotid glandUBERON:000183184.24gold quality
muscle layer of sigmoid colonUBERON:003580584.21gold quality
adenohypophysisUBERON:000219683.35gold quality
lower esophagus muscularis layerUBERON:003583382.15gold quality
lower esophagusUBERON:001347382.13gold quality
mucosa of paranasal sinusUBERON:000503081.45gold quality
esophagogastric junction muscularis propriaUBERON:003584180.95gold quality
male germ line stem cell (sensu Vertebrata) in testisCL:0000089 ∩ UBERON:000047380.85silver quality
body of stomachUBERON:000116180.63gold quality
body of pancreasUBERON:000115080.48gold quality
fallopian tubeUBERON:000388980.45gold quality
pituitary glandUBERON:000000778.79gold quality
stomachUBERON:000094578.27gold quality
epithelium of nasopharynxUBERON:000195178.02gold quality
metanephros cortexUBERON:001053377.92gold quality
fundus of stomachUBERON:000116077.64gold quality
gastrocnemiusUBERON:000138877.33gold quality
muscle of legUBERON:000138376.89gold quality
tracheaUBERON:000312676.19gold quality
apex of heartUBERON:000209875.71gold quality

Single-cell (SCXA)

Detected in 3 experiment(s), a significant marker in 2.

ExperimentMarker?Max mean expression
E-HCAD-11yes21.48
E-ANND-3yes7.75
E-MTAB-9543no1.22

Regulation

Is transcription factor: no

Literature-anchored findings (GeneRIF, showing 2)

  • Collagen XXVIII is a novel von Willebrand factor A domain-containing protein with many imperfections in the collagenous domain (PMID:16330543)
  • COL28 promotes proliferation, migration, and EMT of renal tubular epithelial cells. (PMID:36883360)

Cross-species orthologs

2 orthologs

OrganismSymbolGene ID
mus_musculusCol28a1ENSMUSG00000068794
rattus_norvegicusCol28a1ENSRNOG00000033618

Paralogs (37): COL9A2 (ENSG00000049089), COL23A1 (ENSG00000050767), COL11A1 (ENSG00000060718), COL17A1 (ENSG00000065618), COL5A3 (ENSG00000080573), COL4A4 (ENSG00000081052), COL16A1 (ENSG00000084636), COL9A3 (ENSG00000092758), COL20A1 (ENSG00000101203), COL1A1 (ENSG00000108821), COL9A1 (ENSG00000112280), COL7A1 (ENSG00000114270), COL21A1 (ENSG00000124749), COL5A1 (ENSG00000130635), COL4A2 (ENSG00000134871), COL2A1 (ENSG00000139219), COL6A1 (ENSG00000142156), COL6A2 (ENSG00000142173), EDA (ENSG00000158813), COL26A1 (ENSG00000160963), COL1A2 (ENSG00000164692), COL3A1 (ENSG00000168542), COL4A3 (ENSG00000169031), COL22A1 (ENSG00000169436), COL24A1 (ENSG00000171502), COL18A1 (ENSG00000182871), EMID1 (ENSG00000186998), COL4A1 (ENSG00000187498), COL4A5 (ENSG00000188153), COL25A1 (ENSG00000188517), COL27A1 (ENSG00000196739), COL13A1 (ENSG00000197467), COL4A6 (ENSG00000197565), COL11A2 (ENSG00000204248), COL5A2 (ENSG00000204262), COL15A1 (ENSG00000204291), COLQ (ENSG00000206561)

Protein

Protein identifiers

Collagen alpha-1(XXVIII) chainQ2UY09 (reviewed: Q2UY09)

All UniProt accessions (5): A0A0C4DG66, A0A0C4DG72, H7BZU0, H7C3P2, Q2UY09

UniProt curated annotations — full annotation on UniProt →

Function. May act as a cell-binding protein.

Subunit / interactions. Trimer or homomer. Secreted as a 135 kDa monomer under reducing conditions and as a homotrimer under non-reducing conditions.

Subcellular location. Secreted. Extracellular space. Extracellular matrix. Basement membrane.

Similarity. Belongs to the VWA-containing collagen family.

Isoforms (3)

UniProt IDNamesCanonical?
Q2UY09-11yes
Q2UY09-22
Q2UY09-33

RefSeq proteins (1): NP_001032852* (*=MANE)

Domains & families (InterPro)

IDNameType
IPR002035VWF_ADomain
IPR002223Kunitz_BPTIDomain
IPR008160CollagenRepeat
IPR020901Prtase_inh_Kunz-CSConserved_site
IPR036465vWFA_dom_sfHomologous_superfamily
IPR036880Kunitz_BPTI_sfHomologous_superfamily
IPR050149Collagen_superfamilyFamily

Pfam: PF00014, PF00092, PF01391

UniProt features (32 total): domain 9, sequence variant 7, compositionally biased region 5, splice variant 4, disulfide bond 3, region of interest 2, signal peptide 1, chain 1

Structure

Experimental structures (PDB)

0 structures.

Predicted structure (AlphaFold)

ModelpLDDTFraction very-high
AF-Q2UY09-F161.810.23

Functional residue map

Curated UniProt residues grouped by drug-discovery relevance — catalytic, ligand-binding, modification, and mutation-validated positions. Source: UniProtKB sequence features.

Disulfide bonds (3): 1072–1122, 1081–1105, 1097–1118

Function

Pathways and Gene Ontology

Reactome pathways

2 pathways

IDPathway
R-HSA-1650814Collagen biosynthesis and modifying enzymes
R-HSA-8948216Collagen chain trimerization

MSigDB gene sets: 58 (showing top): GOBP_COLLAGEN_FIBRIL_ORGANIZATION, GOBP_SKELETAL_SYSTEM_DEVELOPMENT, GOCC_COLLAGEN_TRIMER, GOBP_ANIMAL_ORGAN_MORPHOGENESIS, GOMF_EXTRACELLULAR_MATRIX_STRUCTURAL_CONSTITUENT, CREIGHTON_ENDOCRINE_THERAPY_RESISTANCE_5, GOCC_BASEMENT_MEMBRANE, GOBP_SKELETAL_SYSTEM_MORPHOGENESIS, GOCC_ENDOPLASMIC_RETICULUM_LUMEN, GOMF_PEPTIDASE_REGULATOR_ACTIVITY, GOMF_SERINE_TYPE_ENDOPEPTIDASE_INHIBITOR_ACTIVITY, GOMF_ENZYME_INHIBITOR_ACTIVITY, GOMF_ENZYME_REGULATOR_ACTIVITY, DODD_NASOPHARYNGEAL_CARCINOMA_DN, GOMF_STRUCTURAL_MOLECULE_ACTIVITY

GO Biological Process (1): cell adhesion (GO:0007155)

GO Molecular Function (3): serine-type endopeptidase inhibitor activity (GO:0004867), extracellular matrix structural constituent conferring tensile strength (GO:0030020), peptidase inhibitor activity (GO:0030414)

GO Cellular Component (6): extracellular region (GO:0005576), endoplasmic reticulum lumen (GO:0005788), extracellular matrix (GO:0031012), collagen type XXVIII trimer (GO:1990326), collagen trimer (GO:0005581), basement membrane (GO:0005604)

Reactome top-level categories

Rollup of top-2 pathways:

CategoryPathways
Collagen formation1
Collagen biosynthesis and modifying enzymes1

GO top-level categories

Rollup of top GO terms by namespace:

CategoryTerms
cellular process1
serine-type endopeptidase activity1
endopeptidase inhibitor activity1
extracellular matrix structural constituent1
enzyme inhibitor activity1
peptidase activity1
peptidase regulator activity1
cellular anatomical structure1
endoplasmic reticulum1
intracellular organelle lumen1
external encapsulating structure1
collagenous component of basement membrane1
von-Willerbrand-factor-A-domain-rich collagen trimer1
extracellular protein-containing complex1
protein-containing complex1
extracellular matrix1

Protein interactions and networks

STRING

970 interactions, top by confidence (×1000):

Protein AProtein BPartner UniProtScore
COL28A1VWFP04275566
COL28A1FAM180AQ6UWF9510
COL28A1ITGAVP06756404
COL28A1ANGPTL7O43827403
COL28A1ELAPOR2A8MWY0389
COL28A1FAT3Q8TDW7384
COL28A1CNTN6Q9UQ52367
COL28A1CNIH1O95406353
COL28A1CNIH3Q8TBE1353
COL28A1P3H1Q32P28350
COL28A1ADAMTS14Q8WXS8350
COL28A1CRTAPO75718349
COL28A1ANGPTL5Q86XS5348
COL28A1PPP4R4Q6NUP7347
COL28A1TPST1O60507336

IntAct

0 interactions, top by confidence:

BioGRID (4): COL28A1 (Affinity Capture-MS), COL28A1 (Affinity Capture-MS), COL28A1 (Cross-Linking-MS (XL-MS)), COL28A1 (Cross-Linking-MS (XL-MS))

ESM2 similar proteins: A0MSJ1, A5PN28, A8WR59, B8V7R6, C0HLN2, O76368, O88207, P02462, P08122, P08125, P08572, P12106, P12107, P12108, P13942, P20849, P20850, P20908, P20909, P23206, P25067, P25318, P25940, P32017, P70560, P83371, P98085, Q03692, Q05306, Q05722, Q07092, Q07643, Q0VF58, Q14050, Q14055, Q14993, Q28083, Q28668, Q2UY09, Q2UY11

Diamond homologs: A0A1D5NSM8, A2AVA0, D3YXF5, O02839, O19124, O35764, O43405, O62685, O62837, O70340, O76536, O95502, O96530, P00751, P04003, P04186, P06205, P06206, P06207, P06681, P07629, P08174, P08607, P0C6B8, P13944, P14151, P14650, P15529, P17690, P18337, P26022, P32018, P33703, P35419, P42201, P47970, P47971, P47972, P48199, P48759

SIGNOR signaling

0 interactions.

Disease & clinical

Clinical variants and AI predictions

ClinVar

226 variants total. Per-class counts are floors (≥ shown; pagination cap):

ClassificationCount (floor)
Pathogenic0
Likely pathogenic0
Uncertain significance179
Likely benign14
Benign4

Top pathogenic / likely-pathogenic (0)

SpliceAI

5652 predictions. Top by Δscore:

VariantEffectΔscore
7:7358806:C:CCacceptor_gain1.0000
7:7360524:CTCAA:Cacceptor_gain1.0000
7:7370717:TTAC:Tdonor_loss1.0000
7:7370718:TACT:Tdonor_loss1.0000
7:7370719:A:ACdonor_gain1.0000
7:7370719:ACTT:Adonor_loss1.0000
7:7370720:C:CCdonor_gain1.0000
7:7370720:CTTA:Cdonor_gain1.0000
7:7370721:TTA:Tdonor_loss1.0000
7:7370723:A:ACdonor_gain1.0000
7:7370723:ACTG:Adonor_loss1.0000
7:7370724:C:CAdonor_gain1.0000
7:7370724:CT:Cdonor_gain1.0000
7:7370724:CTG:Cdonor_gain1.0000
7:7370724:CTGA:Cdonor_gain1.0000
7:7370814:C:CTacceptor_gain1.0000
7:7370815:A:Tacceptor_gain1.0000
7:7370884:T:Cacceptor_gain1.0000
7:7370884:T:TCacceptor_gain1.0000
7:7370886:T:Cacceptor_gain1.0000
7:7370886:T:TCacceptor_gain1.0000
7:7370897:G:GCacceptor_gain1.0000
7:7380862:CCTT:Cacceptor_gain1.0000
7:7435395:GGTTA:Gdonor_loss1.0000
7:7435396:GTTA:Gdonor_loss1.0000
7:7435397:TTACC:Tdonor_loss1.0000
7:7435398:TAC:Tdonor_loss1.0000
7:7435399:ACCTT:Adonor_loss1.0000
7:7435400:CCTT:Cdonor_loss1.0000
7:7439083:ATGT:Adonor_gain1.0000

AlphaMissense

7236 scored. Top likely-pathogenic:

VariantProtein changeam_pathogenicity
7:7358658:C:GC1118S0.998
7:7358659:A:TC1118S0.998
7:7358659:A:GC1118R0.995
7:7358749:A:GW1088R0.995
7:7358749:A:TW1088R0.995
7:7373491:G:CS805R0.995
7:7373491:G:TS805R0.995
7:7373493:T:GS805R0.995
7:7358657:A:CC1118W0.994
7:7358721:C:GC1097S0.994
7:7358722:A:TC1097S0.994
7:7358747:C:AW1088C0.994
7:7358747:C:GW1088C0.994
7:7373048:G:TA953D0.994
7:7373490:A:GS806P0.994
7:7373510:A:GL799P0.994
7:7358646:C:GC1122S0.993
7:7358647:A:TC1122S0.993
7:7358722:A:GC1097R0.993
7:7358675:G:CF1112L0.992
7:7358675:G:TF1112L0.992
7:7358677:A:GF1112L0.992
7:7373267:G:TA880D0.992
7:7358721:C:TC1097Y0.991
7:7358796:C:GC1072S0.991
7:7358797:A:TC1072S0.991
7:7358645:G:CC1122W0.990
7:7358647:A:GC1122R0.990
7:7358711:A:CF1100L0.990
7:7358711:A:TF1100L0.990

dbSNP variants (sampled 300 via entrez): RS1000011177 (7:7388905 C>G), RS1000016604 (7:7405633 A>T), RS1000025654 (7:7440403 C>T), RS1000061357 (7:7440180 A>C,G), RS1000068846 (7:7391090 T>C), RS1000082975 (7:7471550 C>T), RS1000096263 (7:7513851 A>G), RS1000103145 (7:7493909 A>C), RS1000117190 (7:7457534 T>C), RS1000120689 (7:7543704 A>T), RS1000128771 (7:7403104 A>G), RS1000145803 (7:7536096 C>T), RS1000167012 (7:7350483 T>C), RS1000180824 (7:7466125 G>A,C), RS1000192987 (7:7459696 C>G,T)

Disease associations

OMIM: gene MIM:609996 | disease phenotypes:

GenCC curated gene-disease

Mondo (0):

Orphanet (0):

HPO phenotypes

0 total (0 of 0 shown, HPO-id order):

GWAS associations

6 associations (top):

StudyTraitp-value
GCST001533_10Immune reponse to smallpox (secreted IL-1beta)6.000000e-08
GCST003518_32Daytime sleep phenotypes2.000000e-06
GCST003542_173Night sleep phenotypes8.000000e-06
GCST006147_2Frontotemporal dementia (age at onset)8.000000e-07
GCST006149_1Frontotemporal dementia with GRN mutation (age at onset)5.000000e-06
GCST007672_33-month functional outcome in ischaemic stroke (modified Rankin score)8.000000e-06

EFO canonical traits (5, from GWAS)

EFO IDTrait name
EFO:0004645response to vaccine
EFO:0004873cytokine measurement
EFO:0007828daytime rest measurement
EFO:0004847age at onset
EFO:0009603stroke outcome severity measurement

Drugs & pharmacology

Drug and pharmacology data

Is drug target: yes

ChEMBL targets (1): CHEMBL2364188 (PROTEIN COMPLEX GROUP)

PharmGKB: 1 entry (VIP=true, CPIC=false)

CTD chemical–gene interactions

12 total (human), top 12 by PubMed support.

ChemicalActions (top 5)PubMed papers
sodium arsenitedecreases expression, increases expression2
Acetaminophendecreases expression, increases expression2
Nickeldecreases expression2
propionaldehydeincreases expression1
bisphenol Aincreases expression1
CGP 52608affects binding, increases reaction1
Estradioldecreases expression1
Methamphetamineaffects response to substance1
Paraquatincreases expression, increases reaction1
Tobacco Smoke Pollutiondecreases expression1
Aflatoxin B1decreases expression1
Okadaic Aciddecreases expression1

Clinical trials (associated diseases)

0 trials via MONDO — disease-level, not drug-specific.

  • Disease cohort memberships (association, not causation — diseases whose associated-gene cohort lists this gene; a subset are also under Associated diseases): frontotemporal dementia