EMID1

gene
On this page

Also known as EMU1hEmu1EMI5

Summary

EMID1 (EMI domain containing 1, HGNC:18036) is a protein-coding gene on chromosome 22q12.2, encoding EMI domain-containing protein 1 (Q96A84).

Predicted to be located in several cellular components, including Golgi apparatus; endoplasmic reticulum; and extracellular matrix. Predicted to be part of collagen trimer.

Source: NCBI Gene 129080 — RefSeq curated summary.

At a glance

  • GWAS associations: 24
  • Clinical variants (ClinVar): 99 total
  • MANE Select transcript: NM_133455

Identifiers

Gene identifiers

FieldValue
HGNC IDHGNC:18036
Approved symbolEMID1
NameEMI domain containing 1
Location22q12.2
Locus typegene with protein product
StatusApproved
AliasesEMU1, hEmu1, EMI5
Ensembl geneENSG00000186998
Ensembl biotypeprotein_coding
OMIM608926
Entrez129080

Gene structure

Transcript identifiers

Ensembl transcripts: 22 — 17 protein_coding, 3 retained_intron, 1 nonsense_mediated_decay, 1 protein_coding_CDS_not_defined

ENST00000334018, ENST00000404755, ENST00000404820, ENST00000429226, ENST00000430127, ENST00000433143, ENST00000435427, ENST00000473933, ENST00000484039, ENST00000487477, ENST00000488820, ENST00000891447, ENST00000891448, ENST00000891449, ENST00000891450, ENST00000935681, ENST00000935682, ENST00000935683, ENST00000961472, ENST00000961473, ENST00000961474, ENST00000961475

RefSeq mRNA: 3 — MANE Select: NM_133455 NM_001267895, NM_001410828, NM_133455

CCDS: CCDS33630, CCDS93143

Canonical transcript exons

ENST00000334018 — 15 exons

ExonStartEnd
ENSE000016115082922649029226551
ENSE000016707272921552729215630
ENSE000016951012921492629215039
ENSE000017547902925881729259597
ENSE000019133062920589629206139
ENSE000035059072923102029231140
ENSE000035099192923225629232402
ENSE000035145542923361429233666
ENSE000035602632923337929233468
ENSE000035627252923159329231682
ENSE000035819862922513329225216
ENSE000036450652923413729234199
ENSE000036626322923430529234349
ENSE000036831972925420329254287
ENSE000037861372924344529243489

Expression profiles

Bgee: expression breadth ubiquitous, 192 present calls, max score 92.96.

FANTOM5 (CAGE): breadth broad, TPM avg 4.3481 / max 403.3496, expressed in 740 samples.

FANTOM5 promoters (2 alternative TSS)

Promoter IDTPM avgSamples expressed
1915954.1542736
1915960.1939110

Top tissues by expression

272 total, by Bgee expression score (0-100, higher = more expressed):

TissueAnatomy IDExpression scoreQuality
ventricular zoneUBERON:000305392.96gold quality
spleenUBERON:000210692.34gold quality
right adrenal gland cortexUBERON:003582790.09gold quality
right adrenal glandUBERON:000123389.85gold quality
cortical plateUBERON:000534389.60gold quality
left adrenal gland cortexUBERON:003582589.31gold quality
left adrenal glandUBERON:000123489.21gold quality
amygdalaUBERON:000187688.81gold quality
adrenal cortexUBERON:000123588.53gold quality
cingulate cortexUBERON:000302788.36gold quality
anterior cingulate cortexUBERON:000983588.26gold quality
ganglionic eminenceUBERON:000402388.02gold quality
adrenal glandUBERON:000236987.32gold quality
right frontal lobeUBERON:000281087.30gold quality
temporal lobeUBERON:000187186.37gold quality
nucleus accumbensUBERON:000188286.37gold quality
caudate nucleusUBERON:000187386.33gold quality
prefrontal cortexUBERON:000045186.05gold quality
neocortexUBERON:000195085.94gold quality
frontal cortexUBERON:000187085.32gold quality
dorsolateral prefrontal cortexUBERON:000983485.22gold quality
cerebral cortexUBERON:000095685.05gold quality
telencephalonUBERON:000189384.92gold quality
putamenUBERON:000187484.89gold quality
Brodmann (1909) area 9UBERON:001354084.59gold quality
Ammon’s hornUBERON:000195484.17gold quality
entorhinal cortexUBERON:000272883.88gold quality
forebrainUBERON:000189082.71gold quality
Brodmann (1909) area 46UBERON:000648382.57silver quality
embryoUBERON:000092282.34gold quality

Single-cell (SCXA)

Detected in 3 experiment(s), a significant marker in 2.

ExperimentMarker?Max mean expression
E-HCAD-10yes18.70
E-ANND-3yes4.85
E-MTAB-8060no226.76

Regulation

Is transcription factor: no

miRNA regulators (miRDB)

14 targeting EMID1, top 30 by miRDB confidence (max_score; target_count = how many genes the miRNA targets in total — lower means more specific):

miRNAMax scoreAvg scoremiRNA target_count
HSA-MIR-477999.8666.501583
HSA-MIR-4728-5P99.8569.394718
HSA-MIR-6785-5P99.8268.684428
HSA-MIR-149-3P99.7268.223963
HSA-MIR-6883-5P99.6968.053785
HSA-MIR-5004-5P99.6866.631294
HSA-MIR-4649-3P99.5666.901783
HSA-MIR-4667-3P99.2665.451608
HSA-MIR-6773-3P98.1765.511213
HSA-MIR-6847-5P97.9366.741808
HSA-MIR-4638-3P97.9065.75905
HSA-MIR-3085-5P97.7265.43544
HSA-MIR-6793-3P97.6665.781084
HSA-MIR-5195-5P90.8465.09287

Cross-species orthologs

2 orthologs

OrganismSymbolGene ID
mus_musculusEmid1ENSMUSG00000034164
rattus_norvegicusEmid1ENSRNOG00000033575

Paralogs (37): COL9A2 (ENSG00000049089), COL23A1 (ENSG00000050767), COL11A1 (ENSG00000060718), COL17A1 (ENSG00000065618), COL5A3 (ENSG00000080573), COL4A4 (ENSG00000081052), COL16A1 (ENSG00000084636), COL9A3 (ENSG00000092758), COL20A1 (ENSG00000101203), COL1A1 (ENSG00000108821), COL9A1 (ENSG00000112280), COL7A1 (ENSG00000114270), COL21A1 (ENSG00000124749), COL5A1 (ENSG00000130635), COL4A2 (ENSG00000134871), COL2A1 (ENSG00000139219), COL6A1 (ENSG00000142156), COL6A2 (ENSG00000142173), EDA (ENSG00000158813), COL26A1 (ENSG00000160963), COL1A2 (ENSG00000164692), COL3A1 (ENSG00000168542), COL4A3 (ENSG00000169031), COL22A1 (ENSG00000169436), COL24A1 (ENSG00000171502), COL18A1 (ENSG00000182871), COL4A1 (ENSG00000187498), COL4A5 (ENSG00000188153), COL25A1 (ENSG00000188517), COL27A1 (ENSG00000196739), COL13A1 (ENSG00000197467), COL4A6 (ENSG00000197565), COL11A2 (ENSG00000204248), COL5A2 (ENSG00000204262), COL15A1 (ENSG00000204291), COLQ (ENSG00000206561), COL28A1 (ENSG00000215018)

Protein

Protein identifiers

EMI domain-containing protein 1Q96A84 (reviewed: Q96A84)

Alternative names: Emilin and multimerin domain-containing protein 1

All UniProt accessions (7): B0QYK2, B0QYK3, B0QYK4, B0QYK5, F8WDX7, H0Y6W4, Q96A84

UniProt curated annotations — full annotation on UniProt →

Subunit / interactions. Homo- or heteromers.

Subcellular location. Secreted. Extracellular space. Extracellular matrix.

Post-translational modifications. O-fucosylated at Thr-42 within the EMI domain by POFUT3 and POFUT4.

Miscellaneous. May be due to a competing acceptor splice site. May be due to a competing acceptor splice site.

Isoforms (3)

UniProt IDNamesCanonical?
Q96A84-11yes
Q96A84-22
Q96A84-33

RefSeq proteins (3): NP_001254824, NP_001397757, NP_597712* (*=MANE)

Domains & families (InterPro)

IDNameType
IPR008160CollagenRepeat
IPR011489EMI_domainDomain
IPR050392Collagen/C1q_domainFamily

Pfam: PF01391, PF07546

UniProt features (23 total): compositionally biased region 8, glycosylation site 3, disulfide bond 3, domain 2, splice variant 2, region of interest 2, signal peptide 1, chain 1, sequence variant 1

Structure

Experimental structures (PDB)

0 structures.

Predicted structure (AlphaFold)

ModelpLDDTFraction very-high
AF-Q96A84-F151.770.00

Functional residue map

Curated UniProt residues grouped by drug-discovery relevance — catalytic, ligand-binding, modification, and mutation-validated positions. Source: UniProtKB sequence features.

Disulfide bonds (3): 37–96, 62–68, 95–104

Glycosylation sites (3): 42, 51, 136

Function

Pathways and Gene Ontology

Reactome pathways

1 pathways

IDPathway
R-HSA-5173105O-linked glycosylation

MSigDB gene sets: 120 (showing top): GOCC_COLLAGEN_TRIMER, GGGTGGRR_PAX4_03, NKX62_Q2, MODULE_379, BLALOCK_ALZHEIMERS_DISEASE_UP, TGGNNNNNNKCCAR_UNKNOWN, HAND1E47_01, ZHANG_BREAST_CANCER_PROGENITORS_UP, MODULE_242, YYCATTCAWW_UNKNOWN, YAGI_AML_WITH_11Q23_REARRANGED, MODULE_104, GOCC_ENDOPLASMIC_RETICULUM_LUMEN, GAUSSMANN_MLL_AF4_FUSION_TARGETS_G_UP, MODULE_13

GO Biological Process (0):

GO Molecular Function (1): protein binding (GO:0005515)

GO Cellular Component (6): collagen trimer (GO:0005581), endoplasmic reticulum lumen (GO:0005788), Golgi apparatus (GO:0005794), extracellular matrix (GO:0031012), extracellular region (GO:0005576), endoplasmic reticulum (GO:0005783)

Reactome top-level categories

Rollup of top-1 pathways:

CategoryPathways
Post-translational protein modification1

GO top-level categories

Rollup of top GO terms by namespace:

CategoryTerms
cytoplasm2
endomembrane system2
intracellular membrane-bounded organelle2
binding1
protein-containing complex1
endoplasmic reticulum1
intracellular organelle lumen1
external encapsulating structure1
cellular anatomical structure1

Protein interactions and networks

STRING

654 interactions, top by confidence (×1000):

Protein AProtein BPartner UniProtScore
EMID1SERPINH1P29043833
EMID1THOC2Q8NI27565
EMID1CD164L2Q6UWJ8447
EMID1MRPL45Q9BRJ2441
EMID1MMRN2Q9H8L6441
EMID1EMILIN3Q9NT22411
EMID1CCER2I3L3R5405
EMID1OR51D1Q8NGF3380
EMID1ZMAT5Q9UDW3380
EMID1VPREB1P12018374
EMID1TEX38Q6PEX7368
EMID1SERTAD3Q9UJW9367
EMID1POFUT2Q9Y2G5363
EMID1MYPOPQ86VE0349
EMID1UBE2QL1A1L167338

IntAct

10 interactions, top by confidence:

ABTypeScore
AKTIPEMID1psi-mi:“MI:0915”(physical association)0.370
EMID1CDC5Lpsi-mi:“MI:0915”(physical association)0.370
KRT40ANKRD36psi-mi:“MI:0914”(association)0.350
EMID1CRTAPpsi-mi:“MI:0914”(association)0.350
EMID1POTEFpsi-mi:“MI:0914”(association)0.350
KRT40NEURL1Bpsi-mi:“MI:0914”(association)0.350
NPTXRACACBpsi-mi:“MI:0914”(association)0.350
EMID1NDUFS4psi-mi:“MI:0914”(association)0.350
EMID1psi-mi:“MI:0915”(physical association)0.000

BioGRID (40): KLHL25 (Affinity Capture-MS), UBR4 (Affinity Capture-MS), COLGALT2 (Affinity Capture-MS), YTHDF1 (Affinity Capture-MS), EMID1 (Synthetic Lethality), KLHL25 (Affinity Capture-MS), YTHDF1 (Affinity Capture-MS), CRTAP (Affinity Capture-MS), EMID1 (Affinity Capture-RNA), YTHDF1 (Affinity Capture-MS), P3H1 (Affinity Capture-MS), EMID1 (Affinity Capture-MS), CRTAP (Affinity Capture-MS), KLHL25 (Affinity Capture-MS), SDF2L1 (Affinity Capture-MS)

ESM2 similar proteins: A1XQX1, A6NFA1, B1ATG9, C0HL12, O09010, O12971, O12972, O15399, O54693, O75973, O88992, P48807, P52795, P52796, P98172, Q03391, Q13563, Q20FD0, Q28143, Q3UHD1, Q4GZT3, Q4ZJM9, Q5FVH0, Q5QQ50, Q5QQ51, Q5VWW1, Q62645, Q6NW40, Q6PFE7, Q6ZRP7, Q7TQ33, Q7Z5L3, Q86V40, Q86Z23, Q8CFR0, Q8IYR6, Q8IZP7, Q8K479, Q924T4, Q92838

Diamond homologs: A6H6E2, F1QC17, P59900, Q13201, Q8K482, Q91VF6, Q96A83, Q96A84, Q99K41, Q9BXX0, Q9H8L6, Q9NT22, Q9Y6C2, B2RPV6, Q5RJ80, Q8BVD7, Q91VF5, Q9BXJ2, Q9W332, Q05A80, Q6IMN6, Q8R066, Q9BXJ3, P83425, Q4ZJM7, Q5FVH0, Q8K479, Q9BXJ0, P27658, Q7Z5L3, Q8CFR0, A6NHN0, Q0II24

SIGNOR signaling

0 interactions.

Disease & clinical

Clinical variants and AI predictions

ClinVar

99 variants total. Per-class counts are floors (≥ shown; pagination cap):

ClassificationCount (floor)
Pathogenic0
Likely pathogenic0
Uncertain significance71
Likely benign2
Benign1

Top pathogenic / likely-pathogenic (0)

SpliceAI

3155 predictions. Top by Δscore:

VariantEffectΔscore
22:29231016:TCA:Tacceptor_loss1.0000
22:29231018:A:AGacceptor_gain1.0000
22:29231018:AGAT:Aacceptor_gain1.0000
22:29231018:AGATG:Aacceptor_loss1.0000
22:29231019:G:GCacceptor_gain1.0000
22:29231019:GA:Gacceptor_gain1.0000
22:29231019:GAT:Gacceptor_gain1.0000
22:29231019:GATG:Gacceptor_gain1.0000
22:29231137:CAGGG:Cdonor_loss1.0000
22:29231138:AGGGT:Adonor_loss1.0000
22:29231139:GG:Gdonor_gain1.0000
22:29231140:GG:Gdonor_gain1.0000
22:29231140:GGT:Gdonor_loss1.0000
22:29231141:G:GGdonor_gain1.0000
22:29231141:GTG:Gdonor_loss1.0000
22:29231380:G:GTdonor_gain1.0000
22:29231591:A:AGacceptor_gain1.0000
22:29231592:G:GGacceptor_gain1.0000
22:29233467:GG:Gdonor_gain1.0000
22:29233468:GG:Gdonor_gain1.0000
22:29233468:GGTA:Gdonor_loss1.0000
22:29233469:G:GGdonor_gain1.0000
22:29233469:G:Tdonor_loss1.0000
22:29233470:TAA:Tdonor_loss1.0000
22:29233474:T:Gdonor_gain1.0000
22:29233667:G:GGdonor_gain1.0000
22:29234196:GAGA:Gdonor_gain1.0000
22:29234198:GA:Gdonor_gain1.0000
22:29234200:G:GGdonor_gain1.0000
22:29234301:A:AGacceptor_gain1.0000

AlphaMissense

2800 scored. Top likely-pathogenic:

dbSNP variants (sampled 300 via entrez): RS1000000982 (22:29235993 T>C), RS1000050567 (22:29252404 T>C,G), RS1000054484 (22:29244627 A>G), RS1000109706 (22:29244870 CT>C), RS1000110682 (22:29242977 T>C), RS1000290174 (22:29224399 T>G), RS1000297295 (22:29215911 A>G), RS1000304987 (22:29221209 C>T), RS1000384261 (22:29229902 T>C), RS1000425545 (22:29224260 G>A), RS1000469011 (22:29215744 G>A,C), RS1000530894 (22:29247527 T>C), RS1000558132 (22:29240994 G>C), RS1000560722 (22:29204617 T>C), RS1000584818 (22:29219753 G>C)

Disease associations

OMIM: gene MIM:608926 | disease phenotypes:

GenCC curated gene-disease

Mondo (0):

Orphanet (0):

HPO phenotypes

0 total (0 of 0 shown, HPO-id order):

GWAS associations

24 associations (top):

StudyTraitp-value
GCST001937_28Breast cancer3.000000e-09
GCST002553_2Pancreatic cancer1.000000e-08
GCST004611_168High light scatter reticulocyte count7.000000e-13
GCST004612_189High light scatter reticulocyte percentage of red cells3.000000e-12
GCST004619_18Reticulocyte fraction of red cells1.000000e-10
GCST004622_95Reticulocyte count5.000000e-11
GCST004625_196Monocyte count5.000000e-10
GCST005580_24Intraocular pressure3.000000e-14
GCST006394_62Intraocular pressure5.000000e-17
GCST006412_131Intraocular pressure6.000000e-16
GCST006479_20Diverticular disease8.000000e-06
GCST009725_15Intraocular pressure2.000000e-17
GCST011742_70Triglyceride levels in HIV infection6.000000e-06
GCST90002385_575High light scatter reticulocyte count5.000000e-26
GCST90002386_530High light scatter reticulocyte percentage of red cells1.000000e-24
GCST90002388_266Lymphocyte count2.000000e-17
GCST90002393_588Monocyte count2.000000e-22
GCST90002397_595Mean spheric corpuscular volume2.000000e-10
GCST90002398_26Neutrophil count2.000000e-12
GCST90002400_504Plateletcrit2.000000e-15
GCST90002402_643Platelet count1.000000e-17
GCST90002405_412Reticulocyte count3.000000e-24
GCST90002406_565Reticulocyte fraction of red cells9.000000e-23
GCST90002407_195White blood cell count6.000000e-19

EFO canonical traits (9, from GWAS)

EFO IDTrait name
EFO:0007986reticulocyte count
EFO:0005091monocyte count
EFO:0004695intraocular pressure measurement
EFO:0009959diverticular disease
EFO:0004530triglyceride measurement
EFO:0004587lymphocyte count
EFO:0004833neutrophil count
EFO:0007985platelet crit
EFO:0004309platelet count

Drugs & pharmacology

Drug and pharmacology data

Is drug target: no

PharmGKB: 1 entry (VIP=true, CPIC=false)

CTD chemical–gene interactions

18 total (human), top 18 by PubMed support.

ChemicalActions (top 5)PubMed papers
Benzo(a)pyreneaffects methylation, increases methylation2
methylmercuric chloridedecreases expression1
aflatoxin B2increases methylation1
CGP 52608affects binding, increases reaction1
abrinedecreases expression1
2,3,5-trichloro-6-phenyl-(1,4)benzoquinonedecreases expression1
Sunitinibdecreases expression1
Ethanolincreases expression1
Silicon Dioxidedecreases expression1
Tobacco Smoke Pollutiondecreases expression1
Tretinoindecreases expression1
Triclosanincreases expression1
Urethanedecreases expression1
Valproic Aciddecreases expression, increases methylation1
Zincincreases expression1
Antirheumatic Agentsdecreases expression1
Zinc Sulfateincreases expression1
Acrylamideincreases expression1

Clinical trials (associated diseases)

0 trials via MONDO — disease-level, not drug-specific.

  • Disease cohort memberships (association, not causation — diseases whose associated-gene cohort lists this gene; a subset are also under Associated diseases): exocrine pancreatic carcinoma