EMID1
geneOn this page
Also known as EMU1hEmu1EMI5
Summary
EMID1 (EMI domain containing 1, HGNC:18036) is a protein-coding gene on chromosome 22q12.2, encoding EMI domain-containing protein 1 (Q96A84).
Predicted to be located in several cellular components, including Golgi apparatus; endoplasmic reticulum; and extracellular matrix. Predicted to be part of collagen trimer.
Source: NCBI Gene 129080 — RefSeq curated summary.
At a glance
- GWAS associations: 24
- Clinical variants (ClinVar): 99 total
- MANE Select transcript:
NM_133455
Identifiers
Gene identifiers
| Field | Value |
|---|---|
| HGNC ID | HGNC:18036 |
| Approved symbol | EMID1 |
| Name | EMI domain containing 1 |
| Location | 22q12.2 |
| Locus type | gene with protein product |
| Status | Approved |
| Aliases | EMU1, hEmu1, EMI5 |
| Ensembl gene | ENSG00000186998 |
| Ensembl biotype | protein_coding |
| OMIM | 608926 |
| Entrez | 129080 |
Gene structure
Transcript identifiers
Ensembl transcripts: 22 — 17 protein_coding, 3 retained_intron, 1 nonsense_mediated_decay, 1 protein_coding_CDS_not_defined
ENST00000334018, ENST00000404755, ENST00000404820, ENST00000429226, ENST00000430127, ENST00000433143, ENST00000435427, ENST00000473933, ENST00000484039, ENST00000487477, ENST00000488820, ENST00000891447, ENST00000891448, ENST00000891449, ENST00000891450, ENST00000935681, ENST00000935682, ENST00000935683, ENST00000961472, ENST00000961473, ENST00000961474, ENST00000961475
RefSeq mRNA: 3 — MANE Select: NM_133455
NM_001267895, NM_001410828, NM_133455
CCDS: CCDS33630, CCDS93143
Canonical transcript exons
ENST00000334018 — 15 exons
| Exon | Start | End |
|---|---|---|
| ENSE00001611508 | 29226490 | 29226551 |
| ENSE00001670727 | 29215527 | 29215630 |
| ENSE00001695101 | 29214926 | 29215039 |
| ENSE00001754790 | 29258817 | 29259597 |
| ENSE00001913306 | 29205896 | 29206139 |
| ENSE00003505907 | 29231020 | 29231140 |
| ENSE00003509919 | 29232256 | 29232402 |
| ENSE00003514554 | 29233614 | 29233666 |
| ENSE00003560263 | 29233379 | 29233468 |
| ENSE00003562725 | 29231593 | 29231682 |
| ENSE00003581986 | 29225133 | 29225216 |
| ENSE00003645065 | 29234137 | 29234199 |
| ENSE00003662632 | 29234305 | 29234349 |
| ENSE00003683197 | 29254203 | 29254287 |
| ENSE00003786137 | 29243445 | 29243489 |
Expression profiles
Bgee: expression breadth ubiquitous, 192 present calls, max score 92.96.
FANTOM5 (CAGE): breadth broad, TPM avg 4.3481 / max 403.3496, expressed in 740 samples.
FANTOM5 promoters (2 alternative TSS)
| Promoter ID | TPM avg | Samples expressed |
|---|---|---|
| 191595 | 4.1542 | 736 |
| 191596 | 0.1939 | 110 |
Top tissues by expression
272 total, by Bgee expression score (0-100, higher = more expressed):
| Tissue | Anatomy ID | Expression score | Quality |
|---|---|---|---|
| ventricular zone | UBERON:0003053 | 92.96 | gold quality |
| spleen | UBERON:0002106 | 92.34 | gold quality |
| right adrenal gland cortex | UBERON:0035827 | 90.09 | gold quality |
| right adrenal gland | UBERON:0001233 | 89.85 | gold quality |
| cortical plate | UBERON:0005343 | 89.60 | gold quality |
| left adrenal gland cortex | UBERON:0035825 | 89.31 | gold quality |
| left adrenal gland | UBERON:0001234 | 89.21 | gold quality |
| amygdala | UBERON:0001876 | 88.81 | gold quality |
| adrenal cortex | UBERON:0001235 | 88.53 | gold quality |
| cingulate cortex | UBERON:0003027 | 88.36 | gold quality |
| anterior cingulate cortex | UBERON:0009835 | 88.26 | gold quality |
| ganglionic eminence | UBERON:0004023 | 88.02 | gold quality |
| adrenal gland | UBERON:0002369 | 87.32 | gold quality |
| right frontal lobe | UBERON:0002810 | 87.30 | gold quality |
| temporal lobe | UBERON:0001871 | 86.37 | gold quality |
| nucleus accumbens | UBERON:0001882 | 86.37 | gold quality |
| caudate nucleus | UBERON:0001873 | 86.33 | gold quality |
| prefrontal cortex | UBERON:0000451 | 86.05 | gold quality |
| neocortex | UBERON:0001950 | 85.94 | gold quality |
| frontal cortex | UBERON:0001870 | 85.32 | gold quality |
| dorsolateral prefrontal cortex | UBERON:0009834 | 85.22 | gold quality |
| cerebral cortex | UBERON:0000956 | 85.05 | gold quality |
| telencephalon | UBERON:0001893 | 84.92 | gold quality |
| putamen | UBERON:0001874 | 84.89 | gold quality |
| Brodmann (1909) area 9 | UBERON:0013540 | 84.59 | gold quality |
| Ammon’s horn | UBERON:0001954 | 84.17 | gold quality |
| entorhinal cortex | UBERON:0002728 | 83.88 | gold quality |
| forebrain | UBERON:0001890 | 82.71 | gold quality |
| Brodmann (1909) area 46 | UBERON:0006483 | 82.57 | silver quality |
| embryo | UBERON:0000922 | 82.34 | gold quality |
Single-cell (SCXA)
Detected in 3 experiment(s), a significant marker in 2.
| Experiment | Marker? | Max mean expression |
|---|---|---|
| E-HCAD-10 | yes | 18.70 |
| E-ANND-3 | yes | 4.85 |
| E-MTAB-8060 | no | 226.76 |
Regulation
Is transcription factor: no
miRNA regulators (miRDB)
14 targeting EMID1, top 30 by miRDB confidence (max_score; target_count = how many genes the miRNA targets in total — lower means more specific):
| miRNA | Max score | Avg score | miRNA target_count |
|---|---|---|---|
| HSA-MIR-4779 | 99.86 | 66.50 | 1583 |
| HSA-MIR-4728-5P | 99.85 | 69.39 | 4718 |
| HSA-MIR-6785-5P | 99.82 | 68.68 | 4428 |
| HSA-MIR-149-3P | 99.72 | 68.22 | 3963 |
| HSA-MIR-6883-5P | 99.69 | 68.05 | 3785 |
| HSA-MIR-5004-5P | 99.68 | 66.63 | 1294 |
| HSA-MIR-4649-3P | 99.56 | 66.90 | 1783 |
| HSA-MIR-4667-3P | 99.26 | 65.45 | 1608 |
| HSA-MIR-6773-3P | 98.17 | 65.51 | 1213 |
| HSA-MIR-6847-5P | 97.93 | 66.74 | 1808 |
| HSA-MIR-4638-3P | 97.90 | 65.75 | 905 |
| HSA-MIR-3085-5P | 97.72 | 65.43 | 544 |
| HSA-MIR-6793-3P | 97.66 | 65.78 | 1084 |
| HSA-MIR-5195-5P | 90.84 | 65.09 | 287 |
Cross-species orthologs
2 orthologs
| Organism | Symbol | Gene ID |
|---|---|---|
| mus_musculus | Emid1 | ENSMUSG00000034164 |
| rattus_norvegicus | Emid1 | ENSRNOG00000033575 |
Paralogs (37): COL9A2 (ENSG00000049089), COL23A1 (ENSG00000050767), COL11A1 (ENSG00000060718), COL17A1 (ENSG00000065618), COL5A3 (ENSG00000080573), COL4A4 (ENSG00000081052), COL16A1 (ENSG00000084636), COL9A3 (ENSG00000092758), COL20A1 (ENSG00000101203), COL1A1 (ENSG00000108821), COL9A1 (ENSG00000112280), COL7A1 (ENSG00000114270), COL21A1 (ENSG00000124749), COL5A1 (ENSG00000130635), COL4A2 (ENSG00000134871), COL2A1 (ENSG00000139219), COL6A1 (ENSG00000142156), COL6A2 (ENSG00000142173), EDA (ENSG00000158813), COL26A1 (ENSG00000160963), COL1A2 (ENSG00000164692), COL3A1 (ENSG00000168542), COL4A3 (ENSG00000169031), COL22A1 (ENSG00000169436), COL24A1 (ENSG00000171502), COL18A1 (ENSG00000182871), COL4A1 (ENSG00000187498), COL4A5 (ENSG00000188153), COL25A1 (ENSG00000188517), COL27A1 (ENSG00000196739), COL13A1 (ENSG00000197467), COL4A6 (ENSG00000197565), COL11A2 (ENSG00000204248), COL5A2 (ENSG00000204262), COL15A1 (ENSG00000204291), COLQ (ENSG00000206561), COL28A1 (ENSG00000215018)
Protein
Protein identifiers
EMI domain-containing protein 1 — Q96A84 (reviewed: Q96A84)
Alternative names: Emilin and multimerin domain-containing protein 1
All UniProt accessions (7): B0QYK2, B0QYK3, B0QYK4, B0QYK5, F8WDX7, H0Y6W4, Q96A84
UniProt curated annotations — full annotation on UniProt →
Subunit / interactions. Homo- or heteromers.
Subcellular location. Secreted. Extracellular space. Extracellular matrix.
Post-translational modifications. O-fucosylated at Thr-42 within the EMI domain by POFUT3 and POFUT4.
Miscellaneous. May be due to a competing acceptor splice site. May be due to a competing acceptor splice site.
Isoforms (3)
| UniProt ID | Names | Canonical? |
|---|---|---|
| Q96A84-1 | 1 | yes |
| Q96A84-2 | 2 | |
| Q96A84-3 | 3 |
RefSeq proteins (3): NP_001254824, NP_001397757, NP_597712* (*=MANE)
Domains & families (InterPro)
| ID | Name | Type |
|---|---|---|
| IPR008160 | Collagen | Repeat |
| IPR011489 | EMI_domain | Domain |
| IPR050392 | Collagen/C1q_domain | Family |
Pfam: PF01391, PF07546
UniProt features (23 total): compositionally biased region 8, glycosylation site 3, disulfide bond 3, domain 2, splice variant 2, region of interest 2, signal peptide 1, chain 1, sequence variant 1
Structure
Experimental structures (PDB)
0 structures.
Predicted structure (AlphaFold)
| Model | pLDDT | Fraction very-high |
|---|---|---|
| AF-Q96A84-F1 | 51.77 | 0.00 |
Functional residue map
Curated UniProt residues grouped by drug-discovery relevance — catalytic, ligand-binding, modification, and mutation-validated positions. Source: UniProtKB sequence features.
Disulfide bonds (3): 37–96, 62–68, 95–104
Glycosylation sites (3): 42, 51, 136
Function
Pathways and Gene Ontology
Reactome pathways
1 pathways
| ID | Pathway |
|---|---|
| R-HSA-5173105 | O-linked glycosylation |
MSigDB gene sets: 120 (showing top):
GOCC_COLLAGEN_TRIMER, GGGTGGRR_PAX4_03, NKX62_Q2, MODULE_379, BLALOCK_ALZHEIMERS_DISEASE_UP, TGGNNNNNNKCCAR_UNKNOWN, HAND1E47_01, ZHANG_BREAST_CANCER_PROGENITORS_UP, MODULE_242, YYCATTCAWW_UNKNOWN, YAGI_AML_WITH_11Q23_REARRANGED, MODULE_104, GOCC_ENDOPLASMIC_RETICULUM_LUMEN, GAUSSMANN_MLL_AF4_FUSION_TARGETS_G_UP, MODULE_13
GO Biological Process (0):
GO Molecular Function (1): protein binding (GO:0005515)
GO Cellular Component (6): collagen trimer (GO:0005581), endoplasmic reticulum lumen (GO:0005788), Golgi apparatus (GO:0005794), extracellular matrix (GO:0031012), extracellular region (GO:0005576), endoplasmic reticulum (GO:0005783)
Reactome top-level categories
Rollup of top-1 pathways:
| Category | Pathways |
|---|---|
| Post-translational protein modification | 1 |
GO top-level categories
Rollup of top GO terms by namespace:
| Category | Terms |
|---|---|
| cytoplasm | 2 |
| endomembrane system | 2 |
| intracellular membrane-bounded organelle | 2 |
| binding | 1 |
| protein-containing complex | 1 |
| endoplasmic reticulum | 1 |
| intracellular organelle lumen | 1 |
| external encapsulating structure | 1 |
| cellular anatomical structure | 1 |
Protein interactions and networks
STRING
654 interactions, top by confidence (×1000):
| Protein A | Protein B | Partner UniProt | Score |
|---|---|---|---|
| EMID1 | SERPINH1 | P29043 | 833 |
| EMID1 | THOC2 | Q8NI27 | 565 |
| EMID1 | CD164L2 | Q6UWJ8 | 447 |
| EMID1 | MRPL45 | Q9BRJ2 | 441 |
| EMID1 | MMRN2 | Q9H8L6 | 441 |
| EMID1 | EMILIN3 | Q9NT22 | 411 |
| EMID1 | CCER2 | I3L3R5 | 405 |
| EMID1 | OR51D1 | Q8NGF3 | 380 |
| EMID1 | ZMAT5 | Q9UDW3 | 380 |
| EMID1 | VPREB1 | P12018 | 374 |
| EMID1 | TEX38 | Q6PEX7 | 368 |
| EMID1 | SERTAD3 | Q9UJW9 | 367 |
| EMID1 | POFUT2 | Q9Y2G5 | 363 |
| EMID1 | MYPOP | Q86VE0 | 349 |
| EMID1 | UBE2QL1 | A1L167 | 338 |
IntAct
10 interactions, top by confidence:
| A | B | Type | Score |
|---|---|---|---|
| AKTIP | EMID1 | psi-mi:“MI:0915”(physical association) | 0.370 |
| EMID1 | CDC5L | psi-mi:“MI:0915”(physical association) | 0.370 |
| KRT40 | ANKRD36 | psi-mi:“MI:0914”(association) | 0.350 |
| EMID1 | CRTAP | psi-mi:“MI:0914”(association) | 0.350 |
| EMID1 | POTEF | psi-mi:“MI:0914”(association) | 0.350 |
| KRT40 | NEURL1B | psi-mi:“MI:0914”(association) | 0.350 |
| NPTXR | ACACB | psi-mi:“MI:0914”(association) | 0.350 |
| EMID1 | NDUFS4 | psi-mi:“MI:0914”(association) | 0.350 |
| EMID1 | psi-mi:“MI:0915”(physical association) | 0.000 |
BioGRID (40): KLHL25 (Affinity Capture-MS), UBR4 (Affinity Capture-MS), COLGALT2 (Affinity Capture-MS), YTHDF1 (Affinity Capture-MS), EMID1 (Synthetic Lethality), KLHL25 (Affinity Capture-MS), YTHDF1 (Affinity Capture-MS), CRTAP (Affinity Capture-MS), EMID1 (Affinity Capture-RNA), YTHDF1 (Affinity Capture-MS), P3H1 (Affinity Capture-MS), EMID1 (Affinity Capture-MS), CRTAP (Affinity Capture-MS), KLHL25 (Affinity Capture-MS), SDF2L1 (Affinity Capture-MS)
ESM2 similar proteins: A1XQX1, A6NFA1, B1ATG9, C0HL12, O09010, O12971, O12972, O15399, O54693, O75973, O88992, P48807, P52795, P52796, P98172, Q03391, Q13563, Q20FD0, Q28143, Q3UHD1, Q4GZT3, Q4ZJM9, Q5FVH0, Q5QQ50, Q5QQ51, Q5VWW1, Q62645, Q6NW40, Q6PFE7, Q6ZRP7, Q7TQ33, Q7Z5L3, Q86V40, Q86Z23, Q8CFR0, Q8IYR6, Q8IZP7, Q8K479, Q924T4, Q92838
Diamond homologs: A6H6E2, F1QC17, P59900, Q13201, Q8K482, Q91VF6, Q96A83, Q96A84, Q99K41, Q9BXX0, Q9H8L6, Q9NT22, Q9Y6C2, B2RPV6, Q5RJ80, Q8BVD7, Q91VF5, Q9BXJ2, Q9W332, Q05A80, Q6IMN6, Q8R066, Q9BXJ3, P83425, Q4ZJM7, Q5FVH0, Q8K479, Q9BXJ0, P27658, Q7Z5L3, Q8CFR0, A6NHN0, Q0II24
SIGNOR signaling
0 interactions.
Disease & clinical
Clinical variants and AI predictions
ClinVar
99 variants total. Per-class counts are floors (≥ shown; pagination cap):
| Classification | Count (floor) |
|---|---|
| Pathogenic | 0 |
| Likely pathogenic | 0 |
| Uncertain significance | 71 |
| Likely benign | 2 |
| Benign | 1 |
Top pathogenic / likely-pathogenic (0)
SpliceAI
3155 predictions. Top by Δscore:
| Variant | Effect | Δscore |
|---|---|---|
| 22:29231016:TCA:T | acceptor_loss | 1.0000 |
| 22:29231018:A:AG | acceptor_gain | 1.0000 |
| 22:29231018:AGAT:A | acceptor_gain | 1.0000 |
| 22:29231018:AGATG:A | acceptor_loss | 1.0000 |
| 22:29231019:G:GC | acceptor_gain | 1.0000 |
| 22:29231019:GA:G | acceptor_gain | 1.0000 |
| 22:29231019:GAT:G | acceptor_gain | 1.0000 |
| 22:29231019:GATG:G | acceptor_gain | 1.0000 |
| 22:29231137:CAGGG:C | donor_loss | 1.0000 |
| 22:29231138:AGGGT:A | donor_loss | 1.0000 |
| 22:29231139:GG:G | donor_gain | 1.0000 |
| 22:29231140:GG:G | donor_gain | 1.0000 |
| 22:29231140:GGT:G | donor_loss | 1.0000 |
| 22:29231141:G:GG | donor_gain | 1.0000 |
| 22:29231141:GTG:G | donor_loss | 1.0000 |
| 22:29231380:G:GT | donor_gain | 1.0000 |
| 22:29231591:A:AG | acceptor_gain | 1.0000 |
| 22:29231592:G:GG | acceptor_gain | 1.0000 |
| 22:29233467:GG:G | donor_gain | 1.0000 |
| 22:29233468:GG:G | donor_gain | 1.0000 |
| 22:29233468:GGTA:G | donor_loss | 1.0000 |
| 22:29233469:G:GG | donor_gain | 1.0000 |
| 22:29233469:G:T | donor_loss | 1.0000 |
| 22:29233470:TAA:T | donor_loss | 1.0000 |
| 22:29233474:T:G | donor_gain | 1.0000 |
| 22:29233667:G:GG | donor_gain | 1.0000 |
| 22:29234196:GAGA:G | donor_gain | 1.0000 |
| 22:29234198:GA:G | donor_gain | 1.0000 |
| 22:29234200:G:GG | donor_gain | 1.0000 |
| 22:29234301:A:AG | acceptor_gain | 1.0000 |
AlphaMissense
2800 scored. Top likely-pathogenic:
dbSNP variants (sampled 300 via entrez): RS1000000982 (22:29235993 T>C), RS1000050567 (22:29252404 T>C,G), RS1000054484 (22:29244627 A>G), RS1000109706 (22:29244870 CT>C), RS1000110682 (22:29242977 T>C), RS1000290174 (22:29224399 T>G), RS1000297295 (22:29215911 A>G), RS1000304987 (22:29221209 C>T), RS1000384261 (22:29229902 T>C), RS1000425545 (22:29224260 G>A), RS1000469011 (22:29215744 G>A,C), RS1000530894 (22:29247527 T>C), RS1000558132 (22:29240994 G>C), RS1000560722 (22:29204617 T>C), RS1000584818 (22:29219753 G>C)
Disease associations
OMIM: gene MIM:608926 | disease phenotypes:
GenCC curated gene-disease
Mondo (0):
Orphanet (0):
HPO phenotypes
0 total (0 of 0 shown, HPO-id order):
GWAS associations
24 associations (top):
| Study | Trait | p-value |
|---|---|---|
| GCST001937_28 | Breast cancer | 3.000000e-09 |
| GCST002553_2 | Pancreatic cancer | 1.000000e-08 |
| GCST004611_168 | High light scatter reticulocyte count | 7.000000e-13 |
| GCST004612_189 | High light scatter reticulocyte percentage of red cells | 3.000000e-12 |
| GCST004619_18 | Reticulocyte fraction of red cells | 1.000000e-10 |
| GCST004622_95 | Reticulocyte count | 5.000000e-11 |
| GCST004625_196 | Monocyte count | 5.000000e-10 |
| GCST005580_24 | Intraocular pressure | 3.000000e-14 |
| GCST006394_62 | Intraocular pressure | 5.000000e-17 |
| GCST006412_131 | Intraocular pressure | 6.000000e-16 |
| GCST006479_20 | Diverticular disease | 8.000000e-06 |
| GCST009725_15 | Intraocular pressure | 2.000000e-17 |
| GCST011742_70 | Triglyceride levels in HIV infection | 6.000000e-06 |
| GCST90002385_575 | High light scatter reticulocyte count | 5.000000e-26 |
| GCST90002386_530 | High light scatter reticulocyte percentage of red cells | 1.000000e-24 |
| GCST90002388_266 | Lymphocyte count | 2.000000e-17 |
| GCST90002393_588 | Monocyte count | 2.000000e-22 |
| GCST90002397_595 | Mean spheric corpuscular volume | 2.000000e-10 |
| GCST90002398_26 | Neutrophil count | 2.000000e-12 |
| GCST90002400_504 | Plateletcrit | 2.000000e-15 |
| GCST90002402_643 | Platelet count | 1.000000e-17 |
| GCST90002405_412 | Reticulocyte count | 3.000000e-24 |
| GCST90002406_565 | Reticulocyte fraction of red cells | 9.000000e-23 |
| GCST90002407_195 | White blood cell count | 6.000000e-19 |
EFO canonical traits (9, from GWAS)
| EFO ID | Trait name |
|---|---|
| EFO:0007986 | reticulocyte count |
| EFO:0005091 | monocyte count |
| EFO:0004695 | intraocular pressure measurement |
| EFO:0009959 | diverticular disease |
| EFO:0004530 | triglyceride measurement |
| EFO:0004587 | lymphocyte count |
| EFO:0004833 | neutrophil count |
| EFO:0007985 | platelet crit |
| EFO:0004309 | platelet count |
Drugs & pharmacology
Drug and pharmacology data
Is drug target: no
PharmGKB: 1 entry (VIP=true, CPIC=false)
CTD chemical–gene interactions
18 total (human), top 18 by PubMed support.
| Chemical | Actions (top 5) | PubMed papers |
|---|---|---|
| Benzo(a)pyrene | affects methylation, increases methylation | 2 |
| methylmercuric chloride | decreases expression | 1 |
| aflatoxin B2 | increases methylation | 1 |
| CGP 52608 | affects binding, increases reaction | 1 |
| abrine | decreases expression | 1 |
| 2,3,5-trichloro-6-phenyl-(1,4)benzoquinone | decreases expression | 1 |
| Sunitinib | decreases expression | 1 |
| Ethanol | increases expression | 1 |
| Silicon Dioxide | decreases expression | 1 |
| Tobacco Smoke Pollution | decreases expression | 1 |
| Tretinoin | decreases expression | 1 |
| Triclosan | increases expression | 1 |
| Urethane | decreases expression | 1 |
| Valproic Acid | decreases expression, increases methylation | 1 |
| Zinc | increases expression | 1 |
| Antirheumatic Agents | decreases expression | 1 |
| Zinc Sulfate | increases expression | 1 |
| Acrylamide | increases expression | 1 |
Clinical trials (associated diseases)
0 trials via MONDO — disease-level, not drug-specific.
Related Atlas pages
- Disease cohort memberships (association, not causation — diseases whose associated-gene cohort lists this gene; a subset are also under Associated diseases): exocrine pancreatic carcinoma