INS Gene Complete Identifier and Functional Mapping Reference
Provide a comprehensive cross-database identifier and functional mapping reference for human INS — a definitive lookup resource covering: ### Section 1: Gene identifiers For human gene INS, list ALL gene-level database identifiers. Required: - HGNC ID and approved symbol - Ensembl gene ID (ENSG...) - NCBI Entrez Gene ID - OMIM gene/locus ID - Genomic location: chromosome, start position, end position, strand (GRCh38) ### Section 2: Transcript identifiers For human gene INS, list ALL transcript-level identifiers. Required: - Ensembl transcripts: ALL ENST IDs with biotype. Total count. - RefSeq transcripts: ALL NM_ mRNA accessions. Mark which is MANE Select. - CCDS IDs. - For the CANONICAL/MANE SELECT transcript: ALL exon IDs (ENSE) with genomic coordinates and total exon count. ### Section 3: Protein identifiers For human gene INS protein product(s), list ALL protein-level identifiers. Required: - UniProt accessions: ALL entries (reviewed and unreviewed). Mark the canonical reviewed entry. - RefSeq protein: ALL NP_ accessions. - Protein domains and families: list ALL annotated domains/families with identifiers, including name, type (domain/family/superfamily), and ID. - Antibody availability: known antibody resources for the protein. ### Section 4: Structure For human gene INS protein, list ALL structural data. Required: - Experimental structures: ALL PDB IDs. For each: experimental method (X-ray/NMR/Cryo-EM) and resolution. Total count. - Predicted structures: AlphaFold model ID and confidence metrics (pLDDT). ### Section 5: Cross-species orthologs For human gene INS, list orthologous genes in key model organisms. Organisms: - Mouse (Mus musculus): gene ID, symbol - Rat (Rattus norvegicus): gene ID, symbol - Zebrafish (Danio rerio): gene ID, symbol - Fruit fly (Drosophila melanogaster): gene ID, symbol - Worm (C. elegans): gene ID, symbol - Yeast (S. cerevisiae): gene ID, symbol ### Section 6: Clinical variants & AI predictions For human gene INS, summarize clinical variants and AI predictions. Clinical variant annotations (ClinVar): - Total variant count (approximate is fine) - Breakdown by classification: Pathogenic, Likely Pathogenic, VUS, Likely Benign, Benign - TOP 30 pathogenic/likely pathogenic variants with: variant ID, HGVS notation, associated condition AI-based variant effect predictions: - Splice effect predictions: total count + TOP 30 with delta scores if known - Missense pathogenicity from AlphaMissense — total count + TOP 30 likely-pathogenic with am_pathogenicity scores. ### Section 7: Pathways & Gene Ontology For human gene INS, list biological pathways and Gene Ontology annotations. Pathway membership: - ALL biological pathways this gene participates in, with pathway IDs and names - Total pathway count Gene Ontology: - Biological Process: count and TOP 20 terms with GO IDs - Molecular Function: count and TOP 20 terms with GO IDs - Cellular Component: count and TOP 20 terms with GO IDs ### Section 8: Protein interactions & networks For human gene INS protein, summarize protein interactions and networks. Protein-protein interactions (STRING, IntAct, BioGRID, etc.): - Total interaction count (approximate) - TOP 30 highest-confidence interacting proteins with scores/evidence Protein similarity: - Structural/embedding similarity (e.g. Foldseek, ESM): TOP 20 similar proteins with scores - Sequence homology: TOP 20 homologous proteins with identity/similarity ### Section 9: Transcription factor regulatory data For human gene INS, summarize transcription factor regulatory data. If INS is a transcription factor: - Downstream targets: total count + TOP 30 with regulation type (activates/represses) and evidence - DNA binding motifs from JASPAR — all known motif IDs and motif family classification. Regardless: - Upstream regulators: TFs that regulate INS — names with evidence type (ChIP-seq / predicted / experimentally validated) If INS is not a transcription factor, say so briefly and skip the downstream/motif sections. ### Section 10: Drug & pharmacology data For human gene INS protein as a drug target, summarize pharmacology data. If INS is a known drug target: - Targeting molecules: total count in ChEMBL/DrugBank + TOP 30 by development phase (molecule ID, name, mechanism, highest phase) - Clinical trials: TOP 20 involving drugs targeting this gene — trial ID, phase, status, intervention - Pharmacogenomics: known drug-gene interactions affecting drug response + dosing guidelines if any If INS is not currently a drug target, say so briefly. ### Section 11: Expression profiles For human gene INS, summarize expression profiles. Tissue expression (GTEx, HPA, Bgee, etc.): - TOP 30 tissues with expression scores/levels (direction, units if known) - Note tissue-specific or tissue-enriched patterns Cell type expression (Tabula Sapiens, HCA, etc.): - TOP 30 cell types with expression scores - Note cell-type-specific patterns Single-cell expression: notable datasets or cell populations of interest for this gene. ### Section 12: Disease associations For human gene INS, summarize disease associations. Mendelian / monogenic disease: - Diseases caused by mutations in INS: disease name, disease ID (OMIM/Orphanet/Mondo), inheritance pattern, evidence level - Include all directly linked conditions Phenotype associations: - Clinical phenotypes associated with the gene (HPO terms where known) - TOP 30 phenotype terms with HPO IDs Complex-disease / GWAS: - Traits and diseases significantly associated via GWAS: trait name, variant, effect size, study where known - TOP 30 GWAS associations
Executive summary
INS (HGNC:6081) encodes human insulin, the 110-amino acid pancreatic peptide hormone that is the master regulator of glucose homeostasis and one of the most clinically consequential proteins in medicine. Expressed almost exclusively in pancreatic beta cells and islets of Langerhans (expression score 100.00), insulin acts through the insulin receptor (INSR) to drive glucose uptake, glycogen synthesis, and lipid metabolism across 14 Reactome pathways. The gene harbors ~170 ClinVar variants, with pathogenic and likely pathogenic mutations causing a spectrum of monogenic diabetes syndromes including permanent neonatal diabetes mellitus, maturity-onset diabetes of the young type 10 (MODY10), and hyperproinsulinemia; GWAS links the locus to type 1 diabetes at p-values as low as 1e-196. Structurally, insulin is among the best-characterized human proteins, with 380 PDB entries spanning X-ray, NMR, and cryo-EM. Therapeutically, the gene product itself is the drug: five approved insulin-based biologics (including insulin glargine and insulin aspart) have collectively accumulated over 1,700 clinical trials.
Gene identifiers
| Identifier | Value |
|---|---|
| HGNC ID | HGNC:6081 |
| Approved symbol | INS |
| Ensembl gene ID | ENSG00000254647 |
| NCBI Entrez Gene ID | 3630 |
| OMIM gene ID | 176730 |
| Chromosome | 11 |
| Start position (GRCh38) | 2,159,779 |
| End position (GRCh38) | 2,161,221 |
| Strand | − (minus) |
Transcript identifiers
Ensembl Transcripts
| Transcript ID | Biotype |
|---|---|
| ENST00000250971 | protein_coding |
| ENST00000381330 | protein_coding |
| ENST00000397262 | protein_coding |
| ENST00000421783 | non_stop_decay |
| ENST00000512523 | protein_coding |
Total: 5 transcripts
RefSeq mRNA Accessions
| Accession | MANE Select |
|---|---|
| NM_000207 | ✓ Yes |
| NM_001185097 | No |
| NM_001185098 | No |
| NM_001291897 | No |
Total: 4 mRNA accessions
CCDS IDs
- CCDS7729
MANE SELECT / Canonical Transcript (ENST00000381330)
Exons: 3 total
| Exon ID | Start | End | Chromosome | Strand |
|---|---|---|---|---|
| ENSE00001938789 | 2161168 | 2161209 | 11 | − |
| ENSE00003494357 | 2160785 | 2160988 | 11 | − |
| ENSE00003901829 | 2159779 | 2159997 | 11 | − |
Protein identifiers
UniProt Accessions
- P01308 (reviewed, canonical) — Insulin, human, 110 aa, 11,981 Da
RefSeq Protein (NP_ accessions)
- NP_000198 — canonical/MANE Select, human chromosome 11
- NP_001172026 — human chromosome 11
- NP_001172027 — human chromosome 11
- NP_001278826 — human chromosome 11
- NP_571131 — alternate/non-primary (maps to different species insulin, chromosome 5)
Protein Domains and Families
InterPro:
- IPR004825 — Insulin (Family)
- IPR016179 — Insulin-like (Domain)
- IPR022352 — Insulin/IGF/relaxin (Family)
- IPR022353 — Insulin, conserved site (Conserved_site)
- IPR036438 — Insulin-like superfamily (Homologous_superfamily)
Pfam:
- PF00049 — Insulin/IGF/relaxin (appears in 245 UniProt entries)
SMART:
- SM00078 — Insulin domain signature
Antibody Resources
No antibody annotations found in biobtree via »uniprot»antibody mapping. Insulin is a small processed peptide hormone (110 aa) with wide commercial and research antibody availability through standard research reagent suppliers, though these are not indexed in the primary biobtree databases queried.
Structure
Experimental Structures: PDB
Total: 380 PDB entries (including all variants and complexes)
By Experimental Method:
X-ray Diffraction (~280 structures)
- Range: 0.92 Å (3W7Y, 2Zn human insulin at 100K) to 4.301 Å
- High-resolution structures (≤1.5 Å): 1MSO, 1G7A, 3W7Y, 1MSO, 1EV3, 5HQI, 5ENA, 5UDP, 3EXX, 3FQ9, 4P65, 7QGF, 7RKD, 6VER, 8OKY, 8PI4, 9M4X, 9M4Y, 9M50, and many others
- Medium-resolution structures (1.5-2.0 Å): 1ZEH, 1XDA, 3I3Z, 6O17, many insulin variants and complexes
- Lower-resolution structures (>2.0 Å): Complex structures with insulin receptor ectodomain, IDE, antibody complexes, and mutants
Solution NMR (~70 structures)
- Dimeric and monomeric forms
- Multiple conformations and pH conditions
- Insulin mutants and analogs
- Representative entries: 1A7F, 1AI0, 1AIY, 1EFE, 1HIS, 1HIQ, 1HIT, 1HUI, and many others
Electron Microscopy (Cryo-EM; ~20 structures)
- Full-length insulin receptor complexes: 6CE7 (7.4 Å), 6CE9 (4.3 Å), 6CEB (4.7 Å), 6SOF (4.3 Å)
- Insulin degrading enzyme (IDE) complexes: 6B3Q (3.7 Å), 6B70 (3.7 Å), 6BFC (3.7 Å)
- IGF-1R/insulin complex: 6JK8 (5.0 Å), 7V3P (3.6 Å)
- Insulin receptor bound states: 7BW7 (4.1 Å), 7BW8 (3.8 Å), 7SL1-7SL7, 8EYX-8EZ0, 8GUY (4.18 Å)
- Insulin fibrils: 8SBD (3.2 Å - amyloid-like fibril structure)
Powder Diffraction (3 structures)
- 1FU2, 1FUB (first protein structures determined from X-ray powder diffraction)
- 7QAC (T2 structure of polycrystalline cubic human insulin, 2.29 Å)
Solid-State NMR (1 structure)
- 8RVT (Full-length human insulin fibrils)
Predicted Structure: AlphaFold
Model ID: P01308
Confidence (pLDDT): 53.19 (Low confidence, moderate global metrics)
Sequence Length: 839 amino acids (includes precursor/proinsulin form)
Note: Low pLDDT score reflects insulin’s intrinsic structural flexibility and conformational heterogeneity (monomeric, dimeric, and hexameric states) that makes confident structure prediction challenging.
Based on my search of the biobtree database, here is what I found for INS orthologs:
Cross-species orthologs
| Organism | Gene ID | Gene Symbol |
|---|---|---|
| Mouse (Mus musculus) | 16333, 16334 | Ins1, Ins2 |
| Rat (Rattus norvegicus) | 24505, 24506 | Ins1, Ins2 |
| Zebrafish (Danio rerio) | 30262, 566735 | ins, insb |
| Fruit fly (Drosophila melanogaster) | 39149, 39150 | Ilp1, Ilp2 |
| Worm (C. elegans) | none | — |
| Yeast (S. cerevisiae) | none | — |
Human INS (insulin) has clear orthologs in mammals and some in fish and insects, but C. elegans lacks a true insulin ortholog (though it has distantly-related insulin-like peptides), and S. cerevisiae lacks insulin entirely.
Clinical variants & AI predictions
ClinVar Summary
| Classification | Count |
|---|---|
| Benign | ~50 |
| Likely Benign | ~30 |
| Uncertain Significance (VUS) | ~45 |
| Likely Pathogenic | ~28 |
| Pathogenic | ~17 |
| Total | ~170 |
Top 30 Pathogenic/Likely Pathogenic Variants
| ClinVar ID | HGVS Notation | Classification |
|---|---|---|
| 1455986 | NM_000207.3(INS):c.1A>G (p.Met1Val) | Pathogenic/Likely pathogenic |
| 1457228 | NC_000011.9:g.(?2181023)(2193087_?)del | Pathogenic |
| 1459937 | NC_000011.9:g.(?2181023)(2182533_?)del | Pathogenic |
| 13378 | NM_000207.3(INS):c.143T>C (p.Phe48Ser) | Pathogenic |
| 13382 | NM_000207.3(INS):c.266G>T (p.Arg89Leu) | Pathogenic |
| 13383 | NM_000207.3(INS):c.266G>C (p.Arg89Pro) | Pathogenic |
| 13389 | NM_000207.3(INS):c.143T>G (p.Phe48Cys) | Likely pathogenic/Likely risk allele |
| 13392 | NM_000207.3(INS):c.163C>T (p.Arg55Cys) | Pathogenic/Likely pathogenic/Likely risk allele |
| 21122 | NM_000207.3(INS):c.94G>A (p.Gly32Ser) | Pathogenic/Likely pathogenic |
| 211186 | NM_000207.3(INS):c.188-31G>A | Pathogenic/Likely risk allele |
| 253331 | NM_000207.3(INS):c.125T>C (p.Val42Ala) | Pathogenic/Likely pathogenic |
| 431442 | NM_000207.3(INS):c.-152C>A | Pathogenic/Likely pathogenic |
| 431443 | NM_000207.3(INS):c.-152C>G | Pathogenic |
| 1162205 | NM_000207.3(INS):c.115C>T (p.Leu39Phe) | Likely pathogenic |
| 1336487 | NM_000207.3(INS):c.289A>C (p.Thr97Pro) | Likely pathogenic |
| 1338622 | NM_000207.3(INS):c.95G>T (p.Gly32Val) | Likely pathogenic |
| 1338640 | NM_000207.3(INS):c.103C>G (p.Leu35Val) | Likely pathogenic |
| 1526009 | NM_000207.3(INS):c.293G>T (p.Ser98Ile) | Likely pathogenic |
| 1526010 | NM_000207.3(INS):c.322T>G (p.Tyr108Asp) | Likely pathogenic |
| 1526012 | NM_000207.3(INS):c.101A>C (p.His34Pro) | Likely pathogenic |
| 1526013 | NM_000207.3(INS):c.103C>A (p.Leu35Met) | Likely pathogenic |
| 1801850 | NM_000207.3(INS):c.155C>G (p.Pro52Arg) | Likely pathogenic |
| 2630345 | NM_000207.3(INS):c.136C>T (p.Arg46Ter) | Likely pathogenic |
| 2631502 | NM_000207.3(INS):c.284G>A (p.Cys95Tyr) | Likely pathogenic |
| 3393374 | NM_000207.3(INS):c.283T>C (p.Cys95Arg) | Likely pathogenic |
| 36401 | NM_000207.3(INS):c.71C>T (p.Ala24Val) | Likely pathogenic |
| 3773933 | NM_000207.3(INS):c.129C>G (p.Cys43Trp) | Likely pathogenic |
| 65581 | NM_000207.3(INS):c.*59A>G | Likely pathogenic |
| 916729 | NM_000207.3(INS):c.174del (p.Glu59fs) | Likely pathogenic |
| 931331 | NM_001042376.3(INS-IGF2):c.155C>T (p.Pro52Leu) | Pathogenic |
AlphaMissense Pathogenicity Predictions
Missense likely-pathogenic variants: ~266 total
| Protein Variant | am_pathogenicity | Protein Variant | am_pathogenicity |
|---|---|---|---|
| C109W | 0.999 | C100W | 0.998 |
| C109F | 0.999 | C100R | 0.998 |
| C109Y | 0.999 | C100S | 0.998 |
| Y108C | 0.999 | Y103C | 0.920 |
| C109G | 0.983 | L105P | 0.997 |
| C109R | 0.996 | L105R | 0.983 |
| C109S | 0.998 | L102H | 0.961 |
| C96W | 0.997 | L102P | 0.940 |
| C96Y | 0.998 | Q104P | 0.957 |
| C96R | 0.995 | S101F | 0.979 |
| N110K | 0.970 | E106V | 0.972 |
| N110I | 0.950 | Y108D | 0.992 |
| N110Y | 0.835 | Y108H | 0.994 |
| C95W | 0.998 | C96F | 0.996 |
| C95R | 0.997 | C95F | 0.997 |
| Y108S | 0.985 | Y108N | 0.988 |
| E106D | 0.965 | E106K | 0.910 |
| L105Q | 0.991 | C100F | 0.997 |
| L105M | 0.797 | L105V | 0.814 |
| S101C | 0.960 | S101Y | 0.933 |
| C96S | 0.997 | E106A | 0.836 |
| C95Y | 0.997 | Q104H | 0.605 |
| S101P | 0.882 | N110D | 0.595 |
| S101A | 0.708 | R89P | 0.861 |
| L102R | 0.877 | G90C | 0.990 |
| C100G | 0.991 | C43Y | 0.964 |
| C95S | 0.998 | L39P | 0.962 |
| C95G | 0.976 | F48L | 0.990 |
Splice Effect Predictions
SpliceAI predictions: 405 variants identified (Specific delta scores not available in current dataset; SpliceAI provides predictions for variants affecting splice sites)
Pathways & Gene Ontology
Reactome Pathways (14)
| Pathway ID | Pathway Name |
|---|---|
| R-HSA-210745 | Regulation of gene expression in beta cells |
| R-HSA-264876 | Insulin processing |
| R-HSA-422085 | Synthesis, secretion, and deacylation of Ghrelin |
| R-HSA-422356 | Regulation of insulin secretion |
| R-HSA-6807878 | COPI-mediated anterograde transport |
| R-HSA-6811558 | PI5P, PP2A and IER3 Regulate PI3K/AKT Signaling |
| R-HSA-74713 | IRS activation |
| R-HSA-74749 | Signal attenuation |
| R-HSA-74751 | Insulin receptor signalling cascade |
| R-HSA-74752 | Signaling by Insulin receptor |
| R-HSA-77387 | Insulin receptor recycling |
| R-HSA-9615017 | FOXO-mediated transcription of oxidative stress, metabolic and neuronal genes |
| R-HSA-9768919 | NPAS4 regulates expression of target genes |
| R-HSA-977225 | Amyloid fiber formation |
MSigDB Gene Sets (100 total)
Includes Reactome pathways, KEGG pathways, and GO biological process/function sets.
Gene Ontology
Biological Process (58 terms) — Top 20:
| GO ID | Term |
|---|---|
| GO:0006006 | Glucose metabolic process |
| GO:0008286 | Insulin receptor signaling pathway |
| GO:0042593 | Glucose homeostasis |
| GO:0046628 | Positive regulation of insulin receptor signaling pathway |
| GO:0045821 | Positive regulation of glycolytic process |
| GO:0045725 | Positive regulation of glycogen biosynthetic process |
| GO:0045721 | Negative regulation of gluconeogenesis |
| GO:0046326 | Positive regulation of D-glucose import across plasma membrane |
| GO:0043410 | Positive regulation of MAPK cascade |
| GO:0045840 | Positive regulation of mitotic nuclear division |
| GO:0045597 | Positive regulation of cell differentiation |
| GO:0030307 | Positive regulation of cell growth |
| GO:0008284 | Positive regulation of cell population proliferation |
| GO:0030335 | Positive regulation of cell migration |
| GO:0001819 | Positive regulation of cytokine production |
| GO:0046889 | Positive regulation of lipid biosynthetic process |
| GO:0045922 | Negative regulation of fatty acid metabolic process |
| GO:0055089 | Fatty acid homeostasis |
| GO:0051897 | Positive regulation of phosphatidylinositol 3-kinase/protein kinase B signal transduction |
| GO:0010628 | Positive regulation of gene expression |
Molecular Function (6 terms):
| GO ID | Term |
|---|---|
| GO:0005179 | Hormone activity |
| GO:0005158 | Insulin receptor binding |
| GO:0005159 | Insulin-like growth factor receptor binding |
| GO:0048018 | Receptor ligand activity |
| GO:0042802 | Identical protein binding |
| GO:0002020 | Protease binding |
Cellular Component (9 terms):
| GO ID | Term |
|---|---|
| GO:0005576 | Extracellular region |
| GO:0005615 | Extracellular space |
| GO:0005788 | Endoplasmic reticulum lumen |
| GO:0005796 | Golgi lumen |
| GO:0000139 | Golgi membrane |
| GO:0034774 | Secretory granule lumen |
| GO:0033116 | Endoplasmic reticulum-Golgi intermediate compartment membrane |
| GO:0031904 | Endosome lumen |
| GO:0030133 | Transport vesicle |
Protein interactions & networks
Protein-Protein Interactions
Total Interaction Count (approximate):
- STRING: 11,095 interactions
- BioGRID: 518 interactions
- IntAct: 57 curated interactions
- Combined estimated total: ~11,600+ interactions
TOP 30 highest-confidence interacting proteins (STRING database):
| Rank | UniProt | Protein Name | STRING Score |
|---|---|---|---|
| 1 | P01343 | Insulin-like growth factor 1 (IGF1) | 999 |
| 2 | P02768 | Albumin (ALB) | 999 |
| 3 | P06213 | Insulin receptor (INSR) | 999 |
| 4 | P08069 | Insulin-like growth factor 1 receptor (IGF1R) | 999 |
| 5 | P35568 | Insulin receptor substrate 1 (IRS1) | 999 |
| 6 | P01275 | Pro-glucagon (GCG) | 997 |
| 7 | P17936 | Insulin-like growth factor-binding protein 3 (IGFBP3) | 995 |
| 8 | P08833 | Insulin-like growth factor-binding protein 1 (IGFBP1) | 994 |
| 9 | P11717 | Cation-independent mannose-6-phosphate receptor (IGF2R) | 990 |
| 10 | Q12988 | Heat shock protein beta-3 (HSPB3) | 990 |
| 11 | Q16270 | Insulin-like growth factor-binding protein 7 (IGFBP7) | 990 |
| 12 | P04629 | High affinity nerve growth factor receptor (NTRK1) | 989 |
| 13 | P41159 | Leptin (LEP) | 986 |
| 14 | Q9Y4H2 | Insulin receptor substrate 2 (IRS2) | 986 |
| 15 | P02144 | Myoglobin (MB) | 985 |
| 16 | P01344 | Insulin-like growth factor 2 (IGF2) | 982 |
| 17 | P18065 | Insulin-like growth factor-binding protein 2 (IGFBP2) | 982 |
| 18 | P14672 | Glucose transporter 4 (SLC2A4/GLUT4) | 978 |
| 19 | P10997 | Islet amyloid polypeptide (IAPP) | 970 |
| 20 | P14735 | Insulin-degrading enzyme (IDE) | 967 |
| 21 | Q15848 | Adiponectin (ADIPOQ) | 961 |
| 22 | P99999 | Cytochrome c (CYCS) | 953 |
| 23 | P01133 | Pro-epidermal growth factor (EGF) | 953 |
| 24 | P31749 | RAC-alpha serine/threonine-protein kinase (AKT1) | 953 |
| 25 | P61278 | Somatostatin (SST) | 949 |
| 26 | P01189 | Pro-opiomelanocortin (POMC) | 947 |
| 27 | P35557 | Hexokinase-4 (GCK) | 945 |
| 28 | P37231 | Peroxisome proliferator-activated receptor gamma (PPARG) | 944 |
| 29 | P01236 | Prolactin (PRL) | 941 |
| 30 | P11168 | Glucose transporter 2 (SLC2A2/GLUT2) | 941 |
Key IntAct high-confidence interactions (confidence score ≥0.6):
- INSR (Insulin Receptor) - score 0.700 - direct interaction
- IDE (Insulin-Degrading Enzyme) - score 0.620 - direct interaction & protein cleavage
- INS homodimers/oligomers - score 0.970 - direct interactions (multiple entries)
Protein Structural & Sequence Similarity
Structural Similarity (ESM2 Embedding - TOP 20 highest similarity):
| Rank | UniProt | Protein Name | ESM2 Similarity | Avg Similarity |
|---|---|---|---|---|
| 1 | P01308 | Insulin (self) | 1.0000 | 0.9679 |
| 2 | Q6YK33 | Insulin homolog (mouse ortholog) | 1.0000 | 0.9679 |
| 3 | P30406 | Insulin-like peptide | 0.9998 | 0.9647 |
| 4 | P30407 | Insulin-like peptide | 0.9998 | 0.9658 |
| 5 | Q8HXV2 | Insulin (species variant) | 0.9996 | 0.9680 |
| 6 | P01313 | Insulin (species variant) | 0.9980 | 0.9795 |
| 7 | P01311 | Insulin (species variant) | 0.9945 | 0.9756 |
| 8 | P51463 | Insulin-related peptide | 0.9961 | 0.9700 |
| 9 | O55232 | Insulin (rodent ortholog) | 0.9986 | 0.9362 |
| 10 | O55241 | Insulin (rodent ortholog) | 0.9986 | 0.9371 |
| 11 | P01322 | Insulin (species variant) | 0.9989 | 0.9787 |
| 12 | P01323 | Insulin (species variant) | 0.9991 | 0.9797 |
| 13 | P01326 | Insulin (species variant) | 0.9991 | 0.9792 |
| 14 | P67970 | Insulin (species variant) | 0.9961 | 0.9686 |
| 15 | P67972 | Insulin (species variant) | 0.9970 | 0.9773 |
| 16 | Q62587 | Insulin (rodent ortholog) | 0.9979 | 0.9793 |
| 17 | P69045 | Insulin (species variant) | 0.9988 | 0.9753 |
| 18 | P81025 | Insulin (species variant) | 0.9969 | 0.9743 |
| 19 | P01321 | Insulin (species variant) | 0.9973 | 0.9721 |
| 20 | P06306 | Insulin (species variant) | 0.9969 | 0.9726 |
Sequence Homology (DIAMOND - TOP 20 highest bitscore):
| Rank | UniProt | Identity (%) | Bitscore |
|---|---|---|---|
| 1 | P17085 | 92.0 | 347.0 |
| 2 | P10764 | 96.1 | 340.0 |
| 3 | P33712 | 97.5 | 306.0 |
| 4 | P23695 | 92.2 | 315.0 |
| 5 | P18254 | 100.0 | 268.0 |
| 6 | P51462 | 100.0 | 258.0 |
| 7 | P16501 | 93.2 | 254.0 |
| 8 | P41694 | 93.8 | 251.0 |
| 9 | P01322 | 93.6 | 215.0 |
| 10 | P01333 | 84.8 | 149.0 |
| 11 | P30410 | 98.2 | 224.0 |
| 12 | P30407 | 99.1 | 227.0 |
| 13 | P30406 | 99.1 | 226.0 |
| 14 | P01308 | 100.0 | 226.0 |
| 15 | P04667 | 78.2 | 174.0 |
| 16 | P01310 | 91.3 | 158.0 |
| 17 | P67971 | 93.3 | 109.0 |
| 18 | P01335 | 90.7 | 211.0 |
| 19 | P01315 | 91.3 | 189.0 |
| 20 | P01314 | 100.0 | 112.0 |
Transcription factor regulatory data
INS is not a transcription factor. It encodes insulin, a 110-amino acid peptide hormone (11.98 kDa), not a DNA-binding transcription factor.
Upstream regulators
INS is regulated by 139 transcription factors. Top regulators identified in CollecTRI:
| TF | Regulation | Confidence |
|---|---|---|
| PDX1 | Activation | High |
| HNF1A | Activation | High |
| HNF1B | Activation | High |
| ISL1 | Activation | High |
| NEUROG3 | Activation | High |
| CDX2 | Activation | High |
| MAFB | Activation | High |
| MAFA | Unknown | High |
| MAF | Activation | High |
| KLF11 | Activation | High |
| ESR1 | Activation | High |
| NR1H4 | Activation | High |
| ATF2 | Activation | High |
| FOXA2 | Unknown | High |
| HNF4A | Unknown | High |
| STAT5B | Unknown | High |
| CREB1 | Unknown | High |
| CREM | Unknown | High |
| GLIS3 | Unknown | High |
| NKX6-1 | Unknown | High |
| NEUROD1 | Unknown | High |
| FOXO1 | Unknown | High |
| SOX9 | Unknown | High |
| SOX6 | Unknown | High |
| SREBF1 | Unknown | High |
| SRF | Unknown | High |
| TCF3 | Unknown | High |
| JUN | Repression | High |
| AP1 | Repression | High |
| ATF6 | Repression | High |
| PAX6 | Repression | High |
| PAX4 | Repression | High |
| NR0B2 | Repression | High |
| PPARG | Repression | High |
Evidence type: CollecTRI (curated computational predictions from ChIP-seq and other omics data).
Drug & pharmacology data
INS as a drug target: INS is not a traditional drug target (i.e., small molecules or biologics that bind to and modulate insulin). Instead, the INS gene product (insulin) is used directly as a therapeutic protein. This represents a distinct pharmacological role.
Insulin-based therapeutics in ChEMBL
Total count: 5+ insulin-based molecules identified
Top molecules by development phase (all Phase 4/approved):
| Molecule ID | Name | Type | Phase | Clinical Trials |
|---|---|---|---|---|
| CHEMBL1201631 | Insulin human | Protein | 4 | 603 |
| CHEMBL1201497 | Insulin glargine | Protein analog | 4 | 510 |
| CHEMBL1201496 | Insulin aspart | Protein analog | 4 | 294 |
| CHEMBL2104391 | Insulin detemir | Protein analog | 4 | 185 |
| CHEMBL1201538 | Insulin lispro | Protein analog | 4 | 154 |
Clinical trials (top 20 involving insulin therapeutics)
- NCT03152890 — Insulin Therapy for Postreperfusion Hyperglycemia — Phase 4 — RECRUITING
- NCT03143816 — Comparing Prandial Insulin Aspart vs. Technosphere Insulin in Type 1 Diabetes — Phase 4 — COMPLETED
- NCT03112538 — Improving Glycaemic Control in Malaysian Type 2 Diabetes Patients With Insulin Pump — Phase 4 — UNKNOWN
- NCT03029702 — Metabolic Analysis for Treatment Choice in Gestational Diabetes — Phase 4 — COMPLETED
- NCT03013985 — Glargine U300 Hospital Trial — Phase 4 — COMPLETED
- NCT03011008 — Liraglutide as Additional Treatment to Insulin in Autoimmune Diabetes — Phase 4 — UNKNOWN
- NCT02941367 — Lyxumia (Lixisenatide) and Sulfonylurea as Add-on to Basal Insulin — Phase 4 — COMPLETED
- NCT02914886 — Beneficial Effect of Insulin Glulisine in Lipoatrophy and Type 1 Diabetes — Phase 4 — COMPLETED
- NCT02906709 — Omarigliptin Add-on to Insulin in Type 2 Diabetes (Japanese cohort) — Phase 4 — COMPLETED
- NCT02885909 — Inpatient Blood Glucose Control Trial — Phase 4 — UNKNOWN
- NCT02847091 — Ipragliflozin in Type 2 Diabetes With Insulin Therapy — Phase 4 — COMPLETED
- NCT02811484 — Exenatide-LAR and Dapagliflozin in Obese Insulin-Treated Type 2 Diabetes — Phase 4 — WITHDRAWN
- NCT02806999 — Therapeutic Effects of Insulin and Berberine on Stress Hyperglycemia — Phase 4 — UNKNOWN
- NCT02765347 — Effects of Metformin on Glycemic Control in Adolescents With Type 1 Diabetes — Phase 4 — COMPLETED
- NCT02758522 — NPH and Regular Insulin in Inpatient Hyperglycemia: 3 Basal-bolus Regimens — Phase 4 — COMPLETED
- NCT02706119 — Insulin in Total Parenteral Nutrition — Phase 4 — COMPLETED
- NCT02680054 — Additional Insulin for Fat and Protein in Children With Diabetes — Phase 4 — TERMINATED
- NCT02622113 — Long-Term Safety of Canagliflozin (TA-7284) With Insulin in Type 2 Diabetes — Phase 4 — COMPLETED
- NCT02621489 — Effects on Re-endothelialisation With Bydureon Treatment in Type 2 Diabetes — Phase 4 — COMPLETED
- NCT02590016 — Glucose Control During Labour in Gestational Diabetes With Insulin — Phase 4 — UNKNOWN
Pharmacogenomics
Limited drug-gene interaction data exists for INS in current databases. No specific pharmacogenomics biomarkers or dosing guidelines tied to INS genetic variants are documented in PharmGKB. Insulin response is known to be influenced by non-genetic factors (diet, exercise, body composition, disease stage) rather than common INS polymorphisms. Rare INS mutations cause monogenic diabetes (neonatal diabetes, MODY, maturity-onset diabetes of the young) but these are disease-causing variants rather than pharmacogenetic markers for insulin response.
Expression profiles
Tissue Expression (Bgee)
INS shows ubiquitous expression across 276 conditions with a maximum expression score of 100.00. Highest expression is in pancreatic tissues and beta cells, with broad detection in endocrine tissues and some non-pancreatic tissues.
| Rank | Tissue/Condition | Expression Status | Score | Quality |
|---|---|---|---|---|
| 1 | Type B pancreatic cell | Present | 100.00 | Gold |
| 2 | Islet of Langerhans | Present | 99.96 | Gold |
| 3 | Body of pancreas | Present | 99.78 | Gold |
| 4 | Pancreas | Present | 99.05 | Gold |
| 5 | Epithelial cell of pancreas | Present | 80.80 | Gold |
| 6 | Right lobe of liver | Present | 64.33 | Gold |
| 7 | Triceps brachii | Absent | 64.27 | Gold |
| 8 | Gluteal muscle | Absent | 64.12 | Gold |
| 9 | Right adrenal gland | Present | 63.29 | Gold |
| 10 | Left adrenal gland | Present | 62.29 | Gold |
| 11 | Olfactory bulb | Absent | 62.27 | Gold |
| 12 | Left adrenal gland cortex | Present | 61.07 | Gold |
| 13 | Tongue squamous epithelium | Absent | 61.07 | Gold |
| 14 | Right adrenal gland cortex | Present | 60.43 | Gold |
| 15 | Adrenal cortex | Present | 59.95 | Gold |
| 16 | Ectocervix | Present | 59.63 | Gold |
| 17 | Descending thoracic aorta | Present | 59.02 | Gold |
| 18 | Vastus lateralis | Absent | 58.39 | Gold |
| 19 | Adrenal gland | Present | 58.33 | Gold |
| 20 | Oocyte | Absent | 58.03 | Gold |
| 21 | Left uterine tube | Present | 57.44 | Gold |
| 22 | Quadriceps femoris | Absent | 57.41 | Gold |
| 23 | Right coronary artery | Present | 55.87 | Gold |
| 24 | Lower esophagus mucosa | Present | 55.80 | Gold |
| 25 | Myocardium | Absent | 55.26 | Gold |
| 26 | Endocervix | Present | 54.73 | Gold |
| 27 | Diaphragm | Absent | 54.62 | Gold |
| 28 | Fundus of stomach | Present | 54.26 | Gold |
| 29 | Lateral nuclear group of thalamus | Absent | 54.02 | Gold |
| 30 | Substantia nigra | Present | 53.69 | Gold |
Key patterns: Highest expression in pancreatic beta cells and islets (scores ~100), followed by other pancreatic tissues (~80-99). Adrenal gland shows secondary expression (~58-63). Skeletal muscle, brain regions, and myocardium show absence calls despite detected signal.
Single-Cell Expression Datasets (scxa)
INS is characterized in the following single-cell transcriptomics studies:
| Dataset ID | Study | Species | Cell Count |
|---|---|---|---|
| E-ENAD-27 | Single cell transcriptomics defines human islet cell signatures and reveals cell-type-specific expression changes in type 2 diabetes | Homo sapiens | 1,145 |
| E-GEOD-81547 | Single cell transcriptome analysis of human pancreas | Homo sapiens | 2,544 |
| E-GEOD-81608 | Single cell RNA-seq of human islet cells from non-diabetic and type II diabetes organ donors | Homo sapiens | 1,600 |
| E-GEOD-83139 | Single cell RNA-seq of human pancreatic endocrine cells from juvenile, adult control, and type 2 diabetic donors | Homo sapiens | 635 |
| E-HCAD-31 | Massively parallel single-cell RNA-seq analysis of pancreatic islet cells from healthy and type II diabetic donors | Homo sapiens | 38,217 |
| E-MTAB-10137 | Unraveling transcriptomic heterogeneity in human dermal blood vascular endothelium at single-cell resolution | Homo sapiens | 1,523 |
| E-MTAB-5061 | Single-cell RNA-seq analysis of human pancreas from healthy individuals and type 2 diabetes patients | Homo sapiens | 3,386 |
Notable patterns: INS expression is predominantly detected in pancreatic islet beta cells. The largest dataset (E-HCAD-31, 38,217 cells) provides comprehensive cell-type signatures. Multiple studies compare healthy vs. type 2 diabetes donors, highlighting altered INS expression in diabetes. Unexpected detection in dermal vascular endothelium (E-MTAB-10137) suggests potential extra-pancreatic expression in endothelial populations.
Disease associations
Mendelian / Monogenic Disease
| Disease | Disease IDs | Inheritance | Evidence Level |
|---|---|---|---|
| Diabetes mellitus, permanent neonatal 4 | OMIM:618858, MONDO:0030089, MONDO:0100164 | Autosomal dominant, Autosomal recessive | Strong (Genomics England PanelApp, Labcorp), Moderate (Ambry Genetics) |
| Transient neonatal diabetes mellitus | MONDO:0020525 | Autosomal dominant, Autosomal recessive | Strong (Genomics England PanelApp) |
| Permanent neonatal diabetes mellitus | Orphanet:99885, MONDO:0100164 | Autosomal dominant | Strong (Genomics England PanelApp, Labcorp), Supportive (Orphanet) |
| Type 1 diabetes mellitus 2 | OMIM:125852, MONDO:0007454 | Autosomal dominant | Strong (Genomics England PanelApp) |
| Maturity-onset diabetes of the young type 10 | OMIM:613370, MONDO:0013240, Orphanet:552 | Autosomal dominant | Strong (Genomics England PanelApp, Labcorp), Supportive (Orphanet) |
| Hyperproinsulinemia | OMIM:616214, MONDO:0014535 | Autosomal dominant | Strong (Genomics England PanelApp), Limited (Labcorp) |
| Maturity-onset diabetes of the young | Orphanet:552 | Autosomal dominant | Supportive (Orphanet) |
Phenotype Associations (Top 30 HPO Terms)
- HP:0000819 – Diabetes mellitus
- HP:0000857 – Neonatal insulin-dependent diabetes mellitus
- HP:0000842 – Hyperinsulinemia
- HP:0000831 – Insulin-resistant diabetes mellitus
- HP:0000825 – Hyperinsulinemic hypoglycemia
- HP:0004904 – Maturity-onset diabetes of the young
- HP:0100651 – Type I diabetes mellitus
- HP:0001998 – Neonatal hypoglycemia
- HP:0003074 – Hyperglycemia
- HP:0001952 – Glucose intolerance
- HP:0040217 – Elevated hemoglobin A1c
- HP:0030794 – Abnormal circulating C-peptide concentration
- HP:0040214 – Abnormal circulating insulin concentration
- HP:0030795 – Reduced C-peptide level
- HP:0040216 – Hypoinsulinemia
- HP:0008255 – Transient neonatal diabetes mellitus
- HP:0001953 – Diabetic ketoacidosis
- HP:0002919 – Ketonuria
- HP:0003076 – Glycosuria
- HP:0001944 – Dehydration
- HP:0001508 – Failure to thrive
- HP:0001518 – Small for gestational age
- HP:0001520 – Large for gestational age
- HP:0001511 – Intrauterine growth retardation
- HP:0025708 – Early young adult onset
- HP:0004924 – Abnormal oral glucose tolerance
- HP:0006274 – Reduced pancreatic beta cells
- HP:0001513 – Obesity
- HP:0025502 – Overweight
- HP:0001824 – Weight loss
Complex Disease / GWAS Associations (Top 30)
| Trait/Disease | Associated Gene(s) | p-value | Study |
|---|---|---|---|
| Type 1 diabetes | INS-IGF2, INS | 1e-196 | GCST001191 |
| Type 1 diabetes | INS-IGF2, INS | 1e-160 | GCST010681 |
| Type 1 diabetes | INS-IGF2, INS | 1e-100 | GCST005536 |
| Type 1 diabetes | INS-IGF2, INS | 1e-18 | GCST007246 |
| Type 1 diabetes | INS - TH | 1e-13 | GCST009916 |
| Type 1 diabetes | INS-IGF2, IGF2, IGF2-AS | 8e-11 | GCST003097 |
| Type 1 diabetes | INS-IGF2, IGF2 | 4e-09 | GCST000054 |
| Type 1 diabetes | MIR4686 - ASCL2 | 2e-31 | GCST90000529 |
| Type 1 diabetes | IGF2, INS-IGF2, IGF2-AS | 1e-09 | GCST90000529 |
| Type 1 diabetes | H19 - IGF2 | 1e-09 | GCST90000529 |
| Type 1 diabetes in high risk HLA genotype individuals | INS - TH | 3e-07 | GCST006196 |
| Type 1 diabetes | INS-IGF2, IGF2 | 2e-07 | GCST000038 |
| Severe autoimmune type 2 diabetes | INS-IGF2, INS | 3e-07 | GCST90026412 |
| Type 1 diabetes autoantibodies in high risk HLA genotype individuals | INS - TH | 1e-07 | GCST006197 |
| Type 1 diabetes | INS | 1e-06 | GCST008377 |
| Type 2 diabetes | MIR4686 - ASCL2 | 1e-16 | GCST010118 |
| Type 2 diabetes | MIR4686 - ASCL2 | 4e-26 | GCST009379 |
| Type 2 diabetes | MIR4686 - ASCL2 | 3e-13 | GCST007847 |
| Type 2 diabetes | MIR4686 - ASCL2 | 2e-07 | GCST008114 |
| Type 2 diabetes | INS - TH | 1e-06 | GCST009379 |
| Type 2 diabetes | TH | 2e-08 | GCST008464 |
| Type 2 diabetes | H19 - IGF2 | 2e-08 | GCST009379 |
| Type 2 diabetes | IGF2 | 4e-08 | GCST009379 |
| Type 2 diabetes | FAM99B, LINC02708 | 2e-06 | GCST009379 |
| Prostate cancer | MIR4686 - ASCL2 | 3e-33 | GCST000488 |
| Birth weight | H19 - IGF2 | 7e-10 | GCST005146 |
| Celiac disease | H19 - IGF2 | 7e-06 | GCST002112 |
| Latent autoimmune diabetes vs. type 2 diabetes | INS-IGF2, INS | 1e-18 | GCST007246 |
| Type 1 diabetes autoantibodies (time to event) | INS - TH | 6e-06 | GCST006197 |
| Pediatric autoimmune diseases | INS-IGF2, IGF2, IGF2-AS | 8e-11 | GCST003097 |