ARSH
gene geneOn this page
Summary
ARSH (arylsulfatase family member H, HGNC:32488) is a protein-coding gene on chromosome Xp22.33, encoding Arylsulfatase H (Q5FYA8).
Sulfatases, such as ARSH, hydrolyze sulfate esters from sulfated steroids, carbohydrates, proteoglycans, and glycolipids. They are involved in hormone biosynthesis, modulation of cell signaling, and degradation of macromolecules (Sardiello et al., 2005 [PubMed 16174644]).
Source: NCBI Gene 347527 — RefSeq curated summary.
At a glance
- Clinical variants (ClinVar): 122 total
- MANE Select transcript:
NM_001011719
Identifiers
Gene identifiers
| Field | Value |
|---|---|
| HGNC ID | HGNC:32488 |
| Approved symbol | ARSH |
| Name | arylsulfatase family member H |
| Location | Xp22.33 |
| Locus type | gene with protein product |
| Status | Approved |
| Ensembl gene | ENSG00000205667 |
| Ensembl biotype | protein_coding |
| OMIM | 300586 |
| Entrez | 347527 |
Gene structure
Transcript identifiers
Ensembl transcripts: 1 — 1 protein_coding
ENST00000381130
RefSeq mRNA: 1 — MANE Select: NM_001011719
NM_001011719
CCDS: CCDS35198
Canonical transcript exons
ENST00000381130 — 9 exons
| Exon | Start | End |
|---|---|---|
| ENSE00001487595 | 3033018 | 3034111 |
| ENSE00001487597 | 3029247 | 3029368 |
| ENSE00001487599 | 3027313 | 3027475 |
| ENSE00001487600 | 3024021 | 3024155 |
| ENSE00001487602 | 3018534 | 3018670 |
| ENSE00001487603 | 3014970 | 3015393 |
| ENSE00001487606 | 3013047 | 3013172 |
| ENSE00001487608 | 3010030 | 3010151 |
| ENSE00001487610 | 3006546 | 3006704 |
Expression profiles
Bgee: expression breadth broad, 25 present calls, max score 75.21.
Top tissues by expression
124 total, by Bgee expression score (0-100, higher = more expressed):
| Tissue | Anatomy ID | Expression score | Quality |
|---|---|---|---|
| male germ line stem cell (sensu Vertebrata) in testis | CL:0000089 ∩ UBERON:0000473 | 75.21 | gold quality |
| gall bladder | UBERON:0002110 | 47.79 | gold quality |
| lower esophagus mucosa | UBERON:0035834 | 47.09 | silver quality |
| ganglionic eminence | UBERON:0004023 | 47.02 | gold quality |
| islet of Langerhans | UBERON:0000006 | 47.01 | gold quality |
| esophagus mucosa | UBERON:0002469 | 46.24 | gold quality |
| olfactory segment of nasal mucosa | UBERON:0005386 | 41.86 | gold quality |
| endometrium | UBERON:0001295 | 40.95 | silver quality |
| tonsil | UBERON:0002372 | 40.20 | gold quality |
| ventricular zone | UBERON:0003053 | 39.95 | gold quality |
| pancreas | UBERON:0001264 | 39.39 | gold quality |
| saliva-secreting gland | UBERON:0001044 | 38.07 | gold quality |
| minor salivary gland | UBERON:0001830 | 37.21 | silver quality |
| colonic epithelium | UBERON:0000397 | 37.20 | gold quality |
| urinary bladder | UBERON:0001255 | 36.84 | silver quality |
| cortical plate | UBERON:0005343 | 36.47 | gold quality |
| mucosa of transverse colon | UBERON:0004991 | 36.42 | gold quality |
| bone marrow cell | CL:0002092 | 36.16 | gold quality |
| sural nerve | UBERON:0015488 | 35.55 | gold quality |
| placenta | UBERON:0001987 | 35.17 | silver quality |
| body of pancreas | UBERON:0001150 | 34.82 | gold quality |
| skin of abdomen | UBERON:0001416 | 34.67 | gold quality |
| vagina | UBERON:0000996 | 34.39 | gold quality |
| esophagus | UBERON:0001043 | 33.94 | gold quality |
| bone marrow | UBERON:0002371 | 33.89 | gold quality |
| zone of skin | UBERON:0000014 | 33.68 | gold quality |
| skeletal muscle tissue | UBERON:0001134 | 33.38 | gold quality |
| muscle tissue | UBERON:0002385 | 33.23 | gold quality |
| thoracic mammary gland | UBERON:0005200 | 32.77 | silver quality |
| skin of leg | UBERON:0001511 | 32.71 | gold quality |
Single-cell (SCXA)
Detected in 1 experiment(s), a significant marker in 1.
| Experiment | Marker? | Max mean expression |
|---|---|---|
| E-ANND-3 | yes | 3.78 |
Regulation
Is transcription factor: no
Cross-species orthologs
11 orthologs
| Organism | Symbol | Gene ID |
|---|---|---|
| danio_rerio | sulf2a | ENSDARG00000018423 |
| danio_rerio | arsib | ENSDARG00000117075 |
| drosophila_melanogaster | CG18278 | FBGN0033836 |
| drosophila_melanogaster | CG7408 | FBGN0036765 |
| drosophila_melanogaster | CG7402 | FBGN0036768 |
| drosophila_melanogaster | Sulf1 | FBGN0040271 |
| drosophila_melanogaster | CG32191 | FBGN0052191 |
| drosophila_melanogaster | CG30059 | FBGN0260475 |
| caenorhabditis_elegans | WBGENE00006308 | |
| caenorhabditis_elegans | WBGENE00006309 | |
| caenorhabditis_elegans | WBGENE00006310 |
Paralogs (16): ARSD (ENSG00000006756), IDS (ENSG00000010404), ARSF (ENSG00000062096), ARSA (ENSG00000100299), STS (ENSG00000101846), ARSB (ENSG00000113273), GNS (ENSG00000135677), SULF1 (ENSG00000137573), GALNS (ENSG00000141012), ARSG (ENSG00000141337), ARSL (ENSG00000157399), ARSK (ENSG00000164291), ARSJ (ENSG00000180801), SGSH (ENSG00000181523), ARSI (ENSG00000183876), SULF2 (ENSG00000196562)
Protein
Protein identifiers
Arylsulfatase H — Q5FYA8 (reviewed: Q5FYA8)
All UniProt accessions (1): Q5FYA8
UniProt curated annotations — full annotation on UniProt →
Subcellular location. Membrane.
Post-translational modifications. The conversion to 3-oxoalanine (also known as C-formylglycine, FGly), of a serine or cysteine residue in prokaryotes and of a cysteine residue in eukaryotes, is critical for catalytic activity.
Cofactor. Binds 1 Ca(2+) ion per subunit.
Similarity. Belongs to the sulfatase family.
RefSeq proteins (1): NP_001011719* (*=MANE)
Domains & families (InterPro)
| ID | Name | Type |
|---|---|---|
| IPR000917 | Sulfatase_N | Domain |
| IPR017850 | Alkaline_phosphatase_core_sf | Homologous_superfamily |
| IPR024607 | Sulfatase_CS | Conserved_site |
| IPR050738 | Sulfatase | Family |
Pfam: PF00884, PF14707
UniProt features (14 total): binding site 8, transmembrane region 2, active site 2, chain 1, modified residue 1
Structure
Experimental structures (PDB)
0 structures.
Predicted structure (AlphaFold)
| Model | pLDDT | Fraction very-high |
|---|---|---|
| AF-Q5FYA8-F1 | 94.15 | 0.87 |
Functional residue map
Curated UniProt residues grouped by drug-discovery relevance — catalytic, ligand-binding, modification, and mutation-validated positions. Source: UniProtKB sequence features.
Catalytic / active sites (2): 55 (nucleophile); 117
Ligand- & substrate-binding residues (8): 323; 324; 348; 15; 16; 55 (via 3-oxoalanine); 115; 271
Post-translational modifications (1): 55
Function
Pathways and Gene Ontology
Reactome pathways
9 pathways
| ID | Pathway |
|---|---|
| R-HSA-1663150 | The activation of arylsulfatases |
| R-HSA-9840310 | Glycosphingolipid catabolism |
| R-HSA-1430728 | Metabolism |
| R-HSA-163841 | Gamma carboxylation, hypusinylation, hydroxylation, and arylsulfatase activation |
| R-HSA-1660662 | Glycosphingolipid metabolism |
| R-HSA-392499 | Metabolism of proteins |
| R-HSA-428157 | Sphingolipid metabolism |
| R-HSA-556833 | Metabolism of lipids |
| R-HSA-597592 | Post-translational protein modification |
MSigDB gene sets: 15 (showing top):
REACTOME_SPHINGOLIPID_METABOLISM, GOCC_ENDOPLASMIC_RETICULUM_LUMEN, GOMF_HYDROLASE_ACTIVITY_ACTING_ON_ESTER_BONDS, GOMF_SULFURIC_ESTER_HYDROLASE_ACTIVITY, GOMF_ARYLSULFATASE_ACTIVITY, REACTOME_POST_TRANSLATIONAL_PROTEIN_MODIFICATION, chrXp22, PEDRIOLI_MIR31_TARGETS_DN, ZWANG_TRANSIENTLY_UP_BY_2ND_EGF_PULSE_ONLY, REACTOME_GAMMA_CARBOXYLATION_HYPUSINYLATION_HYDROXYLATION_AND_ARYLSULFATASE_ACTIVATION, REACTOME_METABOLISM_OF_LIPIDS, DESCARTES_MAIN_FETAL_BIPOLAR_CELLS, REACTOME_GLYCOSPHINGOLIPID_CATABOLISM, REACTOME_GLYCOSPHINGOLIPID_METABOLISM, REACTOME_THE_ACTIVATION_OF_ARYLSULFATASES
GO Biological Process (0):
GO Molecular Function (3): arylsulfatase activity (GO:0004065), metal ion binding (GO:0046872), hydrolase activity (GO:0016787)
GO Cellular Component (2): endoplasmic reticulum lumen (GO:0005788), membrane (GO:0016020)
Reactome top-level categories
Rollup of top-7 pathways:
| Category | Pathways |
|---|---|
| Gamma carboxylation, hypusinylation, hydroxylation, and arylsulfatase activation | 1 |
| Glycosphingolipid metabolism | 1 |
| Post-translational protein modification | 1 |
| Sphingolipid metabolism | 1 |
| Metabolism of lipids | 1 |
| Metabolism | 1 |
| Metabolism of proteins | 1 |
GO top-level categories
Rollup of top GO terms by namespace:
| Category | Terms |
|---|---|
| sulfuric ester hydrolase activity | 1 |
| cation binding | 1 |
| catalytic activity | 1 |
| endoplasmic reticulum | 1 |
| intracellular organelle lumen | 1 |
| cellular anatomical structure | 1 |
Protein interactions and networks
STRING
544 interactions, top by confidence (×1000):
| Protein A | Protein B | Partner UniProt | Score |
|---|---|---|---|
| ARSH | AS3MT | Q9HBK9 | 613 |
| ARSH | GET3 | O43681 | 573 |
| ARSH | ARSK | Q6UWY0 | 532 |
| ARSH | CUTC | Q9NTM9 | 510 |
| ARSH | SLURP1 | P55000 | 418 |
| ARSH | COPB1 | P53618 | 397 |
| ARSH | PRR32 | B1ATL7 | 395 |
| ARSH | ARSI | Q5FYB1 | 383 |
| ARSH | COPG1 | Q9Y678 | 376 |
| ARSH | ARSJ | Q5FYB0 | 363 |
| ARSH | GNS | P15586 | 347 |
| ARSH | CTPS2 | Q9NRF8 | 315 |
| ARSH | APEH | P13798 | 313 |
| ARSH | SRRT | Q9BXP5 | 303 |
| ARSH | AQP9 | O43315 | 302 |
IntAct
2 interactions, top by confidence:
| A | B | Type | Score |
|---|---|---|---|
| ARSH | HERC4 | psi-mi:“MI:0914”(association) | 0.350 |
BioGRID (3): ITCH (Affinity Capture-MS), GPHN (Affinity Capture-MS), HERC4 (Affinity Capture-MS)
ESM2 similar proteins: A0A2D0TC04, A1A4K5, A2VDP5, A8K7I4, J3SBP3, J3SEZ3, O14638, P06802, P0DQQ4, P15396, P18563, P18564, P22413, P54793, P97259, P97675, Q08834, Q09328, Q13822, Q14CN2, Q1RPR6, Q29444, Q2TU62, Q32KH8, Q3SZI1, Q4FZV0, Q5FYA8, Q5GF25, Q5R5M5, Q64610, Q6AYF4, Q6DDW2, Q6DYE8, Q6NXH2, Q6PT52, Q6Q473, Q863C4, Q8BTJ4, Q8K1B9, Q8K2I4
Diamond homologs: P08842, P14000, P15289, P15589, P20713, P34059, P50427, P50428, P50473, P51689, P51690, P54793, Q08DD1, Q32KH5, Q32KH8, Q32KH9, Q32KJ6, Q32KJ9, Q3TYD4, Q571E4, Q5FYA8, Q5FYB0, Q60HH5, Q8BM89, Q8WNQ7, Q9X759, T2KMG4, T2KN90, P22304, P31447, Q08890, Q32KI9, Q32KJ8, Q5FYB1, Q96EG1, P33727, P50430, Q32KH7, Q8A2F6, Q8A2H2
SIGNOR signaling
0 interactions.
Disease & clinical
Clinical variants and AI predictions
ClinVar
122 variants total. Per-class counts are floors (≥ shown; pagination cap):
| Classification | Count (floor) |
|---|---|
| Pathogenic | 0 |
| Likely pathogenic | 0 |
| Uncertain significance | 68 |
| Likely benign | 8 |
| Benign | 12 |
Top pathogenic / likely-pathogenic (0)
SpliceAI
1606 predictions. Top by Δscore:
| Variant | Effect | Δscore |
|---|---|---|
| X:3006700:GTGAG:G | donor_gain | 1.0000 |
| X:3010020:T:TA | acceptor_gain | 1.0000 |
| X:3010021:G:A | acceptor_gain | 1.0000 |
| X:3010025:TGCA:T | acceptor_loss | 1.0000 |
| X:3010026:GCAG:G | acceptor_loss | 1.0000 |
| X:3010028:A:AG | acceptor_gain | 1.0000 |
| X:3010029:G:GA | acceptor_gain | 1.0000 |
| X:3010029:GC:G | acceptor_gain | 1.0000 |
| X:3010029:GCA:G | acceptor_gain | 1.0000 |
| X:3010029:GCAC:G | acceptor_gain | 1.0000 |
| X:3010029:GCACA:G | acceptor_gain | 1.0000 |
| X:3010149:CAGG:C | donor_loss | 1.0000 |
| X:3010150:AGGTG:A | donor_loss | 1.0000 |
| X:3010151:GGTG:G | donor_loss | 1.0000 |
| X:3010152:G:GA | donor_loss | 1.0000 |
| X:3013045:AG:A | acceptor_gain | 1.0000 |
| X:3013045:AGG:A | acceptor_gain | 1.0000 |
| X:3013046:GG:G | acceptor_gain | 1.0000 |
| X:3013046:GGG:G | acceptor_gain | 1.0000 |
| X:3013046:GGGAT:G | acceptor_gain | 1.0000 |
| X:3024015:CTCCA:C | acceptor_loss | 1.0000 |
| X:3024017:CCAG:C | acceptor_loss | 1.0000 |
| X:3024020:GGTA:G | acceptor_gain | 1.0000 |
| X:3024020:GGTAA:G | acceptor_gain | 1.0000 |
| X:3024128:C:G | donor_gain | 1.0000 |
| X:3024152:AAAGG:A | donor_loss | 1.0000 |
| X:3024154:AG:A | donor_loss | 1.0000 |
| X:3024155:GG:G | donor_loss | 1.0000 |
| X:3024156:G:T | donor_loss | 1.0000 |
| X:3024157:T:G | donor_loss | 1.0000 |
AlphaMissense
3685 scored. Top likely-pathogenic:
| Variant | Protein change | am_pathogenicity |
|---|---|---|
| X:3014975:T:A | W116R | 0.991 |
| X:3014975:T:C | W116R | 0.991 |
| X:3014977:G:C | W116C | 0.985 |
| X:3014977:G:T | W116C | 0.985 |
| X:3015038:T:C | F137L | 0.985 |
| X:3015040:T:A | F137L | 0.985 |
| X:3015040:T:G | F137L | 0.985 |
| X:3006641:T:A | V10D | 0.984 |
| X:3027414:A:C | S380R | 0.982 |
| X:3027416:C:A | S380R | 0.982 |
| X:3027416:C:G | S380R | 0.982 |
| X:3033299:T:A | W535R | 0.982 |
| X:3033299:T:C | W535R | 0.982 |
| X:3010109:A:C | S58R | 0.978 |
| X:3010111:T:A | S58R | 0.978 |
| X:3010111:T:G | S58R | 0.978 |
| X:3018658:G:C | D297H | 0.978 |
| X:3027349:G:C | R358P | 0.976 |
| X:3027361:T:A | I362K | 0.975 |
| X:3027423:G:C | D383H | 0.975 |
| X:3033034:A:C | K446N | 0.974 |
| X:3033034:A:T | K446N | 0.974 |
| X:3015039:T:C | F137S | 0.973 |
| X:3015047:T:C | F140L | 0.972 |
| X:3015049:T:A | F140L | 0.972 |
| X:3015049:T:G | F140L | 0.972 |
| X:3010134:G:C | R66P | 0.971 |
| X:3015039:T:G | F137C | 0.969 |
| X:3015260:T:A | W211R | 0.969 |
| X:3015260:T:C | W211R | 0.969 |
dbSNP variants (sampled 300 via entrez): RS1000257165 (X:3020958 A>G), RS1000263790 (X:3009893 T>G), RS1000347558 (X:3032857 A>G), RS1000380164 (X:3033484 T>C), RS1000416330 (X:3021205 T>C), RS1000519144 (X:3023271 T>G), RS1000628176 (X:3031196 G>A), RS1000838701 (X:3007671 T>G), RS1000891136 (X:3007507 A>G), RS1001046604 (X:3014655 A>T), RS1001189288 (X:3018954 T>C,G), RS1001299970 (X:3027891 C>A), RS1001500552 (X:3004832 C>A), RS1001504296 (X:3011595 T>C), RS1001712760 (X:3010809 A>G)
Disease associations
OMIM: gene MIM:300586 | disease phenotypes:
GenCC curated gene-disease
Mondo (0):
Orphanet (0):
HPO phenotypes
0 total (0 of 0 shown, HPO-id order):
GWAS associations
0 associations (top):
Drugs & pharmacology
Drug and pharmacology data
Is drug target: no
PharmGKB: 1 entry (VIP=true, CPIC=false)
CTD chemical–gene interactions
7 total (human), top 7 by PubMed support.
| Chemical | Actions (top 5) | PubMed papers |
|---|---|---|
| Benzo(a)pyrene | affects methylation, decreases methylation | 2 |
| propionaldehyde | decreases expression | 1 |
| benzo(e)pyrene | increases methylation | 1 |
| 2-palmitoylglycerol | increases expression | 1 |
| bisphenol S | decreases methylation | 1 |
| Methapyrilene | increases methylation | 1 |
| Aflatoxin B1 | decreases methylation | 1 |
Clinical trials (associated diseases)
0 trials via MONDO — disease-level, not drug-specific.
Related Atlas pages
No linked Atlas pages yet — the cross-entity mesh grows as the corpus expands.