ARSH

gene
On this page

Summary

ARSH (arylsulfatase family member H, HGNC:32488) is a protein-coding gene on chromosome Xp22.33, encoding Arylsulfatase H (Q5FYA8).

Sulfatases, such as ARSH, hydrolyze sulfate esters from sulfated steroids, carbohydrates, proteoglycans, and glycolipids. They are involved in hormone biosynthesis, modulation of cell signaling, and degradation of macromolecules (Sardiello et al., 2005 [PubMed 16174644]).

Source: NCBI Gene 347527 — RefSeq curated summary.

At a glance

  • Clinical variants (ClinVar): 122 total
  • MANE Select transcript: NM_001011719

Identifiers

Gene identifiers

FieldValue
HGNC IDHGNC:32488
Approved symbolARSH
Namearylsulfatase family member H
LocationXp22.33
Locus typegene with protein product
StatusApproved
Ensembl geneENSG00000205667
Ensembl biotypeprotein_coding
OMIM300586
Entrez347527

Gene structure

Transcript identifiers

Ensembl transcripts: 1 — 1 protein_coding

ENST00000381130

RefSeq mRNA: 1 — MANE Select: NM_001011719 NM_001011719

CCDS: CCDS35198

Canonical transcript exons

ENST00000381130 — 9 exons

ExonStartEnd
ENSE0000148759530330183034111
ENSE0000148759730292473029368
ENSE0000148759930273133027475
ENSE0000148760030240213024155
ENSE0000148760230185343018670
ENSE0000148760330149703015393
ENSE0000148760630130473013172
ENSE0000148760830100303010151
ENSE0000148761030065463006704

Expression profiles

Bgee: expression breadth broad, 25 present calls, max score 75.21.

Top tissues by expression

124 total, by Bgee expression score (0-100, higher = more expressed):

TissueAnatomy IDExpression scoreQuality
male germ line stem cell (sensu Vertebrata) in testisCL:0000089 ∩ UBERON:000047375.21gold quality
gall bladderUBERON:000211047.79gold quality
lower esophagus mucosaUBERON:003583447.09silver quality
ganglionic eminenceUBERON:000402347.02gold quality
islet of LangerhansUBERON:000000647.01gold quality
esophagus mucosaUBERON:000246946.24gold quality
olfactory segment of nasal mucosaUBERON:000538641.86gold quality
endometriumUBERON:000129540.95silver quality
tonsilUBERON:000237240.20gold quality
ventricular zoneUBERON:000305339.95gold quality
pancreasUBERON:000126439.39gold quality
saliva-secreting glandUBERON:000104438.07gold quality
minor salivary glandUBERON:000183037.21silver quality
colonic epitheliumUBERON:000039737.20gold quality
urinary bladderUBERON:000125536.84silver quality
cortical plateUBERON:000534336.47gold quality
mucosa of transverse colonUBERON:000499136.42gold quality
bone marrow cellCL:000209236.16gold quality
sural nerveUBERON:001548835.55gold quality
placentaUBERON:000198735.17silver quality
body of pancreasUBERON:000115034.82gold quality
skin of abdomenUBERON:000141634.67gold quality
vaginaUBERON:000099634.39gold quality
esophagusUBERON:000104333.94gold quality
bone marrowUBERON:000237133.89gold quality
zone of skinUBERON:000001433.68gold quality
skeletal muscle tissueUBERON:000113433.38gold quality
muscle tissueUBERON:000238533.23gold quality
thoracic mammary glandUBERON:000520032.77silver quality
skin of legUBERON:000151132.71gold quality

Single-cell (SCXA)

Detected in 1 experiment(s), a significant marker in 1.

ExperimentMarker?Max mean expression
E-ANND-3yes3.78

Regulation

Is transcription factor: no

Cross-species orthologs

11 orthologs

OrganismSymbolGene ID
danio_reriosulf2aENSDARG00000018423
danio_rerioarsibENSDARG00000117075
drosophila_melanogasterCG18278FBGN0033836
drosophila_melanogasterCG7408FBGN0036765
drosophila_melanogasterCG7402FBGN0036768
drosophila_melanogasterSulf1FBGN0040271
drosophila_melanogasterCG32191FBGN0052191
drosophila_melanogasterCG30059FBGN0260475
caenorhabditis_elegansWBGENE00006308
caenorhabditis_elegansWBGENE00006309
caenorhabditis_elegansWBGENE00006310

Paralogs (16): ARSD (ENSG00000006756), IDS (ENSG00000010404), ARSF (ENSG00000062096), ARSA (ENSG00000100299), STS (ENSG00000101846), ARSB (ENSG00000113273), GNS (ENSG00000135677), SULF1 (ENSG00000137573), GALNS (ENSG00000141012), ARSG (ENSG00000141337), ARSL (ENSG00000157399), ARSK (ENSG00000164291), ARSJ (ENSG00000180801), SGSH (ENSG00000181523), ARSI (ENSG00000183876), SULF2 (ENSG00000196562)

Protein

Protein identifiers

Arylsulfatase HQ5FYA8 (reviewed: Q5FYA8)

All UniProt accessions (1): Q5FYA8

UniProt curated annotations — full annotation on UniProt →

Subcellular location. Membrane.

Post-translational modifications. The conversion to 3-oxoalanine (also known as C-formylglycine, FGly), of a serine or cysteine residue in prokaryotes and of a cysteine residue in eukaryotes, is critical for catalytic activity.

Cofactor. Binds 1 Ca(2+) ion per subunit.

Similarity. Belongs to the sulfatase family.

RefSeq proteins (1): NP_001011719* (*=MANE)

Domains & families (InterPro)

IDNameType
IPR000917Sulfatase_NDomain
IPR017850Alkaline_phosphatase_core_sfHomologous_superfamily
IPR024607Sulfatase_CSConserved_site
IPR050738SulfataseFamily

Pfam: PF00884, PF14707

UniProt features (14 total): binding site 8, transmembrane region 2, active site 2, chain 1, modified residue 1

Structure

Experimental structures (PDB)

0 structures.

Predicted structure (AlphaFold)

ModelpLDDTFraction very-high
AF-Q5FYA8-F194.150.87

Functional residue map

Curated UniProt residues grouped by drug-discovery relevance — catalytic, ligand-binding, modification, and mutation-validated positions. Source: UniProtKB sequence features.

Catalytic / active sites (2): 55 (nucleophile); 117

Ligand- & substrate-binding residues (8): 323; 324; 348; 15; 16; 55 (via 3-oxoalanine); 115; 271

Post-translational modifications (1): 55

Function

Pathways and Gene Ontology

Reactome pathways

9 pathways

IDPathway
R-HSA-1663150The activation of arylsulfatases
R-HSA-9840310Glycosphingolipid catabolism
R-HSA-1430728Metabolism
R-HSA-163841Gamma carboxylation, hypusinylation, hydroxylation, and arylsulfatase activation
R-HSA-1660662Glycosphingolipid metabolism
R-HSA-392499Metabolism of proteins
R-HSA-428157Sphingolipid metabolism
R-HSA-556833Metabolism of lipids
R-HSA-597592Post-translational protein modification

MSigDB gene sets: 15 (showing top): REACTOME_SPHINGOLIPID_METABOLISM, GOCC_ENDOPLASMIC_RETICULUM_LUMEN, GOMF_HYDROLASE_ACTIVITY_ACTING_ON_ESTER_BONDS, GOMF_SULFURIC_ESTER_HYDROLASE_ACTIVITY, GOMF_ARYLSULFATASE_ACTIVITY, REACTOME_POST_TRANSLATIONAL_PROTEIN_MODIFICATION, chrXp22, PEDRIOLI_MIR31_TARGETS_DN, ZWANG_TRANSIENTLY_UP_BY_2ND_EGF_PULSE_ONLY, REACTOME_GAMMA_CARBOXYLATION_HYPUSINYLATION_HYDROXYLATION_AND_ARYLSULFATASE_ACTIVATION, REACTOME_METABOLISM_OF_LIPIDS, DESCARTES_MAIN_FETAL_BIPOLAR_CELLS, REACTOME_GLYCOSPHINGOLIPID_CATABOLISM, REACTOME_GLYCOSPHINGOLIPID_METABOLISM, REACTOME_THE_ACTIVATION_OF_ARYLSULFATASES

GO Biological Process (0):

GO Molecular Function (3): arylsulfatase activity (GO:0004065), metal ion binding (GO:0046872), hydrolase activity (GO:0016787)

GO Cellular Component (2): endoplasmic reticulum lumen (GO:0005788), membrane (GO:0016020)

Reactome top-level categories

Rollup of top-7 pathways:

CategoryPathways
Gamma carboxylation, hypusinylation, hydroxylation, and arylsulfatase activation1
Glycosphingolipid metabolism1
Post-translational protein modification1
Sphingolipid metabolism1
Metabolism of lipids1
Metabolism1
Metabolism of proteins1

GO top-level categories

Rollup of top GO terms by namespace:

CategoryTerms
sulfuric ester hydrolase activity1
cation binding1
catalytic activity1
endoplasmic reticulum1
intracellular organelle lumen1
cellular anatomical structure1

Protein interactions and networks

STRING

544 interactions, top by confidence (×1000):

Protein AProtein BPartner UniProtScore
ARSHAS3MTQ9HBK9613
ARSHGET3O43681573
ARSHARSKQ6UWY0532
ARSHCUTCQ9NTM9510
ARSHSLURP1P55000418
ARSHCOPB1P53618397
ARSHPRR32B1ATL7395
ARSHARSIQ5FYB1383
ARSHCOPG1Q9Y678376
ARSHARSJQ5FYB0363
ARSHGNSP15586347
ARSHCTPS2Q9NRF8315
ARSHAPEHP13798313
ARSHSRRTQ9BXP5303
ARSHAQP9O43315302

IntAct

2 interactions, top by confidence:

ABTypeScore
ARSHHERC4psi-mi:“MI:0914”(association)0.350

BioGRID (3): ITCH (Affinity Capture-MS), GPHN (Affinity Capture-MS), HERC4 (Affinity Capture-MS)

ESM2 similar proteins: A0A2D0TC04, A1A4K5, A2VDP5, A8K7I4, J3SBP3, J3SEZ3, O14638, P06802, P0DQQ4, P15396, P18563, P18564, P22413, P54793, P97259, P97675, Q08834, Q09328, Q13822, Q14CN2, Q1RPR6, Q29444, Q2TU62, Q32KH8, Q3SZI1, Q4FZV0, Q5FYA8, Q5GF25, Q5R5M5, Q64610, Q6AYF4, Q6DDW2, Q6DYE8, Q6NXH2, Q6PT52, Q6Q473, Q863C4, Q8BTJ4, Q8K1B9, Q8K2I4

Diamond homologs: P08842, P14000, P15289, P15589, P20713, P34059, P50427, P50428, P50473, P51689, P51690, P54793, Q08DD1, Q32KH5, Q32KH8, Q32KH9, Q32KJ6, Q32KJ9, Q3TYD4, Q571E4, Q5FYA8, Q5FYB0, Q60HH5, Q8BM89, Q8WNQ7, Q9X759, T2KMG4, T2KN90, P22304, P31447, Q08890, Q32KI9, Q32KJ8, Q5FYB1, Q96EG1, P33727, P50430, Q32KH7, Q8A2F6, Q8A2H2

SIGNOR signaling

0 interactions.

Disease & clinical

Clinical variants and AI predictions

ClinVar

122 variants total. Per-class counts are floors (≥ shown; pagination cap):

ClassificationCount (floor)
Pathogenic0
Likely pathogenic0
Uncertain significance68
Likely benign8
Benign12

Top pathogenic / likely-pathogenic (0)

SpliceAI

1606 predictions. Top by Δscore:

VariantEffectΔscore
X:3006700:GTGAG:Gdonor_gain1.0000
X:3010020:T:TAacceptor_gain1.0000
X:3010021:G:Aacceptor_gain1.0000
X:3010025:TGCA:Tacceptor_loss1.0000
X:3010026:GCAG:Gacceptor_loss1.0000
X:3010028:A:AGacceptor_gain1.0000
X:3010029:G:GAacceptor_gain1.0000
X:3010029:GC:Gacceptor_gain1.0000
X:3010029:GCA:Gacceptor_gain1.0000
X:3010029:GCAC:Gacceptor_gain1.0000
X:3010029:GCACA:Gacceptor_gain1.0000
X:3010149:CAGG:Cdonor_loss1.0000
X:3010150:AGGTG:Adonor_loss1.0000
X:3010151:GGTG:Gdonor_loss1.0000
X:3010152:G:GAdonor_loss1.0000
X:3013045:AG:Aacceptor_gain1.0000
X:3013045:AGG:Aacceptor_gain1.0000
X:3013046:GG:Gacceptor_gain1.0000
X:3013046:GGG:Gacceptor_gain1.0000
X:3013046:GGGAT:Gacceptor_gain1.0000
X:3024015:CTCCA:Cacceptor_loss1.0000
X:3024017:CCAG:Cacceptor_loss1.0000
X:3024020:GGTA:Gacceptor_gain1.0000
X:3024020:GGTAA:Gacceptor_gain1.0000
X:3024128:C:Gdonor_gain1.0000
X:3024152:AAAGG:Adonor_loss1.0000
X:3024154:AG:Adonor_loss1.0000
X:3024155:GG:Gdonor_loss1.0000
X:3024156:G:Tdonor_loss1.0000
X:3024157:T:Gdonor_loss1.0000

AlphaMissense

3685 scored. Top likely-pathogenic:

VariantProtein changeam_pathogenicity
X:3014975:T:AW116R0.991
X:3014975:T:CW116R0.991
X:3014977:G:CW116C0.985
X:3014977:G:TW116C0.985
X:3015038:T:CF137L0.985
X:3015040:T:AF137L0.985
X:3015040:T:GF137L0.985
X:3006641:T:AV10D0.984
X:3027414:A:CS380R0.982
X:3027416:C:AS380R0.982
X:3027416:C:GS380R0.982
X:3033299:T:AW535R0.982
X:3033299:T:CW535R0.982
X:3010109:A:CS58R0.978
X:3010111:T:AS58R0.978
X:3010111:T:GS58R0.978
X:3018658:G:CD297H0.978
X:3027349:G:CR358P0.976
X:3027361:T:AI362K0.975
X:3027423:G:CD383H0.975
X:3033034:A:CK446N0.974
X:3033034:A:TK446N0.974
X:3015039:T:CF137S0.973
X:3015047:T:CF140L0.972
X:3015049:T:AF140L0.972
X:3015049:T:GF140L0.972
X:3010134:G:CR66P0.971
X:3015039:T:GF137C0.969
X:3015260:T:AW211R0.969
X:3015260:T:CW211R0.969

dbSNP variants (sampled 300 via entrez): RS1000257165 (X:3020958 A>G), RS1000263790 (X:3009893 T>G), RS1000347558 (X:3032857 A>G), RS1000380164 (X:3033484 T>C), RS1000416330 (X:3021205 T>C), RS1000519144 (X:3023271 T>G), RS1000628176 (X:3031196 G>A), RS1000838701 (X:3007671 T>G), RS1000891136 (X:3007507 A>G), RS1001046604 (X:3014655 A>T), RS1001189288 (X:3018954 T>C,G), RS1001299970 (X:3027891 C>A), RS1001500552 (X:3004832 C>A), RS1001504296 (X:3011595 T>C), RS1001712760 (X:3010809 A>G)

Disease associations

OMIM: gene MIM:300586 | disease phenotypes:

GenCC curated gene-disease

Mondo (0):

Orphanet (0):

HPO phenotypes

0 total (0 of 0 shown, HPO-id order):

GWAS associations

0 associations (top):

Drugs & pharmacology

Drug and pharmacology data

Is drug target: no

PharmGKB: 1 entry (VIP=true, CPIC=false)

CTD chemical–gene interactions

7 total (human), top 7 by PubMed support.

ChemicalActions (top 5)PubMed papers
Benzo(a)pyreneaffects methylation, decreases methylation2
propionaldehydedecreases expression1
benzo(e)pyreneincreases methylation1
2-palmitoylglycerolincreases expression1
bisphenol Sdecreases methylation1
Methapyrileneincreases methylation1
Aflatoxin B1decreases methylation1

Clinical trials (associated diseases)

0 trials via MONDO — disease-level, not drug-specific.

No linked Atlas pages yet — the cross-entity mesh grows as the corpus expands.