ARSJ

gene
On this page

Also known as FLJ23548

Summary

ARSJ (arylsulfatase family member J, HGNC:26286) is a protein-coding gene on chromosome 4q26, encoding Arylsulfatase J (Q5FYB0).

Sulfatases (EC 3.1.5.6), such as ARSJ, hydrolyze sulfate esters from sulfated steroids, carbohydrates, proteoglycans, and glycolipids. They are involved in hormone biosynthesis, modulation of cell signaling, and degradation of macromolecules (Sardiello et al., 2005 [PubMed 16174644]).

Source: NCBI Gene 79642 — RefSeq curated summary.

At a glance

  • GWAS associations: 4
  • Clinical variants (ClinVar): 68 total — 1 pathogenic
  • MANE Select transcript: NM_024590

Identifiers

Gene identifiers

FieldValue
HGNC IDHGNC:26286
Approved symbolARSJ
Namearylsulfatase family member J
Location4q26
Locus typegene with protein product
StatusApproved
AliasesFLJ23548
Ensembl geneENSG00000180801
Ensembl biotypeprotein_coding
OMIM610010
Entrez79642

Gene structure

Transcript identifiers

Ensembl transcripts: 4 — 2 protein_coding_CDS_not_defined, 1 protein_coding, 1 nonsense_mediated_decay

ENST00000315366, ENST00000503013, ENST00000509829, ENST00000636527

RefSeq mRNA: 2 — MANE Select: NM_024590 NM_001354210, NM_024590

CCDS: CCDS43264

Canonical transcript exons

ENST00000315366 — 2 exons

ExonStartEnd
ENSE00001368167113900284113903675
ENSE00002075002113978437113979647

Expression profiles

Bgee: expression breadth ubiquitous, 186 present calls, max score 90.10.

FANTOM5 (CAGE): breadth ubiquitous, TPM avg 7.2710 / max 189.7082, expressed in 1160 samples.

FANTOM5 promoters (9 alternative TSS)

Promoter IDTPM avgSamples expressed
536984.87661025
536971.0587648
536990.5831344
536950.2584121
536930.187881
537000.127136
536960.089439
536940.051614
537010.03825

Top tissues by expression

281 total, by Bgee expression score (0-100, higher = more expressed):

TissueAnatomy IDExpression scoreQuality
stromal cell of endometriumCL:000225590.10gold quality
cartilage tissueUBERON:000241887.83gold quality
calcaneal tendonUBERON:000370180.37gold quality
male germ line stem cell (sensu Vertebrata) in testisCL:0000089 ∩ UBERON:000047380.01gold quality
primordial germ cell in gonadCL:0000670 ∩ UBERON:000099179.93gold quality
ascending aortaUBERON:000149679.32gold quality
thoracic aortaUBERON:000151579.29gold quality
aortaUBERON:000094778.45gold quality
right coronary arteryUBERON:000162578.23gold quality
popliteal arteryUBERON:000225078.07gold quality
tibial arteryUBERON:000761078.04gold quality
descending thoracic aortaUBERON:000234577.93gold quality
left coronary arteryUBERON:000162676.81gold quality
rectumUBERON:000105275.69gold quality
lower esophagus muscularis layerUBERON:003583375.67gold quality
lower esophagusUBERON:001347375.64gold quality
gall bladderUBERON:000211075.13gold quality
islet of LangerhansUBERON:000000675.01gold quality
coronary arteryUBERON:000162174.95gold quality
upper lobe of left lungUBERON:000895274.60gold quality
esophagogastric junction muscularis propriaUBERON:003584174.11gold quality
upper lobe of lungUBERON:000894873.76gold quality
metanephros cortexUBERON:001053372.74gold quality
minor salivary glandUBERON:000183072.64gold quality
esophagusUBERON:000104372.23gold quality
epithelial cell of pancreasCL:000008371.81silver quality
endometriumUBERON:000129571.71gold quality
muscle layer of sigmoid colonUBERON:003580571.52gold quality
olfactory segment of nasal mucosaUBERON:000538670.61gold quality
prostate glandUBERON:000236770.58gold quality

Single-cell (SCXA)

Detected in 2 experiment(s), a significant marker in 1.

ExperimentMarker?Max mean expression
E-ANND-3yes6.50
E-MTAB-6678no3.89

Regulation

Is transcription factor: no

miRNA regulators (miRDB)

136 targeting ARSJ, top 30 by miRDB confidence (max_score; target_count = how many genes the miRNA targets in total — lower means more specific):

miRNAMax scoreAvg scoremiRNA target_count
HSA-MIR-5011-5P100.0083.465820
HSA-MIR-3163100.0077.238605
HSA-MIR-190A-3P100.0080.355520
HSA-MIR-4262100.0073.263931
HSA-MIR-513A-5P100.0069.772465
HSA-MIR-4795-3P100.0074.624024
HSA-MIR-340-5P100.0072.504437
HSA-MIR-7110-3P100.0073.182486
HSA-MIR-1277-5P100.0073.955056
HSA-MIR-181A-5P99.9972.962995
HSA-MIR-181B-5P99.9972.972996
HSA-MIR-181C-5P99.9972.952996
HSA-MIR-181D-5P99.9973.042997
HSA-MIR-548C-3P99.9974.017587
HSA-MIR-196A-1-3P99.9972.152772
HSA-MIR-428299.9975.366408
HSA-MIR-366299.9973.825684
HSA-MIR-569699.9872.364487
HSA-MIR-499A-5P99.9870.791323
HSA-MIR-27A-3P99.9872.132955
HSA-MIR-27B-3P99.9872.132955
HSA-MIR-998599.9872.112939
HSA-MIR-1250-3P99.9670.044038
HSA-MIR-55999.9572.283609
HSA-MIR-548AB99.9571.313488
HSA-MIR-548A-5P99.9471.273482
HSA-MIR-548AD-5P99.9471.233502
HSA-MIR-548AE-5P99.9471.233502
HSA-MIR-548AK99.9471.243488
HSA-MIR-548AM-5P99.9471.243488

Literature-anchored findings (GeneRIF, showing 1)

  • The identification, molecular cloning and initial characterization of three new members of this human gene family is reported. (PMID:16500042)

Cross-species orthologs

11 orthologs

OrganismSymbolGene ID
danio_rerioarsjENSDARG00000077590
mus_musculusArsjENSMUSG00000046561
rattus_norvegicusArsjENSRNOG00000000001
drosophila_melanogasterCG18278FBGN0033836
drosophila_melanogasterCG7408FBGN0036765
drosophila_melanogasterCG7402FBGN0036768
drosophila_melanogasterSulf1FBGN0040271
drosophila_melanogasterCG32191FBGN0052191
drosophila_melanogasterCG30059FBGN0260475
caenorhabditis_elegansWBGENE00006308
caenorhabditis_elegansWBGENE00006309

Paralogs (16): ARSD (ENSG00000006756), IDS (ENSG00000010404), ARSF (ENSG00000062096), ARSA (ENSG00000100299), STS (ENSG00000101846), ARSB (ENSG00000113273), GNS (ENSG00000135677), SULF1 (ENSG00000137573), GALNS (ENSG00000141012), ARSG (ENSG00000141337), ARSL (ENSG00000157399), ARSK (ENSG00000164291), SGSH (ENSG00000181523), ARSI (ENSG00000183876), SULF2 (ENSG00000196562), ARSH (ENSG00000205667)

Protein

Protein identifiers

Arylsulfatase JQ5FYB0 (reviewed: Q5FYB0)

All UniProt accessions (2): D6RGC1, Q5FYB0

UniProt curated annotations — full annotation on UniProt →

Subcellular location. Secreted.

Post-translational modifications. The conversion to 3-oxoalanine (also known as C-formylglycine, FGly), of a serine or cysteine residue in prokaryotes and of a cysteine residue in eukaryotes, is critical for catalytic activity.

Cofactor. Binds 1 Ca(2+) ion per subunit.

Similarity. Belongs to the sulfatase family.

RefSeq proteins (2): NP_001341139, NP_078866* (*=MANE)

Domains & families (InterPro)

IDNameType
IPR000917Sulfatase_NDomain
IPR017850Alkaline_phosphatase_core_sfHomologous_superfamily
IPR024607Sulfatase_CSConserved_site
IPR047115ARSBFamily

Pfam: PF00884

UniProt features (28 total): binding site 8, glycosylation site 6, sequence conflict 5, compositionally biased region 2, active site 2, signal peptide 1, chain 1, modified residue 1, region of interest 1, sequence variant 1

Structure

Experimental structures (PDB)

0 structures.

Predicted structure (AlphaFold)

ModelpLDDTFraction very-high
AF-Q5FYB0-F187.340.74

Functional residue map

Curated UniProt residues grouped by drug-discovery relevance — catalytic, ligand-binding, modification, and mutation-validated positions. Source: UniProtKB sequence features.

Catalytic / active sites (2): 122 (nucleophile); 178

Ligand- & substrate-binding residues (8): 176; 269; 327; 328; 345; 84; 85; 122 (via 3-oxoalanine)

Post-translational modifications (1): 122

Glycosylation sites (6): 157, 306, 318, 431, 497, 527

Function

Pathways and Gene Ontology

Reactome pathways

9 pathways

IDPathway
R-HSA-1663150The activation of arylsulfatases
R-HSA-9840310Glycosphingolipid catabolism
R-HSA-1430728Metabolism
R-HSA-163841Gamma carboxylation, hypusinylation, hydroxylation, and arylsulfatase activation
R-HSA-1660662Glycosphingolipid metabolism
R-HSA-392499Metabolism of proteins
R-HSA-428157Sphingolipid metabolism
R-HSA-556833Metabolism of lipids
R-HSA-597592Post-translational protein modification

MSigDB gene sets: 101 (showing top): BUYTAERT_PHOTODYNAMIC_THERAPY_STRESS_DN, GAUSSMANN_MLL_AF4_FUSION_TARGETS_A_UP, AACTTT_UNKNOWN, REACTOME_SPHINGOLIPID_METABOLISM, ACEVEDO_METHYLATED_IN_LIVER_CANCER_DN, GOCC_ENDOPLASMIC_RETICULUM_LUMEN, NUYTTEN_NIPP1_TARGETS_DN, GOMF_HYDROLASE_ACTIVITY_ACTING_ON_ESTER_BONDS, GOMF_SULFURIC_ESTER_HYDROLASE_ACTIVITY, GOMF_ARYLSULFATASE_ACTIVITY, MEISSNER_NPC_HCP_WITH_H3_UNMETHYLATED, REN_ALVEOLAR_RHABDOMYOSARCOMA_DN, REACTOME_POST_TRANSLATIONAL_PROTEIN_MODIFICATION, VERHAAK_GLIOBLASTOMA_MESENCHYMAL, PEDERSEN_METASTASIS_BY_ERBB2_ISOFORM_4

GO Biological Process (0):

GO Molecular Function (4): arylsulfatase activity (GO:0004065), sulfuric ester hydrolase activity (GO:0008484), metal ion binding (GO:0046872), hydrolase activity (GO:0016787)

GO Cellular Component (4): extracellular region (GO:0005576), endoplasmic reticulum lumen (GO:0005788), actin cytoskeleton (GO:0015629), membrane (GO:0016020)

Reactome top-level categories

Rollup of top-7 pathways:

CategoryPathways
Gamma carboxylation, hypusinylation, hydroxylation, and arylsulfatase activation1
Glycosphingolipid metabolism1
Post-translational protein modification1
Sphingolipid metabolism1
Metabolism of lipids1
Metabolism1
Metabolism of proteins1

GO top-level categories

Rollup of top GO terms by namespace:

CategoryTerms
cellular anatomical structure2
sulfuric ester hydrolase activity1
hydrolase activity, acting on ester bonds1
cation binding1
catalytic activity1
endoplasmic reticulum1
intracellular organelle lumen1
cytoskeleton1

Protein interactions and networks

STRING

956 interactions, top by confidence (×1000):

Protein AProtein BPartner UniProtScore
ARSJARSKQ6UWY0696
ARSJENPP1P22413624
ARSJABCC6P78420519
ARSJPALMDQ9NP74454
ARSJCOL20A1Q9P218448
ARSJRIC3Q7Z5B4443
ARSJIFT38Q96AJ1442
ARSJKLHL31Q9H511438
ARSJACAD10Q6JQN1432
ARSJTDRD7Q8NHU6417
ARSJUBE2HP37286410
ARSJAUTS2Q8WXX7394
ARSJMKNK1Q9BUB5393
ARSJARSHQ5FYA8363
ARSJZCCHC24Q8N2G6358

IntAct

2 interactions, top by confidence:

ABTypeScore
JUNTPM3psi-mi:“MI:0914”(association)0.350

BioGRID (6): ARSJ (Two-hybrid), ARSJ (Two-hybrid), ARSJ (Two-hybrid), ZMIZ2 (Two-hybrid), ARSJ (Affinity Capture-RNA), ARSJ (Protein-peptide)

ESM2 similar proteins: A1A5C7, A5D7H1, A6H7A0, A6NJW4, A6QLN9, A8MUP2, A8MXK1, B0BMW8, B0BNL6, O35393, O62657, O75078, P52875, P55244, P56880, P57791, Q08334, Q0V881, Q15768, Q16557, Q2M1K6, Q3SZQ2, Q3UHH2, Q4V899, Q5E9H2, Q5FYB0, Q5M7U7, Q5R6I6, Q5RCI5, Q5SQ64, Q642A6, Q6PCB0, Q7TPB4, Q8BM89, Q8BZH0, Q8N431, Q8N5I2, Q8R2R5, Q8R2Z5, Q8VE98

Diamond homologs: P08842, P14000, P15289, P15589, P20713, P34059, P50427, P50428, P50473, P51689, P51690, P54793, Q08DD1, Q32KH5, Q32KH8, Q32KH9, Q32KJ6, Q32KJ9, Q3TYD4, Q571E4, Q5FYA8, Q5FYB0, Q60HH5, Q8BM89, Q8WNQ7, Q9X759, T2KMG4, T2KN90, P15586, P50426, Q32KJ8, Q5FYB1, Q8A2F6, Q8A2H2, Q8BFR4, Q8IWU6, Q8K007, Q8VI60, Q90XB6, P15848

SIGNOR signaling

0 interactions.

Disease & clinical

Clinical variants and AI predictions

ClinVar

68 variants total. Per-class counts are floors (≥ shown; pagination cap):

ClassificationCount (floor)
Pathogenic1
Likely pathogenic0
Uncertain significance65
Likely benign1
Benign0

Top pathogenic / likely-pathogenic (1)

Variant IDHGVSClassification
426095t(4;14)(q26;q12)Pathogenic

SpliceAI

1227 predictions. Top by Δscore:

VariantEffectΔscore
4:113903675:CC:Cacceptor_loss1.0000
4:113903675:CCT:Cacceptor_gain1.0000
4:113903677:T:Cacceptor_gain1.0000
4:113903677:T:TCacceptor_gain1.0000
4:113903672:ATAC:Aacceptor_gain0.9900
4:113903673:TAC:Tacceptor_gain0.9900
4:113903674:AC:Aacceptor_gain0.9900
4:113903676:C:CCacceptor_gain0.9900
4:113903676:C:Tacceptor_gain0.9900
4:113917199:A:ACdonor_gain0.9900
4:113917200:C:CCdonor_gain0.9900
4:113917216:AGTG:Adonor_gain0.9900
4:113917234:T:Adonor_gain0.9900
4:113977434:TGGC:Tdonor_gain0.9900
4:113903671:GATAC:Gacceptor_gain0.9800
4:113953396:CA:Cacceptor_gain0.9800
4:113978439:T:Adonor_gain0.9700
4:113979447:GACAC:Gdonor_loss0.9700
4:113979448:ACACT:Adonor_loss0.9700
4:113979449:CACT:Cdonor_loss0.9700
4:113979450:ACTC:Adonor_loss0.9700
4:113979451:CTCAC:Cdonor_loss0.9700
4:113979452:T:TCdonor_loss0.9700
4:113979453:CACC:Cdonor_loss0.9700
4:113979454:ACC:Adonor_loss0.9700
4:113979455:CCAGG:Cdonor_gain0.9700
4:113978435:ACTTT:Adonor_gain0.9600
4:113978436:CTTTC:Cdonor_gain0.9600
4:113979454:A:ACdonor_gain0.9600
4:113979455:C:CCdonor_gain0.9600

AlphaMissense

3905 scored. Top likely-pathogenic:

VariantProtein changeam_pathogenicity
4:113903039:T:AK345N1.000
4:113903039:T:GK345N1.000
4:113903040:T:AK345I1.000
4:113903094:T:AD327V1.000
4:113903545:A:GW177R1.000
4:113903545:A:TW177R1.000
4:113903547:T:AK176I1.000
4:113978581:T:AD85V1.000
4:113978584:T:AD84V1.000
4:113902423:A:GW551R0.999
4:113902423:A:TW551R0.999
4:113902590:A:GL495P0.999
4:113902678:A:GW466R0.999
4:113902678:A:TW466R0.999
4:113902936:A:GW380R0.999
4:113902936:A:TW380R0.999
4:113902939:C:GD379H0.999
4:113903020:C:AG352W0.999
4:113903022:C:AG351V0.999
4:113903022:C:TG351E0.999
4:113903023:C:GG351R0.999
4:113903023:C:TG351R0.999
4:113903041:T:CK345E0.999
4:113903041:T:GK345Q0.999
4:113903046:C:AG343V0.999
4:113903048:T:AR342S0.999
4:113903048:T:GR342S0.999
4:113903049:C:GR342T0.999
4:113903060:G:CN338K0.999
4:113903060:G:TN338K0.999

dbSNP variants (sampled 300 via entrez): RS1000002387 (4:113922737 C>T), RS1000118461 (4:113922485 A>G), RS1000125629 (4:113970005 C>T), RS10001850 (4:113912494 T>A,C), RS1000215898 (4:113945079 G>A), RS1000217209 (4:113962469 G>A), RS1000232700 (4:113981036 C>T), RS1000266558 (4:113931902 A>G), RS1000284084 (4:113938842 G>C), RS1000336802 (4:113949961 G>A,T), RS1000378910 (4:113903875 A>C), RS1000439830 (4:113926490 G>C), RS1000442203 (4:113944904 G>A,T), RS1000458380 (4:113915949 A>G), RS1000550309 (4:113946233 A>C)

Disease associations

OMIM: gene MIM:610010 | disease phenotypes: MIM:613454

GenCC curated gene-disease

Mondo (1): FOXG1 disorder (MONDO:0100040)

Orphanet (3): Atypical Rett syndrome (Orphanet:3095), FOXG1 syndrome (Orphanet:561854), FOXG1 syndrome due to intragenic alteration (Orphanet:598164)

HPO phenotypes

0 total (0 of 0 shown, HPO-id order):

GWAS associations

4 associations (top):

StudyTraitp-value
GCST002249_11Blood pressure measurement (high sodium intervention)3.000000e-07
GCST002252_13Blood pressure measurement (high sodium and potassium intervention)2.000000e-07
GCST002252_9Blood pressure measurement (high sodium and potassium intervention)3.000000e-06
GCST007628_3Impulsivity (motor)9.000000e-07

EFO canonical traits (5, from GWAS)

EFO IDTrait name
EFO:0005401response to high sodium diet
EFO:0006335systolic blood pressure
EFO:0005403response to dietary potassium supplementation
EFO:0006340mean arterial pressure
EFO:0006946behavioural disinhibition measurement

Drugs & pharmacology

Drug and pharmacology data

Is drug target: no

PharmGKB: 1 entry (VIP=true, CPIC=false)

CTD chemical–gene interactions

32 total (human), top 30 by PubMed support.

ChemicalActions (top 5)PubMed papers
sodium arsenitedecreases expression, increases abundance2
Arsenicincreases methylation, decreases expression, increases abundance2
Benzo(a)pyreneincreases methylation, affects expression2
Valproic Aciddecreases expression, increases expression2
Aflatoxin B1decreases methylation, increases expression2
aristolochic acid Idecreases expression1
methylmercuric chlorideincreases expression1
bisphenol Adecreases expression1
trichostatin Aincreases expression1
sulforaphaneincreases expression1
cobaltous chloridedecreases expression1
potassium chromate(VI)affects cotreatment, decreases expression1
S-(1,2-dichlorovinyl)cysteinedecreases reaction, increases expression1
epigallocatechin gallateaffects cotreatment, decreases expression1
perfluorooctane sulfonic aciddecreases expression1
jinfukangaffects cotreatment, decreases expression1
incobotulinumtoxinAdecreases expression1
Sunitinibdecreases expression1
Air Pollutantsdecreases expression, increases abundance1
Atrazinedecreases expression1
Cisplatinaffects cotreatment, decreases expression1
Doxorubicindecreases expression1
Estradiolincreases expression1
Lipopolysaccharidesincreases expression, decreases reaction1
Rifampindecreases expression1
Dronabinoldecreases expression1
Tobacco Smoke Pollutiondecreases expression1
Tretinoindecreases expression1
Triclosandecreases expression1
Vanadatesincreases expression1

Clinical trials (associated diseases)

4 trials via MONDO — disease-level, not drug-specific.

TrialPhaseStatusTitle
NCT07293546PHASE1/PHASE2NOT_YET_RECRUITINGPhase 1/2 Study of FRF-001, an AAV-9 Gene Therapy, in Patients With FOXG1 Syndrome (FS)
NCT02705677Not specifiedCOMPLETEDBiobanking of Rett Syndrome and Related Disorders
NCT02738281Not specifiedCOMPLETEDNatural History of Rett Syndrome & Related Disorders
NCT06938542Not specifiedENROLLING_BY_INVITATIONPalliative Care Needs of Children With Rare Diseases and Their Families
  • Disease cohort memberships (association, not causation — diseases whose associated-gene cohort lists this gene; a subset are also under Associated diseases): FOXG1 disorder