Provide a comprehensive cross-database identifier and functional mapping reference for human ARID1A — a definitive lookup resource covering:

### Section 1: Gene identifiers
For human gene ARID1A, list ALL gene-level database identifiers.

Required:
- HGNC ID and approved symbol
- Ensembl gene ID (ENSG...)
- NCBI Entrez Gene ID
- OMIM gene/locus ID
- Genomic location: chromosome, start position, end position, strand (GRCh38)

### Section 2: Transcript identifiers
For human gene ARID1A, list ALL transcript-level identifiers.

Required:
- Ensembl transcripts: ALL ENST IDs with biotype. Total count.
- RefSeq transcripts: ALL NM_ mRNA accessions. Mark which is MANE Select.
- CCDS IDs.
- For the CANONICAL/MANE SELECT transcript: ALL exon IDs (ENSE) with genomic coordinates and total exon count.

### Section 3: Protein identifiers
For human gene ARID1A protein product(s), list ALL protein-level identifiers.

Required:
- UniProt accessions: ALL entries (reviewed and unreviewed). Mark the canonical reviewed entry.
- RefSeq protein: ALL NP_ accessions.
- Protein domains and families: list ALL annotated domains/families with identifiers, including name, type (domain/family/superfamily), and ID.
- Antibody availability: known antibody resources for the protein.

### Section 4: Structure
For human gene ARID1A protein, list ALL structural data.

Required:
- Experimental structures: ALL PDB IDs. For each: experimental method (X-ray/NMR/Cryo-EM) and resolution. Total count.
- Predicted structures: AlphaFold model ID and confidence metrics (pLDDT).

### Section 5: Cross-species orthologs
For human gene ARID1A, list orthologous genes in key model organisms.

Organisms:
- Mouse (Mus musculus): gene ID, symbol
- Rat (Rattus norvegicus): gene ID, symbol
- Zebrafish (Danio rerio): gene ID, symbol
- Fruit fly (Drosophila melanogaster): gene ID, symbol
- Worm (C. elegans): gene ID, symbol
- Yeast (S. cerevisiae): gene ID, symbol

### Section 6: Clinical variants & AI predictions
For human gene ARID1A, summarize clinical variants and AI predictions.

Clinical variant annotations (ClinVar):
- Total variant count (approximate is fine)
- Breakdown by classification: Pathogenic, Likely Pathogenic, VUS, Likely Benign, Benign
- TOP 30 pathogenic/likely pathogenic variants with: variant ID, HGVS notation, associated condition

AI-based variant effect predictions:
- Splice effect predictions: total count + TOP 30 with delta scores if known
- Missense pathogenicity from AlphaMissense — total count + TOP 30 likely-pathogenic with am_pathogenicity scores.

### Section 7: Pathways & Gene Ontology
For human gene ARID1A, list biological pathways and Gene Ontology annotations.

Pathway membership:
- ALL biological pathways this gene participates in, with pathway IDs and names
- Total pathway count

Gene Ontology:
- Biological Process: count and TOP 20 terms with GO IDs
- Molecular Function: count and TOP 20 terms with GO IDs
- Cellular Component: count and TOP 20 terms with GO IDs

### Section 8: Protein interactions & networks
For human gene ARID1A protein, summarize protein interactions and networks.

Protein-protein interactions (STRING, IntAct, BioGRID, etc.):
- Total interaction count (approximate)
- TOP 30 highest-confidence interacting proteins with scores/evidence

Protein similarity:
- Structural/embedding similarity (e.g. Foldseek, ESM): TOP 20 similar proteins with scores
- Sequence homology: TOP 20 homologous proteins with identity/similarity

### Section 9: Transcription factor regulatory data
For human gene ARID1A, summarize transcription factor regulatory data.

If ARID1A is a transcription factor:
- Downstream targets: total count + TOP 30 with regulation type (activates/represses) and evidence
- DNA binding motifs from JASPAR — all known motif IDs and motif family classification.

Regardless:
- Upstream regulators: TFs that regulate ARID1A — names with evidence type (ChIP-seq / predicted / experimentally validated)

If ARID1A is not a transcription factor, say so briefly and skip the downstream/motif sections.

### Section 10: Drug & pharmacology data
For human gene ARID1A protein as a drug target, summarize pharmacology data.

If ARID1A is a known drug target:
- Targeting molecules: total count in ChEMBL/DrugBank + TOP 30 by development phase (molecule ID, name, mechanism, highest phase)
- Clinical trials: TOP 20 involving drugs targeting this gene — trial ID, phase, status, intervention
- Pharmacogenomics: known drug-gene interactions affecting drug response + dosing guidelines if any

If ARID1A is not currently a drug target, say so briefly.

### Section 11: Expression profiles
For human gene ARID1A, summarize expression profiles.

Tissue expression (GTEx, HPA, Bgee, etc.):
- TOP 30 tissues with expression scores/levels (direction, units if known)
- Note tissue-specific or tissue-enriched patterns

Cell type expression (Tabula Sapiens, HCA, etc.):
- TOP 30 cell types with expression scores
- Note cell-type-specific patterns

Single-cell expression: notable datasets or cell populations of interest for this gene.

### Section 12: Disease associations
For human gene ARID1A, summarize disease associations.

Mendelian / monogenic disease:
- Diseases caused by mutations in ARID1A: disease name, disease ID (OMIM/Orphanet/Mondo), inheritance pattern, evidence level
- Include all directly linked conditions

Phenotype associations:
- Clinical phenotypes associated with the gene (HPO terms where known)
- TOP 30 phenotype terms with HPO IDs

Complex-disease / GWAS:
- Traits and diseases significantly associated via GWAS: trait name, variant, effect size, study where known
- TOP 30 GWAS associations

Question

Provide a comprehensive cross-database identifier and functional mapping reference for human ARID1A — a definitive lookup resource covering:

### Section 1: Gene identifiers
For human gene ARID1A, list ALL gene-level database identifiers.

Required:
- HGNC ID and approved symbol
- Ensembl gene ID (ENSG...)
- NCBI Entrez Gene ID
- OMIM gene/locus ID
- Genomic location: chromosome, start position, end position, strand (GRCh38)

### Section 2: Transcript identifiers
For human gene ARID1A, list ALL transcript-level identifiers.

Required:
- Ensembl transcripts: ALL ENST IDs with biotype. Total count.
- RefSeq transcripts: ALL NM_ mRNA accessions. Mark which is MANE Select.
- CCDS IDs.
- For the CANONICAL/MANE SELECT transcript: ALL exon IDs (ENSE) with genomic coordinates and total exon count.

### Section 3: Protein identifiers
For human gene ARID1A protein product(s), list ALL protein-level identifiers.

Required:
- UniProt accessions: ALL entries (reviewed and unreviewed). Mark the canonical reviewed entry.
- RefSeq protein: ALL NP_ accessions.
- Protein domains and families: list ALL annotated domains/families with identifiers, including name, type (domain/family/superfamily), and ID.
- Antibody availability: known antibody resources for the protein.

### Section 4: Structure
For human gene ARID1A protein, list ALL structural data.

Required:
- Experimental structures: ALL PDB IDs. For each: experimental method (X-ray/NMR/Cryo-EM) and resolution. Total count.
- Predicted structures: AlphaFold model ID and confidence metrics (pLDDT).

### Section 5: Cross-species orthologs
For human gene ARID1A, list orthologous genes in key model organisms.

Organisms:
- Mouse (Mus musculus): gene ID, symbol
- Rat (Rattus norvegicus): gene ID, symbol
- Zebrafish (Danio rerio): gene ID, symbol
- Fruit fly (Drosophila melanogaster): gene ID, symbol
- Worm (C. elegans): gene ID, symbol
- Yeast (S. cerevisiae): gene ID, symbol

### Section 6: Clinical variants & AI predictions
For human gene ARID1A, summarize clinical variants and AI predictions.

Clinical variant annotations (ClinVar):
- Total variant count (approximate is fine)
- Breakdown by classification: Pathogenic, Likely Pathogenic, VUS, Likely Benign, Benign
- TOP 30 pathogenic/likely pathogenic variants with: variant ID, HGVS notation, associated condition

AI-based variant effect predictions:
- Splice effect predictions: total count + TOP 30 with delta scores if known
- Missense pathogenicity from AlphaMissense — total count + TOP 30 likely-pathogenic with am_pathogenicity scores.

### Section 7: Pathways & Gene Ontology
For human gene ARID1A, list biological pathways and Gene Ontology annotations.

Pathway membership:
- ALL biological pathways this gene participates in, with pathway IDs and names
- Total pathway count

Gene Ontology:
- Biological Process: count and TOP 20 terms with GO IDs
- Molecular Function: count and TOP 20 terms with GO IDs
- Cellular Component: count and TOP 20 terms with GO IDs

### Section 8: Protein interactions & networks
For human gene ARID1A protein, summarize protein interactions and networks.

Protein-protein interactions (STRING, IntAct, BioGRID, etc.):
- Total interaction count (approximate)
- TOP 30 highest-confidence interacting proteins with scores/evidence

Protein similarity:
- Structural/embedding similarity (e.g. Foldseek, ESM): TOP 20 similar proteins with scores
- Sequence homology: TOP 20 homologous proteins with identity/similarity

### Section 9: Transcription factor regulatory data
For human gene ARID1A, summarize transcription factor regulatory data.

If ARID1A is a transcription factor:
- Downstream targets: total count + TOP 30 with regulation type (activates/represses) and evidence
- DNA binding motifs from JASPAR — all known motif IDs and motif family classification.

Regardless:
- Upstream regulators: TFs that regulate ARID1A — names with evidence type (ChIP-seq / predicted / experimentally validated)

If ARID1A is not a transcription factor, say so briefly and skip the downstream/motif sections.

### Section 10: Drug & pharmacology data
For human gene ARID1A protein as a drug target, summarize pharmacology data.

If ARID1A is a known drug target:
- Targeting molecules: total count in ChEMBL/DrugBank + TOP 30 by development phase (molecule ID, name, mechanism, highest phase)
- Clinical trials: TOP 20 involving drugs targeting this gene — trial ID, phase, status, intervention
- Pharmacogenomics: known drug-gene interactions affecting drug response + dosing guidelines if any

If ARID1A is not currently a drug target, say so briefly.

### Section 11: Expression profiles
For human gene ARID1A, summarize expression profiles.

Tissue expression (GTEx, HPA, Bgee, etc.):
- TOP 30 tissues with expression scores/levels (direction, units if known)
- Note tissue-specific or tissue-enriched patterns

Cell type expression (Tabula Sapiens, HCA, etc.):
- TOP 30 cell types with expression scores
- Note cell-type-specific patterns

Single-cell expression: notable datasets or cell populations of interest for this gene.

### Section 12: Disease associations
For human gene ARID1A, summarize disease associations.

Mendelian / monogenic disease:
- Diseases caused by mutations in ARID1A: disease name, disease ID (OMIM/Orphanet/Mondo), inheritance pattern, evidence level
- Include all directly linked conditions

Phenotype associations:
- Clinical phenotypes associated with the gene (HPO terms where known)
- TOP 30 phenotype terms with HPO IDs

Complex-disease / GWAS:
- Traits and diseases significantly associated via GWAS: trait name, variant, effect size, study where known
- TOP 30 GWAS associations

Accepted Answer

Executive summary ARID1A (AT-rich interactive domain-containing protein 1A, chromosome 1q25) is a core structural subunit of the BAF/SWI-SNF chromatin-remodeling complex and a major tumor suppressor, encoded at HGNC:11110 and producing a 2,285-amino-acid, ~242 kDa protein (UniProt O14497). Its primary biological role is chromatin remodeling and transcription coactivation rather than sequence-specific DNA binding, with ~3,461 documented protein interactions spanning SWI/SNF subunits (SMARCA4, …

Identifier	Value
HGNC ID	HGNC:11110
Approved symbol	ARID1A
Ensembl gene ID	ENSG00000117713
NCBI Entrez gene ID	8289
OMIM gene ID	603024
Genomic location (GRCh38)
Chromosome	1
Start position	26,693,236 bp
End position	26,782,104 bp
Strand	+ (forward)

Ensembl ID	Biotype
ENST00000324856	protein_coding
ENST00000374152	protein_coding
ENST00000430291	retained_intron
ENST00000430799	protein_coding
ENST00000457599	protein_coding
ENST00000466382	nonsense_mediated_decay
ENST00000524572	protein_coding
ENST00000532781	nonsense_mediated_decay
ENST00000636072	retained_intron
ENST00000636110	retained_intron
ENST00000636219	protein_coding
ENST00000636422	retained_intron
ENST00000636794	nonsense_mediated_decay
ENST00000636958	protein_coding_CDS_not_defined
ENST00000637465	protein_coding
ENST00000637788	retained_intron
ENST00000850904	protein_coding

NM ID	MANE Select
NM_006015	✓
NM_001080819
NM_001341479
NM_001363070
NM_001401271
NM_001401273
NM_001401275
NM_001401276
NM_001401278
NM_001401279
NM_118259
NM_139135

Exon ID	Start	End	Strand	Chromosome
ENSE00001907429	26696015	26697540	+	1
ENSE00003471930	26729651	26729863	+	1
ENSE00000902180	26731152	26731604	+	1
ENSE00001227857	26732676	26732792	+	1
ENSE00001349760	26762152	26762319	+	1
ENSE00001349761	26761384	26761473	+	1
ENSE00001349762	26760856	26761096	+	1
ENSE00001157462	26762973	26763285	+	1
ENSE00003672172	26773346	26773496	+	1
ENSE00003589420	26773580	26773717	+	1
ENSE00003460238	26773802	26773898	+	1
ENSE00000872621	26771119	26771326	+	1
ENSE00001227767	26772812	26772987	+	1
ENSE00001227772	26772500	26772632	+	1
ENSE00001349739	26774329	26775220	+	1
ENSE00003552035	26775577	26775707	+	1
ENSE00000766221	26766221	26766366	+	1
ENSE00001349753	26766457	26766566	+	1
ENSE00001349752	26767790	26767999	+	1
ENSE00001883917	26779023	26782104	+	1

ID	Name	Type
IPR001606	ARID_dom	Domain
IPR030094	ARID1A_ARID_BRIGHT_DNA-bd	Domain
IPR033388	BAF250_C	Domain
IPR021906	BAF250/Osa	Family
IPR011989	ARM-like	Homologous_superfamily
IPR016024	ARM-type_fold	Homologous_superfamily
IPR036431	ARID_dom_sf	Homologous_superfamily

PDB ID	Method	Resolution	Title
1RYU	NMR	Solution state	Solution Structure of the SWI1 ARID
6LTH	Cryo-EM	3.0 Å	Structure of human BAF Base module
6LTJ	Cryo-EM	3.7 Å	Structure of nucleosome-bound human BAF complex
9RL4	Cryo-EM	3.5 Å	Structure of BAF in complex with OCT4-SOX2-bound nucleosome - SHL-6
9RMC	Cryo-EM	4.2 Å	Structure of BAF in complex with OCT4-SOX2-bound nucleosome - SHL+6 class 1
9RN1	Cryo-EM	5.9 Å	Structure of BAF-nucleosome complex with OCT4-SOX2 at SHL+6 in ADP-bound state, BAF47 bound to ATPase lobe 2
9RN2	Cryo-EM	4.1 Å	Structure of BAF in complex with OCT4-SOX2-bound nucleosome - SHL+6 class 2

Organism	Gene ID	Symbol
Mouse (Mus musculus)	ENSMUSG00000007880	Arid1a
Rat (Rattus norvegicus)	ENSRNOG00000006137	Arid1a
Zebrafish (Danio rerio)	ENSDARG00000101710	arid1aa
Fruit fly (Drosophila melanogaster)	FBGN0261885	osa
Worm (C. elegans)	WBGENE00002717	let-526
Yeast (S. cerevisiae)	none	none

Classification	Count
Pathogenic	~50
Likely Pathogenic	~120
Uncertain Significance	~600
Likely Benign	~380
Benign	~420
Benign/Likely Benign	~150
Conflicting	~60

Variant ID	HGVS Notation	Classification	Associated Condition
1065491	c.175G>T (p.Glu59Ter)	Pathogenic	ARID1A-related disorder
1177329	c.166C>T (p.Gln56Ter)	Pathogenic	ARID1A-related disorder
1177330	c.1708_1766del (p.Pro570fs)	Pathogenic	ARID1A-related disorder
1177343	c.2914del (p.Asp972fs)	Pathogenic	ARID1A-related disorder
1182296	c.3230C>A (p.Ala1077Glu)	Pathogenic/Likely pathogenic	ARID1A-related disorder
1120179	c.5940_6000del (p.Val1982fs)	Pathogenic	ARID1A-related disorder
1323396	c.1850C>A (p.Ser617Ter)	Pathogenic	ARID1A-related disorder
1323404	c.2122C>T (p.Gln708Ter)	Pathogenic	ARID1A-related disorder
1028997	c.5963T>C (p.Ile1988Thr)	Likely pathogenic	ARID1A-related disorder
1172645	c.4049del (p.Ser1350fs)	Likely pathogenic	ARID1A-related disorder
1176188	c.3169T>C (p.Ser1057Pro)	Likely pathogenic	ARID1A-related disorder
1177344	c.3146T>G (p.Leu1049Arg)	Likely pathogenic	ARID1A-related disorder
1298412	c.791C>A (p.Ser264Ter)	Likely pathogenic	ARID1A-related disorder
1307168	c.4101G>A (p.Gln1367=)	Likely pathogenic	ARID1A-related disorder
1314744	c.2341A>G (p.Ile781Val)	Likely pathogenic	ARID1A-related disorder
1320102	c.595C>T (p.Gln199Ter)	Likely pathogenic	ARID1A-related disorder

Protein Variant	am_pathogenicity Score	Position
A2D	0.969	1:26696408
A2V	0.940	1:26696408
K25N	0.949	1:26696478
K26I	0.868	1:26696480
D75H	0.954	1:26696626
D75V	0.932	1:26696627
S79R	0.924	1:26696638
E72K	0.911	1:26696617
A3V	0.908	1:26696411
E78K	0.908	1:26696635
A3E	0.888	1:26696411
K71N	0.880	1:26696616
D75A	0.880	1:26696627
A6D	0.879	1:26696420
K25T	0.859	1:26696477
S11R	0.868	1:26696434
S12R	0.885	1:26696437
A3T	0.821	1:26696410
G14R	0.822	1:26696443
G76R	0.806	1:26696629
D75N	0.797	1:26696626
N80K	0.782	1:26696643
A2P	0.779	1:26696407
G76E	0.781	1:26696630
A3P	0.810	1:26696410
K26E	0.717	1:26696479
A9D	0.706	1:26696429
A8D	0.672	1:26696426
D75E	0.600	1:26696628
A10D	0.598	1:26696432

Position	Variant	Effect	Delta Score
1:26697536	CTCAG>C	donor_loss	0.99
1:26697537	TCAG>T	donor_loss	0.99
1:26697538	CAGG>C	donor_loss	0.99
1:26697539	AG>A	donor_loss	0.99
1:26697540	GG>G	donor_loss	0.99
1:26697541	G>GA	donor_loss	0.99
1:26697542	T>A	donor_loss	0.99
1:26697549	C>G	donor_gain	0.99
1:26696224	GAGCC>G	donor_gain	0.86
1:26697036	G>GG	donor_gain	0.80
1:26697035	A>AG	donor_gain	0.79
1:26697541	G>GG	donor_gain	0.79
1:26698162	C>G	donor_gain	0.80
1:26696102	C>T	donor_gain	0.73
1:26698175	GTA>G	donor_gain	0.70
1:26697497	G>GT	donor_gain	0.62
1:26697111	T>TA	donor_gain	0.61
1:26698173	GAGTA>G	donor_gain	0.59
1:26697548	GC>G	donor_gain	0.59
1:26698145	G>GT	donor_gain	0.52
1:26697112	G>GA	donor_gain	0.68
1:26696573	A>T	donor_gain	0.67
1:26698162	C>CG	donor_gain	0.21
1:26696163	C>G	donor_gain	0.63
1:26697693	G>T	donor_gain	0.62
1:26697381	C>T	donor_gain	0.26
1:26698178	G>GG	donor_gain	0.68
1:26696470	G>A	donor_gain	0.91
1:26697656	G>T	donor_gain	0.26
1:26697367	G>GT	donor_gain	0.25

ID	Pathway Name
R-HSA-3214858	RMTs methylate histone arginines
R-HSA-8939243	RUNX1 interacts with co-factors whose precise effect on RUNX1 targets is not known
R-HSA-9764790	Positive Regulation of CDH1 Gene Transcription
R-HSA-9824585	Regulation of MITF-M-dependent genes involved in pigmentation
R-HSA-9845323	Regulation of endogenous retroelements by Piwi-interacting RNAs (piRNAs)
R-HSA-9933937	Formation of the canonical BAF (cBAF) complex
R-HSA-9933946	Formation of the embryonic stem cell BAF (esBAF) complex
R-HSA-9934037	Formation of neuronal progenitor and neuronal BAF (npBAF and nBAF)

GO ID	Term
GO:0006325	Chromatin organization
GO:0006337	Nucleosome disassembly
GO:0006338	Chromatin remodeling
GO:0006357	Regulation of transcription by RNA polymerase II
GO:0007399	Nervous system development
GO:0030071	Regulation of mitotic metaphase/anaphase transition
GO:0045582	Positive regulation of T cell differentiation
GO:0045597	Positive regulation of cell differentiation
GO:0045663	Positive regulation of myoblast differentiation
GO:0045815	Transcription initiation-coupled chromatin remodeling
GO:0045893	Positive regulation of DNA-templated transcription
GO:0070316	Regulation of G0 to G1 transition
GO:1902459	Positive regulation of stem cell population maintenance
GO:2000045	Regulation of G1/S transition of mitotic cell cycle
GO:2000781	Positive regulation of double-strand break repair
GO:2000819	Regulation of nucleotide-excision repair

GO ID	Term
GO:0003677	DNA binding
GO:0003713	Transcription coactivator activity
GO:0005515	Protein binding
GO:0016922	Nuclear receptor binding
GO:0031491	Nucleosome binding

GO ID	Term
GO:0000785	Chromatin
GO:0005634	Nucleus
GO:0005654	Nucleoplasm
GO:0016514	SWI/SNF complex
GO:0035060	Brahma complex
GO:0071564	npBAF complex
GO:0071565	nBAF complex
GO:0140092	bBAF complex

Rank	Protein	Interaction Count	Size (aa)	Function
1	SMARCA4 (BRG1)	5,148	1,647	SWI/SNF ATPase catalytic subunit
2	CREBBP (p300-CBP)	5,086	2,442	Histone acetyltransferase
3	TERT	5,450	1,132	Telomerase reverse transcriptase
4	ATRX	5,224	2,492	Chromatin remodeler, H3.3 chaperone
5	ACTL6A (BAF53A)	4,850	429	SWI/SNF actin-like component
6	ACTL6B (BAF53B)	4,366	426	Neural-specific SWI/SNF subunit
7	SMARCA2 (CHD2)	3,380	1,590	SWI/SNF ATPase variant
8	SMARCB1 (BAF47)	3,478	394	SWI/SNF core subunit
9	PPP2R1A	3,046	589	Protein phosphatase 2A regulatory subunit
10	KMT2C (MLL3)	2,988	4,911	Histone H3K4 methyltransferase
11	PBRM1 (BAF180)	2,972	1,634	SWI/SNF-B (PBAF) subunit
12	SMARCC1	2,936	1,105	SWI/SNF core component
13	SMARCC2	2,760	1,245	SWI/SNF core component
14	SMARCA1 (SNF2L1)	4,094	1,070	NURF/CERF ATPase
15	SMARCA5 (ISWI)	5,382	1,052	Chromatin remodeler
16	BRD4	5,234	1,362	Acetyl-histone reader, elongation factor
17	EP300 (p300)	6,450	2,414	Histone acetyltransferase
18	KMT2D (MLL2)	2,772	5,537	H3K4 methyltransferase
19	EZH2	7,058	751	Polycomb PRC2 methyltransferase
20	TP53 (p53)	14,764	393	Tumor suppressor
21	PTEN	9,614	403	Phosphatase tumor suppressor
22	BRCA1	6,120	1,884	DNA repair E3 ubiquitin ligase
23	BRCA2	3,778	3,418	DNA repair recombination protein
24	ATM	6,446	3,056	DNA damage sensor kinase
25	PIK3CA	4,602	1,068	PI3K catalytic subunit α
26	KRAS	10,098	189	Ras GTPase
27	NRAS	6,520	189	Ras GTPase
28	BRAF	6,138	767	RAF serine/threonine kinase
29	AKT1	14,324	480	Serine/threonine kinase
30	EGFR	11,600	1,210	Epidermal growth factor receptor

Rank	Tissue/Cell Type	Expression Score	Quality
1	Bone marrow cell	96.39	Gold
2	Ventricular zone	96.34	Gold
3	Embryo	96.24	Gold
4	Colonic epithelium	96.01	Gold
5	Ileal mucosa	95.83	Gold
6	Cortical plate	95.76	Gold
7	Ganglionic eminence	95.46	Gold
8	Caput epididymis	95.05	Gold
9	Corpus epididymis	94.87	Gold
10	Sural nerve	94.87	Gold
11	Trabecular bone tissue	94.78	Gold
12	Adult organism	94.36	Gold
13	Nipple	94.30	Gold
14	Pigmented layer of retina	94.26	Gold
15	Pylorus	94.10	Gold
16	Nasal cavity epithelium	94.03	Gold
17	Tonsil	93.83	Gold
18	Lower lobe of lung	93.69	Gold
19	Cauda epididymis	93.35	Gold
20	Mammary duct	93.31	Gold
21	Oocyte	93.28	Gold
22	Cardia of stomach	93.16	Gold
23	Mammalian vulva	93.15	Gold
24	Epithelium of mammary gland	92.99	Gold
25	Seminal vesicle	92.86	Gold
26	Upper leg skin	92.75	Gold
27	Tibialis anterior	92.67	Gold
28	Thymus	92.63	Gold
29	Superficial temporal artery	92.61	Gold
30	Leukocyte	92.37	Gold

Cell Type	Cluster	Expression Score	Log Fold Change
Multi-lymphoid progenitor	1	High	+2.48-4.49
Granulocyte macrophage progenitor	2	8.41 (ARID1A)	+2.57
Lymphoid-primed multipotent progenitor	3	Variable	Variable

Disease	Disease ID	Inheritance	Evidence Level
Intellectual disability, autosomal dominant 14	OMIM:614607 / MONDO:0013819	Autosomal dominant	Definitive (GenCC), Strong (clinical)
Coffin-Siris syndrome 1	ORPHANET:1465 / MONDO:0007617	Autosomal dominant	Supportive (GenCC)

Phenotype	HPO ID	Category
Intellectual disability	HP:0001249	Neurological
Global developmental delay	HP:0001263	Developmental
Seizure	HP:0001250	Neurological
Microcephaly	HP:0000252	Craniofacial
Cerebellar hypoplasia	HP:0001321	CNS
Agenesis of corpus callosum	HP:0001274	CNS
Autistic behavior	HP:0000729	Behavioral
Delayed speech and language development	HP:0000750	Developmental
Broad philtrum	HP:0000289	Facial
Coarse facial features	HP:0000280	Facial
Wide mouth	HP:0000154	Facial
Short stature	HP:0004322	Growth
Growth delay	HP:0001510	Growth
Intrauterine growth retardation	HP:0001511	Growth
Feeding difficulties	HP:0011968	GI/Developmental
Hypotonia	HP:0001252	Motor
Short 5th finger	HP:0009237	Skeletal
Brachydactyly	HP:0001156	Skeletal
Ventral septal defect	HP:0001629	Cardiac
Atrial septal defect	HP:0001631	Cardiac
Patent ductus arteriosus	HP:0001643	Cardiac
Cleft palate	HP:0000175	Orofacial
Macroglossia	HP:0000158	Oral
Strabismus	HP:0000486	Ocular
Ptosis	HP:0000508	Ocular
Hearing impairment	HP:0000365	Auditory
Hepatoblastoma	HP:0002884	Neoplastic
Recurrent infections	HP:0002719	Immunological
Joint hypermobility	HP:0001382	Skeletal
Scoliosis	HP:0002650	Skeletal

Trait	P-value	Category
HDL cholesterol levels × alcohol consumption (drinkers vs non-drinkers interaction)	7e-156	Lipid metabolism
HDL cholesterol levels × alcohol consumption (drinkers vs non-drinkers interaction)	4e-154	Lipid metabolism
HDL cholesterol levels	4e-38	Lipid metabolism
Apolipoprotein B levels	1e-38	Lipid metabolism
Triglyceride levels	4e-35	Lipid metabolism
Apolipoprotein A1 levels	2e-28	Lipid metabolism
Heel bone mineral density	2e-25	Skeletal
Platelet count	4e-24	Hematologic
LDL cholesterol levels × alcohol consumption (regular vs non-regular drinkers)	2e-14	Lipid metabolism
Plateletcrit	7e-15	Hematologic
Total cholesterol levels	2e-08	Lipid metabolism
Alanine aminotransferase levels	3e-08	Liver function
Monocyte percentage of white cells	6e-12	Hematologic
Metabolic syndrome	2e-12	Metabolic
Granulocyte percentage of myeloid white cells	2e-12	Hematologic
Neutrophil count	2e-09	Hematologic
Heel bone mineral density	1e-09	Skeletal
Liver volume	7e-10	Hepatic
HDL cholesterol levels in current drinkers	4e-08	Lipid metabolism
Heel bone mineral density	4e-11	Skeletal
HDL cholesterol levels x alcohol consumption (regular vs non-regular drinkers interaction)	2e-18	Lipid metabolism
LDL cholesterol levels x alcohol consumption (drinkers vs non-drinkers)	4e-13	Lipid metabolism
White matter hyperintensities in ischemic stroke	2e-06	Neurological
Rosacea symptom severity	1e-06	Dermatologic
LDL cholesterol levels × alcohol consumption (regular vs non-regular drinkers)	4e-15	Lipid metabolism
HDL cholesterol levels (drinkers vs non-drinkers)	4e-19	Lipid metabolism
Triglyceride levels × alcohol consumption (regular vs non-regular drinkers)	2e-06	Lipid metabolism
Triglyceride levels × alcohol consumption (drinkers vs non-drinkers)	1e-06	Lipid metabolism
HDL cholesterol levels	5e-07	Lipid metabolism
HDL cholesterol levels	2e-07	Lipid metabolism

Executive summary

ARID1A — Reference

Gene identifiers

Transcript identifiers

Ensembl Transcripts

RefSeq mRNA Transcripts

CCDS IDs

MANE Select Transcript Exons

Protein identifiers

UniProt Accessions

RefSeq Protein Accessions (NP_)

Protein Domains and Families (InterPro)

Pfam Domains

Antibody Availability

Structure

Experimental Structures (PDB)

Predicted Structures (AlphaFold)

Cross-species orthologs

Clinical variants & AI predictions

ClinVar Summary

TOP 30 Pathogenic/Likely Pathogenic Variants (ClinVar)

AlphaMissense Pathogenicity Predictions

TOP 30 Likely-Pathogenic Missense Variants (AlphaMissense)

Splice Effect Predictions (SpliceAI)

TOP 30 High-Impact Splice Variants

Pathways & Gene Ontology

Reactome Pathways

MSigDB Gene Sets

Gene Ontology Annotations

Biological Process

Molecular Function

Cellular Component

Protein interactions & networks

Protein-Protein Interactions

Protein Similarity

Transcription factor regulatory data

No JASPAR DNA Binding Motifs

Downstream Targets (Limited)

Upstream Regulators

Drug & pharmacology data

Expression profiles

Tissue expression (Bgee)

Single-cell and cell type expression (SCXA)

Disease associations

Mendelian/Monogenic Diseases

Phenotype Associations (HPO, Top 30)

GWAS Associations (Top 30)