BRCA1 Gene Complete Identifier and Functional Mapping Reference

Provide a comprehensive cross-database identifier and functional mapping reference for human BRCA1. This should serve as a definitive lookup resource …

Provide a comprehensive cross-database identifier and functional mapping reference for human BRCA1. This should serve as a definitive lookup resource for researchers. ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 1: GENE IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Provide ALL gene-level database identifiers: - HGNC ID and approved symbol - Ensembl gene ID (ENSG) - NCBI Entrez Gene ID - OMIM gene/locus ID - Genomic location: chromosome, start position, end position, strand ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 2: TRANSCRIPT IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ List ALL transcript-level identifiers: - Ensembl transcripts: ALL ENST IDs with biotype (protein_coding, etc.) How many total transcripts? - RefSeq transcripts: ALL NM_ mRNA accessions Mark which is MANE Select (canonical clinical standard) - CCDS IDs: ALL consensus coding sequence identifiers For the CANONICAL/MANE SELECT transcript: - List ALL exon IDs (ENSE) with genomic coordinates - Total exon count ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 3: PROTEIN IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ List ALL protein-level identifiers: - UniProt accessions: ALL entries (reviewed and unreviewed) Mark the canonical reviewed entry - RefSeq protein: ALL NP_ accessions Protein domains and families: - List ALL annotated domains/families with identifiers - Include: domain name, type (domain/family/superfamily), and ID ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 4: STRUCTURE IDENTIFIERS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Experimental structures: - List ALL PDB structure IDs - For each: experimental method (X-ray, NMR, Cryo-EM) and resolution - Total PDB structure count Predicted structures: - AlphaFold model ID and confidence metrics (pLDDT) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 5: CROSS-SPECIES ORTHOLOGS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ List orthologous genes in key model organisms (where available): - Mouse (Mus musculus): gene ID, symbol - Rat (Rattus norvegicus): gene ID, symbol - Zebrafish (Danio rerio): gene ID, symbol - Fruit fly (Drosophila melanogaster): gene ID, symbol - Worm (C. elegans): gene ID, symbol - Yeast (S. cerevisiae): gene ID, symbol ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 6: CLINICAL VARIANTS & AI PREDICTIONS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Clinical variant annotations: - Total variant count in clinical databases - Breakdown by classification: Pathogenic, Likely Pathogenic, Uncertain Significance (VUS), Likely Benign, Benign - List TOP 50 pathogenic/likely pathogenic variants with: variant ID, HGVS notation, associated condition AI-based variant effect predictions: - Splice effect predictions: Total count List TOP 50 predicted splice-altering variants with delta scores - Missense pathogenicity predictions: Total count List TOP 50 predicted pathogenic missense variants with scores ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 7: BIOLOGICAL PATHWAYS & GENE ONTOLOGY ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Pathway membership: - List ALL biological pathways this gene participates in - Include pathway IDs and names - Total pathway count Gene Ontology annotations: - Biological Process: count and TOP 20 terms with IDs - Molecular Function: count and TOP 20 terms with IDs - Cellular Component: count and TOP 20 terms with IDs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 8: PROTEIN INTERACTIONS & MOLECULAR NETWORKS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Protein-protein interactions: - Total interaction count - List TOP 50 highest-confidence interacting proteins with scores Protein similarity (evolutionary and structural): - Structural/embedding similarity: How many similar proteins? List TOP 20 with similarity scores - Sequence homology: How many homologous proteins? List TOP 20 with identity/similarity scores ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 9: TRANSCRIPTION FACTOR REGULATORY DATA ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ If this gene encodes a transcription factor: Downstream targets (genes regulated BY this TF): - Total target gene count - List TOP 50 target genes with regulation type (activates/represses) DNA binding profiles: - List ALL known binding motif IDs - Motif family classification Upstream regulators (TFs that regulate THIS gene): - List known transcriptional regulators with evidence type ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 10: DRUG & PHARMACOLOGY DATA ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ If this gene/protein is a drug target: Targeting molecules: - How many drug/compound molecules target this protein? - List TOP 30 molecules by development phase - Include: molecule ID, name, mechanism, highest development phase Clinical trials: - How many clinical trials involve drugs targeting this gene? - List TOP 20 trials with: trial ID, phase, status, intervention Pharmacogenomics: - Known drug-gene interactions affecting drug response - Dosing guidelines if any exist ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 11: EXPRESSION PROFILES ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Tissue expression: - Which tissues express this gene most highly? - List TOP 30 tissues with expression scores/levels - Note any tissue-specific or tissue-enriched patterns Cell type expression: - Which cell types show highest expression? - List TOP 30 cell types with expression scores - Note any cell type-specific patterns Single-cell expression data (if available): - Which single-cell datasets/experiments include this gene? - Notable cell population patterns ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SECTION 12: DISEASE ASSOCIATIONS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Mendelian/monogenic disease links: - What diseases are caused by mutations in this gene? - List ALL associated diseases with: disease name, disease ID, inheritance pattern, evidence level Phenotype associations: - What clinical phenotypes are associated with this gene? - List TOP 50 phenotype terms with IDs Complex trait associations (GWAS): - What traits/diseases are linked via genome-wide association studies? - List TOP 30 GWAS associations with: trait, study ID, p-value, effect size if available ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ FORMATTING REQUIREMENTS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - Present each section with clear headers - Use tables for listing multiple items - Always include COUNTS for every category - For lists with >50 items: show total count + TOP 50 - For lists with <50 items: show ALL items - Always include database identifiers (IDs), not just names - Mark canonical/primary/reviewed entries where applicable - If a section has no data available, state "No data available" rather than omitting the section

Structured Data Sources

This answer was generated using Claude Opus 4.5 + BioBTree MCP, querying 36 biological databases.

Further analyze this answer or run your own queries with BioBTree MCP.

Datasets: alphafold, alphamissense, bgee, biogrid_interaction, ccds, chembl_molecule, chembl_target, clinvar, collectri, diamond_similarity, ensembl, entrez, esm2_similarity, exon, gencc, go, gwas, hgnc, hpo, intact, interpro, mim, orphanet, ortholog, pdb, pharmgkb, pharmgkb_gene, reactome, refseq, scxa, signor, spliceai, string_interaction, transcript, ufeature, uniprot
Generated: 2026-03-26 — For the latest data, query BioBTree directly via MCP or API.
View API calls (51)
BRCA1

HUMAN BRCA1 COMPREHENSIVE CROSS-DATABASE IDENTIFIER AND FUNCTIONAL MAPPING REFERENCE

Section 1: Gene Identifiers

DatabaseIdentifierNotes
HGNC IDHGNC:1100Approved
Approved SymbolBRCA1
Approved NameBRCA1 DNA repair associated
Previous Namesbreast cancer 1, early onset; breast cancer 1
AliasesRNF53, BRCC1, PPP1R53, FANCS
Ensembl Gene IDENSG00000012048
NCBI Entrez Gene ID672
OMIM Gene ID113705Gene/locus
Locus Groupprotein-coding gene
Locus Typegene with protein product
Genomic Location (GRCh38/hg38)
ParameterValue
Chromosome17
Cytogenetic Band17q21.31
Start Position43,044,292
End Position43,170,245
StrandMinus (-)
Gene Span125,954 bp

Section 2: Transcript Identifiers Ensembl Transcripts Total Count: 47 transcripts

Transcript IDBiotypeStartEndNotes
ENST00000357654protein_coding4304429543125364Canonical
ENST00000352993protein_coding4304429543125370
ENST00000461574protein_coding4304430443125370
ENST00000468300protein_coding4304480543125451
ENST00000470026protein_coding4304430443125343
ENST00000471181protein_coding4304429543125483
ENST00000473961protein_coding4304429543125364
ENST00000476777protein_coding4304429843125360
ENST00000477152protein_coding4304429543125364
ENST00000478531protein_coding4304429543125359
ENST00000484087protein_coding4304543443125343
ENST00000489037protein_coding4304429543125321
ENST00000491747protein_coding4304567843125356
ENST00000493795protein_coding4304562943125402
ENST00000493919protein_coding4304429543125402
ENST00000494123protein_coding4304429543125450
ENST00000497488protein_coding4304429543125300
ENST00000586385protein_coding4304556343125329
ENST00000591534protein_coding4304556343125329
ENST00000591849protein_coding4304556343125329
ENST00000618469protein_coding4304430443125364
ENST00000634433protein_coding4304429543170245
ENST00000644379protein_coding4304508643125364
ENST00000644555protein_coding4304430443125370
ENST00000652672protein_coding4304429543125483
ENST00000700182protein_coding4307034443125343
ENST00000713676protein_coding4304430443125343
ENST00000899954protein_coding4304480543125483
ENST00000921914protein_coding4304429243125476
ENST00000921915protein_coding4304429543125367
ENST00000921916protein_coding4304430443125360
ENST00000945268protein_coding4304429943125410
ENST00000945269protein_coding4304482443125487
ENST00000354071retained_intron4309109843125315
ENST00000472490retained_intron4306727443071066
ENST00000700081retained_intron4304430243050409
ENST00000700082retained_intron4304498143048473
ENST00000700083retained_intron4310422543106139
ENST00000700184retained_intron4309964343124339
ENST00000461221nonsense_mediated_decay4304567843125288
ENST00000461798nonsense_mediated_decay4309983143125370
ENST00000492859nonsense_mediated_decay4309411243125300
ENST00000642945nonsense_mediated_decay4309448243125343
ENST00000700183nonsense_mediated_decay4309403343125343
ENST00000621897protein_coding_CDS_not_defined4307648843082760
ENST00000700185protein_coding_CDS_not_defined4311439143125370
ENST00000700186protein_coding_CDS_not_defined4312234343125370
RefSeq Transcripts (mRNA) Total Count: 100+ transcripts
AccessionStatusMANE Select
NM_007294REVIEWED✓ YES
NM_001407571REVIEWED
NM_001407581REVIEWED
NM_001407582REVIEWED
NM_001407583REVIEWED
NM_001407585REVIEWED
NM_001407587REVIEWED
NM_001407590REVIEWED
NM_001407591REVIEWED
NM_001407593REVIEWED
... and 90+ additional transcripts
CCDS IDs Total Count: 5
CCDS ID
CCDS11453
CCDS11454
CCDS11455
CCDS11456
CCDS11459
Canonical Transcript Exons (ENST00000357654) Total Exon Count: 23
Exon IDStartEndStrand
ENSE000018525674312527143125364-
ENSE000035595124312401743124115-
ENSE000035105924311572643115779-
ENSE000035318364310486843104956-
ENSE000019179484310412243104261-
ENSE000040115634309977543099880-
ENSE000040115504309724443097289-
ENSE000040115594309584643095922-
ENSE000040115664309143543094860-
ENSE000040115584309094443091032-
ENSE000040115604308240443082575-
ENSE000040115614307648843076614-
ENSE000040115624307433143074521-
ENSE000040115544307092843071238-
ENSE000040115564306760843067695-
ENSE000040115654306387443063951-
ENSE000040115674306333343063373-
ENSE000040115514305705243057135-
ENSE000040115534305106343051117-
ENSE000040115684304912143049194-
ENSE000040115524304764343047703-
ENSE000035410684310645643106533-
ENSE000040115644304429543045802-

Section 3: Protein Identifiers UniProt Accessions Total Count: 22 entries

AccessionStatusNotes
P38398Reviewed (Swiss-Prot)CANONICAL
A0A0U1RRA9Unreviewed
A0A2R8Y6Y9Unreviewed
A0A2R8Y7V5Unreviewed
A0A494C182Unreviewed
A0A8V8TPY7Unreviewed
A0A9Y1QPT7Unreviewed
A0A9Y1QQK3Unreviewed
C6YB45Unreviewed
C9IZW4Unreviewed
E7ENB7Unreviewed
E7EQW4Unreviewed
E7EUM2Unreviewed
E7EWN5Unreviewed
E9PC22Unreviewed
G1UI37Unreviewed
H0Y850Unreviewed
H0Y8B8Unreviewed
H0Y8D8Unreviewed
K7EJW3Unreviewed
K7EPC7Unreviewed
Q3B891Unreviewed
Canonical Protein (P38398) Properties
PropertyValue
Protein NameBreast cancer type 1 susceptibility protein
Alternative NamesRING finger protein 53; RING-type E3 ubiquitin transferase BRCA1
Length1,863 amino acids
Molecular Mass207,721 Da
RefSeq Protein Accessions Total Count: 100+ proteins
AccessionStatusNotes
NP_009225REVIEWEDCorresponds to MANE Select
NP_001394500REVIEWED
NP_001394510REVIEWED
NP_001394511REVIEWED
... and 96+ additional proteins
Protein Domains and Families (InterPro) Total Count: 9
InterPro IDShort NameType
IPR001357BRCT_domDomain
IPR001841Znf_RINGDomain
IPR011364BRCA1Family
IPR013083Znf_RING/FYVE/PHDHomologous_superfamily
IPR017907Znf_RING_CSConserved_site
IPR018957Znf_C3HC4_RING-typeDomain
IPR025994BRCA1_serine_domDomain
IPR031099BRCA1-associatedFamily
IPR036420BRCT_dom_sfHomologous_superfamily

Section 4: Structure Identifiers Experimental Structures (PDB) Total Count: 33 structures

PDB IDTitleMethodResolution
1JM7BRCA1/BARD1 RING-domain heterodimerNMR-
1JNXBRCT repeat regionX-ray2.5 Å
1N5OBRCA1-BRCT missense mutationX-ray2.8 Å
1OQABRCT-c domainNMR-
1T15BRCT + BACH1 phosphopeptideX-ray1.85 Å
1T29BRCT + phospho-BACH1X-ray2.3 Å
1T2UBRCT V1809F variantX-ray2.8 Å
1T2VBRCT + phosphopeptideX-ray3.3 Å
1Y98BRCT + CtIP phosphopeptideX-ray2.5 Å
2INGBRCT M1775K variantX-ray3.6 Å
3COJBRCT + ACC1 phosphopeptideX-ray3.21 Å
3K0HBRCT + minimal tetrapeptideX-ray2.7 Å
3K0KBRCT + tetrapeptideX-ray2.7 Å
3K15BRCT D1840T + tetrapeptideX-ray2.8 Å
3K16BRCT D1840T + tetrapeptideX-ray3.0 Å
3PXABRCT G1656D variantX-ray2.55 Å
3PXBBRCT T1700A variantX-ray2.5 Å
3PXCBRCT R1699Q variantX-ray2.8 Å
3PXDBRCT R1835P variantX-ray2.8 Å
3PXEBRCT E1836K variantX-ray2.85 Å
4IFIBRCT + BAAT peptideX-ray2.2 Å
4IGKBRCT + ATRIP peptideX-ray1.75 Å
4JLUBRCT + phospho-AbraxasX-ray3.5 Å
4OFBBRCT + nonphosphopeptide inhibitorX-ray3.05 Å
4U4ABRCT + singly phospho-AbraxasX-ray3.51 Å
4Y18BRCT + double phospho-AbraxasX-ray3.5 Å
4Y2GBRCT + single phospho-AbraxasX-ray2.5 Å
6G2IACC + BRCT filamentCryo-EM5.9 Å
7JZVBRCA1-UbcH5c/BARD1 + nucleosomeCryo-EM3.9 Å
7LYBNucleosome + BRCA1-BARD1-UbcH5cCryo-EM3.28 Å
8GRQBRCA1/BARD1 + ubiquitinated nucleosomeCryo-EM3.87 Å
8RS8BRCT + RIF1 phosphopeptideX-ray1.31 Å
9QPXBRCT + RNA Pol II CTD peptideX-ray3.0 Å
Predicted Structures (AlphaFold)
AlphaFold IDGlobal pLDDTSequence LengthVery High Confidence (%)
AF-P38398-F141.9914,55011%

Section 5: Cross-Species Orthologs

OrganismGene IDSymbolBiotype
Mouse (Mus musculus)ENSMUSG00000017146Brca1protein_coding
Rat (Rattus norvegicus)ENSRNOG00000020701Brca1protein_coding
C. elegansWBGENE00000264brc-1-
C. elegansWBGENE00018327--
Zebrafish (Danio rerio)No direct ortholog--
Fruit fly (Drosophila)No direct ortholog--
Yeast (S. cerevisiae)No ortholog--

Section 6: Clinical Variants & AI Predictions ClinVar Summary Total Variant Count: 15,445

ClassificationStatus
Pathogenic~2,500+
Likely Pathogenic~500+
Uncertain Significance (VUS)~8,000+
Likely Benign~1,500+
Benign~2,000+
Conflicting~1,000+
TOP 50 Pathogenic/Likely Pathogenic Variants (ClinVar)
Variant IDHGVSType
1012155c.1728del (p.Glu577fs)Deletion
1012156c.3982del (p.Ser1328fs)Deletion
1049053c.5296del (p.Ile1766fs)Deletion
1049103c.5141_5144del (p.Val1714fs)Deletion
1049239c.1100del (p.Thr367fs)Deletion
1049275c.5280del (p.Phe1761fs)Deletion
1049277c.3914dup (p.Asp1305fs)Duplication
1049514c.2827A>T (p.Lys943Ter)Nonsense
1049719c.851del (p.Gln284fs)Deletion
1049776c.1603G>T (p.Gly535Ter)Nonsense
1049828c.1848_1861del (p.Thr617fs)Deletion
1049895c.2365del (p.Ser789fs)Deletion
1050145c.66_67delinsTT (p.Leu22_Glu23delinsPheTer)Indel
1050371c.709G>T (p.Glu237Ter)Nonsense
1050671c.3476del (p.Ile1159fs)Deletion
1050776c.3225_3226del (p.Asn1075fs)Deletion
1068565c.2479_2480del (p.Glu827fs)Deletion
1069423c.1997_1998del (p.Leu666fs)Deletion
1070111c.3975del (p.Arg1325fs)Deletion
1070603c.2163del (p.Phe721fs)Deletion
1070907c.5202dup (p.Glu1735Ter)Duplication
1070948c.4799del (p.Ala1599_Leu1600insTer)Deletion
1071122c.3257_3258insTTGC (p.Leu1086fs)Insertion
1071268c.2145_2152del (p.Thr715_Ser716insTer)Deletion
1071282c.3381T>A (p.Tyr1127Ter)Nonsense
1071926c.4583_4589dup (p.Lys1530delinsAsnHisTer)Duplication
1071944c.4463_4467dup (p.Glu1490fs)Duplication
1072656c.2899dup (p.Thr967fs)Duplication
1072718c.292_293insT (p.Gly98fs)Insertion
1072912c.4862_4863insG (p.Asp1621fs)Insertion
1073201c.133_134+1delSplice
1073249c.4513del (p.Asp1505fs)Deletion
1073509c.671-8_671-2delSplice
1073669c.3679del (p.Gln1227fs)Deletion
1073703c.1099del (p.Thr367fs)Deletion
1074343c.1423_1508del (p.Ser475fs)Deletion
1074407c.2635del (p.Glu879fs)Deletion
1074646c.1709del (p.Pro570fs)Deletion
1074684c.1009G>T (p.Glu337Ter)Nonsense
1074746c.2735_2738del (p.Lys912fs)Deletion
1074750c.1682del (p.Ser561fs)Deletion
1074789c.4544_4545dup (p.Ser1516fs)Duplication
1075200c.959_981del (p.Arg320fs)Deletion
... and many more large deletions/duplications
SpliceAI Predictions Total Count: 3,719 TOP 50 Splice-Altering Variants (Δ Score ≥ 0.9)
VariantEffectΔ Score
17:43045800:TTG:Tacceptor_gain1.00
17:43045801:TGCTG:Tacceptor_loss1.00
17:43045798:AATTG:Aacceptor_gain0.99
17:43045799:ATTG:Aacceptor_gain0.99
17:43045801:TG:Tacceptor_gain0.99
17:43045803:C:CCacceptor_gain0.99
17:43045810:C:CTacceptor_gain0.98
17:43045811:A:Tacceptor_gain0.98
17:43045797:CAAT:Cacceptor_gain0.95
17:43047636:ACCTT:Adonor_loss0.95
17:43047637:CCTTA:Cdonor_loss0.95
17:43047638:CTTAC:Cdonor_loss0.95
17:43047639:TTA:Tdonor_loss0.95
17:43047640:TACCA:Tdonor_loss0.95
17:43047641:ACC:Adonor_loss0.95
17:43045799:ATTGC:Aacceptor_gain0.97
17:43045800:TTGC:Tacceptor_gain0.97
17:43045801:TGCT:Tacceptor_gain0.97
17:43045802:GCTG:Gacceptor_gain0.97
17:43045804:T:Gacceptor_gain0.97
17:43045803:CTGG:Cacceptor_gain0.96
17:43047635:CACCT:Cdonor_loss0.92
17:43045800:T:Cacceptor_gain0.92
17:43045796:CCAAT:Cacceptor_gain0.91
17:43047641:A:ACdonor_gain0.90
17:43047642:C:CCdonor_gain0.90
... and many more
AlphaMissense Predictions Total Count: 12,463 TOP 50 Predicted Pathogenic Missense Variants (Score ≥ 0.9)
VariantProtein ChangeAM ScoreClassification
17:43045713:A:CY1853D0.940likely_pathogenic
17:43045747:A:CS1841R0.996likely_pathogenic
17:43045747:A:TS1841R0.996likely_pathogenic
17:43045749:T:GS1841R0.996likely_pathogenic
17:43047684:A:TV1809D0.993likely_pathogenic
17:43045759:C:AW1837C0.990likely_pathogenic
17:43045759:C:GW1837C0.990likely_pathogenic
17:43045751:T:AD1840V0.982likely_pathogenic
17:43045761:A:GW1837R0.998likely_pathogenic
17:43045761:A:TW1837R0.998likely_pathogenic
17:43045760:C:GW1837S0.980likely_pathogenic
17:43045752:C:GD1840H0.978likely_pathogenic
17:43045772:A:TV1833E0.983likely_pathogenic
17:43045757:A:TV1838E0.971likely_pathogenic
17:43045766:C:GR1835P0.969likely_pathogenic
17:43045751:T:GD1840A0.966likely_pathogenic
17:43045743:C:GA1843P0.964likely_pathogenic
17:43045742:G:TA1843E0.961likely_pathogenic
17:43045751:T:CD1840G0.958likely_pathogenic
17:43045745:A:TV1842E0.956likely_pathogenic
17:43047687:A:TV1808E0.955likely_pathogenic
17:43045750:G:CD1840E0.954likely_pathogenic
17:43045750:G:TD1840E0.954likely_pathogenic
17:43045748:C:AS1841I0.954likely_pathogenic
17:43045752:C:AD1840Y0.948likely_pathogenic
17:43045754:A:GL1839S0.941likely_pathogenic
17:43045737:A:CY1845D0.918likely_pathogenic
17:43045760:C:AW1837L0.914likely_pathogenic
17:43047678:T:GQ1811P0.914likely_pathogenic
17:43045763:T:AE1836V0.910likely_pathogenic
17:43045761:A:CW1837G0.909likely_pathogenic
17:43045748:C:TS1841N0.907likely_pathogenic
17:43047677:C:AQ1811H0.903likely_pathogenic
17:43047677:C:GQ1811H0.903likely_pathogenic
17:43045739:A:GL1844P0.896likely_pathogenic
17:43045764:C:TE1836K0.894likely_pathogenic
17:43045752:C:TD1840N0.881likely_pathogenic
17:43047670:C:GA1814P0.870likely_pathogenic
17:43045713:A:GY1853H0.867likely_pathogenic
17:43045709:A:GL1854P0.867likely_pathogenic
17:43047667:A:GW1815R0.865likely_pathogenic
17:43047667:A:TW1815R0.865likely_pathogenic
17:43045745:A:CV1842G0.866likely_pathogenic
17:43045713:A:TY1853N0.858likely_pathogenic
17:43045754:A:CL1839W0.850likely_pathogenic
17:43047681:A:TV1810E0.848likely_pathogenic
17:43047678:T:CQ1811R0.843likely_pathogenic
17:43045767:G:CR1835G0.837likely_pathogenic
17:43047685:C:AV1809F0.834likely_pathogenic
17:43045712:T:GY1853S0.832likely_pathogenic

Section 7: Biological Pathways & Gene Ontology Reactome Pathways Total Count: 27 pathways (7 disease pathways)

Pathway IDNameDisease?
R-HSA-5685942HDR through Homologous Recombination (HRR)No
R-HSA-5685938HDR through Single Strand Annealing (SSA)No
R-HSA-5693607Processing of DNA double-strand break endsNo
R-HSA-5693565Recruitment and ATM-mediated phosphorylation at DSBsNo
R-HSA-5693579Homologous DNA Pairing and Strand ExchangeNo
R-HSA-5693571Nonhomologous End-Joining (NHEJ)No
R-HSA-5693616Presynaptic phase of homologous DNA pairingNo
R-HSA-5693554Resolution of D-loop Structures through SDSANo
R-HSA-5693568Resolution of D-loop through Holliday JunctionNo
R-HSA-69473G2/M DNA damage checkpointNo
R-HSA-912446Meiotic recombinationNo
R-HSA-1221632Meiotic synapsisNo
R-HSA-3108214SUMOylation of DNA damage response proteinsNo
R-HSA-5689901Metalloprotease DUBsNo
R-HSA-6796648TP53 Regulates Transcription of DNA Repair GenesNo
R-HSA-6804756Regulation of TP53 Activity through PhosphorylationNo
R-HSA-8951664NeddylationNo
R-HSA-8953750Transcriptional Regulation by E2F6No
R-HSA-9755511KEAP1-NFE2L2 pathwayNo
R-HSA-9825895Regulation of MITF-M-dependent genesNo
R-HSA-9663199Defective DNA DSB response due to BRCA1 lossYes
R-HSA-9699150Defective DNA DSB response due to BARD1 lossYes
R-HSA-9701192Defective HRR due to BRCA1 loss of functionYes
R-HSA-9704331Defective HRR due to PALB2 loss of BRCA1 bindingYes
R-HSA-9704646Defective HRR due to PALB2 loss of BRCA2/RAD51 bindingYes
R-HSA-9709570Impaired BRCA2 binding to RAD51Yes
R-HSA-9709603Impaired BRCA2 binding to PALB2Yes
Gene Ontology Annotations Total Count: 77 GO terms Biological Process (37 terms)
GO IDTerm
GO:0000724double-strand break repair via homologous recombination
GO:0006281DNA repair
GO:0006282regulation of DNA repair
GO:0006301DNA damage tolerance
GO:0006302double-strand break repair
GO:0006338chromatin remodeling
GO:0006357regulation of transcription by RNA polymerase II
GO:0006633fatty acid biosynthetic process
GO:0006974DNA damage response
GO:0007059chromosome segregation
GO:0007095mitotic G2 DNA damage checkpoint signaling
GO:0007098centrosome cycle
GO:0008630intrinsic apoptotic signaling pathway in response to DNA damage
GO:0010212response to ionizing radiation
GO:0010575positive regulation of VEGF production
GO:0010628positive regulation of gene expression
GO:0016567protein ubiquitination
GO:0030308negative regulation of cell growth
GO:0033147negative regulation of intracellular estrogen receptor signaling
GO:0035825homologous recombination
GO:0043009chordate embryonic development
GO:0044027negative regulation via chromosomal CpG island methylation
GO:0044818mitotic G2/M transition checkpoint
GO:0045717negative regulation of fatty acid biosynthetic process
GO:0045739positive regulation of DNA repair
GO:0045766positive regulation of angiogenesis
GO:0045786negative regulation of cell cycle
GO:0045892negative regulation of DNA-templated transcription
GO:0045893positive regulation of DNA-templated transcription
GO:0045944positive regulation of transcription by RNA polymerase II
GO:0046600negative regulation of centriole replication
GO:0051726regulation of cell cycle
GO:0051865protein autoubiquitination
GO:0060816random inactivation of X chromosome
GO:0071356cellular response to tumor necrosis factor
GO:0071479cellular response to ionizing radiation
GO:0071681cellular response to indole-3-methanol
Molecular Function (17 terms)
GO IDTerm
GO:0000976transcription cis-regulatory region binding
GO:0002039p53 binding
GO:0003677DNA binding
GO:0003684damaged DNA binding
GO:0003713transcription coactivator activity
GO:0003723RNA binding
GO:0004842ubiquitin-protein transferase activity
GO:0008270zinc ion binding
GO:0015631tubulin binding
GO:0019899enzyme binding
GO:0031625ubiquitin protein ligase binding
GO:0042802identical protein binding
GO:0061649ubiquitin-modified histone reader activity
GO:0070063RNA polymerase binding
GO:0140863histone H2AK127 ubiquitin ligase activity
GO:0140864histone H2AK129 ubiquitin ligase activity
GO:0085020protein K6-linked ubiquitination
Cellular Component (23 terms)
GO IDTerm
GO:0000151ubiquitin ligase complex
GO:0000152nuclear ubiquitin ligase complex
GO:0000800lateral element
GO:0000931gamma-tubulin ring complex
GO:0001673male germ cell nucleus
GO:0001741XY body
GO:0005634nucleus
GO:0005654nucleoplasm
GO:0005694chromosome
GO:0005737cytoplasm
GO:0005886plasma membrane
GO:0016604nuclear body
GO:0031436BRCA1-BARD1 complex
GO:0032991protein-containing complex
GO:0070531BRCA1-A complex
GO:0070532BRCA1-B complex
GO:0070533BRCA1-C complex
GO:1990391DNA repair complex
GO:1990904ribonucleoprotein complex

Section 8: Protein Interactions & Molecular Networks STRING Interactions Total Count: 6,120+ interactions TOP 50 Highest-Confidence Interactors (Score ≥ 900)

UniProt BGeneScore
P38398BRCA1 (self)999
P51587BRCA2999
Q06609RAD51999
Q12888TP53BP1999
Q13315ATM999
Q86YC2PALB2999
P04637TP53999
Q99728BARD1993
Q99708RBBP8998
Q9BXW9FANCD2998
O96017CHEK2997
Q6UWZ7ABRAXAS1997
Q96RL1UIMC1/RAP80997
Q9BX63BRCC3997
Q14676MDC1996
P01106MYC995
P03372ESR1995
P46736BRCC36995
Q7Z569BRAP995
Q9NXR7BABAM2995
Q9GZX5MRE11994
Q92560USP7987
P16104H2AFX983
O15360FANCA978
P52701MSH6978
P49959MRE11974
Q08211DHX9967
O14757CHEK1965
Q7Z333SETX962
P43351BTRC961
P51532SMARCA4957
O43502RFWD3954
Q9NVI1FANCI952
Q05048CSTF1948
P43246MSH2947
P42224STAT1943
Q9NRR4NOP10940
P09874PARP1936
O76064RIFL1931
P40692MLH1930
P06401PGR929
Q09472EP300928
Q8IYW5RNF138921
P04626ERBB2919
Q13547HDAC1918
O75330HMMR912
A0A087WY85RAD51B912
P30304CDC25A911
O43542XRCC3909
O75771RAD51C904
IntAct Interactions Total Count: 342+ curated interactions Key High-Confidence Direct Interactions
Partner GeneInteraction TypeConfidence
BRIP1 (BACH1)direct interaction0.98
BARD1physical association0.96
RBBP8 (CtIP)physical association0.93
PALB2physical association0.91
ABRAXAS1direct interaction0.86
ESR1direct interaction0.81
UIMC1 (RAP80)direct interaction0.78
BRCA2association0.73
H2AFXcolocalization0.71
TP53BP1physical association0.70
RPA1physical association0.65
MDC1physical association0.63
BRCC3association0.61
BRAT1physical association0.60
SUMO1physical association0.60
PPP1CBphysical association0.58
TOP2Aphysical association0.52
ACACAphysical association0.52
BioGRID Interactions Total Count: 2,598+ interactions Protein Similarity ESM2 Structural/Embedding Similarity Total Similar Proteins: 52
UniProt IDTop SimilarityAvg Similarity
Q6J6I81.00000.9929
Q9GKK81.00000.9931
G7NY550.99990.9933
F6ULY30.99990.9932
G3S0770.99980.9935
Q8N9V70.99980.9931
P979290.99950.9927
O359230.99950.9921
P51587 (BRCA2)0.99940.9914
Q007560.99930.9876
F7DF150.99930.9933
D3ZUC60.99920.9927
E5FYH10.99920.9927
Q951530.99900.9932
G7H7V70.99880.9932
E5FYH00.99850.9930
Q66JQ70.99850.9852
Q99MR90.99840.9877
Q6NZG40.99840.9876
Q8IXT10.99840.9869
DIAMOND Sequence Similarity Total Homologous Proteins: 96
UniProt IDIdentity (%)Bitscore
Q6J6I898.83520
Q9GKK898.83515
P38398 (self)98.53486
Q6J6J097.53448
Q6J6I993.63310
O5495281.02758
P4875481.02752
Q9515374.42524
Q864U171.42368
Q1452791.11806
Q9521691.11804
Q6PCN783.61652
Q8HXH092.61357
Q496Y092.61356
D3YY2387.81325
Q17RB887.81326
Q9D4H779.51149
P4325476.11017
Q3MV1987.71015
Q6DJN287.71016

Section 9: Transcription Factor Regulatory Data BRCA1 as a Transcription Factor/Co-regulator Total Regulatory Connections: 140 Downstream Targets (Genes Regulated BY BRCA1) Count: 60+ targets

Target GeneRegulationConfidence
AREGRepression-
ASPMActivation-
ATMActivation-
BARD1RepressionHigh
BRCA1Repression (autoregulation)High
BRCA2Unknown-
CASRUnknown-
CCNB1Unknown-
CCND1Unknown-
CDH3Repression-
CDKN1A (p21)Activation-
CDKN1B (p27)Activation-
CTSDRepression-
CXCL1Repression-
CYP19A1ActivationHigh
CYP1A1UnknownHigh
CYP1B1UnknownHigh
DDB2Activation-
DDIT3Activation-
E2F6Repression-
EGFRRepression-
EGR1Unknown-
EP300Activation-
ESR1Activation-
FOSUnknown-
FOXC1Repression-
FOXC2Repression-
FSTActivation-
GADD45AActivation-
HIF1AUnknown-
HMGA2Repression-
HNRNPA2B1Repression-
HSPA5Repression-
IFNA1Activation-
IGF1Repression-
IRF7Activation-
IRF9Activation-
MAD2L1Activation-
MDM2Activation-
MYCUnknown-
NOS3Activation-
NTHL1Activation-
RNASELActivation-
S100A2Activation-
SIRT1Activation-
STAT1Activation-
STAT2Activation-
TFF1Repression-
VEGFAActivation-
XPCUnknown-
Upstream Regulators (TFs that Regulate BRCA1) Count: 80+ regulators
RegulatorRegulationConfidence
AHRUnknownHigh
AP1ActivationLow
APEX1-High
ARNTUnknown-
BHLHE41RepressionLow
CHD8Unknown-
CREB1UnknownHigh
CREBBPUnknown-
CTBP1Repression-
CTBP2RepressionHigh
CTCFUnknownHigh
DLX4RepressionHigh
DNMT1-High
E2F1RepressionHigh
E2F2-High
E2F3-High
E2F4ActivationHigh
E2F5-High
E2F6UnknownHigh
EGR1UnknownHigh
EP300Activation-
ESR1UnknownHigh
ETS2RepressionHigh
EZH2UnknownLow
FOSActivationLow
FOSL2Activation-
FOXA1UnknownHigh
FOXP3RepressionHigh
GABPAActivationHigh
HIF1ARepressionLow
HMGA1RepressionHigh
SIGNOR Signaling Interactions Total Count: 83 interactions Key upstream kinases/regulators:
- ATM- ATR- CHEK2- CDK1- CDK2- AKT1- AURKA- PPP1CA
phosphorylates BRCA1 (multiple sites)phosphorylates BRCA1phosphorylates BRCA1 at S988phosphorylates BRCA1phosphorylates BRCA1phosphorylates BRCA1phosphorylates BRCA1dephosphorylates BRCA1
Drug & Pharmacology Data**
ChEMBL Target
PropertyValue
ChEMBL Target IDCHEMBL5990
Target TypeSINGLE PROTEIN
Total Activities358
Total Molecules310
Molecules Targeting BRCA1 Total Count: 310 molecules Notable Compounds
ChEMBL IDNameTypePhase
CHEMBL140CURCUMINSmall molecule3
CHEMBL102714SB-216763Small molecule0
CHEMBL129177HARMALOLSmall molecule0
+ 307 additional compounds
PharmGKB Gene Status
PropertyValue
PharmGKB IDPA25411
VIP GeneYes
Has Variant AnnotationYes
Has CPIC GuidelineNo
Drug-Gene Associations (PharmGKB)
DrugPharmGKB IDClinical Annotations
OlaparibPA1649204200
NiraparibPA1661316100
RucaparibPA1661634180
TalazoparibPA1661827400
Sacituzumab govitecanPA1662250611
Note
BRCA1/2 mutations are biomarkers for PARP inhibitor therapy (olaparib, niraparib, rucaparib, talazoparib) in breast and ovarian cancers.
Expression Profiles**
Bgee Expression Summary
PropertyValue
Expression BreadthUbiquitous
Total Present Calls208
Total Absent Calls58
Total Conditions266
Max Expression Score90.68
Average Expression Score68.16
Gold Quality Count225
TOP 30 Tissues by Expression Score
Tissue/Anatomical EntityExpressionScoreQuality
Ventricular zonepresent90.68gold
Male germ line stem cell in testispresent89.54gold
Primordial germ cell in gonadpresent86.44gold
Ganglionic eminencepresent86.30gold
Embryopresent85.26gold
Secondary oocytepresent84.10gold
Bone marrow cellpresent82.00gold
Oocytepresent81.87gold
Endometrium epitheliumpresent81.67gold
Sural nervepresent81.55gold
Trabecular bone tissuepresent80.40gold
Stromal cell of endometriumpresent79.88gold
Colonic epitheliumpresent79.56gold
Bone marrowpresent79.55gold
Testispresent78.58gold
Right testispresent78.11gold
Rectumpresent78.05gold
Left testispresent77.69gold
Right lobe of thyroid glandpresent77.35gold
Vermiform appendixpresent77.10gold
Tonsilpresent76.23gold
Left lobe of thyroid glandpresent75.58gold
Hindlimb stylopod musclepresent74.97gold
Thyroid glandpresent74.62gold
Mucosa of transverse colonpresent74.61gold
Adrenal tissuepresent74.41gold
Choroid plexus epitheliumpresent74.34silver
Calcaneal tendonpresent73.96gold
Lymph nodepresent73.91gold
Single-Cell Expression Data
Dataset IDDescriptionSpeciesCells
E-GEOD-99795LNCaP prostate carcinoma cells ± androgenHomo sapiens144

Section 12: Disease Associations OMIM Disease Associations

OMIM IDDisease
113705BRCA1 gene locus
604370Breast-ovarian cancer, familial, susceptibility to, 1
617883Fanconi anemia, complementation group S
614320Pancreatic cancer, susceptibility to, 4
167000Ovarian cancer
114480Breast cancer
GenCC Gene-Disease Classifications
DiseaseClassificationInheritanceSubmitter
Breast-ovarian cancer, familial, susceptibility to, 1 (OMIM:604370)DefinitiveAutosomal dominantAmbry Genetics
Breast-ovarian cancer, familial, susceptibility to, 1 (OMIM:604370)StrongAutosomal dominantGenomics England PanelApp
Breast-ovarian cancer, familial, susceptibility to, 1 (OMIM:604370)StrongAutosomal dominantLabcorp Genetics
Fanconi anemia, complementation group S (OMIM:617883)StrongAutosomal recessiveG2P
Fanconi anemia, complementation group S (OMIM:617883)ModerateAutosomal recessiveAmbry Genetics
Fanconi anemia, complementation group S (OMIM:617883)LimitedAutosomal recessiveLabcorp Genetics
Pancreatic cancer, susceptibility to, 4 (OMIM:614320)ModerateAutosomal dominantGenomics England PanelApp
Hereditary breast ovarian cancer syndrome (ORPHA:145)SupportiveAutosomal dominantOrphanet
Fanconi anemia (ORPHA:84)SupportiveAutosomal recessiveOrphanet
Orphanet Disease Associations Total Count: 9 diseases
Orphanet IDDiseaseTypeGene Count
ORPHA:145Hereditary breast and/or ovarian cancer syndromeDisease15
ORPHA:84Fanconi anemiaMalformation syndrome23
ORPHA:227535Hereditary breast cancerDisease10
ORPHA:1333Familial pancreatic carcinomaDisease9
ORPHA:1331Familial prostate cancerDisease15
ORPHA:168829Primary peritoneal carcinomaDisease1
ORPHA:667662Breast implant-associated anaplastic large cell lymphomaDisease7
ORPHA:694963Inflammatory breast cancerDisease2
ORPHA:70567CholangiocarcinomaDisease4
HPO Phenotype Associations Total Count: 166 phenotypes TOP 50 Clinical Phenotypes
HPO IDPhenotype
HP:0000006Autosomal dominant inheritance
HP:0000007Autosomal recessive inheritance
HP:0001249Intellectual disability
HP:0001263Global developmental delay
HP:0001510Growth delay
HP:0001511Intrauterine growth retardation
HP:0001508Failure to thrive
HP:0000252Microcephaly
HP:0000365Hearing impairment
HP:0000518Cataract
HP:0001873Thrombocytopenia
HP:0001903Anemia
HP:0001882Decreased total leukocyte count
HP:0000027Azoospermia
HP:0000028Cryptorchidism
HP:0000135Hypogonadism
HP:0000130Abnormality of the uterus
HP:0000813Bicornuate uterus
HP:0000175Cleft palate
HP:0000218High palate
HP:0000347Micrognathia
HP:0000316Hypertelorism
HP:0000486Strabismus
HP:0000568Microphthalmia
HP:0001631Atrial septal defect
HP:0001643Patent ductus arteriosus
HP:0001639Hypertrophic cardiomyopathy
HP:0002023Anal atresia
HP:0000453Choanal atresia
HP:0001172Abnormal thumb morphology
HP:0001199Triphalangeal thumb
HP:0001392Abnormality of the liver
HP:0001738Exocrine pancreatic insufficiency
HP:0000079Abnormality of the urinary system
HP:0000083Renal insufficiency
HP:0001537Umbilical hernia
HP:0002007Frontal bossing
HP:0000340Sloping forehead
HP:0000280Coarse facial features
HP:0000989Pruritus
HP:0001000Abnormality of skin pigmentation
HP:0001053Hypopigmented skin patches
HP:0001945Fever
HP:0001824Weight loss
HP:0002039Anorexia
HP:0002017Nausea and vomiting
HP:0002027Abdominal pain
HP:0002019Constipation
HP:0002254Intermittent diarrhea
HP:0001251Ataxia
GWAS Associations Total Count: 7
Study IDTraitP-value
GCST90011899_184Aspartate aminotransferase levels1.0e-12
GCST005312_44Menopause (age at onset)8.0e-11
GCST90002394_251Monocyte percentage of white cells1.0e-10
GCST005863_3Menopause (age at onset)1.0e-08
GCST009823_7Gynecologic disease (multivariate)1.0e-08
GCST009829_3Ovarian cancer (MTAG)5.0e-08
GCST009830_31Ovarian cancer3.0e-08
SUMMARY STATISTICS
CategoryCount
Total Ensembl Transcripts47
Total RefSeq Transcripts (mRNA)100+
Total CCDS IDs5
Total Exons (canonical)23
Total UniProt Entries22
Total InterPro Domains9
Total PDB Structures33
Total ClinVar Variants15,445
Total AlphaMissense Predictions12,463
Total SpliceAI Predictions3,719
Total STRING Interactions6,120+
Total IntAct Interactions342+
Total BioGRID Interactions2,598+
Total Reactome Pathways27
Total GO Terms77
Total GWAS Associations7
Total HPO Phenotypes166
Total Orphanet Diseases9
Total SIGNOR Interactions83
Total CollecTRI Regulatory Connections140
Total ChEMBL Molecules310
Reference compiled from: HGNC, Ensembl, NCBI, UniProt, PDB, AlphaFold, ClinVar, AlphaMissense, SpliceAI, STRING, IntAct, BioGRID, Reactome, Gene Ontology, GWAS Catalog, HPO, OMIM, Orphanet, GenCC, PharmGKB, ChEMBL, Bgee, SIGNOR, CollecTRI, and Single Cell Expression Atlas.