PubChem BioAssay: Link chemical research to GenBank and beyond

Download Report

Transcript PubChem BioAssay: Link chemical research to GenBank and beyond

PubChem BioAssay: Link chemical
research to GenBank and beyond
Yanli Wang
251st American Chemical Society
National Meeting
San Diego, California
March 13th-17th, 2016
PubChem BioAssay …
• Public Data Repository at NCBI
• Open Research Database
• Small Molecule Bioactivity Data
• RNAi Screening
• Connect to PubChem Substance
• Integrated with Other Biomedical
Resources
2
PubChem BioAssay
… support multiple research areas
• Chemical Biology
• Chemgenomics
• Medicinal Chemistry
• Drug Discovery
• Functional Genomics
3
PubChem BioAssay
… data standard
Meta data
Test results
•
•
•
•
•
•
•
• Sample ID / SID
• Bioactivity outcome /
score/potency/dose response
• Phenotype annotation
• Bioactivity readout
• Cross reference
• Target
• Replicate
• Attributes
Protocol
Target
Cell line
Comment / Categorized comment
Grant number
Embargo date
Cross reference
publication
taxonomy
related assays
gene, nucleotide etc.
PubChem BioAssay
… data content
Statistics
• 1,000, 000 records
• 3,000,000 tested substances
• 220,000,000 bioactivity outcomes
• 1,000,000,000 data points
• 200 chemical probes
Data type
• HTS experiment
• Literature curation
• Bioactivity
• Toxicity
• Selectivity Profiling
Links to many other databases
PubMed
Protein
50,000
OMIM
BioSystems
10,000
a pathway db
drug annotation
Gene
MeSH
Literature claasifiication
50,000
Nucleotide
Depositor website
GEO
Taxonomy
3000
CDD
conserved protein family domain
Structure
a mirror of Protein Data Bank (PDB)
6
Link Research Data to Molecular Target …
BioAssay targets
 all test results
 specific to test reagent
 specific readout
7
Chemical Probe
F2RL3 antagonists
IC50: 0.139 uM
(CID: 2333)
CHRM5 antagonists
IC50: 0.44uM
(CID: 42519285)
mGlu5 positive
allosteric
Potentiator
EC50: 2.411 uM
(CID: 1318633)
STAR inhibitor
IC50: 2.12 uM
(CID: 45100448)
… 200
EGFR inhibitor
IC50: 0.7079 uM
(CID: 2303746)
mGluR3 modulator
IC50: 2.611 uM
(CID: 60210836)
more
Thyroid Hormone Receptor /
Steroid Receptor Coregulator 2
interaction inhibitor
Potency: 1.4uM
(CID: 5184800)
MRGPRX1
allosteric activator
EC50: 0.19 uM
(CID:71598556)
8
Protein BioAssay Target …
Protein class for BioAssay targets
Enzyme (4399)
Membrane receptor (705)
Ion channel (409)
Transcription factor (224)
Transporter (180)
Epigenetic regulator (166)
Secreted protein (63)
Structural protein (53)
Auxiliary transport protein (32)
Protein
class for MLP probe targets
Surface antigen (20)
Adhesion (15)
Other cytosolic protein (256)
Other membrane protein (18)
Enzyme (27)
Membrane receptor (25)
Other nuclear protein (18)
Transporter (6)
Unclassified protein (1170)
Transcription factor (5)
Ion channel (3)
Epigenetic regulator (1)
Other cytosolic protein (1)
Unclassified protein (19)
Biological Pathways for Protein Target …
BioSystems name
KEGG id (conserved pathway)
Count of genes
Neuroactive ligand-receptor interaction
ko04080
623
Calcium signaling pathway
ko04020
329
cAMP signaling pathway
ko04024
299
PI3K-Akt signaling pathway
ko04151
279
MAPK signaling pathway
ko04010
252
Ribosome
ko03010
220
Proteoglycans in cancer
ko05205
194
cGMP-PKG signaling pathway
ko04022
190
Focal adhesion
ko04510
189
Rap1 signaling pathway
ko04015
186
Oxytocin signaling pathway
ko04921
181
Retrograde endocannabinoid signaling
ko04723
168
Inflammatory mediator regulation of TRP channels
ko04750
168
HTLV-I infection
ko05166
165
Vascular smooth muscle contraction
ko04270
165
Chemokine signaling pathway
ko04062
163
Alzheimer's disease
ko05010
161
Epstein-Barr virus infection
ko05169
155
Adrenergic signaling in cardiomyocytes
ko04261
155
Dopaminergic synapse
ko04728
153
Organisms …
Organism
Assay Count
Rattus norvegicus
391714
Homo sapiens
260507
Mus musculus
118398
Staphylococcus aureus
17767
Canis lupus familiaris
16884
Escherichia coli
13513
Cavia porcellus
12341
Human immunodeficiency virus 1
9075
Pseudomonas aeruginosa
6654
Oryctolagus cuniculus
6574
Candida albicans
6206
Bos taurus
5277
Plasmodium falciparum
5037
Macaca mulatta
4412
Streptococcus pneumoniae
3604
Mycobacterium tuberculosis
3562
Macaca fascicularis
3244
Klebsiella pneumoniae
3031
Saccharomyces cerevisiae
2900
Cricetulus griseus
2889
Gene Target and its relevance to disease …
Linking Gene target to MedGene
3000
2500
2000
1500
1000
500
0
BioAssay Descriptions & Data …
https://pubchem.ncbi.nlm.nih.gov/bioassay/1202
A RNAi BioAssay Record…
http://pubchem.ncbi.nlm.nih.gov/assay/assay.cgi?aid=720703
Gene target
Kinase selectivity profiling assay…
BioAssay Search …
- classification tool for research data
https://pubchem.ncbi.nlm.nih.gov/assay/assay.cgi
BioAssay Target Search …
Link BioAssay data to Entrez Gene …
Identify
drugs
Verify gene
& chemical
functions
with
modulators
RNAi data
18
Summary
• Repository of chemistry & functional genomics
research data
• Cross link chemical biology data to genomic
resources providing access to chemical tools
• Identify gene functions
• Predict target and off-targets
• Evaluate selectivity, promiscuity, toxicity
• Construct drug target network
• Drug repositioning
PubChem … Open & Public Resource
http://pubchem.ncbi.nlm.nih.gov
Send questions to:
[email protected]
[email protected]
[email protected]
Acknowledgement
Steve Bryant
Evan Bolton
Ben Shoemaker
Jie Chen
Paul Thiessen
Tiejun Cheng
Jiyao Wang
Gang Fu
Bo Yu
Haehnke Volker
Jian Zhang
Lewis Geer
Renata Geer
Asta Gindulyte
Lianyi Han
Jane He
Siqian He
Sunghwan Kim