Database - BD2KCCC
Download
Report
Transcript Database - BD2KCCC
Hypothesis Fusion to Improve
the Odds of Successful Drug
Repurposing
Alexander Tropsha, Charles
Schmitt, Eugene Muratov
UNC-Chapel Hill
Weifan Zheng, NCCU
Nabarun Dasgupta, Epidemico
Information resources for bioactive
chemicals are abundant and growing
The FDA Safety Information
FDA Orange Book of and Adverse Event Reporting
Program
Approved Drug Products
Over 18 million citations from
MEDLINE and other life science
journals for biomedical articles back
to the 1950s
FDA approved labels
for marketed drugs
eMC provides electronic
Compound Assay data for
proteins and cytotoxicity
FDA data for five liver
enzyme endpoints for
Integrated
ChemicalBioactivity
Data
The DrugBank database
combines detailed drug data
(8200+ drug entries) with
comprehensive drug target
information
Summaries of Product
Characteristics
European Public Assessment Reports (EPAR)
Cytochrome P450
Drug Interaction Table
Drug interactions with
cytochrome P450 isoforms
“Potential Safety Issue” data
“Drug Interactions” table
FDA New Drug
Application documents
Modified from a slide provided by Julie Barnes, Biowisdom
Reflects the scientific conclusion
reached by the Committee for
Medicinal Products for Human Use
(CHMP) at the end of the centralised
evaluation process.
Data Science and Data Cycle
Unstructured test:
Facebook
Twitter
Other Social Media
Electronic Databases:
Lab collections
Literature data
Disease
Text Mining
Data
reproducibility
and data
curation are
critical,
otherwise:
Structured
Data
Repository
Data
Analysis
and
Modeling
BD2K
= Bogus
Data collection,
curation,
integration, and
Data
to(ontology).
structuring
Knonsense
Effect
Experimental
Validation
Experimental Design
Decision support
Predictive data models & tools
4
Data set curation workflows: Trust but Verify!
Fourches D. et al.
J. Chem. Inf. Model.,
2010, 50, 1189
Fourches D. et al.
Nat. Chem. Bio. 2015, 11, 535
5
Disease-Target
Association
Target
related
ligands
Functional
data
Binding
data
QSAR
Disease
related
genes or
proteins
Text/database
mining
Disease
related
proteins
PubMed/
Chemotext
Predictive models
Disease
gene
signatures
CTD
HMDB
Database mining
Structural hypothesis
“putative drug candidates”
Network mining
Hypotheses
fusion
ChemoText
cmap
New hypothesis about connectivity between
chemicals and diseases
New testable hypotheses
with higher confidence
Hajjo et al, Chemocentric Informatics Approach
to Drug DiscoveryJ Med Chem.
2012, 55(12):5704-19
QSAR modeling and Virtual Screening:
Hit identification in external libraries
CHEMICAL
STRUCTURES
CHEMICAL
DESCRIPTORS
CHEMICAL DATABASE
PREDICTIVE
QSAR MODELS
PROPERTY/
ACTIVITY
QSAR
MAGIC
VIRTUAL
SCREENING
HITS
(confirmed
actives)
~106 – 109
molecules
INACTIVES
(confirmed inactives)
5-HT6 receptor QSAR models &
QSAR-based VS
Dataset
Virtual screening
196 cps.
94 Inactives
Ki ≥ 10 µM
102 Actives
Ki < 10 µM
59 K cps.
Source: PDSP Ki-DB
Model statistics
5-HT6
predictor
1.0
0.9
0.8
CCRevs
0.7
0.6
kNN-Dragon Model
0.5
kNN-Dragon Random
CBA-SG Model
0.4
CBA-SG Random
0.3
300 VS Hits
“Actives”
0.2
0.1
8
0.0
Model
The connectivity map
Input
Database
Output
High correlation
Biological state 1
Null
Signature
Control
Low correlation
Step1: upload signature
Step2: query the cmap
Lamb, J. et al. Science, 313, 1929-1935 (2006)
Lamb, J. Nature 7, 54-60 (2007)
Step3 : list of correlated
compounds
Querying the cmap
Upload signature
(S1)
Query the cmap
List of compounds
cmap
1.00
SCORE
(S2)
0.00
0.00
Alzheimer’s disease
gene signatures
cmap
S1: Hata, R. et al., Biochem. Biophys. Res. Commun 284, 310 (2001).
S2: Ricciarelli, R. et al., IUBMB Life 56, 349 (2004).
-1.00
10
WDI
DATABASE
cmap
DATABASE
59 K
compounds
QSAR
FILTER
Chemocentric
Informatics
CONSENSUS
HYPOTHESES
300 5-HT6
Active HITS
cmap
FILTER
881 instances with S1
861 instances with S2
97 COMMON HITS with S1
106 COMMON HITS with S2
73 COMMON HITS with S1 & S2
Antipsychotics
Antidepressants
Calcium Channel Blockers
Selective Estrogen Receptor
Modulators (SERMs)
Further
selection
34 Higher
Confidence Hits
6.1 K
Individual
instances
Exploring PubMed as one of the largest
Chemical Biology Databases: the
ChemoText Project
9,088,747
relationships
13,157,701
relationships
Proteins
61,329 distinct
Drug Effects
7,761 distinct
Diseases
4,865 distinct
20,466,335
relationships
5,395,144
relationships
9,360,330
relationships
Subject
Chemical
Baker, N. Hemminger, B.J Biomed
Inform. 2010 Aug;43(4):510-9
134,184 distinct
http://chemotext.mml.unc.edu/
•2008 Medline baseline: 16,880,015 records
•6,635,344 records had subject chemicals
Swanson’s ABC approach to drug
discovery via text mining*
Relationships established
through co-occurrence of
terms
B
Intermediate
Terms
Vasodilation
Spreading cortical
depression
Platelet aggregation
C
Disease
A
Chemicals
Magnesium
Relationships established
through co-occurrence of
terms
Deduced relationship
Migraine
*Swanson DR. Medical literature as a potential source of new knowledge.
Bull Med Libr Assoc 1990;78(1):29–37
ABC Method as applied to discern
chemical-target-disease
associations (using Chemotext)
B
Protein
A
Chemical
C
Disease
http://chemotext.mml.unc.edu/
Raloxifene identified as a 5-HT6 receptor
ligand and potential treatment for the
Alzheimer’s disease
Raloxifene binds to 5-HT6
receptor with a Ki= 750 nM.*
Chlorpromazine
Raloxifene given at a dose of
120 mg/day led to reduced risk
of cognitive impairment in postmenopausal women.
Raloxifene
Yaffe, K. et al., Am J Psychiatry, 2005,
162, 683–690.
Adjunctive raloxifene treatment
improves attention and memory
in men and women with
schizophrenia.
Competition binding at 5-HT6 receptors for
raloxifene
(yellow
triangle)
and
chlorpromazine (square) versus [3H] LSD.
Tested by our collaborators at PDSP.
Weickert TW, et al Mol Psychiatry. 2015
20, 685-94
*Hajjo et al, Chemocentric Informatics Approach to
Drug Discovery. J Med Chem. 2012, 55(12):5704-19
Aim 3: To Man
Aim 2: To Molecule
Social Media
On-line Databases
Etc.
Etc.
Curated Cancer-Related
Bioassay Database
Curated Database
of Assertions
Electronic
Medical Records
Chemotext
Drug-Target-Disease Database
Disease
Effect
Analysis
Virtual screening
platform
Cancer-Related
Assertions
Hypothesis generation
Candidates for Repurposing
Primary
Hits
Hypothesis confirmation
Hypothesis enrichment
NIH 1U01CA207160-01. Drug repurposing: From Man to Molecules to Man
Experimental validation in-vitro and in-vivo
Aim 1: From Man