Transcript Document

Medicinal
Informatics
Automated Design of ligands with
targeted polypharmacology
Jérémy Besnard PhD
University of Dundee
ELRIG Drug Discovery '13 Manchester
3rd September 2013
Background
• Increasing cost of R&D
• High failure rate for compounds in Phase II
and III
Phase II failures: 2008–2010. The 108 failures are divided according to reason
for failure when reported (87 drugs). The total success rate is 18 % between
2008 and 2009 (Arrowsmith, Nature Reviews Drug Discovery 2011)
Possible solution
• Improve efficacy and safety by better
understanding polypharmacological profile
of a compound
Two proteins are deemed interacting in chemical space (joined by an edge) if
both bind one or more compound. Paolini et al. Nature Biotechnology, 2006
Designing Ligands
• Challenging to test one compound versus
multiple targets: costs, which panel to use,
more complicated SAR, increasing difficulty
of multiobjective optimisation
• Computational methods can provide
– Design ideas
– Prediction of activities
– When possible ADME predictions
Design ideas – De Novo Drug Design
• Compound structures are generated by an
algorithm
• Predefined rules to create/modify structures
• User defined filters
– Molecular property space (MW, LogP)
– Primary activities to improve
– Side activities to avoid
Can we design against a
polypharmacological profile?
• Drug Design is a multi-dimensional optimisation
problem
• Polypharmacology profile design increases the number
of dimensions but not the type of the problem
– Multiple biological activities
– ADMET properties
– Drug-like properties
• Automating drug design is the strategy we have taken
to deal with the design decision complexity of multitarget optimisation
Drug Optimisation Road
Lead
Clinical
Candidate
Biologically active chemical space
Synthesised Compounds
Decision to synthesize
• Decisions:
– Exploration
– Improvement
• Guides
– Structure
– Previous SAR
– Med Chem
Knowledge
Algorithm
Define Objectives
Compounds
Med Chem
design rules
Analyse
X run
Synthesis
optimal molecules
Generate Virtual
compounds
Top cpds +
Random set
Assess
molecules
Background
knowledge
Machine
Learning
Test in bio-assays
Predict properties
Phys-Chem
Activities (primary
and anti target)
Novelty
Multiobjective
prioritization
Results expand
knowledge-base
Patent WO2011061548A2
Final
Population
Background knowledge
• ChEMBL
– 30 years of publications
Total 40,000 papers
Total ~ 3M
endpoints
Total ~ 660,000 cpds
Algorithm
Define Objectives
Compounds
Med Chem
design rules
Analyse
X run
Synthesis
optimal molecules
Generate Virtual
compounds
Top cpds +
Random set
Assess
molecules
Background
knowledge
Machine
Learning
Test in bio-assays
Predict properties
Phys-Chem
Activities (primary
and anti target)
Novelty
Multiobjective
prioritization
Results expand
knowledge-base
Patent WO2011061548A2
Final
Population
Transformations
• Set of ~700
– Tactics to design analogs
– Not synthetic reactions
– Derived from literature
• Semi-automatic
Try to find new
transformations
Algorithm
Define Objectives
Compounds
Med Chem
design rules
Analyse
X run
Synthesis
optimal molecules
Generate Virtual
compounds
Top cpds +
Random set
Assess
molecules
Background
knowledge
Machine
Learning
Test in bio-assays
Predict properties
Phys-Chem
Activities (primary
and anti target)
Novelty
Multiobjective
prioritization
Results expand
knowledge-base
Patent WO2011061548A2
Final
Population
Model
• Categorical model
– Active if activity < 10μM
• Use 2D structural information
1
0
0
1
1
0
0
0
1
0
Bayesian
Bad feature:
360 times in training set,
Never in active molecule:
Weight = -1.91
Good feature:
23 times in training set,
15 times in active molecule:
Weight = 2.46
Moderate good feature:
389 times in training set,
7 times in active molecule:
Weight = 0.10
Moderate bad feature:
4 times in training set,
Never in active molecule:
Weight = -0.06
“A molecule”
Score= 2.46 + 0.10 -1.91 -0.06 = 0.59
High score means high confidence of activity.
Low (negative) score means high confidence of inactivity
Score ~ 0: either cancellation of good and bad, or unknown
W. Van Hoorn, Scitegic User Group Meeting, Feb 2006, La Jolla
Algorithm
Define Objectives
Compounds
Med Chem
design rules
Analyse
X run
Synthesis
optimal molecules
Generate Virtual
compounds
Top cpds +
Random set
Assess
molecules
Background
knowledge
Machine
Learning
Test in bio-assays
Predict properties
Phys-Chem
Activities (primary
and anti target)
Novelty
Multiobjective
prioritization
Results expand
knowledge-base
Patent WO2011061548A2
Final
Population
Prioritization
• Objectives
• Example
–
–
–
–
Receptor 1 and 2 activity
Good CNS score
No α1 (a, b and d) activity
-> n dimensions
Objective 2
– Activity
– CNS score or QED
– Anti Target
Achievement
Objective
Objective 1
QED: see Bickerton et al., Quantifying the chemical beauty of drugs. Nature Chemistry, 4(February 2012)
Algorithm
Define Objectives
Compounds
Med Chem
design rules
Analyse
X run
Synthesis
optimal molecules
Generate Virtual
compounds
Top cpds +
Random set
Assess
molecules
Background
knowledge
Machine
Learning
Test in bio-assays
Predict properties
Phys-Chem
Activities (primary
and anti target)
Novelty
Multiobjective
prioritization
Results expand
knowledge-base
Patent WO2011061548A2
Final
Population
Experimental Validation
• Does it actually work?
• Evolution of a drug (SOSA)
– Look at possible side activity of
drugs
• Donepezil: acetylcholinesterase inhibitor
used for Alzheimer disease
• Potential activity for dopamine D4
receptor
• Confirmed experimentally at 600nM:
design ligands with Donepezil as a hit to
improve D4 activity
• Dopamine D2 receptor studied (lower
prediction, not active)
Wermuth, C. G. Selective optimization of side activities: the SOSA approach.
Drug discovery today, 11(3-4), 2006
What are Dopamine D2 and D4
receptors?
•
•
•
•
Belong to the GPCR family
Mainly present in the CNS
Involved in cognition, memory, learning…
Targets for several neuropsychiatric
disorders like Parkinson’s disease,
Schizophrenia, Attention-deficit
hyperactivity disorder, Bipolar disorder…
• Data (4,400 activities for D2 and 1,500 for
D4) and screening facilities available
Two studies
• Two receptors as objectives
– D2: will lead to work on selectivity toward
multiple receptors
– D4: will lead to work on selectivity and
novelty
D2 as objective
• 1st series of compounds with high D2
prediction
Results
CNS penetration for
compound 3: brain/blood
ratio = 0.5
Next objectives: reduce antitarget activity
• Polypharmacology primary activity
– Combination profile of multiple GPCRs: 5HT1a, D2, D3, D4
• Selectivity over alpha 1 anti-targets
– Alpha 1a, 1b and 1d
– Inhibitors induce vasodilatation
• Novelty: remove known scaffolds
• Good phys-chem properties: need to cross blood-brainbarrier
• Multiple calculations and look at the results for
synthetically attractive compounds
Optimisation results for 5-HT1A/D2/D3/D4/CNS/α1 selectivity/CNS objectives
Highest ranked compound
Path
Results
Selectivity
• Need to include selectivity in the algorithm:
– Alpha adrenoreceptor 1 inhibitors versus
other targets
Ratio Ki D2 receptor / Ki α Receptor
avg_selectivity
100
10
1
0.1
0.01
0.001
GFR- GFR- GFR- GFR- GFR- GFR- GFR- GFR- GFR- GFR- GFR- GFR- GFR- GFR- GFR- GFRVIIVIIVIIVIIVIIVIIVIIVIIVIIVIIVIIVIIVIIVIIVIIVII266 269 273 274 280- 280 281 285 287 290 327 328 329 330 331 332
HCl
D4 objective
• Improve D4 activity
• Good ADME score
Bayesian = 25
D4 Ki=614nM
Bayesian = 105
D4 Ki=9nM
Screening data
Bayesian Model Predictions
Ki Binding Assays (nM)
Experimental Data
• Compound 13 is selective for D4 receptor
with pKi = 8
• It crosses the BBB (Ratio of 7.5)
• In vivo experiments on with comparison to
D4-KO mouse showed effects that the
compound acts on target
Morpholino series
• However Cpd 13 is commercial and thus not
novel
• New objective: starting from 13, keep
activity, filter non-novel chemotype, D4
selectivity over other targets, CNS penetrant
• Example of top ranked compound
Morpholino series
• 24 analogues were synthesised around 2
scaffolds
Matrix of results
Lead Series Criteria Met
• Ki<100nM
• Highly novel chemotype at level of carbon
framework
• Chemotype is D4 selective
• CNS penetrable
• Patent filing (WO2012160392)
Further characterization
• Functional data
– Compounds are antagonist or inverse agonist
• hERG (K ion channel): inhibition can cause sudden
death
– 27s: EC50 = 3μM
• Blood-Brain-Barrier
– 27s: in vivo brain/blood ratio of 2.0
• Stability
– Compound itself: oxidation possible indoline > indole
– Metabolic stability: high clearance > need improvement
(Cli, = 25 mL/min/g)
Compound 27s can be classified as a lead for D4 selective
inverse agonist. From the series, there is also a potential of dual
5HT1A/D4 ligands
How to improve the algorithm
• Better model: better prediction can help reducing
false positives and detect potential other activities
• Different methods
– Predictions: other machine learning, 2D/3D
similarity (USR-USRCAT), docking
– Idea generator: real synthetic reactions, group
replacements (MMPs)
• More knowledge on the method itself
– Where it works
– When to stop
Hussain. Computationally Efficient Algorithm to Identify Matched Molecular Pairs ( MMPs ) in Large Data Sets, J.Chem.Inf.Model., 4, 2010
Ballester. Ultrafast shape recognition to search compound databases for similar molecular shapes. Journal of computational chemistry, 28(10), 2007
Schreyer. USRCAT: real-time ultrafast shape recognition with pharmacophoric constraints. Journal of Cheminformatics, 4(27), 2012
Conclusion
• We have designed an algorithm to generate
and predict compounds against
polypharmacological profile
• The algorithm can adapt to the situation:
improve activity, selectivity, novelty
• We have shown proof of concept that we can
automatically invent patentable compounds
• Results were experimentally validated and it
generated a lead compound – this study has
been published (Besnard al,. Nature, 492(7428),
2012)
• The technology has been licensed to Ex
Scientia Ltd (spin off http://www.exscientia.co.uk/ )
• Ex Scientia in its first year has had further
successes applying the algorithm to the
design of various other gene families
including ion channels, GPCRs and enzymes
(“stay tuned”)
Acknowledgments
• Pr. Andrew Hopkins
• Richard Bickerton
• ALH group
• Pr. Ian Gilbert
• Gian Filippo Ruda
• Karen Abecassis
• Kevin Read and DMPK group
• Barton group
• Brenk group
•
•
•
•
Pr. Bryan Roth (UNC-CH - NIH)
Vincent Setola
Roth lab
Pr. William Wetsel (Duke
University Medical School)
• Wetsel group
• CLS IT support (Jon)
• Accelrys support