- Lorentz Center

Download Report

Transcript - Lorentz Center

Forming focused libraries and discovering active
molecules with Iterative Stochastic Elimination
Amiram Goldblum, Anwar Rayan and David Marcus
Dept. of Medicinal Chemistry
School of Pharmacy
Ein Kerem Campus
http://www.md.huji.ac.il/models
Optimizing Drug Design
Leiden 20-23 July 2009
Iterative Stochastic Elimination (ISE)
Our Generic tool for optimizing highly complex combinatorial problems
Problem type: Systems with many variables, each variable having
many discrete values, the variables interacting with each other,
and each state of the system can be evaluated and given a score
(transportation, communication, electronic devices, life sciences)
Method: ISE finds optimal system states (global and local
minima/optima) by iteratively eliminating values of variables that
contribute to worst results. Elimination is based on careful
statistics of randomly picked states of the system
Why: ISE has been compared to Genetic Algorithms, Monte Carlo,
Simulated annealing, Support Vector Machines and other
optimization methods – on specific problems and found to do as
well or better
Optimizing Drug Design
Leiden 20-23 July 2009
Iterative Stochastic Elimination publications
1.
Glick, M. & Goldblum, A. A novel energy-based stochastic method for positioning polar
protons in protein structures from X-rays. Proteins-Structure Function and Genetics 38, 273287 (2000).
2. Glick, M., Rayan, A. & Goldblum, A. A stochastic algorithm for global optimization and for
best populations: A test case of side chains in proteins. Proceedings of the National
Academy of Sciences of the United States of America 99, 703-708 (2002).
3. Noy, E., Gorelik, B., Rayan, A. & Goldblum, A. Stochastic path to form ensembles and to
quantify flexibility in proteins. Abstracts of Papers of the American Chemical Society 225,
U781-U781 (2003).
4. Rayan, A., Barasch, D., Brinker, G., Cycowitz, A., Geva-Dotan, I., Scaiewicz, A. & Goldblum,
A. New stochastic algorithm to determine drug-likeness. Abstracts of Papers of the
American Chemical Society 226, U297-U297 (2003).
5. Rayan, A., Scaiewicz, A., Geva-Dotan, I., Barasch, D. & Goldblum, A. Screening molecules
for their drug-like index. Abstracts of Papers of the American Chemical Society 228, U358U358 (2004).
6. Rayan, A., Senderowitz, H. & Goldblum, A. Exploring the conformational space of cyclic
peptides by a stochastic search method. Journal of Molecular Graphics & Modelling 22, 319333 (2004).
7. Rayan, A., Noy, E., Chema, D., Levitzki, A. & Goldblum, A. Stochastic algorithm for kinase
homology model construction. Current Medicinal Chemistry 11, 675-692 (2004).
8. Rayan, A., Scaiewitz, A., Geva-Dotan, I., Marcus D., Barasch, D. & Goldblum, A (2007).
Determining the Drug Like character of molecules and prioritizing them by a drug like index,
ACS presentations 2005-8.
9. Noy, E., Tabakman, T. & Goldblbum A. Constructing ensembles of flexible fragments by ISE
is relevant to protein-protein interfaces, Proteins (2007) 68, 702-711
10. Gorelik, B & Goldblum, A. High Quality binding modes in docking ligands to proteins.
Proteins (2008), 71, 1373-1386
Optimizing Drug Design
Leiden 20-23 July 2009
General Model System
• Variables
A7
• Values
• Interactions A6
A1
A2
B4
B5
A
B
C7
B6
B7
B8
C
C6
The number of combinations:
D
7(A)x8(B)x7(C)xn(D)xm(E)…..
C5
C4
E
=A very large number
• An exhaustive calculation is not possible
Optimizing Drug Design
Leiden 20-23 July 2009
(1) Randomly pick:
one value for each of the variables
B4
A7
This determines a single
“conformation” or
“configuration” of the system
A
B
(2) Employ the “cost
function” to score the
current configuration
C
D
C5
E
Optimizing Drug Design
Leiden 20-23 July 2009
(3) Repeat steps (1) and (2) for n conformations
(n~103-106), and calculate the total value of each
sample 2
2nd value
.
.
.
.
.
.
sample n
nth value
Optimizing Drug Design
Leiden 20-23 July 2009
(4) Construct a histogram of the distribution of
values for all sampled conformations
Distribution
10%
9%
8%
7%
6%
5%
4%
3%
2%
1%
0%
low values
region
1
2
3
4
5
6
7
high values region
8
9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Function Value
Optimizing Drug Design
Leiden 20-23 July 2009
(5) Examine the frequency of each variable
value in worst results, compare to expected
high values region
zoom
A3
C6
A3
A3
A3
B4
B8
B4
C6
C6
C6
D7
D2
D6
E8
E8
E2
F1
F2
F9
conformation 220
conformation 314
conformation 715
Optimizing Drug Design
Leiden 20-23 July 2009
(6) Evict values that contribute above expectation
to worst scores, and less than expected to best
A3
B4
B8
B4
D7
D2
D6
E8
E8
E2
F1
F2
F9
C6
conformation 220
conformation 314
conformation 715
The total number of combinations is reduced
(7) Repeat the process iteratively until all remaining combinations
can be evaluated exhaustively and sorted. We obtain a population
Optimizing Drug Design
Leiden 20-23 July 2009
Acetylcholinesterase inhibitors with ISE
Inhibition measured by Marta Rosin (Novartis’ Excellon) , Hebrew University School
of Pharmacy
Molecular
chemical
properties
ISE “engine”
ISE Docking
and scoring
> 2 million
molecules
Target
specificity
9 molecules, 5
measured, 3 active
Optimizing Drug Design
Leiden 20-23 July 2009
Bcr-Abl dimerization inhibition by peptides 64aa
Synthesized and measured by Martin Ruthardt, Goethe Univ. Frankfurt
Properties
of amino
acids
~ 1080 sequences
ISE “engine”
ISE protein
design
Target
specificity
10 peptides,
6 active
Optimizing Drug Design
Leiden 20-23 July 2009
Distinguishing between actives and inactives, on a specific target
Classification – Drugs vs. Non-drugs, Selectives vs. non Selectives
Huge combinatorial problem with more than 10100 options
Optimization problem: find differences in molecular
properties to distinguish between actives and inactives
Optimizing Drug Design
Leiden 20-23 July 2009
Learning from known data
“Actives” : Molecules with activity < 100nm
“Selectives” : Molecules with selectivity > 3:1
“Inactives”: MDDR (randomly picked), or less actives
Properties (“descriptors”, our variables) are produced by
computer programs (MOE):
Molecular weight, number of H-bond donors & acceptors,
partial charges, topological, polar surface, Van der Waals,
Molar refraction etc…
Optimizing Drug Design
Leiden 20-23 July 2009
Optimization of property ranges by ISE
to distinguish between the two databases
Each property is separated into two “sub properties”
Mol Weight values
60
percentage
50
40
30
20
10
0
0
250
500
MW
750
1000
1250
0
Lower Range 0
Upper Range
100
1500
1200
800
500
700
~ 80 values at intervals of 10
1200
~70 values
Randomly picked range
Overall there are 80*70 = 5.6*103 combinations for ranges of this variable
Optimizing Drug Design
Leiden 20-23 July 2009
Using properties to optimize the difference
between actives (selectives) and inactives
 If we construct a RANGE for each property
 Then we test each of the molecules in the
Actives and each in the inactives
2 < HD  6
-2 < logP  3
150 < M.W  775
A FILTER
 Determine if TP, TN, FP, FN
( P N Pf Nf)
 Compute the fraction of each category in the full DB
 Use the Matthews Correlation to score
Optimizing Drug Design
Leiden 20-23 July 2009
Scoring by the Matthews Correlation
Each given range is for ACTIVES, and actives can only be P or Nf
Databases:
Nf
MCC 
actives
inactives
P
Pf
N
( PN )  ( PfNf )
( N  Nf )( N  Pf )( P  Nf )( P  Pf )
For a fully correct prediction C = 1
For a completely erroneous prediction C = - 1
For a random prediction C ~ 0.00
Optimizing Drug Design
Leiden 20-23 July 2009
Applying ISE to discriminate between actives and
inactives by optimizing descriptor ranges
Construct filter i:
Pick randomly a value for each of the variables,
i.e., low range MW, high range MW etc.
Pass all actives and inactives of the
training set through filter i
P, N, Pf, Nf
Until i = 106
MCC 
( PN )  ( PfNf )
( N  Nf )( N  Pf )( P  Nf )( P  Pf )
Get MCC value
for filter i
Histogram, Elimination, Iteration, Exhaustive, Test
Optimizing Drug Design
Leiden 20-23 July 2009
Results of exhaustive step, before clustering
MCC
Best
filter
MW
ClogP
HDon
Hacc
%actives
%inactives
(P)
(N)
0.49
282 <
-6 <
0<
2
82
67
0.49
292 <
-2.5 <
0<
2
78
71
0.49
292 <
< 9.5
0<
2
80
69
0.49
301 <
-6 <
0<
2
77
72
0.48
282 <
-6 <
0<
1
85
63
Optimizing Drug Design
Leiden 20-23 July 2009
Employing the “best sets of filters”
to construct a Molecular Bioactivity Index
n
MBI 
P
N
active  inactive

Pf
Nf
i 1
n
With good data, the range of MBI is large and we get a good
“resolution”
We have shown that we can use MBI to “fish” a few active
molecules out of a “sea” of inactive ones
http://www.md.huji.ac.il/models (look for “test MBI”)
Optimizing Drug Design
Leiden 20-23 July 2009
Employing the “best sets of filters”
to construct a Drug Likeness Index (DLI)
n
DLI 
P
N
active  inactive

Pf
Nf
i 1
n
Drug Likeness is different than Lipinski’s ROF !
Optimizing Drug Design
Leiden 20-23 July 2009
MBI and DLI can make a difference in:

High Throughput Screening

Combinatorial Synthesis

Hit to lead development

Lead optimization

Construction of Focused libraries

Molecular scaffold optimization

Selectivity optimization
Optimizing Drug Design
Leiden 20-23 July 2009
Timeline for discovery, single processor
One target (enzyme, cells, organs…)
2. ZINC 3. Diversity, Similarity
Eliminate known actives
scan
Few hrs. A few hours
1.
Model
building
2-3 days
4. SCIFinder
manual search
4-5 days
5. Purchase/synthesize
molecules
6. in vitro tests
1-2 months
Optimizing Drug Design
Leiden 20-23 July 2009
Input: VEGFR-2 KDR active inhibitors <100nm
549 actives divided randomly into 412 training and 137 test set
Inactives are from MDDR
Optimizing Drug Design
Leiden 20-23 July 2009
Output: example of a filter with 6 descriptors
One of the best (high MCC); there are others with higher MCC but many desciptors
Number of descriptors – 6
MCC of test set – 0.79
TP 98.9
TN 78.6
Bcut_SMR_3
0.0 – 3.06
SMR_VSA4
0.1 - 100.6
Vsa_pol
0.1 – 102.4
Reactive
0.0 – 0.999
balabanJ
0.0 - 1.902
Q_RPC0.0 – 0.267
Optimizing Drug Design
Leiden 20-23 July 2009
A 6-property filter
Bcut_SMR_3
SMR_VSA4
Vsa_pol
Reactive
balabanJ
Q_RPC-
Molar refraction
VdW surface area
Approx VdW polar surface
Reactive fragments
Topological variable
Relative Negative partial charge
Optimizing Drug Design
Leiden 20-23 July 2009
Enrichment in the training set of VEGFR2
MBI MODEL for VEGFR
True
Positives/Negatives
Green :% True Positives above threshold
Red :% True Negatives below threshold
Blue: Enrichment Factor
100
500
80
400
60
300
40
200
20
100
0
0
-18
-8
2
12
22
MBI Threshold
Optimizing Drug Design
Leiden 20-23 July 2009
Initial focused library from ZINC (2.1 million)
ZINC library screening gave 7826 molecules with top MBI
Optimizing Drug Design
Leiden 20-23 July 2009
Similarity of highest MBI to training set
Similarity of focused library from ZINC against known VEGFR
active compounds
3000
2000
1500
1000
500
0.98
0.93
0.88
0.83
0.78
0.73
0.68
0.63
0.58
0.53
0.48
0.43
0.38
0.33
0.28
0.23
0.18
0.13
0.08
0
0.03
Number of molecules
2500
0.025
0
0.075
37
0.125
858
0.175
2678
0.225
2655
0.275
1071
0.325
344
0.375
112
0.425
54
0.475
10
0.525
2
0.575
4
0.625
1
0.675
0
0.725
0
0.775
0
0.825
0
0.875
0
0.925
0
0.975
0
Tanimoto Index
Optimizing Drug Design
Leiden 20-23 July 2009
BBB results
1
0.8
0.6
BBB Index
0.4
0.2
0
-0.2
-0.4
-0.6
-0.8
Negative BBB pass
-1
Positive BBB pass
Optimizing Drug Design
Leiden 20-23 July 2009
logRBA
ER-MBI “moving ensemble”
(normalized MBI values)
High
Moderate
Low
ER-MBI
Optimizing Drug Design
Leiden 20-23 July 2009
ER-MBI Combined high/low MBI
3
R ²=0.75
2
1
logRBA
0
-1
-2
-3
Low
Moderate
High
-4
-5
-1
-0.5
0
0.5
1
ER-MBI
Optimizing Drug Design
Leiden 20-23 July 2009
Molecular bioactivity index
Optimizing Drug Design
Leiden 20-23 July 2009
Molecular Bioactivity Index (MBI):
Fishing actives from a “bath” of “non-actives”
Mix 10 in 100,000 - find 9 in best 100, 5 in best 10
Enrichment of 900
Enrichment of 5000
Optimizing Drug Design
Leiden 20-23 July 2009
Polypharmacology – with our indexing method
•
We use several MBI (or MBI and DLI) to map activity into multiple targets.
This may be used to extract potential new poly-active compounds or
selective compounds depending on the behavior of the relevant disease
Target1
selective
Multi
target
Non-actives
Target2
selective
MBI target1
MBI target2
Optimizing Drug Design
Leiden 20-23 July 2009
Docking & Scoring
Do the molecules bind ?
How strong is the
binding affinity ?
Score
How does the
complex look like
?
Binding mode
X-ray, NMR,
Homology model
Requirement: 3D structure of the targetOptimizing Drug Design
Leiden 20-23 July 2009
ISE-dock
•
A new docking program from our lab that
uses the ISE algorithm in order to produce
large sets of optimal results for docking of
ligands to their targets
Optimizing Drug Design
Leiden 20-23 July 2009
ISE-dock
• Better than AutoDock – the most cited
docking program
• Much better in the main docking criteria
than other two popular programs – Glide
and GOLD
• Produces large near optimal docking
populations to study the nature of binding
and to predict alternative binding modes
• Accounts for ligand and protein flexibility
• Correlation between ISE-dock populations
and experimental multiple binding modes
Optimizing Drug Design
Leiden 20-23 July 2009
Anti Alzheimer current main drug strategy
O
O
AChE
+
N
N
OH
O
Acetylcholine
Choline
HO
Acetic acid
Optimizing Drug Design
Leiden 20-23 July 2009
MBI MODEL for AChE inhibition
True/False
Positives
Green: % True Positives above threshold
Red: % False positives above threshold
Blue : richment factor
100
2000
80
1500
60
1000
40
500
20
0
0
-10
0
10
20
30
40
MBI Threshold
Based on ~450 active molecules with IC50 < 10 micromolar
~8000 randomly picked molecules from ZINC assumed to be inactives
Optimizing Drug Design
Leiden 20-23 July 2009
Docking with ISE-dock/Autodock
We used the crystal structure of mouse AChE (1q84) for
docking.
Compounds in protonated state were docked to AChE by
AutoDock3.0 and ISE-Dock.
751 out of 755 compounds were docked in the active site
by both methods
Optimizing Drug Design
Leiden 20-23 July 2009
ISE-dock results
10 different conformations of one
ligand in the AChE. Each color
represents a different pose
Fig 2 – AChE with ACh , the red color
represents the negatively charged
gorge due to many side chain
aromatic rings
Optimizing Drug Design
Leiden 20-23 July 2009
10 compounds from docking results
(financial limitation)
The 10 compounds were picked by direct examination of each of
these molecules in the active site, paying utmost attention to its
conformation, H-bonds and other interactions.
Optimizing Drug Design
Leiden 20-23 July 2009
Experimental Results
9 out of the 10 compounds were purchased
8 out of the 9 compounds reached our lab with enough quantity
5 out of the 8 compounds are soluble
3 out of the 5 compounds are active (IC50=3.25, 3.5, 3.75 µM)
Similarity to known active compounds is less than 0.35
molecules are novel AChE inhibitors (not a single paper on any)
Optimizing Drug Design
Leiden 20-23 July 2009
Conclusions
 ISE is useful for solving extremely complex
optimization problems
 Provides large sets of graded results
 Achieves high enrichments of “actives” vs.
“inactives” by MBI, DLI, MSI etc.
 Useful for developing multi-targeted drugs
 Discovers new binders for known drug targets
 Produces diverse sets of solutions
Optimizing Drug Design
Leiden 20-23 July 2009
Molecular Modeling Group Partners
http://www.md.huji.ac.il/models
http://www.cancergrid.eu
Prof. Andrej Bohac:
Comenius U, Bratislava, VEGFR2 (Angiokem)
DAC company
Milan, HDAC and HSP90 inhibition
Prof. Mart Sarma
U. Helsinki, RET Kinase inhibition
Prof. Martin Rhutardt
U. Frankfurt, Bcr-Abl inhibition by peptides
Prof. Yousef Najajreh
Al Quds University, Bcr-Abl inhibitor synthesis
Prof. Yossi Schlessinger Yale, FGFR inhibitors
Prof. David Varon
Hadassah, Jerusalem, ADAMTS-13 inhibition
Prof. Angelo Carotti:
School of Pharmacy, Univ. of Bari, MMP inhibitors
Prof. Marta Rosin
HUJI, AChE inhibitors
Optimizing Drug Design
Leiden 20-23 July 2009
Molecular Modeling Group, HUJI
http://www.md.huji.ac.il/models
Optimizing Drug Design
Leiden 20-23 July 2009