DRUG_DESIGN_ik_2011

Download Report

Transcript DRUG_DESIGN_ik_2011

Chemoinformatics in
Drug Design
Irene Kouskoumvekaki,
Associate Professor,
Computational Chemical Biology,
CBS, DTU-Systems Biology
Biological Sequence Analysis, May 6, 2011
Computational Chemical Biology group
Tudor Oprea
Guest Professor
Olivier Taboureau
Associate Professor
Irene Kouskoumvekaki
Associate Professor
Sonny Kim Nielsen
PhD student
Kasper Jensen
PhD student
2
CBS, Department of Systems Biology
Ulrik Plesner
master student
3
CBS, Department of Systems Biology
4
CBS, Department of Systems Biology
Definition:
Chemoinformatics
Gathering and systematic use of
chemical information, and
application of this information to
predict the behavior of unknown
compounds in silico.
data
5
CBS, Department of Systems Biology
prediction
Definition:
A drug candidate…
... is a (ligand) compound that binds to a biological
target (protein, enzyme, receptor, ...) and in this way
either initiates a process (agonist) or inhibits it
(antagonist)
The structure/conformation of the ligand is
complementary to the space defined by the protein’s
active site
The binding is caused by favorable interactions
between the ligand and the side chains of the amino
acids in the active site. (electrostatic interactions,
hydrogen bonds, hydrophobic contacts...)
6
CBS, Department of Systems Biology
Drug Discovery
Animal studies
In vitro / In silico studies
Clinical studies
Disease
7
Biological
Target
CBS, Department of Systems Biology
Drug
candidate
The Drug Discovery Process
Genome
Gene
Protein
HTS
Hit
Lead
Candidate Drug
Genomics
Bioinformatics
Structural Bioinformatics
Chemoinformatics
Chemoinformatics
Structure-based Drug Design
ADMET Modelling
8
CBS, Department of Systems Biology
The Drug Discovery Process
We know the structure of
the biological target
We identify/predict the
binding pocket
MKTAALAPLFFLPSALATTVYLA
GDSTMAKNGGGSGTNGWGEYL
ASYLSATVVNDAVAGRSAR…(etc)
Challenge:
To design an organic molecule that
would bind strong enough to the
biological target and modute it’s
activity.
9
CBS, Department of Systems Biology
New drug candidate
Example: – Alzheimer’s disease
What is it?
Alzheimer's is a disease that causes failure of brain
functions and dementia. It starts with bad memory and
disability to function in common everyday activities.
How do you get it?
Alzheimer's disease is the result of malfunctioning
neurons at different parts of the brain. This, in
turn, is due to an inbalance in the
concentration of neurotranmitters.
10
CBS, Department of Systems Biology
Example: – Alzheimer’s disease
How can we treat it?
Acetylkolin neurotransmitter
Drug against Alzheimer’s
11
CBS, Department of Systems Biology
Old School Drug discovery process
HTS
Screening
collection
106 cmp.
Follow-up
Actives
Hits
103 actives
1-10 hits
High rate of false
positives !!!
12
CBS, Department of Systems Biology
Hit-to-lead
Lead-to-drug
Lead series
0-3 lead series
Drug
candidate
0-1
Clinical trials
13
CBS, Department of Systems Biology
Failures
14
CBS, Department of Systems Biology
Drug discovery in the 21st Century
in vitro
in silico + in vitro
Diverse set of molecules tested Computational methods to select
in the lab
subsets (to be tested in the lab) based
on prediction of drug-likeness,
solubility, binding,
pharmacokinetics, toxicity, side
effects, ...
15
CBS, Department of Systems Biology
The Lipinski ‘rule of five’ for druglikeness prediction




Octanol-water partition coefficient (logP) ≤ 5
Molecular weight ≤ 500
# hydrogen bond acceptors (HBA) ≤ 10
# hydrogen bond donors (HBD) ≤ 5
If two or more of these rules are violated, the compound might
have problems with oral bioavailability.
(Lipinski et al., Adv. Drug Delivery Rev., 23, 1997, 3.)
16
CBS, Department of Systems Biology
Major Aspects of Chemoinformatics
Experimental
data
17
CBS, Department of Systems Biology
Model
generation
Prediction for
unknown
compounds
Major Aspects of Chemoinformatics
•Information Acquisition and Management:
Methods for collecting data (mainly
experimental). Development of databases for
storage and retrieval of information.
•Information Use: Data analysis, correlation
and model building.
•Information Application: Prediction of
molecular properties relevant to chemical and
biochemical sciences.
18
CBS, Department of Systems Biology
Major Aspects of Chemoinformatics
•Information Acquisition and Management:
Methods for collecting data (mainly
experimental). Development of databases for
storage and retrieval of information.
•Information Use: Data analysis, correlation
and model building.
•Information Application: Prediction of
molecular properties relevant to chemical and
biochemical sciences.
19
CBS, Department of Systems Biology
Information Acquisition and Management
20
CBS, Department of Systems Biology
Small molecule databases
21
CBS, Department of Systems Biology
Growth In PubChem Substances & Compounds
Recent count: Substance: 72,156,631 Compound:
28,807,320 Rule of 5: 20,692,980
20,000,000
18,000,000
16,000,000
Compound
Substance
14,000,000
12,000,000
10,000,000
8,000,000
6,000,000
4,000,000
2,000,000
0
May-05
22
Sep-05
Jan-06
CBS, Department of Systems Biology
May-06
Sep-06
Jan-07
May-07
Sep-07
Searching in PubChem
23
CBS, Department of Systems Biology
Structural representation of molecules
Structural representation of molecules
24
CBS, Department of Systems Biology
Major Aspects of Chemoinformatics
•Information Acquisition and Management:
Methods for collecting data (mainly
experimental). Development of databases for
storage and retrieval of information.
•Information Use: Data analysis, correlation
and model building.
•Information Application: Prediction of
molecular properties relevant to chemical and
biochemical sciences.
25
CBS, Department of Systems Biology
Beyond the Lipinski Rule of 5...
•Chemometrics: The application of mathematical or
statistical methods to chemical data (simple, linear
methods)
e.g. Principal Component Analysis
•Machine Learning: The design and development of
algorithms and techniques that allow computers to
learn (complex, non-linear algorithms)
e.g. Artificial Neural Networks, K-means clustering
26
CBS, Department of Systems Biology
Major Aspects of Chemoinformatics
•Information Acquisition and Management:
Methods for collecting data (mainly
experimental). Development of databases for
storage and retrieval of information.
•Information Use: Data analysis, correlation
and model building.
•Information Application: Prediction of
molecular properties relevant to chemical and
biochemical sciences.
27
CBS, Department of Systems Biology
Prediction of Solubility, ADME &
Toxicity
Solid
Dissolution
drug
Membrane
Drug in
solution
Solubility
28
transfer
CBS, Department of Systems Biology
Absorbed
Liver
extraction
circulation
drug
Absorption
Systemic
Metabolism
Prediction of biological
activity/selectivity
29
CBS, Department of Systems Biology
Prediction models at CBS
30
CBS, Department of Systems Biology
Virtual screening
 Computational techniques for a rapid
assessment of large libraries of chemical
structures in order to guide the selection of
likely drug candidates.
 Exploit knowledge of the active ligand
molecule or the protein target.
31
CBS, Department of Systems Biology
Virtual Screening Flavors
TARGET-BASED
1D filters
1D
e.g. Lipinskis
Rule of Five
32
CBS, Department of Systems Biology
LIGAND-BASED
Molecular similarity on the Chemical Space
• Similar Property Principle – Molecules having similar
structures and properties are expected to exhibit similar
biological activity. (Not always true!)
• Thus, molecules that are located closely together in the
chemical space are often considered to be functionally
related.
33
CBS, Department of Systems Biology
Ligand-based VS: Fingerprints
– widely used similarity search tool
– consists of descriptors encoded as bit strings
– Bit strings of query and database are compared using
similarity metric such as Tanimoto coefficient
MACCS fingerprints: 166 structural keys
that answer questions of the type:
• Is there a ring of size 4?
• Is at least one F, Br, Cl, or I present?
where the answer is either
TRUE (1) or FALSE (0)
34
CBS, Department of Systems Biology
Tanimoto Similarity
c
9
Tc 

 0.9
a  b  c 10 9  9
or 90% similarity
35
CBS, Department of Systems Biology
Tanimoto Similarity
36
CBS, Department of Systems Biology
Ligand-based VS: Pharmacophore
37
CBS, Department of Systems Biology
Structure-based Virtual Screening: Docking
Binding pocket of target
Library of small compounds
Given a protein and a database of ligands, docking
scores determine which ligands are most likely to bind.
38
CBS, Department of Systems Biology
Energy of binding
Binding pocket of target
Library of small compounds
-1 kcal/mol
-10 kcal/mol
+10 kcal/mol
+1 kcal/mol
ΔG = ΔH - TΔS
vdW
Hbond
Desolvation E
Electrostatic E
39
CBS, Department of Systems Biology
Torsional free E
“Docking” and “Scoring”
• Docking involves the prediction of the binding mode of
individual molecules
– Goal: new ligand orientation closest in geometry to the
observed X-ray structure (Conformations of ligands in complexes
often have very similar geometries to minimum-energy conformations of
the isolated ligand)
• Scoring ranks the ligands using some function related to
the free energy of association of the two partners, looking
at attractive and repulsive regions and taking into account
steric and hydrogen bonding interactions
– Goal: new ligand score closest in value to the docking
score of the X-ray structure
40
CBS, Department of Systems Biology
Docking algorithms
• Most exhaustive algorithms:
–Accurate prediction of a binding pose
• Most efficient algorithms
–Docking of small ligand databases in reasonable time
• Rapid algorithms
–Virtual high-throughput screening of millions of
compounds
41
CBS, Department of Systems Biology
Scoring functions
• Molecular mechanics force field-based
Score is estimated by summing the strength of intermolecular
van der Waals and electrostatic interactions between all atoms
of the ligand-target complex
-CHARMM, AMBER
• Empirical-based
Based on summing various types of interactions between the
two binding partners (hydrogen bonds, hydrophobic, …)
- ChemScore, GlideScore, AutoDock
• Knowledge-based
Based on statistical observations of intermolecular close
contacts from large 3D databases, which are used to derive
potentials or mean forces
-PMF, DrugScore
42
CBS, Department of Systems Biology
Combination of pharmacophore, docking and
molecular dynamics (MD) screens
Ligand-based VS
 good enrichment of candidate
molecules from the screening of
large databases with less
computational efforts
Structure-based VS
 better fit for analyzing smaller
sets of compounds, especially in
retrospective analysis
× too coarse to pick up subtle
differences induced by small
structural variations in the
ligands
 include all possible interactions
thus allowing the detection of
unexpected binding modes
 many options for model
refinement
× Changing parameters for docking
algorithms and scores is
demanding
Mutants are being developed:
• pharmacophore methods with
information about the target’s binding
site
43
CBS, Department of Systems Biology
• docking programs that incorporate
pharmacophore constraints
http://www.vcclab.org/lab/edragon/
44
4
CBS, Department of Systems Biology
Public Web Chemoinformatics Tools
http://pasilla.health.unm.edu/
http://pasilla.health.unm.edu/
45
CBS, Department of Systems Biology
ChemSpider
www.chemspider.com
46
CBS, Department of Systems Biology
Open Babel
http://openbabel.org/wiki/Main_page
47
CBS, Department of Systems Biology
48
CBS, Department of Systems Biology
D. Vidal et al, Ligand-based Approaches to In Silico Pharmacology, Chemoinformatics and
Computational Chemical Biology, Ed J. Bajorath, Springer, 2011
Questions?
49
CBS, Department of Systems Biology