A Statistical Analysis of the Linear Interaction Energy Method
Download
Report
Transcript A Statistical Analysis of the Linear Interaction Energy Method
Flexible-Protein Docking
Dr Jonathan Essex
School of Chemistry
University of Southampton
Southampton
Programme
• Existing small-molecule docking
– Typical approximations, and outcomes
• Evidence for receptor flexibility, and
consequences
• Methods for accommodating protein flexibility
in docking:
– The ensemble approach
– The induced fit approach
Existing small-molecule docking
• Taylor, R.D. et al. J. Comput. Aided Mol. Des.
16, 151-166 (2002)
• Many docking algorithms (some 127
references in this 2002 review!)
• Most docking algorithms:
– Rigid receptor hypothesis
• Limited receptor flexibility in, for example, GOLD – polar
hydrogens
Existing small-molecule docking
• Most docking algorithms:
– Range of ligand sampling methods
• Pattern matching, GA, MD, MC…
– Treatment of intermolecular forces:
• Simplified scoring functions: empirical, knowledge-based
and molecular mechanics
• Very simple treatment of solvation and entropy, or
completely ignored!
Existing small-molecule docking
• And how well do they work?
– Jones, G. et al. J. Mol. Biol. 267, 727-748 (1997)
– In re-docking studies, achieved a 71 % success
rate
• This is probably typical of most of these
methods
• So what’s missing?
The scoring function
• Existing functions inadequate
– Too simplified, for reasons of computational
expediency
– Solvation and entropy often inadequately treated
• Possible solutions?
– More physics
The rigid receptor hypothesis
• Murray, C.W. et al. J. Comput. Aided Mol.
Des. 13, 547-562 (1999)
– Docking to thrombin, thermolysin, and
neuraminidase
– PRO_LEADS – Tabu search
– In self docking, ligand conformation correctly
identified as the lowest energy structure – 76 %
– For cross-docking – 49 % successful
– Some of the associated protein movements very
small
The rigid receptor hypothesis
• Erickson, J.A. et al. J. Med. Chem. 47, 45-55
(2004)
– Docking of trypsin, thrombin and HIV1-p
– Self-docking, docking to a single structure that is
closest to the average, and docking to apo
structures
– Docking accuracy declines on docking to the
average structure, and is very poor for docking to
apo
– Decline in accuracy correlated with degree of
protein movement
The rigid receptor hypothesis
• Erickson, J.A. et al. J. Med. Chem. 47, 45-55
(2004)
protein
RMSD / Å RMSD / Å
cocomplexes
apo
%
self
%
average
%
apo
trypsin
0.15
1.6
67
60
37
thrombin
0.31
1.0
36
27
9
HIV1-p
0.73
2.0
50
35
4
Models of Protein-Ligand Binding
• Goh, C.-S. et al.
Curr. Opin. Struct.
Biol. 14, 104-109
(2004)
• Review of receptor
flexibility for proteinprotein interactions
Models of Protein-Ligand Binding
• This paper classifies protein-protein binding
in terms of these models
• Induced fit assumed if there is no
experimental evidence for a pre-existing
equilibrium of multiple conformations
• Note that strictly this is an artificial distinction
– Statistical mechanics – all states are accessible
with a non-zero probability
– For induced fit, probability of observing bound
conformation without the ligand may be very small
Protein flexibility in drug design
• Teague, S.J. Nature
Reviews 2, 527-541
(2003)
• Effect of ligand
binding on free
energy
Protein flexibility in drug design
• Multiple
conformations of a
few residues
– Acetylcholinesterase
• Phe330 flexible –
acts as a
swinging gate
Protein flexibility in drug design
• Movement of a large number of residues
– Acetylcholinesterase (again!)
Protein flexibility in drug design
• Table 1 in Teague paper
lists pharmaceutically
relevant flexible targets
(some 30 systems!)
• Consequences of
protein flexibility for
ligand design
– One site, several ligand
binding modes possible
Protein flexibility in drug design
• Consequences
– Allosteric inhibition
– Binding often remote from active site – NNRTIs
• Proteins in metabolism and transport
– Promiscuous
• Bind many compounds, in many orientations
• E.g P450cam substrates, camphor versus thiocamphor
(two orientations, different to camphor!)
Experimental evidence for population
shift
• Binding kinetics
– Binding to low-population conformation should
yield slow kinetics – DGbarrier
– Observed for p38 MAP kinase - mobile loop
• Rates of association vary between 8.5 x 105 and 4.3 x
107 M-1s-1, depending on whether conformational change
involved
– Slow kinetics can make experimental comparison
between assays difficult
– Slow kinetics can improve ADME properties!
Experimental evidence for population
shift
Nitrogen Regulatory Protein C (NtrC) plays a central role in the
bacterial metabolism of nitrogen
N-terminal
receiver
domain
DNA
binding
domain
Central catalytic
domain
Protein conformational change
Changing nitrogen levels promote the activity of NtrB kinase
Phosphate
Asp54
NtrB kinase phosphorylates NtrC at aspartate 54
in the receiver domain
Protein conformational change
Phosphorylation promotes conformational change in the
receiver domain
Phosphate
Asp54
Protein conformational change
• NtrC – active and
inactive conformations
apparent
• P-NtrC – protein shifted
towards activated
conformation
• Volkman, B.F. et al.
Science 291, 2429-33
(2001)
Summary
• Protein flexibility important in ligand design
• Two basic mechanisms
– Selection of a binding conformation from a preexisting ensemble – population shift
– Induced fit – binding to a previously unknown
conformation
– Thermodynamically, these mechanisms are
identical
• Evidence for population shift from binding
kinetics, and protein NMR
Docking methods for
incorporating receptor flexibility
• Ensemble docking
– Docking to individual protein structures, or parts of
protein structures – “ensemble docking”
– Docking to a single average structure – “soft
docking”
• Induced fit modelling
• Carlson, H.A. Curr. Opin. Chem. Biol. 6, 447452 (2002)
Ensemble docking
• Generate an ensemble of structures, and
dock to them
• Experimentally derived structures
– NMR or X-ray structures
• Computationally derived structures
– Molecular dynamics
– Simulated annealing
– Normal mode propagation
FlexE
• Claussen, H. et al. J. Mol. Biol. 308, 377-395
(2001)
• Extension of the FlexX algorithm:
– Preferred conformations for ligands identified
– Simplified scoring function adopted – based on
hydrogen bonds, ionic interactions etc.
– Break ligand into base fragments by severing
acyclic single bonds
FlexE
• Extension of the FlexX algorithm:
– Base fragments placed in active site by
superposing interaction centres
– Incrementally reconstruct ligand onto base
fragments
– Test each partial solution and continue with the
best for further reconstruction
FlexE
• United protein description
– Use a set of protein structures representing
flexibility, mutations, or alternative protein models
– Assumes that overall shape of the protein and
active site is maintained across the series
– FlexE selects the combination of partial protein
structures that best suit the ligand
– Flexibility given by FlexE is therefore defined by
the protein input structures
FlexE
• United protein description
– Similar parts of the protein structures are merged
– Dissimilar parts of the protein are treated as
separate alternatives
FlexE
• United protein description
– Some combinations of the structural features are
incompatible and not considered
– As the ligand is constructed, the optimum protein
structure is identified
– Combination strategy for the protein may result in
a structure not present in the original data set
• Evaluation
FlexE
– 10 proteins, 105 crystal structures
– RMSD < 2.0 Å, within top ten solution, 67 % success
– Cross-docking with FlexX gave 63 %
– FlexE faster than cross-docking with FlexX
• Aldose reductase - very flexible active site
– FlexE docking successful (3 ligands)
– Using only one rigid protein structure would not have
worked
Ensemble docking
• Advantages:
– Well-defined computational problem
– Computational cost generally scales linearly with
number of structures (potential combinatorial
explosion)
– Can use either experimental information, or
structures derived from computation
• Disadvantages:
– What happens if the appropriate bound receptor
conformation is not present in the ensemble?
Soft-Docking
• Knegtel, R.M.A. et al. J. Mol. Biol. 266, 424440 (1997)
• Build interaction grids within DOCK that
incorporate the effect of more than one
protein structure
• Effectively soften and average the different
structures
Soft-Receptor Modelling
• Österberg, F. et al. Proteins 46, 34-40 (2002)
• Similar approach applied to Autodock grids
– Energy-weighted grid
– Boltzmann-type weighting applied to reduce the
influence of repulsive terms
• Combined grids performed very well – HIV
protease
Soft-Receptor Modelling
Soft-Receptor Modelling
• Advantages
– Low computational cost – use of single averaged protein
model
– Can use experimental or simulation derived structures
• Disadvantages
– Cope with large-scale motion?
– How reliable is this “averaged” representation?
– Mutually exclusive binding regions could be
simultaneously exploited
– Active sites enlarged
Induced-Fit Docking Methods
• Allow protein conformational change at the
same time as the docking proceeds
• Taking some of these algorithms, in no
particular order…
Induced-Fit Docking Methods
• Molecular dynamics methods:
– Mangoni, R. et al. Proteins 35, 153-162 (1999)
– Separate thermal baths used for protein and
ligand to facilitate sampling
• Multicanonical molecular dynamics:
– Nakajima, N. et al. Chem. Phys. Lett. 278, 297301 (1997)
– Bias normal molecular dynamics to yield a flat
energy distribution
Induced-Fit Docking Methods
• Monte Carlo methods
– Apostolakis, J. et al. J. Comput. Chem. 19, 21-37
(1998)
• Hybrid Monte Carlo and minimisation method. PoissonBoltzmann continuum solvation used
– ICM, Abagyan, R. et al. J. Comput. Chem. 15,
488-506 (1997)
• Conventional MC, plus side-chain moves from a rotamer
library
• Minimisation again required
• VS - J. Mol. Biol. 337, 209-225 (2004)
Induced-Fit Docking Methods
• FDS Taylor, R. et al. J. Comput. Chem. 24,
1637-1656 (2003)
• Flexible ligand/flexible protein docking
– large side chain motions, rotamer library
• Solvation included “on the fly”
– continuum solvation model – GB/SA
• Soft-core potential energy function
– anneal the potential to improve sampling
Arabinose Binding Protein
• Rigid protein docking
• Low energy structures are
essentially identical to the Xray structure
• Dock starting from
experimental result, does not
return to it
Arabinose Binding Protein
• Flexible protein docking
• Experimental structure found
• A number of other structures
are isoenergetic
• Cannot uniquely identify the
experimental structure
Arabinose Binding Protein
• Flexible protein docking
• Most successful structure
with experiment
(transparent)
• Most successful structure,
experiment, and
isoenergetic mode
Monte Carlo Docking
• 15 complexes studied
• Rigid receptor
– 13/15 identified X-ray binding mode
– 8/15 were the unique, lowest energy structures
– 3/15 were part of a cluster of low-energy binding modes
• Flexible receptor
– 11/15 identified X-ray binding mode
– 3/15 were the unique, lowest energy structure
– 6/15 were part of a cluster of low-energy binding modes
FAB Fragment
• Two isoenergetic binding modes
Closest seed
Isoenergetic seed
Conclusion
• Rigid protein docking as successful as other
methods, but much more expensive
• Flexible protein docking does find X-ray
structures, but does not uniquely identify
them
– Refine scoring function?
• Using this methodology, need to consider a
number of structures
• Further validation required
Summary
• Two main approaches for modelling receptor
flexibility
– Use of multiple structures (experimental or
theoretical) either independently, or averaged in
some way – ensemble approach
– Allow the receptor to adopt conformations under
the influence of the ligand – induced fit approach
Summary
• Ensemble is the more widely employed – less
expensive, but limited somewhat by the
composition of the ensemble
• Induced fit should overcome this
disadvantage of ensemble methods
• Induced fit methods can have significant
sampling problems
– not computationally limited
– search space large, and increasing as extra
degrees of freedom added
Flexible protein docking –
a case study
• Wei, B.Q. et al. J. Mol. Biol. 337, 1161-1182
(2004)
• Use experimental structures
• Like FlexE, flexible regions move
independently, and are able to recombine
• Modified version of DOCK used
Flexible protein docking –
a case study
• Receptor decomposed
into three parts
– Green – rigid
– Blue and red – two
flexible parts
• Ligand scored against
each component
• Best-fit protein
conformation
assembled from these
components
Flexible protein docking –
a case study
• Scoring function
– Electrostatic (potential from PB), van der Waals
– Solvation (scaled AMSOL result according to
buried surface area)
• Large ligands favoured for large cavities
– Penalty for forming the larger cavity introduced
Flexible protein docking –
a case study
• In screening, enrichment improved compared
to docking against individual conformations
• ACD screened against L99A M102Q mutant
of T4L
– 18 compounds that were predicted to bind and
change cavity conformation, tested
– 14 found to bind
– X-ray structures obtained on 7
Flexible protein docking –
a case study
• Predicted ligand geometries reproduced (<
0.7 Å)
• In five structures, part of observed cavity
changes reproduced
• In two structures, receptor conformations not
part of original data set, and therefore not
reproduced!
Flexible protein docking –
a case study
• New ligands found by flexible receptor
docking
• Receptor conformational energy needs to be
considered
Conclusion
• Rigid receptor approximation not universal
• Two main approaches to modelling receptor
flexibility
– Ensemble
– Induced fit
• Further validation of these methods needed
Acknowledgements
• Flexible Docking
– Richard Taylor, Phil Jewsbury, Astra Zeneca
• Practical
– Donna Goreham, Sebastien Foucher