FLEX* - REVIEW

Download Report

Transcript FLEX* - REVIEW

FLEX* - REVIEW
Agenda
 Introduction
 Main concepts
 FlexE
 FlexS
 Evaluation and Discussion
Introduction
 Methods for the prediction of binding
properties of molecules to proteins.
 Classification by the amount of
information available about the target
protein
The general schema
Ligand conformational flexibility
Modeling
Receptor-ligand interactions
Scoring function
Base selection
Algorithm
Base placement
Incremental construction
The Ligand conformational
flexibility
 Approximated by a discrete set of
conformations.
 rotatable
single bond - modeled by a
discrete set of preferred torsion angles
from the MIMUMBA DB.
 Ring system - A set of ring conformations
is computed with the program CORINA.
The model of receptor-ligand
interactions
 Modeled by a few special types of
interactions
hydrogen bonds
metal acceptors bonds
hydrophobic contacts
The model of protein-ligand
interactions – Cont.
 To each interaction group, we assign:
 Interaction
types
 Interaction geometry ( center + surface)
Two groups interact if :
 The centers of the groups lie approximately on the

surface of the counter group.
The interaction types are compatible
 The intermolecular interactions can be classified by
the strength of their geometric constrains
Scoring function
 Estimates the free binding energy in the
complex
match score
contact score
 The function is additive in the ligand atoms.
Overall docking algorithm
1. Ligand fragmentation
2. Select & Place a set of base
fragments
3. Construct the ligand by linking the
remaining fragments.
Ligand fragmentation
 The ligand is decomposed into
components by cutting at each acyclic
bond.
 Fragmentation is a partition of the
components of the molecule, such that
every part, called fragment, is
connected in the component tree.
Ligand fragmentation
 Good results are produced if the added
fragments are small
 Every fragment, except for the base fragment
consist of only one component.
Selecting a base fragment
 The problem: Find a fragment which
leads to low energy docking solution.
 Good base fragment properties:
 Placeability
 Specificity
Selecting a base fragment –Cont.
 We look for fragments maximizing the
function:
Rules for selecting a set of
fragments
 No base fragment is fully contained in
another base fragment
 Each component occurs in at most two
base fragments
 Each component in a base fragment
must be either necessary for the
connectivity of the fragment or it must
have interaction centers.
The base placement algorithm
 Goal: find positions of the base
fragment in the active site such that
sufficient number of favorable
interactions between the fragment and
the protein can occur simultaneously.
 Solution: pose clustering.
The base placement algorithm
– Cont.
 Preparation: Store all triangles of
interaction points (IP) of the protein in a
hash table.
 Find all the compatible fragment IP’s
triangles.
 Clustering of the legal transformations
The incremental construction
algorithm
 Input: solution set - set of partial
placements with the ligands with the
ligands constructed up to and including
fragment i-1
 Output: set of partial placements with
the ligands with the ligands constructed
up to and including fragment i
The complex construction
algorithm – cont.
 Adding the next fragment in all the possible
conformations
 Reject extended placements that have strong
overlap with the receptor or internal overlap
with the ligand.
 Searching for new interactions
 Optimizing the positions of the partial ligand
 Selecting a new solution set
 Clustering the solution set
Optimizing the positions of the
partial ligand
 The placement is optimized when:
 New
interactions are found.
 The placement contains slightly
overlapping atoms between the receptor
and the ligand.
 w l  r )
2
i
i
i
Selecting a new solution set
 Select k best-scoring solution
 Problem: the scoring values cannot be
compared directly when different
fragments are involved.
 Solution: estimate the score of the
whole ligand, given a partial placement.
Clustering partial solutions
 If no placement contains the other, the
distance is infinity
 Otherwise, the distance is defined to be
the RMSD of the intersecting atoms.
 A cluster is reduced to a single
placement.
Protein flexibility - motivation
 Induced fit – side chain or even backbone
adjustments upon docking of different ligands
to the same protein.
 Even small conformational changes are
critical for docking applications e.g. if a rotate
able bond prevents a ligand from binding in
the correct position.
Protein flexibelity
 Main idea: describe the protein structure
variations with a set of protein structures
representing the flexibility, mutation or
alternative models of a protein.
 The variability considered by flexE is defined by
the differences within the given input structures.
United protein description
 Data structure that
administers the protein
structures variations.
 Contains an ensemble of up to 30 possible
conformation of the protein.
 Most of them are low energy
conformations of the same protein.
United protein description construction
 Superposition
 Clustering
Add picture - 8
Notation
 Component : all the
atoms which belong to the
same amino acid or
mutation of the amino
acid. Contains a backbone
part and a side chain part
 Part : set of instances
 Instance : one of the
alternative conformations.
United protein description clustering
 The superimposed structures are
combined by clustering each part
separately
 Complete linkage hierarchical cluster
 The clustered instances can be
recombined to form new valid protein
structures.
Incompatibility
 Two instances of the united
protein description are
incompatible if they cannot
be realized simultaneously.
Logical: two instances are
alternative to each other
 Geometric: two logically
compatible instances overlap
 Structural: two instances of
the same chain are
unconnected

Incompatibility graph
V  ins tances
E
e v andv incom patiable 
ij
i
j
Incompatibility graph
 The incompatibility is
internally represented
as a graph by using the
instances as nodes and
the connecting pairs of
incompatible node by
an edge.
 Valid protein structures
correspond to
independent set in the
graph.
Selection of instances
 The ligand is placed fragment by
fragment into the active site by the
incremental construction algorithm.
 After each construction step, all
possible interactions are determined.
 Apply the scoring function for each
instance.
 We chose the IS with the highest score.
Select the optimal IS
 The IS can be assembled from IS of the
connected components.
 Apply a modified version of the BronKerbosch algorithm.
Evaluation
 FlexE was evaluated with ten protein
structures ensembles containing 105
crystal structure from the PDB.
 The structures within the ensemble
 highly
similar backbone trace
 Different conformations for several side
chains.
Evaluation – Cont.
 FlexE finds a ligand position with RMSD
below 2 A in 67% of the cases.
 Average CPU time for the incremental
construction algorithm is 5.5 minutes.
Discussion
 The ensemble approach is able to cope
with several side-chains conformations
and even movements of loops.
 Motions of larger backbone segments or
even domains movements are not
covered by this approach.
flexS - motivation
 In drug design, often enough, no
structural information about a particular
receptor is available.
 Considerable number of different
ligands are known together with their
binding affinities towards the receptor.
flexS - overview
 A method for structurally superpositing
pairs of ligands, approximating their
putative binding site geometry.
 Main Applications
 ligand

superpositioning
Virtual Screening
Implementation in flexS
 RigFit – fast rigid-body placement using
Fourier space methods.
 Incremental construction
 Systematic parameter study
Two Base Placement Methods
 Target: Place a rigid molecule fragment
onto the reference ligand
 Combinatorial
placement procedure
 Numerical placement procedure
RigFit
 Optimizes the common volume of two
molecule expressed by various
Gaussian functions associated to
different physicochemical properties.
 Solves the combinatorial placement
problem.
Variable Sequence Construction
 The sequence in
which fragments are
added is selected
dynamically
depending on the
actual placement.
 Effective in cases
where the flexible
test ligand partially
extends beyond the
reference ligand.
Dynamically selection of the next
fragment
 Each partial placement

is associated with a list
of candidate fragments.
Evaluation of the next
fragment considers:



The amount of expected
overlap with the
reference
The number of potential
interaction in the
candidate fragment
The size of the
substructure tree rooted
at the candidate
fragment.
Dynamically selection of the next
fragment – Cont.
 Nbus –number of buildup states.
 Deviation from the original sequence
only if a better sequence is found
 If flexS exceeds Nbus upper limit, it
returns to the original sequence
Evaluation
 The performance of the algorithm
depends on the size of the
superimposed ligands.
 In reproduction of 284 alignments, 60%
reproduces with RMSD below A.
Questions?
Thank you!
Scoring & Selection strategy
 Total score fo the partial ligand
FlexS Flow
Test ligand
fragmentation
Placement of the anchor molecule
Reference ligand
The physicochemical model
 The conformational space of the ligand
 The model of protein-ligand interactions
 Scoring function
United protein description superposition
 Assumption: highly similar backbone
traces -> superposition by fitting the
backbone atoms of the particular
structures.
 This procedure emphasizes the
differences and improves the fitting in
conserved regions of structures. [why
???]
Surface and interaction
geometries