FLEX* - REVIEW
Download
Report
Transcript FLEX* - REVIEW
FLEX* - REVIEW
Agenda
Introduction
Main concepts
FlexE
FlexS
Evaluation and Discussion
Introduction
Methods for the prediction of binding
properties of molecules to proteins.
Classification by the amount of
information available about the target
protein
The general schema
Ligand conformational flexibility
Modeling
Receptor-ligand interactions
Scoring function
Base selection
Algorithm
Base placement
Incremental construction
The Ligand conformational
flexibility
Approximated by a discrete set of
conformations.
rotatable
single bond - modeled by a
discrete set of preferred torsion angles
from the MIMUMBA DB.
Ring system - A set of ring conformations
is computed with the program CORINA.
The model of receptor-ligand
interactions
Modeled by a few special types of
interactions
hydrogen bonds
metal acceptors bonds
hydrophobic contacts
The model of protein-ligand
interactions – Cont.
To each interaction group, we assign:
Interaction
types
Interaction geometry ( center + surface)
Two groups interact if :
The centers of the groups lie approximately on the
surface of the counter group.
The interaction types are compatible
The intermolecular interactions can be classified by
the strength of their geometric constrains
Scoring function
Estimates the free binding energy in the
complex
match score
contact score
The function is additive in the ligand atoms.
Overall docking algorithm
1. Ligand fragmentation
2. Select & Place a set of base
fragments
3. Construct the ligand by linking the
remaining fragments.
Ligand fragmentation
The ligand is decomposed into
components by cutting at each acyclic
bond.
Fragmentation is a partition of the
components of the molecule, such that
every part, called fragment, is
connected in the component tree.
Ligand fragmentation
Good results are produced if the added
fragments are small
Every fragment, except for the base fragment
consist of only one component.
Selecting a base fragment
The problem: Find a fragment which
leads to low energy docking solution.
Good base fragment properties:
Placeability
Specificity
Selecting a base fragment –Cont.
We look for fragments maximizing the
function:
Rules for selecting a set of
fragments
No base fragment is fully contained in
another base fragment
Each component occurs in at most two
base fragments
Each component in a base fragment
must be either necessary for the
connectivity of the fragment or it must
have interaction centers.
The base placement algorithm
Goal: find positions of the base
fragment in the active site such that
sufficient number of favorable
interactions between the fragment and
the protein can occur simultaneously.
Solution: pose clustering.
The base placement algorithm
– Cont.
Preparation: Store all triangles of
interaction points (IP) of the protein in a
hash table.
Find all the compatible fragment IP’s
triangles.
Clustering of the legal transformations
The incremental construction
algorithm
Input: solution set - set of partial
placements with the ligands with the
ligands constructed up to and including
fragment i-1
Output: set of partial placements with
the ligands with the ligands constructed
up to and including fragment i
The complex construction
algorithm – cont.
Adding the next fragment in all the possible
conformations
Reject extended placements that have strong
overlap with the receptor or internal overlap
with the ligand.
Searching for new interactions
Optimizing the positions of the partial ligand
Selecting a new solution set
Clustering the solution set
Optimizing the positions of the
partial ligand
The placement is optimized when:
New
interactions are found.
The placement contains slightly
overlapping atoms between the receptor
and the ligand.
w l r )
2
i
i
i
Selecting a new solution set
Select k best-scoring solution
Problem: the scoring values cannot be
compared directly when different
fragments are involved.
Solution: estimate the score of the
whole ligand, given a partial placement.
Clustering partial solutions
If no placement contains the other, the
distance is infinity
Otherwise, the distance is defined to be
the RMSD of the intersecting atoms.
A cluster is reduced to a single
placement.
Protein flexibility - motivation
Induced fit – side chain or even backbone
adjustments upon docking of different ligands
to the same protein.
Even small conformational changes are
critical for docking applications e.g. if a rotate
able bond prevents a ligand from binding in
the correct position.
Protein flexibelity
Main idea: describe the protein structure
variations with a set of protein structures
representing the flexibility, mutation or
alternative models of a protein.
The variability considered by flexE is defined by
the differences within the given input structures.
United protein description
Data structure that
administers the protein
structures variations.
Contains an ensemble of up to 30 possible
conformation of the protein.
Most of them are low energy
conformations of the same protein.
United protein description construction
Superposition
Clustering
Add picture - 8
Notation
Component : all the
atoms which belong to the
same amino acid or
mutation of the amino
acid. Contains a backbone
part and a side chain part
Part : set of instances
Instance : one of the
alternative conformations.
United protein description clustering
The superimposed structures are
combined by clustering each part
separately
Complete linkage hierarchical cluster
The clustered instances can be
recombined to form new valid protein
structures.
Incompatibility
Two instances of the united
protein description are
incompatible if they cannot
be realized simultaneously.
Logical: two instances are
alternative to each other
Geometric: two logically
compatible instances overlap
Structural: two instances of
the same chain are
unconnected
Incompatibility graph
V ins tances
E
e v andv incom patiable
ij
i
j
Incompatibility graph
The incompatibility is
internally represented
as a graph by using the
instances as nodes and
the connecting pairs of
incompatible node by
an edge.
Valid protein structures
correspond to
independent set in the
graph.
Selection of instances
The ligand is placed fragment by
fragment into the active site by the
incremental construction algorithm.
After each construction step, all
possible interactions are determined.
Apply the scoring function for each
instance.
We chose the IS with the highest score.
Select the optimal IS
The IS can be assembled from IS of the
connected components.
Apply a modified version of the BronKerbosch algorithm.
Evaluation
FlexE was evaluated with ten protein
structures ensembles containing 105
crystal structure from the PDB.
The structures within the ensemble
highly
similar backbone trace
Different conformations for several side
chains.
Evaluation – Cont.
FlexE finds a ligand position with RMSD
below 2 A in 67% of the cases.
Average CPU time for the incremental
construction algorithm is 5.5 minutes.
Discussion
The ensemble approach is able to cope
with several side-chains conformations
and even movements of loops.
Motions of larger backbone segments or
even domains movements are not
covered by this approach.
flexS - motivation
In drug design, often enough, no
structural information about a particular
receptor is available.
Considerable number of different
ligands are known together with their
binding affinities towards the receptor.
flexS - overview
A method for structurally superpositing
pairs of ligands, approximating their
putative binding site geometry.
Main Applications
ligand
superpositioning
Virtual Screening
Implementation in flexS
RigFit – fast rigid-body placement using
Fourier space methods.
Incremental construction
Systematic parameter study
Two Base Placement Methods
Target: Place a rigid molecule fragment
onto the reference ligand
Combinatorial
placement procedure
Numerical placement procedure
RigFit
Optimizes the common volume of two
molecule expressed by various
Gaussian functions associated to
different physicochemical properties.
Solves the combinatorial placement
problem.
Variable Sequence Construction
The sequence in
which fragments are
added is selected
dynamically
depending on the
actual placement.
Effective in cases
where the flexible
test ligand partially
extends beyond the
reference ligand.
Dynamically selection of the next
fragment
Each partial placement
is associated with a list
of candidate fragments.
Evaluation of the next
fragment considers:
The amount of expected
overlap with the
reference
The number of potential
interaction in the
candidate fragment
The size of the
substructure tree rooted
at the candidate
fragment.
Dynamically selection of the next
fragment – Cont.
Nbus –number of buildup states.
Deviation from the original sequence
only if a better sequence is found
If flexS exceeds Nbus upper limit, it
returns to the original sequence
Evaluation
The performance of the algorithm
depends on the size of the
superimposed ligands.
In reproduction of 284 alignments, 60%
reproduces with RMSD below A.
Questions?
Thank you!
Scoring & Selection strategy
Total score fo the partial ligand
FlexS Flow
Test ligand
fragmentation
Placement of the anchor molecule
Reference ligand
The physicochemical model
The conformational space of the ligand
The model of protein-ligand interactions
Scoring function
United protein description superposition
Assumption: highly similar backbone
traces -> superposition by fitting the
backbone atoms of the particular
structures.
This procedure emphasizes the
differences and improves the fitting in
conserved regions of structures. [why
???]
Surface and interaction
geometries