Transcript ppt

Computing Protein Structures
from Electron Density Maps: The
Missing Loop Problem
I. Lotan, H. van den
Bedem, A. Beacon
and J.C. Latombe
Protein Structure: Experimental
Techniques


Nuclear Magnetic Resonance (NMR)
spectroscopy – limited to short sequences.
X-ray crystallography
X-ray Crystallography
Crystallizing protein samples
Collect X-ray diffraction images
Calculate electronic
charge – a 3-D Electron
Density Map (EDM)
Electron Density Map
3-D “image” of atomic structure
–
–
–
High value (electron density) at atom centers
Density falls off exponentially away from center
Limited resolution, sampled on 3D grid
The End Goal: Build Protein Model
from EDM
Completeness of automatically generated models
varies with experimental data quality:
High Resolution  90% completeness.
Low Resolution  2/3 completeness.
Completing the missing fragments manually is
time consuming.
Experimental Data Quality Varies





Recovering the phase of diffracted beam is associated
with error.
Resolution at which data were collected (High
resolution images cannot be obtained for all proteins)
Not all replicas of protein in the protein crystal are
identical
Mobility of molecule fragments
Temperature dependent atomic vibration
Existing Techniques
Existing software rely on:
 Pattern recognition techniques
 Unambiguous density
 Elementary stereochemical constraints.
Model Refinement


Standard Maximum Likelihood (ML) algorithms
exploit experimental and model phase
information to build new refined models.
Iterating model building and refinement steps
improves completeness and quality of models.
The problem: missing fragments (Usually
loops).
The solution: filling the gaps at early stage.
Goal: Propose Candidates to
Missing Fragments

Input:
–
–
–
–

EDM
Known structure
Anchor residues
The amino acid sequence
Output: propose a structure that fall within the
radius of convergence of existing refinement
tools (1-1.5Å)
Model




Standard Phi-Psi model.
Compute backbone,
ignore side chains except
Cß and O atoms.
Loop closure
Mobile anchor vs.
stationary anchor.
Closure is measured as
the RMSD distance of the
Mobile anchor atoms from
stationary anchor atoms.
Mobile Anchor
Stationary
Anchor
IK + EDM  Loop Structure
Two stages algorithm:
1. Guided by the EDM, sample closing
conformation.
2. Refine top-ranking conformation, using local
optimization, while maintaining loop closure.
Conformations Ranking – density fit and
conformational likelihood.
Stage 1: Generating Loop
Candidates



Employ cyclic coordinate descent (CCD)
method to obtain closing conformations, up to
a tolerance distance dclose.
Starting conformations are obtained by a
random procedure, biased by PDB-derived
distributions.
Best scoring (95% percentile) conformations
are submitted to stage 2.
Cyclic Coordinate Descent (CCD)
Adding the Electron Density
Constraints
We would like to guide the loop closing to fit the EDM.
 For residue i the CDD proposes a distance minimizing dihedral
angles (Φ,Ψ)ip.
p
 Find a pair (Φ,Ψ)i in a square neighborhood of (Φ,Ψ)i that
maximizes the local fit to the EDM. The neighborhood’s size is
reduced linearly with CCD iterations to allow closure.
Atoms that
are changed
by angle pair i
and not i+1
Center
of atom
Aj
Stage 2: Refining Loop Candidates


Improve models fit to experimental data (This
time the model as a whole, as opposed to local
fit in stage 1).
Maintain loop-closure constraint during
optimization process.
Target Function
For conformation q, the target function T(q) is the sum of
the squared differences between the observed density
and the calculated density at each grid point in some
volume V around the loop.
Grid Points in Volume
Scaling Factors
Observed Density
Calculated Density (sum
of contributions of atoms
within a cutoff distance
from gi)
Optimization with Closure
Constraints
Generic Approach: Objective function optimization (T(q))
while performing given task (loop-closure) by taking
advantage of manipulator redundancy (DoFs).
f(q) : forward kinematics
equation.
J(q) : 6-by-n Jacobian
: the change to the end of the
chain
J+(q) : an approximation of J-1(q)
N(q) : Orthonormal basis for the
Null-Space (n-6 dimensions)
y = əT(q)/əq : gradient vector of
objective function T(q)
Minimization Procedure: Monte
Carlo and Simulated Annealing


Choose a random sub-chain with at least 8 DoFs.
Propose random move with magnitude proportional to
current temperature
–
–




High temperature: use exact IK solver (Dill)
Low temperature: pick random direction in null-space
Minimize resulting conformation (gradient decent)
Accept using Metropolis criterion:
P(accept qnew) = e^[( T(qprev) - T(qnew) ) / temp]
Use simulated annealing – at each step decrease
pseudo-temperature
At each step verify closure constrained is satisfied
within tolerance.
Results – High Resolution Data
Applying RESOLVE to
the data (high
resolution) yielded 88%
completed initial model .
Applying the alg to a
gap of 12 residues.
Magenta – the structure
from the PDB
Cyan – Best scoring
structure, RMSD =
0.25Å.
The lowest RMSD for 7
residues gap at the end
of stage 1 is 0.35Å.
Results – Low Resolution Data
Applying RESOLVE
yielded a model with
61% completeness.
Applying the alg to a
gap of 12 residues.
Magenta – the
highest scoring,
RMSD 0.6Å.
Yellow – starting
conformation (end of
stage 1), RMSD =
2.1Å (the lowest)