Homology Modelling and Methods for Fold Recognition

Download Report

Transcript Homology Modelling and Methods for Fold Recognition

Protein Structure Prediction:
Homology Modeling
&
Threading/Fold Recognition
D. Mohanty
NII, New Delhi
Experimental Methods for Structure Determination
Computational Approaches
for Protein Structure Prediction
•Methods based on laws of physical chemistry
Ab initio folding using Molecular Mechanics
Forcefield
•Knowledge-based Methods
Homology Modelling
Fold Recognition or Threading
Interactions between atoms in a protein
Schematic depiction of the free energy surface of a protein
Computational tools for exploring
energy surface & locating minimas
Energy Minimization
Molecular Dynamics
Monte Carlo Simulations
Structure Prediction Flowchart
http://www.bmm.icnet.uk/people/rob/CCP11BBS/flowchart2.html
Homology Modelling
Homology (or Comparative) modelling involves,
building a 3D model for a protein of unknown structure
(the target) on the basis of sequence similarity to proteins
of known structure (the templates).
Necessary requirements for homology modeling:
•Sequence similarity between the target and the template
must be detectable.
•Substantially correct alignment between the target
sequence and template must be calculated.
Homology or comparative modelling is
Possible because:
•The 3D structures of the proteins in a family are more
conserved than their sequences. Therefore, if similarity
between two proteins is detectable at the sequence level,
structural similarity can usually be assumed.
•Small changes in protein sequence usually results in
small changes in 3D structure.
But large changes in protein sequence can also result in
small changes in its 3D structure i.e. Proteins with
non-detectable sequence similarity can have similar
structures.
Steps in Comparative
Protein Structure
Modelling
Target
Template
Target
Template
Simple sequence-sequence alignment using BLAST does
not give alignment over the entire length.
Sidechain Modelling
Rotamer Library
Loop Modelling
Model Validation
•Ramachandran Plot for backbone dihedrals
•Packing & Accessibility of amino acids
Threading or Fold Recognition
•Proteins often adopt similar folds despite no
significant sequence or functional similarity.
•For many proteins there will be suitable template
structures in PDB.
•Unfortunately, lack of sequence similarity will
mean that many of these are undetected by
sequence-only comparison done in homology
modelling.
Goal of Fold Recognition or Threading
•Fold recognition methods attempt to detect the fold that
is compatible with a particular query sequence.
•Unlike sequence-only comparison, these methods take
advantage of the extra information made available by
3D structure.
•In effect, fold prediction methods turn the protein
folding problem on its head: rather than predicting how
a sequence will fold, they predict how well a fold will
fit a sequence.
47%
17%
5%
There are many examples of proteins exhibiting high
structural similarity but less than 15% sequence identity.
Classical sequence alignment fails to detect homology
below 25-30% sequence identity.
One needs sequence comparison methods which take into
account structural environment of amino acids.
Alternate approach is Threading or Fold Recognition,
where sequence is compared directly to structure.
Compatibility of a sequence with a given fold
A practical approach for fold recognition
•Although fold prediction methods are not 100% accurate, the
methods are still very useful.
•Run many different methods on many sequences from your
homologous protein family. After all these runs, one can build up a
consensus picture of the likely fold.
•Remember that a correct fold may not be at the top of the list, but
it is likely to be in the top 10 scoring folds.
•Think about the function of your protein, and look into the function
of the predicted folds.
•Don’t trust the alignments, rather use them as starting points.
Applications of comparative modeling. The potential uses of a
comparative model depend on its accuracy. This in turn depends
significantly on the sequence identity between the target and the
template structure on which the model was based.