Multiple angles - RMIT University

Download Report

Transcript Multiple angles - RMIT University

A new Approach to Structural
Prediction of Proteins
Heiko Schröder
Bertil Schmidt
Jiujiang Zhu
School of Computer Engineering
Nanyang Technological University
Singapore
Contents
Protein Structure
Protein Structure Prediction
Approach based on Local Protein
Structure
Refinements
Conclusions and Future Work
Protein Structure
Proteins are large molecules composed of
smaller molecules called amino acids
There are 20 kinds of amino acids found in
natural proteins
All share a common structure
R
amine group
side chain
carboxyl group
alpha carbon
(with attached hydrogen)
Protein Structure
From Primary to
Tertiary Structure
A protein’s 3D shape is
determined by its primary amino
acid sequence (Anfinsen, 1963)
Predicting tertiary structure from
amino acid sequence is an
unsolved problem
Difficult to model the energies that
stabilize a protein molecule
Conformational search space is
enormous
Prediction Methods
Target amino acid sequence
YLAADTYK
Template
Fold library
Template amino acid sequence
Target/template Score:
FISSETCN
MEPSSYV
7
TGLIRKN
21
2
Given an amino acid sequence:
search a set of known folds by aligning sequence and
a template fold representative
predict the fold that gets the best scoring alignment
Prediction Methods
This method is very effective when target and
template have >30% sequence identity
Approximately 1/3 of protein sequences can
be assigned folds and modeled this way
Our aim is to contribute to determine tertiary
structures in case matching sequences cannot
be found
Local structure and
prediction
What is Local structure ?
describes environment of an amino acid
an amino acid’s relationship to neighbors
we use this information to predict structure
from primary sequence
Dihedral Angles
The 6 atoms in each peptide unit lie in the same
plane
 and  free to rotate
The structure of a protein is almost totally
determined, if all angles  and  are known
Idea of our Approach
Back bone
N
C
C
 and 
Side chains
 Stiff  free
 local predictability  database of sub-chain structures
 reduction of the number of degrees of freedom by 10,
reduces the computation time significantly in combination
with a global optimization algorithm (e.g. GA or SA)
Classification of
Dihedral Angles
multiple
Selected PDB
structures
Dihedral
angle
extraction
Histogram
for each
amino acids
pair
stiff
flexible
Classification of
Dihedral Angles
multiple
20
120
80
Frequency
Frequency
15
Stiff
60
40
10
5
20
0
-160
0
-100
-50
0
50
100
ALA-ALA
ALA-ALA
150
-140
-120

50
flexible
40
30
20
10
0
-160
-100
-80
GLY-ILE
 GLY-ILE
60
Frequency
Frequency
Frequency
Frequency
100
-140
-120
-100
 LEU-ARG
 LEU-ARG
-80
-60
-60
-40
-20
Classification of
Dihedral Angles
multiple
Selected PDB
structures
Dihedral
angle
extraction
Histogram
for each
amino acids
pair
stiff
flexible
 Stiff angles: determine mean value
 Multiple angles: determine sequence of mean
values, one for each peak in decreasing order of
these peaks
 Flexible angles: determine mean value and mark
as flexible
Prediction based on
Classification
Given a sequence of amino acids, find the
subsequence in which all angles are of
type stiff
predict structure of these subsequences,
using the mean values of the
corresponding histograms
Prediction based on
Classification
Part of a protein
predicted with this
method (backbone of a
helix, original structure
on the left, predicted
structure on the right)
Successfully predicted
certain stiff structures
of subsequences up to
the length of 15
Refinement of the
method
For multiple angles:
consider sequences of length 3 or 4:
extract sequences (C,A,B,D) and determine the
histogram of angles  and  related to the peptide
chain between A and B
if histogram for  for amino acids (A,B) is multiple,
check if angle for (A,B,C,D) is stiff
with longer subsequences the occurrences of these
sequences drops dramatically
Refinement of the
method
For multiple angles:
if an amino acid sequence has only a small
number of multiple edges, it is possible to
try all combinations of possible peaks
many combinations lead to collisions in part
of the protein, and thus can be eliminated
Conclusion and
Future Work
Presented a method to predict stiff structures of
subsequences up to the certain length
Presented a refinement of the method to handle
multiple angles
how to handle flexible angles ?
Using the local prediction as an input for a global
optimization method, e.g. based on Simulated
Annealing