Protein structure classification genTHREADER – 3D

Download Report

Transcript Protein structure classification genTHREADER – 3D

Protein Structure
Prediction
•
•
•
•
•
•
PDB
SCOP
CATH
genTHREADER
Swiss-Model
ModBase
–
–
–
–
-
Protein Data Bank
Protein structure classification
Protein structure classification
3D structure prediction
3D structure prediction
A database of 3D struc. Predict.
How are structures solved
experimentally?
• X-Ray crystalography: Diffraction patterns are recorded
from x-ray beams hitting a crystalized array of molecules.
• NMR: Nuclear magnetic resonance, magnetic nuclei absorb and
re-emit magnetic radiation in frequencies depending on their
properties.
• Cryo-EM: In Cryo-electron microscopy molecules are frozen
in a thin tube, EM records many low resolution projections of
the molecule and it is computationaly combined.
Many other methods exist.
‫‪Embedding: from distances to shape‬‬
‫חיפה‬
‫ת "א‬
‫ירושלים‬
‫אילת‬
‫אשדוד‬
‫חיפה‬
‫‪0‬‬
‫‪-‬‬
‫‪-‬‬
‫‪-‬‬
‫‪-‬‬
‫ת "א‬
‫‪85.71‬‬
‫‪0‬‬
‫‪-‬‬
‫‪-‬‬
‫‪-‬‬
‫ירושלים‬
‫‪117.3‬‬
‫‪53.34‬‬
‫‪0‬‬
‫‪-‬‬
‫‪-‬‬
‫אילת‬
‫‪361.5‬‬
‫‪278.8‬‬
‫‪247.5‬‬
‫‪0‬‬
‫‪-‬‬
‫אשדוד‬
‫‪117.8‬‬
‫‪32.36‬‬
‫‪54.98‬‬
‫‪250.1‬‬
‫‪0‬‬
PDB: Curation of solved structures
Accession
number
Structural
Classification
Java based
visualization
tools
PDB file
PDB provides the atomic coordinates of the structure :
Which can be viewed by different visualization tools
SCOP: Structural
Classification of Proteins
http://scop.mrc-lmb.cam.ac.uk/scop/data/scop.b.html
Based on known protein structures
•Manually created by visual inspection
•Hierarchical database structure:
–Class, Fold, Superfamily, Family, Protein
and Species
Node
Parents of node
Children
of node
Node
Parents of node
Children
of node
CATH:
Protein Structure Classification
by Class, Architecture, Topology and Homology
http://www.cathdb.info/
•Class: The secondary structure
mainly-beta and alpha-beta.
composition:
mainly-alpha,
• Architecture: The overall shape of the domain structure.
Orientations of the secondary structures : e.g. barrel or 3layer sandwich.
• Topology: Structures are grouped into fold groups at this
level depending on both the overall shape and connectivity of
the secondary structures.
•Homologous Superfamily: Evolutionary conserved structures
CATH:
Protein Structure Classification
by Class, Architecture, Topology and Homology
Prediction:
Comparative Modeling
• Various methods:
– Homology modeling
– Protein threading
– Side-chain geometry prediction
• Accuracy of the comparative model
is related to the sequence
identity on which it is based
>50% sequence identity = high accuracy
30%-50% sequence identity = 90% modeled
<30% sequence identity = low accuracy (many errors)
SWISS-MODEL
An automated protein homology modeling server.
http://swissmodel.expasy.org/
SWISS-MODEL
• The SWISS-MODEL algorithm can be divided into
three steps:
1. Search for suitable templates: the server finds
all similarities of a query sequence to sequences
of known structure. It uses the BLASTP2 program
with the ExNRL-3D database (a derivative of PDB
database, specified for SWISS-MODEL). You get
these partial results as a SwissModel TraceLog
file.
2. Check sequence identity with target: All templates
with sequence identities above 25% are selected
3. Create the model using the ProModII program. You
get this as a SwissModel-Model file.
SWISS-MODEL
Get PDB file by
E-mail
Load to J-Mol
Homology Modeling
Single Structure
query
Swiss-Model file
Structures used
for the homology
model
ModBase
A Homology Model Database
http://modbase.compbio.ucsf.edu/modbase-cgi/index.cgi
GenTHREADER
An automated protein threading server.
Type of
Analysis
(PSIPRED,MEMSAT
,genTHREAD)
Input
sequence
http://bioinf.cs.ucl.ac.uk/psipred/
GenTHREADER
GenTHREADER
Output
The output sequences show some extent of sequence homology
But high level of secondary structure conservation
Ab inito modeling
• Based on physical (chemical)
properties of amino acids
• Rosetta@home
– Leading contender in the field
• foldit
– Crowd-sourcing software
– Designed as a game where the goal is to
optimize a structure
– Dozens of published papers referencing it
Exercise
In this exercise we will analyze two structures of the protein Lysozyme. the sequences of those
proteins have small differences.
1. Download Pymol (after registering) http://pymol.org/educational/
2. Load the two structures 1LYD.pdb ,1L35.pdb
3. Use the Cartoon option for visualizing the structures.
4. Align the structures using the command: align /1lyd,/1l35
Analyze the difference in structures, what is the RMSD (Root Mean Square – represents the distance
between the structures)?
Results
Show
Cartoon
Hide lines