Transcript Topic 9
DSSP and STRIDE
Chapter 19, Du and Bourne “Structural Bioinformatics”
Topic 9
Why Secondary Structure Assignments?
A key step in protein classification
--class, fold…..
It has functional implications
Useful in protein structure comparison and protein structure prediction
-- Some protein structure alignment programs use SSE (secondary structure element)
-- In protein threading, secondary structures are used to define “cores”, more later…
It is an intuitive means of visualizing
and understanding protein structures
lysozyme
CAF Andersen, and B Rost (2002) “Secondary structure assignment”
How do we extract 2o structure info?
Secondary Structure Annotations from PDB File
Types of helices and sheets:
Helix: Right-handed alpha (default)
Right-handed omega
Right-handed pi
Right-handed gamma
Right-handed 310
1
2
3
4
5
Sheets: Sense of strand with respect
to previous strand in the sheet.
first strand
0
Parallel
1
anti-parallel
-1
………………
They are assigned by crystallographers, but how? Will come back to this later……
Secondary Structure Assignments
What are the two main structural properties when we talk about secondary structures?
1. Hydrogen bond patterns
“Knowledge-based protein secondary structure assignment”.
Frishman D, Argos P. (1995). Proteins 23(4):566-79
2. Backbone geometry
(main-chain dihedral
angles)
Hydrogen Bond Identification
1. A simple way: angle-distance hydrogen bond assignment:
A hydrogen bond is assigned when: 1. q > 120O
AND
2. rHO < 2.5 Å
Baker, E. N. & Hubbard, R. E. (1984).
A better H-bond potential
Repulsive
Attractive
Dahiyat BI, Gordon DB, and Mayo SL (1997). Automated design of the surface positions of
protein helices. Protein Science. 6:1333-1337.
But when used in practice, it isn’t without problems
Allosteric response is both conserved and variable across three CheY orthologs.
Mottonen JM, Jacobs DJ, Livesay DR (2010). Biophysical Journal, 99:2245-2254.
Hydrogen Bond Identification-DSSP
2. Hydrogen bond identification is based on Coulomb energy
DSSP: Definition (Dictionary) of Secondary Structure of Proteins
E fq 1 q 2 (
1
rNO
1
rHC
1
rHO
1
rNC
Where: f = 332 kcal/mol
q1= 0.42, q2=0.2
**Hydrogen issue
A hydrogen bond is identified if this energy E is less than -0.5 kcal/mol
Kabsch W, Sander C (1983). Dictionary of protein secondary structure: pattern recognition of
hydrogen-bonded and geometrical features. Biopolymers. 1983 Dec;22(12):2577-637
)
Secondary Structure Assignments-DSSP
H: (-helix), G: (310 helix), I: (-helix), E: (-strand),
B: (bridge), T: (-turn), S: (bend), C(space): (coil)
H: two consecutive amino acids have i and i+4 hydrogen bonds, and ends
likewise with two consecutive i-4 and i hydrogen bonds.
** Similarly for G and I assignments.
** The helix definition does not assign the edge residue having the initial and
final hydrogen bonds in the helix.
T: single helix hydrogen bonds.
Kabsch W, Sander C (1983). Dictionary of protein secondary structure: pattern recognition of
hydrogen-bonded and geometrical features. Biopolymers. 1983 Dec;22(12):2577-637
Secondary Structure Assignments-DSSP
H
G
I
Kabsch W, Sander C (1983). Dictionary of protein secondary structure: pattern recognition of
hydrogen-bonded and geometrical features. Biopolymers. 1983 Dec;22(12):2577-637
Secondary Structure Assignments-DSSP
T
Kabsch W, Sander C (1983). Dictionary of protein secondary structure: pattern recognition of
hydrogen-bonded and geometrical features. Biopolymers. 1983 Dec;22(12):2577-637
Secondary Structure Assignments-DSSP
Beta Structure Definitions:
• Kabsch and Sander define all beta structure in terms of `bridges' which
are either parallel or antiparallel.
• Where two or more bridges of the same type are consecutive, the
structure is termed a ladder.
• Finally, overlapping ladders are amalgamated into sheets. Additional
complications arise because ladders may have discontinuities in them,
and ladders may consists of just a single bridge.
• These aspects of protein structure make the coding of beta-structure
less straightforward than for helix.
Kabsch W, Sander C (1983). Dictionary of protein secondary structure: pattern recognition of
hydrogen-bonded and geometrical features. Biopolymers. 1983 Dec;22(12):2577-637
Secondary Structure Assignments-DSSP
H: (-helix), G: (310 helix), I: (-helix), E: (-strand),
B: (bridge), T: (-turn), S: (bend), C(space): (coil)
E: sheet, which is composed of overlapping ladders
B: ladders of length 1
S: indicate a bend in the chain
Kabsch W, Sander C (1983). Dictionary of protein secondary structure: pattern recognition of
hydrogen-bonded and geometrical features. Biopolymers. 1983 Dec;22(12):2577-637
Secondary Structure Assignments-DSSP
Kabsch W, Sander C (1983). Dictionary of protein secondary structure: pattern recognition of
hydrogen-bonded and geometrical features. Biopolymers. 1983 Dec;22(12):2577-637
DSSP Sample Output
DSSP Sample Output
Note: lower case for SS-bridge CYS
DSSP Sample Output
N-H-->O etc. hydrogen bonds;
e.g. -3,-1.4 means: if this residue is
residue i then N-H of i is h-bonded to
C=O of i-3 with an electrostatic Hbond energy of -1.4 kcal/mol.
There are two columns for each type
of H-bond, to allow for bifurcated Hbonds.
Bifurcated HBs
DSSP Sample Output
**TCO: cosine of angle between C=O of residue I and C=O of residue I-1.
α-helices: near +1; β-sheets: near -1. (Not used for structure definition)
**KAPPA: virtual bond angle (bend angle) defined by the three Cα atoms of residues I- 2,
I, I+2. Used to define bend (structure code 'S').
**ALPHA: virtual torsion angle (dihedral angle) defined by the four Cα atoms of residues
I-1, I, I+1, I+2. Used to define chirality (structure code '+' or '-').
**PHI PSI
**X, Y, Z coordinates of C
Kappa and Alpha-DSSP
k
Kabsch W, Sander C (1983). Dictionary of protein secondary structure: pattern recognition of
hydrogen-bonded and geometrical features. Biopolymers. 1983 Dec;22(12):2577-637
SS Assignment of PDB Files using DSSP
α
310
β
Turn
1ATP
An aside: ACC from DSSP
ACC = water exposed surface (Å2). But what is the problem with doing this???
Relative solvent accessibility (Range = 0-1)
Reference =
G
G
X
X = Lys
protein
res
ref
res
A
Racc =
A
G
G
http://www.cmbi.ru.nl/hsspsoap/
STRIDE: secondary STRuctural IDEntification
STRIDE uses two criteria:
1. Hydrogen Bond Energy
2. dihedral angle probabilities
Knowledge-based protein secondary structure assignment.
Frishman D, Argos P. (1995). Proteins 23(4):566-79
Hydrogen Bond Identification-STRIDE
Empirical hydrogen bond calculation:
Er
C
r
8
Ehb = E distance × E directional = Er × Et × E p
D
r
6
distance dependent term
E p = cos
Et
2
r is N-O distance
()
[0.9 + 0.1 sin(2t i )] cos (t
= K 1 [K 2 - cos 2 (t i ) ] cos(t
0
)
0 < t i 90
o)
90 < t i 110
o
110 t i
where K 1 0 . 9 / cos 110 , K 2 cos 110
6
o
2
o
two angular dependent terms
Knowledge-based protein secondary structure assignment.
Frishman D, Argos P. (1995). Proteins 23(4):566-79
Dihedral Angle Probabilities-STRIDE
Torsion angles propensities for alpha-helix and beta-sheet
Knowledge-based protein secondary structure assignment.
Frishman D, Argos P. (1995). Proteins 23(4):566-79
Secondary Structure Assignments-STRIDE
Recognition of -helices:
similar to DSSP--have two consecutive hydrogen bonds between k and k+4
E
k ,k 4
hb
(1 W 1 W 2
For edge residues:
Pk Pk 4
2
) T1
Five parameters
Pk T 2
and Pk 5 T3
Recognition of -sheets:
similar to DSSP--have two consecutive hydrogen bonds
) T Antiparall el
Antiparall el
) T Antiparall el
Antiparall el
E hb 1 (1 W 1 W 2 CONF
E hb 2 (1 W 1 W 2 CONF
E hb 1 (1 W 1 W 2 CONF
Four parameters
Parallel
) T Parallel
E hb 2 (1 W 1 W 2 CONF
Optimized based on a
dataset with author’s
assignments
Pparallel
) T Parallel
CONF
( PInt 1 PInt 2 )
2
OR PInt
Knowledge-based protein secondary structure assignment. Frishman D, Argos P. (1995). Proteins 23(4):566-79
STRIDE Sample Output
STRIDE Sample Output
STRIDE Sample Output
http://webclu.bio.wzw.tum.de/cgi-bin/stride/stridecgi.py
DSSP vs STRIDE
STRIDE better, 58%
DSSP better, 31%
+
Same Assignment, 11%
** <14% difference
for individual proteins
226 chains based on authors’ three state
Assignments, helix, extended, coil
Knowledge-based protein secondary structure assignment.
Frishman D, Argos P. (1995). Proteins 23(4):566-79
DSSP vs STRIDE
Although DSSP is the older method and continues to be the most commonly used, the
original STRIDE definition reported it to give a more satisfactory structural assignment
in at least 70% of cases. In particular, STRIDE was observed to correct for the
propensity of DSSP to assign shorter secondary structures than would be assigned by
an expert crystallographer, usually due to the minor local variations in structure that
are most common near the termini of secondary structure elements.
Knowledge-based protein secondary structure assignment. Frishman D, Argos P. (1995). Proteins 23(4):566-579.
Using a sliding-window method to smooth variations in assignment of single terminal
residues, current implementations of STRIDE and DSSP are reported to agree in up to
95.4% of cases.
Protein secondary structure assignment revisited: a detailed analysis of different assignment methods. Martin J,
et al (2005). BMC Structural Biology 5:17.
Both STRIDE and DSSP, among other common secondary structure assignment
methods, are believed to under predict pi helices.
Occurrence, conformational features and amino acid propensities for the pi-helix. Fodje MN, Al-Karadaghi S
(2002). Protein Engineering 15(5):353-358.
Comparison of Methods for Secondary Structure Assignment
Other Programs:
DEFINE – uses a distance criteria
between C atoms which varies
slightly for each secondary structure
type; allows modifications for
curvature
DSSP is widely used and a generally accepted method
Fourrier et al. BMC Bioinformatics 2004 5:58
Secondary Structure Annotations from PDB File
Types of helices and sheets:
Helix: Right-handed alpha (default)
Right-handed omega
Right-handed pi
Right-handed gamma
Right-handed 310
1
2
3
4
5
………………
Crystallographers’ assignments
-- angle-distance simple hydrogen bonding pattern
-- more complex distance and geometric
-- hydrogen pattern + mainchain dihedral angles
-- mainchain dihedral angles only
-- DSSP algorithm
-- a combination of several methods
-- visual inspection
Sheets: Sense of strand with respect
to previous strand in the sheet.
first strand
0
Parallel
1
anti-parallel
-1