A Quaternion-Based Definition of Protein
Download
Report
Transcript A Quaternion-Based Definition of Protein
Use of quaternions in biomolecular
structure analysis
Robert M. Hanson, Daniel Kohler, and Steven Braun
Department of Chemistry, St. Olaf College
Northfield, MN 55057
August 19, 2009
238th ACS National Meeting
Washington, DC
Protein Secondary Structure
• My research interest is in describing, visualizing, and
quantifying protein and nucleic acid secondary structure,
particularly in relation to substrate binding.
Protein Secondary Structure
• As the current principal developer and project manager
of the Jmol molecular visualization project, I get requests
periodically for new visualization ideas.
The Jmol Molecular Visualization Project
• As the current principal developer and project manager
of the Jmol molecular visualization project, I get requests
periodically for new visualization ideas.
The Jmol Molecular Visualization Project
• As the current principal developer and project manager
of the Jmol molecular visualization project, I get requests
periodically for new visualization ideas.
The Jmol Molecular Visualization Project
• As the current principal developer and project manager
of the Jmol molecular visualization project, I get requests
periodically for new visualization ideas.
•
Andy Hanson, Indiana University
Outline
•
•
•
•
Reference Frames
Quaternions
Local Helical Axes
Quaternion-Based “Straightness”
Visualization Can Drive Research
• The main point:
– Sometimes a good visualization can lead to
interesting findings in basic research that
otherwise simply would not be considered.
Reference Frames
• The basic idea is that each amino acid residue can be
assigned a “frame” that describes its position and
orientation in space.
Reference Frames
• The frame has both translational and rotational aspects.
Quaternion Frames
• A quaternion is a set of four numbers.
• Unit quaternions can describe rotations.
Quaternion Frames
• The choice of frame is (seemingly) arbitrary.
“P”
“C”
“N”
Local Helical Axes
• The quaternion difference describes how one gets from
one frame to the next. This is the local helical axis.
Local Helical Axes
• The quaternion difference describes how one gets from
one frame to the next. This is the local helical axis.
Local Helical Axes
• Strings of local helical axes identify actual “helices.”
Local Helical Axes
• Sheet strands are also technically helical as well.
Local Helical Axes
Quaternion Difference Map
Straightness
• The quaternion differences can be used to
unambiguously define how “straight” a helix is.
Quaternion-Based Straightness
• The dot product of two vectors expresses how well they
are aligned. This suggests a definition of “straightness”
based on quaternion dot products.
arccos | dqi 1 dqi |
s (i ) 1
/2
Quaternion-Based Straightness
• The “arccos” business here just allows us to turn the dot
product into a distance measure – on the fourdimensional hypersphere!
arccos | dqi 1 dqi |
s (i ) 1
/2
Quaternion-Based Straightness
• In fact, in quaternion algebra, the distance between two
quaternions can be expressed in terms of the quaternion
second derivative:
arccos | dqi 1 dqi |
s (i ) 1
/2
| 2 / 2 |
s (i ) 1
/2
Quaternion-Based Straightness
• So our definition of straightness is just a simple
quaternion measure:
s (i ) 1
| 2 |
Quaternion-Based Straightness
• select *; color straightness
Quaternion-Based Straightness
• select not helix and not sheet and straightness > 0.85;
color straightness
Quaternion-Based Straightness
Quaternion-Based Straightness
Quaternion-Based Straightness
Quaternion-Based Straightness
Quaternion-Based Straightness
Quaternion-Based P Straightness
• We have found several interesting aspects of
straightness. Among them are two relationships
to well-known “Ramachandran angles.”
For P-straightness:
where
[Figure 5. Correlation of quaternion- and Ramachandran-based P-straightness for protein 2CQO. R² = 0.9997.]
Quaternion-Based C Straightness
• We have found several interesting aspects of
straightness. Among them are two relationships to wellknown “Ramachandran angles.”
For C-straightness:
s (i ) 1
and
| 2 |
2 (i 1 i i i 1 )[ , ]
[Figure 7. Correlation between quaternion- and Ramachandran-based C-straightness for protein 2CQO. R² ≈ 1.]
Quaternion-Based Straightness
For the entire PDB database, straightness correlates well with
DSSP-calculated secondary structure.
Helix residues
Sheet residues
Unstructured residues
Total average
C-straightness
0.8526,
σ = 0.2234
0.7697,
σ = 0.2210
0.3874,
σ = 0.4310
Total average
P-straightness
0.8660,
σ = 0.1742
0.7326,
σ = 0.2181
0.3564,
σ = 0.4136
[Table 1. Summarizes overall average C-straightness
and P-straightness measures for all within(helix),
within(sheet), and (protein and not helix and not
sheet) residues in the Protein Data Bank.]
Quaternion-Based Straightness
Anomalies – very high straightness for “unstructured” groups
PDB ID
Cstraightness
Pstraightness
Description
2HI5
0.9528
0.9210
Aberrant bonds between carbonyl
oxygen and peptide nitrogen atoms
1NH4
0.9517
0.9440
Aberrant bonds between carbonyl
oxygen atoms
1KIL
0.9142
0.9102
Helix designation missing
3FX0
0.9037
0.8086
Problem with helix connection
designations
3HEZ
0.8444
Not
calculable
Disconnected helix fragments
[Table 2. Some structures where overall average straightness is high but
labels in the PDB file result in the misappropriation of secondary structure.
In this way, straightness can check for errors in PDB files.]
Twenty Common Amino Acids
Amino acid
Total average
C-straightness
Amino acid
Total average
C-straightness
ILE
0.7325
CYS
0.6779
LEU
0.7257
TYR
0.6727
VAL
0.7215
LYS
0.6695
ALA
0.7192
THR
0.6500
MET
0.7149
HIS
0.6492
GLU
0.7000
SER
0.6321
GLN
0.6967
ASP
0.6270
TRP
0.6860
ASN
0.6161
ARG
0.6839
PRO
0.5444
PHE
0.6802
GLY
0.5315
Twenty Common Amino Acids
Amino acid
Total average
C-straightness
Amino acid
Total average
C-straightness
ILE
0.7325
CYS
0.6779
LEU
0.7257
TYR
0.6727
VAL
0.7215
LYS
0.6695
ALA
0.7192
THR
0.6500
MET
0.7149
HIS
0.6492
GLU
0.7000
SER
0.6321
GLN
0.6967
ASP
0.6270
TRP
0.6860
ASN
0.6161
ARG
0.6839
PRO
0.5444
PHE
0.6802
GLY
0.5315
Twenty Common Amino Acids
Amino acid
Total average
C-straightness
Amino acid
Total average
C-straightness
ILE
0.7325
CYS
0.6779
LEU
0.7257
TYR
0.6727
VAL
0.7215
LYS
0.6695
ALA
0.7192
THR
0.6500
MET
0.7149
HIS
0.6492
GLU
0.7000
SER
0.6321
GLN
0.6967
ASP
0.6270
TRP
0.6860
ASN
0.6161
ARG
0.6839
PRO
0.5444
PHE
0.6802
GLY
0.5315
Visualization Can Drive Research
• The bottom line:
– Sometimes a good visualization can lead to
interesting findings in basic research that
otherwise simply would not be considered.
Visualization Can Drive Research
• The bottom line:
– Sometimes a good visualization can lead to
interesting findings in basic research that
otherwise simply would not be considered.
– Quaternion-based straightness offers a simple
quantitative measure of biomolecular
structure.
Visualization Can Drive Research
• Future directions:
– Natural extension to nucleic acids
Visualization Can Drive Research
• Future directions:
– Natural extension to nucleic acids
– Define “motifs” based on quaternions
Visualization Can Drive Research
• Future directions:
– Natural extension to nucleic acids
– Define “motifs” based on quaternions
– Extension to molecular dynamics calculations
and ligand binding
Acknowledgments
• Andrew Hanson, Indiana University
• Howard Hughes Medical Institute
• Jmol user community
[email protected]
http://Jmol.sourceforge.net