Transcript Chapter 9a

Chapter 9 Superposition and dynamic
programming
Most methods for comparing structures use some sorts of
superposition and dynamic programming
We have two structures A and B with elements in a given order
A = A1 A2 ... Am and B = B1 B2 ... Bn
An equivalence for (A, B) is a set of pairs of elements
(Ai1, Bj1) (Ai2, Bj2) ... (Air, Bjr)
If i1<i2<...<ir and
j1<j2<...<jr then the set of pairs is an alignment
We shall look at methods for finding ”good”
- equivalences
- alignments
Chapter 9 Superposition and Dynamic Programming
1
Superposition
• Determine how ”good” an equivalence is:
• Put the structures on top of each other, and see how good the
equivalent pairs fit each other in the space, when translation and
rotation are allowed
• If the geometry of the structures are not changed, this is called rigid
body superposition
• Let B be fixed in space
• Transform A with a transformation T over B
• We can measure how good T is by giving a score to the resulting
superposition by measuring the root mean square deviation (RMSD)
between the equivalent pairs of (transformed ) A and B
• Low RMSD values are best, zero indicates exact equality between
the (sub)structures
Chapter 9 Superposition and Dynamic Programming
2
Root Mean Square Deviation
Two different measures are mainly used for scoring the
transformation (superposition)
• Coordinate RMSD
• Distance RMSD
Chapter 9 Superposition and Dynamic Programming
3
Coordinate RMSD
• Let (a1,b1) ,..., (ar,br) be the coordinate set (three values each) of the
equivalent elements of the equivalence E
• The problem is then to find a transformation T for A which minimizes
the coordinate root mean square deviation, that is
RMSD C ( E )  min
T
r
1
r
w
i 1
 w (Ta
i 1
i
2

b
)
i
i
i
where wi are weights corresponding to each pair (ai,bi), and often
set to one.
• A transformation can be performed by
– A translation (three distances)
– A rotation (three angles, around the x-, y-, z-axis)
– The rotation can also be performed around one axis, the direction of
which has to be calculated for each rotation
Chapter 9 Superposition and Dynamic Programming
4
Coordinate RMSD, cont’
•
A transformation for the minimum RMSD can be found by
1. Shift the centroids (geometrical centres) of each structure to the origin
of a common coordinate system
2. Find the rotation of A that minimizes the RMSDC
•
A rotation around the origin can be described by an orthogonal
matrix R3,3 (3D space) with determinant equal to 1
– There exist equations describing the connections between the angles
(3) and the values of the matrix (9)
•
A matrix is orthogonal if
– The scalar product of any two different columns is 0
– The scalar product of any column with itself is 1
•
The deteminant of a 3 x 3 matrix R with elements {rij} is calculated
as
r11r22r33 + r12r23r31 + r13r21r32
- r11r23r32 - r12r21r33 - r13r22r31
Chapter 9 Superposition and Dynamic Programming
5
Coordinate RMSD, cont’
•
The orthogonal requirement is for assuring that the distances
between the points are not changed (ridig body)
•
The determinant requirement is for not ”reflecting” (”mirrorring”) the
structure
Chapter 9 Superposition and Dynamic Programming
6
Example
• A point (1, -1, 1) are to be rotated by an orthogonal matrix with
deteminant equal to one
c1
 12
{ 23
0
c2
c3
 43
 14
3
4
3
2
3
4
1
2
1
}{ 1}
1
• Show that the matrix is orthogonal
• Show that the determinant is one
• Find the new coordinates
Chapter 9 Superposition and Dynamic Programming
7
Coordinate RMSD, cont’
• The formula can therefore be described by a rotation matrix R
and a translation vector t, and we search for a pair (R,t) which
minimizes the expression (assuming wi=1)
r
2
(
R
a

t

b
)
 i
i
i 1
• Since t is found by moving to common centroids, the problem can
be formulated as finding the orthogonal matrix, with determinant
one, that minimizes the function
r
 ( Ra
i 1
i
 bi )2
where (a1,b1) ,..., (ar,br) are now the coordinates after the
structures are moved to common origin
• Algorithms exist for finding such a matrix
Chapter 9 Superposition and Dynamic Programming
8
Distance RMSD
• The distance score method measures how equal corresponding
pairwise distances in the two structures are
• It alleviates the need for finding a translation and rotation of one of
the structures, and is therefore faster
RMSD D ( E )  1r
r
r
A
B 2
(
d

d
 ij ij )
i 1 j 1
where dijA is the spatial distance between the elements of A in pairs i and
j of the equivalence
• However, it has a (serious) weakness, it is invariant under reflection
• The two measures are experimentally shown to have a close to linear
relation
Chapter 9 Superposition and Dynamic Programming
9
Using RMSD as scoring of structure similarities
• The problem of pairwise structure comparison is often to find
equivalences with low RMSD values
• Several quite different equivalences with similar scores might be
found, which one is ”correct” is often not easy to determine
• However, always consider the number of elements in the
equivalences
– for random comparisons the expected RMSD values seems to be
proportional to the square root of the number of equivalent residues
• Different measures can then be used for evaluating how well two
structures can be superposed
Chapter 9 Superposition and Dynamic Programming
10
Using RMSD as scoring of structure similarities
Chapter 9 Superposition and Dynamic Programming
11
Comparison methods using superpositon and dynamic
programming
• Dynamic programming cannot be directly used for structure
comparison
• Two different approaches are used for combining superposition and
dynamic programming
– Alternating superposition and dynamic programming
– ”Contemporary” superposition and dynamic programming
Chapter 9 Superposition and Dynamic Programming
12
Alternating superposition and dynamic
programming
p:=0, Eo := an initial equivalence (sometimes subalignment required)
iter
find the transformation Tp for min RMSD, using the r pairs in Ep
use Tp on the whole structure A, giving A*
calculate a scoring matrix Rij as function of the distance between
residue i in A* and residue j in B
find the best alignment P with use of R (use dynamic
programming)
p:=p+1
Ep:= the r pairs in P with shortest distances in R
until Ep=Ep-1 or p<pmax
P now contains a found equivalence (or alignment)
Chapter 9 Superposition and Dynamic Programming
13
Alternating superposition and dynamic
programming, cont’
The scoring matrix can have more components, in addition to the
distance
• Sequence component, how similar are the amino acids?
• How similar are the local environments?
Number of pairs in each equivalence
• Instead of a fix number for r, r can vary with the number of
”overlapping” pairs
Chapter 9 Superposition and Dynamic Programming
14