No Slide Title
Download
Report
Transcript No Slide Title
Find the optimal alignment ?
+
Optimal Alignment
• Find the highest number of atoms aligned with
the lowest RMSD (Root Mean Squared
Deviation)
• Find a balance between local regions with very
good alignments and overall alignment
Geometric Matching task =
Geometric Pattern Discovery
Structure Comparison Requirements
1. Which atom in structure A corresponds to
what atom in structure B ?
Answer: Sequence alignments
THESESENTENCESALIGN--NICLEY
||| ||| || || ||||| ||||||
THE—SEQ-EN-CE-ALIGNEDNICELY
Structure Comparison Requirements
2. What are the locations of atoms in the structures ?
Answer: PDB-files (Dihedral angles, bond lengths …)
Chain 1AI9:A
bond
C-N
C-N
(PRO)
C-O
CA-C
CA-C
(GLY)
CA-CB
CA-CB
(ALA)
CA-CB
(I,T,V)
N-CA
N-CA
(GLY)
N-CA
(PRO)
total #
180
average
1.32
stddev
0.019
11
192
184
1.33
1.25
1.52
0.019
0.022
0.022
1.29 PRO 68
1.19 ASN 124
1.47 LEU 121
1.36 PRO 160
1.33 GLN 165
1.58 ILE 8
8
133
1.54
1.53
0.016
0.032
1.52 GLY 20
1.4 GLU 174
1.57 GLY 55
1.62 ASP 105
7
1.53
0.019
1.5 ALA 93
44
173
1.56
1.47
0.026
0.023
1.5 VAL 6
1.42 ASP 71
1.61 THR 147
1.54 TRP 189
8
1.47
0.013
1.45 GLY 20
1.49 GLY 180
11
1.47
0.02
1.44 PRO 15
1.5 PRO 152
Source: http://www.rcsb.org/pdb
min
at
1.27 VAL 6
max
at
1.38 ASN 123
1.56 ALA 16
Structure Comparison Requirements
3. Methods to superimpose structures
Answer: Translation and Rotation
x1, y1, z1
x2, y2, z2
x3, y3, z3
x1 + d, y1, z1
x2 + d, y2, z2
x3 + d, y3, z3
Translation
Rotation
Transformations
Translation
x' x t
Translation and Rotation
Rigid Motion (Euclidian Trans.)
x ' Rx t
Translation, Rotation + Scaling
x ' s( Rx t )
Inexact Alignment.
Simple case – two closely related proteins with the same number of
amino acids.
T
Question: how to measure
an alignment error?
Distance Functions
Two point sets: A={ai} i=1…n
B={bj} j=1…m
• Pairwise Correspondence:
(ak1,bt1) (ak2,bt2)… (akN,btN)
(1) Exact Matching: ||aki – bti||=0
(2) Bottleneck max ||aki – bti||
(3) RMSD (Root Mean Square Distance)
Sqrt( Σ||aki – bti||2/N)
Superposition - best least squares
(RMSD – Root Mean Square Deviation)
Given two sets of 3-D points :
P={pi}, Q={qi} , i=1,…,n;
rmsd(P,Q) = √
S i|pi - qi |2 /n
Find a 3-D rigid transformation T* such that:
rmsd( T*(P), Q ) = minT
√ S i|T(pi) - qi |2 /n
A closed form solution exists for this task.
It can be computed in O(n) time.
RMSD
Unit of RMSD => e.g. Ångstroms
- identical structures => RMSD = “0”
- similar structures => RMSD is small (1 – 3 Å)
- distant structures => RMSD > 3 Å
Pitfalls of RMSD
• all atoms are treated equally
(e.g. residues on the surface have a higher degree
of freedom than those in the core)
• best alignment does not always mean
minimal RMSD
• significance of RMSD is size dependent
Correspondence is Unknown
Given two configurations of points in the
three dimensional space,
T
find those rotations and translations of one of the
point sets which produce “large” superimpositions
of corresponding 3-D points.
Structure Alignment
(Straightforward
Algorithm)
• For each pair of triplets, one from each
molecule which define ‘almost’ congruent
triangles compute the rigid transformation
that superimposes them.
• Count the number of point pairs, which are
‘almost’ superimposed and sort the
hypotheses by this number.
A 3-D reference frame can be uniquely
defined by the ordered vertices of a nondegenerate triangle
p1
p2
p3
Improvement : BLAST idea - detect short
similar fragments, then extend as much as
possible.
k+l-1
k
t
i-1
i+1
i
j-1
j
j+1
ai-1 ai ai+1
bj-1 bj bj+1
Extend while: rmsd(Fij(k)) <e.
Complexity: O(n2)
t+l-1
Protein zinc finger (4znf)
Superimposed 3znf and 4znf
30 CA atoms RMS = 0.70Å
248 atoms RMS = 1.42Å
Lys30
Superimposed 3znf and 4znf backbones
30
30 CA
CAatoms
atoms RMS
RMS == 0.70Å
0.70Å
248 atoms RMS = 1.42Å
Lys30