Transcript LCS_Score1

Longest Common
Subsequence (LCS) - Scoring
Dr. Nancy Warter-Perez
June 25, 2003
Hydrophobicity Sliding Window
Program Using Functions (1)
#include <iostream>
#include <string>
using namespace std;
double hydro[25] = {1.8,0,2.5,-3.5,-3.5,2.8,-0.4,-3.2,4.5,0,-3.9,3.8,1.9,-3.5,0,
-1.6,-3.5,-4.5,-0.8,-0.7,0,4.2,-0.9,0,-1.3};
void compute_hydro(string seq, int ws);
void main () {
string seq;
int ws;
cout << "This program will compute the hydrophobicity of an sequence of
amino acids.\n”;
cout << "Please enter the sequence: "<< flush;
cin >> seq;
cout << "Please enter the window size: "<< flush;
cin >> ws;
compute_hydro(seq, ws);
}
June 25, 2003
LCS Scoring
2
Hydrophobicity Sliding Window
Program Using Functions (2)
void compute_hydro(string seq, int ws) {
int i; double sum = 0;
cout << "\n\nThe hydrophocity values are:" << endl;
for(i = 0; i <= seq.size(); i++)
if((seq.data()[i] >= 'a') && (seq.data()[i] <= 'z'))
seq.at(i) = seq.data()[i] - 32;
for(i = 0; i < ws; i++) {
sum += hydro[seq.data()[i] - 'A'];
}
for(i = 1; i <= seq.size() - ws; i++) {
cout << "Hydrophocity value:\t" << sum/ws << endl;
sum = sum - hydro[seq.data()[i-1] - 'A'] + hydro[seq.data()[i+ws-1] - 'A'];
}
cout << "Hydrophocity value:\t" << sum/ws << endl;
}
June 25, 2003
LCS Scoring
3
Reference

Computational Molecular Biology – An Algorithmic
Approach, Pavel Pevzner
June 25, 2003
LCS Scoring
4
Longest Common
Subsequence (LCS) Problem


Can have insertion and deletions but no
substitutions (no mismatches)
Ex: V: ATCTGAT
W: TGCATA
LCS: TCTA
June 25, 2003
LCS Scoring
5
LCS Problem (cont.)

Similarity score
si-1,j
si,j = max {
si,j-1
si-1,j-1 + 1, if vi = wj

On board example: Pevzner Fig 6.1
June 25, 2003
LCS Scoring
6
Indels – insertions and
deletions (e.g., gaps)

alignment of V and W



V = rows of similarity matrix (vertical axis)
W = columns of similarity matrix (horizontal axis)
Space (gap) in W
 (UP)


 (LEFT)
Space (gap) in V


insertion
deletion
Match (no mismatch in LCS)
June 25, 2003
LCS Scoring
(DIAG)
7
LCS(V,W) Algorithm
for i = 1 to n
si,0 = 0
for j = 1 to n
s0,j = 0
for i = 1 to n
for j = 1 to m
if vi = wj
si,j = si-1,j-1 + 1; bi,j = DIAG
else if si-1,j >= si,j-1
si,j = si-1,j; bi,j = UP
else
si,j = si,j-1; bi,j = LEFT
June 25, 2003
LCS Scoring
8
Programming Workshop 5

Implement the LCS scoring algorithm as a
function



Inputs: 2 strings to score
Outputs: Scoring matrix and traceback matrix
(these can be global variables)
Write a main functions to



prompt the user for 2 sequences
call the scoring function
print the 2 matrices
June 25, 2003
LCS Scoring
9