Transcript CHAPTER 3

CHAPTER 2
The Basic Concepts of Algorithms
The Minimal Spanning Tree Problem



Spanning tree
Prim’s Algorithm
Kruskal’s Algorithm
The Longest Common Subsequence
Problem


S1 : ABDDRGTY
S2 : CDEDRGRT
DRT
DRGT
DDRGT
The Dynamic Programming
Strategy

If we go through a vertex X, we should find a
shortest route from X to T.
Application of the Dynamic Programming
Strategy to Solve the LCS problem


S1  a1a2 ...am
S 2  a1a2 ...an
LCS(i,j)=LCS(i-1,j-1)+1 if ai  b j
ai  b j
max{LCS(i-1,j),LCS(i,j-1)}
if
a b
LCS(0,0)=LCS(1,0)=LCS(0,1)=0
i
i
The Time-Complexity of
Algorithms

n2
Find a LCS : best case: O(
)
2
n
average Case: O(
)
worst case: O( n 2 )
Quicksort : best case: O( n log n )
average Case: O( n log n )
worst case: O( n 2 )
The 2-Dimensional Maxima Finding
Problem and the Divide and Conquer
Strategy

A point( x1 , y1) is said to dominate another point( x2 , y2) if
x1  x2 , y1  y2 .If a point is not dominated by any
point ,is called a maxima.
The Selection Problem and the
Prune and Search Strategy

Give a set S of n numbers,there is a number p
which divides S into three subsets S1,S2and S3.
case1:the size of S is greater than k.Kth smallest of S
must be located in S1 ,prune away S2 and S3.
case2:the condition of Case1 is not valid.But the size of
S1 and S2 is greater than k.the kth smallest number of S
must be equal to p.
case3:none of the conditions of case1 and case2 is
valid.In this case,the kth smallest number of S must be
located in S3 and we can prune away S1 and S2
The NP-Complete Problems


NP problems are all decision problems
Nearly all of the decision problems are
NP problems.
CHAPTER 3
String Matching
3.1 Basic Terminologies of Strings





|S|=sting S 的長度
“AGCTTGATT”
Substring, S '  Si ,i |S '|1
“GCTT” “TTG”
Subsequences,
“ACT” “GTT”
Prefix of string S '  S1,|S '| “AGCT” “AG”
Suffix of string S '  S|S ||S '|1,|S '| “GATT” “TT”
The KMP Algorithm

Knuth, Morris, Pratt in 1977
Case1:the fist symbol of P dose not appear again in P.

Case2:the first symbol of P appears again in P.


Case3:not only the first symbol of P appears in P,prefixes
of P appear in P twice.

If a mismatch occurs when Ti is compared with Pj ,then
align Ti with Pf ( j 1) 1 if j!=1 and align Ti 1 with P1 if
j=1.
The Boyer-Moore Algorithm



Boyer,Moore in 1977,more efficient in practice than KMP
Bad Character Rule:Align Ts  j 1
with ,where j’ is the Pj '
rightmost position of Ts  j 1
in P.
Good Suffix rule 1:Align Ts  j 1 with Pj '  m  j , where j’ is the largest
position such that Pj 1,m is a suffix of P1, j ' and Pj '  m  j  Pj


Good suffix Rule 2:Align Ts  m  j ' with P1 ,where j’ is the largest position
such that P1, j ' is a suffix of Pj 1,m
P
Function G: (a) g1 ( j ) be the largest k such that j 1,m is a suffix of P1, k
and Pj '  m  j  Pj ,0 if there is no such k. (b) g 2 ( j ) be the largest k such
that P1, k is a suffix of Pj 1,m ; 0 if there is no such k. (c).
G(j)={m - max{ g1 ( j ) , g 2 ( j ) }}

Function f’:

The Boyer-Moore algorithm for exact for exact matching.
s=14 > n – m + 1 = 24 – 12 + 1 = 13
Suffix Trees and suffix Arrays

Suffixes
suffix tree
Approximate String Matching


Case 1: T j  Pj E(i,j)=E(i-1,j-1)
Csse 2: T j  Pj E(i,j)={E(i-1,j),E(i,j-1)}+1