Transcript Document

ICS 353
Design and Analysis of Algorithms
Spring Semester 2006 - 2007 (062)
King Fahd University of Petroleum & Minerals
Information & Computer Science Department
1
Basic Concepts in Algorithmic
Analysis
• Topics
–
–
–
–
–
–
–
–
Introduction
Time Complexity
Space Complexity
Optimal Algorithms
How to estimate the running time of an algorithm
Worst Case Analysis and Average Case Analysis
Amortized Analysis
Input Size and Problem Instance
• Reading Assignment
– All Chapter 1 from the textbook
• In particular, sections 1-3,6,8-14 will be discussed in class.
2
What is an algorithm?
• An algorithm is defined as a finite set of
steps, each of which may require one or
more operations and if carried out on a set
of inputs, will produce one or more
outputs after a finite amount of time.
• Examples of Algorithms
• Examples of computations that are not
algorithms
3
Properties of Algorithms
• Definiteness: It must be clear what should be done.
• Effectiveness: Each step must be such that it can, at
least in principle, be carried out by a person using
pencil and paper in a finite amount of time. E.g.
integer arithmetic.
• An algorithm produces one or more outputs and may
have zero or more externally supplied inputs.
• Finiteness: Algorithms should terminate after a finite
number of operations.
4
Our Objective
• Find the most efficient algorithm for solving
a particular problem.
• In order to achieve the objective, we need to
determine:
– How can we find such algorithm?
– What does it mean to be an efficient algorithm?
– How can one tell that it is more efficient than
other algorithms?
5
In the First Chapter
• We will answer the following two questions
– What does it mean to be an efficient algorithm?
– How can one tell that it is more efficient than
other algorithms?
based on some easy-to-understand searching
and sorting algorithms that we may have seen
earlier.
6
Searching Problem
• Assume A is an array with n elements A[1], A[2],
… A[n]. For a given element x, we must
determine whether there is an index j; 1 ≤ j ≤ n,
such that x = A[j]
• Two algorithms, among others, address this
problem
– Linear Search
– Binary Search
7
Linear Search Algorithm
Algorithm: LINEARSEARCH
Input: array A[1..n] of n elements and an element x.
Output: j if x = A[j], 1 ≤ j ≤ n, and 0 otherwise.
1.
2.
3.
4.
5.
j  1
while (j < n) and (x A[j])
j  j + 1
end while
if x = A[j] then return j else return 0
8
Analyzing Linear Search
• One way to measure efficiency is to count how many
statements get executed before the algorithm terminates
• One should keep an eye, though, on statements that are
executed “repeatedly”.
• What will be the number of “element” comparisons if x
–
–
–
–
First appears in the first element of A
First appears in the middle element of A
First appears in the last element of A
Doesn’t appear in A.
9
Binary Search
• We can do “better” than linear search if we knew
that the elements of A are sorted, say in nondecreasing order.
• The idea is that you can compare x to the middle
element of A, say A[middle].
– If x < A[middle] then you know that x cannot be an
element from A[middle+1], A[middle+2], …A[n].
Why?
– If x > A[middle] then you know that x cannot be an
element from A[1], A[2], …A[middle-1]. Why?
10
Binary Search Algorithm
Algorithm: BINARYSEARCH
Input: An array A[1..n] of n elements sorted in
nondecreasing order and an element x.
Output: j if x = A[j], 1 ≤ j ≤ n, and 0 otherwise.
1.
2.
3.
4.
5.
6.
7.
8.
low  1; high  n; j  0
while (low ≤ high) and (j = 0)
mid  (low + high)/2
if x = A[mid] then j  mid
else if x < A[mid] then high  mid - 1
else low  mid + 1
end while
return j
11
Worst Case Analysis of Binary Search
• What to do: Find the maximum number of element
comparisons
• How to do:
– The number of “element” comparisons is equal to the number
of iterations of the while loop in steps 2-7. HOW?
– How many elements of the input do we have in the
•
•
•
•
•
First iteration
Second iteration
Third iteration
…
ith iteration
– The last iteration occurs when the size of input we have =
12
Theorem
• The number of comparisons performed by
Algorithm BINARYSEARCH on a sorted
array of size n is at most log n  1
13
Insertion Sort
Algorithm: INSERTIONSORT
Input: An array A[1..n] of n elements.
Output: A[1..n] sorted in nondecreasing order.
1. for i  2 to n
2.
x  A[i]
3.
j  i - 1
4.
while (j > 0) and (A[j] > x)
5.
A[j + 1]  A[j]
6.
j  j - 1
7.
end while
8.
A[j + 1]  x
9. end for
14
Insertion Sort Example
5
2
9
8
4
15
Insertion Sort Example
x=2
5
x=9
2 5 9 8 4
x=8
2 5
9
8
4
x=4
2 5
8
9
4
2 4
5
8
9
2
9
8
4
16
Analyzing Insertion Sort
• The minimum number of element
comparisons is
which occurs when
• The maximum number of element
comparisons is
which occurs when
• The number of element assignments is
17
Time Complexity
• One way of measuring the performance of
an algorithm is how fast it executes. The
question is how to measure this “time”?
– Is having a digital stop watch suitable?
18
Order of Growth
• As measuring time is subjective to many factors, we
look for a more “objective” measure, i.e. the number
of operations
• Since counting the exact number of operations is
cumbersome, sometimes impossible, we can always
focus our attention to asymptotic analysis, where
constants and lower-order terms are ignored.
– E.g. n3, 1000n3, and 10n3+10000n2+5n-1 are all “the same”
– The reason we can do this is that we are always interested in
comparing different algorithms for arbitrary large number of
inputs.
19
Example
Growth rate for some function
20
Example
Growth rate for same previous functions
showing larger input sizes
21
Running Times for Different Sizes of
Inputs of Different Functions
22
Asymptotic Analysis: Big-oh (O())
• Definition: For T(n) a non-negatively valued
function, T(n) is in the set O(f(n)) if there exist
two positive constants c and n0 such that T(n) 
cf(n) for all n > n0.
• Usage: The algorithm is in O(n2) in [best, average,
worst] case.
• Meaning: For all data sets big enough (i.e., n>n0),
the algorithm always executes in less than or equal
to cf(n) steps in [best, average, worst] case.
23
Big O()
• O() notation indicates an upper bound.
• Usually, we look for the tightest upper
bound:
– while T(n) = 3n2 is in O(n3), we prefer
O(n2).
24
Big O() Examples
• Example 1: Find c and n0 to show that
T(n) = (n+2)/2 is in O(n)
• Example 2: Find c and n0 to show that
T(n)=c1n2+c2n is in O(n2)
• Example 3: T(n) = c. We say this is in O(1).
25
Asymptotic Analysis: Big-Omega (())
• Definition: For T(n) a non-negatively valued
function, T(n) is in the set (g(n)) if there exist
two positive constants c and n0 such that T(n) >=
cg(n) for all n > n0.
• Meaning: For all data sets big enough (i.e., n > n0),
the algorithm always executes in more than or
equal to cg(n) steps.
• () notation indicates a lower bound.
26
() Example
• Find c and n0 to show that T(n) = c1n2 + c2n
is in (n2) .
27
Asymptotic Analysis: Big Theta (())
• When O() and () meet, we indicate this by
using () (big-Theta) notation.
• Definition: An algorithm is said to be (h(n))
if it is in O(h(n)) and it is in (h(n)).
28
Example
• Show that log(n!) is in (n log n).
29
Complexity Classes and small-oh (o())
• Using () notation, one can divide the functions into
different equivalence classes, where f(n) and g(n)
belong to the same equivalence class if f(n) = (g(n))
• To show that two functions belong to different
equivalence classes, the small-oh notation has been
introduced
• Definition: Let f(n) and g(n) be two functions from
the set of natural numbers to the set of non-negative
real numbers. f(n) is said to be in o(g(n)) if for every
constant c > 0, there is a positive integer n0 such that
f(n) < cg(n) for all n  n0.
30
Simplifying Rules
• If f(n) is in O(g(n)) and g(n) is in O(h(n)), then
f(n) is in O(h(n))
• If f(n) is in O(kg(n)) for any constant k > 0, then
f(n) is in ………
• If f1(n) is in O(g1(n)) and f2(n) is in O(g2(n)),
then (f1 + f2)(n) is in ………
• If f1(n) is in O(g1(n)) and f2(n) is in O(g2(n))
then f1(n)f2(n) is in ………
• You can safely “globally” replace O with  or  in the
above, where the above rules will still hold.
31
Very Useful Simplifying Rule
• Let f(n) and g(n) be be two functions from the set
of natural numbers to the set of non-negative real
numbers such that:
f ( n)
0  L  lim

n  g ( n)
Then
if L <  then f(n) is in
if L > 0 then f(n) is in
if 0 < L <  then f(n) is in
if L = 0 then f(n) is in
32
Space Complexity
• Space complexity refers to the number of memory
cells needed to carry out the computational steps
required in an algorithm excluding memory cells
needed to hold the input.
• Compare additional space needed to carry out
SELECTIONSORT to that of BOTTOMUPSORT
if we have an array with 2 million elements!
33
Examples
• What is the space complexity for
–
–
–
–
–
–
Linear search
Binary search
Selection sort
Insertion sort
Merge (that merges two sorted lists)
Bottom up merge sort
34
Optimal Algorithms
• If one can show that there is no algorithm
that solves a certain problem in
asymptotically less than that of a certain
algorithm A, we call A an optimal
algorithm.
35
Optimal Algorithms: Example
A decision tree for sorting three elements
Figure 12.1 page 338 from the textbook
36
Optimal Algorithms: Example
• Consider the sorting problem of n distinct elements
using element comparison-based sorting
– Using the decision tree model, the number of possible
solutions (leaf nodes) in the binary tree is equal to
................
– You have learnt earlier that a binary tree with n nodes has
height of at least  log n  (Observation 3.3 page 111 of
the textbook)
– Hence, the length of the longest path in a decision tree for
sorting n distinct elements is at least .............
– Therefore,
•
•
•
•
Insertion sort is
Selection sort is
Merge sort is
Quick sort is
37
Estimating the Running Time of an Algorithm
• As mentioned earlier, we need to focus on
counting those operations which represent, in
general, the behavior of the algorithm
• This is achieved by
– Counting the frequency of basic operations.
• Basic operation is an operation with highest frequency
to within a constant factor among all other elementary
operations
– Recurrence Relations
38
Counting the Frequency of Basic
Operations
• Sometimes, it is easier to compute the frequency
of an operation that is a good representative of
the overall time complexity of the algorithm
– For example, Algorithm MERGE.
• Counting the number of iterations
– The number of iterations in a while loop and/or a for
loop is a good indication of the total number of
operations
39
Example 1
sum = 0;
for (j=1; j<=n; j++)
for (i=1; i<=j; i++)
sum++;
for (k=0; k<n; k++)
A[k] = k;
40
Example 2
for j := 1 to n do
sum[j] := 0;
for i := 1 to j2 do
sum[j] := sum[j] + i;
end for;
end for;
return sum[1..n];
41
Example 3
count := 0;
for i := 1 to
m :=  n/i
for j := 1
count :=
end for;
end for;
return count;
n do

to m do
count + 1 ;
42
Example 4
count := 0;
while n >= 1 do
for j := 1 to n do
execute_algorithm_x;
count := count + 1;
end for
n := n / 2;
end while
return count;
43
Examples 5 & 6
sum1 = 0;
for (k=1; k<=n; k*=2)
for (j=1; j<=n; j++)
sum1++;
sum2 = 0;
for (k=1; k<=n; k*=2)
for (j=1; j<=k; j++)
sum2++;
44
Example 7
count := 0;
for i := 1 to n do
j := 2;
while j <= n do
j := j2;
count := count + 1;
end while
end for;
return count;
45
Recurrence Relations
• The number of operations can be
represented as a recurrence relation.
• There are very well known techniques,
other than expanding the recurrence
relation, which we will study in order to
solve these recurrences
46
Example
• Recursive Merge Sort
MergeSort(A,p,r)
if p < r then
q := (p+r)/2;
MergeSort(A,p,q);
MergeSort(A,q+1,r);
Merge(A,p,q,r);
end if;
– What is the call to sort an array with n elements?
– Let us assume that the overall cost of sorting n elements is T(n),
assuming that n is a power of two.
•
•
•
•
If n = 1, do we know T(n)?
What is the cost of MergeSort(A,p,q)?
What is the cost of MergeSort(A,q+1,r)?
What is the cost of Merge(A,p,q,r)?
47
Worst Case Analysis
• In worst case analysis of time complexity
we select the maximum cost among all
possible inputs of size n.
– One can do that for the() notation as well as
the O() notation.
– However, it is better use it with the ()
notation.
• Why?
48
Average Case Analysis
• Probabilities of all inputs is an important
piece of prior knowledge in order to compute
the number of operations on average
• Usually, average case
 analysis is lengthy and
complicated, even with simplifying
assumptions.
k
i 1
49
Computing the Average Running Time
• The running time in this case is taken to be
the average time over all inputs of size n.
– Assume we have k inputs, where each input
costs Ci operations, and each input can occur
with probability Pi, 1  i  k, the average
running time is given by
k
 PC
i 1
i
i
50
Average Case Analysis of Linear Search
• Assume that the probability that key x appears
in any position in the array (1, 2, …, n) or does
not appear in the array is equally likely
– This means that we have a total of ……… different
inputs, each with probability ………
– What is the number of comparisons for each input?
– Therefore, the average running time of linear
search = ………
51
Average Case Analysis of Insertion Sort
• Assume that array A contains the numbers from
1..n ( i.e. elements are distinct)
• Assume that all n! permutations of the input are
equally likely.
• What is the number of comparisons for inserting
A[i] in its proper position in A[1..i]? What about
on average?
• Therefore, the total number of comparisons on
average is
52