CSC401: Analysis of Algorithms

Download Report

Transcript CSC401: Analysis of Algorithms

CSC401 – Analysis of Algorithms
Lecture Notes 2
Asymptotic Analysis
Objectives:
Mathematics foundation for algorithm analysis
Amortization analysis techniques
Case studies
Asymptotic notions: Big-Oh, Big-Omega,
Little-oh, little-omega, Big-Theta
Summations
Summation:
b
 f (i)  f (a)  f (a  1)  f (a  2)    f (b)
i a
1  a n1
a  1 a  a  a 
Geometric summation: 
1 a
i 0

1
i
2
n
a  1 a  a  a  

1 a
i 0
n
i
2
n
n
Natural summation:  i  1  2  3    n 
i 0
n(n  1)
2

1
1 1
1
Harmonic number:   1         ln n
2 3
n
i 0 i
Split summation:
n
k
n
i 0
i 1
i  k 1
 f (i)   f (i)   f (i)
2
Logarithms and Exponents
Logarithms
– properties of logarithms:
logb(xy) = logbx + logby
logb (x/y) = logbx - logby
logbxa = alogbx
logba = logxa/logxb
Exponents
– properties of exponentials:
a(b+c) = aba c
abc = (ab)c
ab /ac = a(b-c)
b = a logab
bc = a c*logab
Floor and ceiling functions
–
–
x  the largest integer less than or equal to x
x  the smallest integer greater than or equal to x
3
Proof Techniques
By Example -- counterexample
Contrapositive
– Principle: To justify “if p is true, then q is true”, show “if
q is not true, then p is not true”.
– Example: To justify “if ab is odd, a is odd or b is even”,
assume “’a is odd or b is odd’ is not true”, then “a is
even and b is odd”, then “ab is even”, then “’ab is odd’
is not true”
Contradiction
– Principle: To justify “p is true”, show “if p is not true,
then there exists a contradiction”.
– Example: To justify “if ab is odd, then a is odd or b is
even”, let “ab is odd”, and assume “’a is odd or b is
even’ is not true”, then “a is even and b is odd”, thus
“ab is even” which is a contradiction to “ab is odd”.
4
Proof by Induction
Induction -- To justify S(n) for n>=n0
– Principle
Base cases: Justify S(n) is true for n0<=n<=n1
Assumption: Assume S(n) is true for n=N>=n1;
or
Assume S(n) is true for n1<=n<=N
Induction: Justify S(n) is true for n=N+1
– Example:
Question: Fibonacci sequence is defined as F(1)=1,
F(2)=1, and F(n)=F(n-1)+F(n-2) for n>2. Justify
F(n)<2^n for all n>=1
Proof by induction where n0=1. Let n1=2.
– Base cases: For n0<=n<=n1, n=1 or n=2. If n=1,
F(1)=1<2=2^1. If n=2, F(2)=2<4=2^2. It holds.
– Assumption: Assume F(n)<2^n for n=N>=2.
– Induction: For n=N+1, F(N+1)=F(N)+F(N1)<2^N+2^(N-1)<2 2^N=2^(N+1). It holds.
5
Proof by Loop Invariants
Principle:
– To prove a statement S about a loop is correct, define S in
terms of a series of smaller statements S0, S1, …, Sk,
where
The initial claim S0 is true before the loop begins
If Si-1 is true before iteration i begins, then show that
Si is true after iteration i is over
The final statement Sk is true
Example: Consider the algorithm arrayMax
– Statement S: max is the maximum number when finished
– A series of smaller statements:
Si: max is the maximum
Algorithm arrayMax(A, n)
in the first i+1 elements
max  A[0]
of the array
for i  1 to n  1 do
S0 is true before the loop
if A[i]  max then
If Si is true, then easy to
max  A[i]
return max
show Si+1 is true
6
S=Sn-1 is also true
Basic Probability
Sample space: the set of all possible outcomes from
some experiment
Probability space: a sample space S together with a
probability function that maps subsets of S to real numbers
between 0 and 1
Event: Each subset of A of S called an event
Properties of the probability function Pr
–
–
–
–
Pr(Ø)=0
Pr(S)=1
0<=Pr(A)<=1 for any subset A of S
If A and B are subsets of S and AB=, then
Pr(AB)=Pr(A)+Pr(B)
Independence
– A and B are independent if Pr(AB)=Pr(A)Pr(B)
– A1, A2, …, An are mutually independent if Pr(A1A2…
An)=Pr(A1)Pr(A2)…Pr(An)
7
Basic Probability
Conditional probability
– The conditional probability that A occurs, given B, is
defined as: Pr(A|B)=Pr(AB)/Pr(B), assuming Pr(B)>0
Random variables
– Intuitively, Variables whose values depend on the
outcomes of some experiment
– Formally, a function X that maps outcomes from some
sample space S to real numbers
– Indicator random variable: a variable that maps
outcomes to either 0 or 1
Expectation:a random variable has random values
– Intuitively, average value of a random variable
– Formally, expected value of a random variable X is
defined as E(X)=xxPr(X=x)
– Properties:
Linearity: E(X+Y) = E(X) + E(Y)
Independence: If X and Y are independent, that is,
Pr(X=x|Y=y)=Pr(X=x), then E(XY)=E(X)E(Y)
8
Computing Prefix Averages
We further illustrate
asymptotic analysis with
two algorithms for prefix
averages
The i-th prefix average of
an array X is average of
the first (i  1) elements of
X:
A[i]  (X[0]  X[1]  … 
X[i])/(i+1)
Computing the array A of
prefix averages of
another array X has
applications to financial
analysis
35
30
X
A
25
20
15
10
5
0
1 2 3 4 5 6 7
9
Prefix Averages (Quadratic)
The following algorithm computes prefix
averages in quadratic time by applying the
definition
Algorithm prefixAverages1(X, n)
Input array X of n integers
Output array A of prefix averages of X #operations
A  new array of n integers
n
for i  0 to n  1 do
n
s  X[0]
n
for j  1 to i do
1  2  … (n  1)
s  s  X[j]
1  2  … (n  1)
A[i]  s / (i  1)
n
return A
1
10
Arithmetic Progression
The running time of
prefixAverages1 is
O(1  2  … n)
The sum of the first n
integers is n(n  1) / 2
– There is a simple visual
proof of this fact
Thus, algorithm
prefixAverages1 runs in
O(n2) time
7
6
5
4
3
2
1
0
1
2
3
4
5
6
11
Prefix Averages (Linear)
The following algorithm computes prefix
averages in linear time by keeping a running
sum
Algorithm prefixAverages2(X, n)
Input array X of n integers
Output array A of prefix averages of X
A  new array of n integers
s0
for i  0 to n  1 do
s  s  X[i]
A[i]  s / (i  1)
return A
#operations
n
1
n
n
n
1
Algorithm prefixAverages2 runs in O(n) time
12
Amortization
Amortization
– Typical data structure supports a wide variety of
operations for accessing and updating the elements
– Each operation takes a varying amount of running time
– Rather than focusing on each operation
– Consider the interactions between all the operations by
studying the running time of a series of these operations
– Average the operations’ running time
Amortized running time
– The amortized running time of an operation within a
series of operations is defined as the worst-case running
time of the series of operations divided by the number of
operations
– Some operations may have much higher actual running
time than its amortized running time, while some others
13
have much lower
The Clearable Table Data Structure
The clearable table
– An ADT
Storing a table of elements
Being accessing by their index in the table
– Two methods:
add(e) -- add an element e to the next available cell
in the table
clear() -- empty the table by removing all elements
Consider a series of operations (add and
clear) performed on a clearable table S
– Each add takes O(1)
– Each clear takes O(n)
– Thus, a series of operations takes O(n2),
because it may consist of only clears
14
Amortization Analysis
Theorem:
– A series of n operations on an initially empty clearable
table implemented with an array takes O(n) time
Proof:
– Let M0, M1, …, Mn-1 be the series of operations performed
on S, where k operations are clear
– Let Mi0, Mi1, …, Mik-1 be the k clear operations within the
series, and others be the (n-k) add operations
– Define i-1=-1,Mij takes ij-ij-1, because at most ij-ij-1-1
elements are added by add operations between Mij-1 and
Mij
– The total time of the series is:
(n-k) + ∑k-1j=0(ij-ij-1) = n-k + (ik-1 - i-1) <= 2n-k
Total time is O(n)
Amortized time is O(1)
15
Accounting Method
The method
– Use a scheme of credits and debits: each
operation pays an amount of cyber-dollar
– Some operations overpay --> credits
– Some operations underpay --> debits
– Keep the balance at any time at least 0
Example: the clearable table
– Each operation pays two cyber-dollars
– add always overpays one dollar -- one credit
– clear may underpay a variety of dollars
the underpaid amount equals the number of add
operations since last clear - 2
– Thus, the balance is at least 0
– So the total cost is 2n -- may have credits
16
Potential Functions
Based on energy model
– associate a value with the structure, Ø,
representing the current energy state
– each operation contributes to Ø a certain
amount of energy t’ and consumes a varying
amount of energy t
– Ø0 -- the initial energy, Øi -- the energy after
the i-th operation
– for the i-th operation
ti -- the actual running time
t’i -- the amortized running time
ti = t’i + Øi-1 - Øi
– Overall: T=∑(ti), T’=∑(t’i)
– The total actual time T = T’ + Ø0 - Øn
– As long as Ø0 <= Øn, T<=T’
17
Potential Functions
Example -- The clearable table
– the current energy state Ø is defined as the
number of elements in the table, thus Ø>=0
– each operation contributes to Ø t’=2
– Ø0 = 0
– Øi = Øi-1 + 1, if the i-th operation is add
add: t = t’+ Øi-1 - Øi = 2 - 1 = 1
– Øi = 0, if the i-th operation is clear
clear: t =t’+ Øi-1 = 2 + Øi-1
– Overall: T=∑ (t), T’=∑(t’)
– The total actual time T = T’ + Ø0 - Øn
– Because Ø0 = 0 <= Øn, T<=T’ =2n
18
Extendable Array
ADT: Extendable array is an array with extendable
size. One of its methods is add to add an element to
the array. If the array is not full, the element is
added to the first available cell. When the array is
full, the add method performs the following
–
–
–
–
Allocate a new array with double size
Copy elements from the old array to the new array
Add the element to the first available cell in the new array
Replace the old array with the new array
Question: What is the amortization time of add?
– Two situations of add:
add -- O(1)
Extend and add -- (n)
– Amortization analysis:
Each add deposits 3 dollars and spends 1 dollar
Each extension spends k dollars from the size k to 2k
(copy)
Amortization time of add is O(1)
19
Relatives of Big-Oh
big-Omega
– f(n) is (g(n)) if there is a constant c > 0
and an integer constant n0  1 such that
f(n)  c•g(n) for n  n0
big-Theta
– f(n) is (g(n)) if there are constants c’ > 0 and c’’ > 0
and an integer constant n0  1 such that c’•g(n)  f(n) 
c’’•g(n) for n  n0
little-oh
– f(n) is o(g(n)) if, for any constant c > 0, there is an
integer constant n0  0 such that f(n)  c•g(n) for n  n0
little-omega
– f(n) is (g(n)) if, for any constant c > 0, there is an
integer constant n0  0 such that f(n)  c•g(n) for n  n0
20
Intuition for Asymptotic Notation
Big-Oh
– f(n) is O(g(n)) if f(n) is asymptotically less than or
equal to g(n)
big-Omega
– f(n) is (g(n)) if f(n) is asymptotically greater
than or equal to g(n)
big-Theta
– f(n) is (g(n)) if f(n) is asymptotically equal to
g(n)
little-oh
– f(n) is o(g(n)) if f(n) is asymptotically strictly less
than g(n)
little-omega
– f(n) is (g(n)) if is asymptotically strictly greater
than g(n)
21
Example Uses of the Relatives of Big-Oh



5n2 is (n2)
f(n) is (g(n)) if there is a constant c > 0 and an
integer constant n0  1 such that f(n)  c•g(n) for
n  n0 let c = 5 and n0 = 1
5n2 is (n)
f(n) is (g(n)) if there is a constant c > 0 and an
integer constant n0  1 such that f(n)  c•g(n) for
n  n0 let c = 1 and n0 = 1
5n2 is (n)
f(n) is (g(n)) if, for any constant c > 0, there is
an integer constant n0  0 such that f(n)  c•g(n)
for n  n0 need 5n02  c•n0  given c, the n0 that
satisfies this is n0  c/5  0
22