Transcript Document

Chapter 2
Complexity Analysis
Data Structures and Algorithms in Java
Objectives
Discuss the following topics:
• Computational and Asymptotic Complexity
• Big-O Notation
• Properties of Big-O Notation
• Ω and Θ Notations
• Examples of Complexities
• Finding Asymptotic Complexity: Examples
• Amortized Complexity
• The Best, Average, and Worst Cases
• NP-Completeness
Data Structures and Algorithms in Java
2
Computational and
Asymptotic Complexity
• Computational complexity measures the
degree of difficulty of an algorithm
• Indicates how much effort is needed to apply an
algorithm or how costly it is
• To evaluate an algorithm’s efficiency, use logical
units that express a relationship such as:
– The size n of a file or an array
– The amount of time t required to process
the data
Data Structures and Algorithms in Java
3
Computational and
Asymptotic Complexity (continued)
• This measure of efficiency is called asymptotic
complexity
• It is used when disregarding certain terms of a
function
– To express the efficiency of an algorithm
– When calculating a function is difficult or
impossible and only approximations can be
found
f (n) = n2 + 100n + log10n + 1,000
Data Structures and Algorithms in Java
4
Computational and
Asymptotic Complexity (continued)
Figure 2-1 The growth rate of all terms of function f (n) = n2 + 100n +
log10n + 1,000
Data Structures and Algorithms in Java
5
Big-O Notation
• Introduced in 1894, the big-O notation specifies
asymptotic complexity, which estimates the rate
of function growth
• Definition 1: f (n) is O(g(n)) if there exist
positive numbers c and N such that f (n) ≤ cg(n)
for all n ≥ N
Figure 2-2 Different values of c and N for function f (n) = 2n2 + 3n + 1 = O(n2)
calculated according to the definition of big-O
Data Structures and Algorithms in Java
6
Big-O Notation (continued)
Figure 2-3 Comparison of functions for different values of c and N
from Figure 2-2
Data Structures and Algorithms in Java
7
Properties of Big-O Notation
• Fact 1 (transitivity)
If f (n) is O(g(n)) and g(n) is O(h(n)), then f(n) is
O(h(n))
• Fact 2
If f (n) is O(h(n)) and g(n) is O(h(n)), then f(n) +
g(n) is O(h(n))
• Fact 3
The function ank is O(nk)
Data Structures and Algorithms in Java
8
Properties of Big-O Notation
(continued)
• Fact 4
The function nk is O(nk+j) for any positive j
• Fact 5
If f(n) = cg(n), then f(n) is O(g(n))
• Fact 6
The function loga n is O(logb n) for any positive
numbers a and b ≠ 1
• Fact 7
loga n is O(lg n) for any positive a ≠ 1, where lg n
= log2 n
Data Structures and Algorithms in Java
9
Ω and Θ Notations
• Big-O notation refers to the upper bounds of
functions
• There is a symmetrical definition for a lower
bound in the definition of big-Ω
• Definition 2: The function f(n) is Ω(g(n)) if there
exist positive numbers c and N such that f(n) ≥
cg(n) for all n ≥ N
Data Structures and Algorithms in Java
10
Ω and Θ Notations (continued)
• The difference between this definition and the
definition of big-O notation is the direction of the
inequality
• One definition can be turned into the other by
replacing “≥” with “≤”
• There is an interconnection between these two
notations expressed by the equivalence
f (n) is Ω(g(n)) iff g(n) is O(f (n)) (prove?)
Data Structures and Algorithms in Java
11
Ω and Θ Notations (continued)
• Definition 3: f(n) is Θ(g(n)) if there exist positive
numbers c1, c2, and N such that c1g(n) ≤ f(n) ≤
c2g(n) for all n ≥ N
• When applying any of these notations (big-O,Ω,
and Θ), remember they are approximations that
hide some detail that in many cases may be
considered important
Data Structures and Algorithms in Java
12
Examples of Complexities
• Algorithms can be classified by their time or
space complexities
• An algorithm is called constant if its execution
time remains the same for any number of
elements
• It is called quadratic if its execution time is
O(n2)
Data Structures and Algorithms in Java
13
Examples of Complexities
(continued)
Figure 2-4 Classes of algorithms and their execution times on a computer
executing 1 million operations per second (1 sec = 106 μsec = 103 msec)
Data Structures and Algorithms in Java
14
Examples of Complexities
(continued)
Figure 2-4 Classes of algorithms and their execution times on a computer
executing 1 million operations per second (1 sec = 106 μsec = 103 msec)
(continued)
Data Structures and Algorithms in Java
15
Examples of Complexities
(continued)
Figure 2-5 Typical functions applied in big-O estimates
Data Structures and Algorithms in Java
16
Finding Asymptotic Complexity:
Examples
• Asymptotic bounds are used to estimate the efficiency
of algorithms by assessing the amount of time and
memory needed to accomplish the task for which the
algorithms were designed
for (i = sum = 0; i < n; i++)
sum += a[i]
• Initialize two variables
• Execute two assignments
– Update sum
– Update i
Total 2+2n assignments for the complete execution
Asymptotic complexity is O(n)
Data Structures and Algorithms in Java
17
Finding Asymptotic Complexity:
Examples
•
Printing sums of all the sub-arrays that begins
with position 0
for (i = 0; i < n; i++) {
for (j = 1, sum = a[0]; j <= i; j++)
sum += a[j];
System.out.println ("sum for subarray 0 through
"+i+" is"
+ sum);
}
n 1
1+3n+
2i
=1+3n+2(1+2+….n-1)
i 1
=1+3n+n(n-1)=O(n)+ O(n2)= O(n2)
Data Structures and Algorithms in Java
18
Examples Continued
•
Printing sums of numbers in the last five cells of
the sub-arrays starting in position 0
for (i = 4; i < n; i++) {
for (j = i-3, sum = a[i-4]; j <= i; j++)
sum += a[j];
System.out.println
("sum for subarray "+(i - 4)+" through "+i+"
is"+ sum);
}
• n-4 times for outer loop
• For each i, inner loop executes only
four times
1+8.(n-4)= O(n)
Data Structures and Algorithms in Java
19
Finding Asymptotic Complexity:
Examples
•
•
Finding the length of the longest sub-array
with the numbers in increasing order
For example [1 2 5 ] in [1 8 1 2 5 0 11 12]
for (i = 0, length = 1; i < n-1; i++) {
for (i1 = i2 = k = i; k < n-1 && a[k] < a[k+1];
k++, i2++);
if (length < i2 - i1 + 1)
length = i2 - i1 + 1;
System.out.println ("the length of the longest
ordered subarray is" + length);
}
Data Structures and Algorithms in Java
20
• If all numbers in the array are in decreasing order, the
outer loop is executed n-1 times
• But in each iteration, the inner loop executes just one
time. The algorithm is O(n)
• If the numbers are in increasing order, the outer loop is
executed n- 1 times and the inner loop is executed n-1-i
times for each i in {0,…, n-2}. The algorithm is O(n2)
Data Structures and Algorithms in Java
21
Finding Asymptotic Complexity:
Examples
int binarySearch(int[] arr, int key) {
int lo = 0, mid, hi = arr.length-1;
while (lo <= hi) {
mid = (lo + hi)/2;
if (key < arr[mid])
hi = mid - 1;
else if (arr[mid] < key)
lo = mid + 1;
else return mid; // success
}
return -1;
// failure
}
• O(lg n)
Data Structures and Algorithms in Java
22
The Best, Average,
and Worst Cases
• The worst case is when an algorithm requires a
maximum number of steps
• The best case is when the number of steps is
the smallest
• The average case falls between these extremes
Cavg = Σip(inputi)steps(inputi)
Data Structures and Algorithms in Java
23
• The average complexity is established by considering possible
inputs to an algorithm,
• determining the number of steps performed by the algorithm for
each input,
• adding the number of steps for all the inputs, and dividing by the
number of inputs
• This definition assumes that the probability of occurrence of each
input is the same. It is not the case always.
• The average complexity is defined as the average over the number
of steps executed when processing each input weighted by the
probability of occurrence of this input
Data Structures and Algorithms in Java
24
Consider searching sequentially an unordered array
to find a number
• The best case is when the number is found in
the first cell
• The worst case is when the number is in the last
cell or not in the array at all
• The average case?
Data Structures and Algorithms in Java
25
•
•
•
•
•
•
Assuming the probability distribution is uniform
The probability equals to 1/n for each position
To find a number in one try is 1/n
To find a number in two tries is 1/n
etc…
The average steps to find a number is
1+2+…+n n+1
=
n
2
Data Structures and Algorithms in Java
26
• If the probabilities differ, the average case gives a different
outcome
• If the probability of finding a number in the first cell is ½ , the
probability in the second cell is ¼ and the probability is the
same for remaining cells
1-½-¼
n-2
=
1
4(n - 2)
• the average steps
1 2 3  ...n
n(n  1)  6
n3
 
 1
 1
2 4 4(n  2)
8(n  2)
8
Data Structures and Algorithms in Java
27
Summation Formulas
Let N > 0, let A, B, and C be constants, and let f and g be any functions. Then:
N
N
k 1
k 1
N
N
N
k 1
k 1
k 1
 Cf (k )  C  f (k )
 ( f (k )  g (k ))   f (k )   g (k )
S1: factor out constant
S2: separate summed terms
 C  NC
N ( N  1)
k


2
k 1
k2 
S3: sum of constant
S4: sum of k
S5: sum of k squared
N
N
k 1
N
2  2
k
N 1
k 0
S6: sum of 2^k
1
N
k 1
N ( N  1)( 2 N  1)
6
N
k 1
N
k
2

(
N

1
)
2
1

k 1
S7: sum of k2^(k-1)
Data Structures and Algorithms in Java
28
Logarithms
Let b be a real number, b > 0 and b  1. Then, for any real number x > 0, the logarithm
of x to base b is the power to which b must be raised to yield x. That is:
log b ( x)  y if and only if b y  x
For example:
log 2 (64)  6 because 26  64
log 2 (1 / 8)  3 because 23  1 / 8
log 2 (1)  0 because 20  1
If the base is omitted, the standard convention in mathematics is that log base 10 is
intended; in computer science the standard convention is that log base 2 is intended.
Data Structures and Algorithms in Java
29
Logarithms
Let a and b be real numbers, both positive and neither equal to 1. Let x > 0 and y > 0 be
real numbers.
L1:
log b (1)  0
L2:
log b (b)  1
L3:
log b ( x)  0 for all 0  x  1
L4:
log b ( x)  0 for all x  1
L5:
log b (b y )  y
L6:
b logb ( x )  x
Data Structures and Algorithms in Java
L7:
log b ( xy)  log b ( x)  log b ( y)
L8:
 x
log b    log b ( x)  log b ( y )
 y
L9:
log b ( x y )  y log b ( x)
L10:
log a ( x)
log b ( x) 
log a (b)
30
Limit of a Function
Definition:
Let f(x) be a function with domain (a, b) and let a < c < b. The limit of f(x) as x
approaches c is L if, for every positive real number e, there is a positive real number d
such that whenever |x-c| < d then |f(x) – L| < e.
The definition being cumbersome, the following theorems on limits are useful. We
assume f(x) is a function with domain as described above and that K is a constant.
C1:
lim K  K
x c
C2:
lim x  c
x c
C3:
lim x r  c r for all r  0
xc
Data Structures and Algorithms in Java
31
Limit of a Function
Here assume f(x) and g(x) are functions with domain as described above and that K is a
constant, and that both the following limits exist (and are finite):
lim g ( x)  B
lim f ( x)  A
x c
x c
Then:
C4:
lim Kf ( x)  K lim f ( x)
x c
C5:
C6:
C7:
x c
lim  f ( x)  g ( x)   lim f ( x)  lim g ( x)
x c
x c
x c
lim  f ( x) * g ( x)   lim f ( x) * lim g ( x)
x c
x c
x c
lim  f ( x) / g ( x)   lim f ( x) / lim g ( x) provided B  0
x c
x c
Data Structures and Algorithms in Java
x c
32
Limit as x Approaches Infinity
Definition:
Let f(x) be a function with domain [0, ). The limit of f(x) as x approaches  is L if, for
every positive real number e, there is a positive real number N such that whenever x > N
then |f(x) – L| < e.
The definition being cumbersome, the following theorems on limits are useful. We
assume f(x) is a function with domain [0, ) and that K is a constant.
C8:
lim K  K
x 
C9:
1
0
x  x
lim
C10:
1
lim r  0 for all r  0
x  x
Data Structures and Algorithms in Java
33
Limit of a Rational Function
Given a rational function the last two rules are sufficient if a little algebra is employed:
5 10
 2
7 x 2  5 x  10
x x
lim

lim
x  3 x 2  2 x  5
x 
2 5
3  2
x x
5
10
lim 7  lim  lim 2
x 
x  x
x  x

2
5
lim 3  lim  lim 2
x 
x  x
x  x
700

3 0 0
7

3
7
Data Structures and Algorithms in Java
Divide by highest power of
x from the denominator.
Take limits term by term.
Apply theorem C3.
34
Infinite Limits
In some cases, the limit may be infinite. Mathematically, this means that the limit does
not exist.
C11:
lim x r   for all r  0
C12:
lim log b x   
C13:
x
 
lim e x  
x
x 
7 x  5 x  10
 lim
x 
x 
2x  5
2
Example:
lim
7x  5 
10
x
5
2
x
10
lim 7 x  lim 5  lim
x 
x 
x  x


5
lim 2  lim
x 
x  x
Data Structures and Algorithms in Java
35
l'Hôpital's Rule
In some cases, the reduction trick shown for rational functions does not apply:
7 x  5 log( x)  10
lim
 ??
x 
2x  5
In such cases, l'Hôpital's Rule is often useful. If f(x) and g(x) are differentiable
functions such that
lim f ( x)  lim g ( x)  
x c
x c
This also applies if
the limit is 0.
then:
f ( x)
f ( x)
lim
 lim
x c g ( x )
x c g ( x)
Data Structures and Algorithms in Java
36
l'Hôpital's Rule Examples
Applying l'Hôpital's Rule:
5
7
7 x  5 log( x)  10
7
x
lim
 lim

x 
x


2x  5
2
2
Another example:
x 3  10
3x 2
6x
6
lim
 lim x  lim x  lim x  0
x
x 
x  e
x  e
x  e
e
Recall that:


D e f ( x )  e f ( x ) D f ( x)
Data Structures and Algorithms in Java
37
Mathematical Induction
Mathematical induction is a technique for proving that a statement is true for all
integers in the range from N0 to , where N0 is typically 0 or 1.
First (or Weak) Principle of Mathematical Induction
Let P(N) be a proposition regarding the integer N, and let S be the set of all
integers k for which P(k) is true. If
1)
N0 is in S, and
2)
whenever N is in S then N+1 is also in S,
then S contains all integers in the range [N0, ).
To apply the PMI, we must first establish that a specific integer, N0, is in S
(establishing the basis) and then we must establish that if a arbitrary integer, N  N0, is
in S then its successor, N+1, is also in S.
Data Structures and Algorithms in Java
38
Induction Example
Theorem: For all integers n  1, n2+n is a multiple of 2.
proof: Let S be the set of all integers for which n2+n is a multiple of 2.
If n = 1, then n2+n = 2, which is obviously a multiple of 2. This establishes the
basis, that 1 is in S.
Now suppose that some integer k  1 is an element of S. Then k2+k is a multiple of
2. We need to show that k+1 is an element of S; in other words, we must show that
(k+1)2+(k+1) is a multiple of 2. Performing simple algebra:
(k+1)2+(k+1) = (k2 + 2k + 1) + (k + 1) = k2 + 3k + 2
Now we know k2+k is a multiple of 2, and the expression above can be grouped to
show:
(k+1)2+(k+1) = (k2 + k) + (2k + 2) = (k2 + k) + 2(k + 1)
The last expression is the sum of two multiples of 2, so it's also a multiple of 2.
Therefore, k+1 is an element of S.
Therefore, by PMI, S contains all integers [1, ).
Data Structures and Algorithms in Java
QED
39
Inadequacy of the First Form of Induction
Theorem: Every integer greater than 3 can be written as a sum of 2's and 5's.
(That is, if N > 3, then there are nonnegative integers x and y such that N = 2x + 5y.)
This is not (easily) provable using the First Principle of Induction. The problem is that
the way to write N+1 in terms of 2's and 5's has little to do with the way N is written in
terms of 2's and 5's. For example, if we know that
N = 2x + 5y
we can say that
N + 1 = 2x + 5y + 1 = 2x + 5(y – 1) + 5 + 1 = 2(x + 3) + 5(y – 1)
but we have no reason to believe that y – 1 is nonnegative. (Suppose for example that
N is 9.)
Data Structures and Algorithms in Java
40
"Strong" Form of Induction
There is a second statement of induction, sometimes called the "strong" form, that is
adequate to prove the result on the preceding slide:
Second (or Strong) Principle of Mathematical Induction
Let P(N) be a proposition regarding the integer N, and let S be the set of all
integers k for which P(k) is true. If
1)
N0 is in S, and
2)
whenever N0 through N are in S then N+1 is also in S,
then S contains all integers in the range [N0, ).
Interestingly, the "strong" form of induction is logically equivalent to the "weak" form
stated earlier; so in principle, anything that can be proved using the "strong" form can
also be proved using the "weak" form.
Data Structures and Algorithms in Java
41
Using the Second Form of Induction
Theorem: Every integer greater than 3 can be written as a sum of 2's and 5's.
proof: Let S be the set of all integers n > 3 for which n = 2x + 5y for some
nonnegative integers x and y.
If n = 4, then n = 2*2 + 5*0. If n = 5, then n = 2*0 + 5*1. This establishes the
basis, that 4 and 5 are in S.
Now suppose that all integers from 4 through k are elements of S, where k 
5. We need to show that k+1 is an element of S; in other words, we must show
that k+1 = 2r + 5s for some nonnegative integers r and s.
Now k+1  6, so k-1  4. Therefore by our assumption, k-1 = 2x + 5y for
some nonnegative integers x and y. Then, simple algebra yields that:
k+1 = k-1 + 2 = 2x + 5y + 2 = 2(x+1) + 5y,
whence k+1 is an element of S.
Therefore, by the Second PMI, S contains all integers [4, ).
QED
Data Structures and Algorithms in Java
42
Amortized Complexity
• Amortized analysis:
– Analyzes sequences of operations
– Can be used to find the average complexity of a
worst case sequence of operations
• By analyzing sequences of operations rather
than isolated operations, amortized analysis
takes into account interdependence between
operations and their results
Data Structures and Algorithms in Java
43
Amortized Complexity (continued)
Worst case:
C(op1, op2, op3, . . .) = Cworst(op1) + Cworst(op2) + Cworst(op3) + . . .
Average case:
C(op1, op2, op3, . . .) = Cavg(op1) + Cavg(op2) + Cavg(op3) + . . .
Amortized:
C(op1, op2, op3, . . .) = C(op1) + C(op2) + C(op3) + . . .
Where C can be worst, average, or best case complexity
Data Structures and Algorithms in Java
44
Amortized Complexity (continued)
Figure 2-6 Estimating the amortized cost
Data Structures and Algorithms in Java
45
NP-Completeness
• A deterministic algorithm is a uniquely defined
(determined) sequence of steps for a particular
input
– There is only one way to determine the next step
that the algorithm can make
• A nondeterministic algorithm is an algorithm
that can use a special operation that makes a
guess when a decision is to be made
Data Structures and Algorithms in Java
46
NP-Completeness (continued)
• A nondeterministic algorithm is considered
polynomial: its running time in the worst case is
O(nk) for some k
• Problems that can be solved with such
algorithms are called tractable and the
algorithms are considered efficient
• A problem is called NP-complete if it is NP (it
can be solved efficiently by a nondeterministic
polynomial algorithm) and every NP problem can
be polynomially reduced to this problem
Data Structures and Algorithms in Java
47
NP-Completeness (continued)
• The satisfiability problem concerns Boolean
expressions in conjunctive normal form (CNF)
Data Structures and Algorithms in Java
48
Summary
• Computational complexity measures the degree
of difficulty of an algorithm.
• This measure of efficiency is called asymptotic
complexity.
• To evaluate an algorithm’s efficiency, use logical
units that express a relationship.
• This measure of efficiency is called asymptotic
complexity.
Data Structures and Algorithms in Java
49
Summary (continued)
• Introduced in 1894, the big-O notation specifies
asymptotic complexity, which estimates the rate
of function growth.
• An algorithm is called constant if its execution
time remains the same for any number of
elements.
• It is called quadratic if its execution time is O(n2).
• Amortized analysis analyzes sequences of
operations.
Data Structures and Algorithms in Java
50
Summary (continued)
• A deterministic algorithm is a uniquely defined
(determined) sequence of steps for a particular
input.
• A nondeterministic algorithm is an algorithm that
can use a special operation that makes a guess
when a decision is to be made.
• A nondeterministic algorithm is considered
polynomial.
Data Structures and Algorithms in Java
51