00-overview_math_algorithm_analysis

Download Report

Transcript 00-overview_math_algorithm_analysis

TCSS 342, Winter 2005
Lecture Notes
Course Overview,
Review of Math Concepts,
Algorithm Analysis and Big-Oh Notation
Weiss book, Chapter 5, pp. 147-181
1
Course objectives


(broad) prepare you to be a good software
engineer
(specific) learn basic data structures and
algorithms


data structures – how data is organized
algorithms – unambiguous sequence of steps to
compute something
2
Software design goals

What are some goals one should have for good
software?
3
Course content

data structures
algorithms

data structures + algorithms = programs



algorithm analysis – determining how long an
algorithm will take to solve a problem
Who cares? Aren't computers fast enough and getting
faster?
4
An example
Given an array of 1,000,000 integers,
find the maximum integer in the array.
…
0
1
2
999,999
Now suppose we are asked to find the kth
largest element (The Selection Problem)
5
Candidate solutions

candidate solution 1



sort the entire array (from small to large), using
Java's Arrays.sort()
pick out the (1,000,000 – k)th element
candidate solution 2


sort the first k elements
for each of the remaining 1,000,000 – k elements,


keep the k largest in an array
pick out the smallest of the k survivors
6
Is either solution good?



Is there a better solution?
What makes a solution "better" than another?
Is it entirely based on runtime?
How would you go about determining which
solution is better?
 could code them, test them
 could somehow make predictions and
analysis of each solution, without coding
7
Why algorithm analysis?



as computers get faster and problem sizes get
bigger, analysis will become more important
The difference between good and bad
algorithms will get bigger
being able to analyze algorithms will help us
identify good ones without having to program
them and test them first
8
Why data structures?



when programming, you are an engineer
engineers have a bag of tools and tricks – and
the knowledge of which tool is the right one for
a given problem
Examples: arrays, lists, stacks, queues, trees,
hash tables, heaps, graphs
9
Development practices


modular (flexible) code
appropriate commenting of code





each method needs a comment explaining its
parameters and its behavior
writing code to match a rigid specification
being able to choose the right data structures
to solve a variety of programming problems
using an integrated development environment
(IDE)
incremental development
10
Math background: exponents

Exponents


XY = "X to the Yth power";
X multiplied by itself Y times
Some useful identities





XA XB = XA+B
XA / XB = XA-B
(XA)B = XAB
XN+XN = 2XN
2N+2N = 2N+1
11
Logarithms

Logarithms




definition: XA = B if and only if logX B = A
intuition: logX B means "the power X must be raised
to, to get B"
a logarithm with no base implies base 2
log B means log2 B
Examples


log2 16 = 4
log10 1000 = 3
(because 24 = 16)
(because 103 = 1000)
12
Logarithms, continued

log AB = log A + log B




Proof: (let's write it together!)
log A/B = log A – log B
log (AB) = B log A
log C B
log A B 
log C A

A, B, C  0, A  1
example:
log432 = (log2 32) / (log2 4)
=5/2
13
Arithmetic series

Series
k
 Expr
i j

for some expression Expr (possibly containing i ),
means the sum of all values of Expr with each value
of i between j and k inclusive
4

Example:

 2i  1
i 0
= (2(0) + 1) + (2(1) + 1) + (2(2) + 1)
+ (2(3) + 1) + (2(4) + 1)
=1+3+5+7+9
= 25
14
Series identities

sum from 1 through N inclusive
N ( N  1)
i

2
i 1
N

is there an intuition for this identity?

sum of all numbers from 1 to N
1 + 2 + 3 + ... + (N-2) + (N-1) + N

how many terms are in this sum? Can we
rearrange them?
15
Series

sum of powers of 2
N
2
i
2
N 1
1
i 0



0 + 1 + 2 + 4 + 8 + 16 + 32 = 64 - 1
think about binary representation of numbers...
sum of powers of any number a
("Geometric progression" for a>0, a ≠1)
N 1
N
1 a
i
a 

1 a
i 0

16
Algorithm performance



How to determine how much time an algorithm A uses
to solve problem X ?
 Depends on input; use input size "N" as parameter
 Determine function f(N) representing cost
empirical analysis: code it and use timer
running on many inputs
algorithm analysis: Analyze steps of
algorithm, estimating amount of work each
step takes
17
RAM model


Typically use a simple model for basic operation
costs
RAM (Random Access Machine) model




RAM model has all the basic operations:
+, -, *, / , =, comparisons
fixed sized integers (e.g., 32-bit)
infinite memory
All basic operations take exactly one time unit (one
CPU instruction) to execute
18
Critique of the model

Strengths:




simple
easier to prove things about the model than the
real machine
can estimate algorithm behavior on any
hardware/software
Weaknesses:


not all operations take the same amount of time in
a real machine
does not account for page faults, disk accesses,
limited memory, floating point math, etc
19
Why use models?
model

real world
Idea: useful statements using the model translate into
useful statements about real computers
20
Relative rates of growth



most algorithms' runtime can be expressed as a
function of the input size N
rate of growth: measure of how quickly the
graph of a function rises
goal: distinguish between fast- and slowgrowing functions


we only care about very large input sizes
(for small sizes, most any algorithm is fast enough)
this helps us discover which algorithms will run
more quickly or slowly, for large input sizes
21
Growth rate example
Consider these graphs of functions.
Perhaps each one represents an algorithm:
n3 + 2n2
100n2 + 1000
• Which grows
faster?
22
Growth rate example
• How about now?
23
Big-Oh notation



Defn:
T(N) = O(f(N))
if there exist positive constants c , n0 such that:
T(N)  c · f(N) for all N  n0
idea: We are concerned with how the function
grows when N is large. We are not picky about
constant factors: coarse distinctions among
functions
Lingo: "T(N) grows no faster than f(N)."
24
Examples

n = O(2n)
?

2n = O(n)
?

n = O(n2)
?

n2 = O(n)
?

n = O(1)

100 = O(n) ?

10 log n = O(n)

214n + 34 = O(2n2 + 8n)
?
?
?
25
Preferred big-Oh usage

pick tightest bound. If f(N) = 5N, then:
f(N)
f(N)
f(N)
f(N)

=
=
=
=
O(N5)
O(N3)
O(N)
O(N log N)
 preferred
ignore constant factors and low order terms




T(N) = O(N), not T(N) = O(5N)
T(N) = O(N3), not T(N) = O(N3 + N2 + N log N)
Bad style: f(N)  O(g(N))
Wrong: f(N)  O(g(N))
26
Big-Oh of selected functions
27
Ten-fold processor speedup
28
Big omega, theta




Defn: T(N) = (g(N)) if there are positive
constants c and n0 such that T(N)  c g(N) for
all N  n0
Lingo: "T(N) grows no slower than g(N)."
Defn: T(N) = (h(N)) if and only if T(N) =
O(h(N)) and T(N) = (h(N)).
Big-Oh, Omega, and Theta establish a relative
order among all functions of N
29
Intuition, little-Oh
notation
O (Big-Oh)
 (Big-Omega)
 (Theta)
o (little-Oh)

intuition


=
<
Defn: T(N) = o(p(N))
if T(N) = O(p(N)) and T(N)  (p(N))
30
More about asymptotics


Fact: If f(N) = O(g(N)), then g(N) = (f(N)).
Proof: Suppose f(N) = O(g(N)).
Then there exist constants c and n0 such that
f(N)  c g(N) for all N  n0
Then g(N)  (1/c) f(N) for all N  n0,
and so g(N) = (f(N))
31
More terminology

T(N) = O(f(N))



T(N) = (g(N))



f(N) is an upper bound on T(N)
T(N) grows no faster than f(N)
g(N) is a lower bound on T(N)
T(N) grows at least as fast as g(N)
T(N) = o(h(N))

T(N) grows strictly slower than h(N)
32
Facts about big-Oh

If T1(N) = O(f(N)) and T2(N) = O(g(N)), then



If T(N) is a polynomial of degree k, then:
T(N) = (Nk)


T1(N) + T2(N) = O(f(N) + g(N))
T1(N) * T2(N) = O(f(N) * g(N))
example: 17n3 + 2n2 + 4n + 1 = (n3)
logk N = O(N), for any constant k
33
Techniques


Algebra
ex. f(N) = N / log N
g(N) = log N
same as asking which grows faster, N or log
Evaluate:
f (N )
lim
N  g ( N )
limit is
0
c0

no limit
2
N
Big-Oh relation
f(N) = o(g(N))
f(N) = (g(N))
g(N) = o(f(N))
no relation
34
Techniques, cont'd

L'Hôpital's rule:
If lim f ( N )   and lim g ( N )   , then
N 
N 
f (N )
f '(N )
lim
 lim
N  g ( N )
N  g ' ( N )
example: f(N) = N, g(N) = log N
Use L'Hôpital's rule
f'(N) = 1, g'(N) = 1/N
 g(N) = o(f(N))
35
Program loop runtimes
for (int i = 0; i < n; i += c)
statement(s);

Adding to the loop counter means that the loop runtime grows
linearly when compared to its maximum value n.
for (int i = 0; i < n; i *= c)
statement(s);

// O(n)
// O(log n)
Multiplying the loop counter means that the maximum value n
must grow exponentially to linearly increase the loop runtime;
therefore, it is logarithmic.
for (int i = 0; i < n * n; i += c)
statement(s);

// O(n2)
The loop maximum is n2, so the runtime is quadratic.
36
More loop runtimes
for (int i = 0; i < n; i += c)
for (int j = 0; j < n; i += c)
statement;

Nesting loops multiplies their runtimes.
for (int i = 0; i < n; i += c)
statement;
for (int i = 0; i < n; i += c)
for (int j = 0; j < n; i *= c)
statement;

// O(n2)
// O(n log n)
Loops in sequence add together their runtimes, which means
the loop set with the larger runtime dominates.
37
Maximum subsequence sum
The maximum contiguous subsequence sum
problem:
Given a sequence of integers A0, A1, ..., An - 1,
j
find the maximum value of
A
k i
k
for any integers 0  (i, j) < n.
(This sum is zero if all numbers in the sequence
are negative.)
38
First algorithm (brute force)
try all possible combinations of subsequences
// implement together
function maxSubsequence(array[]):
max sum = 0
for each starting index i,
for each ending index j,
add up the sum from Ai to Aj
if this sum is bigger than max,
max sum = this sum
return max sum

What is the runtime (Big-Oh) of this algorithm? How
could it be improved?
39
Second algorithm (improved)


still try all possible combinations, but don't
redundantly add the sums
j
j 1
key observation:
A A 
A

k i


k
j

k i
k
in other words, we don't need to throw away partial sums
can we use this information to remove one of the
loops from our algorithm?
// implement together
function maxSubsequence2(array[]):

What is the runtime (Big-Oh) of this new algorithm?
Can it still be improved further?
40
Third algorithm (improved!)


must avoid trying all possible combinations; to do this,
we must find a way to broadly eliminate many
potential combinations from consideration
Claim #1: A subsequence with a negative sum cannot
be the start of the maximum-sum subsequence.
41
Third algorithm, continued

Claim #1, more formally:
If Ai, j is a subsequence such that
j
A
k i
k
0
,
then there is no q such that Ai,q is the
maximum-sum subsequence.

Proof: (do it together in class)

Can this help us produce a better algorithm?
42
Third algorithm, continued

Claim #2: when examining subsequences left - to right, for some starting index i, if Ai,j becomes the first
subsequence starting with i ,
j
such that
A
k i
k
0
Then no part of Ai,j can be part of the maximum-sum
subsequence.

(Why is this a stronger claim than Claim #1?)

Proof: (do it together in class)
43
Third algorithm, continued

These figures show the possible contents of Ai,j
44
Third algorithm, continued

Can we eliminate another loop from our
algorithm?
// implement together
function maxSubsequence3(array[]):


What is its runtime (Big-Oh)?
Is there an even better algorithm than this third
algorithm? Can you make a strong argument
about why or why not?
45
Kinds of runtime analysis

Express the running time as f(N), where N is
the size of the input



worst case: your enemy gets to pick the input
average case: need to assume a probability
distribution on the inputs
amortized: your enemy gets to pick the
inputs/operations, but you only have to guarantee
speed over a large number of operations
46
References

Weiss book, Chapter 5, pp. 147-181
47