Transcript Lecture2PL

Introduction to Algorithm
Analysis Concepts
15-211
Fundamental Data Structures and
Algorithms
Peter Lee
January 15, 2004
Plan
 Today
Introduction to some basic concepts in
the design of data structures
 Reading:
For today: Chapter 5 and 7.1-7.3
For next time: Chapter 18 and 19
Homework 1 is available!
 See the Blackboard
 Due Monday, Jan.19, 11:59pm
A First Data Structure
Lists of integers
 Let’s start with a very simple data
structure
 Lists of integers, with operations
such as:
create a new empty list
return the length of the list
add an integer to the end of the list
…
Implementing lists
 How shall we implement this?
 What design process could we use?
 One answer:
Think mathematically
Think inductively
Induction
 Recall proofs by induction:
 If trying to prove that a property
P(n) holds for all natural numbers
0, 1, 2, …, then
Prove the base case of P(0)
For n>0, assume P(n-1), show that
P(n) holds
Inductive definitions
 A great deal of computer science
can be defined inductively
 For example, we can define the
factorial function as follows:
fact(0) = 1
Base case
fact(n) = n * fact(n-1),
for n>0
Inductive case
Implementing lists
 How shall we implement this?
 What design process could we use?
 One answer:
Think mathematically
Think inductively
Inductive definitions
 An integer list is either
an empty list, or
Base case
an integer paired with an integer list
Inductive case
Integer lists in Java
The inductive definition gives us
guidance on ways to implement
integer lists in Java
One possibility (not really the best):
 An integer list is either
an empty list, or
use null
an integer paired with an integer list
define a new ListCell class
Integer lists in Java
public class List {
int head;
List tail;
public List(int n, List l) {
head = n;
tail = l;
}
}
 An integer list is either
an empty list, or
an integer paired with an integer list
How about the
length operation?
Another inductive definition
 The length of a list L is
0, if L is the empty list
1 + length of the tail of L, otherwise
Implementing length()
public class ListOps {
public static int length (List l) {
if (l==null)
return 0;
else
return 1 + length(l.tail);
}
}
The add operation
 The add of n onto the end of list L
is
the singleton list containing n, if L is
the empty list
otherwise, a list whose head is the
head of L and the tail is M
 where M is the result of adding n onto
the end of the tail of L
Implementing add()
public class ListOps {
…
public static List add (int n, List l) {
if (l==null)
return new List(n, null);
else
return new List(l.head, add(n, l.tail));
}
}
Running time
 How much time does it take to
compute length()?
 and also add()?
The “step”
 In order to abstract from a
particular piece of hardware,
operating system, and language,
we will focus on counting the
number of steps of an algorithm
 A “step” should execute in
constant time
That is, it’s execution time should not
vary much when the size of the input
varies
Constant-time operations
public class ListOps {
public static int length (List l) {
if (l==null)
return 0;
else
return 1 + length(l.tail);
}
}
This is the only operation in length()
that does not run in a constant
amount of time.
Hence, we want to know how many
times this operation is invoked.
Constant-time operations
public static int length(List l) {
if (l==null)
return 0;
else
return 1 + length(l.tail);
}
Each call to length() requires at most
a constant amount of time plus the
time for a recursive call on the tail
So, the “steps” we want are the
number of recursive calls
length()
 How many steps for length()?
for a list with N elements, length()
requires N-1 steps
 Since length() requires ~N steps
for an “input” of size N, we say
that length() runs in linear time
Why do we care about “steps”?
100n sec
7n2 sec
2n sec
1
100 s
7 s
2 s
5
.5 ms
175 s
32 s
10
1 ms
.7 ms
1 ms
45
4.5 ms
14 ms
1 year
100
100 ms
7 sec
1016 year
1 sec
12 min
--
10 sec
20 hr
--
1.6 min
.22 year
--
n
1,000
10,000
1,000,000
Our goal
 Our goal is to compare algorithms
against each other
Not compute the “wall-clock” time
 We will also want to know if an
algorithm is “fast”, “slow”, or
maybe so slow as to be impractical
What about add()?
public static List add(int n,
List l) {
if (l==null)
return new List(n, null);
else
return new List(l.head,
add(n, l.tail));
}
Let’s Try Something a
Bit Harder…
Reverse
 The reversal of a list L is:
L, if L is empty
otherwise, the head of L added to the
end of M
 where M is the reversal of the tail of L
Implementing reverse()
public static List reverse(List l) {
if (l==null)
return null;
else {
List r = reverse(l.tail);
return add(l.head, r);
}
}
How many steps?
 How many “steps” does reverse
take?
 Think back to the inductive
definition:
The reversal of a list L is:
 L, if L is empty
 otherwise, the head of L added to M
• where M is the reversal of the tail of L
Running time for reverse
The running time is given by the
following recurrence equation:
time required to
reverse the tail
t(0) = 0
t(n) = n + t(n-1)
time required to add
head to the end
Solving for t would tell
us how many steps it
takes to reverse a list
Reverse
t(0) = 0
public static List reverse(List l) {
if (l==null)
return null;
else {
List r = reverse(l.tail);
return add(l.head, r);
}
}
t(n) = n + t(n-1)
Solving recurrence equations
 A common first step is to use repeated
substitution:
t(n) = n + t(n-1)

= n + (n-1) + t(n-2)

= n + (n-1) + (n-2) + t(n-3)
and so on…

= n + (n-1) + (n-2) + (n-3) + … + 1
Klaus says that this is easy…
t(n) = n + (n-1) + (n-2) + … 1 = n(n+1)/2
But how on earth did he come up with this
beautiful little closed-form solution?
Incrementing series
 By the way, this is an arithmetic seires
that comes up over and over again in
computer science, because it
characterizes many nested loops:
for (i=1; i<n; i++) {
for (j=1; j<i; j++) {
f();
}
}
Mathematical handbooks
 For really common series like this
one, standard textbooks and
mathematical handbooks will
usually provide closed-form
solutions.
So, one way is simply to look up the
answer.
 Another way is to try to think
visually…
Visualizing it
Area of the leftovers: n/2
n
Area: n2/2
…
3
So: n2/2 + n/2
= (n2+n)/2
= n(n+1)/2
2
1
0
1
2
3
…
n
Proving it
 Yet another approach is to start
with an answer or a guess, and
then verify it by induction.
t(1) = 1(1+1)/2 = 1
Inductive case:
 for n>1, assume t(n-1) = (n-1)(n1+1)/2 = (n2 – n) /2
 then t(n) = n + (n2 – n) /2

= (n2 + n)/2

= n(n+1)/2
Summations
 Arithmetic and geometric series
come up everywhere in analysis of
algorithms.
 Some series come up so
frequently that every computer
scientist should know them by
heart.
Quadratic time
 Very roughly speaking,
f(n) = n(n+1)/2
 grows no faster than
g(n) = n2
 In such cases, we say that
reverse() runs in quadratic time
(we’ll be more precise about this
later in the course)
n2 is an upper bound
120
100
80
f(n)
g(n)
60
40
20
0
0
1
2
3
4
5
6
7
8
9
10
How about Sorting?
Everybody knows how to sort an array, but
we have singly linked lists.
As always, think inductively:
sort(nil) = nil
sort(L)
= insert the head into the right
place in sort(tail(L))
Ordered Insert
Need to insert element in order, in an already
sorted lists.
2
12
5
10
20
2
50
5
12
10
20
50
Code for ordered insert
public List order_insert(int x, List l) {
if (x <= l.head)
return new List(x, l);
List t = order_insert(x, l.tail);
return new List(l.head, t);
}
The running time depends on the position of x in
the new list.
But in the worst case this could take n steps.
Analysis of sort()
sort(nil) = nil
sort(L)
= insert the head into the right
place in sort(tail(L))
t(0) = 0
t(n) = n + t(n-1)
which we already know to be “very
roughly” n2, or quadratic time.
Insertion sort
This is yet another example of a doublynested loop…
for
i = 2
to
n
do
insert a[i] in the proper place
in
a[1:i-1]
How fast is insertion sort?
We’ve essentially counted the number
of computation steps in the worst
case.
But what happens if the elements are
nearly sorted to begin with?
A preview of some questions
 Question: Insertion sort takes n2
steps in the worst case, and n
steps in the best case. What do
we expect in the average case?
What is meant by “average”?
 Question: What is the fastest that
we could ever hope to sort? How
could we prove our answer?
Worst-case analysis
 We’ll have much more to say, later
in the course, about “worst-case”
vs “average-case” vs “expected
case” performance.
Better sorting
 The sorting algorithm we have just
shown is called insertion sort.
 It is OK for very small data sets,
but otherwise is slow.
 Later we will look at several
sorting algorithms that run in
many fewer steps.
Quiz Break
Doubling summation
Like the incrementing summation, sums of
powers of 2 are also encountered
frequently in computer science.
What is the closed-form solution for this
sum? Prove your answer by induction.
Hint 1: Visualizing it
2n-5
2n-4
2n-3
2n-2
2n-1
Imagine filling a glass by
halves…
Hint 2: Visualizing it
 A somewhat geekier hint:
term
20
21
22
23
24
…
in binary
1
10
100
1000
10000
What is the sum of this
column of binary numbers?
Proving it
 Base case:
When n=1, then 20 = 21-1
 Induction step, when n>1.
Assume true for n’<n, consider n
By the IH, then
Can you think of an algorithm
whose running time is
characterized by this series?
Summary
 Counting constant-time “steps” of
computation
 Linear time and quadratic time
 Recurrence equations
 Sums of geometric series
 Simple list algorithms
 Next time: Programming tips