powerpoint - Asian Institute of Technology

Download Report

Transcript powerpoint - Asian Institute of Technology

Data Structures and
Algorithms (AT70.02)
Comp. Sc. and Inf. Mgmt.
Asian Institute of Technology
Instructor: Prof. Sumanta Guha
Slide Sources: CLRS “Intro. To
Algorithms” book website
(copyright McGraw Hill)
adapted and supplemented
CLRS “Intro. To Algorithms”
Ch. 8: Sorting in Linear Time
Theorem 8.1: Any comparison sort algorithm requires (n log n) comparisons in the
worst case.
Proof: The decision tree must have n! leaves, one corresponding to each possible
output. Moreover, the branching-factor of the tree is two, i.e., the tree is binary,
because each comparison gives one of two results.
The height of the decision tree is the number of comparisons made in the worst case.
Therefore, the minimum possible height of a binary tree with n! leaves is the number of
comparisons in the worst case.
Min. possible ht. = log (n!) = (n logn) (using Stirling’s approximation).
COUNTING-SORT assumes that each of the input elements is an integer
in the range 0 to k, inclusive.
A [1..n] is the input array, B [1..n] is the output array, C [0..k] is a
temporary working array.
The running time of COUNTING-SORT is (k+n) (why?), which is (n) if k ≤ n.
Intuitively, explain why COUNTING-SORT can beat the lower bound of (n logn)
for comparison sorts. Doesn’t it do comparisons as well…!?
A stable sort is one where items with the same value appear in the
same order in the output as they appeared in the input.
Lemma 8.3: Given n d-digit numbers in which each digit can take up to k possible
values, RADIX-SORT correctly sorts these numbers in (d (n +k)) time.
Proof: COUNTING-SORT on each column * no. of columns
Is COUNTING-SORT stable?!!
BUCKET-SORT assumes each input element
A [i ] lies in 0 ≤ A [i ] < 1 (i.e., normalized).
A [1..n] is the input array, B [0..n-1] is an
auxiliary array of linked lists (=buckets).
Analysis of BUCKET-SORT
The running time is
T(n) = (n) + ∑i=0..n-1 O(ni2)
where ni is the random variable representing the number of
elements in bucket B[i], because insertion sort is quadratic-time.
Therefore,
E[T(n)] = E[ (n) + ∑i=0..n-1 O(ni2) ] = (n) + ∑i=0..n-1 E[O(ni2)]
= (n) + ∑i=0..n-1 O(E[ni2])
Claim : E[ni2] = 2 – 1/n
If the claim is proved:
E[T(n)] = (n) + ∑i=0..n-1 O(2 – 1/n)
= (n) + n * O(2 – 1/n)
= (n)
To prove the claim E[ni2] = 2 – 1/n, let the indicator variable
Xij = I{A[j] falls in bucket i}
for i = 0, …, n-1 and j = 1, 2, …, n.
Therefore, ni = ∑j=1..n Xij and
E[ni2] = E[ (∑j=1..n Xij)2 ]
= ∑j=1..n E[Xij2] + ∑j=1..n ∑k=1..n & k≠j E[Xij Xik]
Now, Xij is 1 with probability 1/n and 0 with probability 1 – 1/n.
Therefore,
E[Xij2] = 12 * 1/n + 02 * (1 – 1/n) = 1/n
When k ≠ j, Xij and Xik and independent, so
E[Xij Xik] = E[Xij] * E[Xik] = 1/n * 1/n = 1/n2
Substituting above:
E[ni2] = ∑j=1..n 1/n + ∑j=1..n ∑k=1..n & k≠j 1/n2
= n * 1/n + n(n-1) 1/n2
= 1 + (n-1)/n = 2 – 1/n
Problems

Ex. 8.1-4
You are given a sequence of n elements to sort. The input sequence
consists of n/k subsequences, each containing k elements. The elements in
a given subsequence are all smaller than the elements in the succeeding
subsequence and larger than the elements in the preceding subsequence.
Thus, all that is needed to sort the whole sequence of length n is to sort
the k elements in each of the n/k subsequences. Show an (n lg k) lower
bound on the number of comparisons needed to solve this variant of the
sorting problem. (Hint: It is not rigorous to simply combine the lower
bounds for the individual subsequences.)




Ex.
Ex.
Ex.
Ex.
8.2-1
8.2-2
8.2-3
8.2-4
Problems





Ex.
Ex.
Ex.
Ex.
Ex.
8.3-1
8.3-2
8.4-1
8.4-2
8.4-3
Let X be a random variable that is equal to the number of heads in two
flips of a fair coin. What is E [X 2]? What is E2[X]?


Prob. 8-4 (see next slide)
Prob. 8-6
Solution to Ex. 8-4 c:
MATCH-JUGS(R, B)
if |R| = 0 // Sets are empty
then return
if |R| = 1 // Sets contain just one jug each
then let R = {r} and B = {b}
output “(r, b)”
return
else r ← a randomly chosen jug in R
compare r to every jug of B
B< ← the set of jugs in B that are smaller than r
B> ← the set of jugs in B that are larger than r
b ← the one jug in B with the same size as r
compare b to every jug of R − {r}
R< ← the set of jugs in R that are smaller than b
R> ← the set of jugs in R that are larger than b
output “(r, b)”
MATCH-JUGS(R<, B<)
MATCH-JUGS(R>, B>)