PARTITION(A, p, r) - Computer Graphics at Stanford University

Download Report

Transcript PARTITION(A, p, r) - Computer Graphics at Stanford University

Quicksort!
A practical sorting algorithm
• Quicksort
 Very efficient
• Tight code
• Good cache performance
• Sort in place
 Easy to implement
 Used in older version of C++ STL sort(begin, end)
 Average case performance: O(n log n)
 Worst case performance: O(n2)
Quicksort
• Divide & Conquer
 Divide into two arrays around the last element
 Recursively sort each array
QUICKSORT
Divide around
this element
Quicksort
• Divide & Conquer
 Divide into two arrays around the last element
 Recursively sort each array
QUICKSORT
Divide around
this element
Quicksort
• Divide & Conquer
 Divide into two arrays around the last element
 Recursively sort each array
QUICKSORT
Exchange
Quicksort
• Divide & Conquer
 Divide into two arrays around the last element
 Recursively sort each array
QUICKSORT
QUICKSORT
Partitioning the array
PARTITION(A, p, r)
x = A[r]
i = p–1
for j = p to r-1
if A[j] <= x
then i++; exchange A[i] and A[j]
exchange A[i+1] and A[r]
return i+1
i
j
x
Partitioning the array
PARTITION(A, p, r)
x = A[r]
i = p–1
for j = p to r-1
if A[j] <= x
then i++; exchange A[i] and A[j]
exchange A[i+1] and A[r]
return i+1
i
j
x
Partitioning the array
PARTITION(A, p, r)
x = A[r]
i = p–1
for j = p to r-1
if A[j] <= x
then i++; exchange A[i] and A[j]
exchange A[i+1] and A[r]
return i+1
i
j
x
Partitioning the array
PARTITION(A, p, r)
x = A[r]
i = p–1
for j = p to r-1
if A[j] <= x
then i++; exchange A[i] and A[j]
exchange A[i+1] and A[r]
return i+1
i
j
x
Partitioning the array
PARTITION(A, p, r)
x = A[r]
i = p–1
for j = p to r-1
if A[j] <= x
then i++; exchange A[i] and A[j]
exchange A[i+1] and A[r]
return i+1
i
j x
Putting it together: Quicksort
QUICKSORT(A, p, r)
if p < r
then q = PARTITION(A, p, r)
QUICKSORT(A, p,
q-1)
QUICKSORT(A, q+1, r)
QUICKSORT
QUICKSORT
p
QUICKSORT
q
r
Putting it together: Quicksort
QUICKSORT(A, p, r)
if p < r
then q = PARTITION(A, p, r)
QUICKSORT(A, p,
q-1)
QUICKSORT(A, q+1, r)
PARTITION(A, p, r)
x = A[r]; i = p–1
for j = p to r-1
if A[j] <= x
then i++; exchange A[i] and A[j]
exchange A[i+1] and A[r]
return i+1
• To sort an array A of length n:
QUICKSORT(A, 1, n)
Running Time
QUICKSORT(A, p, r)
if p < r
then q = PARTITION(A, p, r)
QUICKSORT(A, p,
q-1)
QUICKSORT(A, q+1, r)
PARTITION(A, p, r)
x = A[r]; i = p–1
for j = p to r-1
if A[j] <= x
then i++; exchange A[i] and A[j]
exchange A[i+1] and A[r]
return i+1
• PARTITION(A, p, r)
 Outside loop:
• ~5 operations
• 0 comparisons
 Inside loop:
• r-p comparisons
• c(r-p) total operations, for some c ~= 5
Time O(r-p)
Running Time
QUICKSORT(A, p, r)
if p < r
then q = PARTITION(A, p, r)
QUICKSORT(A, p,
q-1)
QUICKSORT(A, q+1, r)
PARTITION(A, p, r)
x = A[r]; i = p–1
for j = p to r-1
if A[j] <= x
then i++; exchange A[i] and A[j]
exchange A[i+1] and A[r]
return i+1
• QUICKSORT (A, 1, n)
• Let T(n) be the worst-case running time, for any n
• Then, T(n) ≤ max{1 ≤ q ≤ n-1} ( T(q-1) + T(n-q) + (n) )
(why?)
Worst-Case Running Time
•
Sort A = [n, n-1, n-2, …, 3, 2, 1]
1.
2.
3.
4.
5.
6.
7.
PARTITION(A, 1, n)
PARTITION(A, 2, n)
PARTITION(A, 2, n-1)
PARTITION(A, 3, n-1)
PARTITION(A, 3, n-2)
PARTITION(A, 4, n-2)
PARTITION(A, 4, n-3)
…
n. PARTITION(A, n/2, n/2+1)
A = [1, n-1, n-2, …, 3, 2, n]
A = [1, n-1, n-2, …, 3, 2, n]
A = [1, 2, n-2, …, 3, n-1, n]
A = [1, 2, n-2, …, 3, n-1, n]
A = [1, 2, 3, …, n-2, n-1, n]
Total comparisons: (n-1) + (n-2) + … + 1 = (n2)
Worst-Case Running Time
T(n) ≤ max{1 ≤ q ≤ n-1} ( T(q-1) + T(n – q) + (n) )
Substitution Method:
• Guess T(n) < cn2
T(n) ≤ max{1 ≤ q ≤ n} ( c(q – 1)2 + c(n – q)2 ) + dn
= c×max{1 ≤ q ≤ n} ((q – 1)2 + (n – q)2 ) + dn
 This is max for q = 1 (or, q = n)
= c(n2 – 2n + 1) + dn
= cn2 – c(2n – 1) + dn
≤ cn2,
if we pick c such that c(2n – 1) > dn
A randomized version of Quicksort
• Pick a random number in the array as a pivot
RANDOMIZED-PARTITION(A, p, r)
i = RANDOM(p, r)
exchange A[r] and A[i]
return PARTITION(A, p, r)
RANDOMIZED-QUICKSORT(A, p, r)
if p < r
then q = RANDOMIZED-PARTITION(A, p, r)
RANDOMIZED-QUICKSORT(A, p, q – 1)
RANDOMIZED-QUICKSORT(A, q + 1, r)
• What is the expected running time?
Expected running time of Quicksort
• Observe that the running time is a constant times the number of
comparison
Lemma
Let X be the number of comparisons (A[j] <= x) performed by
PARTITION.
Then, the running time of QUICKSORT is (n+X)
Proof
There are n calls to PARTITION.
Each call does a constant amount of work, plus the for loop.
Each iteration of the for loop executes the comparison A[j] <= x.
Therefore, running time of PARTITION is a (# comparisons)
A quick review of probability
Define a sample space S, which is a set of events
S = { A1, A2, …. }
Each event A has a probability P(A), such that
1. P(A) ≥ 0
2. P(S) = P(A1) + P(A2) + … = 1
3. P(A or B) = P(A) + P(B)
More generally, for any subset T = { B1, B2, … } of S,
P(T) = P(B1) + P(B2) + …
A quick review of probability
• Example: Let S be all sequences of 2 coin tosses
S = {HH, HT, TH, TT}
• Let each head/tail occur with 50% = 0.5 probability, independent of
the other coin flips
• Then P(HH) = P(HT) = P(TH) = P(TT) = 0.25
• Let T = “no two tails occurring one after another”
= { HH, HT, TH }
• Then, P(T) = P(HH) + P(TH) + P(TT) = 0.75
A quick review of probability
• A random variable X is a function from S to real numbers
X: S  R
Example: X = “$3 for every heads”
X: {HH, HT, TH, TT}  R
X(HH) = 6;
X(HT) = X(TH) = 3;
X(TT) = 0
A quick review of probability
• Given two random variables X and Y,
• The joint probability density function f(x, y) is
 f(x, y) = P(X = x and Y = y)
• Then,
 P(X = x0) = y P(X = x0 and Y = y)
 P(Y = yo) = x P(X = x and Y = y0)
• The conditional probability
 P(X = x | Y = y) = P(X = x and Y = y) / P(Y = y)
A quick review of probability
• Two random variables X and Y are independent, if for all
values x and y,
P(X = x and Y = y) = P(X = x) P(Y = y)
Example:
X = 1, if first coin toss is heads; 0 otherwise
Y = 1, if second coin toss is heads; 0 otherwise
Easy to check that X and Y are independent
A quick review of probability
• The expected value E(X) of a random variable X, is
E(X) = x x P(X = x)
• Going back to the coin tosses, estimate E(X)
E(X) = 6 P(X = 6) + 3 P(X = 3) + 0 P(X = 0)
= 6×0.25 + 3×0.5 + 0×0.25
= 1.5 + 1.5 + 0 = 3
A quick review of probability
Linearity of expectation:
E(X + Y) = E(X) + E(Y)
For any function g(x),
E(g(X)) = x g(x) P(X = x)
Assume X and Y are independent:
E[XY] = x y xyP(X = x and Y = y)
= x y P(X = x) P(Y = y)
=
x P(X = x) y P(Y = y)
= E(X) E(Y)
A quick review of probability
Define indicator variable I(A), where A is an event
1, if A occurs
 I(A) =
0, otherwise
Lemma: Given a sample space S, and an event A,
E[ I(A) ] = P(A)
Proof:
E[ I(A) ] = 1×P(A) + 0×P(S – A) = P(A)
Back to Quicksort
Lemma
Let X be the number of comparisons (A[j] <= x) performed by
PARTITION.
Then, the running time of QUICKSORT is O(n+X)
Define indicator random variables:
Xij = I( zi is compared to zj during the course of the algorithm )
Then, the total number of comparisons performed is:
X = i=1…n-1 j=i+1…n Xij
Expected running time of Quicksort
• Running time = X = i=1…n-1 j=i+1…n Xij
• Expected running time = E[X]
E[X] = E[ i=1…n-1 j=i+1…n Xij ]
= i=1…n-1 j=i+1…n E[ Xij ]
= i=1…n-1 j=i+1…n P( zi is compared to zj )
• We just need to estimate P( zi is compared to zj )
Expected running time of Quicksort
During partition of zi……zj
• Once a pivot p is selected
 p is compared to all elements except itself
 no element < p is ever compared to an element > p
smaller than pivot
larger than pivot
Expected running time of Quicksort
• Denote by Zij = { zi,……,zj }
In order to compare zi and zj
• First element chosen as pivot within Zij should be zi, or zj
P( zi is compared to zj ) =
= P( zi or zj is first pivot from Zij )
= P( zi is first pivot from Zij ) + P( zj is first pivot from Zij )
= 1/(j – i +1) + 1/(j – i + 1)
= 2/(j – i + 1)
Expected running time of Quicksort
Now we are ready to compute E[X]:
E[X] = i=1…n-1 j=i+1…n 2/(j – i + 1)
Let k = j – i
E[X] = i=1…n-1 j=i+1…n 2/(k + 1)
< i=1…n j=1…n 2/k
= i=1…n (lg n + O(1))
= O(n lg n )
by CLRS A.7
Selection: Find the ith number
2
7
4
1
3
5
6
8
9
10
13
11
12
• The ith order statistics is the ith smallest element
14
15