Class notes on "Selection" algorithms.

Download Report

Transcript Class notes on "Selection" algorithms.

Medians and Order Statistics
• i-th order statistic: i-th smallest element
• n elements: median is
– n odd: (n+1)/2
– n even: n/2 or n/2+1
• Assume distinct numbers.
• Input: A, n, 1<=i<=n
• Output: element x of A larger than i-1
elements of A.
Solutions
• O(n log n) time based on …
• O(n) time average.
• O(n) time worst case.
Minimum and Maximum
• How many comparisons?
– At most n-1.
– Examine each element and keep trach of
smallest one:
• Comparison based
• Each element must be compared
• Each must loose once (except winner).
• What about simultaneous min and max?
Min & Max
• Can do with 2n-2 comparisons.
• Can do better
– Form pairs of elements
– Compare elements in each pair
– Pair (ai, ai+1), assume ai < ai+1, then
• Compare (min,ai), (ai+1,max)
• 3 comparisions for each pair.
Average Time Median Selection
• Divide-and-Conquer (prune-and-search).
• Randomized: behavior determined by output
of random number generator.
• Based on QuickSort:
– Partition input array recursively, but
– Work only on one side!
Randomized Selection
• QuickSort(A,p,r)
If p < r then
q=partition(A,p,r)
QuickSort(A,p,q)
QuickSort(A,q+1,r).
– First call: QuickSort(A,1,n)
– After partition(A,p,q):
• A[i]<A[q}, i<q;
• A[q]<A[j}, q<j.
• RandSelect(A,p,r,i)
If p == r then return A[p]
q=RandPartition(A,p,r)
k=q-p+1 /* size of A[p..q]
If i ≤ k then return
RandSelect(A,p,q,i)
Else return
RandSelect(A,q+1,r,i-k).
• First call:
RandSelect(A,1,n,i).
• Returns the i-th smallest
element in A[p..r].
Selection (cont.)
• RandPartition (see 8.3, 8.4 textbook) gives
partition with low side:
– 1 element with probability 2/n
– j elements with probability 1/n, for j=2,3,…,n.
• Assume i-th element always on larger side:
T(n)≤(T(max(1,n-1)+Σk=1..n-1T(max(k,n-k)))/n+O(n)
≤(T(n-1)+2 Σk=n/2..n-1T(k))/n+O(n)
=2(Σk=n/2..n-1T(k))/n+O(n), since T(n-1)=O(n2).
• Then T(n)=O(n) (proof by substitution).
Worst Case Linear Time Selection
• O(n) worst case algorithm.
• Works in similar way: recursively partition input
array
• Idea: guarantee good split
– E.g., in QuickSort assume at each recursion level have
T(n)=T(9n/10)+T(n/10)+O(n).
• Then, T(n)=O(n log n).
– Use deterministic partitioning:
• Compute the element to partition around.
Steps to find i-th smallest element
Algorithm Select
1.
Divide elements in n/5 groups of 5 elements, plus at most one group with
(n mod 5) elements.
Find median of each group:
2.
•
•
3.
Insertion sort: O(1) time (at most 5 elements).
Take middle element (largest if two medians).
Use Select recursively to find median x of medians.
Algorithm Select (cont.)
4. Partition input array around median-ofmedians x. Let k be the number of
elements on low side, n-k on high side.
•
•
a1,a2,…,ak | ak+1,ak+2,…,an
ai < aj, for 1 ≤ i ≤ k, k+1 ≤ j ≤ n.
5. Use Select recursively to:
•
•
Find i-th smallest element on low side, if i ≤ k
Find (i-k)-th smallest on high side, if i > k.
Analysis
• Find lower bound on number of elements greater than x.
– At least half of medians in step 2 greater than x. Then,
– At least half of the groups contribute 3 elements that are greater
than x, except:
• Last group (if less than 5 elements);
• x own group.
– Discard those two groups:
• Number of elements greater than x is ≥ 3((n/5)/2-2)=3n/10-6.
• Similarly, number of elements smaller than x is ≥3n/10-6.
• Then, in worst case, Select is called recursively in Step 5
on at most 7n/10+6 elements (upper bound).
Analysis (cont.)
• Steps 1,2 and 4: O(n) time.
• Step 3: T(n/5)
• Step 5: at most T(7n/10+6)
• 7n/10+6 < n for n > 20.
• T(n) ≤ T(|¯n/5¯|)+T(7n/10+6)+O(n), n > n1.
• Use substitution to solve:
• Assume T(n) ≤ cn, for n > n1; find n1 and c.
Analysis (cont.)
• T(n) ≤ c|¯n/5¯| + c(7n/10+6) + O(n)
≤ cn/5 + c + 7cn/10 + 6c +O(n)
= 9cn/10 + 7c + O(n)
• Want T(n) ≤ cn:
– Pick c such that c(n/10-7) ≥ c1n, where c1 is
constant from O(n) above (n1 = 80).
Questions
• Why not groups of 7 elements?
• Why not groups of 3 elements?
– T(n)=O(?)