Counting sort

Download Report

Transcript Counting sort

Lecture 3
Sorting and Selection
Comparison Sort
In a comparison sort, we use only comparison s between
elements to gain order informatio n about input sequence
{a1 , a2 ,..., an }.
Insertion Sort, Merge Sort, Heapsort and Quicksort are
comparison sorts.
Each comparison partitions all permutatio ns of input sequence
into two parts. Therefore, a comparison sort can be viewed as
a decision t ree.
Decision Tree
a1  a2 ?
yes
no
a2  a3 ?
a1  a3 ?
yes
no
1,2,3
a1  a3 ?
yes
1,3,2
no
yes
a2  a3 ?
2,1,3
no
3,1,2
yes
2,3,1
no
3,2,1
Running Time
In a comparison sort, the (worst case) running time is the depth
(or height) of the decision t ree.
Since the decision t ree has n! leaves, its depth T (n) satisfies
T (n)
2
n
 n! 2n  
e
n

12 n
n
 n 1
e   n! e

e
 e 
Thus, T (n)  (n lg n).
n
n 1
Is there a faster sorting of other type?
Counting sort
for i  1 to k
A : 3, 6, 4, 1, 3, 4, 1, 4
do C[i ]  0;
for j  1 to length[ A]
do C[ A[ j ]]  C[ A[ j ]]  1;
for i  2 to k
do C[i ]  C[i ]  C[i  1];
for j  length[ A] downto 1
do begin
B[C[ A[ j ]]]  A[ j ];
C[ A[ j ]]  C[ A[ j ]]  1;
end - for
C : 2, 0, 2, 3, 0, 1
C : 2, 2, 4, 7, 7, 8
A : 3, 6, 4, 1, 3, 4, 1, 4̂
B : , , , , , , 4,
C : 2, 2, 4, 6, 7, 8
A : 3, 6, 4, 1, 3, 4, 1̂, 4
A : 3, 6, 4, 1, 3, 4̂, 1, 4
B : , 1, , , , , 4,
B : , 1, , , , 4, 4,
C : 1, 2, 4, 6, 7, 8
C : 1, 2, 4, 5, 7, 8
A : 3, 6, 4, 1, 3̂, 4, 1, 4
A : 3, 6, 4, 1̂, 3, 4, 1, 4
B : , 1, , 3, , 4, 4,
B : 1, 1, , 3, , 4, 4,
C : 1, 2, 3, 5, 7, 8
C : 0, 2, 3, 5, 7, 8
A : 3, 6, 4̂, 1, 3, 4, 1, 4
A : 3, 6̂, 4, 1, 3, 4, 1, 4
B : 1, 1, , 3, 4, 4, 4,
B : 1, 1, , 3, 4, 4, 4, 6
C : 0, 2, 3, 4, 7, 8
A : 3̂, 6, 4, 1, 3, 4, 1, 4
B : 1, 1, 3, 3, 4, 4, 4, 6
C : 0, 2, 2, 4, 7, 7
C : 0, 2, 3, 4, 7, 7
Counting sort (Running Time)
for i  1 to k
O (k )
do C[i ]  0;
for j  1 to length[ A]
O (n)
do C[ A[ j ]]  C[ A[ j ]]  1;
for i  2 to k
O (k )
do C[i ]  C[i ]  C[i  1];
for j  length[ A] downto 1
O (n)
do begin
B[C[ A[ j ]]]  A[ j ];
C[ A[ j ]]  C[ A[ j ]]  1;
end - for
O(n  k )
A Student’s Variation
for i  1 to k
do C[i ]  0;
for j  1 to length[ A]
do C[ A[ j ]]  C[ A[ j ]]  1;
k  length[C ];
for j  length[ A] downto 1
do begin
B[ j ]  k ;
C[k ]  C[k ]  1;
while C[k ]  0 do
k  k  1;
end - for
A : 3, 6, 4, 1, 3, 4, 1, 4
C : 2, 0, 2, 3, 0, 1
B: , , , , , , ,6
C : 2, 0, 2, 3, 0, 0
B : , , , , , , 4, 6
B : , , , , , 4 , 4, 6
B : , , , , 4, 4, 4, 6
C : 2, 0, 2, 0, 0, 0
B : , , , 3, 4, 4, 4, 6
B : , , 3, 3, 4, 4, 4, 6
Counting sort is stable.
That is, the same value appear in the output
array in the same order as they do in the
input array.
Radix Sort
In input array A, each element is a number of d digit.
Radix - Sort ( A, d )
for i  1 to d
do " use a stable sort to sort array A on digit i;
329
720
720
329
457
657
839
355
436
457
329
436
839
355
436
457
436
720
657
329
355
457
657
720
355
839
657
839
Running Time
O(d (n  k ))
if each digit is in the range 1 to k .
Since k  O(n), O(d (n  k ))  O(dn).
An Application:
Selection Problem
Input : A set A of n distinct numbers and an integer i,
1  i  n.
Output : x  A, which is the ith smallest number, i.e.,
larger tha n exactly i  1 elements in A.
1. Sort input numbers.
2. Count from the leftmost t o the ith to the left.
O(n  k )
If we use comparison, what is the lower bound?
There are only n possibliti es :
(*, , , ..., )
( ,* , , ..., )
...
( , , , ... , *)
Therefore, O(log n) is a lower bound.
O (n)
Randomized Selection
Rondomized  Select ( A, p, r , i )
if p  r
then return A[ p ];
q  Randomized - Partition ( A, p, r );
k  q  p  1;
if i  k
then return Randomized - Select ( A, p, q, i )
else if i  1  k ,
then return A[q  1]
else return Randomized - Select ( A, q  2, r , i  k  1);
Running time
n 1
1
E[T (n)]   (E[T (max( k , n  k  1))]  c1n) for some c1  0.
k 0 n
Guess E[T (n)]  cn. Prove it by induction.
E[T (1)]  c. (choose c such that E[T (1)]  c.)
Assume E[T (k )]  ck for k  n. Then,
n 1
1
E[T (n)]  (2  ck )  c1n
n k  n / 2 
c
 ( n / 2  n  1)(( n  1)  n / 2  1)  c1n
n
3
 c   n  c1n  cn. (choose c  4c1 )
4
Selection with O(n) Comparisons
Algorithm Select :
1. Divide n elements into n/5  groups of 5 elements, possibly
except one of less than 5 elements.
2. Find the median of each group by insertion sort. In case,
the exceptiona l one has two medians, take the larger one.
3. Use Select recursivel y to find the median x of n/5  group
medians.
4. Exchange x with the last element in input array and apply
Partition subroutine . Let k be the number of elements on the
low side of the partition.
5. Use Select recursivel y to find the ith smallest element on the
low side if i  k , or the (i  k ) th smallest element on the high
side if i  k .
1. Divide n elements into n/5  groups of 5 elements, possibly
except one of less than 5 elements.
O (n)
2. Find the median of each group by insertion sort. In case,
the exceptiona l one has two medians, take the smaller one.
3. Use Select recursivel y to find the median x of n/5  group
medians.
x
T ( n / 5)
4. Exchange x with the last element in input array and apply
Partition subroutine . Let k be the number of elements on the
low side of the partition.
x
O (n)
Partition ( A,1, n)
high side
low side
x
k
x
x
5. Use Select recursivel y to find the ith smallest element on the
low side if i  k , or x is the ith smallest one if i  1  k ,
or the (i  k  1) th smallest element on the high side if i  1  k .
x
Partition ( A,1, n)
high side
low side
x
k
x
x
n  k 1
 1  n 
T (n  3(      1))
 2  5 
Why? See next page!!!
(
(
+ low side) contains red part
+ high side) contains blue part
 1  n 
k  1  3(      1)
 2  5 
 1  n 
n  k  3(      1)
 2  5 
Analysis
Therefore,
  1  n  
T (n)  T ( n / 5)  T (n  3      1)  O(n)
  2  5  
In each iteration, at least
  1  n    3n
3      1 
3
  2  5    10
elements are deleted.
Thus, Select is called recursivel y on at most 7 n / 10  3 elements.
Analysis
 1  n 
T (n)  T ( n / 5)  T (n  3(      1))  O(n)
 2  5 
Guess T (n)  O(n), i.e., T (n)  cn.
Prove it by induction.
T (1)  c (choose c to satisfy T (1)  c !).
Assume T (k )  ck for k  n.
Then
T (n)  c(n / 5  1)  c(7 n / 10  3)  c' n
 9cn / 10  4c  c' n  cn
whe n choose c big enough such that c(n / 10  4)  c' n.
What we learnt in this lecture?
•
•
•
•
•
What is “comparison sort”?.
Lower bound for comparison sort.
Counting sort and Radix sort.
What is selection problem?
expected linear time and linear time
selections.
Puzzle
If we use comparison s to solve selection problem,
then in the worst case, how many comparison s has
to be done to find a median?