Algorithm Efficiency & Sorting

Download Report

Transcript Algorithm Efficiency & Sorting

Algorithm Efficiency and Sorting
(Walls & Mirrors - Remainder of Chapter 9)
1
Overview
• Mergesort
• Quicksort
th
• Finding the K Smallest Item
• Radix Sort
2
Mergesort: Basic Idea
1) Split an array into two halves.
2) Sort each half by recursively invoking mergeSort.
3) Merge the two sorted halves in the correct order.
• If we do enough splitting, we will eventually obtain an
array with one element, which is trivial to sort — i.e. do
nothing (base case).
• Most of the work goes into merging.
3
Mergesort: Example
Problem: Sort the following array from smallest to largest.
0 1 2 3 4 5
38 16 27 39 12 27
SPLIT
38 16 27
39 12 27
mergeSort
mergeSort
16
27
38
12 16
12
merge
27 27
38
27
39
39
4
Mergesort: Example (Cont’d.)
Details behind the two calls to mergeSort (on previous slide):
38
16
27
39
SPLIT
38 16
mergeSort
16
mergeSort
merge
16 27 38
27
SPLIT
27
38
12
27
39 12
mergeSort
12
27
mergeSort
39
27
merge
12 27 39
5
Mergesort: Example (Cont’d.)
Details behind the four calls to mergeSort (on previous slide):
38
16
SPLIT
38
16
mergeSort mergeSort
38
16
merge
16
38
27
39
mergeSort
27
12
SPLIT
39
12
mergeSort
27
mergeSort
27
mergeSort
39
12
merge
12
39
6
Mergesort: C++ Implementation
void mergeSort( DataType a[ ], int first, int last )
{ if( first < last )
{ // find the midpoint of a[first .. last]
int mid = (first + last) / 2;
// sort each half
mergeSort( a, first, mid );
mergeSort( a, mid + 1, last );
// merge the two halves
merge( a, first, mid, last );
}
}
7
Merge: Basic Idea
Problem: Merge sorted sub-arrays, a[ first .. mid ] and a[ mid+1 ..
last ], so that a[ first .. last ] is put in sorted order.
1) Set first1 = first and first2 = mid + 1.
2) If a[first1] < a[first2], copy a[first1] into array temp[ ] and
increment first1; otherwise, copy a[first2] into temp[ ] and
increment first2.
3) Repeat step 2 with new first1 or new first2 until all elements of
either a[ first .. mid ] or a[ mid+1 .. last ] have been copied into
temp[ ].
4) If all elements of a[ first .. mid ] were copied to temp[ ], then
copy remaining elements of a[ mid+1 .. last ] into temp[ ].
Otherwise, copy remaining elements of a[ first .. mid ] into
temp[ ].
5) Copy temp[ ] into a[ first .. last ].
8
Merge: Example
a[ ]: 13
21
55
first1
34 89
first2
temp[ ]:
13
index
13
21 55
34 89
13
21
13
21 55
34 89
13
21 34
13
21 55
34 89
13
21 34
55
13
21 55
34 89
13
21 34
55
89
9
Merge: C++ Implementation
void merge( DataType a[ ], int first, int mid, int last )
{ DataType temp[MaxSize];
// in turn, copy the smallest of a[ first .. mid ] or a[ mid+1 .. last ] into
// temp[ ] until either sub-array has been copied into temp[ ]
int first1 = first, last1 = mid, first2 = mid + 1, last2 = last, index;
for( index = first1; first1 <= last1 && first2 <= last2; index++ )
if( a[ first1 ] < a[ first2 ] )
{ temp[ index ] = a[ first1 ]; first1++; }
else
{ temp[ index ] = a[ first2 ]; first2++; }
// copy the remaining elements of a[ first .. mid ] or
// a[ mid+1 .. last ] into temp[ ]; copy temp[ ] into a[ ]
}
10
Merge: C++ Implementation (Cont’d.)
// copy the remaining elements of a[ first .. mid ] into temp[ ]
for( ; first1 <= last1; first1++, index++ )
temp[ index ] = a[ first1 ];
// copy the remaining elements of a[ mid+1 .. last ] into temp[ ]
for( ; first2 <= last2; first2++, index++ )
temp[ index ] = a[ first2 ];
// copy temp[ first .. last ] into a[ first .. last ]
for( index = first; index <= last; index++ )
a[ index ] = temp[ index ];
11
Mergesort: Efficiency
• Each merge step merges a[ first .. mid ] with a[ mid+1 ..
last ]. If the total number of elements in these two subarrays is n, then merging them requires at most n – 1
comparisons.
• Each merge step involves n moves from a[ ] to temp[ ] and
n moves from temp[ ] back to a[ ].
• Therefore, each merge step requires at most
n–1+n+n = 3*n–1
data-item operations to merge n elements.
12
Mergesort: Efficiency (Cont’d.)
• Each call to mergeSort( ) divides the input array in half and
then calls mergeSort( ), recursively, on each half.
• As we saw with binary search, if n = 2k, for some k, then
we can divide an array of n things k times, where k =
log2 n.
• If n is not a power of 2 then, as we also saw with binary
search, k = 1 +  log2 n  = 1 + log2 n (rounded down).
• At level m of the recursion, 2m calls to merge occur. Each
call merges n /2m elements, and consequently, requires
3 * (n/2m) – 1 major operations.
• The 2m calls to merge at level m require at most
2m * [ 3 * (n/2m) – 1 ] = 3 * n – 2m = O( n ) operations
13
Mergesort: Efficiency (Cont’d.)
• Since there are either log2 n or 1 + log2 n levels of
recursion (depending on the number of times an array of n
things can be divided in half), it follows that, in worst case,
mergeSort( ) is
O( n * (1 + log2 n) ) = O( n * log2 n )
• In the average case, Mergesort is also
O( n * log2 n )
• Note that, after dividing an array in half, binary search
calls itself recursively on only one half of the array.
mergeSort( ), on the other hand, calls itself recursively on
both halves of the array.
14
Quicksort: Basic Idea
1) Choose a pivot from the array a[ first .. last ] and swap
elements, if necessary, to place it in a[ first ]. (Later, we
will indicate a strategy for choosing the pivot.)
2) Compare each element in a[ first + 1 .. last ] with the pivot.
– If an element is < the pivot, add it to an (initially empty)
block of numbers, S1, that are all < the pivot.
– If an element is  the pivot, add it to an (initially empty)
block of numbers, S2, that are all  the pivot.
3) Arrange a[ first .. last ] to be in the order S1 pivot S2.
4) Invoke quicksort( ) recursively on S1 and on S2.
Most of the work goes into forming S1 and S2.
15
Quicksort: Observations
Let pivotIndex be the index of the pivot in a[ first .. last ] after
a[ first .. last ] is arranged in the order S1 pivot S2. We cannot
expect that a[ first .. last ] is now completely sorted.
However:
• The elements in a[ first .. pivotIndex – 1] will remain in
a[ first .. pivotIndex – 1] when a[ first .. last ] is completely
sorted. (Their positions relative to each other may change.)
• Likewise, the elements in a[ pivotIndex + 1 .. last ] will
remain in a[ pivotIndex + 1 .. last ] when a[ first .. last ] is
completely sorted.
• The pivot will remain at a[ pivotIndex ] when a[ first .. last ]
is completely sorted.
16
Quicksort: Example
Problem: Sort the following array from smallest to largest.
0 1
a[ ]: 15 19
2
11
3
17
4
15
5
13
• Arbitrarily select a[0] as the pivot. S1 and S2 are empty.
• We now compare each element in a[1 .. 5] with the pivot.
• Since a[1]  pivot, we add it to S2, yielding
0
1
2
3
4
5
a[ ]: 15 19
11
17
15
13
pivot
S2
UNKNOWN
17
Quicksort: Example (Cont’d.)
0
1
2
3
4
5
a[ ]: 15 19
11
17
15
13
pivot
S2
UNKNOWN
• Since a[2] < pivot, we add it to S1 as the first element.
• However, we swap a[2] with a[1] to keep S1 before S2, so
that S1 pivot S2 is easy to form. This yields
0
1
2
3
4
5
a[ ]: 15
11
19
17
15
13
pivot
S1 S2
UNKNOWN
18
Quicksort: Example (Cont’d.)
0
a[ ]: 15
pivot
1
11
2
3
19 17
S1 S2
4
5
15 13
UNKNOWN
• Since a[3]  pivot, we add it to S2, yielding
0
a[ ]: 15
1
11
pivot
S1
2
19
3
17
S2
4
15
5
13
UNKNOWN
Note that this step required no elements of a[0 .. 5] to move.
19
Quicksort: Example (Cont’d.)
0
a[ ]: 15
1
11
2
3
19 17
pivot
S1
S2
4
5
15 13
UNKNOWN
• Since a[4]  pivot, we add it to S2, yielding
0
a[ ]: 15
1
11
pivot
S1
2
19
3
17
S2
4
15
5
13
UNKNOWN
Note that this step required no elements of a[0 .. 5] to move.
20
Quicksort: Example (Cont’d.)
0
a[ ]: 15
1
11
2
3
19 17
pivot
S1
S2
4
5
15 13
UNKNOWN
• Since a[5] < pivot, we add it to S1 by swapping it with the
first element of S2, yielding
0
a[ ]: 15
pivot
1
11
2
13
S1
3
17
4
15
5
19
S2
Note that we did not try to insert a[5] as the first element of
S1. This would require extra work.
21
Quicksort: Example (Cont’d.)
0
a[ ]: 15
1
11
pivot
2
13
3
17
S1
4
15
5
19
S2
• Note that every element of a[0 .. 5] is in either S1, S2, or pivot.
We now arrange a[0 .. 5] in the order: S1 pivot S2. This is done
by swapping the pivot with the last element of S1, yielding
0
a[ ]: 13
1
11
S1
2
15
3
17
4
15
S2
5
19
pivot
Note that we did not shift all the elements of S1 in order to
make room for the pivot after the last element of S1. This would
require extra work.
22
Quicksort: Example (Cont’d.)
0
a[ ]: 13
1
11
S1
2
15
pivot
3
17
4
15
5
19
S2
• Note that a[0 .. 5] is not yet sorted.
• However, every element of a[0 .. 5] to the left of pivot is
< pivot, and every element to the right of pivot is  pivot.
(This arrangement is called a partition of a[0 .. 5].)
• When a[0 .. 5] is finally sorted, everything that is now to the
left of pivot will remain on the left, and everything that is
now to the right of pivot will remain on the right.
Consequently, pivot will remain in its current position when
a[0 .. 5] is completely sorted.
23
Quicksort: Example (Cont’d.)
0
a[ ]: 13
1
11
S1
2
15
pivot
3
17
4
15
5
19
S2
• We now invoke quicksort( ) recursively on S1 and on S2.
• Consider first invoking quicksort( ) on S1:
0
a[ ]: 13
1
11
• As before, we arbitrarily select a[0] as the pivot. Set new
S1 and S2 to empty.
• Since a[1] < pivot, we add a[1] to S1, yielding
24
Quicksort: Example (Cont’d.)
0
a[ ]: 13
1
11
pivot
S1
• Now, every element of a[0 .. 1] is in either S1, S2, or pivot
(S2 is empty). We arrange a[0 .. 1] in the order: S1 pivot
S2. This is done by swapping the pivot with a[1], yielding
0
1
a[ ]: 11
13
S1 pivot
25
Quicksort: Example (Cont’d.)
0
a[ ]: 11
1
13
S1 pivot
• The next step is to invoke quicksort( ) recursively on S1.
• Since S1 contains only one element, quicksort( )
immediately returns without doing any work.
• Now, consider invoking quicksort( ) on the original S2:
3 4 5
a[ ]: 17 15 19
26
Quicksort: Example (Cont’d.)
3
4
5
a[ ]: 17 15 19
• Arbitrarily select a[3] as the pivot. Set new S1 and S2 to
empty.
• We now compare each element in a[4 .. 5] with the pivot.
• Since a[4] < pivot, we add it to S1, yielding
3
4
5
a[ ]: 17 15 19
pivot
S1 UNKNOWN
27
Quicksort: Example (Cont’d.)
3
4
5
a[ ]: 17 15 19
pivot
S1 UNKNOWN
• Since a[5]  pivot, we add it to S2, yielding
3 4 5
a[ ]: 17 15 19
pivot
S1 S2
• Since every element of a[3 .. 5] is in either S1, S2, or pivot,
we now swap pivot with a[4] to arrange a[3 .. 5] in the
order: S1 pivot S2. This yields
28
Quicksort: Example (Cont’d.)
3
4
5
a[ ]: 15 17 19
S1 pivot S2
• The next step is to invoke quicksort( ) recursively on S1 and on S2.
• Since S1 and S2 each contain only one element, quicksort( ) returns
immediately from each recursive call without doing any work.
• Array a[0 .. 5] now looks like this
0
a[ ]: 11
1
2
3
13 15 15
4
5
17 19
which is finally in sorted order.
29
Quicksort: C++ Implementation
void quicksort( DataType a[ ], int first, int last )
{ int pivotIndex;
if( first < last )
{ // select pivotIndex, and partition a[first .. last] so that each
// element in a[first .. pivotIndex – 1] < a[pivotIndex], and
// each element in a[pivotIndex + 1 .. last] >= a[pivotIndex]
partition( a, first, last, pivotIndex );
// sort a[first .. pivotIndex – 1] = S1
quicksort( a, first, pivotIndex – 1 );
// sort a[pivotIndex + 1 .. last] = S2
quicksort( a, pivotIndex + 1, last );
}
}
30
Partition: Basic Idea
Select pivotIndex, and partition a[ first .. last ] so that
• each element in S1 < a[ pivotIndex ], and
• each element in S2 >= a[ pivotIndex ]
Starting
configuration
(S1 = S2 = empty)
General case
0
a[ ]: 15
1
19
pivot
2
11
3
4
17 15
5
13
UNKNOWN
0
1
2
3
4
5
a[ ]: 15
11
19
17
15
13
pivot
S1
S2
UNKNOWN
31
Partition: C++ Implementation
void partition( DataType a[ ], int first, int last, int &pivotIndex )
{ // select a pivot and place it in a[ first ]
choosePivot( a, first, last );
DataType pivot = a[ first ];
int lastS1 = first, firstUnknown = first + 1;
for( ; firstUnknown <= last; firstUnknown++ )
if( a[ firstUnknown ] < pivot )
{ lastS1++;
// swap a[ firstUnknown ] with first S2
swap( a[ firstUnknown ], a[ lastS1 ] );
}
swap( a[ first ], a[ lastS1 ] );
pivotIndex = lastS1;
}
32
ChoosePivot: C++ Implementation
// one heuristic for choosing a Pivot and placing it in a[ first ]
void choosePivot( DataType a[ ], int first, int last )
{ int pivotIndex, mid = (first + last) / 2;
// set pivotIndex to the index of the middle value of
// a[ first ], a[ mid ], and a[ last ]
if( a[ first ] <= a[ mid ] )
{ if( a[ mid ] <= a[ last ] ) pivotIndex = mid;
else pivotIndex = ( a[ first ] <= a[ last ] ? last : first );
}
else if( a[ last ] <= a[ mid ] ) pivotIndex = mid;
else pivotIndex = ( a[ first ] <= a[ last ] ? first : last );
swap( a[ first ], a[ pivotIndex ] );
}
33
Quicksort: Observations
• Note that the heuristic that selects a Pivot by choosing the middle
value of a[ first ], a[ mid ], and a[ last ] guarantees that, while first
< mid < last, S1 and S2 will each contain at least one element.
• Consequently, while first < mid < last, S1 and S2 both decrease
by at least two elements (the pivot and one of a[ first ], a[ mid ],
and a[ last ] ) with each recursive call to quicksort( ).
• Also, note that this heuristic will always choose the middle value
of a[ first .. last ] if a[ first .. last ] is already arranged in either
ascending or descending order.
• Finally, note that in quicksort( ), most of the work is done by
partition( ) BEFORE the recursive calls to quicksort( ). On the
other hand, in mergeSort( ), most of the work is done by merge( )
AFTER the recursive calls to mergeSort( ).
34
Quicksort: Efficiency
• In the worst case, the pivot is always chosen as the smallest
or largest element in a[ ]. In these cases, each recursive call
to quicksort( ) produces a partition with either S1 or S2
empty.
• In these cases, the array segment processed by quicksort( )
decreases by only one element with each recursive call.
• Since n – 1 data-item comparisons are required to partition
an array of n elements, the total number of data-item
comparisons required by quicksort( ) in the worst case is
given by
(n – 1) + (n – 2) + (n – 3) + . . . + 2 + 1 = n * (n – 1) / 2
• If the preceding, choosePivot( ), is used, the worst-case
number of comparisons to partition n elements is given by 35
(n – 1) + (n – 3) + (n – 5 ) + . . . + 3 + 1 = n2 / 4
Quicksort: Efficiency (Cont’d.)
• In the worst case, choosePivot( ) requires 3 data-item
•
•
•
•
comparisons to determine the middle value of a[ first ],
a[ mid ], and a[ last ]. Consequently, choosePivot( ) adds at
most 3 * (n – 1) data-item comparisons to the preceding.
Note that partition( ) requires no data-item moves to place
an element in S2. (We simply increment firstUnknown.)
However, a swap of data elements, requiring 3 data-item
moves, is required to place an element in S1.
In the worst case, each data-item comparison needed to
partition a[ ] will be followed by swapping an element into
S1.
Therefore, in the worst case, quicksort( ) will require O( n2 )
data-item comparisons and O( n2 ) data-item moves to sort36
an array of n elements.
Quicksort: Summary
2
• In the worst case, Quicksort requires O( n ) time to sort an
array of n elements, similar to Insertion sort, Selection
sort, and Bubble sort.
• Further analysis shows that Quicksort’s average running
time is O( n * log2 n ), like Mergesort.
• However, in practice, Quicksort usually runs faster than
Mergesort, and Quicksort does not need the extra space for
a temporary array as Mergesort does.
37
th
K Smallest Item: Basic Idea
1) As with quicksort( ), choose a pivot from the array a[first .. last]
and swap elements, if necessary, to place it in a[first].
2) Compare each element in a[first + 1 .. last] with the pivot.
– If an element is < the pivot, add it to an (initially empty)
block of numbers, S1, that are all < the pivot.
– If an element is  the pivot, add it to an (initially empty)
block of numbers, S2, that are all  the pivot.
3) Arrange a[first .. last] to be in the order S1 pivot S2.
th
– If S1 contains  k items, then the k smallest is in S1.
Invoke kthSmallest( ) recursively on S1.
th
– If S1 contains k – 1 items, then k smallest = pivot. Return
pivot.
th
– Otherwise, k smallest is in S2. Invoke kthSmallest( )
recursively on S2.
38
th
K Smallest Item: Example
Problem: Find the 4th smallest item in the following.
0 1
a[ ]: 15 19
2
11
3
17
4
16
5
13
• Arbitrarily select a[0] as the pivot. S1 and S2 are empty.
• We now compare each element in a[1 .. 5] with the pivot.
• Since a[1]  pivot, we add it to S2, yielding
0
1
2
3
4
5
a[ ]: 15 19
11
17
16
13
pivot
S2
UNKNOWN
39
th
K Smallest Item: Example (Cont’d.)
• Continue as with quicksort( ) until we obtain
0
a[ ]: 13
1
11
S1
2 3 4 5
15 17 16 19
pivot
S2
• As with quicksort( ), everything to the left of pivot is
•
•
< pivot and everything to the right of pivot is  pivot.
In other words, pivot is now in the position it would be in
if a[0 .. 5] was completely sorted.
rd
rd
Since pivot is in the 3 position of a[0 .. 5], it is the 3
smallest item.
Therefore, we know that the 4th smallest must be in S2.
40
th
K Smallest Item: Example (Cont’d.)
• Invoke kthSmallest( ) recursively on S2:
3 4 5
a[ ]: 17
16
19
• Arbitrarily select a[3] as the pivot. Set new S1 and S2 to
empty.
• We now compare each element in a[4 .. 5] with the pivot.
• Since a[4] < pivot, we add it to S1, yielding
3 4 5
a[ ]: 17 16 19
pivot
S1 UNKNOWN
41
th
K Smallest Item: Example (Cont’d.)
• Continue as with quicksort( ) until we obtain
3 4 5
a[ ]: 16 17 19
S1 pivot S2
• From the previous recursive step we know that a[2] contains the 3rd
st
•
•
nd
smallest item and the 1 and 2 smallest items are to the left of a[2]
(in some unknown order).
th
th
th
Consequently, S1 pivot S2 must contain the 4 , 5 , and 6 smallest
items (in some order).
Since all items in S1 < pivot  all items in S2, and since S1 contains
only a[3], we immediately conclude that a[3] is the desired, 4th
smallest element.
42
th
K Smallest Item: C++ Implementation
DataType kthSmallest( int k, DataType a[ ], int first, int last )
{ int pivotIndex;
if( first < last )
{ // select pivotIndex, and partition a[first .. last] so that each
// element in a[first .. pivotIndex – 1] < a[pivotIndex], and
// each element in a[pivotIndex + 1 .. last] >= a[pivotIndex]
partition( a, first, last, pivotIndex );
int sizeS1 = pivotIndex – first;
if( k <= sizeS1 )
return kthSmallest( k, a, first, pivotIndex – 1 );
if( k = = sizeS1 + 1) return a[ pivotIndex ];
return kthSmallest( k – (sizeS1 + 1), a, pivotIndex + 1, last );
}
}
43
th
K Smallest Item: Efficiency
• Recall that quicksort( ) invokes itself recursively on both
S1 and S2. However, kthSmallest( ) invokes itself
recursively on only one of S1 or S2 or not at all — if the
desired item is the pivot.
• As with quicksort( ), most of the work in kthSmallest( )
occurs in the partitioning step.
• Analysis similar to that done for quicksort( ) shows that the
average and worst-case running time of kthSmallest( ) is
the same as quicksort( ), namely
Average:
O( n * log2 n )
Worst-case: O( n2 )
• However, in the best case, the first pivot chosen is the
desired item, yielding O( n ) running time in the best case.
44
Radix Sort: Basic Idea
1) Arrange a list of (integer) numbers into groups, depending on the
value of the last digit in each number:
– numbers ending in “0” are placed in the 1st group;
– numbers ending in “1” are placed in the 2nd group; …
– numbers ending in “9” are placed in the 10th group.
2) Merge the groups in the order, 1st, 2nd, …, 10th, without
changing the relative order of the numbers in any group.
3) Arrange the numbers again into groups, now depending on the
value of the next-to-last digit in each number, without changing
the relative order of any number.
4) Repeat steps 2 and 3 for each successive digit position to the left
of the last digit.
5) Sorting is complete when the numbers have been grouped by the
value of the first (leading) digit and then merged back into a
single list.
45
Radix Sort: What’s a Radix?
• Radix is another word for the base of a number system.
• Specifically, it indicates the number of choices for values that
•
•
•
are allowed in any given digit’s place.
For example, most of the time we work with a base 10 (or
radix 10) number system, where any of the 10 integers, 0 9, are allowed to represent a digit.
However, there are other possibilities, most common of which
are
base 2 (binary):
digits are 0 and 1
base 8 (octal):
digits are 0 - 7
base 16 (hexadecimal): digits are 0 - 9 and A - F
Radix sort may be applied to strings of characters, where each
character position might contain, for example, one of 97
printable, ASCII characters, including ‘ ’, \t, \n. (Radix = 97)
46
Radix Sort: Example
Problem: Sort the following numbers
123, 2154, 222, 4, 283, 1560, 1061, 2150
• Pad to the left with zeroes, so that all numbers are of the
same length:
0123, 2154, 0222, 0004, 0283, 1560, 1061, 2150
• Group by rightmost digit:
(1560, 2150), (1061), (0222), (0123, 0283), (2154, 0004)
• Combine the groups into one, without changing the relative
order of any numbers:
1560, 2150, 1061, 0222, 0123, 0283, 2154, 0004
47
Radix Sort: Example (Cont’d.)
1560, 2150, 1061, 0222, 0123, 0283, 2154, 0004
• Group by the next-to-the-last digit, without changing the
relative order of any numbers within a group:
(0004), (0222, 0123), (2150, 2154), (1560, 1061), (0283)
• Combine the groups into one, without changing the relative
order of any numbers:
0004, 0222, 0123, 2150, 2154, 1560, 1061, 0283
48
Radix Sort: Example (Cont’d.)
0004, 0222, 0123, 2150, 2154, 1560, 1061, 0283
• Group by the 2nd-to-last digit, without changing the
relative order of any numbers within a group:
(0004, 1061), (0123, 2150, 2154), (0222, 0283), (1560)
• Combine the groups into one, without changing the relative
order of any numbers:
0004, 1061, 0123, 2150, 2154, 0222, 0283, 1560
49
Radix Sort: Example (Cont’d.)
0004, 1061, 0123, 2150, 2154, 0222, 0283, 1560
• Group by the 3rd-to-last digit, without changing the
relative order of any numbers within a group:
(0004, 0123, 0222, 0283), (1061, 1560), (2150, 2154)
• Combine the groups into one, without changing the relative
order of any numbers:
0004, 0123, 0222, 0283, 1061, 1560, 2150, 2154
Radix sort is now done, with all numbers in sorted order.
50
Radix Sort: C++ Implementation
void radixSort( int a[ ], int n, int ndigits )
{ Queue queue[RADIX];
// RADIX = 10
for( int j = ndigits – 1; j >= 0; j– – )
{ // store a[0..n-1] in queue[0] … queue[RADIX – 1]
for( int i = 0; i < n; i++ )
{ int k = digit( a[i], j );
// set k to the jth digit of a[i]
queue[k].enqueue( a[i] ); // add a[i] to the end of queue[k]
}
// replace a[0..n-1] with items from queue[0]…queue[RADIX–1]
for( int i = 0, k = 0; k < RADIX; k++ )
while( queue[k].dequeue( a[i] ) ) i++;
}
}
51
Radix Sort: Efficiency
• Note that Radix sort requires no data-item comparisons!
• For each digit in a data item, Radix sort enqueues and
dequeues n elements of a[ ].
• The number of data-item moves, therefore, is
ndigits * n * 2
• For a given application, there is a constant, C, such that
ndigits < C. (For example, for 32-bit integers, C = 10.)
• Therefore, by this analysis, Radix sort requires O( n ) time
to sort n items in the best, average, and worst cases.
52
Radix Sort: Efficiency Caveat
• Although the preceding analysis is technically correct, a more
•
•
careful analysis shows that the loop
for( int i = 0, k = 0; k < RADIX; k++ )
while( queue[k].dequeue( a[i] ) ) i++;
executes dequeue( ) RADIX + n times.
The running time of Radix sort is, therefore, proportional to
ndigits * ( n + RADIX + n )
which is
O( ndigits * ( RADIX + n ) )
Although for a given application, there are constants C1 and C2,
such that ndigits < C1 and RADIX < C2, their impact cannot be
ignored if either or both of them is large. (RADIX could be
~100 ( 256) for character strings, and ndigits could be 40 or
more for RSA Security keys.)
53
Radix Sort: Summary
• Radix sort requires O( ndigits * ( RADIX + n )) = O( n ) time to
•
•
•
•
•
sort n items in the best, worst, and average cases.
Radix sort requires no data-item comparisons.
Radix sort requires that the data items all be of the same length.
If the data items are integers, they are padded (to the left) with
zeroes. If the data items are character strings, they are padded
(to the right) with blanks.
Radix sort requires extracting each digit from each data item, or
n * ndigits extractions. (On a binary computer, this will be
most efficient if the RADIX is a power of 2 — e.g. 2, 8 or 16.)
Radix sort requires extra storage and software to manage the
queues that are required. (Number of queues = RADIX.)
For the three previous reasons, Radix sort can be somewhat
awkward to use, in spite of its O( n ) running time.
54
Radix Sort: Summary (Cont’d.)
• In practice, it is sometimes sufficient to use Radix sort on
the leading digits only (to form a set of queues), and then
use some other method, such as Quicksort or Insertion
sort, to sort the individual queues.
55
Growth Rates for Selected Sorting Algorithms
Best case
Average case
Worst case
Selection sort
n2
n2 (†)
n2
Bubble sort
n
n2 (†)
n2
Insertion sort
n
n2 (†)
n2
Mergesort
n * log2 n
n * log2 n
n * log2 n
Quicksort
n * log2 n
n * log2 n
n2
Radix sort
n (‡)
n (‡)
n (‡)
†
According to Knuth, the average growth rate of Insertion sort is 0.9 times
the average growth rate of Selection sort and 0.4 times the average growth
rate of Bubble Sort.
‡
This is is really O( ndigits * ( RADIX + n ) ).
56