Analysis of Algorithms CS 465/665

Download Report

Transcript Analysis of Algorithms CS 465/665

Analysis of Algorithms
CS 477/677
Lecture 8
Instructor: Monica Nicolescu
Quick Announcement
• Hand imaging collection (hand shape)
• Reza Amayeh: [email protected]
• Lab address: LME building room 314
• Time:
– Mon, Wed, Fri: 10:00 am until 5:00pm
– Thu, Thr:
3:00 pm until 5:00pm
CS 477/677 - Lecture 8
2
How Fast Can We Sort?
• Insertion sort, Bubble Sort, Selection Sort
• Merge sort
(nlgn)
• Quicksort
(nlgn)
(n2)
• What is common to all these algorithms?
– These algorithms sort by making comparisons between the input
elements
• To sort n elements, comparison sorts must make (nlgn)
comparisons in the worst case
CS 477/677 - Lecture 8
3
Decision Tree Model
• Represents the comparisons made by a sorting algorithm on an input
of a given size: models all possible execution traces
• Control, data movement, other operations are ignored
• Count only the comparisons
• Decision tree for insertion sort on three elements:
one execution trace
node
leaf:
CS 477/677 - Lecture 8
4
Counting Sort
• Assumption:
– The elements to be sorted are integers in the range 0
to k
• Idea:
– Determine for each input element x, the number of
elements smaller than x
– Place element x into its correct position in the output
array
1
A
2
3
4
5
6
7
8
0
B
2
3
4
2
3
4
5
C 2 2 4 7 7 8
2 5 3 0 2 3 0 3
1
1
5
6
7
8
0 0 2 2 3 3 3 5
CS 477/677 - Lecture 8
5
Analysis of Counting Sort
Alg.: COUNTING-SORT(A, B, n, k)
1.
for i ← 0 to k
(k)
2.
do C[ i ] ← 0
3.
for j ← 1 to n
(n)
4.
do C[A[ j ]] ← C[A[ j ]] + 1
5.
C[i] contains the number of elements equal to i
6.
for i ← 1 to k
(k)
7.
do C[ i ] ← C[ i ] + C[i -1]
8.
C[i] contains the number of elements ≤ i
9.
for j ← n downto 1
(n)
10.
do B[C[A[ j ]]] ← A[ j ]
11.
C[A[ j ]] ← C[A[ j ]] - 1
CS 477/677 - Lecture 8
Overall time: (n + k)
6
Analysis of Counting Sort
• Overall time: (n + k)
• In practice we use COUNTING sort when k = O(n)
 running time is (n)
• Counting sort is stable
– Numbers with the same value appear in the same order
in the output array
– Important when satellite data is carried around with the
sorted keys
CS 477/677 - Lecture 8
7
Radix Sort
• Considers keys as numbers in a base-R
number
– A d-digit number will occupy a field of d
columns
• Sorting looks at one column at a time
– For a d digit number, sort the least significant
digit first
– Continue sorting on the next least significant
digit, until all digits have been sorted
– Requires only d passes through the list
CS 477/677 - Lecture 8
8
RADIX-SORT
Alg.: RADIX-SORT(A, d)
for i ← 1 to d
do use a stable sort to sort array A on digit i
• 1 is the lowest order digit, d is the highest-order digit
CS 477/677 - Lecture 8
9
Analysis of Radix Sort
• Given n numbers of d digits each, where each
digit may take up to k possible values, RADIXSORT correctly sorts the numbers in (d(n+k))
– One pass of sorting per digit takes (n+k) assuming
that we use counting sort
– There are d passes (for each digit)
CS 477/677 - Lecture 8
10
Correctness of Radix sort
• We use induction on number of passes through each
digit
• Basis: If d = 1, there’s only one digit, trivial
• Inductive step: assume digits 1, 2, . . . , d-1 are sorted
– Now sort on the d-th digit
– If ad < bd, sort will put a before b: correct
a < b regardless of the low-order digits
– If ad > bd, sort will put a after b: correct
a > b regardless of the low-order digits
– If ad = bd, sort will leave a and b in the
same order and a and b are already sorted
on the low-order d-1 digits
CS 477/677 - Lecture 8
11
Bucket Sort
• Assumption:
– the input is generated by a random process that distributes
elements uniformly over [0, 1)
• Idea:
–
–
–
–
Divide [0, 1) into n equal-sized buckets
Distribute the n input values into the buckets
Sort each bucket
Go through the buckets in order, listing elements in each one
• Input: A[1 . . n], where 0 ≤ A[i] < 1 for all i
• Output: elements ai sorted
• Auxiliary array: B[0 . . n - 1] of linked lists, each list
initially empty
CS 477/677 - Lecture 8
12
BUCKET-SORT
Alg.: BUCKET-SORT(A, n)
for i ← 1 to n
do insert A[i] into list B[nA[i]]
for i ← 0 to n - 1
do sort list B[i] with insertion sort
concatenate lists B[0], B[1], . . . , B[n -1]
together in order
return the concatenated lists
CS 477/677 - Lecture 8
13
Example - Bucket Sort
1
.78
0
2
.17
1
.17
.12
3
.39
2
.26
.21
4
.26
3
.39
/
5
.72
4
/
6
.94
5
/
7
.21
6
.68
/
8
.12
7
.78
9
.23
8
10
.68
9
/
.72
/
.23
/
/
/
.94
/
CS 477/677 - Lecture 8
14
Example - Bucket Sort
.12
0
.17
.23
.21
.12
.17
2
.21
.23
3
.39
/
6
.68
/
7
.72
4
/
5
/
9
.39
.68
.72
.78
.94
/
1
8
.26
.78
/
.94
/
/
.26
/
/
Concatenate the lists from
0 to n – 1 together, in order
CS 477/677 - Lecture 8
15
/
Correctness of Bucket Sort
• Consider two elements A[i], A[ j]
• Assume without loss of generality that A[i] ≤ A[j]
• Then nA[i] ≤ nA[j]
– A[i] belongs to the same group as A[j] or to a group
with a lower index than that of A[j]
• If A[i], A[j] belong to the same bucket:
– insertion sort puts them in the proper order
• If A[i], A[j] are put in different buckets:
– concatenation of the lists puts them in the proper order
CS 477/677 - Lecture 8
16
Analysis of Bucket Sort
Alg.: BUCKET-SORT(A, n)
for i ← 1 to n
O(n)
do insert A[i] into list B[nA[i]]
for i ← 0 to n - 1
do sort list B[i] with insertion sort
(n)
concatenate lists B[0], B[1], . . . , B[n -1]
O(n)
together in order
return the concatenated lists
CS 477/677 - Lecture 8
(n)
17
Conclusion
• Any comparison sort will take at least nlgn to sort an
array of n numbers
• We can achieve a better running time for sorting if we
can make certain assumptions on the input data:
– Counting sort: each of the n input elements is an integer in the
range 0 to k
– Radix sort: the elements in the input are integers represented
with d digits
– Bucket sort: the numbers in the input are uniformly distributed
over the interval [0, 1)
CS 477/677 - Lecture 8
18
A Job Scheduling Application
• Job scheduling
– The key is the priority of the jobs in the queue
– The job with the highest priority needs to be executed
next
• Operations
– Insert, remove maximum
• Data structures
– Priority queues
– Ordered array/list, unordered array/list
CS 477/677 - Lecture 8
19
Example
CS 477/677 - Lecture 8
20
PQ Implementations & Cost
Worst-case asymptotic costs for a PQ with N items
Insert
Remove max
ordered array
N
1
ordered list
N
1
unordered array
1
N
unordered list
1
N
Can we implement both operations efficiently?
CS 477/677 - Lecture 8
21
Background on Trees
• Def: Binary tree = structure composed of a finite
set of nodes that either:
– Contains no nodes, or
– Is composed of three disjoint sets of nodes: a root
node, a left subtree and a right subtree
root
4
Left subtree
1
2
14
3
16 9
Right subtree
10
8
CS 477/677 - Lecture 8
22
Special Types of Trees
• Def: Full binary tree = a
binary tree in which each
node is either a leaf or has
degree exactly 2.
• Def: Complete binary tree = a
binary tree in which all leaves
have the same depth and all
internal nodes have degree 2.
4
1
3
2
14
16 9
8
10
7
12
Full binary tree
4
1
2
3
16 9
10
Complete binary tree
CS 477/677 - Lecture 8
23
The Heap Data Structure
• Def: A heap is a nearly complete binary tree with
the following two properties:
– Structural property: all levels are full, except
possibly the last one, which is filled from left to right
– Order (heap) property: for any node x
Parent(x) ≥ x
8
7
5
4
2
It doesn’t matter that 4 in
level 1 is smaller than 5 in
level 2
Heap
CS 477/677 - Lecture 8
24
Definitions
• Height of a node = the number of edges on a longest
simple path from the node down to a leaf
• Depth of a node = the length of a path from the root to
the node
• Height of tree = height of root node
= lgn, for a heap of n elements
Height of root = 3
4
1
Height of (2)= 1
2
14
3
16 9
10
Depth of (10)= 2
8
CS 477/677 - Lecture 8
25
Array Representation of Heaps
• A heap can be stored as an
array A.
– Root of tree is A[1]
– Left child of A[i] = A[2i]
– Right child of A[i] = A[2i + 1]
– Parent of A[i] = A[ i/2 ]
– Heapsize[A] ≤ length[A]
• The elements in the subarray
A[(n/2+1) .. n] are leaves
• The root is the maximum
element of the heap
A heap is a binary tree that is filled in order
CS 477/677 - Lecture 8
26
Heap Types
• Max-heaps (largest element at root), have the
max-heap property:
– for all nodes i, excluding the root:
A[PARENT(i)] ≥ A[i]
• Min-heaps (smallest element at root), have the
min-heap property:
– for all nodes i, excluding the root:
A[PARENT(i)] ≤ A[i]
CS 477/677 - Lecture 8
27
Operations on Heaps
• Maintain the max-heap property
– MAX-HEAPIFY
• Create a max-heap from an unordered array
– BUILD-MAX-HEAP
• Sort an array in place
– HEAPSORT
• Priority queue operations
CS 477/677 - Lecture 8
28
Operations on Priority Queues
• Max-priority queues support the following
operations:
– INSERT(S, x): inserts element x into set S
– EXTRACT-MAX(S): removes and returns element of
S with largest key
– MAXIMUM(S): returns element of S with largest key
– INCREASE-KEY(S, x, k): increases value of element
x’s key to k (Assume k ≥ x’s current key value)
CS 477/677 - Lecture 8
29
Maintaining the Heap Property
•
Suppose a node is smaller than a
child
– Left and Right subtrees of i are max-heaps
•
Invariant:
– the heap condition is violated only at that
node
•
To eliminate the violation:
– Exchange with larger child
– Move down the tree
– Continue until node is not smaller than
children
CS 477/677 - Lecture 8
30
Maintaining the Heap Property
Alg: MAX-HEAPIFY(A, i, n)
1. l ← LEFT(i)
– Left and Right
subtrees of i are 2. r ← RIGHT(i)
max-heaps
3. if l ≤ n and A[l] > A[i]
– A[i] may be
4. then largest ←l
smaller than its
5. else largest ←i
children
6. if r ≤ n and A[r] > A[largest]
7. then largest ←r
8. if largest  i
9. then exchange A[i] ↔ A[largest]
10.
MAX-HEAPIFY(A, largest, n)
• Assumptions:
CS 477/677 - Lecture 8
31
Example
MAX-HEAPIFY(A, 2, 10)
A[2]  A[4]
A[2] violates the heap property
A[4] violates the heap property
A[4]  A[9]
Heap property restored
CS 477/677 - Lecture 8
32
MAX-HEAPIFY Running Time
• Intuitively:
– A heap is an almost complete binary tree  must
process O(lgn) levels, with constant work at each
level
• Running time of MAX-HEAPIFY is O(lgn)
• Can be written in terms of the height of the heap,
as being O(h)
– Since the height of the heap is lgn
CS 477/677 - Lecture 8
33
Building a Heap
• Convert an array A[1 … n] into a max-heap (n = length[A])
• The elements in the subarray A[(n/2+1) .. n] are leaves
• Apply MAX-HEAPIFY on elements between 1 and n/2
Alg: BUILD-MAX-HEAP(A)
1.
n = length[A]
2.
for i ← n/2 downto 1
3.
1
4
2
3
1
do MAX-HEAPIFY(A, i, n)
8
2
14
A:
4
3
4
1
CS 477/677 - Lecture 8
3
2
5
6
7
16 9
9
10
8
7
16
9
10 14
10
8
34
7
Readings
• Chapter 8, 6
CS 477/677 - Lecture 8
35