pq - Green Cedars

Download Report

Transcript pq - Green Cedars

Priority Queues
• Certain applications require that the largest record be
selected next from a group of records.
• Queues: insert new, delete oldest
• Stacks: insert new, delete newest
• Priority queues: insert new, delete largest
• Priority queues are a generalization of stacks and queues
using the appropriate priority assignment.
Some Applications
•
•
•
•
•
Simulations - cf. Keys are "time events".
Operating system job (process) scheduling
Numerical computations - cf. Pick largest error first
File compression
Graph searching1
1 More on this later in the semester.
Data Structure Discussion
• Need to build a d/s containing numerical keys (i.e.
priorities) and support the following operations:
–
–
–
–
Construct a priority queue from N given items
Insert a new item
Remove the largest item
Replace the largest item with a new item (unless the
new item is larger)
– Change the priority of an item
– Delete an arbitrary specified item
– Join two priority queues into one large one.
Unordered List implementation
insert(int v)
{
a[N++] = v;
}
int remove()
{
int j, max=0, v;
for (j=1; j<N; j++)
max = (a[j] > a[max]) ? j : max;
v = a[max];
a[max] = a[--N];
return v;
}
Use an unordered list:
insert without paying
attention to the key.
To implement 'change',
insert new value and
remove old value.
Ordered List implementation
• Use an ordered list - keep items ordered by key value.
• Remove: remove last (largest) item.
• Insert requires moving array elements to the right (linear time).
Priority Queues As Sorting Tools
• Priority queues can be turned into sorting algorithms
• Repeatedly use insert then repeatedly use remove to empty the
queue and get the results in order.
• Using a priority queue this way represented as an unordered list
=> selection sort
• Using a priority queue this way represented as an ordered list =>
insertion sort
Linked-list implementations
• Linked-list implementations are also possible. This can improve
deleting arbitrary items and joining operations into constant time.
• Other operations have same performance characteristics.
• Can deal with "unlimited" number of items.
Observations
• These simple implementations are useful and can outperform
more complicated methods.
• Unordered list implementation useful if we have only few
"remove largest" as opposed to a larger number of insertions.
• Ordered list implementation useful if the inserted items are close
to the largest elements in the priority queue.
Heap implementation
•Heap: store elements in an array satisfying the heap condition.
•Heap condition: the key in each node is >= keys in children (if any)
•Definition: a heap is a complete binary tree, represented as an
array, in which every node satisfies the heap condition. The largest
key is the 1st position of the array.
Recall that a complete binary tree is constructed by placing one node (the root)
and proceeding down the page and from left to right, connecting two nodes
beneath each node on the previous level until N nodes have been placed.
Heap implementation
1
X
3
2
4
8
A
O
T
5
G
9
10
E
R
6
S
11
7
M
N
12
A
I
k:
1
2
3
4
5
6
7
8
9
10
11
12
a[k]:
X
T
O
G
S
M
N
A
E
R
A
I
• Parent of node j is in slot floor(j/2)
• Children of node j are in slots 2j and 2j+1
Some Observations
k:
1
2
3
4
5
6
7
8
9
10
11
12
a[k]:
X
T
O
G
S
M
N
A
E
R
A
I
• Representation as an array does limit general utility as a data
structure, however is just enough flexibility to allow implementation
of efficient priority queues.
• All the algorithms operate along some path from the root to the
bottom of the heap (just moving from parent to child and from child
to parent). In a heap of size N, all paths have ~ lgN nodes. Each
generation has ~ half as many nodes as the next, thus there can be at
most lgN generations.
•All priority queue operations (except join) can be done in
logarithmic time using heaps.
Heap algorithms
• Make simple structural changes which could violate the heap
condition, then travel through the heap modifying it to fix this.
• Some algorithms travel through the heap top-to bottom, others
bottom-to-top.
• Assume that records are integer keys stored in an array 'a' of some
maximum size, with the current size of the heap kept in an integer
N.
• N is as much a part of the definition of a heap as the keys and
records themselves!
Insert algorithm
• This operation will increase N by one. The record to be inserted is
put into a[N], but this may violate the heap property.
• If there is a violation (i.e. a new node is greater than its parent),
then exchange the node with its parent (recursively).
1
X
First step of
inserting
element 'P'
3
2
4
8
A
O
T
5
S
R
11
G
9
10
E
6
7
M
12
A
I
P
N
Insert algorithm
1
Completed
operation after
inserting
element 'P'
X
3
2
4
8
A
P
T
5
G
9
10
E
R
6
S
11
7
O
12
A
I
M
N
Insert algorithms
insert adds a new item to a[N], then upheap fixes the heap condition.
#include <limits.h>
upheap(int k) {
int v;
v = a[k]; a[0] = INT_MAX;
while (a[k/2] <= v)
{ a[k] = a[k/2]; k /= 2; }
a[k] = v;
}
insert (int v) {
a[++N] = v; upheap(N);
}
upheap(int k) {
int v;
v = a[k]; a[0] = INT_MAX;
while (a[k/2] <= v)
{ a[k] = a[k/2]; k /= 2; }
a[k] = v;
}
k:
1
a[k]:
0
Insert 0
upheap(int k) {
int v;
v = a[k]; a[0] = INT_MAX;
while (a[k/2] <= v)
{ a[k] = a[k/2]; k /= 2; }
a[k] = v;
}
k=2, v=1
k:
1
2
a[k]:
0
1
k:
1
2
a[k]:
0
0
k:
1
2
a[k]:
1
0
Insert 1
upheap(int k) {
int v;
v = a[k]; a[0] = INT_MAX;
while (a[k/2] <= v)
{ a[k] = a[k/2]; k /= 2; }
a[k] = v;
}
k=3, v=2
k:
1
2
3
a[k]:
1
0
2
k:
1
2
3
a[k]:
1
0
1
k=1, v=2
k:
1
2
3
a[k]:
2
0
1
Insert 2
upheap(int k) {
int v;
v = a[k]; a[0] = INT_MAX;
while (a[k/2] <= v)
{ a[k] = a[k/2]; k /= 2; }
a[k] = v;
}
k=4, v=3
k:
1
2
3
4
a[k]:
2
0
1
3
k:
1
2
3
4
a[k]:
2
0
1
0
k=2, v=3
k:
1
2
3
4
a[k]:
2
2
1
0
k:
1
2
3
4
a[k]:
3
2
1
0
Insert 3
Some efficiency issues
#include <limits.h>
upheap(int k) {
int v;
v = a[k]; a[0] = INT_MAX;
while (a[k/2] <= v)
{ a[k] = a[k/2]; k /= 2; }
a[k] = v;
}
#include <limits.h>
upheap(int k) {
int v;
v = a[k]; a[0] = INT_MAX;
while (a[k>>1] <= v)
{ a[k] = a[k>>1]; k >>= 1; }
a[k] = v;
}
Replace algorithm
Replace X with C. This violates the heap condition, but is easily
fixed by exchanges the node with its child until the violation is
fixed.
1
C
3
2
4
8
A
P
T
5
S
R
11
G
9
10
E
6
7
O
12
A
I
M
N
Replace algorithm
Result after replacing X with C. Here, C ends up at a leaf.
1
T
3
2
4
8
A
P
S
5
G
9
10
E
C
6
R
11
7
O
12
A
I
M
N
Remove Largest algorithm
To remove the largest element, we shrink the queue by one
element (M, the last one), and then do a Replace to replace the
current largest element (T) with M.
1S
1T
2
4
8
A
G
9E
3
5
10
C
6O
R
11A
12
I
7
M
P
2R
P
S
3
N
4
8A
G
9E
5M
10
C
11A
6O
12
I
7N
Replace and Remove algorithms
downheap fixes the heap condition starting at the root and working
its way down.
downheap(int k) {
int i, j, v;
v = a[k];
while (k < N/2) {
j = 2 * k;
if (j<N) && ((a[j] < a[j+1]))
j++;
if v > a[j] break;
a[k] = a[j];
k = j;
}
a[k] = v;
}
// pick among siblings
// found replacing node
// else copy up...
// and move on
// final placement
Replace and Remove algorithms
int remove() {
int r;
r = a[1];
a[1] = a[N];
N--;
downheap(1);
//
//
//
//
save top value
copy last element to first pos.
decrement queue size
fix heap cond. if necessary.
}
int replace(int v) {
a[0] = v;
// save top value
downheap(0);
// fix heap cond. if necessary.
return a[0];
}
This code uses a[0] in an artificial way: its children are 0 (itself) and 1. So if v is
larger than the largest element in the heap, the heap is not touched. Otherwise, v is
put into the heap and a[1] returned.
Other operations
delete and change can be implemented using combinations of
what we’ve seen so far.
For example, if the priority of an element at position k is raised,
then upheap(k) is called, and if lowered, then downheap(k) is
called.
A Word on Performance
All of the basic operations insert, remove, replace, delete, and
change (this includes downheap and upheap) require less than
2lgN comparisons when performed on a heap of N elements.
All these operations involve moving along a path between the
root and the bottom of the heap, which includes no more than lgN
elements for a heap of size N. The factor of 2 comes from
downheap, which makes two comparisons in its inner loop. The
other operations require only lgN comparisons.
join requires a more sophisticated data structure to do efficiently,
however it is usually done much less frequently than the other
operations.
Heapsort
An elegant and efficient1 method of sorting using the basic
operations of priority queues. It uses no extra memory and is
guaranteed to sort M elements in about MlogM steps no matter
what the input.
1It
is twice as slow as quicksort on the average. We will look at
quicksort later in the semester.
Algorithm
• Build a heap containing the elements to be sorted.
• Remove them all in order.
In this discussion, M will be the number of elements to be sorted,
while N will continue to be the size of the heap.
A very simple implementation
N = 0;
for (k=1; k<=M; k++)
insert(a[k]);
for (k=M; k>=1; k--)
a[k] = remove();
Top-down heap construction1
Constructing a heap from
“ASORTINGEXAMPLE”
Figures from Algorithms,
R. Sedgewick (1988, 2nd
ed.)
Sorting from a heap
Bottom-up heap construction
K=5
K=6
K=1
Bottom-up heap construction is linear-time
K=5
This
is
because
most
of
the heaps processed are small. For
K=6
example, to build a heap of 127 elements, we call downheap()
on 64 heaps of size 1, 32 heaps of size 3, 16 heaps of size 7, 8
heaps of size 15, 4 heaps of size 31, 2 heaps of size 63, and 1
heap of size 127. Therefore: 64*0 + 32*1 + 16*2 + 8*3 + K=1
4*4 +
2*5 + 1*6 = 120 promotions (twice as many comparisons) are
required in the worst case.
For M=2m, an upper bound on the number of comparisons is:

1<=k<=M
(k-1)2m-k = 2m - m - 2 < M
Still a very simple implementation
heapsort()
{
int k, t;
N = M;
for (k=M/2; k>=1; k--)
downheap(k);
do {
t = a[1];
a[1] = a[N];
a[N] = t;
N--;
downheap(1);
} while (N>1);
}
// fill bottom-up
//
//
//
//
//
save largest
put the larges entry
at the end
… and shrink the queue
now fix heap
Heapsort data
movement
Build heap
Sort
Heapsorting a random permutation:
construction phase
Heapsorting a random permutation:
sorting phase
Some last remarks on performance
K=5
• All
of
the
basic
operations
insert, remove, replace, delete, and
K=6
change (this includes downheap and upheap) require less
than 2lgN comparisons when performed on a heap of N
elements.
K=1
• Bottom-up heap construction is linear-time
• Heapsort uses fewer than 2M lg M comparisons to sort M
elements.