Lecture 19 - UMass CS !EdLab

Download Report

Transcript Lecture 19 - UMass CS !EdLab

CMPSCI 187
Introduction to Programming
with Data Structures
Computer Science 187
Lecture 18
Heaps (and Heapsort) & Priority Queues
Announcements
1. Programming project 5 is up.
2. Final exam is on December 20 at 10:30
(ELAB303).
1
CMPSCI 187
A Heap
2
CMPSCI 187
Heaps


A heap orders its node, but in a way different from a binary search tree
A heap is a binary tree T that stores a collection of keys (or keyelement pairs) at its internal nodes and that satisfies two additional
properties:
 Order Property: key(parent)  key(child) (MinHeap)
 Structural Property: all levels are full, except the last one, which is
left-filled (complete binary tree)
 We know there is an efficient implementation for a complete binary
tree.
3
CMPSCI 187
!Heaps
Bottom level not filled
key(parent)>key(child)
Note: This use of the word “heap” is entirely different from the heap that
is the allocation area in Java
4
CMPSCI 187
Height of a Heap - 1
 A heap
T storing n keys has height h = log(n + 1) ,
which is O(log n)
0
1
1
2
h-2
h-1
2 h-2
Number of Nodes
n ≥ 1 + 2 + 4 + ... + 2 h-2 + 1 = 2 h-1 - 1 + 1 = 2 h-1
Height

5
CMPSCI 187
Height of a Heap - 2
n  1 + 2 + 4 + ... +2 h-1 = 2 h - 1
0
1
1
2
h-2
2 h-2
h-1



Number of Nodes
Height

Therefore 2 h-1  n  2 h-1
Taking logs, we get log (n + 1)  h  log n + 1
Which implies h = log(n+1)
6
CMPSCI 187

A Representation for Complete
Binary Trees
Since tree is full, it maps nicely onto an array representation.
0 A
1
2
B
C
3
D
7
H
T:
4
E
8 9
I
J
A
B
C
D
E
0
1
2
3
4
F
5
5
F
6
G
10 11
K
L
last
G
H
I
J
K
L
6
7
8
9
10
11 12
7
CMPSCI 187
Properties of the Array Representation

Data from the root node is always in T[0].

Suppose some node appears in T[i]
 data for its parent is always at location T[(i-1)/2]
(using integer division)
 data for its children nodes appears in locations
T[2*i+1] for the left child
T[2*i+2] for the right child
 formulas provide an implicit representation of the edges
 Values outside of array ‘bounds’ [0->last] imply node does
not exist
 can use these formulas to implement efficient algorithms for
traversing the tree and moving around in various ways.
8
CMPSCI 187
Heap operations

A heap is a data structure which supports efficient
implementation of three operations:
insert - inserts a key and associated element
into the heap
findMin - return the element having the
smallest key
deleteMin - delete the element having the
smallest key



Such a heap is a min-heap, to be more precise.
Can also define a max-heap in which value at root
node is greater than value in the child nodes.
We will stick with min-heaps [book does this]
9
CMPSCI 187
The minHeap Interface
public interface MinHeapInterface
{ // Inserts 'key' and associated 'data' object into heap. Throws an
// InvalidElementException if 'key' is not a valid key; throws a HeapException
// if there is no room for this key and data.
public void insert(Comparable key, Object data) throws
InvalidElementException, HeapException;
// Removes the minimum element from the heap; throws an
// InvalidElementException if the reheap attempt fails
public Node removeMin() throws InvalidElementException;
// Returns the key and data element (in node form) of the minimum element in
// the heap. Does not alter the heap. Throws a HeapException if the heap is empty.
public Node getMin() throws HeapException;
// Returns true if the heap is empty, false otherwise.
public boolean isEmpty();
// Returns the number of elements in the heap.
public int getSize();
//Removes all elements from the heap. On completion, isEmpty() is true.
public void clear();
} // end MinHeapInterface
10
CMPSCI 187
Inserting Into a Heap

The key to insert is 6
3
7
4
21
22
28 13
10
20
8
19 25
11
CMPSCI 187
Find a Spot for the Key

Add the key in the next available position in the heap.
3
7
4
21
22



28 13
10
20
19 25
8
6
Tree does not satisfy the heap constraints
Move key in the tree until the constraints are satisfied
Move key “UP” the tree …..called “percolateUp”
12
CMPSCI 187
percolateUp

Swap parent-child keys if they are out of order
 For a min heap, out of order if key(parent) > key(child)
3
7
4
21
3
10
20
8
21
compare
22
28 13
19 25
6
7
4
10
6
8
swap
22
28 13
19 25
20
13
CMPSCI 187
percolateUp continues
3
3
7
4
7
4
swap
compare
21
22
28 13
10
19 25
6
8
20
21
22
28 13
10
19 25
6
8
20
14
CMPSCI 187
percolateUp Termination
3
6
4
21
22


28 13
10
7
19 25
8
20
percolateUp terminates when new key is greater
than the key of its parent or the top of the heap is
reached
(total #swaps)  (h - 1), which is O(log n)
15
CMPSCI 187
Removal from a Heap: removeMin
3
6
4
21
22
28 13
10
7
19 25
8
20

The removal of the top key leaves a hole
 We need to fix the heap
 First, replace the hole with the last key in the heap
 Then, begin percolateDown
16
CMPSCI 187
percolateDown
20
4
21
22


28 13
10
19 25
4
6
compare
7
20
8
21
22
28 13
swap
10
6
7
8
19 25
percolateDown compares the parent with the smallest child.
If the child is smaller, it switches the two.
17
CMPSCI 187
percolateDown continues
4
4
20
10
6
compare
21
22
28 13
10
19 25
6
swap
7
8
21
22
28 13
20
7
8
19 25
18
CMPSCI 187
percolateDown continues
4
4
10
20
21
10
6
7
8
21
compare
22
28 13
19 25
22
swap
28 20
6
13
7
8
19 25
19
CMPSCI 187
percolateDown Termination
4
10
13
21
22


28
6
20
7
8
19 25
percolateDown terminates when the key is less
than the keys of both its children or the bottom of the
heap is reached.
(total #swaps)  (h - 1), which is O(log n)
20
CMPSCI 187
Heap Implementation

Based on array representation
 Nodes in the array contain a key and associated data
element
 Example will use integer keys, but will be written to be
general
 Heap methods to implement: insert, findMin, deleteMin
 plus various support methods, including
 percolateUp
 percolateDown

Comparable to compare keys
21
CMPSCI 187
Heap Nodes
public class Node
{ protected Object element;
protected Comparable key;
Node instance:
public Node() //constructor
{}
public Node(Comparable myKey, Object myElement)
key
{ key = myKey;
element = myElement;}
element
public Object element()
{ return element;}
public void setElement(Object myElement)
{ element = myElement;}
public void setKey(Object myKey)
{ key = (Comparable) myKey;}
public Object getKey()
{ return key;}
}
22
CMPSCI 187
The Heap Class
public class Heap
{
private Node nodes[]; // storage for heap data (use a vector??)
private int last; // index of last element in heap
Comparator comp; //object used to compare keys
int heapCapacity = 100; //the default size of the heap;
public Heap() //class constructor
{
nodes = new Node[heapCapacity];
last = -1;
// initially, heap is empty
}
23
CMPSCI 187
insert Method
public void insert(Object key, Object data) throws InvalidElementException,
HeapException
{ last = last+1; //make space for new node
if (last > heapCapacity) throw new HeapException("Heap capacity exhausted");
nodes[last] = new Node(key, data); // put node at 'last' element of array
percolateUp(last); //re-establish heap constraints by percolating to new location
}
Note that we could resize the heap
here by calling a resize method
rather than throwing an exception…..
24
CMPSCI 187
percolateUp Method
public void percolateUp(int idx) throws InvalidElementException
{ //pre: heap does not satisfy heap constraints - the new element is
//out of position; idx references the new element
//post: heap satisfies heap constraints
if (idx == 0) return; //node is already in place, nothing to do
int parentIdx = (idx - 1) / 2; //index of parent of idx
//if the key at the parent is greater than the key at the child, swap them
//and call percolateUp on the parent; otherwise key is at correct location
if (((Comparable) nodes[parentIdx].getKey()).compareTo(
(Comparable) nodes[idx].getKey()) > 0 )
{
swap(idx, parentIdx); // swap parent and child
percolateUp(parentIdx); //now percolateUp the parent
Notice the Recursion
}
} //end percolateUp
25
CMPSCI 187
Deleting the Min Node (root)
public Node removeMin() throws InvalidElementException
{
if (last == -1) return null; // the heap is empty
//the heap is now known not to be emtpy
Node min = nodes[0]; // remove element at top
//was this the last element in the heap?
if (last == 0) // the last node was just removed
{ last = -1;
return min;}
//copy last element into root, and fix the heap by letting
//the root element settle into final position.
nodes[0] = nodes[last]; // move last element to top
last--; // reduce heap size
percolateDown(0); // let element settle
return min;
}
26
CMPSCI 187
percolateDown method
public void percolateDown(int idx) throws InvalidElementException
{
int childIdx = idx * 2 + 1; //get address of left most child of idx
if (childIdx > last) return; //idx does not have any children
//get the address of the largest child of idx
if( (childIdx + 1 <= last) && ( ((Comparable) nodes[childIdx+1].getKey())
.compareTo ((Comparable) nodes[childIdx].getKey())) < 0)
{childIdx = childIdx + 1;}
//see if the nodes are out of order
if( ((Comparable) nodes[idx].getKey()).compareTo( (Comparable)
nodes[childIdx].getKey()) > 0)
{ //if they are, swap them and continue to percolate downward
swap(idx, childIdx);
Notice the Recursion
percolateDown(childIdx);}
See the text for
a non-recursive
} // end percolateDown
definition of removeMin
27
CMPSCI 187
buildHeap Method
public void buildHeap(Comparable[] keys, Object[] data)
throws InvalidElementException, HeapException
// converts an array into a heap by repeatedly calling insert
// for the elements of the array
{
if (keys.length != data.length) throw new
HeapException("Can't build heap.");
for (int i=0; i < data.length; i++)
{insert(keys[i],data[i]);}
}
28
CMPSCI 187
Remaing Methods
public boolean isEmpty()
{ return last==-1;}
public int getSize()
{ return last+1;}
public void clear()
{ for(int i=0; i<=last; i++)
nodes[i]=null;
last = -1;
}
29
CMPSCI 187
HeapTest
public class HeapTest
{ public static void main (String[] args) throws
InvalidElementException, HeapException
{
Heap h = new Heap();
System.out.println("Inserting first (0) element.");
h.insert(new Integer(33), null);
System.out.println("Inserting second (1) element.");
h.insert(new Integer(10), null);
System.out.println("Inserting third (2) element.");
h.insert(new Integer(3), null);
System.out.println("Inserting fourth (3) element.");
h.insert(new Integer(-2), null);
System.out.println("Inserting fifth (4) element.");
h.insert(new Integer(12), null);
System.out.println("Inserting sixth (5) element.");
h.insert(new Integer(11), null);
h.print();
30
CMPSCI 187
Output
[Allen-Hansons-Computer:~] al% /tmp/CodeWarriorJava.command; exit
cd /Users/al/187/MinHeap/MinHeap
java -cp .:/Users/al/187/MinHeap/MinHeap/JavaClasses.jar HeapTest
Inserting
Inserting
Inserting
Inserting
Inserting
Inserting
first (0) element.
second (1) element.
third (2) element.
fourth (3) element.
fifth (4) element.
sixth (5) element.
-2 3 10 33 12 11
31
CMPSCI 187
HeapTest
h.removeMin();
System.out.println("Heap after removal of minimum element");
h.print();
System.out.println("Inserting new element.");
h.insert(new Integer(102), null);
System.out.println("Inserting new element.");
h.insert(new Integer(2), null);
h.print();
}
} //end HeapTest
32
CMPSCI 187
Output
[Allen-Hansons-Computer:~] al% /tmp/CodeWarriorJava.command; exit
cd /Users/al/187_F04/MinHeap_F04/MinHeap
java -cp .:/Users/al/187_F04/MinHeap_F04/MinHeap/JavaClasses.jar HeapTest
Inserting first (0) element.
Inserting second (1) element.
Inserting third (2) element.
Inserting fourth (3) element.
Inserting fifth (4) element.
Inserting sixth (5) element.
-2 3 10 33 12 11
Heap after removal of minimum element
3 11 10 33 12
Inserting new element.
Inserting new element.
2 11 3 33 12 102 10
logout
[Process completed]
33
CMPSCI 187
Complexity of Heap Operations


findMin - O(1)
Insert, deleteMin  Complexity proportional to height of tree
 Height is log(n)
 Complexity is log(n)
insert
deleteMin
34
CMPSCI 187
Recall the Priority Queue ADT



size
isEmpty
offer
element
peek
poll


A priority queue de-queues items in priority order
 Not in order of entry into the queue (not FIFO)
Heap is an efficient implementation of priority queue
 Operations cost at most O(log n)
A priority queue P supports the following methods:
 size(): Return the number of elements in P
 isEmpty(): Test whether P is empty
 *insertItem(k,e): Insert a new element e with key k into P
 minElement(): Return (but don’t remove) an element of P with
smallest key; an error occurs if P is empty.
 minKey(): Return the smallest key in P; an error occurs if P is
empty
 *removeMin(): Remove from P and return an element with the
smallest key; an error condition occurs if P is empty.
Implement with a heap
Complexity: *O(log n), others are all O(1)
35
CMPSCI 187
Heapsort

Build a heap from the keys and data
Obvious method:
public void buildHeap(Comparable[ ] keys, Object[ ] data) throws
InvalidElementException, HeapException
{ if (keys.length != data.length) throw new HeapException("Can't build heap.");
for (int i=0; i < data.length; i++)
{insert(keys[i],data[i]);}}
Complexity????
 Do n deleteMin operations.
Complexity O(n log n)
36
CMPSCI 187
HeapSort Performance



Heapsort is O(n log 2 n).
Heapsort is generally more efficient that any of the
sorts that we will look at.
Direct comparison in terms of algorithm loops:
Number of Loops
n
25
100
500
1000
2000
Straight Insertion
Straight Selection
625
10,000
250,000
1,000,000
4,000,000
Shell
(not done)
Heap
55
316
2,364
5,623
13,374
116
664
4,482
9,965
10,965
37