Tutorial 1 C++ Programming

Download Report

Transcript Tutorial 1 C++ Programming

Tutorial 10
Heap
&
Priority Queue
Heap – What is it?
• Special complete binary tree data structure
– Complete BT: no empty node when checked top-down (level by level), left to right!
– We actually implement this using compact array (or vector for size flexibility)
• One to one mapping between complete binary tree and compact array!
• We have a very efficient Heap manipulation using array/vector
–
–
–
–
Index start from 0
Parent(i): floor((i-1)/2)
Left(i): 2i+1
Right(i): 2i+2
– Special because it must satisfy recursive heap property for each node x:
• The node x is larger than any of its left/right children (MaxHeap), or
• The node x is smaller than any of its left/right children (MinHeap)
• PS: Compare this with BST property!
– Discussed in q2.
Heap – Operations
• Note: some other lecture notes/text books use different terms!
• Two key operations: bubble up or bubble down
– Fix heap property by bubbling a node upwards (exchange with parent) or
downwards (exchange with one of the largest/smallest child).
• Standard operations
– Insertion: HeapInsert() a.k.a: fixHeap(), BubbleUp()
•
Insert to last, then bubble up  O(log n), see q1
– Deletion: HeapDelete() a.k.a: ExtractMax(), ExtractMin()
• Take the root, replace with last, and then bubble down  O(log n)
• For deleting non-root nodes, see q1
– No default searching operation:
• We usually only access root (max/min) in O(1)
• Accessing any other node will normally incur O(n) cost, but see q3
– Updating a node value requires bubble up or bubble down, see q3
• Special operation
– ArrayHeap: Heapify() a.k.a: HeapConstruction()  O(n), not O(n log n), see q1
Heap - Usage
• Heap can be used as an efficient Priority Queue
– Items are inserted as per normal O(log n)
– But these items will come out (de-queued) based on their priority value!
As we perform ExtractMax() or ExtractMin() from a max (or min) heap! O(log n)
– Faster than using other data structure, e.g. sorted array
• Heap can also be used for sorting (discussed in q3):
– HeapSort() – O(n log n)
– Or even for “partial sort” – O(k log n), e.g. Google default top 10 search results
•
•
•
•
•
The term “Barack Obama” on 6 Nov 08 produces 83 million hits
Sorting (80m log 80m) and display top 10 is “slow”
Create heap O(80m) + take top-10 in O(10 log 80m) is much faster
Heap can be built beforehand and stored in Google server, so it is just O(10 log 80m)
However, we can do better by mapping search query into an “answer page” in O(1)…
– Only possible if we have large storage space.
– Other sorting algorithms can do partial sort, but not natural
• Quick sort can stop processing right part if pivot > k
• Partitioning algorithm will ensure everything on the left side of pivot is the top k…
Student Presentation
• Gr3 (average: 2 times)
1.
2.
3.
Rebecca Chen or Cao Hoangdang or
Ding Ying Shiaun or Du Xin
Nur Liyana Bte Roslie or Jashan Deep Kaur
Huang Chuanxian or Leow Wei Jie or Tan Kar Ann
• Gr4 (average: 3 times)
1.
2.
3.
Cynthia Tan or Tan Peck Luan
Liew Hui Sun or Hanyenkno Afi
Goh Khoon Hiang or Jasmine Choy
• Gr5 (average: 4 times)
1.
2.
3.
Zheng Yang
Stephanie Teo
Tan Yan Hao
• Gr6 (average: 3 times)
1.
2.
3.
Koh Jye Yiing or Wang Shuling
Siddhartha or Laura Chua
Rasheilla or Brenda Koh or Gary Kwong
or Low Wei Chen
Overview of the questions:
1. Trace Heap Operations (1 or 2 students)
2. IsHeap(Array) (1 student)
3. FindKthBest (1, 2, or 3 students)
5
Q1 – Trace Heap Operations
• Insert 3, 1, 4, 1, 5, 9, 2, 6, 4 to empty max heap, delete 5, O(n log n)
• Heapify array 3, 1, 4, 1, 5, 9, 2, 6, 4 to max heap, delete 6, O(n)
• The resulting max heap (before deleting 5 and 6, respectively) are
different but both are valid. However, Heapify() is faster as it is O(n).
Link
Q2 –IsHeap(Array)?
class BinaryHeap {
int currentSize; //maintain the number of elements in the heap
Comparable[] array; //stores the heap elements
BinaryHeap() { //constructor
currentSize = 0;
array = new Comparable[ DEFAULT_CAPACITY ];
}
// with other standard heap methods ...
}
• Give code for checking if the heap is a valid max-heap:
(is a complete tree and maintains heap property)
Q2 –IsHeap(Array)? (Answer)
bool isHeap() {
for (int i = 0; i < currentSize; i++) {
if (array[i] == NULL) // check if compact array == complete binary tree
return false; // not a complete tree
int child = 2 * i + 1; // check heap property
if (child < currentSize) {
if ((child+1) < currentSize)
child = (array[child]>array[child+1])?child:(child+1);
if (array[child] > array[i])
return false;
}
}
return true;
}
Link
Q3 – FindKthBest (1)
• findKthBest(int k), which returns the athlete with
the kth least penalty points. What is the complexity?
– Take/delete top k-1 items from min heap, so, after k-1 steps, you get the kth one
as the root of the remaining min heap (the answer), this is (k-1) log n
– Then, restore the deleted k-1 items into the heap again so that the next queries
are correct, this is another (k-1) log n
– Overall: (2k-2) log n which is O(k log n)
– The complexity depends on k and n, ‘k’ cannot be dropped…
• update(Athlete, points), which adds the new penalty points
to Athletes record. What is the complexity?
– Use hash table to map Athlete to index in Heap data structure!
• Otherwise we need O(n) to scan for the correct index for this Athlete, now just O(1)
– When we modify the value (points) in Heap, we need to bubble up/down to fix the
heap property, which is maximum O(log n)… Update the corresponding values in
Hash Table too, which is O(1) per update.
– Overall complexity is O(1) + O(log n) * O(1) = O(log n)
Q3 – FindKthBest (2)
• Is there some other data structure which can be used
to give more optimized performance? (findKthBest + update)
– If the data is static, we can put the scores in array, sort the array
one time O(n log n) and return array[k-1] O(1) as the answer,
faster than O(k log n) per query in Q3a.
– However, the data will be frequently updated,
thus, this method will be slow, as re-sorting the array takes O(n log n)!
– Since data is going to be frequently updated and we want top k results, we can
use an augmented BST instead of a heap to store the Athlete objects!
– http://en.wikipedia.org/wiki/Augmenting_Data_Structures
– Store the number of nodes in the left and right sub-tree in each node!
Thus, we can know where to go (left or right or stay) to find index-k in O(log n)!
– BST would take O(n lg n) time to create (one time task),
but the query time for findKthBest will be O(log n), faster than O(k log n)
– The time for update would remain unchanged, still O(log n),
delete old value O(log n) and re-insert new value to BST O(log n).
Homework: Trace Heap Sort
• Perform partial sort to get top 3 results on this max heap (from q1)!
• Continue all the way to complete the Heap Sort on this max heap!
Link
Class Photo 
• I usually take class photo every semester
• Please move to the center
• We will take two photos:
– One without me.
– One with me (one student will do the photo taking)
• After Thursday, you can see our photos here:
– http://www.comp.nus.edu.sg/~stevenha/myteaching/myteachingrecord.html
Advertisement
• CS1102/C IEEE Revision Series
• Conducted by: me…
• Very relevant for this module!
– Venue: LT 3
– Time: Friday,14 November 2008, 6-9pm
• I will discuss past papers and share how to tackle CS1102/C exam!
• Be there… you do not want to miss this!
Questions from last semesters
• Use them as additional learning materials…
Heap Insert/Delete
•
•
Given the minheap h below,
Show what the minheap h would look like after each of the
following pseudocode operations one after another:
a)
b)
c)
h.heapInsert(8)
h.heapInsert(5)
h.heapDelete()
Solution
a)
h.heapInsert(8)
b)
h.heapInsert(5)
c)
h.heapDelete()
Extension
After these three operations
a) h.heapInsert(8)
b) h.heapInsert(5)
c) h.heapDelete()
Continue with this:
d) h.heapInsert(14)
e) h.heapInsert(1) // assume this is 1a
f) h.heapDelete()
g) h.heapInsert(1) // duplicate, this is 1b
h) h.heapInsert(15)
Extension – Solution
Final heap:
Heap Construction
• Show the result of inserting:
12, 10, 1, 14, 6, 5, 8, 15, 3, 9, 7, 4, 11, 13, and 2
• One at a time!
• Into an initially empty maxheap...
• Then show/compare the result by using the
heap construction algorithm instead!
Solution
• Using
Individual
Insertions
• Using
Heap
Construction
Algorithm