PowerPoint Source - University of Virginia, Department of Computer

Download Report

Transcript PowerPoint Source - University of Virginia, Department of Computer

CS 332: Algorithms
Augmenting Data Structures
David Luebke
1
3/30/2016
Administrivia
Midterm is postponed until Thursday, Oct 26
 Reminder: homework 3 due today




In the CS front office
Due at 5 PM (but don’t risk being there at 4:59!)
Check your e-mail for some clarifications & hints
David Luebke
2
3/30/2016
Review: Hash Tables

More formally:

Given a table T and a record x, with key (=
symbol) and satellite data, we need to support:
 Insert
(T, x)
 Delete (T, x)
 Search(T, x)


Don’t care about sorting the records
Hash tables support all the above in
O(1) expected time
David Luebke
3
3/30/2016
Review: Direct Addressing

Suppose:



The range of keys is 0..m-1
Keys are distinct
The idea:


Use key itself as the address into the table
Set up an array T[0..m-1] in which
 T[i]
=x
 T[i] = NULL

if x T and key[x] = i
otherwise
This is called a direct-address table
David Luebke
4
3/30/2016
Review: Hash Functions

Next problem: collision
T
0
U
(universe of keys)
h(k1)
h(k4)
k1
k4
K
(actual
keys)
k2
k5
h(k2) = h(k5)
h(k3)
k3
m-1
David Luebke
5
3/30/2016
Review: Resolving Collisions
How can we solve the problem of collisions?
 Open addressing




To insert: if slot is full, try another slot, and
another, until an open slot is found (probing)
To search, follow same sequence of probes as
would be used when inserting the element
Chaining


Keep linked list of elements in slots
Upon collision, just add new element to list
David Luebke
6
3/30/2016
Review: Chaining

Chaining puts elements that hash to the same
slot in a linked list:
T
——
U
(universe of keys)
k6
k2
k4 ——
k5
k2
——
——
——
k1
k4
K
(actual
k7
keys)
k1
k5
k7 ——
——
k8
k3
k3 ——
k8
k6 ——
——
David Luebke
7
3/30/2016
Review: Analysis Of Hash Tables
Simple uniform hashing: each key in table is
equally likely to be hashed to any slot
 Load factor  = n/m = average # keys per slot





Average cost of unsuccessful search = O(1+α)
Successful search: O(1+ α/2) = O(1+ α)
If n is proportional to m, α = O(1)
So the cost of searching = O(1) if we size our
table appropriately
David Luebke
8
3/30/2016
Review: Choosing A Hash Function

Choosing the hash function well is crucial


Bad hash function puts all elements in same slot
A good hash function:
 Should
distribute keys uniformly into slots
 Should not depend on patterns in the data

We discussed three methods:



Division method
Multiplication method
Universal hashing
David Luebke
9
3/30/2016
Review: The Division Method

h(k) = k mod m

In words: hash k into a table with m slots using the
slot given by the remainder of k divided by m
Elements with adjacent keys hashed to
different slots: good
 If keys bear relation to m: bad
 Upshot: pick table size m = prime number not
too close to a power of 2 (or 10)

David Luebke
10
3/30/2016
Review: The Multiplication Method
For a constant A, 0 < A < 1:
 h(k) =  m (kA - kA) 

Fractional part of kA

Upshot:



Choose m = 2P
Choose A not too close to 0 or 1
Knuth: Good choice for A = (5 - 1)/2
David Luebke
11
3/30/2016
Review: Universal Hashing
When attempting to foil an malicious
adversary, randomize the algorithm
 Universal hashing: pick a hash function
randomly when the algorithm begins (not upon
every insert!)



Guarantees good performance on average, no
matter what keys adversary chooses
Need a family of hash functions to choose from
David Luebke
12
3/30/2016
Review: Universal Hashing

Let  be a (finite) collection of hash functions



…that map a given universe U of keys…
…into the range {0, 1, …, m - 1}.
If  is universal if:


for each pair of distinct keys x, y  U,
the number of hash functions h  
for which h(x) = h(y) is ||/m
In other words:
a random hash function from , the chance of a
collision between x and y (x  y) is exactly 1/m
 With
David Luebke
13
3/30/2016
Review: A Universal Hash Function
Choose table size m to be prime
 Decompose key x into r+1 bytes, so that
x = {x0, x1, …, xr}




Only requirement is that max value of byte < m
Let a = {a0, a1, …, ar} denote a sequence of r+1
elements chosen randomly from {0, 1, …, m - 1}
Define corresponding hash function ha  :
 r

ha x     ai xi  mod m
 i 0

 With this definition,  has mr+1 members
David Luebke
14
3/30/2016
Augmenting Data Structures
This course is supposed to be about design and
analysis of algorithms
 So far, we’ve only looked at one design
technique (What is it?)

David Luebke
15
3/30/2016
Augmenting Data Structures
This course is supposed to be about design and
analysis of algorithms
 So far, we’ve only looked at one design
technique: divide and conquer
 Next up: augmenting data structures


Or, “One good thief is worth ten good scholars”
David Luebke
16
3/30/2016
Dynamic Order Statistics
We’ve seen algorithms for finding the ith
element of an unordered set in O(n) time
 Next, a structure to support finding the ith
element of a dynamic set in O(lg n) time





What operations do dynamic sets usually support?
What structure works well for these?
How could we use this structure for order statistics?
How might we augment it to support efficient
extraction of order statistics?
David Luebke
17
3/30/2016
Order Statistic Trees

OS Trees augment red-black trees:


Associate a size field with each node in the tree
x->size records the size of subtree rooted at x,
including x itself:
M
8
C
5
P
2
A
1
F
3
Q
1
D
1
David Luebke
H
1
18
3/30/2016
Selection On OS Trees
M
8
C
5
P
2
A
1
F
3
Q
1
D
1
H
1
How can we use this property
to select the ith element of the set?
David Luebke
19
3/30/2016
OS-Select
OS-Select(x, i)
{
r = x->left->size + 1;
if (i == r)
return x;
else if (i < r)
return OS-Select(x->left, i);
else
return OS-Select(x->right, i-r);
}
David Luebke
20
3/30/2016
OS-Select Example

Example: show OS-Select(root, 5):
OS-Select(x, i)
{
r = x->left->size + 1;
if (i == r)
return x;
else if (i < r)
return OS-Select(x->left, i);
else
return OS-Select(x->right, i-r);
}
David Luebke
21
M
8
C
5
A
1
P
2
F
3
D
1
Q
1
H
1
3/30/2016
OS-Select Example

Example: show OS-Select(root, 5):
OS-Select(x, i)
{
r = x->left->size + 1;
if (i == r)
return x;
else if (i < r)
return OS-Select(x->left, i);
else
return OS-Select(x->right, i-r);
}
David Luebke
22
M
8
i=5
r=6
C
5
A
1
P
2
F
3
D
1
Q
1
H
1
3/30/2016
OS-Select Example

Example: show OS-Select(root, 5):
OS-Select(x, i)
{
r = x->left->size + 1;
if (i == r)
return x;
else if (i < r)
return OS-Select(x->left, i);
else
return OS-Select(x->right, i-r);
}
David Luebke
23
M
8
C
5
A
1
i=5
r=6
i=5
r=2
P
2
F
3
D
1
Q
1
H
1
3/30/2016
OS-Select Example

Example: show OS-Select(root, 5):
OS-Select(x, i)
{
r = x->left->size + 1;
if (i == r)
return x;
else if (i < r)
return OS-Select(x->left, i);
else
return OS-Select(x->right, i-r);
}
David Luebke
24
M
8
C
5
A
1
i=5
r=2
F
3
D
1
i=5
r=6
P
2
i=3
r=2
Q
1
H
1
3/30/2016
OS-Select Example

Example: show OS-Select(root, 5):
OS-Select(x, i)
{
r = x->left->size + 1;
if (i == r)
return x;
else if (i < r)
return OS-Select(x->left, i);
else
return OS-Select(x->right, i-r);
}
David Luebke
25
M
8
C
5
A
1
P
2
i=5
r=2
F
3
D
1
i=5
r=6
i=3
r=2
H
1
Q
1
i=1
r=1
3/30/2016
OS-Select: A Subtlety
OS-Select(x, i)
{
r = x->left->size + 1;
if (i == r)
return x;
else if (i < r)
return OS-Select(x->left, i);
else
return OS-Select(x->right, i-r);
}
What happens at the leaves?
 How can we deal elegantly with this?

David Luebke
26
3/30/2016
OS-Select
OS-Select(x, i)
{
r = x->left->size + 1;
if (i == r)
return x;
else if (i < r)
return OS-Select(x->left, i);
else
return OS-Select(x->right, i-r);
}

What will be the running time?
David Luebke
27
3/30/2016
Determining The
Rank Of An Element
M
8
C
5
P
2
A
1
F
3
Q
1
D
1
H
1
What is the rank of this element?
David Luebke
28
3/30/2016
Determining The
Rank Of An Element
M
8
C
5
P
2
A
1
F
3
Q
1
D
1
H
1
Of this one? Why?
David Luebke
29
3/30/2016
Determining The
Rank Of An Element
M
8
C
5
P
2
A
1
F
3
Q
1
D
1
H
1
Of the root? What’s the pattern here?
David Luebke
30
3/30/2016
Determining The
Rank Of An Element
M
8
C
5
P
2
A
1
F
3
Q
1
D
1
H
1
What about the rank of this element?
David Luebke
31
3/30/2016
Determining The
Rank Of An Element
M
8
C
5
P
2
A
1
F
3
Q
1
D
1
H
1
This one? What’s the pattern here?
David Luebke
32
3/30/2016
OS-Rank
OS-Rank(T, x)
{
r = x->left->size + 1;
y = x;
while (y != T->root)
if (y == y->p->right)
r = r + y->p->left->size + 1;
y = y->p;
return r;
}

What will be the running time?
David Luebke
33
3/30/2016
OS-Trees: Maintaining Sizes
So we’ve shown that with subtree sizes, order
statistic operations can be done in O(lg n) time
 Next step: maintain sizes during Insert() and
Delete() operations


How would we adjust the size fields during
insertion on a plain binary search tree?
David Luebke
34
3/30/2016
OS-Trees: Maintaining Sizes
So we’ve shown that with subtree sizes, order
statistic operations can be done in O(lg n) time
 Next step: maintain sizes during Insert() and
Delete() operations



How would we adjust the size fields during
insertion on a plain binary search tree?
A: increment sizes of nodes traversed during search
David Luebke
35
3/30/2016
OS-Trees: Maintaining Sizes
So we’ve shown that with subtree sizes, order
statistic operations can be done in O(lg n) time
 Next step: maintain sizes during Insert() and
Delete() operations




How would we adjust the size fields during
insertion on a plain binary search tree?
A: increment sizes of nodes traversed during search
Why won’t this work on red-black trees?
David Luebke
36
3/30/2016
Maintaining Size Through Rotation
y
19
x
11
6
x
19
rightRotate(y)
7
leftRotate(x)
4
y
12
6
4
7
Salient point: rotation invalidates only x and y
 Can recalculate their sizes in constant time


Why?
David Luebke
37
3/30/2016
Augmenting Data Structures:
Methodology

Choose underlying data structure


Determine additional information to maintain


E.g., subtree sizes
Verify that information can be maintained for
operations that modify the structure


E.g., red-black trees
E.g., Insert(), Delete()
(don’t forget rotations!)
Develop new operations

David Luebke
E.g., OS-Rank(), OS-Select()
38
3/30/2016
The End

Up next:


Interval trees
Review for midterm
David Luebke
39
3/30/2016