PowerPoint Source - University of Virginia, Department of Computer
Download
Report
Transcript PowerPoint Source - University of Virginia, Department of Computer
CS 332: Algorithms
Augmenting Data Structures
David Luebke
1
3/30/2016
Administrivia
Midterm is postponed until Thursday, Oct 26
Reminder: homework 3 due today
In the CS front office
Due at 5 PM (but don’t risk being there at 4:59!)
Check your e-mail for some clarifications & hints
David Luebke
2
3/30/2016
Review: Hash Tables
More formally:
Given a table T and a record x, with key (=
symbol) and satellite data, we need to support:
Insert
(T, x)
Delete (T, x)
Search(T, x)
Don’t care about sorting the records
Hash tables support all the above in
O(1) expected time
David Luebke
3
3/30/2016
Review: Direct Addressing
Suppose:
The range of keys is 0..m-1
Keys are distinct
The idea:
Use key itself as the address into the table
Set up an array T[0..m-1] in which
T[i]
=x
T[i] = NULL
if x T and key[x] = i
otherwise
This is called a direct-address table
David Luebke
4
3/30/2016
Review: Hash Functions
Next problem: collision
T
0
U
(universe of keys)
h(k1)
h(k4)
k1
k4
K
(actual
keys)
k2
k5
h(k2) = h(k5)
h(k3)
k3
m-1
David Luebke
5
3/30/2016
Review: Resolving Collisions
How can we solve the problem of collisions?
Open addressing
To insert: if slot is full, try another slot, and
another, until an open slot is found (probing)
To search, follow same sequence of probes as
would be used when inserting the element
Chaining
Keep linked list of elements in slots
Upon collision, just add new element to list
David Luebke
6
3/30/2016
Review: Chaining
Chaining puts elements that hash to the same
slot in a linked list:
T
——
U
(universe of keys)
k6
k2
k4 ——
k5
k2
——
——
——
k1
k4
K
(actual
k7
keys)
k1
k5
k7 ——
——
k8
k3
k3 ——
k8
k6 ——
——
David Luebke
7
3/30/2016
Review: Analysis Of Hash Tables
Simple uniform hashing: each key in table is
equally likely to be hashed to any slot
Load factor = n/m = average # keys per slot
Average cost of unsuccessful search = O(1+α)
Successful search: O(1+ α/2) = O(1+ α)
If n is proportional to m, α = O(1)
So the cost of searching = O(1) if we size our
table appropriately
David Luebke
8
3/30/2016
Review: Choosing A Hash Function
Choosing the hash function well is crucial
Bad hash function puts all elements in same slot
A good hash function:
Should
distribute keys uniformly into slots
Should not depend on patterns in the data
We discussed three methods:
Division method
Multiplication method
Universal hashing
David Luebke
9
3/30/2016
Review: The Division Method
h(k) = k mod m
In words: hash k into a table with m slots using the
slot given by the remainder of k divided by m
Elements with adjacent keys hashed to
different slots: good
If keys bear relation to m: bad
Upshot: pick table size m = prime number not
too close to a power of 2 (or 10)
David Luebke
10
3/30/2016
Review: The Multiplication Method
For a constant A, 0 < A < 1:
h(k) = m (kA - kA)
Fractional part of kA
Upshot:
Choose m = 2P
Choose A not too close to 0 or 1
Knuth: Good choice for A = (5 - 1)/2
David Luebke
11
3/30/2016
Review: Universal Hashing
When attempting to foil an malicious
adversary, randomize the algorithm
Universal hashing: pick a hash function
randomly when the algorithm begins (not upon
every insert!)
Guarantees good performance on average, no
matter what keys adversary chooses
Need a family of hash functions to choose from
David Luebke
12
3/30/2016
Review: Universal Hashing
Let be a (finite) collection of hash functions
…that map a given universe U of keys…
…into the range {0, 1, …, m - 1}.
If is universal if:
for each pair of distinct keys x, y U,
the number of hash functions h
for which h(x) = h(y) is ||/m
In other words:
a random hash function from , the chance of a
collision between x and y (x y) is exactly 1/m
With
David Luebke
13
3/30/2016
Review: A Universal Hash Function
Choose table size m to be prime
Decompose key x into r+1 bytes, so that
x = {x0, x1, …, xr}
Only requirement is that max value of byte < m
Let a = {a0, a1, …, ar} denote a sequence of r+1
elements chosen randomly from {0, 1, …, m - 1}
Define corresponding hash function ha :
r
ha x ai xi mod m
i 0
With this definition, has mr+1 members
David Luebke
14
3/30/2016
Augmenting Data Structures
This course is supposed to be about design and
analysis of algorithms
So far, we’ve only looked at one design
technique (What is it?)
David Luebke
15
3/30/2016
Augmenting Data Structures
This course is supposed to be about design and
analysis of algorithms
So far, we’ve only looked at one design
technique: divide and conquer
Next up: augmenting data structures
Or, “One good thief is worth ten good scholars”
David Luebke
16
3/30/2016
Dynamic Order Statistics
We’ve seen algorithms for finding the ith
element of an unordered set in O(n) time
Next, a structure to support finding the ith
element of a dynamic set in O(lg n) time
What operations do dynamic sets usually support?
What structure works well for these?
How could we use this structure for order statistics?
How might we augment it to support efficient
extraction of order statistics?
David Luebke
17
3/30/2016
Order Statistic Trees
OS Trees augment red-black trees:
Associate a size field with each node in the tree
x->size records the size of subtree rooted at x,
including x itself:
M
8
C
5
P
2
A
1
F
3
Q
1
D
1
David Luebke
H
1
18
3/30/2016
Selection On OS Trees
M
8
C
5
P
2
A
1
F
3
Q
1
D
1
H
1
How can we use this property
to select the ith element of the set?
David Luebke
19
3/30/2016
OS-Select
OS-Select(x, i)
{
r = x->left->size + 1;
if (i == r)
return x;
else if (i < r)
return OS-Select(x->left, i);
else
return OS-Select(x->right, i-r);
}
David Luebke
20
3/30/2016
OS-Select Example
Example: show OS-Select(root, 5):
OS-Select(x, i)
{
r = x->left->size + 1;
if (i == r)
return x;
else if (i < r)
return OS-Select(x->left, i);
else
return OS-Select(x->right, i-r);
}
David Luebke
21
M
8
C
5
A
1
P
2
F
3
D
1
Q
1
H
1
3/30/2016
OS-Select Example
Example: show OS-Select(root, 5):
OS-Select(x, i)
{
r = x->left->size + 1;
if (i == r)
return x;
else if (i < r)
return OS-Select(x->left, i);
else
return OS-Select(x->right, i-r);
}
David Luebke
22
M
8
i=5
r=6
C
5
A
1
P
2
F
3
D
1
Q
1
H
1
3/30/2016
OS-Select Example
Example: show OS-Select(root, 5):
OS-Select(x, i)
{
r = x->left->size + 1;
if (i == r)
return x;
else if (i < r)
return OS-Select(x->left, i);
else
return OS-Select(x->right, i-r);
}
David Luebke
23
M
8
C
5
A
1
i=5
r=6
i=5
r=2
P
2
F
3
D
1
Q
1
H
1
3/30/2016
OS-Select Example
Example: show OS-Select(root, 5):
OS-Select(x, i)
{
r = x->left->size + 1;
if (i == r)
return x;
else if (i < r)
return OS-Select(x->left, i);
else
return OS-Select(x->right, i-r);
}
David Luebke
24
M
8
C
5
A
1
i=5
r=2
F
3
D
1
i=5
r=6
P
2
i=3
r=2
Q
1
H
1
3/30/2016
OS-Select Example
Example: show OS-Select(root, 5):
OS-Select(x, i)
{
r = x->left->size + 1;
if (i == r)
return x;
else if (i < r)
return OS-Select(x->left, i);
else
return OS-Select(x->right, i-r);
}
David Luebke
25
M
8
C
5
A
1
P
2
i=5
r=2
F
3
D
1
i=5
r=6
i=3
r=2
H
1
Q
1
i=1
r=1
3/30/2016
OS-Select: A Subtlety
OS-Select(x, i)
{
r = x->left->size + 1;
if (i == r)
return x;
else if (i < r)
return OS-Select(x->left, i);
else
return OS-Select(x->right, i-r);
}
What happens at the leaves?
How can we deal elegantly with this?
David Luebke
26
3/30/2016
OS-Select
OS-Select(x, i)
{
r = x->left->size + 1;
if (i == r)
return x;
else if (i < r)
return OS-Select(x->left, i);
else
return OS-Select(x->right, i-r);
}
What will be the running time?
David Luebke
27
3/30/2016
Determining The
Rank Of An Element
M
8
C
5
P
2
A
1
F
3
Q
1
D
1
H
1
What is the rank of this element?
David Luebke
28
3/30/2016
Determining The
Rank Of An Element
M
8
C
5
P
2
A
1
F
3
Q
1
D
1
H
1
Of this one? Why?
David Luebke
29
3/30/2016
Determining The
Rank Of An Element
M
8
C
5
P
2
A
1
F
3
Q
1
D
1
H
1
Of the root? What’s the pattern here?
David Luebke
30
3/30/2016
Determining The
Rank Of An Element
M
8
C
5
P
2
A
1
F
3
Q
1
D
1
H
1
What about the rank of this element?
David Luebke
31
3/30/2016
Determining The
Rank Of An Element
M
8
C
5
P
2
A
1
F
3
Q
1
D
1
H
1
This one? What’s the pattern here?
David Luebke
32
3/30/2016
OS-Rank
OS-Rank(T, x)
{
r = x->left->size + 1;
y = x;
while (y != T->root)
if (y == y->p->right)
r = r + y->p->left->size + 1;
y = y->p;
return r;
}
What will be the running time?
David Luebke
33
3/30/2016
OS-Trees: Maintaining Sizes
So we’ve shown that with subtree sizes, order
statistic operations can be done in O(lg n) time
Next step: maintain sizes during Insert() and
Delete() operations
How would we adjust the size fields during
insertion on a plain binary search tree?
David Luebke
34
3/30/2016
OS-Trees: Maintaining Sizes
So we’ve shown that with subtree sizes, order
statistic operations can be done in O(lg n) time
Next step: maintain sizes during Insert() and
Delete() operations
How would we adjust the size fields during
insertion on a plain binary search tree?
A: increment sizes of nodes traversed during search
David Luebke
35
3/30/2016
OS-Trees: Maintaining Sizes
So we’ve shown that with subtree sizes, order
statistic operations can be done in O(lg n) time
Next step: maintain sizes during Insert() and
Delete() operations
How would we adjust the size fields during
insertion on a plain binary search tree?
A: increment sizes of nodes traversed during search
Why won’t this work on red-black trees?
David Luebke
36
3/30/2016
Maintaining Size Through Rotation
y
19
x
11
6
x
19
rightRotate(y)
7
leftRotate(x)
4
y
12
6
4
7
Salient point: rotation invalidates only x and y
Can recalculate their sizes in constant time
Why?
David Luebke
37
3/30/2016
Augmenting Data Structures:
Methodology
Choose underlying data structure
Determine additional information to maintain
E.g., subtree sizes
Verify that information can be maintained for
operations that modify the structure
E.g., red-black trees
E.g., Insert(), Delete()
(don’t forget rotations!)
Develop new operations
David Luebke
E.g., OS-Rank(), OS-Select()
38
3/30/2016
The End
Up next:
Interval trees
Review for midterm
David Luebke
39
3/30/2016