Augmenting Data Structures 1
Download
Report
Transcript Augmenting Data Structures 1
CS473-Algorithms I
Lecture X
Augmenting Data Structures
CS 473
Lecture X
1
Augmenting Data Structures
When dealing with a new problem
•
Data structures play important role
•
Must design or addopt a data structure.
Only in rare situtations
• We need to create an entirely new type of data
structure.
More Often
• It suffices to augment a known data structure by
storing additional information.
• Then we can program new operations for the data
structure to support the desired application
CS 473
Lecture X
2
Augmenting Data Structures (2)
Not Always Easy
• Because, added info must be updated and maintained
by the ordinary operations on the data structure.
Operations
• Augmented data structure (ADS) has operations
inherited from underlying data structure (UDS).
• UDS Read/Query operations are not a problem. (ie.
Min-Heap Minimum Query)
• UDS Modify operations should update additional
information without adding too much cost. (ie. MinHeap Extract Min, Decrease Key)
CS 473
Lecture X
3
Dynamic Order Statistics
Example problem;
• Dynamic Order Statistics, where we need two
operations;
OS-SELECT(x,i): returns ith smallest key in
subtree rooted at x
OS-RANK(T,x): returns rank (position) of x in
sorted (linear) order of tree T.
Other operations
CS 473
•
Query: Search, Min, Max, Successor, Predecessor
•
Modify: Insert, Delete
Lecture X
4
Dynamic Order Statistics (2)
Sorted or linear order of a binary search tree T is
determined by inorder tree walk of T.
IDEA:
• Use Red-Black (R-B) tree as the underlying data
structure.
• Keep subtree size in nodes as additional information.
CS 473
Lecture X
5
Dynamic Order Statistics (3)
Relation Between Subtree Sizes;
size[x] = size[left[x]] + size[right[x]] + 1
The node
itself
Note on implementation;
• For convenience use sentinel NIL[T] such that;
size[NIL[T]] = 0
• Since high level languages do not have operations
on NIL values. (ie. Java has NullPointerException)
CS 473
Lecture X
6
Dynamic Order Statistics Notation
KEY
SUBTREE
SIZE
Node Structure:
• Key as with any Binary Search Tree (Tree is
indexed according to key)
• Subtree Size as additional Data on Node.
CS 473
Lecture X
7
Dynamic Order Statistics Example
10=7+2+1
M
10
C
7
A
1
P
2
F
5
D
1
CS 473
5=1+3+1
H
3
G
1
Q
1
1= size[NIL] + size[NIL] + 1
= 0+0+1
3=1+1+1
K
1
Lecture X
Linear Order
ACDFGHKMPQ
8
Retrieving an Element With a Given Rank
OS-SELECT(x, i)
r size[left[x]] + 1
if i = r then
return x
elseif i < r then
return OS-SELECT( left[x], i)
else
return OS-SELECT(right[x], i)
CS 473
Lecture X
9
OS-SELECT Example
M
10
r>i
Go
Left
C
7
A
1
P
2
F
5
D
1
Q
1
H
3
G
1
CS 473
i=6
r=8
K
1
Lecture X
10
r<i
Go
Right
i=6
r=2
OS-SELECT Example
M
10
C
7
A
1
P
2
F
5
D
1
Q
1
H
3
G
1
CS 473
i=6
r=8
K
1
Lecture X
11
OS-SELECT Example
M
10
i=6
r=2
C
7
A
1
P
2
F
5
D
1
i=4
r=2
r<i
Go
Right
Q
1
H
3
G
1
CS 473
i=6
r=8
K
1
Lecture X
12
OS-SELECT Example
M
10
i=6
r=2
C
7
A
1
P
2
F
5
D
1
i=4
r=2
H
3
G
1
CS 473
i=6
r=8
Q
1
i=2
r=2
r=i
Found
!!!
K
1
Lecture X
13
OS-SELECT
IDEA:
• Knowing size of left subtree (number of nodes smaller
than current) tells you which subtree the answer is in
Running Time: O(lgn)
• Each recursive call goes down one level in the OSTREE
• Running time = O(d) where d = depth of the ith element
• Worst case running time is proportional to the height of
the tree
• Since tree is a R-B tree, its height is balanced and O(lgn)
CS 473
Lecture X
14
Determining the Rank of an Element
OS-RANK(T, x)
r size[left[x]] + 1
yx
while y root[T] do
if y = right[p[y]] then
r r + size[left[p[y]]] + 1
yp[y]
return r
CS 473
Lecture X
15
Dynamic Order Statistics Example
M
10
C
7
A
1
P
2
F
5
D
1
H
3
G
1
CS 473
Q
1
init r=2
right child
r=2+1+1=4
K
1
Lecture X
16
Dynamic Order Statistics Example
M
10
C
7
A
1
P
2
F
5
D
1
Q
1
H
3
G
1
CS 473
right child
r=4+1+1=6
K
1
Lecture X
17
Dynamic Order Statistics Example
M
10
C
7
A
1
F
5
D
1
Q
1
H
3
G
1
CS 473
P
2
left child
r=6
K
1
Lecture X
18
Dynamic Order Statistics Example
M
10
Root and Answer!!!
r=6
C
7
A
1
F
5
D
1
Q
1
H
3
G
1
CS 473
P
2
K
1
Lecture X
19
OS-RANK
IDEA:
• rank[x] = # of nodes preceeding x in an inorder tree walk
+ 1 (itself)
• Follow the simple upward path from node x to the root.
All left subtrees in this path contribute to rank[x]
Running Time: O(lgn)
• Each iteration of the while-loop takes O(1) time
• y goes up one level in the tree with each iteration.
• Running time = O(dx), where dx is the depth of node x
• Running time, at worst proportional to the height of the
tree. (if x is a leaf node)
CS 473
Lecture X
20
Determining the Rank of an Element
r
z
Tz
y
x
Ty
Follow the nodes y (y=x
initially) on the path from x to
root.
Consider Subtree Tp[y] rooted at
p[y], where z is y’s sibling.
Important: r contains the number of nodes in Ty that
preceedes x.
CS 473
Lecture X
21
Determining the Rank of an Element
r=r+size[z]+1
z
Tz
y
x
Ty
Case 1:
If y is a right child z is a left child then
• All nodes in Tz and p[y] precedes x
• Thus, must add size[z] + 1 to r
CS 473
Lecture X
22
Determining the Rank of an Element
r=r
y
z
Tz
x
Ty
Case 2:
If y is a left child z is a right child then
• Neither p[y] nor any node in Tz precedes x.
• Don’t update r
CS 473
Lecture X
23
Maintaining Subtree Sizes
OS-SELECT and OS-RANK works if we are able to
update the subtree sizes with modifications.
Two operations INSERT and DELETE modifies the
contents of the Tree. We should try to update subtree
size without extra traversals.
If not, would have to make a pass over the tree to set up
the sizes whenever the tree is modified, at cost (n)
CS 473
Lecture X
24
Red Black Tree Insertion
Insertion is a two phase process;
• Phase 1: Insert node. Go down the tree from the
root. Inserting the new node as a child of an existing
node. (Search in O(lgn) time)
• Phase 2: Balance tree and correct colors. Go up the
tree changing colors and ultimately performing
rotations to maintain the R-B properties.
CS 473
Lecture X
25
Maintaining Subtree Sizes in an Insert Operation
Phase 1:
• Increment size[x] for each node x on the downward path
from root to leaves
• The new added node gets the size of 1
• O(lgn) operation
Phase 2:
• Only rotations cause structural changes
• At most two rotations
• Good News!!! Rotation is a local operation. Invalidates
only two size fields of the two nodes, around which the
rotation is performed
• O(1) time operation
CS 473
Lecture X
26
Maintaining Subtree Sizes in an Insert Operation
19
x
19
y
LEFT-ROTATE(T,x)
7
y 11
6
4
6
12
4
x
7
After the rotation:
size[y]size[x]
size[x]size[left[x]] + size[right[x]] + 1
Note: only size fields of x and y are modified
CS 473
Lecture X
27
Red Black Tree Deletion
Deletion is a two phase process;
Phase 1:
• Splice out one node y
Phase 2:
• Performs at most 3 rotations
• Otherwise performs no structural changes
CS 473
Lecture X
28
Maintaining Subtree Sizes in an Delete Operation
Phase 1:
• Traverse a path from node y up to the root. Decrementing
the size field of each node on the path.
• Length of this path = dy = O(lgn)
• O(lgn) Time Operation
Phase 2:
• The O(1) Time Rotations can be handled in the same
manner as for insertion.
CS 473
Lecture X
29
Application: Counting the Number of Inversions
Definition: Let A[1..n] be an array of n distinct numbers.
(i,j) is an inversion of A if i<j and A[i] > A[j]
Inversions of A = <12, 13, 18, 16, 11>
1
2 3 4
5
I(A) = {(1,5), (2,5), (3,4), (3,5), (4,5)}
CS 473
Lecture X
30
Application: Counting the Number of Inversions
Question : What array with elements from the set {1...n}
has the most inversions?
• A = <n, n-1, ... 2, 1>
• number of inversions;
n(n 1)
| I(A) | i
2
i 1
n 1
CS 473
Lecture X
31
Application: Counting the Number of Inversions
Question : What is the relationship between the running
time of insertion sort and the number of inversions
• |I(A)| = # of
inversions = # of element moves (shifts)
n
| I(A) | i | I j ( A) | where;
i 2
I j (A) {i : i j & A[i ] A[ j ]} {i : (i, j ) I ( A) & i j}
n
i.e; I(A) I j ( A)
j 2
Let r(j) be the rank of A[j] in A[i...j], then;
| I j (A) | j r ( j ) r ( j ) j | I j ( A) |
CS 473
Lecture X
32
Number of inversions
INVERSION(A)
sum0
T
for j1 to n do
x MAKE-NEW-NODE()
left[x]right[x]p[x]NIL
key[x]A[j]
OS-INSERT(T,x)
rOS-RANK(T,x)
Ijj-r
sumsum+Ij
return sum
CS 473
Lecture X
33