CS473-Algorithms I
Download
Report
Transcript CS473-Algorithms I
CS473-Algorithms I
Lecture X
Augmenting Data Structures
CS 473
Lecture X
1
How to Augment a Data Structure
FOUR STEP PROCEDURE:
1. Choose an underlying data structure (UDS)
2. Determine additional info to be maintained in the UDS
3. Verify that additional info can be maintained for the
modifying operations on the UDS
4. Develop new operations
Note: Order of steps may vary and be intermixed
in real design.
CS 473
Lecture X
2
Example
Design of our order statistic trees:
1. Choose Red Black (R-B) TREES
2. Additional Info: Subtree sizes
3. INSERT, DELETE => ROTATIONS
4. OS-RANK, OS-SELECT
Bad design choice for OS-TREES:
2. Additional Info: Store in each node its rank in the subtree
• OS-RANK, OS-SELECT would run quickly but;
• Inserting a new minimum element would cause a change to
this info in every node of the tree.
CS 473
Lecture X
3
Augmenting R-B Trees: Theorem
Theorem:
• Let f be a field that augments a R-B Tree T of n nodes
• Suppose that f[x] for a node x can be computed using only
• The info in nodes, x, left[x], right[x]
• f [ left[x] ] and f [ right[x] ]
Proof Main idea:
• Changing f[x]
=> Update only f[p[x]] but nothing else
• Updating f[p[x]] => Update only f[p[p[x]]] but nothing else
• And so on up to the tree until f[root[x]] is updated
• When f[root] is updated, no other node depends on new value
• So the process terminates
• Changing an f field in a node costs O(lgn) time since the height
of a R-B tree is O(lgn)
CS 473
Lecture X
4
INTERVALS:DEFINITIONS
DEFINITION: A Closed interval
• An ordered pair of real numbers [t1,t2] with t1 ≤ t2
• [t1,t2] = { t ∈ R : t1 ≤ t≤ t2}
INTERVALS:
• Used to represent events that each occupy a continuous
period of time
• We wish to query a database of time intervals to find out
what events occurred during a given interval
• Represent an interval [t1,t2] as an object i, with the fields
low[i] = t1 & high[i] = t2
• Intervals i & i' overlap if i ∩ i' ≠ ∅ that is low[i] ≤ high
[i'] AND low[i'] ≤ high [i]
CS 473
Lecture X
5
INTERVALS:DEFINITIONS
Any two intervals satisfy the interval trichotomy
That is exactly one of the following 3 properties hold
a) i and i' overlap
i
i'
i
i'
b) high[i] < low[i']
i
c) high[i'] < low[i]
i'
CS 473
i
i'
i
i'
i'
i
Lecture X
6
INTERVAL TREES
Maintain a dynamic set of elements with each element x
containing an interval int[x]
Support the following operations:
• INSERT(T,x) : Adds an element x whose int field contains
an interval to the tree
• DELETE(T,x): Removes the element x from the tree T
• SEARCH(T,i): Returns a pointer to an element x in T such
that
int[x] overlaps with i'
NIL if no such element in the set.
CS 473
Lecture X
7
INTERVAL TREES(Cont.)
S1: Underlying Data Structure
• Choose R-B Tree
• Each node x contains an interval int[x]
• Key of x=low[ int[x] ]
Inorder tree walk of the tree lists the intervals in sorted order by
low endpoints
S2: Additional information
Store in each node x the maximum endpoint “max[x]” in the
subtree rooted at the node
CS 473
Lecture X
8
[7,10]
EXAMPLE
[5,11]
[17,19]
[4,8]
[15,18]
[21,23]
int
[17,19]
max
23
[5,11]
[21,23]
18
23
[4,8]
[15,18]
8
18
[7,10]
10
CS 473
Lecture X
9
INTERVAL TREES(cont.)
S3: Maintaining Additional Info(max[x])
• max[x] = minimum {high[int[x]], max[left[x]],
max[right[x]]}
• Thus, by theorem INSERT & DELETE run in
O(lgn) time
CS 473
INTERVAL TREES(cont.)
INSERT OPERATION
• Fix subtree Max’s on the way down
• As traverse path for INSERTION while comparing “new low” to that
of node intervals
• Use “new high” to update “max” of nodes as appropriate
• Restore balance with rotations; updating of “max” fields for
rotation
No change
[11,35]
35
[6,20]
Z
[6,20]
X
14
20
Y
30
X
Right Rotate
35
[11,35]
14
19
• Thus, fixing “max” fields for rotation takes O(1) time.
CS 473
Y
19
Z
35
14
INTERVAL TREES(cont.)
S4: Developing new operations
INTERVAL-SEARCH(T,i)
x root[T]
while x ≠NIL and i ∩int[x] = ∅ do
if left[x] ≠NIL and max[left[x]] < low[i] then
x left[x]
else
x right[x]
return x
CS 473
INTERVAL TREES(cont.)
Time: O(lgn)
Starts with x at the root and proceeds downward
• On a single path, until
• EITHER an overlapping interval is found
• OR x becomes NIL
Each iteration takes O(1) time
Height of the tree = O(lgn)
CS 473
Correctness of the Search Procedure
Key Idea: Need to check only 1 of the node’s 2 children
Theorem
• Case 1: If search goes right then
Either
overlap in the right subtree or no overlap
• Case 2: If search goes left then
Either
overlap in the left subtree or no overlap
CS 473
Correctness of the Search Procedure
Case 1: Go Right
• If overlap in right, then done
• Otherwise (if no overlap in RIGHT)
– Either left[x] = NIL No overlap in LEFT
– OR left[x] ≠ NIL and max[left[x]] < low[i]
For each interval i’’ in LEFT
high[i’’] <= max[left[x]]
< low[i]
Therefore, No overlap in LEFT
CS 473
Correctness of the Search Procedure
Case 2: GO LEFT
• If
overlap in left, then done
• Otherwise (if no overlap in LEFT)
– low[i] <= max[left[x]] =high[i’] for some i’ in LEFT
– Since i & i’ don’t overlap and low[i] <= high[i’]
We have high[i] < low [i’] (Interval Trichotomy)
– Since tree is sorted by lows we have
high[i] < low[i’]<Any lows in RIGHT
– Therefore, no overlap in RIGHT
CS 473
Pictorial View of Case 1 & Case 2
i’
i
i’’
i’’
max[left[x]]
Case 1
max[left[x]]
t
Case 2
i’’: any interval in left
i’’: any interval in right
i’ in left such that high[i’]=max[left[x]]
CS 473
Interval Trees
• How to enumarate all intervals overlapping a given interval
– Can do in O(klgn) time,
where k = # of overlapping intervals
– Find and Delete overlapping intervals one by one;
• When done reinsert them
• Theoritical Best is O(k+lgn)
CS 473
How to maintain a dynamic set of numbers
that support min-gap operations
MIN-GAP(Q): retuns the magnitude of the difference of the two
closest numbers in Q
Example: Q={1,5,9,15,18,22} MIN-GAP(Q) = 18-15 = 3
1.
Underlying Data Structure:
• A R-B Tree containing the numbers keyed on the numbers
2. Additional Info at each Node:
min-gap[x]: minimum gap value in the subtree TX rooted
at x
min[x]
: minimum value (key) in TX
max[x]
: maximum value (key) in TX
These values are ∞ if x is a leaf node
CS 473
3. Maintaining the Additional Info
min[x] =
min[left[x]]
key[x]
if left[x] NIL
otherwise
min[x] =
min[left[x]]
key[x]
if left[x] NIL
otherwise
min-gap[x] = Min
-
CS 473
min-gap[left[x]]
min-gap[right[x]]
key[x] – max[left[x]]
min[right[x]] – key[x]
Each field can be computed from info in the node & its children
Hence, by theorem, they would be maintained during insert & delete
operation without affecting the O(lgn) running time
How to maintain a dynamic set of numbers
that support min-gap operations(cont.)
• The reason for defining the min & max fields is to make it
possible to compute min-gap from the info
at the node & its children
Develop the new operation: MIN-GAP(Q)
• MIN-GAP(Q) simply returns the min-gap value of the root
– It is an O(1) time operation
It is also possible to find the two closest numbers in O(lgn) time
CS 473
How to maintain a dynamic set of numbers
that support min-gap operations(cont.)
CLOSEST-NUMBERS(Q)
x root[Q]
gapmin min-gap[x]
while x ≠ NIL do
if gapmin = min-gap[left[x]] then
x left[x]
elseif gapmin = min-gap[right[x]]
x right[x]
elseif gapmin = key[x] - max[left[x]]
return { key[x], max[left[x]] }
else
return { min[right[x]], key[x] }
CS 473
How to find the overlap of rectilinearly
rectangles
• Given a set R of n rectilinearly oriented rectangles
– i.e sides of all rectangles are paralled to the x & y
axis
• Each rectangle r R is represented with 4 values
– xmin[r], xmax[r], ymin[r], ymax[r]
• Give an O(nlgn)-time algorithm
To decide whether R contains two rectangle that
overlap
CS 473
OVERLAP(R)
TY ∅
SORT xmin & xmax values of rectangles in R
for each extremum x value in the sorted order do
r rectangle [x]
yint [ ymin[r], ymax[r] ]
if x = xmin[r] then
v INTERVAL-SEARCH(TY, yint)
if v ≠NIL then
return TRUE
else
z MAKE-NEW-NODE()
left[z] right[z] p[z] NIL
int[z] yint
INSERT(TY,z)
else /* x=xmax[r]
*/
DELETE(TY,yint)
return FALSE
CS 473