Constant-Time LCA Retrieval
Download
Report
Transcript Constant-Time LCA Retrieval
Ariel Rosenfeld
Bar-Ilan Uni.
In a rooted tree T, a node u is an ancestor of
a node v if u is on the unique path from the
root to v.
In a rooted tree T, the Lowest Common
Ancestor (LCA) of two nodes u and v is the
deepest node in T that is the ancestor of both
u and v.
1
2
3
4
5
6
Node 3 is the LCA of nodes 4 and 6.
Node 1 is the LCA of node 2 and 5.
The LCA problem is then, given a rooted tree
T for preprocessing, preprocess it in a way so
that the LCA of any two given nodes in T can
be retrieved in constant time.
Preprocessing
Querying
Restrictions\comments
None
O(n)
O(n^3)
O(1)
Naïve (do it yourselves)
O(n^2)
O(1)
DP
O(n)
O(1)
When the tree is complete
O(n)
O(1)
Using reduction. Article
added.
Let B denote a complete binary tree with n
nodes.
The key here is to encode the unique path
from the root to a node in the node itself.
We assign each node a path number, a logn
bit number that encodes the unique path
from the root to the node.
For each node v in B we encode a path number
in the following way:
◦ Counting from the left most bit, the i’th bit of the
path number for v corresponds to the i’th edge on the
path from the root to v.
◦ A 0 for the i’th bit from the left indicates that the i’th
edge on the path goes to a left child, and a 1 indicates
that it goes to a right child.
◦ Let k denote then number of edges on the path from
the root to v, then we mark the k+1 bit (the height bit)
of the path number 1, and the rest of the logn-k-1
bits 0.
1
0
node j
1
0
0
node i
Node i’s path number is
Node j’s path number is
0101
1010
The height bit is marked in blue
Padded bits are marked in red.
1000
0100
0010
0001
1100
0110
0011
0101
1010
0111 1001
1011
1110
1101
1111
Path numbers can easily be assigned in a
simple O(n) in-order traversal on B.
(good programing exercise)
Suppose now that u and v are two nodes in B,
and that path(u) and path(v) are their
appropriate path numbers.
We denote the lowest common ancestor of u
and v as lca(u,v).
We denote the prefix bits in the path number,
those that correspond to edges on the path
from the root, as the path bits of the path
number.
First we calculate path(u) XOR path(v) and find
the left most bit which equals 1.
If there is no such bit than path(u) = path(v)
and so u = v, so assume that the k’th bit of the
result is 1.
If both the k’th bit in path(u) and the k’th bit in
path(v) are path bits, then this means that u
and v agree on k-1 edges of their path from
the root, meaning that the k-1 prefix of each
node’s path number encodes within it the path
from the root to lca(u,v).
lca(u,v)
u
0100
0010
v
0111
path(u) XOR path(v) =
0010
XOR
0111
0101
path(lca(u,v) =
0 1 0 0
height bit
padded bits
lca(u’,v’)
1010
u’
1001
v’
1011
path(u’) XOR path(v’) =
1001
XOR
1011
0010
path(lca(u,v) =
1 0 1 0
height bit
padded bit
This concludes that if we take the prefix k1 bits of the result of path(u) XOR path(v),
add 1 as the k’th bit, and pad logn-k 0
suffix bits, we get path(lca(u,v)).
If either the k’th bit in path(u) or the k’th bit
in path(v) (or both) is not a path bit then
one node is ancestor to the other, and
lca(u,v) can easily be retrieved by
comparing path(u) and path(v)’s height bit.
The following are the two stages of the general
LCA algorithm for any arbitrary tree T:
First, we reduce the LCA problem to the Restricted
Range Minima {RRM} (simple case of Range
Minimum Query {RMQ}) problem.
Problem of finding the smallest number in an
interval of a fixed list of numbers, where the
difference between two successive numbers in the
list is exactly one.
We’ll solve the Restricted Range Minima problem
and thus solve the LCA problem.
Let T denote an arbitrary tree
Let lca(u,v) denote the lowest common
ancestor of nodes u and v in T.
First we execute a depth-first traversal of T to
label the nodes in the depth-first order they
are encountered.
In that same traversal we maintain a list L, of
nodes of T, in the same order that they were
visited.
The only property of the depth-first
numbering we need is that the number given
to any node is smaller then the number given
to any of it’s descendents.
000
001
010
011
100
110
101
111
The depth-first traversal creates these depth
numbers and the following list L:
L = { 0, 1, 0, 2, 3, 2, 4, 2, 5, 6, 5, 7, 5, 2, 0 }
Now if want to find lca(u,v), we find the first
occurrence of the two nodes in L, this defines
an interval I in L.
Suppose u occurs in L before v. Now, I
describes the part of the traversal, from the
point we first discovered u to the point we first
discovered v.
lca(u,v) can be retrieved by finding the
minimum number in I.
This is due to the following two simple facts:
◦ If u is an ancestor of v then all nodes visited
between u and v are in u’s subtree, thus the depthnumber assigned to u is minimal in I.
◦ If u is not an ancestor of v, then all those nodes
visited between u and v are in lca(u,v)’s subtree,
and the traversal must visit lca(u,v). Thus the
minimum of I is the depth-number assigned to
lca(u,v).
000
001
010
011
100
101
110
111
L = { 0, 1, 0, 2, 3, 2, 4, 2, 5, 6, 5, 7, 5, 2, 0 }
lca(3,7) = 2
◦
lca(0,7) = 0
So far we’ve shown how to reduce the LCA
problem to the range minima problem. This
next step shows how to achieve reduction to
the restricted range minima problem.
Denote level(u) as the number of edges in the
unique path from the root to node u in T.
If L = { l1, l2, … , lz } then we build the
following list :
L’={level(l1),level(l2),…level(lz)}.
We use L’ in the same manner we used L in the
previous reduction scheme.
This works because in every interval I = [u,v] in
L, lca(u,v) is the lowest node in I for the same
reasons mentioned earlier.
The difference between two adjacent elements
in L’ is exactly one.
This completes the reduction to the restricted
range minima problem.
Denote n as the number of nodes in T.
◦ Depth-first traversal can be done in O( n ) space and time
complexity.
◦ L is of size O( n ) and thus it’s creation and initialization can
be done in O( n ) space and time complexity.
◦ To find lca(u,v) we need the first occurrence of u and v in L.
This could be stored in a table of size O( n ). Thus the
creation and initialization of this table can be done in O( n )
space and time complexity.
The total space and time complexity of the reduction is then
O( n ).
The Range Minima problem is the problem of
finding the smallest number in an interval of a
fixed list of numbers.
The Restricted Range Minima problem is an
instance of the Range Minima problem where
the difference between two successive numbers
is exactly one.
The Restricted Range Minima problem is stated
formally in the following:
Given a list L = { l1 , l2 , … , ln } of n real
numbers, where for each i = 1… n-1 holds that
| li - li+1 | = 1
preprocess the list so that for any interval [ li ,
li+1 , … , lj ] ,1 i < j n, the minimum over the
interval can be retrieved in constant time.
I added the article explaining how to solve
RRM in O(n) time and space.
Somewhat challenging..
If LCA had a good solution, would it solve
(restricted) Range Maxima?
In other words, Can we reduce RM to LCA?
Surprisingly easy.
A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8] A[9]
10 25 22 34 7
19 10 12
26 16
A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8] A[9]
10 25 22 34 7
10
0
19 10 12
26 16
A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8] A[9]
10 25 22 34 7
10
0
25
1
19 10 12
26 16
A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8] A[9]
10 25 22 34 7
10
0
22
2
25
1
19 10 12
26 16
A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8] A[9]
10 25 22 34
34 77
10
0
22
2
34
25
1
3
19 10 12
26 16
A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8] A[9]
10 25 22 34 7
7
4
10
0
22
2
34
25
1
3
19 10 12
26 16
A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8] A[9]
10 25 22 34 7
19 10 12
26 16
4
0
6
2
1
5
7
3
9
8
O(n), O(1) -time solution for LCA,
If there is an
O(n), O(1) -time solution for
then there is an
RMQ.
Let A be the input array to the RMQ problem.
Let C be the Cartesian tree of A.
RMQA(i,j) = LCAC(i,j)
A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8]
10 25 22 34 7
19 10 12
A[9]
26 16
4
0
6
2
1
5
7
3
9
8