Tree Data Structures

Download Report

Transcript Tree Data Structures

Data Structures:
Disjoint Sets, Segment Trees,
Fenwick Trees
Disjoint Sets: the Union-Find tree
• Represent a set of disjoint sets
• Join sets together (union)
• Find which set an element belongs to
• Quickly test if two separate items are in the same set
• Note: Not the same as a vector of sets – that is much slower to go
through.
Implementing disjoint set
• Each set is a tree – store forest of trees.
• Tree is identified by the ID at the root.
• Each node knows its parent – follow to the root.
• Storage:
• array of values,
• array of parent indices,
• Initially, each item is its own parent
• array of tree height
• Actually an upper bound on height, not actual height
• Starts as height 0
Merge operation
• Merge two trees – set one tree’s root to have parent of other tree’s
root.
• Pick the shorter tree to attach to the longer one
• If the trees are the same height, pick either
• Increase the height of the tree which is the new root by 1
• Called “union-by-rank” (prevents trees getting too tall)
Find Operation
• Path Compression – shorten path to root when possible
• As you traverse the path to the root – do so recursively
• After determining root, set everything on path back down to that root.
• Over time, this compresses all paths to point to root more quickly – when lots
of queries, makes them fast.
• Book: see section 2.4.2 for implementation
Segment Trees
• Idea: create a tree on top of a base array that keeps information
about ranges of the array
• Example: where is the minimum in a range of the array
• Example: where is the maximum?
• Example: what is the sum over some range
• Segment trees are a way of computing this
• Work for dynamically changing data
• Static data has a dynamic programming solution that is easier (we will see
later)
• See section 2.4.3 of book
Storing the Segment Tree
• Similar to a heap
• Use an array of 4 times the base array size (could be smaller, but easy
upper bound)
• Root is index 1
• Covers [0,n-1] in original array
• Example: the index of the minimum value from the array, or max, or sum, etc.
• Left child of node i (covering [L,R]) is 2*i, right node is 2*i+1
• Left child covers range [L, (L+R)/2]
• Right child covers range [(L+R)/2+1, R]
Building and updating the tree
• At base level, each “leaf” will have the result for a single element of
the original array.
• When the node covers [k,k] it is the result for a[k] in the original array, a.
• Can build the tree bottom-up
• Compute children, then merge these for parent
• If an array element gets updated
• Update all nodes on the path to that node – can traverse tree appropriately,
and update bottom-up.
Answering range queries with the tree
• For query [a,b], checking node i: [L, R]
• Assumption: [a,b] is within [L,R], i.e. a >= L, b <= R
• If a==L and b==R, just return value
• If [a,b] overlaps one child, recursively call on that child
• if b <= (L+R)/2, return query [a,b] on left child node 2*i
• if a >= (L+R)/2+1, return query [a,b] on right child node 2*i+1
• If [a,b] overlaps both children, process result from call on both:
• query [a,(L+R)/2] on left child node 2*i
• query [(L+R)/2+1,b] on right child node 2*i+1
• return the result of both (e.g. sum, or min, or max, or lcm, etc.)
Square Root Decomposition
• Another way (simpler, but somewhat less powerful than segment trees) to
improve range queries.
• Idea: break an array of size n into buckets of size n/k
• Then, queries over a range will look at
• the first bucket item-by-item from the starting point on
• then the entire buckets until the last one
• then the last bucket item-by-item until the ending point
• Ideal bucket size is k = sqrt(n)
• Example: 16 elements, range [0,15], break into 4 buckets.
• Query: [1,9]: First bucket looks at 1, 2, 3; second bucket covers 4-7, third bucket looks
at 8,9
Fenwick Trees (Binary Indexed Tree)
• A very efficient implementation of trees (stored as an array)
• Uses binary codes to quickly address node elements
• Typically used for range-sum queries (RSQ) – sum over range
• Again, good for dynamic data
• Index into an array from 1 to n
• See book section 2.4.2
Storing the Fenwick tree
• LSOne = Least significant bit
• Can be calculated as x & (-x)
(-x is in two’s complement)
• Node i of the Fenwick tree covers [i-LSOne(i)+1, i]
•
•
•
•
•
•
•
•
•
1: [1,1]
2: [1,2]
3: [3,3]
4: [1,4]
5: [5,5]
6: [5,6]
7: [7,7]
8: [1,8]
etc.
1-1+1
2-2+1
3-1+1
4-4+1
5-1+1
6-2+1
7-1+1
8-8+1
Range Sum Query on Fenwick Tree
• For [1,b], sum results from a sequence of elements, stripping off LSOne
each time:
•
•
•
•
b
b’ = b-LSOne(b)
repeat until b’ = 0
Total number = number of 1 bits in binary representation.
• e.g. for [1,7]:
•
•
•
•
b = 7 (binary 0111)
b’ = 7-1 = 6 (binary 0110) [5,6]
b’’ = 6-2 = 4 (binary 0100) [1,4]
b’’’ = 4-4 = 0 (binary 0000)
[7,7]
stop
• To compute [a,b] compute for [1,b] – [1,a-1]
Updating Fenwick tree
• Updating element k:
• Must update range for sequence of nodes containing k
• k’ = k+LSOne(k)
• repeat until > n, where n is size of array
• e.g for n=12, k = 3
•
•
•
•
k=3
k’ = 3+1 = 4
k’’ = 4+4 = 8
k’’’ = 8+8=16
0011
0011 + 0001 = 0100
0100 + 0100 = 1000
1000 + 1000 = 10000 > 12 therefore stop