lec3 - Department of Computer Science

Download Report

Transcript lec3 - Department of Computer Science

I/O-Algorithms
Lars Arge
Aarhus University
February 16, 2006
I/O-algorithms
I/O-Model
D
Block I/O
M
• Parameters
N = # elements in problem instance
B = # elements that fits in disk block
M = # elements that fits in main memory
K = # output size in searching problem
• We often assume that M>B2
P
Lars Arge
• I/O: Movement of block between memory
and disk
2
I/O-algorithms
Fundamental Bounds
Internal
N
External
• Sorting:
• Permuting
N log N
N
B
N
log M B NB
min{ N , NB log M B
• Searching:
log 2 N
log B N
• Scanning:
Lars Arge
N
B
N
}
B
3
I/O-algorithms
B-tree
• B-tree with branching parameter b and leaf parameter k (b,k≥8)
– All leaves on same level and contain between 1/4k and k elements
– Except for the root, all nodes have degree between 1/4b and b
– Root has degree between 2 and b
• B-tree with leaf parameter k  (B)
– O(N/B) space
– Height O(log b NB )
– O( 1 k ) amortized leaf rebalance operations
– O( b1k log b NB ) amortized internal node rebalance operations
• B-tree with branching parameter Bc, 0<c≤1, and leaf parameter B
– Space O(N/B), updates O(log B N ), queries O(log B N  T B)
Lars Arge
4
I/O-algorithms
Secondary Structures
• When secondary structures used, a rebalance on v often require
O(w(v)) I/Os (w(v) is weight of v)
– If (w(v)) inserts have to be made below v between operations
 O(1) amortized split bound
 O(log B N ) amortized insert bound
• Nodes in standard B-tree do not have this property
(2,4)-tree
Lars Arge
5
I/O-algorithms
Weight-balanced B-tree
• Idea: Combination of B-tree and BB[]-tree
– Weight constraint on nodes instead of degree constraint
– Rebalancing performed using split/fuse as in B-tree
• Weight-balanced B-tree with parameters b and k (b>8, k≥8)
– All leaves on same level and
contain between k/4 and k elements
level l
– Internal node v at level l has
1 l
b k ...b l k
4
w(v) < b l k
level l-1
– Except for the root, internal node v 1 bl -1k...bl -1k
4
1 l
at level l have w(v)> 4 b k
– The root has more than one child
Lars Arge
6
I/O-algorithms
Weight-balanced B-tree
• Weight-balanced B-tree with branching parameter b
level l
1 l
and leaf paramater k=Ω(B)
b k ...b l k
4
– O(N/B) space
level l-1
l -1
1 l -1
– Height O (log b Nk )
b
k
...
b
k
4
– O(log b N ) rebalancing operations after update
– Ω(w(v)) updates below v between consecutive operations on v
• Weight-balanced B-tree with branching parameter Bc and leaf
parameter B
– Updates in O(log B N )and queries in O(log B N  T B) I/Os
• Construction bottom-up in O( NB log M B
Lars Arge
N
)
B I/O
7
I/O-algorithms
Persistent B-tree
• Update current version (getting new version)
• Query all versions
• N is number of updates performed
– O(N/B) space
– O(log B N ) update
– O(log B N  T B) query in any version
Lars Arge
8
I/O-algorithms
Persistent B-tree
• Idea: Elements augmented with “existence interval” and stored in
one structure
• Persistent B-tree with parameter b (>16):
– Directed graph
* Nodes contain elements augmented with existence interval
* At any time t, nodes with elements alive at time t form B-tree
with leaf and branching parameter b
– B-tree with leaf and branching parameter b on indegree 0 node

If b=B:
– Query at any time t in O(log B N  T B) I/Os
Lars Arge
9
I/O-algorithms
B-tree Construction
• In internal memory we can sort N elements in O(N log N) time using
a balanced search tree:
– Insert all elements one-by-one (construct tree)
– Output in sorted order using in-order traversal
• Same algorithm using B-tree use O( N log B N ) I/Os
log MB
– A factor of O( B log B ) non-optimal
• As discussed we could build B-tree bottom-up in O( NB log M B NB ) I/Os
– But what about persistent B-tree?
– In general we would like to have dynamic data structure to use in
O( NB log M B NB ) algorithms  O( B1 log M B NB ) I/O operations
Lars Arge
10
I/O-algorithms
Buffer-tree Technique
M elements
O (log M
fan-out M/B
N
B B)
B
B
• Main idea: Logically group nodes together and add buffers
– Insertions done in a “lazy” way – elements inserted in buffers.
– When a buffer runs full elements are pushed one level down.
– Buffer-emptying in O(M/B) I/Os
 every block touched constant number of times on each level
 inserting N elements (N/B blocks) costs O( NB log M B NB ) I/Os.
Lars Arge
11
I/O-algorithms
Basic Buffer-tree
• Definition:
– B-tree with branching parameter MB and leaf parameter B
– Size M buffer in each internal node
M
$m$ blocks
1 M
4 B
... MB
B
• Updates:
– Add time-stamp to insert/delete element
– Collect B elements in memory before inserting in root buffer
– Perform buffer-emptying when buffer runs full
Lars Arge
12
I/O-algorithms
Basic Buffer-tree
• Note:
– Buffer can be larger than M during recursive buffer-emptying
* Elements distributed in sorted order
 at most M elements in buffer unsorted
– Rebalancing needed when “leaf-node” buffer emptied
* Leaf-node buffer-emptying only performed after all full
internal node buffers are emptied
M
$m$ blocks
1 M
4 B
... MB
B
Lars Arge
13
I/O-algorithms
Basic Buffer-tree
• Internal node buffer-empty:
– Load first M (unsorted) elements into
memory and sort them
– Merge elements in memory with rest
of (already sorted) elements
– Scan through sorted list while
* Removing “matching” insert/deletes
* Distribute elements to child buffers
– Recursively empty full child buffers
M
$m$ blocks
1 M
4 B
... MB
• Emptying buffer of size X takes O(X/B+M/B)=O(X/B) I/Os
Lars Arge
14
I/O-algorithms
Basic Buffer-tree
• Buffer-empty of leaf node with K elements in leaves
K
–
–
–
–
Sort buffer as previously
Merge buffer elements with elements in leaves
Remove “matching” insert/deletes obtaining K’ elements
If K’<K then
* Add K-K’ “dummy” elements and insert in “dummy” leaves
Otherwise
* Place K elements in leaves
* Repeatedly insert block of elements in leaves and rebalance
• Delete dummy leaves and rebalance when all full buffers emptied
Lars Arge
15
I/O-algorithms
Basic Buffer-tree
• Invariant:
Buffers of nodes on path from root to emptied leaf-node are empty

• Insert rebalancing (splits)
performed as in normal B-tree
v’
v
v’’
• Delete rebalancing: v’ buffer emptied before fuse of v
– Necessary buffer emptyings performed before next dummyblock delete
– Invariant maintained
v
Lars Arge
v’
v
16
I/O-algorithms
Basic Buffer-tree
• Analysis:
– Not counting rebalancing, a buffer-emptying of node with X ≥ M
elements (full) takes O(X/B) I/Os
 total full node emptying cost O( NB log M B NB ) I/Os
– Delete rebalancing buffer-emptying (non-full) takes O(M/B) I/Os
 cost of one split/fuse O(M/B) I/Os
– During N updates
* O(N/B) leaf split/fuse
N
* O( M B log M B NB ) internal node split/fuse
B

Total cost of N operations: O( NB log M B NB ) I/Os
Lars Arge
17
I/O-algorithms
Basic Buffer-tree
• Emptying all buffers after N insertions:
Perform buffer-emptying on all nodes in BFS-order
 resulting full-buffer emptyings cost O( NB log M B NB ) I/Os
N
empty O( M B ) non-full buffers using O(M/B)  O(N/B) I/Os
B
M
$m$ blocks
1 M
4 B
... MB
B

• N elements can be sorted using buffer tree in O( NB log M B
Lars Arge
N
)
B
I/Os
18
I/O-algorithms
Summary/Conclusion: Buffer-tree
• Batching of operations on B-tree using M-sized buffers
N I/O updates amortized
– O( B1 log
)
M B B
– All buffers emptied in O( NB log M B NB ) I/Os
• One-dim. rangesearch operations can also be supported in
O( B1 log M B NB  TB ) I/Os amortized
– Search elements handle lazily like updates
– All elements in relevant sub-trees
reported during buffer-emptying
– Buffer-emptying in O(X/B+T’/B),
where T’ is reported elements
• Using buffer technique persistent B-tree built in O( NB log M B
Lars Arge
$m$ blocks
N
)
B
I/O
19
I/O-algorithms
Buffered Priority Queue
• Basic buffer tree can be used in external priority queue
• To delete minimal element:
– Empty all buffers on leftmost path
– Delete 14 M elements in leftmost
leaf and keep in memory
– Deletion of next M minimal
elements free
– Inserted elements checked against
minimal elements in memory
• O( MB log M
Lars Arge
N
B B ) I/Os
( MB )
B
every O(M) delete  O( B1 log M
N
)
B B
amortized
20
I/O-algorithms
Other External Priority Queues
• Buffer technique can be used on other priority queue structures
– Heap
– Tournament tree
• Priority queue supporting update often used in graph algorithms
– O( B1 log 2 NB ) on tournament tree
– Major open problem to do it in O( B1 log M B NB ) I/Os
• Worst case efficient priority queue has also been developed
– B operations require O(log M B NB ) I/Os
Lars Arge
21
I/O-algorithms
Other Buffer-tree Technique Results
• Attaching (B) size buffers to normal B-tree can also be use to
improve update bound
• Buffered segment tree
– Has been used in batched range searching and rectangle
intersection algorithm
• Has been used on String B-tree to obtain I/O-efficient string sorting
algorithms
Lars Arge
22
I/O-algorithms
Summary/Conclusions: Fund. Data Structures
• B-tree
– O(N/B) space, O(logB N) update, O(logB N+T/B) query
• Weight-balanced B-tree
– Ω(w(v)) updates below v between consecutive operations on v
• Persistent B-tree
– Query in any previous version
• Buffer tree
– Batching of operations to obtain O( B1 log M B NB ) bounds
Lars Arge
23
I/O-algorithms
References
• External Memory Geometric Data Structures
Lecture notes by Lars Arge.
– Section 5
Lars Arge
24
I/O-algorithms
Flow Accumulation
• Flow accumulation on grid terrain model:
– Initially one unit of water in each grid cell
– Water (initial and received) distributed from each cell to lowest
lower neighbor cell (if existing)
– Flow accumulation of cell is total flow through it
• Flow accumulation (in a more general form) a basic index used in
environmental sciences (e.g. to used to compute drainage network)
Lars Arge
25
I/O-algorithms
Computing Flow Accumulation
• Process (sweep) points by decreasing height. At each cell:
– Read flow from flow grid and neighbor heights from height grid
– Update flow (flow grid) for downslope neighbors

One sweep  O(N log N) time algorithm
Lars Arge
26
I/O-algorithms
Flow Accumulation
• Computed for Appalachian Mountains (800km x 800km) by Duke
University environmental researchers
– 100m resolution  ~ 64M cells
 ~128MB raw data (~500MB processing)
 14 days (on 512MB machine)
– ~ 1.2GB at 30m resolution, ~12GB at 10m, ~1.2TB at 1m
• Problem: Cells of same height distributed
over the terrain
 scattered access to flow grid and height grid
 Ω(N) I/Os
Lars Arge
27
I/O-algorithms
I/O-Efficient Flow Accumulation
•
Eliminating height grid scattered accesses
– Augment each cell with height of 8 neighbors
• Eliminating flow grid scattered accesses
– Utilize that flow to neighbor cell is only needed when sweep
reaches its elevation:
* Distribute flow by inserting element in priority queue with
priority equal to neighbor’s height (and grid position)
* Flow of cell obtained using DeleteMin operations

Turns O(N) grid accesses into O(N) priority queue operations

O( NB log M B NB ) algorithms
• Appalachian Mountains in 3 hours!
Lars Arge
28