Transcript Lock held
Data Structures &
Algorithms
Extermal Search
Richard Newman
based on book by R. Sedgewick
Rules of the Game
• Searching very large structures
• Too big to fit into memory
• Avoid sequential search
• Fast random access to data blocks
• Hard disk <-> memory pages
• Support insert/remove/find
Rules of the Game
• External access time dominates
• Orders of magnitude slower
• Minimize “probes” – first access to page
•
Models
• File-name based storage
• Virtual memory
Strategy
• Maintain index to data
• Get reference to item quickly
• Only bring in necessary pages
Indexed Sequential Access
• Keep sorted array with keys and
item references
• Use binary search on array
• Uses only lg N probes
• Problem: Index may be huge
• Won’t fit on single page
Indexed Sequential Access
• Store keys and item references in
fully balanced binary tree
• Internal nodes: keys and page pointers
• External nodes: keys and item pointers
• Page access dominates node access
• Implement M-ary tree
• Now logM N probes – nearly linear
Indexed Sequential Access
001
601
601
513
153
601
373
706
562
737
001
601
601
513
017
601
061
706
107
147
153
601
601
513
176
601
207
706
275
277
373
601
601
513
434
601
513
706
524
526
737
601
601
513
741
601
742
706
766
773
562
601
601
513
574
601
706
736
Indexed Sequential Access
• Sequential key organization +
indexed access
• Hence indexed sequential access
• Used by modern file systems
• Index = directory
• Not good with dynamic structures
• Have to rebuild database!
Indexed Sequential Access
Prop. 16.1: Search in indexed
sequential file requires effectively a
constant number of probes (logM N for
large M), but an insertion can involve
rebuilding the entire index.
B-Trees
• Desire to support dynamic sets
• And still be efficient
• Relax requirement that every node
have exactly M entries
• Rather, require at most M entries
• Still want space efficient
• Require at least M/2 entries
• Except possibly at root (at least one)
B-Trees
• A generalization of 2-3-4 trees
• Sedgewick allows different values
• Not just M/2 <= k <= M items
• Strict version Bayer & McCreight 1970
• Multiway nodes via arrays
• Index, not search structure
• Does not contain items
• Split from bottom up
B-Trees
EIQU
ABCD
FGH
JKLMNOP
RST
V WXYZ
Note that each node has one more
link than number of keys
Keys act to split search
B-Trees
Defn. 16.2: A B-Tree of order M is a
tree that either is empty or comprises knodes, with k-1 keys and k links to
trees representing the intervals
delimited by adjacent keys, with 2≤k≤M
at the root and M/2 ≤ k ≤ M at all other
nodes; all links to empty trees must be
at the same depth.
B-Trees
Check out the animation:
http://www.cs.usfca.edu/~galles/visualiz
ation/BTree.html
Prop. 16.2: A search or insertion in a BTree of order M with N items takes
between logM N and logM/2 N probes – a
constant number for practical purposes.
B-Plus Trees
B-Plus tree:
Add links from leaf to next leaf to make
sort traversal easier, as well as merge
Check out the animation:
http://www.cs.usfca.edu/~galles/visualiz
ation/BPlusTree.html
B-Trees
Prop. 16.3: A B-Tree of order M
constructed from N random items is
expected to have about 1.44N/M
pages.
Extendible Hashing
• Combines features of
• Hashing
• Multi-way tries
• Sequential-access
• Uses just one or two probes for
insert or search
Extendible Hashing
• Search
• Start with leading bits of key
• Index into table (power of 2 size)
• Items stored in pages split when
overflow
• Maintain directory with reference to
page containing item matching a key
Extendible Hashing
• Suppose have 2d disk pages
• Directory of 2d page references
• Use d bits of key to index into directory
• Keep in same page all keys whose first
d bits match, in order
• Sequential search on page
• OK to have multiple pointers to same pg
• Items stored in pages split when
overflow
Extendible Hashing
Defn. 16.3: An extendible hash table of
order d is a directory of 2d references to
pages that each contain up to M items
with keys. The items on each page are
identical in their first k bits, and the
directory contains 2d-k pointers to the
page, starting at the location specified
by the first k bits on the page.
Extendible Hashing
• Table growth
• Page split
• Distribute keys from one page over two
• Decided by first bit where keys differ
• Directory split
• Double the size of the directory, d++
• Do when necessary from page split
• May make multiple pointers to same pg
Extendible Hashing
Prop. 16.4: The extendible hash table
built from a set of keys depends only on
the values of those keys, not upon their
insertion order.
Extendible Hashing
M = 5 keys + item references per page
153
601
601
513
176
601
513
706
601
706
Now what? The page is full!
Split!
Insert:
601
153
513
706
176
773
Extendible Hashing
Split page, so split directory…
0x
153
176
373
1x
513
601
706
773
742
773
Now we can insert 773
Insert:
601
153
513
706
176
773
742
373
524
Extendible Hashing
Split page again, need to split directory…
00x
01x
10x
11x
153
176
373
275
373
374
601
706
742
773
776
Now we can insert 524
513
524
Now what?
Insert:
…
524
776
275
374
007
Extendible Hashing
Split page …
00x
01x
10x
11x
007
153
176
153
176
but no need to split directory
(2 pointers)
275
Insert:
373
…
374
007
601
706
742
773
776
Now we can insert 007
513
524
Concurrency Issues
• May have multiple processes
• All accessing same structure
• At the same time….
• No problem if two read same data
• Big problem if
• Read overlaps write
• Write overlaps write
Locking
• One solution is locking
• Prevent simultaneous access that
could cause problems
• Process requests lock
• Has to wait until lock is granted
• Once lock held, can proceed
Locking
• Single lock for whole data structure is
too restrictive
• Prevents all other accesses!
• So lock parts of structure needed
• Nodes along path
• Unlock as soon as no longer needed
• Allow others to access
Locking
• On insertion or removal
• May have to change root
• So can’t release until known
• Types of locks
• Depend on operation
• Reads are compatible
• Writes are not
Locking
• Read lock/write lock
• Use read lock on search
• Use write lock on insert/remove
• Lock compatibility:
• Read OK with read
• Read NOT OK with write
• Write NOT OK with write
Locking
• Read lock/write lock
• Use read lock on search
• Use write lock on insert/remove
Lock
requested
• Lock compatibility matrix:
Lock held
Requested
None
Read
Write
Read
OK
OK
X
Write
OK
X
X
Locking
• Add Alpha lock = tentative write
• Use alpha lock on insert/remove
• Until known if write needed
• Then write lock as required
Lock
requested
Lock held
Request
None
Read
Alpha
Write
Read
OK
OK
OK
X
Alpha
OK
OK
X
X
Write
OK
X
X
X
Locking
• Always lock from top to bottom
• Else danger of deadlock
• Only unlock when know it is safe
• If insert and node not full, then
• Can unlock the parent
• Process with alpha lock on a node
can request write lock
• Gets it when no more read locks
Summary
• Indexed Sequential Access
• B-Trees
• Extendible Hashing
• Concurrency issues