Lecture Notes on Chapter 10 - B

Download Report

Transcript Lecture Notes on Chapter 10 - B

External Storage
Chapter 10
Plus Considerable Instructor Notes from Other
Sources.
Introduction – External Storage
• One of the most important topics we will
discuss.
• We know about multi-way trees, such as 2-3-4
Trees, and these concepts generalize.
• But when it comes to external storage, there
are additional major considerations we
must consider.
2
Introduction – External Storage
• External Storage organization and access (not the
same!!!)
• External Storage – organization and retrieval are
much different than those techniques applied to
primary memory data structures and techniques!!!
3
Accessing External Data
A Very Different Ballgame!
The data structures so far: entirely in RAM (or primary memory)
RAM – ‘random access memory’ is many orders of magnitude quicker than ‘random access disk access.’
MANY ORDERS OF MAGNITUDE!!
.
•
RAM is very limiting indeed! Data is NOT persistent!
–
–
•
Advantage of Data Structures in RAM::
–
–
•
Amount of data limited by size of primary memory typically shared with many other concurrent users;
Data, of course, is volatile.
For data structures, the Speed of RAM is very enticing. But for real IS / IT applications, many applications
require external storage in the form of a database system or a file system.
Advantages of RAM for in-memory data structures are gone!
Disadvantage:
–
–
Persistence, number of data items (records) etc.
Amount of memory available to your process (program)
•
•
Significant IS applications in real world very often require lots of data to process.
External disk files - only practical way.
•
 Disk processing is really the mainstay of information processing whether the files are legacy
type, database files, etc
4
Accessing External Data Very Slow Access
• RAM Access: any location (byte addressable) in memory
accessed equally fast: hence “RAM.’
• Disks:
– Data is organized along circular (concentric) circles
• Tracks 000 to track 2999 for example (outer to inner)
• Disks come in various architectures
• Some have different storage capacities in different tracks.
– Called constant angular velocity and constant linear velocity disks!
LOOK UP!
• Traditional disks (CAV) have same number of bits/track; different
densities! (Velocity of outer tracks is much greater than inner tracks)
• Disk organization:
– Concentric platters, access arm(s), rotational speeds,
seek time, data transfer times, etc.
– Disk Controllers – small computers executing disk
5
commands; shift registers, etc.
Disk Access – Four Key Components
1. Seek time
– Movement of access arm to correct cylinder
– The most significant factor in access
– May take 20 msec
• 2. Head Select – electronic speeds; negligible; fastest component
• 3. Rotational Delay – half speed of rotation; maybe 5-10 msec
• 4. Data Transfer – pretty quick too;
– Tracks divided into fixed-sized sectors. Sector size predetermined at
factory. (Some are marked as defective.)
• Disk access times of 10 or less msec are common.
• Important to note: disk access times are being reduced all the
time, but so are main memory access time – and at a faster rate.
•  Book points out that the disparity continues to grow!
6
Sequential File Organization and Access
• Looking for a specific record in a sequential file:
– While one can search for a specific record in a
sequential file, for a large file, this is very painful.
– This is called sequential access.
–
– Individual record searches are NOT practical, but can be
done.
– Entire file might need to be searched!
– File is organized sequentially.
7
More on Sequential Files
• Sequential files are great for reporting, printing / displaying
information, etc., but not for on-line retrieval of records!!
• Insertions? Book does a poor job here.
– While it is true that conceptually inserting a record into an
existing file would require all records to be ‘moved’ forward
(assuming the blocks are full), in practice this is often NOT
done because the file would not be available for online
access during this physical update!
• Sequential files are NOT USED for online processing.
• Transactions may be batched up during the day and the file is
typically updated one time and offline
– (Note: no one can access the file during updating) prior to
generating end-of-day reports, exception reporting, summary
reports, inventory reports, sales reports, statistical reports,
listings, etc. if required...
8
Still More on Sequential Files
• Need quick retrieval? Sequential Files won’t cut it.
• If we have a need for near immediate response time
– we need a different kind of file organization – one that
provides for fast access too. (These go together)
•  Sequential Files are super – for the right purposes.
– Payroll, budgets, listings, rosters, accounts receivable,
payables, inventories, hosts and hosts of items –
– but NOT any kind of on-line processing (retrieval, updates)
– Can hold tremendous amounts of data.
– Often used for backup as well as master files. 9
B-Trees and Indexing
• Overall objective of B-Trees and similar
indexing schemes is to support:
–
–
–
–
Fast search
Fast insertion
Fast deletion
Fast access
 External Files
10
B-Trees
• Need a different kind of tree than 2-3-4 trees.
• 2-3-4 trees:
– Fine for in-memory operations, but volume and
persistency needs limit the applicability to large files.
• For large files we need more data items (records) per node
so when we retrieve them from disk, we retrieve into RAM
(and store to disk) more records / block.
• Idea is to have few disk retrievals (much time) and then
do sequential searches of retrieved blocks (nodes) in RAM
for the desired record– very quick.
• (Recall 2-3-4 trees: We retrieved a node and then
searched it sequentially to determine / find the item
we
11
desired. But we were limited by size and availability)
B-Trees – One Block per Node
• B-Trees are designed to have many children /node.
– Remember: 2-3-4 trees are B-trees of order 4.
– This structure will not support corporate file
needs in the industrial sector at all!
• B-Tree - much like the 2-3-4 tree but has many
more nodes and many more links to children,
• Keeps the tree height small  few disk I/Os
– (Nomenclature: If a node has (book example) 17 child
pointers, the tree is said to be an order 17 B-tree.)
12
B-Trees – Searching
• The number of levels in a B-Tree is relatively small and the
number of records in a node is relatively large
– Implies downward searching is fast (few levels) and
– Searching node sequentially is quick – once node is retrieved.
•  Typically a block (node) is randomly accessed based on some
kind of key or index, read into memory, and searched sequentially
– If record found, we are done.
• If records exist on all levels, (recall 2-3-4 trees), if the search for
the desired record gets a ‘high’ result, then the record is not in this
node, and the access method retrieves the next block / node using
another child pointer in the block.
• Record found or leaf reached w/o finding record. Search
13 top down.
B-Trees – Insertions - 1
• Want nodes reasonably full
– Facilitates likelihood of finding correct record
increased once we retrieve the node (block)
• To Insert:
– A node split divides the data items ‘equally’:
• half go to a newly created node and half remain in old node.
– Node splits performed bottom up, unlike 2-3-4
trees where they were split going top down.
14
B-Trees – Insertions – 2
• Book goes into a good example to show how a new
record is inserted. Do go through this. Will see.
• Approach is pretty simple, though.
• Starting with a leaf node, we will fill it up
20
40
60
80
• If we now add a 70 to this order-five B-Tree:
– This causes a node split, and the record numbers (only
keys are shown) indicates that 60 is the middle number
and is thus promoted upward, as we would expect.
15
B-Tree Insertion - 3 Example (Book)
So, we ‘simply have’ the resultant B-Tree.
60
20
40
70
80
Pretty straightforward.
Adding a 10 and 30 causes no problems
They are added to the lower left leaf node.
Note: records are maintained in an ascending (ordered) order.
10
20
30
40
16
B-Tree Insertion -4 Example (Book)
• But when we want to now add a 15 to this
leaf (remember, we insert from the bottom
up), we will have a node split again.
• The record keys are 10, 15, 20, 30, and 40
such that 20 is the middle and is promoted
up to its parent, and the arrangement now
looks like:
17
B-Trees – Insertion -5 Algorithm continued
The 20 is the middle record; the node is split evenly with the middle
record promoted up. Records in the node (block) are kept sequential.
20
10
15
60
30
40
70
80
The process continues. Ultimately, the root node above will
also become full, and, using the exact same algorithm, the tree
grows by one level – upwards!
18
B-Trees – Insertion -6 Algorithm
• Clearly, larger nodes means fewer levels needed;
implies more records within a node,
• This is both good and bad.
– Larger blocks are retrieved from disk but more RAM
(buffer space) is needed to hold the larger node in
your process area.
– But can sequentially search more items per node…
•  Note too: no node except the root will ever
be less than half full.
19
B-Tree Efficiency
• Very fast organization and access for retrievals / updates
• Because there are so many records / node, the number of
levels is relatively small.
– If we have a B-Tree of order 9, the height of tree is somewhat
less than log9n; (log base 9 of n) that is 9 raised to what power
equals n.
– Number is going to be one more than the max height of the tree.
– Example: Let n = 500,000 recs. 96= 531,441. Tree = six levels.
– This means that as a max, only six accesses are necessary to
find the correct node in the file of 500,000 records.
• Book: at 10 msec /access,  60 msec to get the right block (node)
(internal comparisons and searches are negligible re to disk accesses).
• Cannot even compare this to those figures of retrieval of a record in a
sequential file.
20
B-Tree Efficiency – 2
• Remember: no free lunch.
• Book hedges on size of the node; and it is NOT free!
• Biggest advantage of B-Trees is for adding and
deleting records. This is very significant!
– Of course we need to search to find the correct node into
which we ‘insert’ or from which we ‘delete.’
• Remember too, the vast majority of the time, in a BTree a node will not have to be split!
– Splitting the node requires time…and disk accesses.
• Records can be accessed both randomly and
sequentially.
21
Variations of B-Trees Implementation
• Important to note that in some implementations
of B-Trees, only the leaf nodes contain records
• This is a B+ Tree. Extremely important.
• In this scenario, non-leaf nodes only contain keys
and block numbers (indices, if you will, to nodes)
– Higher level nodes only contain keys / block numbers
and thus contain many ‘more’ keys / block!
– This is what is done in VSAM! (not mentioned in your book).
– VSAM is old but still prevails!!
– (IBM – Virtual Storage Access Method)
22
Supplementary Information
A B-tree is a tree data structure that keeps data sorted and allows searches,
sequential access, insertions, and deletions in logarithmic amortized time.
The B-tree is a generalization of a binary search tree in that a node can have
more than two children.
Unlike self-balancing binary search trees, the B-tree is optimized for systems
that read and write large blocks of data. It is commonly used in databases and
filesystems.
============
In B-trees, internal (non-leaf) nodes can have a variable number of child nodes
within some pre-defined range.
When data is inserted or removed from a node, number of child nodes changes.
In order to maintain the pre-defined range, internal nodes may be joined/split.
Because a range of child nodes is permitted, B-trees do not need re-balancing as
frequently as other self-balancing search trees, such as Red Black trees,
but may
23
waste some space, since nodes are not entirely full
Supplementary Information
A B-tree is kept balanced by requiring that all leaf nodes are at the same depth.
This depth will increase slowly as elements are added to the tree, but an increase in
the overall depth is infrequent, and results in all leaf nodes being one more node
further away from the root.
B-trees have substantial advantages over alternative implementations when node
access times far exceed access times within nodes.
This usually occurs when nodes are in secondary storage like disk drives.
By maximizing the number of child nodes within each internal node, the height of the
tree decreases and the number of expensive node accesses is reduced.
---------------------------------In addition, rebalancing the tree occurs less often.
The maximum number of child nodes depends on the information that must be
stored for each child node and the size of a full disk block or an analogous size in
secondary storage.
24
Supplementary Information: B+ and B* Trees
The term B-tree may refer to a specific design or it may refer to a general class of
designs.
In the narrow sense, a B-tree stores keys in its internal nodes but need not store
those keys again in the records at the leaves.
The general class includes variations such as the B+-tree and the B*-tree.
In the B+-tree, copies of the keys are stored in the internal nodes; the keys and
records are stored in leaves; in addition, This is IBM’s VSAM – KSDS. (Indexed
Sequential File Organization)
Leaf node may include pointer to next leaf node to speed sequential access
------------------------------The B*-tree balances more neighboring internal nodes to keep the internal nodes
more densely packed.
25
For example, a non-root node of a B-tree must be only half full, but a non-root
node
of a B*-tree must be two-thirds full.
Supplementary Information: B+ and B* Trees
A B+ tree or B plus tree is a type of tree which represents sorted data in a way
that allows for efficient insertion, retrieval and removal of records, each record of
which is identified by a key.
It is a dynamic, multilevel index, with maximum and minimum bounds on the
number of keys in each index segment (usually called a "block" or "node").
 In a B+ tree, in contrast to a B-tree, all records are stored at the leaf level of
the tree; only keys are stored in interior nodes.
The primary value of a B+ tree is in storing data for efficient retrieval in a blockoriented storage context—in particular, filesystems.
This is primarily because unlike binary search trees, B+ trees have very high fanout
(typically on the order of 100 or more), which reduces the number of I/O operations
to the next level down - required to find an element in the tree.
26
Supplementary Information: B+ and B* Trees
The leaves (the bottom-most index blocks) of the B+ tree are often linked to one
another in a linked list; this makes range queries or an (ordered) iteration through the
blocks simpler and more efficieny.
This does not substantially increase space consumption or maintenance on the tree.
This illustrates one of the significant advantages of a B+-tree over a B-tree; in a Btree, since not all keys are present in the leaves, such an ordered linked list cannot be
constructed.
A B+-tree is thus particularly useful as a database system index, where the data
typically resides on disk, as it allows the B+-tree to actually provide an efficient
structure for housing the data itself.
-----------------If nodes of the B+ tree are organized as arrays of elements, then it may take a
considerable time to insert or delete an element as half of the array will need to be
shifted on average, since the data is organized sequentially.
To overcome this problem, elements inside a node can be organized in a binary
tree or a B+ tree instead of an array.
27
Different Kinds of Trees
•
Binary trees: Binary search tree (BST) · Van Emde Boas tree · Cartesian tree · Top Tree ·
T-tree
•
Self-balancing binary search trees: Red-black tree · AVL tree · AA tree · Splay tree ·
Scapegoat tree · Treap
•
B-trees: B+ tree · B*-tree · UB-tree · 2-3 tree · 2-3-4 tree · (a,b)-tree · Dancing tree ·
Htree · Bx-tree
•
Tries: Suffix tree · Radix tree · Ternary search tree
•
Binary space partitioning (BSP) trees: Quadtree · Octree · kd-tree (implicit) · VP-tree
•
Non-binary trees: Exponential tree · Fusion tree · Interval tree · PQ tree · Range tree ·
SPQR tree
•
Spatial data partitioning trees: R-tree · R+ tree · R* tree · X-tree · M-tree · Segment tree ·
Fenwick tree · Hilbert R-tree
•
Other trees: Heap · Hash tree · Finger tree · Metric tree · Cover tree · BK-tree · Doublychained tree
28
Databases
A Database is a collection of data organized in a fashion that facilitates updating,
retrieving, and managing the data.
– Can consist of anything, including, but not limited to names, addresses, pictures, and
numbers. Databases are commonplace and are used everyday.
• Very common for a database to contain millions of records requiring many
gigabytes of storage.
• Because databases cannot typically be maintained entirely in memory, B-trees are
often used to index data and provide fast access.
• Searching an unindexed and unsorted database containing n key values will have
a worst case running time of O(n);
• If same data is indexed with a B-tree, same search operation will run in O(log n).
•
29
Indexed Sequential File Organization
• This type of file arrangement is called ‘indexed sequential’ because it
contains an index file for random access (searching, inserting, deleting,
etc) as well as an ordering of indices with pointers to support sequential
access, for reports and a host of standard business operations.
• In VSAM (Virtual Storage Access Method) there are levels of indices, that is
indexes for the indexes (called index sets / sequence sets)
• Indexed Sequential Files in VSAM are called Key Sequenced Data Sets and the
structure of the files are B+ Trees – (that is, data records are at the leafs;
indexes/keys are in the upper nodes: sequence sets and, if needed, index sets)
• Data records are stored in what IBM calls Control Areas and Control
Intervals.
• In IBM parlance, a node split is a control interval split. Same idea!
• Records are inserted bottom up as in a B-Tree.
30
Indexed File
• The index consists of small records of keys and pointers;
– Blocks of Records (control intervals) are indexed by the keys;
• An index may point to the highest key within a control interval.
– Pointers ‘point’ to a control interval (block) where the record with a
desired value (key) may be located.
•
• Theoretically, a random access requires two accesses at least:
one for the index and one for the node of records!
– In truth, the indexed file is usually brought into memory during these
operations and maintained in a cache store to facilitate fast retrieval.
• Here is what the VSAM organization looks like for Key
Sequenced Data Sets (KSDSs):
31
Pointers have values and block numbers
INDEX COMPONENT
…
INDEX SET
SEQUENCE
SET
...
DATA COMPONENT
CONTROL
INTERVALS
...
CONTROL AREA
CONTROL AREA
CONTROL AREA
32
Key values plus pointers to blocks
Note the points to support sequential
I1
operations.
INDEX
SET
62
9/S1 S2
D1
I2
FREE
SEQUENCE SETS
S1
3 9
D1 D2
KEY VALUES EXTREMELY EXAGGERATED!!
S2
36 62
D3 D4
FREE
D2
S3
FREE
D3
D4
36
1
3
FREE
5
9
FREE
35
CONTROL INTERVALS
FREE
42 43 62 FREE
CONTROL INTERVALS
CONTROL AREA
33
Index Sequential – more =- VSAM!!
•  So, an indexed sequential is organized ‘Indexed Sequentially’
and can be accessed both ‘sequentially’ and ‘randomly.’
– Organized by a record key (unique)
• Random Access of Records: Records can be accessed randomly
(via the indices (record key)) and sequentially via the nodes and
pointers.
• Sequential Access: Only the leaf nodes contain records, but in
turn, they point to the next node that contains the next group of
sequential records in order to support sequential processing.
• Random Access: does not need those pointers to the next node;
search proceeds as previously described using index sets and sequence
sets and pointers to control intervals.
34
Directories
• OK. Nice, but where’s the catch.
• Indexed Sequential File, there is No 1-1 correspondence between
indices and actual records. IS a correspondence between indices and
nodes / control intervals in IBM.
• Sometimes we need a directory.
• A one-to-one correspondence between an index record and a data
record, we call the index file a directory.
• (Think: telephone directory)
• Directories are VERY fast, but incur a lot of overhead to maintain this
1-1 correspondence between indices and data records...
• May be perfect for real time applications!!
35
• In practice (not in book), most index files are not
organized as ;
• Rather, they contain the largest key of a control interval
(node).
• Then, in looking for a record, we search <= index record
which will point us to a node where the desired record
may/may not be.
– This reduces maintenance on the index file, which would
slow down insertions / deletions dramatically.
36
Indexed File – Alternate Keys
• Now, in indexed sequential operations, we may have indexes of
more than one key.
• Perhaps we wish to have random access based on ssan (record
key?) as well as, perhaps, a secondary key, such as an account
number (unique), and, perhaps we also need to be able to retrieve
on last name (non-unique secondary key).
• We can do it all with additional indices!
• However, each index must be maintained and updated
whenever we add / delete records.
• Thus significant overhead. But if it is needed, it is needed.
• Thus there are decisions practitioners must make: Yes, we can
support a client’s being able to retrieve on any number of keys,
but the price may be high, and the complexity may also37 be high.