Physical Design - Dr. Hong-Mei Chen

Download Report

Transcript Physical Design - Dr. Hong-Mei Chen

The Database Design and
Implementation Process
Phase 1: Requirements Collection and Analysis
Phase 2: Conceptual Database Design
Phase 3: Choice of DBMS
Phase 4: Data Model Mapping (Logical Database
Design)
Phase 5: Physical Database Design <= You are
here!
Phase 6: Database System Implementation and
Tuning
Physical Design Inputs
• Logical design from Phase 4
• Intended use of the database: applications, and
queries & transactions (Q&T)
• Expected frequency of invocations of Q&T
• Timing constraints of these Q&T
• Expected frequencies of update operations
• “Uniqueness” of attributes: for designing access
path (indexes)
Physical DB design/Tuning
Decisions
• Storage level decisions:
– file organization, size of table, record size, block size, I/O and
device, DBMS functions optimization
• Deciding on indexing
• *Denormalization & view materialization
• *Tuning queries
– make use of indexes
– avoid join and correlated query when possible ==> break the
inner query to another query
– order of table that affects join processing
– views defined for every possible application, overkill
– Union is better than “or” in the inner query
– to match how individual query optimizer works
Primary File Organization
Sequential: heap
Indexed: B-tree, B+, B* trees, ISAM, VSAM
Hash file
Indexes
• Primary index
– physical key order (unique)
• Clustering index
– Closeness properties
– physically ordered on non-unique, non-key field
– non-dense index (one for each indexing field but not for every
record)
• Secondary index
– non-ordering field of the file
– dense index: one entry for each record
• Either primary or clustering (only one for a file); can
have many secondary indexes
• Single-level vs. Multi-level indexes
Disk Storage Devices
• Preferred secondary storage device for high
storage capacity and low cost.
• Data stored as magnetized areas on magnetic
disk surfaces.
• A disk pack contains several magnetic disks
connected to a rotating spindle.
• Disks are divided into concentric circular tracks
on each disk surface.
Disk Storage Devices -2
• Because a track usually contains a large amount
of information, it is divided into smaller blocks
or sectors.
• The division of a track into sectors is hardcoded on the disk surface and cannot be
changed.
• A track is divided into blocks. The block size B
is fixed for each system. Typical block sizes
range from B=512 bytes to B=4096 bytes.
• Whole blocks are transferred between disk and
main memory for processing.
Disk Storage Devices - 3
Disk Storage Devices - 4
• A read-write head moves to the track that
contains the block to be transferred.
• Disk rotation moves the block under the readwrite head for reading or writing.
• Reading or writing a disk block is time
consuming because of the seek time s and
rotational delay (latency) rd.
• Double buffering can be used to speed up the
transfer of contiguous disk blocks.
Disk Storage Devices - 5
Records
• Fixed and variable length records
• Records contain fields which have values of a
particular type (e.g., amount, date, time, age)
• Fields themselves may be fixed length or
variable length
• Variable length fields can be mixed into one
record: separator characters or length fields are
needed so that the record can be “parsed”.
Blocking
• Blocking: refers to storing a number of records
in one block on the disk.
• Blocking factor (bfr) refers to the number of
records per block.
• There may be empty space in a block if an
integral # of records do not fit in one block.
• Spanned Records: refer to records that exceed
the size of one or more blocks and hence span a
number of blocks.
Files of Records
• Databases are stored as files on disks.
• A file is a sequence of records, where each
record is a collection of data values (or items).
• A file descriptor (or file header) includes
information that describes the file, such as the
field names and their data types, and the
addresses of the file blocks on disk.
• Records are stored on disk blocks.
• The blocking factor bfr for a file is the (average)
number of file records stored in a disk block.
• A file can have fixed-length records or variablelength records.
Unordered Files
• Also called a heap or a pile file.
• New records are inserted at the end of the file.
• To search for a record, a linear search through
the file records is necessary (O(n)).
• This requires reading and searching half the file
blocks on the average, and is hence quite expensive.
• Record insertion is quite efficient.
• Reading the records in order of a particular field
requires sorting the file records (O(nlogn)).
Ordered Files
• Also called sequential files. Records are kept sorted by
the values of an ordering field.
• Insertion is expensive: records must be inserted in the
correct order
– it is common to keep a separate unordered overflow
(transaction) file for new records to improve insertion efficiency
– this is periodically merged with the main ordered file.
• A binary search can be used to search for a record on its
ordering field value
– requires reading and searching log2n of the file blocks on the
average
– a big improvement over linear search!
• Reading the records in order of the ordering field is
quite efficient.
Average Access Times
• The following table shows the average access
time to access a specific record for a given type
of file
Hashed Files
• Hashing for disk files is called external hashing
• The file blocks are divided into M equal-sized
buckets, numbered buck0, buck1, ..., buckM-1.
– Typically, a bucket corresponds to one (or a fixed #
of) disk block(s).
• One of the file fields is designated to be the
hash key.
• The record with hash key value K is stored in
bucket i, where i=h(K), and h is the hashing
function.
– E.g. i = PK mod M
• Search is very efficient on the hash key. Why?
Hashed Files - 2
Hashed Files - 3
• What happens when a new record hashes to a
bucket that is already full?
• Called a collision.
• What does a collision mean about your hashing
function?
• What to do about collision?
– An overflow file is kept for storing such records.
– Overflow records that hash to each bucket can be
linked together.
Hashed Files - 4
• There are numerous methods for collision resolution,
including the following:
– Open addressing: Proceeding from the occupied position
specified by the hash address, the program checks the
subsequent positions in order until an unused (empty) position
is found.
– Chaining: For this method, various overflow locations are kept,
usually by extending the array with a number of overflow
positions. In addition, a pointer field is added to each record
location. A collision is resolved by placing the new record in an
unused overflow location and setting the pointer of the occupied
hash address location to the address of that overflow location.
– Multiple hashing: The program applies a second hash function
if the first results in a collision. If another collision results, the
program uses open addressing or applies a third hash function
and then uses open addressing if necessary.
Chaining
Example
Hashed Files - 5
• To reduce overflow records, a hash file is typically kept
70-80% full.
• The hash function h should distribute the records
uniformly among the buckets
– otherwise, search time will be increased because many overflow
records will exist.
• Main disadvantages of static external hashing:
– Fixed number of buckets M is a problem if the number of
records in the file grows or shrinks.
– Ordered access on the hash key is quite inefficient (requires
sorting the records).
Hashed Files - 6
Hashing - Exercise
• A PARTS file with Part# as hash key includes records
with the following Part# values:
2369, 3760, 4692, 4871, 5659, 1821, 1074, 7115,
1620, 2428, 3943, 4750, 6975, 4981, 9208.
• The file uses 8 buckets, numbered 0 to 7. Each bucket is
one disk block and holds two records. Show how you
would load these records into the file in the given order
using the hash function h(K)=K mod 8.
• BONUS: Calculate the average number of block
accesses for a random retrieval on Part#.
Indexes as Access Paths
• A single-level index is an auxiliary file that
makes it more efficient to search for a record in
the data file.
• The index is usually specified on one field of the
file (although it could be specified on several
fields).
• One form of an index is a file of entries
<field value, pointer to record>
which is ordered by field value
• The index is called an access path on the field.
Indexes as Access Paths - 2
• The index occupies considerably less space than
the data because its entries are much smaller.
• A binary search on the index yields a pointer to
the file record.
• Indexes can also be characterized as dense or
sparse.
– A dense index has an index entry for every search
key value (and hence every record) in the data file.
– A sparse (or nondense) index, on the other hand, has
index entries for only some of the search values
Indexes as Access Paths - 3
Example: Given the following data file:
EMPLOYEE(NAME, SSN, ADDRESS, JOB, SAL, ... )
Suppose that:
record size R=150 bytes
block size B=512 bytes
r=30000 records
Indexes as Access Paths - 4
Then, we get:
– blocking factor Bfr= B div R= 512 div 150= 3 records/block
– number of file blocks b= (r/Bfr)= (30,000/3)= 10,000 blocks
For an index on the SSN field,
–
–
–
–
–
–
assume the field size VSSN=9 bytes,
assume the record pointer size PR=7 bytes. Then:
index entry size RI=(VSSN+ PR)=(9+7)=16 bytes
index blocking factor BfrI= B div RI= 512 div 16= 32 entries/block
number of index blocks b= (r/ BfrI)= (30,000/32)= 938 blocks
binary search needs log2bI= log2938= 10 block accesses
This is compared to an average linear search cost of:
(b/2)= 30,000/2= 15,000 block accesses
If the file records are ordered, the binary search cost would be:
log2b= log230,000= 15 block accesses
Types of Single-Level Indexes
• Primary Index
– Defined on an ordered data file
– The data file is ordered on a key field
– Includes one index entry for each block in the data
file; the index entry has the key field value for the
first record in the block, which is called the block
anchor
– A primary index is a nondense (sparse) index, since it
includes an entry for each disk block of the data file
and the keys of its anchor record rather than for
every search value.
Types of Single-Level Indexes
• Clustering Index
– Defined on an ordered data file
– The data file is ordered on a non-key field unlike
primary index, which requires that the ordering field
of the data file have a distinct value for each record.
– Includes one index entry for each distinct value of
the field; the index entry points to the first data block
that contains records with that field value.
Clustering index
with a separate
block cluster for
each group of
records that share
the same value for
the clustering field.
Types of Single-Level Indexes
• Secondary Index
– A secondary index provides a secondary means of accessing a
file for which some primary access already exists.
– The secondary index may be on a field which is a candidate key
and has a unique value in every record, or a nonkey with
duplicate values.
– The index is an ordered file with two fields.
• The first field is of the same data type as some nonordering
field of the data file that is an indexing field.
• The second field is either a block pointer or a record pointer.
There can be many secondary indexes (and hence, indexing
fields) for the same file.
– Includes one entry for each record in the data file; hence, it is a
dense index
A dense
secondary index
(with block
pointers) on a
nonordering key
field of a file.
A secondary index
(with record
pointers) on a
nonkey field
implemented using
one level of
indirection so that
index entries are of
fixed length and
have unique field
values.
Properties of Index Types
Multi-Level Indexes
• Because a single-level index is an ordered file, we can
create a primary index to the index itself
– the original index file is called the first-level index and the
index to the index is called the second-level index.
• We can repeat the process, creating a third, fourth, ...,
top level until all entries of the top level fit in one disk
block
• A multi-level index can be created for any type of firstlevel index (primary, secondary, clustering) as long as
the first-level index consists of more than one disk block
A two-level
primary index
resembling ISAM
(Indexed
Sequential Access
Method).
Multi-Level Indexes
• Such a multi-level index is a form of search tree;
however, insertion and deletion of new index
entries is a severe problem because every level
of the index is an ordered file.
• Solution? B-trees
B-Tree Example
• A node in a search tree with pointers to subtrees below it
q
q
B-Tree Definition
A B-tree of order M is a multiway search tree such that:
• All leaves are on the bottom level.
• All internal nodes (except the root node) have at least
ceil(M/2) children.
• The root node can have as few as 2 children if it is an
internal node, and may have no children if the root node
is a leaf (i.e., the tree consists only of the root node).
• Each leaf node (other than the root node if it is a leaf)
must contain at least ceil(M/2) - 1 keys.
B-Tree Rules
A B-Tree of order 4 must meet the following conditions:
• The keys in each node are in ascending order.
• At every given Node the following is true:
– The subtree starting at Node.Branch[0] has only keys that are
less than Node.Key[0].
– The subtree starting at Node.Branch[1] has only keys that are
greater than Node.Key[0] and at the same time less than
Node.Key[1].
– The subtree starting at Node.Branch[2] has only keys that are
greater than Node.Key[1] and at the same time less than
Node.Key[2].
– The subtree starting at Node.Branch[3] has only keys that are
greater than Node.Key[2].
• Note: if less than the full number of keys are in the
Node, these 4 conditions are truncated so that they speak
of the appropriate number of keys and branches.
Dynamic Multilevel Indexes Using
B-Trees and B+-Trees
• Because of the insertion and deletion problem,
most multi-level indexes use B-tree or B+-tree
data structures, which leave space in each tree
node (disk block) to allow for new index entries
• These data structures are variations of search
trees that allow efficient insertion and deletion
of new search values.
• In B-Tree and B+-Tree data structures, each
node corresponds to a disk block
• Each node is kept between half-full and
completely full
Dynamic Multilevel Indexes Using
B-Trees and B+-Trees - 2
• An insertion into a node that is not full is quite
efficient; if a node is full the insertion causes a
split into two nodes
• Splitting may propagate to other tree levels
• A deletion is quite efficient if a node does not
become less than half full
• If a deletion causes a node to become less than
half full, it must be merged with neighboring
nodes
B-tree versus B+-tree
• In a B-tree, pointers to data records exist at all
levels of the tree
• In a B+-tree, all pointers to data records exists at
the leaf-level nodes
• A B+-tree can have fewer levels (or higher
capacity of search values) than the
corresponding B-tree
B-tree structures. (a) A node in a B-tree with q – 1 search
values. (b) A B-tree of order p = 3. The values were inserted in
the order 8, 5, 1, 7, 3, 12, 9, 6.
The nodes of a B+-tree. (a) Internal node of a B+-tree with q
–1 search values. (b) Leaf node of a B+-tree with q – 1 search
values and q – 1 data pointers.
B-Tree Algorithm
• When inserting an item, first search for it in the B-tree.
• If the item is not there, this search will end at a leaf.
• If there is room in this leaf, insert the item here (this
may require moving some existing keys).
• If this leaf node is full then it must be "split" with about
half of the keys going into a new node to the right.
• The median (middle) key is moved up to the parent
node. (If the parent is full, it has to be split as well.)
• Note: If the root node is ever split, the median key
moves up into a new root node, thus causing the tree to
increase in height by one.
B-Tree Exercise
• Create a B-Tree of Order 5, containing the
following nodes (in the order of their insertion):
– CNGAHEKQ MFWLTZDPRXYS
• Remember: Order 5 means that each node can
have a maximum of 5 children and 4 keys.
• BONUS QUESTIONS:
– What is the average number of block accesses for a
random retrieval on {this tree, a full tree}?
– What would the average numbers be if this was
{a heap, a sorted file}?