Transcript Kεφάλαιο 8
Kεφάλαιο 8
ISAM και B-Δέντρα
Φυσικός Σχεδιασμός για Βάσεις Δεδομένων
YV - B+ Trees and Physical Design
326
Memory Management
.
DBMS Application
.
SET-ORIENTED
Database Access
Methods
TUPLEORIENTED
Main
Online
External
Logging and
Recovery
Buffer Manager
Manages
BLOCK-ORIENTED
Manages
File Manager
Transaction
Programs
Tuple Management
Associative Access
Record Management
Buffer Management
File Management
Nearline
External
Manages
YV - B+ Trees and Physical Design
Archive Manager
327
Index Structures
A (single-level) index is an auxiliary file structure that
makes it more efficient to search for a record in the file
The index is usually specified on one field of the file
The index is often called access path on the field
An index file usually occupies much less space than the
actual data file (the entries are fewer and smaller)
A pointer to the block where a record is stored is found
with binary search on the index
FORM of a record in the index file:
key value
YV - B+ Trees and Physical Design
pointer to record’s block
328
Indexes
Single-level Indexes can be distinguished in:
– Primary Index: defined on a file which is ordered on the
key field. It includes one entry for each file block. The entry
has the key field value for the first record in the block. (also
called, sparse index)
– Clustering Index: defined on a file which is ordered on a
non-key field. It includes one index entry for each distinct
value of the field. The entry points to the first data block
that contain records with that field value.
– Secondary Index: defined on a file which is unordered.
Contains one entry per each record in the data file
(also called, dense index)
YV - B+ Trees and Physical Design
329
Multi-Level Indexes
Index files are simply files, therefore they can also be
indexed.
We thus end-up with a hierarchy of index structures (first
level, second level, etc.)
Every level of the index is an ordered file, therefore, deletion
and insertion require much work (maintain the index)
A multi-level index forms a search tree and it is assumed that
the top (first) level index fits in one disk block
YV - B+ Trees and Physical Design
330
Indexed Sequential Access Method (ISAM)
ISAM is a multi-level index structure (tree) for direct access
to tuples in a relation sorted on some attribute
Each node of the tree is a page (block) on the disk
The nodes of the tree hold <key value, pointer > pairs, sorted
on key value. The internal nodes point to lower nodes in the
tree (a lower level index), while the leaf nodes point to pages in
the relation. A pointer points to a subtree with key values
greater or equal the corresponding key value and less than the
key value for the next pointer
9
YV - B+ Trees and Physical Design
18
55
60
78
90
331
ISAM Example
ISAM on Salary
10
20
30
10
40
40
50
60
Gary 10
Mike 15
LIsa 20
Shirley 25
Bob 30
Robin 35
Ivan
Bill
40
45
Ron
Jill
50
55
Scott 43
OVERFLOW
Keith 60
Dan 65
YV - B+ Trees and Physical Design
332
Example ISAM Tree
Each node can hold 2 entries
Root
40
10*
15*
20
33
20*
27*
YV - B+ Trees and Physical Design
51
33*
37*
40*
46*
51*
63
55*
63*
97*
333
After Inserting 23*, 48*, 41*, 42* ...
Root
40
Index
Pages
20
33
20*
27*
51
63
Primary
Leaf
10*
15*
33*
37*
40*
46*
48*
41*
51*
55*
63*
97*
Pages
Overflow
23*
Pages
42*
YV - B+ Trees and Physical Design
334
... Then Deleting 42*, 51*, 97*
Root
40
20
10*
15*
20*
23*
33
27*
51
33*
37*
40*
48*
46*
63
55*
63*
41*
Note that 51* appears in index levels, but not in leaf!
YV - B+ Trees and Physical Design
335
ISAM Performance
PERFORMANCE. Suppose that we have D data pages
and k pointers in each node (assume that D is a power
of k, say D = kL )
LOOKUP COST:
Sequential Scan:
D
Binary Search on Relation:
log2 D + 1
Binary Search (single index):
(log2 (D/k) + 1) + 1
Traversing the ISAM tree: logk D + 1 = L + 1
YV - B+ Trees and Physical Design
336
ISAM Summary
ADVANTAGES:
– It provides a sorted (mostly) directory for a file (relation)
– It is very good for exact queries (e.g., Salary = 40000)
– ISAM facilitates the execution of range queries (e.g., Salary
between 35000 and 60000)
DISADVANTAGES
– It is a static structure and may easily get unbalanced
– If there are many updates (volatile data), then the relation does
not remain “mostly” sorted
– The index does consume some space which may be valuable
YV - B+ Trees and Physical Design
337
B+ - Trees
A B+-tree is a multilevel tree-structured directory to index a
sorted file
Leaf nodes contain tuples in sorted order, the other internal
nodes have a special form
Nodes correspond to a block (page)
Each node is kept between half- and completely-full
Insertions in nodes that are not full are very efficient; if a
node is full, a split occurs
Deletions are very efficient if the node does not become less
than half-full
The tree structure remains balanced at all times
YV - B+ Trees and Physical Design
338
B+ - Trees -- Summary
Form of internal nodes:
P1
K1 P2 K2 ...
... Pn-1 Kn-1 Pn
K1 < K1 < ... Kn-1
P1 points to a node containing key values n, n< K1
P2 points to a node containing key values n, K1 < = n < K1
Variations of B+-trees:
– B-trees : They are like B+-trees, but the internal nodes also
contain pointers to data. They can get deeper and are
difficult to implement, but occasionally, they are faster.
– B*-trees : They are like B+-trees, but keep each node 2/3 full
(at least). Shorter trees, faster, but worse for updates
YV - B+ Trees and Physical Design
339
B+ Tree: The Most Widely Used Index
Insert/delete at log F N cost; keep tree height-balanced. (F = fanout,
N = # leaf pages)
Minimum 50% occupancy (except for root). Each node contains d
<= m <= 2d entries. The parameter d is called the order of the tree.
Supports equality and range-searches efficiently.
Index Entries
(Direct search)
Data Entries
("Sequence set")
YV - B+ Trees and Physical Design
340
Example B+ Tree
Search begins at root, and key comparisons direct it to a
leaf (as in ISAM).
Search for 5*, 15*, all data entries >= 24* ...
Root
13
2*
3*
5*
7*
14* 16*
17
24
19* 20* 22*
30
24*
27* 29*
33* 34* 38* 39*
Based on the search for 15*, we know it is not in the tree!
YV - B+ Trees and Physical Design
341
B+ Trees in Practice
Typical order: 100. Typical fill-factor: 67%.
– average fanout = 133
Typical capacities:
– Height 4: 1334 = 312,900,700 records
– Height 3: 1333 = 2,352,637 records
Can often hold top levels in buffer pool:
– Level 1 =
1 page = 8 Kbytes
– Level 2 =
133 pages = 1 Mbyte
– Level 3 = 17,689 pages = 133 MBytes
YV - B+ Trees and Physical Design
342
Inserting a Data Entry into a B+ Tree
Find correct leaf L.
Put data entry onto L.
– If L has enough space, done!
– Else, must split L (into L and a new node L2)
» Redistribute entries evenly,
copy up middle key.
» Insert index entry pointing to L2 into parent of L.
This can happen recursively
– To split index node, redistribute entries evenly, but push up
middle key. (Contrast with leaf splits.)
Splits “grow” tree; root split increases height.
– Tree growth: gets wider or one level taller at top.
YV - B+ Trees and Physical Design
343
Inserting 8* into Example B+ Tree
Observe how
minimum
occupancy is
guaranteed in both
leaf and index pg
splits.
Note difference
between copy-up
and push-up; be
sure you
understand the
reasons for this.
YV - B+ Trees and Physical Design
Entry to be inserted in parent node.
(Note that 5 iss copied up and
continues to appear in the leaf.)
5
2*
3*
5*
13
8*
Entry to be inserted in parent node.
(Note that 17 is pushed up and only
appears once in the index. Contrast
this with a leaf split.)
17
5
7*
24
30
344
Example B+ Tree After Inserting 8*
Root
17
5
2*
3*
24
13
5*
7*
8*
14* 16*
19* 20* 22*
30
24* 27* 29*
33* 34* 38* 39*
Notice that root was split, leading to increase in height.
In this example, we can avoid split by re-distributing
entries; however, this is usually not done in practice.
YV - B+ Trees and Physical Design
345
Deleting a Data Entry from a B+ Tree
Start at root, find leaf L where entry belongs.
Remove the entry.
– If L is at least half-full, done!
– If L has only d-1 entries,
» Try to re-distribute, borrowing from sibling (adjacent node
with same parent as L).
» If re-distribution fails, merge L and sibling.
If merge occurred, must delete entry (pointing to L or sibling)
from parent of L.
Merge could propagate to root, decreasing height.
YV - B+ Trees and Physical Design
346
Example Tree After (Inserting 8*, Then)
Deleting 19* and 20* ...
Root
17
5
2*
3*
27
13
5*
7*
8*
14* 16*
22* 24*
30
27* 29*
33* 34* 38* 39*
Deleting 19* is easy.
Deleting 20* is done with re-distribution. Notice how middle key
is copied up.
YV - B+ Trees and Physical Design
347
... And Then Deleting 24*
Must merge.
Observe `toss’ of index entry
(on right), and `pull down’ of
index entry (below).
30
22*
27*
29*
33*
34*
38*
39*
Root
5
2*
3*
5*
7*
YV - B+ Trees and Physical Design
8*
13
14*
16*
17
30
22*
27*
29*
33*
34* 38*
39*
348
Example of Non-leaf Re-distribution
Tree is shown below during deletion of 24*. (What could be a
possible initial tree?)
In contrast to previous example, can re-distribute entry from left child
of root to right child.
Root
22
5
2*
3*
5*
7* 8*
YV - B+ Trees and Physical Design
13
14* 16*
17
30
20
17* 18*
20* 21*
22* 27* 29*
33* 34* 38* 39*
349
After Re-distribution
Intuitively, entries are re-distributed by `pushing through’ the splitting
entry in the parent node.
It suffices to re-distribute index entry with key 20; we’ve redistributed 17 as well for illustration.
Root
17
5
2* 3*
5* 7* 8*
YV - B+ Trees and Physical Design
13
14* 16*
20
17* 18*
20* 21*
22
30
22* 27* 29*
33* 34* 38* 39*
350
Prefix Key Compression
Important to increase fan-out. (Why?)
Key values in index entries only `direct traffic’; can often
compress them.
– E.g., If we have adjacent index entries with search key values
Dannon Yogurt, David Smith and Devarakonda Murthy, we can
abbreviate David Smith to Dav. (The other keys can be
compressed too ...)
» Is this correct? Not quite! What if there is a data entry Davey
Jones? (Can only compress David Smith to Davi)
» In general, while compressing, must leave each index entry
greater than every key value (in any subtree) to its left.
Insert/delete must be suitably modified.
YV - B+ Trees and Physical Design
351
Bulk Loading of a B+ Tree
If we have a large collection of records, and we want to create a
B+ tree on some field, doing so by repeatedly inserting records is
very slow.
Bulk Loading can be done much more efficiently.
Initialization: Sort all data entries, insert pointer to first (leaf)
page in a new (root) page.
Root
3* 4*
Sorted pages of data entries; not yet in B+ tree
6* 9*
YV - B+ Trees and Physical Design
10* 11*
12* 13*
20* 22*
23* 31*
35* 36*
38* 41*
44*
352
Summary of Bulk Loading
Option 1: multiple inserts.
– Slow.
– Does not give sequential storage of leaves.
Option 2: Bulk Loading
–
–
–
–
Has advantages for concurrency control.
Fewer I/Os during build.
Leaves will be stored sequentially (and linked, of course).
Can control “fill factor” on pages.
YV - B+ Trees and Physical Design
353
A Note on `Order’
Order (d) concept replaced by physical space criterion in practice
(`at least half-full’).
– Index pages can typically hold many more entries than leaf pages.
– Variable sized records and search keys mean differnt nodes will
contain different numbers of entries.
– Even with fixed length fields, multiple records with the same search
key value (duplicates) can lead to variable-sized data entries (if we
use Alternative (3)).
YV - B+ Trees and Physical Design
354
Summary
Tree-structured indexes are ideal for range-searches, also good for
equality searches.
ISAM is a static structure.
– Only leaf pages modified; overflow pages needed.
– Overflow chains can degrade performance unless size of data set and
data distribution stay constant.
B+ tree is a dynamic structure.
– Inserts/deletes leave tree height-balanced; log F N cost.
– High fanout (F) means depth rarely more than 3 or 4.
– Almost always better than maintaining a sorted file.
YV - B+ Trees and Physical Design
355
Summary (Contd.)
– Typically, 67% occupancy on average.
– Usually preferable to ISAM, modulo locking considerations;
adjusts to growth gracefully.
– If data entries are data records, splits can change rids!
Key compression increases fanout, reduces height.
Bulk loading can be much faster than repeated inserts for
creating a B+ tree on a large data set.
Most widely used index in database management systems
because of its versatility. One of the most optimized
components of a DBMS.
YV - B+ Trees and Physical Design
356
Overview of Physical Design
After ER design, schema refinement, and the definition of
views, we have the conceptual and external schemas for our
database.
The next step is to choose indexes, make clustering decisions,
and to refine the conceptual and external schemas (if
necessary) to meet performance goals.
We must begin by understanding the workload:
– The most important queries and how often they arise.
– The most important updates and how often they arise.
– The desired performance for these queries and updates.
YV - B+ Trees and Physical Design
357
Understanding the Workload
For each query in the workload:
– Which relations does it access?
– Which attributes are retrieved?
– Which attributes are involved in selection/join conditions? How
selective are these conditions likely to be?
For each update in the workload:
– Which attributes are involved in selection/join conditions? How
selective are these conditions likely to be?
– The type of update (INSERT/DELETE/UPDATE), and the attributes
that are affected.
YV - B+ Trees and Physical Design
358
Decisions to Make
What indexes should we create?
– Which relations should have indexes? What field(s) should be the
search key? Should we build several indexes?
For each index, what kind of an index should it be?
– Clustered? Hash/tree? Dynamic/static? Dense/sparse?
Should we make changes to the conceptual schema?
– Consider alternative normalized schemas? (Remember, there are
many choices in decomposing into BCNF, etc.)
– Should we ``undo’’ some decomposition steps and settle for a lower
normal form? (Denormalization.)
– Horizontal partitioning, replication, views ...
YV - B+ Trees and Physical Design
359
Choice of Indexes
One approach: consider the most important queries in turn.
Consider the best plan using the current indexes, and see if a
better plan is possible with an additional index. If so, create
it.
Before creating an index, must also consider the impact on
updates in the workload!
– Trade-off: indexes can make queries go faster, updates slower.
Require disk space, too.
YV - B+ Trees and Physical Design
360
Issues to Consider in Index Selection
Attributes mentioned in a WHERE clause are candidates for
index search keys.
– Exact match condition suggests hash index.
– Range query suggests tree index.
» Clustering is especially useful for range queries, although it can
help on equality queries as well in the presence of duplicates.
Try to choose indexes that benefit as many queries as
possible. Since only one index can be clustered per relation,
choose it based on important queries that would benefit the
most from clustering.
YV - B+ Trees and Physical Design
361
Issues in Index Selection (Contd.)
Multi-attribute search keys should be considered when a
WHERE clause contains several conditions.
– If range selections are involved, order of attributes should be
carefully chosen to match the range ordering.
– Such indexes can sometimes enable index-only strategies for
important queries.
» For index-only strategies, clustering is not important!
When considering a join condition:
– Hash index on inner is very good for Index Nested Loops.
» Should be clustered if join column is not key for inner, and inner
tuples need to be retrieved.
– Clustered B+ tree on join column(s) good for Sort-Merge.
YV - B+ Trees and Physical Design
362
Example
Hash index on D.dname supports ‘Toy’ selection.
– Given this, index on D.dno is not needed.
Hash index on E.dno allows us to get matching (inner) Emp
tuples for each selected (outer) Dept tuple.
What if WHERE included: `` ... AND E.age=25’’ ?
– Could retrieve Emp tuples using index on E.age, then join with
Dept tuples satisfying dname selection. Comparable to strategy
that used E.dno index.
– So, if E.age index is already created, this query provides much
less motivation for adding an E.dno index.
SELECT E.ename, D.mgr
FROM Emp E, Dept D
WHERE D.dname=‘Toy’ AND E.dno=D.dno
YV - B+ Trees and Physical Design
363
Examples of Clustering
B+ tree index on E.age can be used to get qualifying
tuples.
– How selective is the condition?
– Is the index clustered?
Consider the GROUP BY query.
– If many tuples have E.age > 10, using
E.age index and sorting the retrieved
tuples may be costly.
– Clustered E.dno index may be better!
Equality queries and duplicates:
– Clustering on E.hobby helps!
YV - B+ Trees and Physical Design
SELECT E.dno
FROM Emp E
WHERE E.age>40
SELECT E.dno, COUNT (*)
FROM Emp E
WHERE E.age>10
GROUP BY E.dno
SELECT E.dno
FROM Emp E
WHERE E.hobby=Stamps
364
Multi-Attribute Index Keys
To retrieve Emp records with age=30 AND sal=4000, an index on
<age,sal> would be better than an index on age or an index on
sal.
– Such indexes also called composite or concatenated indexes.
– Choice of index key orthogonal to clustering etc.
If condition is: 20<age<30 AND 3000<sal<5000:
– Clustered tree index on <age,sal> or <sal,age> is best.
If condition is: age=30 AND 3000<sal<5000:
– Clustered <age,sal> index much better than <sal,age> index!
Composite indexes are larger, updated more often.
YV - B+ Trees and Physical Design
365
Summary
Database design consists of several tasks: requirements
analysis, conceptual design, schema refinement, physical
design and tuning.
– In general, have to go back and forth between these tasks to
refine a database design, and decisions in one task can
influence the choices in another task.
Understanding the nature of the workload for the application,
and the performance goals, is essential to developing a good
design.
– What are the important queries and updates? What
attributes/relations are involved?
YV - B+ Trees and Physical Design
366
Summary (Contd.)
Indexes must be chosen to speed up important queries (and
perhaps some updates!).
–
–
–
–
Index maintenance overhead on updates to key fields.
Choose indexes that can help many queries, if possible.
Build indexes to support index-only strategies.
Clustering is an important decision; only one index on a given
relation can be clustered!
– Order of fields in composite index key can be important.
Static indexes may have to be periodically re-built.
Statistics have to be periodically updated.
YV - B+ Trees and Physical Design
367
Relation Implementations
(a) STORE EACH RELATION IN A FILE
– For small relations, a heap may suffice
– For larger relations, ISAM, B-tree, or Hashing
– Allow for secondary indexes on user-specified fields
Example Commands:
modify R to isam on A1
index on R is S(A1)
(b) STORE EACH RELATION AS IN DBTG
– Store related tuples from different relations together
– Use multilist structures
YV - B+ Trees and Physical Design
368