Transcript chap8b

B + -trees
CM3117
Data Structures
B-Trees
 The
B-tree is similar to binary search trees
in another way
 While it is good for locating a particular key
it is not good for locating a series of keys
– It is possible to find the series of keys within
one node when in a leaf node
– It is not possible to find the next key otherwise
 The
B-tree can be modified to make this
possible
CM3117 -- Data Structures and File Organizations
B+-tree
modified B-tree is called a B+-tree
 The difference is primarily in the structure
of the tree
 The
– All keys and the associated data are found in
the leaf nodes
– The root and interior nodes contain only keys
and pointers
CM3117 -- Data Structures and File Organizations
B+-tree
 The
keys in the root and interior nodes are
duplicated in the leaf nodes
– These keys end up as the first key in each node
except the leftmost node
 This
means that the leaf nodes have a
different structure from the non-leaf nodes
CM3117 -- Data Structures and File Organizations
B+-trees
 The
last pointer of the leaf nodes are used to
connect each leaf node to the next node in
sequence
– The leaf nodes form a linked list
– They are called the sequence set
 The
non-leaf nodes are called the index set
since they are used exclusively to locate
data
CM3117 -- Data Structures and File Organizations
B+-trees
 Here
is an example of a B+-tree
Index Set
50
20
10
85
20
45
50
Sequence Set
CM3117 -- Data Structures and File Organizations
68
85
101
B+-trees
 The
structural differences lead to differences
in the operations as well
 For example, the search operation in B+-trees
must end in the leaf node since that is where
the data is actually located
 The keys in the index set serve only to direct
the search to the correct leaf node
CM3117 -- Data Structures and File Organizations
B+-trees
 There
are also differences in the insertion,
deletion, and redistribution operations
 Because of the structural differences between
the index set and the sequence set the
differences only apply to the sequence set
 Structurally the index set is the same as the
nodes in a B-tree, so these operations are the
same for the index set as in a B-tree
CM3117 -- Data Structures and File Organizations
Inserting into a B+-tree
 There
is no difference when there is room in
the node
 The difference is in the node split
CM3117 -- Data Structures and File Organizations
Inserting into a B+-tree
45 into the B+-tree below
 Since the node is full it needs to be split
 Insert
30
10
20
CM3117 -- Data Structures and File Organizations
30
50
Inserting into a B+-tree
 Create
the new node and determine that the
median key is 45
30
10
20
CM3117 -- Data Structures and File Organizations
30
50
Inserting into a B+-tree
 Move
from the median key on to the new
node
 The difference here is that the median key is
copied
30
10
20
CM3117 -- Data Structures and File Organizations
30
45
50
Inserting into a B+-tree
 Since
this is a leaf node no need to copy any
pointer except the last pointer in the node
30
10
20
CM3117 -- Data Structures and File Organizations
30
45
50
Inserting into a B+-tree
 The
last pointer in the original node is set to
point to the new node
30
10
20
CM3117 -- Data Structures and File Organizations
30
45
50
Inserting into a B+-tree
 The
median key and the pointer to the new
node is still raised to the parent
30
10
20
CM3117 -- Data Structures and File Organizations
45
30
45
50
Inserting into a B+-tree
 Need
to emphasize that this difference only
applies to the sequence set
 Suppose we insert 25 into this B+-tree
(without doing redistribution)
30
10
20
30
CM3117 -- Data Structures and File Organizations
45
45
50
Inserting into a B+-tree
 Create
the new node, identify 20 as the
median key, move the keys to the new node,
and set the pointers in the original and new
nodes
30
10
20
25
CM3117 -- Data Structures and File Organizations
45
30
45
50
Inserting into a B+-tree
 When
the median key is raised, the parent
node needs to be split
 This node split will follow the same rules as
for a B-tree
20
10
30
20
25
CM3117 -- Data Structures and File Organizations
45
30
45
50
Inserting into a B+-tree
 Create
the new node, move all keys and
pointers following the median key, and raise
the median key
30
20
10
45
20
25
CM3117 -- Data Structures and File Organizations
30
45
50
Inserting into a B+-tree
 Since
this is a root split, create the new root
node, insert the median key and change the
pointer to the root
30
20
10
45
20
25
CM3117 -- Data Structures and File Organizations
30
45
50
Redistribution in a B+-tree
 The
redistributions are also slightly
different in the sequence set
 Suppose we insert 55 into this B+-tree
30
20
10
45
20
25
CM3117 -- Data Structures and File Organizations
30
45
50
Redistribution in a B+-tree
 Can
do a left redistribution
30
20
10
45
20
25
CM3117 -- Data Structures and File Organizations
30
45
50
Redistribution in a B+-tree
 Copy
the first key in the node to the last
position in the node to the left and move all
nodes down in the right node
30
20
10
45
20
25
CM3117 -- Data Structures and File Organizations
30
45
50
Redistribution in a B+-tree
 Insert
the new key and its data
30
20
10
45
20
25
CM3117 -- Data Structures and File Organizations
30
45
50
55
Redistribution in a B+-tree
 Replace
the key in the parent with the first
key in the right node
 Note there are no pointers to move
30
20
10
50
20
25
CM3117 -- Data Structures and File Organizations
30
45
50
55
Redistribution in a B+-tree
 Right
redistributions are also slightly
different
 Insert 12 into this B+-tree
30
20
10
15
50
20
CM3117 -- Data Structures and File Organizations
30
45
50
55
Redistribution in a B+-tree
 Move
the rightmost key in the left node and
insert it in the right node
30
20
10
50
15
20
CM3117 -- Data Structures and File Organizations
30
45
50
55
Redistribution in a B+-tree
 Copy
the rightmost node (now the first key
in the right node) to the parent
 Insert the new key in the left node
30
15
10
12
50
15
20
CM3117 -- Data Structures and File Organizations
30
45
50
55
Redistribution in a B+-tree
 As
before there are no pointers to change
30
15
10
12
50
15
20
CM3117 -- Data Structures and File Organizations
30
45
50
55
Deleting from a B+-tree
 As
in a B-tree, when deleting redistributions
must be done if possible
 The node combination is different, but again
only in the sequence set
CM3117 -- Data Structures and File Organizations
Deleting from a B+-tree
 Delete
10 from this B+-tree
32
10
25
68
89
32
51
CM3117 -- Data Structures and File Organizations
68
80
8
Deleting from a B+-tree
 A node
combination is necessary
 But in this case it is not necessary to copy
the key in the parent
32
25
68
89
32
51
CM3117 -- Data Structures and File Organizations
68
80
8
Deleting from a B+-tree
 Move
all keys in the right node to the left
node
 Note that you actually have one slot empty
32
25
32
68
51
CM3117 -- Data Structures and File Organizations
89
68
80
8
Deleting from a B+-tree
 Copy
the last pointer in the now empty node
to the last pointer of the left node
32
25
32
68
51
CM3117 -- Data Structures and File Organizations
89
68
80
8
Deleting from a B+-tree
 Remove
the key in the parent node and
release the empty node
68
25
32
51
89
68
CM3117 -- Data Structures and File Organizations
80
85
92
95
9
Deleting from a B+-tree
 When
the key that is deleted is the first in
the node it is also found in the index set
 Technically this should be changed in the
index set
 This can get to be a lot of processing
 Generally just leave the index set
unchanged
CM3117 -- Data Structures and File Organizations