Transcript chap8b
B + -trees
CM3117
Data Structures
B-Trees
The
B-tree is similar to binary search trees
in another way
While it is good for locating a particular key
it is not good for locating a series of keys
– It is possible to find the series of keys within
one node when in a leaf node
– It is not possible to find the next key otherwise
The
B-tree can be modified to make this
possible
CM3117 -- Data Structures and File Organizations
B+-tree
modified B-tree is called a B+-tree
The difference is primarily in the structure
of the tree
The
– All keys and the associated data are found in
the leaf nodes
– The root and interior nodes contain only keys
and pointers
CM3117 -- Data Structures and File Organizations
B+-tree
The
keys in the root and interior nodes are
duplicated in the leaf nodes
– These keys end up as the first key in each node
except the leftmost node
This
means that the leaf nodes have a
different structure from the non-leaf nodes
CM3117 -- Data Structures and File Organizations
B+-trees
The
last pointer of the leaf nodes are used to
connect each leaf node to the next node in
sequence
– The leaf nodes form a linked list
– They are called the sequence set
The
non-leaf nodes are called the index set
since they are used exclusively to locate
data
CM3117 -- Data Structures and File Organizations
B+-trees
Here
is an example of a B+-tree
Index Set
50
20
10
85
20
45
50
Sequence Set
CM3117 -- Data Structures and File Organizations
68
85
101
B+-trees
The
structural differences lead to differences
in the operations as well
For example, the search operation in B+-trees
must end in the leaf node since that is where
the data is actually located
The keys in the index set serve only to direct
the search to the correct leaf node
CM3117 -- Data Structures and File Organizations
B+-trees
There
are also differences in the insertion,
deletion, and redistribution operations
Because of the structural differences between
the index set and the sequence set the
differences only apply to the sequence set
Structurally the index set is the same as the
nodes in a B-tree, so these operations are the
same for the index set as in a B-tree
CM3117 -- Data Structures and File Organizations
Inserting into a B+-tree
There
is no difference when there is room in
the node
The difference is in the node split
CM3117 -- Data Structures and File Organizations
Inserting into a B+-tree
45 into the B+-tree below
Since the node is full it needs to be split
Insert
30
10
20
CM3117 -- Data Structures and File Organizations
30
50
Inserting into a B+-tree
Create
the new node and determine that the
median key is 45
30
10
20
CM3117 -- Data Structures and File Organizations
30
50
Inserting into a B+-tree
Move
from the median key on to the new
node
The difference here is that the median key is
copied
30
10
20
CM3117 -- Data Structures and File Organizations
30
45
50
Inserting into a B+-tree
Since
this is a leaf node no need to copy any
pointer except the last pointer in the node
30
10
20
CM3117 -- Data Structures and File Organizations
30
45
50
Inserting into a B+-tree
The
last pointer in the original node is set to
point to the new node
30
10
20
CM3117 -- Data Structures and File Organizations
30
45
50
Inserting into a B+-tree
The
median key and the pointer to the new
node is still raised to the parent
30
10
20
CM3117 -- Data Structures and File Organizations
45
30
45
50
Inserting into a B+-tree
Need
to emphasize that this difference only
applies to the sequence set
Suppose we insert 25 into this B+-tree
(without doing redistribution)
30
10
20
30
CM3117 -- Data Structures and File Organizations
45
45
50
Inserting into a B+-tree
Create
the new node, identify 20 as the
median key, move the keys to the new node,
and set the pointers in the original and new
nodes
30
10
20
25
CM3117 -- Data Structures and File Organizations
45
30
45
50
Inserting into a B+-tree
When
the median key is raised, the parent
node needs to be split
This node split will follow the same rules as
for a B-tree
20
10
30
20
25
CM3117 -- Data Structures and File Organizations
45
30
45
50
Inserting into a B+-tree
Create
the new node, move all keys and
pointers following the median key, and raise
the median key
30
20
10
45
20
25
CM3117 -- Data Structures and File Organizations
30
45
50
Inserting into a B+-tree
Since
this is a root split, create the new root
node, insert the median key and change the
pointer to the root
30
20
10
45
20
25
CM3117 -- Data Structures and File Organizations
30
45
50
Redistribution in a B+-tree
The
redistributions are also slightly
different in the sequence set
Suppose we insert 55 into this B+-tree
30
20
10
45
20
25
CM3117 -- Data Structures and File Organizations
30
45
50
Redistribution in a B+-tree
Can
do a left redistribution
30
20
10
45
20
25
CM3117 -- Data Structures and File Organizations
30
45
50
Redistribution in a B+-tree
Copy
the first key in the node to the last
position in the node to the left and move all
nodes down in the right node
30
20
10
45
20
25
CM3117 -- Data Structures and File Organizations
30
45
50
Redistribution in a B+-tree
Insert
the new key and its data
30
20
10
45
20
25
CM3117 -- Data Structures and File Organizations
30
45
50
55
Redistribution in a B+-tree
Replace
the key in the parent with the first
key in the right node
Note there are no pointers to move
30
20
10
50
20
25
CM3117 -- Data Structures and File Organizations
30
45
50
55
Redistribution in a B+-tree
Right
redistributions are also slightly
different
Insert 12 into this B+-tree
30
20
10
15
50
20
CM3117 -- Data Structures and File Organizations
30
45
50
55
Redistribution in a B+-tree
Move
the rightmost key in the left node and
insert it in the right node
30
20
10
50
15
20
CM3117 -- Data Structures and File Organizations
30
45
50
55
Redistribution in a B+-tree
Copy
the rightmost node (now the first key
in the right node) to the parent
Insert the new key in the left node
30
15
10
12
50
15
20
CM3117 -- Data Structures and File Organizations
30
45
50
55
Redistribution in a B+-tree
As
before there are no pointers to change
30
15
10
12
50
15
20
CM3117 -- Data Structures and File Organizations
30
45
50
55
Deleting from a B+-tree
As
in a B-tree, when deleting redistributions
must be done if possible
The node combination is different, but again
only in the sequence set
CM3117 -- Data Structures and File Organizations
Deleting from a B+-tree
Delete
10 from this B+-tree
32
10
25
68
89
32
51
CM3117 -- Data Structures and File Organizations
68
80
8
Deleting from a B+-tree
A node
combination is necessary
But in this case it is not necessary to copy
the key in the parent
32
25
68
89
32
51
CM3117 -- Data Structures and File Organizations
68
80
8
Deleting from a B+-tree
Move
all keys in the right node to the left
node
Note that you actually have one slot empty
32
25
32
68
51
CM3117 -- Data Structures and File Organizations
89
68
80
8
Deleting from a B+-tree
Copy
the last pointer in the now empty node
to the last pointer of the left node
32
25
32
68
51
CM3117 -- Data Structures and File Organizations
89
68
80
8
Deleting from a B+-tree
Remove
the key in the parent node and
release the empty node
68
25
32
51
89
68
CM3117 -- Data Structures and File Organizations
80
85
92
95
9
Deleting from a B+-tree
When
the key that is deleted is the first in
the node it is also found in the index set
Technically this should be changed in the
index set
This can get to be a lot of processing
Generally just leave the index set
unchanged
CM3117 -- Data Structures and File Organizations