Chapter 11_BTrees

advertisement
B-Trees

Data Structures
and Other Objects
Using C++
This presentation shows you the
potential problem of unbalanced
tree and show one way to fix it
The problems of Unbalanced Trees
Solution:
1. Periodically balance
the tree (Project 9 in
Chapter 10 on page 538)
2. Leaves cannot become
too deep --- B-tree
1
2
3
4
5
B-Tree
Differences compared to BST:
1. B-tree nodes have many more than two
children
2. Each node contains more than just a single
entry
3. Rules ensure that leaves do not become too
deap
B-Tree Rules
1. Root may have as few as 1 entry – every other
node has MINIMUM entries
2. Maximum number of entries in a node is twice
MINIMUM
3. Entries of each B-tree node are stored in a
partially filled array – sorted in increasing order
4. Number of subtrees below a nonleaf node is 1
more than the number of entries in the node
B-Tree Rules
5. For any nonleaf node
a. An entry at index i is greater than all the entries in subtree number i of the
node
b. An entry at entry i is less than all the entries in subtree i+1 of the node
6. Every leaf in a B-tree has the same depth
B-Tree Example
●
B-tree of 10 integers (with MINIMUM set to 1)
6
2 and 4
1
3
9
5
7 and 8
10
B-Tree Illustrations


https://www.youtube.com/watch?v=coRJrcIYbF4
https://www.cs.usfca.edu/~galles/visualization/BT
ree.html
Set Example

Author uses a set implemented as a B-tree
as an example
Class Invariant




Items in the set are stored in a B-tree
Number of entries in root – stored in member variable
data_count & number of subtrees stored in member variable
child_count
root's entries are stored in data[0] through
data[data_count - 1]
If root has sub-trees – subtrees are stored in sets pointed to by
subset[0] through subset[child_count - 1]
Searching

Make a local variable i equal to the first index such that
data[i] is not less than the target – if no such index exists,
then set i equal to data_count, indicating that all of the
entries are less than the target
if (we found the target at data[i])
return 1;
else if (the root has no children)
return 0;
else
return
subset[i]->count(target)
Searching Example
Inserting

Loose insertion


might result in MAXIMUM + 1 entries in the nodes
Fix Loose insertion problem later
6, 17
4
12
6, 17
19, 22
4
12
18, 19, 22
Loose Inserting

Make a local variable i equal to first index such that data[i] is not less than
entry – if no such index exists, then set i to data_count
if (we found the new entry at data[i])
2a. Return false with no further work (since the new entry is already in
the set).
else if (the root has no children) {
2b. Add the new entry to the root at data[i]. (The original entries at
data[i] and afterwards must be shifted right to make room for the new entry.)
Return true to indicate that we added the entry.
else {
2c. Save the value from this recursive call:
subset[i]->loose_insert(entry);
Then check whether the root of subset[i] now has an excess entry; if so,
then fix that problem.
Return the saved value from the recursive call.
}
Loose Inserting Example
6, 17
4
12
19, 22
MINIMUM = 1
18
6, 17
4
12
?
6, 17, 19
18, 19, 22
4
12
18
22
Fixing Child with Excess Entry

Split child node into nodes containing MINIMUM entries

Pass median entry up to parent
MINIMUM = 2
9, 28
3, 6
1, 2
4, 5
13, 16, 19, 22, 25
7, 8
11, 12
14, 15
17, 18
33, 40
20, 21
23, 24
26, 27
31, 32
34, 35
50, 51
9, 19, 28
3, 6
1, 2
4, 5
22, 25
13, 16
7, 8
11, 12
14, 15
17, 18
20, 21
33, 40
23, 24
26, 27
31, 32
34, 35
50, 51
Deleting From B-Tree

Loose erase


might leave the root of an internal subtree with fewer than
MINIMUM entries
Fix the problem later
Loose Erase From B-Tree


•
Make a local variable i equal to the first index such that
data[i] is not less than target. If there is no such index, then
set i equal to data_count
4 possibilities:
Root has no children & did not find entry
•
•
Root has no children & found entry
•
•
Remove the entry and return true
Root has children & did not find target
•
•
Done – entry not in tree
Recursive call to subtree[i]
Root has children & found the target
•
Replace the target by the largest item from subtree[i]
Loose Erase Example
10, 28
2, 5
0, 1
3, 4
13, 16, 19, 22
6, 7, 8
11, 12
14, 15
17, 18
33, 40
20, 21
23, 24, 26
34, 35
31, 32
50, 51
MINIMUM = 2
?
28
10, 26
2, 5
0, 1
3, 4
13, 16, 19, 22
6, 7, 8
11, 12
14, 15
17, 18
33, 40
20, 21
23, 24
31, 32
34, 35
50, 51
Fix Shortage in Child
Case 1: Transfer an extra entry from subtree[i-1]
- subset[i-1] has more than MINIMUM number
of entries

Transfer data[i-1] down to front of subset[i]

Transfer final item of subtree[i-1] up to replace data[i1]

If subtree[i-1] has children, transfer final child of
subset[i-1] to front of subtree[i]
Case 2: Transfer extra entry from subtree[i+1]Similar to Case 1
Fix Shortage Example
10, 28
2, 5
0, 1
3, 4
MINIMUM = 2
13, 16, 19, 22
6, 7, 8
11, 12
14, 15
17, 18
33
20, 21
31, 32
23, 24, 26
34, 35
The 22 has come up
from the middle child.
The 28 has come
down from the root.
10, 22
2, 5
0, 1
3, 4
13, 16, 19
6, 7, 8
11, 12
14, 15
17, 18
28, 33
20, 21
23, 24, 26
31, 32
This child has been
moved over.
34, 35
Fix Shortage in Child
Case 3: Combine subtree[i]with subtree[i1]

Transfer data[i-1] down to end of subtree[i-1]

Transfer all items & children from subtree[i] to end of
subtree[i-1]

Delete node subtree[i] & shift subtree[i+1] leftward
to fill gap
Case 4: Combine subtree[i]with
subtree[i+1]- Similar to Case 3
Fix Shortage Example
10, 28
2, 5
0, 1
3, 4
MINIMUM = 2
16, 19
6, 7, 8
14, 15
17, 18
33
31, 32
20, 21
34, 35
10
2, 5
0, 1
3, 4
16, 19, 28, 33
6, 7, 8
14, 15
17, 18
20, 21
31, 32
34, 35
Time analysis



Insert, remove and search times are roughly
proportional to the depth of a tree in
binary search trees, heaps and B-trees.
Binary search trees suffer, since a tree of n
entries could have depth n.
Heaps and B-trees have depth proportional to
log(n), so the operations for a heap or a Btree is O(log(n)).
Summary
A B-tree is a tree for storing entries in a manner
that follows six rules.
 The tree algorithms that we have seen for binary
search trees, heaps, and B-trees all have worsecase time complexity of O(d), where d is the
depth of the tree.
 The depth of a heap or B-tree is never more than
O(log n), where n is the number of nodes.

Download