B Trees

Data Structures
• When data is too large to fit in main memory, it
it expands to the disk.
– Disk access is highly expensive compared to a typical computer instruction
– The number of disk accesses will dominate the running time.
• Our goal is to devise a search tree that will minimize disk accesses.
Typical Disk Drive
• A balanced search tree designed to work well on direct‐access secondary storage devices.
• A generalize search tree.
• Each node corresponds to a block of data on the disk.
the disk.
• Minimizes disk accesses.
– Tree of height 2 containing over one billion keys.
• A B‐tree T is a rooted tree (at root[T]) having the following properties:
the following properties:
1. Every node x has the following fields:
a. n[x], the number of keys currently stored in node x,
b. the n[x] keys themselves, stored in non‐decreasing order, so that key1[x] ≤ key2[x] ≤ ∙∙∙ ≤ keyn[x][x],
FALSE if x is an internal node.
cn[x]+1[x] to its children.
Leaf nodes have no children, so their ci fields are undefined.
3. The keys keyi[x] separate the ranges of keys stored in each sub‐tree:
if ki is any key stored in the sub‐tree with root c
i[[x], then k1 ≤ key1[x] ≤ k2 ≤ key2[x] ≤∙∙∙ ≤ keyn[x][x] ≤ kn[x]+1.
4. All leaves have the same depth, which is the tree's height h.
5. There are lower and upper bounds on the number of keys a node can contain
These bounds can be expressed in terms of a fixed integer t ≥ 2 called the minimum degree of the B‐tree:
Every node other than the root must have at least t‐1 keys.
Every internal node thus has at least t children.
If the tree is nonempty, the root must have at least one key.
Every node can contain at most 2t‐1 keys
Therefore, an internal node can have at most 2t children.
We say that a node is full if it contains exactly 2t–1 keys.
2‐3‐4 Tree
• The simplest B‐tree occurs when t = 2.
• Every internal node then has either 2, 3, or 4 children, and we have a 2‐3‐4 tree.
• In practice, however, much larger values of t are typically used.
• Theorem:
If n ≥ 1, then for any n‐key B‐tree T of height h and minimum degree t ≥ 2, • Proof:
If a B‐tree has height h, the number of its nodes is minimized when the root contains one key and all other nodes contain t ‐ 1 keys.
In this case, there are
1 node at the root
2 nodes at depth 1
2t nodes at depth 2,
2t2 nodes at depth 3,
and so on,
until at depth h there are 2th‐1 nodes
Thus, the number n of keys satisfies the i
which implies
Basic Operations
• We always keep the root in main memory, so th t DISK READ
that a DISK‐READ on the root is never required
• Any nodes that are passed as parameters must already have had a DISK‐READ.
• Any changed node must have DISK‐WRITE.
search (x, k)
i = 1
while (i ≤ n[x] & k > keyi[x]) do i = i + 1
if (i ≤ n[x] & k = keyi[x]) then
return (x, i)
if (leaf [x]) then
return null
return search(ci[x], k)
CPU time is O(th) = O(t logt n).
Disk access is O(h) = O(logt n)
Creating an Empty B‐tree
create (T)
leaf[x] = TRUE
n[x] = 0
root[T] = x
CPU time is O(1).
Disk access is O(1)
Splitting a Node
Splitting a Node
splitChild (x, i, y)
z = allocateNode() leaf[z] ← leaf[y] n[z] ← t ‐ 1
z = allocateNode(), leaf[z] ← leaf[y], n[z] ← t for (j = 1 to t – 1) do keyj[z] ← keyj+t[y]
if (not leaf[y]) then for (j = 1 to t) do cj[z] ← cj+t[y]
n[y] = t ‐ 1
for (j = n[x] + 1 downto i + 1) do cj+1[x] ← cj[x]
ci+1[[x] ← z
for (j = n[x] downto i) do keyj+1[x] ← keyj[x]
keyi[x] ← keyt[y], n[x] ← n[x] + 1
diskWrite(y), diskWrite(z), diskWrite(x) CPU time is O(t).
Disk access is O(1)
insert (T, k)
r = root[T]
if (n[r] = 2t – 1) then
s = allocateNode(), root[T] = s,
leaf[s] = false, n[s] = 0, c1[s] = r
splitChild(s, 1, r)
r = s
insertNonFull(r, k)
CPU time is O(th) = O(t logt n).
insertNonFull (x, k)
Disk access is O(h) = O(logt n)
i = n[x]
if (leaf[x]) then
while (i ≥ 1 & k < keyi[x]) do keyi+1[x] = keyi[x], i = i ‐ 1
keyi+1[x] ← k
n[x] ← n[x] + 1
while (i ≥ 1 & k < keyi[x]) do i = i – 1
i = i + 1
if (n[ci[x]] = 2t –
splitChild (x, i, ci[x])
if (k > keyi[x]) then i = i + 1
insertNonFull (ci[x], k)
delete (x, k)
(1) if (the key k is in node x & x is a leaf)
delete the key k from x.
(2a) else if (k is in x & x is internal &
the child y that precedes k has at least t keys)
recursively delete the predecessor k′ of k,
replace k by k′ in x
(2b) Symmetrically, if (k is in x & x is internal &
the child that follows k has at least t keys)
(2c) else if (k is in x & x is internal)
delete k and merge its children
CPU time is O(th) = O(t logt n).
Disk access is O(h) = O(logt n)
else if (k is not present in internal node x)
determine the root r of the appropriate
subtree that must contain k
(3a) if (r has only t ‐ 1 keys but has an
immediate sibling with at least t keys)
give r an extra key by shifting
(3b) else
else if (r and both of r's
have t ‐ 1 keys)
merge r with one sibling
finish by recurring on the appropriate child of x
