Data Structures B‐Trees Tzachi (Isaac) Rosen Motivation • When data is too large to fit in main memory, it it expands to the disk. d t th di k – Disk access is highly expensive compared to a typical computer instruction – The number of disk accesses will dominate the running time. • Our goal is to devise a search tree that will minimize disk accesses. Tzachi (Isaac) Rosen 1 Typical Disk Drive Tzachi (Isaac) Rosen B‐Tree • A balanced search tree designed to work well on direct‐access secondary storage devices. di t d t d i • A generalize search tree. Tzachi (Isaac) Rosen 2 B‐Tree • Each node corresponds to a block of data on th di k the disk. • Minimizes disk accesses. – Tree of height 2 containing over one billion keys. Tzachi (Isaac) Rosen Definition • A B‐tree T is a rooted tree (at root[T]) having th f ll i the following properties: ti 1. Every node x has the following fields: a. n[x], the number of keys currently stored in node x, b. the n[x] keys themselves, stored in non‐decreasing order, so that key1[x] ≤ key2[x] ≤ ∙∙∙ ≤ keyn[x][x], c. leaf[x], a Boolean value that is TRUE if x is a leaf and l f[ ] B l l th t i TRUE if i l f d FALSE if x is an internal node. Tzachi (Isaac) Rosen 3 Definition 2. Each internal node x also contains n[x]+ 1 pointers c1[x], c [x] c2[x], ..., c [x] cn[x]+1[x] to its children. [x] to its children – Leaf nodes have no children, so their ci fields are undefined. 3. The keys keyi[x] separate the ranges of keys stored in each sub‐tree: – if ki is any key stored in the sub‐tree with root c y y ] i[[x], then k1 ≤ key1[x] ≤ k2 ≤ key2[x] ≤∙∙∙ ≤ keyn[x][x] ≤ kn[x]+1. 4. All leaves have the same depth, which is the tree's height h. Tzachi (Isaac) Rosen Definition 5. There are lower and upper bounds on the number of keys a node can contain of keys a node can contain. – These bounds can be expressed in terms of a fixed integer t ≥ 2 called the minimum degree of the B‐tree: a. Every node other than the root must have at least t‐1 keys. – – b b. Every internal node thus has at least t children. If the tree is nonempty, the root must have at least one key. Every node can contain at most 2t‐1 keys Every node can contain at most 2t‐1 keys. – – Therefore, an internal node can have at most 2t children. We say that a node is full if it contains exactly 2t–1 keys. Tzachi (Isaac) Rosen 4 2‐3‐4 Tree • The simplest B‐tree occurs when t = 2. • Every internal node then has either 2, 3, or 4 children, and we have a 2‐3‐4 tree. • In practice, however, much larger values of t are typically used. Tzachi (Isaac) Rosen Height • Theorem: If n ≥ 1, then for any n‐key B‐tree T of height h and minimum degree t ≥ 2, • Proof: If a B‐tree has height h, the number of its nodes is minimized when the root contains one key and all other nodes contain t ‐ 1 keys. In this case, there are 1 node at the root 2 nodes at depth 1 2 nodes at depth 1, 2t nodes at depth 2, 2t2 nodes at depth 3, and so on, until at depth h there are 2th‐1 nodes Tzachi (Isaac) Rosen 5 Height Thus, the number n of keys satisfies the i inequality: lit which implies Tzachi (Isaac) Rosen Basic Operations • We always keep the root in main memory, so th t DISK READ that a DISK‐READ on the root is never required th ti i d • Any nodes that are passed as parameters must already have had a DISK‐READ. • Any changed node must have DISK‐WRITE. Tzachi (Isaac) Rosen 6 Searching search (x, k) i = 1 1 while (i ≤ n[x] & k > keyi[x]) do i = i + 1 if (i ≤ n[x] & k = keyi[x]) then return (x, i) if (leaf [x]) then return null else diskRead(ci[x]) return search(ci[x], k) CPU time is O(th) = O(t logt n). Disk access is O(h) = O(logt n) Tzachi (Isaac) Rosen Insertion Tzachi (Isaac) Rosen 7 Creating an Empty B‐tree create (T) x = ALLOCATE‐NODE() ALLOCATE NODE() leaf[x] = TRUE n[x] = 0 diskWrite(x) [ ] root[T] = x CPU time is O(1). Disk access is O(1) Tzachi (Isaac) Rosen Splitting a Node Tzachi (Isaac) Rosen 8 Splitting a Node splitChild (x, i, y) z = allocateNode() leaf[z] ← leaf[y] n[z] ← t ‐ 1 z = allocateNode(), leaf[z] ← leaf[y], n[z] ← t for (j = 1 to t – 1) do keyj[z] ← keyj+t[y] if (not leaf[y]) then for (j = 1 to t) do cj[z] ← cj+t[y] n[y] = t ‐ 1 for (j = n[x] + 1 downto i + 1) do cj+1[x] ← cj[x] ] ci+1[[x] ← z for (j = n[x] downto i) do keyj+1[x] ← keyj[x] keyi[x] ← keyt[y], n[x] ← n[x] + 1 diskWrite(y), diskWrite(z), diskWrite(x) CPU time is O(t). Disk access is O(1) Tzachi (Isaac) Rosen Insertion insert (T, k) r = root[T] r = root[T] if (n[r] = 2t – 1) then s = allocateNode(), root[T] = s, leaf[s] = false, n[s] = 0, c1[s] = r splitChild(s, 1, r) r = s insertNonFull(r, k) Tzachi (Isaac) Rosen 9 Insertion CPU time is O(th) = O(t logt n). insertNonFull (x, k) Disk access is O(h) = O(logt n) i = n[x] if (leaf[x]) then while (i ≥ 1 & k < keyi[x]) do keyi+1[x] = keyi[x], i = i ‐ 1 keyi+1[x] ← k n[x] ← n[x] + 1 diskWrite(x) else while (i ≥ 1 & k < keyi[x]) do i = i – 1 i = i + 1 diskRead(ci[x]) if (n[ci[x]] = 2t – [x]] 2t 1) then 1) then splitChild (x, i, ci[x]) if (k > keyi[x]) then i = i + 1 insertNonFull (ci[x], k) Tzachi (Isaac) Rosen Deletion delete (x, k) (1) if (the key k is in node x & x is a leaf) delete the key k from x. (2a) else if (k is in x & x is internal & the child y that precedes k has at least t keys) recursively delete the predecessor k′ of k, replace k by k′ in x (2b) Symmetrically, if (k is in x & x is internal & the child that follows k has at least t keys) the child z that follows k has at least t keys) (2c) else if (k is in x & x is internal) delete k and merge its children Tzachi (Isaac) Rosen 10 Deletion CPU time is O(th) = O(t logt n). Disk access is O(h) = O(logt n) else if (k is not present in internal node x) determine the root r of the appropriate subtree that must contain k (3a) if (r has only t ‐ 1 keys but has an immediate sibling with at least t keys) give r an extra key by shifting (3b) else else if (r and both of r's if (r and both of r s immediate immediate siblings siblings have t ‐ 1 keys) merge r with one sibling finish by recurring on the appropriate child of x Tzachi (Isaac) Rosen 11