B-Trees CS 583 Analysis of Algorithms 7/1/2016 CS583 Fall'06: B-Trees 1 Outline • Data Structures on Secondary Storage – Magnetic disks – Efficient operations • B-Trees – Definitions – Searching – Inserting • Self-test – 18.1-1, 18.1-2, 18.2-1, 18.2-2 7/1/2016 CS583 Fall'06: B-Trees 2 Magnetic Disks • The main memory of a computer system consists of silicon memory chips. – It is typically two orders of magnitude more expensive than the magnetic storage technology. • Magnetic disks are cheaper and have higher capacity than main memory. – However, they are much slower because of moving parts. – In order, to amortize time spent for mechanical movements, disks access several items at the same time. • Information is divided into equal size pages. • Pages appear as consecutive bits within cylinders. • Once the read/write head is positioned at the desired page, large amounts of data can be accessed quickly. 7/1/2016 CS583 Fall'06: B-Trees 3 Disk Operations When x is an object that resides on a disk the following pseudocode conventions are used: x = <a pointer to some object> Disk-Read(x) <access and modify fields of x> Disk-Write(x) In most systems the running time of a B-Tree algorithm is determined by the number of disk read and write operations. Hence, a B-tree node is usually as large as a disk page. Example: a B-tree with a branching factor of 1001 and height 2 can store a Billion+ keys. Since the root note is stored in main memory, only two disk accesses at most are needed to find any key! 7/1/2016 CS583 Fall'06: B-Trees 4 B-tree Definition • We assume that any satellite information associated with a key is stored in the same node as a key. • A B-tree is a rooted tree with the following properties: – Every node x has the following fields: • n[x], the number of keys stored in x. • n[x] keys stored in non-decreasing order: key1[x] <= key2[x] <= ... <= keyn[x][x] • leaf[x] = true if x is a leaf, and false otherwise. – Each internal node x contains n[x]+1 pointers to its children: c1[x], c2[x], ... , cn[x]+1[x] 7/1/2016 CS583 Fall'06: B-Trees 5 B-tree Definition (cont.) • Properties (cont.): – The keys keyi[x] separate the ranges stored in each subtree: if ki is any key stored in the subtree with root ci[x], then • k1 <= key1[x] <= k2 <= key2[x] <= ...<= keyn[x][x] <= kn[x]+1 – All leaves have the same depth, -- the tree’s height h. – There are lower and upper bounds on the number of keys in a node. They are expressed in terms of an integer t >= 2 called the minimum degree: • Every node other than the root must have at least t-1 keys. • Every node can contain at most (2t-1) keys. We say the node is full if it contains exactly (2t-1) keys. 7/1/2016 CS583 Fall'06: B-Trees 6 Height of the Tree The number of disk accesses for a B-tree is proportional to the height of the tree. Theorem 18.1 If n >= 1, then for any n-key B-tree T of height h and minimum degree t >= 2: h <= logt (n+1)/2 Proof. If a B-tree has height h, the root contains at least one key and all other nodes contain at least (t-1) keys. Thus there are at least 2 nodes at depth 1, at least 2t nodes at depth 2, and so on, until 2th-1 nodes at depth h. 7/1/2016 CS583 Fall'06: B-Trees 7 Height of the Tree (cont.) The number of n keys satisfies inequality: n >= 1 + (t-1) i=1,h 2ti-1 = 1+2(t-1)(th-1)/(t-1) = 2 th-1 => th <= (n+1)/2 => h <= logt(n+1)/2 Hence the height of the B-tree grows as O(logt n) , which is significantly slower than the growth of the height of the red-black tree, -- O(lg n). This means that the number of disk accesses is substantially reduced for most tree operations. 7/1/2016 CS583 Fall'06: B-Trees 8 Basic Operations • The root of the B-tree is always in main memory. – Disk-Read on the root is never required. – Disk-Write is required when the root node is changed. • Any nodes that are passed as parameters have already had Disk-Read performed on them. • All basic procedures are “one-pass” algorithms: – They proceed downward from the root of the tree, without having to back up. 7/1/2016 CS583 Fall'06: B-Trees 9 Searching The searching algorithm takes as input a pointer to the root node x of a subtree, and a key k. It returns a pair (y, i) such that keyi[y] = k. B-Tree-Search(x,k) 1 i = 1 2 while i <= n[x] and k > key_i[x] 3 i++ 4 if i <= n[x] and k = key_i[x] 5 return (x,i) 6 if leaf[x] 7 return NIL 8 else 9 Disk-Read (c_i[x]) // read ith child of x 10 return B-Tree-Search(c_i[x],k) 7/1/2016 CS583 Fall'06: B-Trees 10 Searching: Performance • The nodes encountered during the recursion form a path downward from the root of the tree. • The number of disk pages accessed by B-TreeSearch is O(h) = O(logt n). • For each node, n[x] < 2t, hence the while loop 2-3 takes O(t) time. • Therefore the total CPU time is O(th) = O(logt n). 7/1/2016 CS583 Fall'06: B-Trees 11 Inserting • General algorithm: – Search for the leaf node y at which to insert the new key. • If the node y is full (having 2t-1 keys): – Split the full node around its median key: keyt[y]: • Create two nodes with (t-1) keys each. • Move the median key up to y’s parent. • If y’s parent is also full, make the split again. • The key is inserted in a single path down the tree. – Each full node is split along the way. – This assures that when the y node needs to be split, its parent cannot be full. 7/1/2016 CS583 Fall'06: B-Trees 12 Splitting a Node • The procedure B-Tree-Split-Child takes as input non-full node x, index i, and a full child y of x: y=ci[x]. • The procedure then splits y in two and adjusts x so that it has an additional child. • When the root needs to be split, a new root needs to be created. – The tree grows in height by one. – Splitting is the only means to grow the tree. 7/1/2016 CS583 Fall'06: B-Trees 13 Splitting Node: Pseudocode B-Tree-Split-Child(x,i,y) 1 z = Allocate-Node() // allocate a disk page 2 leaf[z] = leaf[y] 3 n[z] = t-1 4 for j = 1 to t-1 5 keyj[z] = keyj+t[y] 6 if not leaf[y] 7 for j = 1 to t 8 cj[z] = cj+t[y] 9 n[y] = t-1 // shift children to the right 10 for j = n[x] downto i+1 11 cj+1[x] = cj[x] 12 ci+1[x] = z // add z as a new child 7/1/2016 CS583 Fall'06: B-Trees 14 Splitting Node: Pseudocode (cont.) 13 14 15 16 17 18 19 // make room for the median for j = n[x] downto i keyj+1[x] = keyj[x] keyi[x] = keyt[y] n[x]++ Disk-Write(y) Disk-Write(z) Disk-Write(x) The CPU time is determined by loops 4-5 and 7-8, which is (t). Note that other loops perform O(t) iterations. The procedure performs (1) disk operations. 7/1/2016 CS583 Fall'06: B-Trees 15 Inserting a Key: Algorithm B-Tree-Insert(T,k) 1 r = root[T] 2 if n[r] = 2t-1 // full node 3 s = Allocate-Node() 4 root[T] = s 5 leaf[s] = FALSE 6 n[s] = 0 7 c1[s] = r // split the old root 8 B-Tree-Split-Child(s,1,r) 9 B-Tree-Insert-Nonfull(s,k) 10 else 11 B-Tree-Insert-Nonfull(r,k) 7/1/2016 CS583 Fall'06: B-Trees 16 Inserting a Key: Algorithm (cont.) // Insert key k into a non-full node x B-Tree-Insert-Nonfull(x,k) 1 i = n[x] 2 if leaf[x] // k is inserted in the ordered list 3 while i >= 1 and k < keyi[x] 4 keyi+1[x] = keyi[x] 5 i-6 keyi+1[x] = k 7 n[x]++ 8 Disk-Write(x) 9 else // search the leaf to insert into 7/1/2016 CS583 Fall'06: B-Trees 17 Inserting a Key: Algorithm (cont.) 10 11 12 13 14 15 16 17 18 7/1/2016 while i >= 1 and k < keyi[x] i-i++ Disk-Read(ci[x]) if n[ci[x]] = 2t-1 // full node B-Tree-Split-Child(x,i,ci[x]) if k > keyi[x] i++ B-Tree-Insert-Nonfull(ci[x], k) CS583 Fall'06: B-Trees 18 Inserting a Key: Performance • The number of disk accesses performed by B-TreeInsert is O(h) for a B-tree of height h. – Only a O(1) of Disk-Read and Disk-Write operations are performed at each level in the B-Tree-Insert-Nonfull. • The total CPU time is O(t h) = O(logt n) – At each level of the tree the number of CPU operations are determined by while loops in B-Tree-Insert-Nonfull. • The maximum number of iterations in these loops are 2t-1, hence the total time at each level is O(t). 7/1/2016 CS583 Fall'06: B-Trees 19