More Trees II 2-3-4 Trees, Red-Black Trees, B Trees 1 Objectives You will be able to Describe Red-Black trees. Describe B-Trees. 2 Red-Black Trees A way of constructing 2-3-4 trees from 2-nodes. Defined as a BST which: Has two kinds of links (red and black). Every path from root to a leaf node has same number of black links. No path from root to leaf node has more than two consecutive red links. Data member added to store color of link from parent. 3 Red-Black Trees enum ColorType {RED, BLACK}; class RedBlackTreeNode { public: DataType data; ColorType parentColor; // RED or BLACK RedBlackTreeNode * parent; RedBlackTreeNode * left; RedBlackTreeNode * right; } Now possible to construct a red-black tree to represent a 2-3-4 tree 4 Red-Black Trees We will convert each 3-Node in the 2-3-4 tree into two 2-Nodes. Each 4-Node into three 2-Nodes. 5 Red-Black Trees 2-3-4 tree represented by red-black trees as follows: Make a link black if it is a link in the 2-3-4 tree. Make a link red if it connects nodes containing values in same node of the 2-3-4 tree. Some authors use h and v instead of red and black. h (horizontal) links connect nodes from the same node of the 2-3-4 tree. v (vertical) links are links from the 2-3-4 tree. 6 Example 2-3-4 tree Corresponding red-black tree 59 55 7 Adding to a Red-Black Tree 1. 2. Do top-down insertion as with 2-3-4 tree Search for place to insert new node – Keep track of parent, grandparent, great grandparent. When 4-node q encountered, split as follows: a. Change both links of q to black b. Change link from parent to red: 8 Adding to a Red-Black Tree 3. If now two consecutive red links, (from grandparent gp to parent p to q) Perform appropriate AVL type rotation determined by direction (left, right, left-right, right-left) from gp -> p-> q 9 Example Let's convert the quiz solution into a Red-Black tree. 10 After Adding WY MA GA DE IL IN PA TX MI NY OH RI VT WY 11 Corresponding Red-Black Tree MA GA DE IL IN TX PA MI NY OH RI VT WY 12 Add NM MA GA DE IL IN TX PA MI NY OH RI VT WY Will be right child of MI We have to split the "4-node" of MI-NY-OH 13 After Adding NM MA GA DE IL IN NY MI NM TX PA OH RI VT WY End of Section 14 B-Trees Drozdek Chapter 7 15 B-Trees Etymology unknown Rudolf Bayer and Ed McCreight invented the B-tree while working at Boeing Research Labs in 1971, but they did not explain what, if anything, the B stands for. Balanced Trees ? Bayer Trees ? Boeing Trees ? See http://en.wikipedia.org/wiki/B-tree Many variations: B+ Trees, B* Trees See Drozdek Chapter 7 16 B-Trees Previous trees used in internal searching schemes B-trees are intended for external searching Tree sufficiently small to be all in memory Data stored in secondary memory Each node is a block on disk. Typically the "data" in a node is really a pointer. B-tree of order m has properties: The root has at least two subtrees unless it is a leaf. Each node stores at most m – 1 data values and has at most m links to subtrees. Each internal node stores at least ceil(m/2) data values. All leaves on same level 17 B-Trees A 2-3-4 tree is a B-tree of order 4 Note example below of order 5 B-tree Best performance for disk storage found to be with values for 50 ≤ m ≤ 400 End of Presentation 18