The Design and Analysis of Algorithms Chapter 6: Transform and Conquer 2-3-4 Trees, Red-Black Trees Chapter 6. 2-3-4 Trees, Red-Black Trees Basic Idea 2-3-4 Trees Definition Operations Complexity Red-Black Trees Definition Properties Insertion 2 Basic Idea Disadvantages of Binary Search trees – worst case complexity is O(N) Solution to the problem: AVL trees – keep the tree balanced Multi-way search tree – decrease tree levels 2-3 and 2-3-4 trees Disadvantages: nodes with different structure Red-Black trees – use advantages of 2-3-4 trees with binary nodes 3 Multi-Way Trees k1 < k2 < … < kn-1 < k1 [k1, k2 ) kn-1 4 2-3-4 Trees - Definition Three types of nodes: 2-node: contains one key, has two links 3-node: contains 2 ordered keys, has 3 links 4-node: contains 3 ordered keys, has 4 links All leaves must be on the same level, i.e. the tree is perfectly height-balanced. This is achieved by allowing more than one key in a node 5 2-3-4 Trees - Example J D B P L N FH K M R TW O Q S U V X 6 2-3-4 Trees - Operations Search – straightforward: start comparing with the root and branch accordingly Insert: The new key is inserted at the lowest internal level 7 Insert in a 2-node The 2-node becomes a 3-node . M P MP 8 Insert in a 3-node The 3-node becomes a 4-node . R MP MPR 9 Insert in a 4-node Bottom-up Insertion: Promotion The 4-node is split, and the middle element is moved up – inserted in the parent node. The process is called promotion and may continue up the top of the tree. If the 4-node is a root (no parent), then a new root is created. After the split the insertion proceeds as in the previous cases. 10 Insert in a 4-node - Example N GN C FG L CF L 11 Top-down Insertion In our way down the tree, whenever we reach a 4-node, we break it up into two 2-nodes, and move the middle element up into the parent node. In this way we make sure there will be place for the new key 12 Complexity of Search and Insert Height of the tree: A 2–3–4 tree with minimum number of keys will correspond to a perfect binary tree N ≥ 1 + 2 + … + 2h = 2 h+1 – 1 h ≤ log(N+1) – 1 A 2–3–4 tree with maximum number of keys will correspond to a perfect 4-tree tree N ≤ 3(1 + 4 + 42 + … + 4h) = 3. (4 h+1 -1)/3 4 (h+1) ≥ N + 1 h ≥ log4(N + 1) -1 = 1/2 log(N + 1) -1 Therefore h = Θ(log(N)) 13 Complexity of Search and Insert • A search visits O(log N) nodes • An insertion requires O(log N) node splits • Each node split takes constant time • Hence, operations Search and Insert each take time O(log N) 14 Red-Black Trees - Definition edges are colored red or black no two consecutive red edges on any root-leaf path same number of black edges on any root-leaf path (=black height of the tree) edges connecting leaves are black 15 2-3-4 and Red-Black Trees 2-3-4 tree red-black tree 2-node 3-node 2-node two nodes connected with a red link (left or right) G GN N F CF L C L 16 2-3-4 and Red-Black Trees 4-node three nodes connected with red links N G NP P G C L C L O 17 2-3-4 and Red-Black Trees 2-3-4 tree Red-black tree or 18 Red-Black Trees 1/2 log(N+1) B log(N + 1) log(N+1) H 2 log(N + 1) where : N is the number of internal nodes L is the number of leaves (L = N + 1) H - height B - black height (count the black edges only) This implies that searches take time O(logN) 19 Red-Black Trees: Insertion Perform a standard search to find the leaf where the key should be added Replace the leaf with an internal node with the new key Color the incoming edge of the new node red Add two new leaves, and color their incoming edges black If the parent had an incoming red edge, we now have two consecutive red edges. We must reorganize tree to remove that violation. What must be done depends on the sibling of the parent. 20 Restructuring Incoming edge of p is red and its sibling is black single rotation g - grandparent, p – parent, n – new node) g p p g n n 21 Restructuring Double Rotations: the new node is between its parent and grandparent in the inorder sequence g p n p g n Left-right double rotation 22 Restructuring Right-left double rotation g n p g p n 23 Promotion: bottom up rebalancing Incoming edge of p is red and its sibling is also red The black depth remains unchanged for all of the descendants of g g g This process will continue upward beyond g if p n p necessary: rename g as n and repeat. n Promotions may continue up the tree and are executed O(log N) times. The time complexity of an insertion is O(logN). 24