Balanced Trees (AVL and RedBlack) Binary Search Trees • Optimal Behavior ▫ O(log2N) – perfectly balanced tree (e.g. complete tree with all levels filled) • Degenerate Behavior ▫ O(N) – tree becomes list • How to avoid degenerate trees? ▫ Keep them perfectly balanced Impossible ▫ Keep them balanced within some criteria Maintain balance over insertions and deletions Balanced Trees • Balanced binary trees ▫ AVL Tree ▫ Red-Black Tree • Balanced N-ary trees ▫ B-Trees ▫ B+-Trees Height of BST • A BST is not balanced • E.g., 5 12 23 35 65 85 AVL Trees Discovered by two Russian mathematicians, Adelson-Velskii and Landis, in 1962. Mathematical Definition: If T is a a nonempty binary search tree, then T is an AVL tree if and only if its left and right subtrees are AVL trees and |left height – right height| < 1. “In English” Definition: A binary search tree that is somewhat balanced, but not necessarily complete or full. AVL Trees? 45 20 10 10 70 30 60 50 7 3 15 9 13 11 30 AVL Trees Representation: Use the linked representation, with an addition to the node class. The addition is a balance factor (bf), which is defined as the following for a node x. bf(x) = height of x’s left subtree – height of x’s right subtree The possible balance factors are –1, 0, and 1. AVL Trees Insertion: Using the standard insertion logic for a binary search tree could cause problems with an AVL tree because it’s possible that some of the properties could be violated. In other words, it’s possible that the tree could become unbalanced. AVL Trees JFK DCA -1 Adding the node “DUS” +1 ORD 0 results in the tree below: JFK GCM +2 0 DCA -2 GCM DUS 0 ORD +1 0 AVL Trees • When an insertion is performed, the balancing information for ALL of the nodes on the path back to the root must be updated. • If the AVL properties were destroyed, they can be fixed by a simple modification to the tree, which is known as a rotation. AVL Trees To fix the tree, start at the inserted node and trace its path back to the root, stopping at the first node that violates the AVL property (call this node A). There are four possible violations that could occur: 1. Inserted into the left subtree of A’s left child 2. Inserted into the right subtree of A’s right child 3. Inserted into the right subtree of A’s left child 4. Inserted into the left subtree of A’s right child AVL Trees 1. Inserted into the left subtree of A’s left child (lets call this B). This problem is fixed by applying a single right rotation. - Change the parent of A to point to B - Make A’s left link equal to B’s right link - Make B’s right link point to A A B 10 B 7 7 3 3 10 A AVL Trees 2. Inserted into the right subtree of A’s right child (lets call this B). This problem is fixed by applying a single left rotation. - Change the parent of A to point to B - Make A’s right link equal to B’s left link - Make B’s left link point to A A 45 B B 55 A 65 45 55 65 AVL Trees Let’s build a tree using the nodes 3, 2, 1, 4, 5, 6, 7. 3 2 1 2 1 2 3 1 3 4 AVL Trees Let’s build a tree using the nodes 3, 2, 1, 4, 5, 6, 7. 2 2 1 1 3 4 3 4 5 5 AVL Trees Let’s build a tree using the nodes 3, 2, 1, 4, 5, 6, 7. 2 After step 1 of a 1 4 3 4 single left rotation 2 5 6 1 3 5 6 AVL Trees Let’s build a tree using the nodes 3, 2, 1, 4, 5, 6, 7. 4 4 After step 2 of a 2 3 5 single left rotation 2 5 1 6 1 3 6 AVL Trees Let’s build a tree using the nodes 3, 2, 1, 4, 5, 6, 7. 4 4 After step 3 of a 2 1 5 3 2 single left rotation 6 1 5 3 6 AVL Trees Let’s build a tree using the nodes 3, 2, 1, 4, 5, 6, 7. 4 4 2 5 2 1 3 6 6 1 7 3 5 7 AVL Trees 3. Inserted into the right subtree of A’s left child (lets call this B). This problem is fixed by applying a left-right rotation. - Make A’s left link equal to B’s right link (call this C) - Make B’s right link equal to C’s left link - Make C’s left link equal to B - Change the parent of A to point to C - Make A’s left link equal to C’s right link - Make C’s right link point to A A B 15 A 7 C C 12 B 12 7 15 C B 12 7 15 A AVL Trees 4. Inserted into the left subtree of A’s right child (lets call this B). This problem is fixed by applying a right-left rotation. - Make A’s right link equal to B’s left link (call this C) - Make B’s left link equal to C’s right link - Make C’s right link equal to B - Change the parent of A to point to C - Make A’s right link equal to C’s left link - Make C’s left link point to A A 15 B C 21 A 32 15 C C 21 B A 32 15 21 32 B AVL Trees Let’s build onto the previous example by adding the nodes: 17, 16, 15, 13, 14. 4 4 2 1 Right-left rotation 6 3 5 1 7 17 16 2 6 3 5 17 7 16 AVL Trees Let’s build onto the previous example by adding the nodes: 17, 16, 15, 13, 14. 4 4 2 1 Right-left rotation 6 3 5 17 7 2 1 16 6 3 5 7 16 17 AVL Trees Let’s build onto the previous example by adding the nodes: 17, 16, 15, 13, 14. 4 4 2 1 Right-left rotation 6 3 5 7 2 1 6 3 5 16 16 17 7 17 AVL Trees Let’s build onto the previous example by adding the nodes: 17, 16, 15, 13, 14. 4 4 2 1 Right-left rotation 6 3 5 1 16 17 7 2 6 3 5 16 7 17 AVL Trees Let’s build onto the previous example by adding the nodes: 17, 16, 15, 13, 14. 4 4 2 2 Right-left rotation 6 1 1 3 5 6 3 5 7 16 16 7 17 15 15 17 AVL Trees Let’s build onto the previous example by adding the nodes: 17, 16, 15, 13, 14. 4 4 2 1 6 3 5 2 Right-left rotation 7 1 16 15 7 3 6 5 17 16 15 17 AVL Trees Let’s build onto the previous example by adding the nodes: 17, 16, 15, 13, 14. 4 7 single left rotation 2 7 4 1 3 6 5 16 15 13 16 2 17 1 6 3 5 15 13 17 AVL Trees Let’s build onto the previous example by adding the nodes: 17, 16, 15, 13, 14. 7 Left - right rotation 4 16 2 6 7 15 17 4 1 3 5 16 13 2 6 14 17 14 1 3 5 13 15 Red-Black Trees • Similar idea to AVL Trees • Similar worse case time complexity • Keep tree somewhat balanced ▫ not as rigidly balanced as AVL ▫ A Red-Black tree with n nodes has height at most 2log(n+1), i.e., most of the common operations are performed in O(log n). ▫ Longest path is no more than twice as long as the shortest branch Red-Black Trees • A red-black tree is an extended binary tree (every node has two children or is a leaf) which satisfies the following properties: ▫ The root is black (if non-empty) Root property ▫ If a node is red, it’s parent must be black Red property ▫ Every path from the root to a leaf/single-child node contains the same number of black nodes. This number is called the blackheight of the tree. Path property Red-Black? Red-Black? yes no yes Red-Black? Red-Black? no Properties Revisited ▫ The root is black (if non-empty) Root property ▫ If a node is red, it’s parent must be black Red property ▫ Every path from the root to a leaf/single-child node contains the same number of black nodes. This number is called the blackheight of the tree. Path property Adding null leaf nodes • Every node has 2 children ▫ Don’t include null nodes in the black-height of the tree Properties Revisited • Objective: ▫ Restore red property violation while preserving other properties (i.e. path property). ▫ Techniques: re-color and/or rotate. Property Restoration • Invariant: ▫ There is exactly one red node x in the tree whose parent may be red. When a new node is added, it is always added as a red node • Strategy: ▫ Fix the violation of red property at x. ▫ This may violate the condition at some ancestor. ▫ Continue fixing the property at the ancestor (ancestor becomes x) Property Restoration • Terminology: ▫ x is the current node in violation of the red property It is red, and its parent is also red ▫ parent(x) is its parent ▫ parent(parent(x) ) is its grandparent (i.e. grandparent(x)) ▫ The other sibling of parent(x) is its uncle (i.e. uncle(x)) Property Restoration There are several cases: • Case 1: ▫ X is the root Case 2: ▫ Parent(x) is black Case 3: ▫ Parent(x) and Uncle(x) are both red Case 4: ▫ Parent(x) is red, but Uncle(x) is black Multiple parts of this case… Property Restoration Case 1: X is the root x Change its color to black. x Property Restoration Case 2: Parent(x) is black p p x Don’t do anything x Property Restoration Case 3: Parent(x) and Uncle(x) are both red g p x g u p u x Change the color of the parent and its uncle to black. Change the color of the grandparent node to red, and repeat with x = grandparent(x) Property Restoration Case 4: Parent(x) is red, but Uncle(x) is black There are 4 different cases, much like the 4 different AVL cases g p x g u p g u x left-left left-right u g p u p x right-left x right-right Property Restoration Case 4 cont: left-left and right-right are symmetric left-right and right-left are symmetric g p x g u p g u x left-left left-right u g p u p x right-left x right-right Property Restoration Case 4 left-left: First rotate the parent RIGHT Then recolor by switching colors of P and G Case 4 right-right is symmetric – simply rotate LEFT instead g p u p x 1 3 2 4 p x 5 1 g 2 x 3 u 4 1 5 g 2 3 u 4 5 Property Restoration Case 4 left-right: First rotate x LEFT (shown below) then rotate the parent RIGHT and recolor by switching colors of P and G Case 4 right-left is symmetric – simply rotate RIGHT then LEFT g g u p 1 x 4 x 5 p 1 u 3 2 4 5 Property Restoration Case 4 left-right: First rotate x LEFT (shown previously) then rotate the parent RIGHT and recolor by switching colors of X and G (just like left-left case but where p and x switch rolls) Case 4 right-left is symmetric – simply rotate RIGHT then LEFT g x u x p 1 3 2 4 x p 5 1 g 2 p 3 u 4 1 5 g 2 3 u 4 5 Insertion Complexity • Observations: ▫ Every one of the three cases take constant time. ▫ Case 3 brings x two steps closer to the root and only re-colors nodes without any rotations. ▫ For the multiple cases in 4, one or two rotations plus a re-coloring are required and we are done. • Lemma: ▫ An insertion into a red-black tree with n nodes takes O(log2 n) time and performs at most two rotations. Insertion Example • Snap-shots taken from: http://www.ececs.uc.edu/~franco/C321/html/RedBlack/redblack.html Note their rules are numbered differently than our ▫ Insert 50, 30, 70 (recall new nodes always insert as red): Rule 1 for 50, Rule 2 for 30 and 70 ▫ Null nodes (all black are not shown) Insertion Example • Consider the following actions; ▫ Insert 50, 30, 70, 65 ▫ Red property violation re-color (Case 3: Parent & Uncle are red) Re-color parent and uncle black while re-coloring grandparent red Insertion Example ▫ Continue with grandparent as X ▫ Root must be black! (Rule 1) color it black Insertion Example • Consider the following actions; ▫ Insert 50, 30, 70, 65, 80 ▫ No violations (rule 2) Insertion Example • Consider the following actions; ▫ Insert 50, 30, 70, 65, 80, 75 ▫ Case 3: Uncle & parent are both red re-color them black and the grandparent red Insertion Example • Consider the following actions; ▫ Insert 50, 30, 70, 65, 80, 75 ▫ Continue on recursively with grandparent being x ▫ Case 2: no violations Insertion Example • Consider the following actions; ▫ Insert 50, 30, 70, 65, 80, 75, 68, 67 ▫ Case 4: Parent & uncle: diff. color: right-left ▫ Balance: first rotation 67 right, then left Insertion Example • Consider the following actions; ▫ Insert 50, 30, 70, 65, 80, 75, 68, 67 ▫ Flip colors of G (65) and X (67) Insertion Example • Consider the following actions; ▫ Insert 50, 30, 70, 65, 80, 75, 68, 67 Insertion Example • Consider the following actions; ▫ Insert 50, 30, 70, 65, 80, 75, 68, 67, 55 ▫ Case 3: parent & uncle same color re-color them and grandparent Insertion Example • Consider the following actions; ▫ Insert 50, 30, 70, 65, 80, 75, 68, 67, 55 ▫ Recursively continue with 67 being the new x ▫ Case 4: 70 and 30 are diff colors: right-left First rotate 67 right Insertion Example • Consider the following actions; ▫ Insert 50, 30, 70, 65, 80, 75, 68, 67, 55 ▫ Then rotate 67 left Insertion Example • Consider the following actions; ▫ Insert 50, 30, 70, 65, 80, 75, 68, 67, 55 ▫ Then flip colors of 67 and 50 Insertion Example • Consider the following actions; ▫ Insert 50, 30, 70, 65, 80, 75, 68, 67, 55 Done! Benefits/Usage of Red-Black Trees • Properties ▫ O(log2N) and Θ(log2N) for search, insert, remove operations • Usage ▫ Underlying implementation structure for hash maps, hash sets, multimaps, multisets in Standard Template Library ▫ New scheduler for Linux Comparison of Red-Black and AVL • Both are Binary Search Trees • Both have the same worst case big-O time complexity for: ▫ Search ▫ Insert ▫ Delete • AVL trees are more rigidly balanced ▫ AVL trees are slightly faster for lookup ▫ AVL trees are more costly for insert and delete • Most people are using Red-Black trees in their implementations