Introduction to Algorithms Jiafen Liu Sept. 2013 Today’s Tasks Balanced Search Trees • Red-black trees – Height of a red-black tree – Rotations – Insertion Balanced Search Trees • Balanced search tree: A search-tree data structure for which a height of O(lgn) is guaranteed when implementing a dynamic set of n items. • Examples: – AVL Tree – 2-3-4 Tree – B Tree – Red-black Tree Red-black trees • This data structure requires an extra 1-bit color field in each node. • Red-black properties: – Every node is either red or black. – The root and leaves (NIL’s) are black. – If a node is red, then its parent is black. – All simple paths from any node x to a descendant leaf have the same number of black nodes = black-height(x). Example of a red-black tree • Convention: black-height of x does not include x itself. Example of a red-black tree • It could have a bunch of blacks, but it will never repeat two reds in a row. Goals of red-black trees • there are a couple of goals that we are trying to achieve. – These properties should force the tree to have logarithmic height, O(lgn) height. – The other desire we have from these properties is that they are easy to maintain. • We can create a tree in the beginning that has this property. • A perfectly balanced binary tree with all nodes black will satisfy those properties. • The tricky part is to maintain them when we make changes to the tree. Height of a red black tree • Let's look at the height of a red black tree. And we will start to see where these properties come from. • Theorem. A red-black tree with n keys has height h ≤ 2 lg(n + 1) . – Can be proved by induction, as in the book P164. – Another informal way: intuition. Intuition • Merge red nodes into their black parents. • This process produces another type of balanced search tree: 2-3-4 tree. – Any guesses why it's called a 2-3-4 tree? – Another nice property: • All of the leaves have the same depth. Why? • By Property 4 height of merged tree h' • Now we will prove the height of merged tree h' . • The first question is how many leaves are there in a red-black tree? – #(internal nodes) +1 – It can also be proved by induction. Try it. Height of Red-Black Tree • • • • • • The number of leaves in each tree is n + 1 2h' ≤ n + 1 ≤ 4h' h' ≤ lg(n + 1) How to connect h' and h? We also have h' ≥ 1/2 h h ≤ 2 lg(n + 1). Query operations • Corollary. The queries SEARCH, MIN, MAX, SUCCESSOR, and PREDECESSOR all run in O(lgn) time on a red-black tree with n nodes. Modifying operations • The operations INSERT and DELETE cause modifications to the red-black tree. • How to INSERT and DELETE a node in Red Black Tree? – The first thing we do is just use the BST operation. – They will preserve the binary search tree property, but don't necessarily preserve balance. Modifying operations • The operations INSERT and DELETE cause modifications to the red-black tree. • How to INSERT and DELETE a node in Red Black Tree? – The second thing is to set color of new internal node, to preserve property 1. – We can color it red, and property 3 does not hold. – The good news is that property 4 is still true. How to fix property 3? • We are going to move the violation of property 3 up the tree. • IDEA: Only red-black property 3 might be violated. Move the violation up the tree by recoloring until it can be fixed with rotations and recoloring. • we have to restructure the links of the tree via “rotation”. Rotations • Rotations maintain the inorder ordering of keys. a ∈α, b ∈β, c ∈γ ⇒ a ≤ A ≤ b ≤ B ≤ c. • A rotation can be performed in O(1) time. – Because we only change a constant number of pointers. Implement of Rotation • P166 Example • Insert 15 Example • Insert 15 • Color it red Example • Insert 15 • Color it red • Handle Property 3 by recoloring Example • Insert 15 • Color it red • Handle Property 3 by recoloring Example • Insert 15 • Color it red • Handle Property 3 by recoloring • Failed! RIGHT-ROTATE • RIGHT-ROTATE(18) • It turns out to be? RIGHT-ROTATE • We are still in trouble between 10 and 18. But made this straighter. • It doesn't look more balanced than before. • We can not resolve by recoloring. Then? LEFT-ROTATE • LEFT-ROTATE(7) • It turns out to be? LEFT-ROTATE • LEFT-ROTATE(7) and recoloring at the same time • It satisfy all the 4 properties Pseudocode of insertion RB-INSERT(T, x) TREE-INSERT(T, x) color[x] ← RED // only RB property 3 can be violated while x ≠ root[T] and color[x] = RED do if p[x] = left[ p[ p[x] ] then y ← right[ p[ p[x] ] // y = aunt or uncle of x if color[y] = RED then 〈Case 1〉 Case 1 • Let • All denote a subtree with a black root. have the same black-height. How? • Push C’s black onto A and D, and recurse. • Caution: We don't know whether B was the right child or the left child. In fact, it doesn't matter. Pseudocode of insertion RB-INSERT(T, x) TREE-INSERT(T, x) color[x] ← RED // only RB property 3 can be violated while x ≠ root[T] and color[x] = RED do if p[x] = left[ p[ p[x] ] then y ← right[ p[ p[x] ] // y = aunt/uncle of x if color[y] = RED then 〈Case 1〉 else if x = right[p[x]] //y is red, x,y have a zigzag then 〈Case 2〉 // This falls into case 3 Case 2 How? • Now we have a zigzig. • we may want a straight path between x. Pseudocode of insertion RB-INSERT(T, x) TREE-INSERT(T, x) color[x] ← RED // only RB property 3 can be violated while x ≠ root[T] and color[x] = RED do if p[x] = left[ p[ p[x] ] then y ← right[ p[ p[x] ] // y = aunt/uncle of x if color[y] = RED then 〈Case 1〉 else if x = right[p[x]] then 〈Case 2〉 // This falls into case 3 〈Case 3〉 //straight path Case 3 How? • Done! No more violations of RB property 3 are possible. • Property 4 is also preserved. Pseudocode of insertion RB-INSERT(T, x) TREE-INSERT(T, x) color[x] ← RED // only RB property 3 can be violated while x ≠ root[T] and color[x] = RED do if p[x] = left[ p[ p[x] ] then y ← right[ p[ p[x] ] // y = aunt/uncle of x if color[y] = RED then 〈Case 1〉 else if x = right[p[x]] then 〈Case 2〉 // This falls into case 3 〈Case 3〉 else 〈“then” clause with “left” and “right” swapped〉 color[root[T]] ← BLACK Analysis • Go up the tree performing Case 1, which only recolors nodes. • If Case 2 or Case 3 occurs, perform 1 or 2 rotations, and terminate. • Running time of insertion? – O(lgn) with O(1) rotations. • RB-DELETE — same asymptotic running time and number of rotations as RBINSERT (see textbook).