Red-Black Trees Chapter 13 in CLRS (2nd and 3rd) Chapter 14 in CLR Introduction We have seen that a binary search tree is a useful tool. I.e., if its height is h, then we can implement any basic operation on it in O(h) units of time. The problem: given an input of size n, how can we arrange it in a binary search tree of height O(logn)? [We cannot expect better then that]. Red-Black trees are one of many search-trees that provide a good balanced solution to this problem. 2 14.1: Properties of red-black trees A red-black tree is a binary search tree with one extra bit of storage per node: its color, which can be either RED or BLACK. By constraining the way nodes can be colored, red-black trees ensure that the tree is approximately balanced. I.e., the length of the longest path from the root to a leaf is not more then twice the length of the shortest one. 3 14.1: Properties of red-black trees Each node of the tree contains the fields color, key, left, right, and p. If a child or the parent of a node does not exist, the corresponding pointer field of the node contains the value NIL. We shall regard these NIL’s as being pointers to external nodes (leaves) of the binary search tree and the normal, key-bearing nodes as being internal nodes of the tree. 4 14.1: Properties of red-black trees A binary search tree is a red-black tree if it satisfies the following red-black properties: 1. Every node is either red or black. 2. Every leaf (NIL) is black. 3. If a node is red, then both its children are black. 4. For each node, all simple paths from the node to descendant leaves contain the same number of black nodes. 5. (The root is black). 5 14.1: Properties of red-black trees 6 14.1: Properties of red-black trees We call the number of black nodes on any path from [but not including] a node x to a leaf the black-height of the node, denoted by bh(x). By property 4, the notion of black height is well defined, since all descending paths from a given node have the same number of black nodes. The black-height of the tree is defined as the blackheight of the root. 7 14.1: Properties of red-black trees Red-black trees are good search trees: Lemma: A red-black tree with n internal nodes has height at most 2log(n+1). 8 14.1: Properties of red-black trees Proof: We first show that the subtree rooted at any node x contains at least 2bh(x)-1 internal nodes. We prove this claim by induction on the height of x. If the height of x is 0, then x must be a leaf (NIL), and the subtree rooted at x indeed contains at least 2bh(x)-1 = 20-1 = 0 internal nodes. 9 14.1: Properties of red-black trees For the inductive step, consider a node x that has positive height and is an internal node with two children. Each child has a black-height of either bh(x) or bh(x)-1, depending on whether its color is red or black, respectively. Since the height of a child is less than the height of x itself, we can apply the inductive hypothesis to conclude that each child has at least 2bh(x)-1-1 internal nodes. Thus, the subtree rooted at x contains at least (2bh(x)-1 - 1) + (2bh(x)-1 - 1) + 1 = 2bh(x) - 1 internal nodes, which proves the claim. 10 14.1: Properties of red-black trees To complete the proof of the lemma, let h be the height of the tree. According to property 3, at least half the nodes on any simple path from the root to a leaf, not including the root, must be black. Consequently, the black-height of the root must be at least h/2; thus: n ≥ 2h/2-1, which yields 2log(n+1) ≥ h. █ 11 14.1: Properties of red-black trees Consequence: the dynamic-set operations SEARCH, MINIMUM, MAXIMUM, SUCCESSOR, PREDECESSOR can all be implemented in time O(logn). Still open: How do we implement TREE-INSERT and TREE-DELETE in time O(logn), so that the output will maintain the red-black properties? 12 14.2: Rotations The search-tree operations TREE-INSERT and TREE-DELETE run in time O(logn). However, the result may violate the red-black properties. To restore these properties, we must recolor some of the nodes in the tree, and make some pointer changes. We use rotations to change the pointer structure. They are presented by the following figures (for the code refer to the book). Note that both LEFT-ROTATION and RIGHT- ROTATION run in O(1) units of time. Only the pointers are changed – the other fields remain the same. 13 14.2: Rotations 14 14.2: Rotations 15 14.3: Insertion General description: • We begin by inserting a node x into a tree T, as if T is an ordinary search tree. • We color x red. • We fix up the modified tree by re-coloring nodes and performing rotations, to guarantee that the red-black properties are preserved. Insertion is accomplished in O(logn) time. 16 14.3: Insertion 17 14.3: Insertion 18 14.3: Insertion 19 14.3: Insertion 20 14.3: Insertion – • Analysis of the code The only property that might be violated in the first two lines is property 3: if x’s parent is red, then we have two reds in a row. • The rest of the code pushes this violation up the tree. It is either corrected somewhere on the way up, or in the root. The other properties are maintained. • Assuming that each move up the tree takes O(1) time, the whole process ends in O(h) time, as desired. • We consider 6 cases, but 3 are symmetric copies of the other 3. It all depends on whether x’s parent p[x] is a left or a right child of x’s grandparent p[p[x]]]. Note that there is an important assumption: the root of the tree is black! 21 14.3: Insertion – • Analysis of the code Case 1 is distinguished from Cases 2 & 3 by the color of x’s uncle (denoted by y). If y is red then Case 1 is executed. Otherwise control passes to Cases 2 & 3. • In all cases, x’s grandfather is black (since it’s father is red). • Case 1 is shown in the following figure. [Both p[x] and y are red, and their father is black. The two sons are colored black, their father is colored red, and the possible problem is pushed up the tree]. 22 14.3: Insertion – Analysis of the code 23 14.3: Insertion – • Analysis of the code In Cases 2 & 3 the color of x’s uncle, y is black. The two are distinguished by whether x is the right child or the left child of p[x]. Using a left rotation we can move from Case 2 to Case 3. After that, x is the left son of p[x], both are red, and the uncle, y, is black. Some color changes and a right rotation are necessary, but there is no need to continue the while loop, since p[x] is colored black. The following figure sums this up. 24 14.3: Insertion – Analysis of the code 25 14.4: Deletion Deletion of a node will also take time O(logn). The procedure we use is called RB-DELETE. It deletes a node like in a “regular” binary search tree, and then calls the procedure RB-DELETE-FIXUP to fix colors and perform rotations, to restore the red-black properties. 26 14.4: Deletion There are differences between TREE-DELETE and RB-DELETE: • All pointers to NIL are replaced by pointers to nil[T] (to save space). Technical. • The test in line 7 of TREE-DELETE has been removed. Semi-technical. • A call to RB-DELETE-FIXUP is made in lines 16-17 if y is black. Most important! The last point needs an explanation: y is the node that is being spliced out. If it is red, than there is no problem: a removal of a red node is “always welcome” and does not violate any of the rules: no black-heights were changed, and no two reds became adjacent. 27 14.4: Deletion However, if y is black we have a problem. We pass y’s sole child, called x, to the procedure RB-DELETE-FIXUP (that is, if y had a non-nil child. Otherwise, we pass nil[T], whose parent is guaranteed to be y’s original parent [by the change in line7]). Why do we have a problem? Because the removal of y causes any path that previously contained it to have one fewer black node. Therefore, property 4 is now violated by any ancestor of y in the tree. However, if we count x as “having” two blacks, the problem is fixed. In other word, splicing out y and pushing its black color into its son x seems OK. The problem is that this “coloring” is not legal (it violates property 1). 28 14.4: Deletion The procedure RB-DELETE-FIXUP attempts to restore a proper coloring. The goal is to push the extra black up the tree until either x reaches a red node that may be colored black, or x points to the root, in which case the extra black may be simply removed. During the process, some nodes are re-colored and some rotations are being performed. There are 4 cases to consider, based on the colors of x’s brother and nephews (or nieces). The different cases are described in the following figures. The code appears after the figures. 29 14.4: Deletion 30 14.4: Deletion 31 14.4: Deletion 32 14.4: Deletion 33 14.4: Deletion 34 14.4: Deletion 35 14.4: Deletion 36 14.4: Deletion 37 Animations Animations and presentations: •http://www.ece.uc.edu/~franco/C321/html/RedBlack/redblack.html •http://www.eli.sdsu.edu/courses/fall95/cs660/notes/RedBlackTree/RedBlack.html •http://www.nist.gov/dads/HTML/redblack.html •http://www.cs.auckland.ac.nz/software/AlgAnim/red_black.html 38