Red-Black Trees CS 583 Analysis of Algorithms 7/1/2016

advertisement
Red-Black Trees
CS 583
Analysis of Algorithms
7/1/2016
CS583 Fall'06: Red-Black Trees
1
Outline
• Red-Black Trees
– Definitions
– Rotations
• Augmenting Data Structures
–
–
–
–
7/1/2016
Definitions
Dynamic order statistics
Determining the rank of an element
Maintaining subtree sizes
CS583 Fall'06: Red-Black Trees
2
Definitions
• A red-black tree is a binary search tree with one
extra item per node: its color, which can be either
RED or BLACK.
– By constraining the color of nodes, red-black trees ensure
the following balancing rule:
• Any path from the root to a leaf is no more than twice as long as
any other path.
– Each node contains the following fields:
• color, key, left, right, and parent.
– If a child of a node does not exist, it is referred by NIL.
• NILs are leaf nodes that are called external nodes, and all other
key bearing nodes are internal nodes
7/1/2016
CS583 Fall'06: Red-Black Trees
3
Red-Black Trees Properties
•
Red-black trees must satisfy the following
properties:
1)
2)
3)
4)
5)
•
Every node is either red or black.
The root is black.
Every leaf (NIL) is black.
If a node is red, then both its children are black.
For each node, all paths from the node to descendant leaves
contain the same number of black nodes.
We use a single sentinel nil[T] to represent NIL.
– Its color field is BLACK, and all other fields are set to
arbitrary values.
– All pointers to NIL are replaced by pointers to sentinel
nil[T].
7/1/2016
CS583 Fall'06: Red-Black Trees
4
Height of the Tree
We call the number of black nodes from, but not including a node x down to a
leaf the black-height of the node, denoted bh(x). By property 5, this notion is
well defined.
Lemma 13.1
A red-black tree with n internal nodes has height at most 2 lg (n+1).
Proof.
First, show that the subtree rooted at any node x contains at least 2bh(x)-1
internal nodes. We prove it by induction on the height of x-based subtree.
If the height of x is 0, then x is nil[T], hence it contains 0 nodes = 20-1.
7/1/2016
CS583 Fall'06: Red-Black Trees
5
Height of the Tree (cont.)
Now, consider an internal node x with two children. Each child has a blackheight of bh(x) (if it is RED), or bh(x) - 1 (if it is BLACK).
By our hypothesis, each child has at least 2bh(x)-1-1 (for the BLACK one)
nodes.
Thus, the subtree x contains at least 2*(2bh(x)-1-1) + 1 = 2bh(x)-1 internal nodes,
which proves the claim.
Note that, in the case of one child, that child cannot be BLACK (to not violate
property 5). Hence, if the child y is RED, its bh(y)=bh(x) => N(x) >= 2bh(x)-1
+ 1.
7/1/2016
CS583 Fall'06: Red-Black Trees
6
Height of the Tree (cont.)
To complete the proof, let h be the height of the tree. According to property 4,
at least half the nodes on any simple path from the root to a leaf, not including
the root, must be black. (The simple path includes only one child, which must
be black for each red node.) Consequently, the black-height of the root must
be at least h/2; hence:
n >= 2h/2-1 <=>
2h/2 <= n+1 <=>
h/2 lg2 <= lg(n+1) <=>
h <= 2 lg(n+1) 
7/1/2016
CS583 Fall'06: Red-Black Trees
7
Rotations
• The insert and delete operations when run on a redblack tree take O(lg n) time.
– However, they modify the tree, which may violate the
red-black tree properties.
– To restore those properties, we must change the colors of
some nodes and the pointers structure.
• The pointer structure is changed through rotation,
which is a local operation in a search that preserves
the binary-search tree property.
– There are two kinds of rotations: left and right.
7/1/2016
CS583 Fall'06: Red-Black Trees
8
Left Rotation
The left rotation for node x assumes its right child y is not nil[T]. It "pivots"
around the link from x to y:
- It makes y the new root of the subtree.
- x is y's left child.
- x's right child is y's left child.
x
a
y
b
c
--->
y
x
a
c
b
A rotation operation preserves the BST properties:
key[a] <= key[x] <= key[b] <= key[y] <= key[c]
7/1/2016
CS583 Fall'06: Red-Black Trees
9
Left Rotation: Pseudocode
Left-Rotate(T,x)
1 y = right[x]
2 right[x] = left[y]
3 if left[y] <> nil[T]
4
parent[left[y]] = x
5 parent[y] = parent[x]
6 if parent[x] = nil[T]
7
root[T] = y
8 else
9
if x = left[parent[x]]
10
left[parent[x]] = y
11
else
12
right[parent[x]] = y
13 left[y] = x
14 parent[x] = y
// x is left child
// x is right child
The rotation operation runs in (1) time; only pointers are changed, all other
fields remain the same.
7/1/2016
CS583 Fall'06: Red-Black Trees
10
Augmenting Data Structures
• Many software engineering problems can be solved
by using “textbook” data structures such as doublylinked lists, hash tables, or binary search trees.
– For example, using a C++ STL library is sufficient for
many financial algorithms.
• In some situations, however, using a straightforward
data structure is not sufficient.
– It is very rare that an entirely new data structure has to be
invented.
– More often, it will suffice to augment a standard data
structure by storing an additional information in it.
• This is not often straightforward as the new information must be
updated and maintained.
7/1/2016
CS583 Fall'06: Red-Black Trees
11
Dynamic Order Statistics
• Recall that, the ith order statistics is the element in
the set of n elements with the ith smallest key.
– We saw that any order statistics could be retrieved in O(n)
time from an ordered set.
– Now we will augment a red-black tree to determine the
order statistics in O(lg n) time.
• The rank of an element is its position in the linear
order of the set.
– It can also be determined in O(lg n) time in an augmented
red-black tree structure,
7/1/2016
CS583 Fall'06: Red-Black Trees
12
Order-Statistics Tree
• An order-statistics tree T is a red-black tree with
additional information stored in each node.
– In addition to key[x], color[x], p[x], left[x], and right[x],
we have another field size[x].
• This field contains the number of internal nodes in the subtree
rooted at x (including x itself).
– If we define size[nil[T]] = 0 (for sentinel nodes) then:
• size[x] = size[left[x]] + size[right[x]] + 1
• We do not require keys to be distinct.
– This creates ambiguity when determining the rank of an
element.
– The convention is to define the rank based on the inorder
tree walk.
7/1/2016
CS583 Fall'06: Red-Black Trees
13
Retrieving an ith Order Element
The procedure below returns a pointer to the node containing the ith smallest
key in the subtree rooted at x.
OS-Select(x,i)
1 r = size[left[x]] + 1
2 if i = r
3
return x
4 else
5
if i < r
6
return OS-Select(left[x], i)
7
else
8
return OS-Select(right[x], i-r)
Each recursive call goes down one level in the tree, hence the total time for
this procedure is proportional to the height of the tree, which is O(lg n) for the
red-black tree. Thus, the running time of OS-Select is O(lg n).
7/1/2016
CS583 Fall'06: Red-Black Trees
14
Determining the Rank
The procedure below returns the position of x in the linear order determined
by an inorder tree walk of T.
OS-Rank(T, x)
1 r = size[left[x]] + 1
2 y = x
3 while y <> root[T]
4
if y = right[p[y]]
5
r = r + size[left[p[y]]] + 1
6
y = p[y]
7 return r
The rank of x can be viewed as the number of nodes preceding x in an inorder
tree walk, plus 1 for x. Invariant:
At the start of the while loop 3-6, r is the rank of key[x] in the subtree rooted
at y.
7/1/2016
CS583 Fall'06: Red-Black Trees
15
Determining the Rank: Correctness
• Initialization:
– Prior to the first iteration, r is the rank of x in the subtree
rooted at x, and y=x.
• Maintenance:
– At the end of each iteration y is set to p[y]. Hence r must
be a rank of key[x] for a tree at p[y].
• If y is a left child, then no additional nodes for x need to be
counted.
• Otherwise, we need to add all nodes in p[y] left subtree and p[y]
itself (line 5).
• Termination:
– The loop terminates when y=root[T], hence r is the rank
of key[x] in the entire tree.
7/1/2016
CS583 Fall'06: Red-Black Trees
16
Determining the Rank: Performance
• Each iteration of the while loop takes (1) time.
• Node y goes up one level in the tree with each
iteration.
• Hence, the running time of OS-Rank is at worst
proportional to the height of the tree: O(lg n) on an
n-node order-statistics tree.
7/1/2016
CS583 Fall'06: Red-Black Trees
17
Maintaining Subtree Sizes
• The size field in each node helps quickly compute
order-statistics information.
• This field should be maintained for both insertion
and deletion operations on the red-black trees
without affecting the asymptotic running time of
these operations.
• The insertion operation is based on two phases:
– Walk the tree to add a node to the existing node.
• Simply increment size[x] for each x on the path traversed.
– The second phase is based on rotations.
• The size needs to be changed for only two nodes involved.
• Since only at most two rotations are needed, a constant time will
be added, not affecting the asymptotic time.
7/1/2016
CS583 Fall'06: Red-Black Trees
18
Maintaining Subtree Sizes: Left Rotation
Left-Rotate(T,x)
1 y = right[x]
2 right[x] = left[y]
3 if left[y] <> nil[T]
4
parent[left[y]] = x
5 parent[y] = parent[x]
6 if parent[x] = nil[T]
7
root[T] = y
8 else
9
if x = left[parent[x]] // x is left child
10
left[parent[x]] = y
11
else
// x is right child
12
right[parent[x]] = y
13 left[y] = x
14 parent[x] = y
15 size[y] = size[x]
16 size[x] = size[left[x]] + size[right[x]] + 1
7/1/2016
CS583 Fall'06: Red-Black Trees
19
Download