12. Balanced Trees

```241-423 Advanced Data Structures
and Algorithms Semester 2, 2012-2013
12. Balanced Search
Trees
Objectives
– discuss various kinds of balanced search trees:
AVL trees, 2-3-4 trees, Red-Black trees,
1
Contents
1.
2.
3.
4.
What is a Balanced Binary Search Tree?
AVL Trees
2-3-4 Trees
Red-Black Trees
2
1.What is a Balanced Binary Search Tree?
• A balanced search tree is one where all the
branches from the root have almost the
same height.
balanced
unbalanced
continued
3
• As a tree becomes more unbalanced, search
running time decreases from O(log n) to
O(n)
– because the tree shape turns into a list
• We want to keep the binary search tree
balanced as nodes are added/removed, so
searching/insertion remain fast.
4
1.1. Balanced BSTs: AVL Trees
• An AVL tree maintains height balance
– for each node, the difference in height of its two
subtrees is in the range -1 to 1
5
1.2. 2-3-4 Trees
o
o
A multiway tree where each node has at most 4
children, and a node can hold up to 3 values.
A 2-3-4- tree can be perfectly balanced
•
•
no difference in height between branches
6
1.3. Red-Black Trees
o
A red-black tree is a binary version of a 2-3-4 tree
•
the nodes have a 'color' attribute: BLACK or RED
•
•
drawn in Ford and Topp (and here) in white and gray!!
the tree maintains a balance measure called
the BLACK height
BLACK
RED
7
1.4. B-Trees
• A multiway tree where each node has at
most m children, and a node can hold up to
m-1 values
– a more general version of a 2-3-4 tree
• B-Trees are most commonly used in
databases and filesystems
– most nodes are stored in secondary storage such
as hard drives
8
2. AVL Trees
• For each AVL tree node, the difference
between the heights of its left and right
subtrees is either -1, 0 or +1
– this is called the balance factor of a node
•
balanceFactor =
height(left subtree) - height(right subtree)
L-R
– if balanceFactor > 1 or < -1 then the tree is too
unbalanced, and needs 'rearranging' to make it
more balanced
9
• Heaviness
– if the balanceFactor is positive, then the node is
"heavy on the left"
•
the height of the left subtree is greater than the
height of the right subtree
– a negative balanceFactor, means the node is
"heavy on the right"
continued
10
L–R
= 0-1
root is
heavy on
the right,
but still
balanced
L–R
= 2-1
root is
heavy on
the left,
but still
balanced
L–R
= 1-2
root is
heavy on
the right,
but still
balanced
11
2.1. The AVLTree Class
12
Using AVLTree
String[] stateList = {"NV", "NY", "MA", "CA", "GA"};
AVLTree<String> avltreeA = new AVLTree<String>();
for (int i = 0; i < stateList.length; i++)
System.out.println("States: " + avltreeA);
int[] arr = {50, 95, 60, 90, 70, 80, 75, 78};
AVLTree<Integer> avltreeB = new AVLTree<Integer>();
for (int i = 0; i < arr.length; i++)
// display the tree
System.out.println(avltreeB.displayTree(2));
avltreeB.drawTree(2);
13
Execution
States: [CA, GA, MA, NV, NY]
70(-1)
60(1)
50(0)
90(1)
78(0)
75(0)
95(0)
80(0)
(-1)
(1)
root is
heavy on
the right,
but still
balanced
14
The AVLTree Node
• An AVLNode contains the node's value,
references to the node's two subtrees, and the
node height.
height(node) = max ( height(node.left), height(node.right) ) + 1;
nodeValue
height
left
AVTTreeNode
object
right
continued
15
nodeValue
height
private static class AVLNode<T>
{
public T nodeValue;
// node data
left
right
public int height;
public AVLNode<T> left, right;
public AVLNode (T item)
{ nodeValue = item;
height = 0;
left = null; right = null;
}
the coding style is
due to Ford & Topp
}
16
2.2. Adding a Node to the Tree
• The addition of a node may cause the tree to
go out of balance.
it returns from the adding back to the root
•
reordering is the new idea in AVL trees
– the reordering is done using single and double
rotations
17
2.2.1. Too Heavy on the Left
L – R = 3-1
left
25
12
Two cases
or
node P
(2)
left
30
12
30
left
5
node P
(2)
25
right
20
left or right branch
-- doesn't matter
11
outside grandchild
5
20
branch
doesn't matter
22
inside grandchild
continued
18
• Inserting a node in the left subtree of P (e.g.
adding 11 or 22) may cause P to become
"too heavy on the left"
– balance factor == 2
• The new node can either be in the outside or
inside grandchild subtree:
– outside grandchild = left-left
– inside grandchild = left-right
continued
19
20
2.2.2. Too Heavy on the Right
L – R = 1-3
node P
(-2)
right
12
30
25
right
12
right
27
45
branch
doesn't matter
40
outside grandchild
or
node P
(-2)
25
Two cases
30
left
27
branch
doesn't matter
45
29
inside grandchild
continued
21
• Inserting a node in the right subtree of P (e.g.
adding 29 or 40) may cause P to become "too
heavy on the right"
– balance factor == -2
• The new node can either be in the outside or
inside grandchild subtree:
– outside grandchild = right-right
– inside grandchild = right-left
continued
22
23
2.3. Single Rotations
• When a new item is added to the subtree for
an outside grandchild, the imbalance is fixed
with a single right or left rotation
• Two cases:
– left outside grandchild (left-left) -->
single right rotation
– right outside grandchild (right-right) -->
single left rotation
24
2.3.1. Single Right Rotation
• A single right rotation occurs when a new
element is added to the subtree of the left
outside grandchild (left-left)
continued
25
cut
cut
left outside grandchild (left-left)
continued
26
• A single right rotation rotates the left child
(LC) to replace the parent
– the parent becomes the new right child
• The right subtree of LC (RGC) is attached as a
left child of P
– ok since the nodes in RGC are greater than LC but
less than P
27
singleRotateRight()
// single right rotation on p
private static <T> AVLNode<T> singleRotateRight(
AVLNode<T> p)
{
AVLNode<T> lc = p.left;
p.left = lc.right;
lc.right = p;
// 1 & 4 on slide 26
// 2 & 3
p.height = max(height(p.left), height(p.right)) + 1;
lc.height = max(height(lc.left),
height(rc.right)) + 1;
return lc;
}
28
private static <T> int height(AVLNode<T> t)
{
if (t == null)
return -1;
else
return t.height;
}
29
2.3.2. Single Left Rotation
• A single left rotation occurs when a new
element is added to the subtree of the right
outside grandchild (right-right).
• The rotation exchanges the parent (P) and
right child (RC) nodes, and attaches the
subtree LGC as the right subtree of P.
continued
30
cut
cut
right outside grandchild (right-right)
31
singleRotateLeft()
// single left rotation on p
private static <T> AVLNode<T> singleRotateLeft(
AVLNode<T> p)
{
AVLNode<T> rc = p.right;
p.right = rc.left;
rc.left = p;
// 1 & 4 on slide 31
// 2 & 3
p.height = max(height(p.left),height(p.right)) + 1;
rc.height = max(height(rc.left),
height(rc.right)) + 1;
return rc;
}
32
2.4. Double Rotations
• When a new item is added to the subtree for
an inside grandchild, the imbalance is fixed
with a double right or left rotation
– a double rotation is two single rotations
• Two cases:
– left inside grandchild (left-right) -->
double right rotation
– right inside grandchild (right-left) -->
double left rotation
33
2.4.1. A Double Right Rotation
Single left rotation
left inside
grandchild
(left-right)
Single right rotation
Watch RGC
rise to the top
balanced
34
doubleRotateRight()
private static <T> AVLNode<T> doubleRotateRight(
AVLNode<T> p)
/* double right rotation on p is
left rotation, then right rotation */
{
p.left = singleRotateLeft(p.left);
return singleRotateRight(p);
}
35
2.4.2. A Double Left Rotation
P
P
LGC
RC
LGC
LGC
LC
P
RC
LC
RC
A
RGC
A
B
A
B
Single right rotation
right inside
grandchild
(right-left)
RGC
Single left rotation
Watch LGC
rise to the top
B
LC
RGC
balanced
36
doubleRotateLeft()
private static <T> AVLNode<T> doubleRotateLeft(
AVLNode<T> p)
/* double left rotation on p is
right rotation, then left rotation */
{
p.right = singleRotateRight(p.right);
return singleRotateLeft(p);
}
37
• addNode() recurses down to the insertion point
and inserts the node.
• As it returns, it visits the nodes in reverse order,
fixing any imbalances using rotations.
• It must handle four cases:
– balance height == 2: left-left, left-right
– balance height == -2: right-left, right-right
38
was P in earlier slides
No AVL rotation
private Node<T> addNode(Node<T> t, T item)
{
if (t == null)
// found insertion point
t = new Node<T>(item);
}
else if (((Comparable<T>)item).compareTo(t.nodeValue) < 0) {
// visit left subtree
else if (((Comparable<T>)item).compareTo(t.nodeValue) > 0 ) {
// visit right
else
throw new IllegalStateException();
// duplicate error
return t;
39
private AVLNode<T> addNode(AVLNode<T> t, T item)
{
if(t == null)
// found insertion point
t = new AVLNode<T>(item);
else if (((Comparable<T>)item).compareTo(t.nodeValue) < 0) {
// visit left subtree: add node then maybe rotate
if (height(t.left) - height(t.right) == 2 ) { //too heavy on left
if (((Comparable<T>)item).compareTo(t.left.nodeValue) < 0)
// problem on left-left
t = singleRotateRight(t);
else
// problem on left-right
t = doubleRotateRight(t);
// left then right rotation
}
}
:
continued
40
else if (((Comparable<T>)item).compareTo(t.nodeValue) > 0 ) {
// visit right subtree: add node then maybe rotate
if (height(t.left)-height(t.right) == -2){ //too heavy on right
if (((Comparable<T>)item).compareTo(t.right.nodeValue) > 0)
// problem on right-right
t = singleRotateLeft(t);
else
// problem on right-left
t = doubleRotateLeft(t); // right then left rotation
}
}
else
// duplicate; throw IllegalStateException
throw new IllegalStateException();
// calculate new height of t
t.height = max(height(t.left), height(t.right)) + 1;
}
return t;
41
public interface for inserting an item
{
try {
root = addNode(root, item); // start from root
}
catch (IllegalStateException e)
{ return false; }
// item is a duplicate
// increment the tree size and modCount
treeSize++;
modCount++;
return true;
}
42
2.6. Building an AVL Tree
gray node is
too heavy
left outside
grandchild
(left-left)
continued
43
right outside
grandchild
(right-right)
45
continued
44
right inside
grandchild
(right-left)
double rotate left
(right then left rotation)
continued
45
left inside
grandchild
(left-right)
double rotate right
(left then right rotation)
continued
46
2.7. Efficiency of AVL Tree Insertion
• Detailed analysis shows:
int(log2n)  height < 1.4405 log2(n+2) - 1.3277
• So the worst case running time for insertion is
O(log2n).
• The worst case for deletion is also O(log2n).
47
2.8. Deletion in an AVL Tree
• Deletion can easily cause an imbalance
– e.g delete 32
44
44
17
62
32
50
48
17
78
54
50
88
48
78
54
88
after deletion
before deletion of 32
62
AVL Trees
48
48
3. 2-3-4 Trees
• In a 2-3-4 tree:
The numbers refer to the
maximum number of
branches that can leave
the node
– a 2-node has 1 value and a max of 2 children
– a 3-node has 2 values and a max of 3 children
– a 4-node has 3 values and a max of 4 children
same as a binary
tree node
49
3.1. Searching a 2-3-4 Tree
• To find an item:
– start at the root and compare the item with all
the values in the node;
– if there's no match, move down to the
appropriate subtree;
– repeat until you find a match or reach an empty
subtree
50
Search Example
Try finding 9 and 30
51
3.2. Inserting into a 2-3-4 Tree
• Search to the bottom for an insertion node
– 2-node at bottom: convert to 3-node
– 3-node at bottom: convert to 4-node
– 4-node at bottom: ??
52
Splitting 4-nodes
• Transform tree on the way down:
– ensures last node is not a 4-node
– local transformation to split a 4-node
Insertion at the bottom is now easy since it's not a 4-node
53
Example
• To split a 4-node. move middle value up.
54
3.3. Building
This 4-node will be split during
the next insertion.
insert 4
This 4-node will be split during
the next insertion.
continued
55
insert 10
Insertions happen at the bottom.
This 4-node will be split during
the next insertion.
56
insert 55
The insertion point is at level 1, so the new 4-node
at level 0 is not split during this insertion.
continued
57
insert 11
12
12
4
2
8
4
25
10
15
Split 4-node (4, 12, 25)
35
55
2
8
25
10 11
15
35
55
Insert 11
This 4-node will be split during
the next insertion.
58
Another Example
insert
The search missed
insert
the 4-nodes on
the left, so
not changed.
59
3.4. Efficiency of 2-3-4 Trees
fast!
• Searching for an item in a 2-3-4 tree with n
elements:
– the max number of nodes visited during the
search is int(log2n) + 1
• Inserting an element into a 2-3-4 tree:
– requires splitting no more than int(log2n) + 1
4-nodes
•
normally requires far fewer splits
60
3.5. Drawbacks of 2-3-4 Trees
• Since any node may become a 4-node, then all
nodes must have space for 3 values and 4 links
– but most nodes are not 4-nodes
– lots of wasted memory, unless impl. is fancier
– slower to process than binary search trees
61
4. Red-Black Trees
• A red-black tree is a binary search tree
where each node has a 'color'
– BLACK or RED
• A red-black tree is a binary version of a
2-3-4 tree, using different color combinations
to represent 3-nodes and 4-nodes.
– a 2-node is already a binary node
62
BLACK
RED
BLACK and RED are drawn
in Ford and Topp
(and here) in white and gray!!
63
4.1. From 2-3-4 Tree Nodes to
Red-Black Nodes
2-node Conversion
• A 2-node is already a binary node so doesn't
need to change its shape.
• The color of a 2-node is always BLACK
(drawn as white in these slides).
continued
64
4-node Conversion
• A 4-node has it's middle value become a
BLACK (white) parent and the other values
become RED (gray) children.
BLACK
RED
continued
65
3-node Conversion
• Represent a 3-node as:
– a BLACK parent and a smaller RED left child or
– a BLACK parent and a larger RED right child
3-node (A, B)
in a 2-3-4 Tree
(a) Red-black tree representation
A is a black parent; B is a red right child
A
A B
B
OR
S
S
T
(b) Red-black tree representation
B is a black parent; A is a red left child
U
A
B
U
T
U
S
T
66
4.2. Changing a 2-3-4 Tree into
a Red-Black Tree
change
this node
continued
67
change
this node
continued
68
change
these nodes
69
4.3. Three Properties of a Red-Black Tree
that must always be true for the tree to be red-black
• 1. The root must always be BLACK
(white in our pictures)
• 2. A RED parent never has a RED child
– in other words: there are never two successive
RED nodes in a path
continued
70
• 3. Every path from the root to an empty
subtree contains the same number of
BLACK nodes
– called the black height
• We can use black height to measure the
balance of a red-black tree.
71
Check the Example Properties
72
4.4. Inserting a Node
Three things
to do.
• 1. Search down the tree to find the insertion
point, splitting any 4-nodes (a BLACK parent
with two RED children) by coloring the
children BLACK and the parent RED
– called a color flip
• Splitting a 4-node may involve additional
rotations and color changes
– there are 4 cases to consider (section 4.4.1)
continued
73
• 2. Once the insertion point is found, add
the new item as a RED leaf node (section
4.4.2)
– this may create two successive RED nodes
•
again use rotation and recoloring to
reorder/rebalance the tree
• 3. Keep the root as a BLACK node.
74
4.4.1. Four Cases for Splitting a 4-Node
L
G
L
L
L
R
L
1
LL/black parent
(also mirror
case, RR)
G
G
L
G
R
2
3
4
LR/black parent
(also mirror
case, RL)
LL/red parent
(also mirror
case, RR)
LR/red parent
(also mirror
case, RL)
75
Case 1 (LL/black P): An Example
• If the parent is BLACK, only a color flip is
needed.
1
L
G
L
L
L
G
L
L
G
L
G
76
Case 2 (LR/black P): Insert 55
L
G
L
R
G
R
2
L
G
L
G
R
Only a color flip is required
for same reason as case 1
77
Case 3 (LL / red parent)
3
L
L
L
L
The color-flip creates two successive red nodes
-- this breaks property 2, so must be fixed
78
• To fix the red color conflict, carry out a single
left or right rotation of node P:
• LL  right rotation of P
• RR (mirror case)  left rotation of P
• Also change the colors of nodes P and G.
continued
79
3
3
R
L
R
L
LL of G → right rot of P
RR of G → left rot of P
and P and G color changes
80
Case 3 (LL / red P) as a 2-3-4 Tree
3
L
L
XPG
A
B
C
D
81
Case 4 (LR / red parent)
4
L
L
R
R
The color-flip creates two successive red nodes
-- this breaks property 2, so must be fixed
82
• To fix the red color conflict, carry out a
double left or right rotation of node X:
• LR  double right rotation of X
– left then right rotations
• RL (mirror case)  double left rot of X
– right then left rotations
• Also change the colors of nodes X and G.
83
LR Example
4
L
X
R
P
left rotation
of X
right rotation
of X
X and G
recoloured
84
Same Example, with 2-3-4 Views
4
L
R
85
4.4.2. Inserting a New Item
• Always add a new item to a tree as a RED leaf
node
– this may create two successive RED nodes, which
breaks property 2
• Fix using single / double rotation and color flip
as used for splitting 4-nodes:
– LL/RR  single right/left rotation of parent (P)
– LR/RL  double right/left rot of new node (X)
86
Example: Insert 14
R
R
single left
rotation of 12
and color flip
87
R
L
single right
rotation of 10
single left
rotation and
color flip
10
12
RL = double left rotation of node 10
(right then left)
88
4.5. Building a Red-Black Tree
L
L
single right rotation
of 20 and color flip
continued
89
these numbers are
from section 4.4.
1
2
3
Insert 35
Insert 25
now
continued
90
Insert 30
L
R
LR = double right rotation of node 30
(left then right)
91
4.6. Search Running Time
• The worst-case running time to search a redblack tree or insert an item is O(log2n)
– the maximum length of a path in a red-black tree
with black height B is 2*B-1
but this cannot
happen if the
insertion rules
are followed
92
4.7. Deleting a Node
• Deletion is more difficult than insertion!
– must usually replace the deleted node
– but no further action is necessary when the
replacement node is RED
Delete 75
Replacement node 78 is RED
75
78
60
50
60
90
70
80
78
100
replacement
(next biggest)
50
90
70
80
100
continued
93
• But deletion requires recoloring and rotations
when the replacement node is BLACK.
Delete 90
Replacement node is BLACK
75
60
50
Replace 90
with 100
90
70
80
78
75
75
100
replacement
60
50
100
70
80
Right rotation with
pivot 80 and
recoloring
80
60
50
70
78
100
78
94
4.8. The RBTree Class
• RBTree implements the Collection interface
and uses RBNode objects to create a red-black
tree.
nodeValue
color
RBNode
object
parent
left
right
95
class RBTree<T> implements Collection<T>
ds.util
Constructor
RBTree()
Creates an empty red-black tree.
Methods
String displayTree(int maxCharacters)
Returns a string that gives a hierarchical view of the tree. An
asterisk (*) marks red nodes.
void drawTree(int maxCharacters)
Creates a single frame that gives a graphical display of the tree.
Nodes are colored.
String drawTrees(int maxCharacters)
Creates of the action of the function and any return value.
String toString()
Returns a string that describes the elements in a comma-separated
list enclosed in brackets.
96
• Ford and Topp's tutorial, "RBTree Class.pdf",
provides more explanation of the RBTree class
– local copy at
•
http://fivedots.coe.psu.ac.th/Software.coe/
• Includes:
– a discussion of the private data
– explanation of the algorithms for splitting a 4-node
and performing a top-down insertion
97
Using the RBTree Class
import ds.util.RBTree;
public class UseRBTree
{
public static void main (String[] args)
{
// 10 values for the red-black tree
int[] intArr = {10, 25, 40, 15, 50, 45,
30, 65, 70, 55};
RBTree<Integer> rbtree = new RBTree<Integer>();
// load the tree with values
for(int i = 0; i < intArr.length; i++) {
rbtree.drawTrees(4); // display on-going tree
}
// in a JFrame
:
98
// display final tree to stdout
System.out.println(rbtree.displayTree(2));
/*
// remove red-node 25
rbtree.remove(25);
rbtree.drawTrees(4);
// tree shown in JFrame
// remove black-node root 45
rbtree.remove(45);
rbtree.drawTree(3);
// tree shown in JFrame
*/
} // end of main()
}
// end of UseRBTree class
99
Execution
Tree changes are shown graphically
by drawTrees() and drawTree()
100
single left rot of 25
BLACK RED
40
25
started red, then
flipped to black
15
a
50
b
split 25; color change 25
continued
101
double right rotation of 45
split 45
45
30
c
d
65
continued
102
single left rotation of 65
70
e
f
55
split 65; left rotation of 45
103
```

– Cards

– Cards

– Cards

– Cards

– Cards