Week 8 - Friday
What did we talk about last time?
Practiced with 2-3 trees
Red-black trees
We can do an insertion with a red-black tree using a series of rotations and recolors
We do a regular BST insert
Then, we work back up the tree as the recursion unwinds
If the right child is red and the left is black, we rotate the current node left
If the left child is red and the left child of the left child is red, we rotate the current node right
If both children are red, we recolor them black and the current node red
You have to do all these checks, in order!
Multiple rotations can happen
It doesn't make sense to have a red root, so we always color the root black after the insert
A
X
Current Current
Y
Y X
B C A B
We perform a left rotation when the right child is red
C
Current
Z
Current
Y
Y
D
X
A
X C
A B
B
We perform a right rotation when the left child is red and its left child is red
C
Z
D
Current
Y
Current
Y
X Z X Z
A B C D A B C
We recolor both children and the current node when both children are red
D
Add the following keys to a red-black tree:
62
11
32
7
45
24
88
25
28
90
The height of a red-black tree is no more than
2 log n
Find is Θ(height), so find is Θ(log n)
Since we only have to go down that path and back up to insert, insert is Θ(log n)
Delete in red-black trees is messy, but it is also actually Θ(log n)
Another approach to balancing is the AVL tree
Named for its inventors G.M. Adelson-Velskii and E.M.
Landis
Invented in 1962
An AVL tree is better balanced than a red-black tree, but it takes more work to keep such a good balance
It's faster for finds (because of the better balance)
It's slower for inserts and deletes
Like a red-black tree, all operations are Θ(log n)
An AVL tree is a binary search tree where
The left and right subtrees of the root have heights that differ by at most one
The left and right subtrees are also AVL trees
6
1
2 7
9
10
14
15
17
The balance factor of a tree (or subtree) is the height of its right subtree minus the height of its left
In an AVL tree, the balance factor of every subtree is -1, 0, or +1
6
1 9
2
What’s the balance factor of this tree?
Carefully.
Every time we do an add, we have to make sure that the tree is still balanced
There are 4 cases
1.
Left Left
2.
3.
4.
Right Right
Left Right
Right Left
3
A
1
B
2
C
D
1
A
Right Rotation
B
2
C
3
D
1
A
B
2
3
A
C D
Left Rotation
1
B
2
3
C D
3
B
1 A
1
C
2
B
D Left Rotation +
Right Rotation
C
2
3
D A
1
A
3 1
D
2
C
B A
Right Rotation +
Left Rotation
D
2
C
3
B
Let’s just add these numbers in order:
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
What about these?
7, 24, 92, 32, 2, 57, 67, 84, 66, 75
What if we knew ahead of time which numbers we were going to put into a tree?
How could we make sure the tree is balanced?
Answer:
Sort the numbers
Recursively add the numbers such that the tree stays balanced
Write a recursive method that adds a sorted array of data such that the tree stays balanced
Assume that an add( int key) method exists
Use the usual convention that start is the beginning of a range and end is the location after the last legal element in the range public void balance( int [] data, int start, int end )
Takes a potentially unbalanced tree and turns it into a balanced tree
Step 1
Turn the tree into a degenerate tree (backbone or vine) by right rotating any nodes with left children
Step 2
Turn the degenerate tree into a balanced tree by doing sequences of left rotations starting at the root
The analysis is not obvious, but it takes O(n) total rotations
How much time does it take to insert n items with:
Red-black or AVL tree
Balanced insertion method
Unbalanced insertion + DSW rebalance
How much space does it take to insert n items with:
Red-black or AVL tree
Balanced insertion method
Unbalanced insertion + DSW rebalance
Balanced binary trees ensure that accessing any element takes O(log n) time
But, maybe only a few elements are accessed repeatedly
In this case, it may make more sense to restructure the tree so that these elements are closer to the root
1.
Single rotation
2.
Rotate a child about its parent if an element in a child is accessed, unless it's the root
Moving to the root
Repeat the child-parent rotation until the element being accessed is the root
1.
2.
3.
A concept called splaying takes the rotate to the root idea further, doing special rotations depending on the relationship of the child
R with its parent Q and grandparent P
Cases:
Node R's parent is the root
Do a single rotation to make R the root
Node R is the left child of Q and Q is the left child of P
(respectively right and right )
Rotate Q around P
Rotate R around Q
Node R is the left child of Q and Q is the right child of P
(respectively right and left )
Rotate R around Q
Rotate R around P
We are not going deeply into self-organizing trees
It's an interesting concept, and rigorous mathematical analysis shows that splay trees should perform well, in terms of Big O bounds
In practice, however, they usually do not
An AVL tree or a red-black tree is generally the better choice
If you come upon some special problem where you repeatedly access a particular element most of the time, only then should you consider splay trees
Student Lecture
We can define a symbol table ADT with a few essential operations:
put(Key key, Value value)
▪ Put the key-value pair into the table
get(Key key):
▪ Retrieve the value associated with key
delete(Key key)
▪ Remove the value associated with key
contains(Key key)
▪ See if the table contains a key
isEmpty()
size()
It's also useful to be able to iterate over all keys
We have been talking a lot about trees and other ways to keep ordered symbol tables
Ordered symbol tables are great, but we may not always need that ordering
Keeping an unordered symbol table might allow us to improve our running time
Balanced binary search trees give us:
O(log n) time to find a key
O(log n) time to do insertions and deletions
Can we do better?
What about:
O(1) time to find a key
O(1) to do an insertion or a deletion
We make a huge array, so big that we’ll have more spaces in the array than we expect data values
We use a hashing function that maps keys to indexes in the array
Using the hashing function, we know where to put each key but also where to look for a particular key
Let’s make a hash table to store integer keys
Our hash table will be 13 elements long
Our hashing function will be simply modding each value by 13
Insert these keys: 3, 19, 7, 104, 89
0 1 2 3 4 5 6 7 8 9 10 11 12
104 3 19 7 89
0 1 2 3 4 5 6 7 8 9 10 11 12
Find these keys:
104 3 19 7 89
0 1 2 3 4 5 6 7 8 9 10 11 12
19
88
YES!
16
NO!
NO!
We are using a hash table for a space/time tradeoff
Lots of space means we can get down to O(1)
How much space do we need?
How do we pick a good hashing function?
What happens if two values collide (map to the same location)
Hashing functions
Keep working on Project 3
Sign up for your teams by today!
Finish Assignment 4
Due tonight by midnight