AVL tree

advertisement

Week 8 - Friday

What did we talk about last time?

Practiced with 2-3 trees

Red-black trees

We can do an insertion with a red-black tree using a series of rotations and recolors

We do a regular BST insert

Then, we work back up the tree as the recursion unwinds

 If the right child is red and the left is black, we rotate the current node left

 If the left child is red and the left child of the left child is red, we rotate the current node right

 If both children are red, we recolor them black and the current node red

You have to do all these checks, in order!

 Multiple rotations can happen

It doesn't make sense to have a red root, so we always color the root black after the insert

A

X

Current Current

Y

Y X

B C A B

We perform a left rotation when the right child is red

C

Current

Z

Current

Y

Y

D

X

A

X C

A B

B

We perform a right rotation when the left child is red and its left child is red

C

Z

D

Current

Y

Current

Y

X Z X Z

A B C D A B C

We recolor both children and the current node when both children are red

D

 Add the following keys to a red-black tree:

 62

 11

 32

 7

 45

 24

 88

 25

 28

 90

The height of a red-black tree is no more than

2 log n

Find is Θ(height), so find is Θ(log n)

Since we only have to go down that path and back up to insert, insert is Θ(log n)

Delete in red-black trees is messy, but it is also actually Θ(log n)

Another approach to balancing is the AVL tree

 Named for its inventors G.M. Adelson-Velskii and E.M.

Landis

 Invented in 1962

An AVL tree is better balanced than a red-black tree, but it takes more work to keep such a good balance

 It's faster for finds (because of the better balance)

 It's slower for inserts and deletes

 Like a red-black tree, all operations are Θ(log n)

 An AVL tree is a binary search tree where

 The left and right subtrees of the root have heights that differ by at most one

 The left and right subtrees are also AVL trees

6

1

2 7

9

10

14

15

17

The balance factor of a tree (or subtree) is the height of its right subtree minus the height of its left

In an AVL tree, the balance factor of every subtree is -1, 0, or +1

6

1 9

2

 What’s the balance factor of this tree?

Carefully.

Every time we do an add, we have to make sure that the tree is still balanced

There are 4 cases

1.

Left Left

2.

3.

4.

Right Right

Left Right

Right Left

3

A

1

B

2

C

D

1

A

Right Rotation

B

2

C

3

D

1

A

B

2

3

A

C D

Left Rotation

1

B

2

3

C D

3

B

1 A

1

C

2

B

D Left Rotation +

Right Rotation

C

2

3

D A

1

A

3 1

D

2

C

B A

Right Rotation +

Left Rotation

D

2

C

3

B

 Let’s just add these numbers in order:

1, 2, 3, 4, 5, 6, 7, 8, 9, 10

 What about these?

7, 24, 92, 32, 2, 57, 67, 84, 66, 75

What if we knew ahead of time which numbers we were going to put into a tree?

How could we make sure the tree is balanced?

Answer:

 Sort the numbers

 Recursively add the numbers such that the tree stays balanced

 Write a recursive method that adds a sorted array of data such that the tree stays balanced

Assume that an add( int key) method exists

Use the usual convention that start is the beginning of a range and end is the location after the last legal element in the range public void balance( int [] data, int start, int end )

Takes a potentially unbalanced tree and turns it into a balanced tree

Step 1

 Turn the tree into a degenerate tree (backbone or vine) by right rotating any nodes with left children

Step 2

 Turn the degenerate tree into a balanced tree by doing sequences of left rotations starting at the root

The analysis is not obvious, but it takes O(n) total rotations

How much time does it take to insert n items with:

 Red-black or AVL tree

Balanced insertion method

 Unbalanced insertion + DSW rebalance

How much space does it take to insert n items with:

Red-black or AVL tree

 Balanced insertion method

 Unbalanced insertion + DSW rebalance

Balanced binary trees ensure that accessing any element takes O(log n) time

But, maybe only a few elements are accessed repeatedly

In this case, it may make more sense to restructure the tree so that these elements are closer to the root

1.

Single rotation

2.

 Rotate a child about its parent if an element in a child is accessed, unless it's the root

Moving to the root

 Repeat the child-parent rotation until the element being accessed is the root

1.

2.

3.

A concept called splaying takes the rotate to the root idea further, doing special rotations depending on the relationship of the child

R with its parent Q and grandparent P

Cases:

Node R's parent is the root

 Do a single rotation to make R the root

Node R is the left child of Q and Q is the left child of P

(respectively right and right )

 Rotate Q around P

 Rotate R around Q

Node R is the left child of Q and Q is the right child of P

(respectively right and left )

 Rotate R around Q

 Rotate R around P

We are not going deeply into self-organizing trees

It's an interesting concept, and rigorous mathematical analysis shows that splay trees should perform well, in terms of Big O bounds

In practice, however, they usually do not

An AVL tree or a red-black tree is generally the better choice

If you come upon some special problem where you repeatedly access a particular element most of the time, only then should you consider splay trees

Student Lecture

We can define a symbol table ADT with a few essential operations:

 put(Key key, Value value)

▪ Put the key-value pair into the table

 get(Key key):

▪ Retrieve the value associated with key

 delete(Key key)

▪ Remove the value associated with key

 contains(Key key)

▪ See if the table contains a key

 isEmpty()

 size()

It's also useful to be able to iterate over all keys

We have been talking a lot about trees and other ways to keep ordered symbol tables

Ordered symbol tables are great, but we may not always need that ordering

Keeping an unordered symbol table might allow us to improve our running time

 Balanced binary search trees give us:

 O(log n) time to find a key

 O(log n) time to do insertions and deletions

Can we do better?

What about:

 O(1) time to find a key

 O(1) to do an insertion or a deletion

We make a huge array, so big that we’ll have more spaces in the array than we expect data values

We use a hashing function that maps keys to indexes in the array

Using the hashing function, we know where to put each key but also where to look for a particular key

Let’s make a hash table to store integer keys

Our hash table will be 13 elements long

Our hashing function will be simply modding each value by 13

 Insert these keys: 3, 19, 7, 104, 89

0 1 2 3 4 5 6 7 8 9 10 11 12

104 3 19 7 89

0 1 2 3 4 5 6 7 8 9 10 11 12

 Find these keys:

104 3 19 7 89

0 1 2 3 4 5 6 7 8 9 10 11 12

19

88

YES!

16

NO!

 NO!

We are using a hash table for a space/time tradeoff

Lots of space means we can get down to O(1)

How much space do we need?

How do we pick a good hashing function?

What happens if two values collide (map to the same location)

 Hashing functions

Keep working on Project 3

 Sign up for your teams by today!

Finish Assignment 4

 Due tonight by midnight

Download