Lecture 9 -- More on BST

advertisement
DCO20105 Data structures and algorithms
 Lecture
9:
More on BST
Removal of a BST
 Some advanced balanced BST trees (AVL trees):
234 tree, Red-Black tree

-- By Rossella Lau
Rossella Lau
Lecture 9, DCO20105, Semester A,2005-6
Re-visit on BST
 A BST
is a tree where all the values of the left sub-tree
are less than the root and all the values of the right
sub-tree are greater than the root
 It
supports O(log n) execution time for both search
and insert in optimal cases when the BST has high
density
 The
worst execution time may be O(n) when the BST
is sparse
Rossella Lau
Lecture 9, DCO20105, Semester A,2005-6
Some facts of a BST
 A binary
search tree’s in-order traversal sequence is a
sort order
 insertion to a BST can also be treated as a tree sort
method and this is another O(n log n) sort algorithm
 The
minimum value of a BST is on the left most leaf
BSTNode<T> cur = root; // assume size()>=1
 while (cur->left) cur = cur->left;
 Return cur->item;

 The
maximum value of a BST is on the right most leaf
Rossella Lau
Lecture 9, DCO20105, Semester A,2005-6
BST removal
 Removing
a node from a BST should maintain the
resulting tree to be a tree as a BST
It cannot have three children
 left sub-tree < root < right sub-tree

 Should
consider different situations of a node (or a
sub-tree)
A leaf
 A node with a single child
 A full node, which has two children

Rossella Lau
Lecture 9, DCO20105, Semester A,2005-6
Deletion of an item which is a leaf
50
28
22
75
40
35
 Delete
50
28
90
87
75
40
95
35
90
87
95
22:
When the item is found, delete it!
Rossella Lau
Lecture 9, DCO20105, Semester A,2005-6
The algorithm for deletion of a leaf
bool BSTree<T>::remove (T const & target)
{
BSTNode<T> *& contentAt (find (target));
if (! contentAt ) return false;
BSTNode<T> *forDelete (contenAt);
if (contentAt->isLeaf())
contentAt = 0;
forDelete->left = forDelete->right = 0;
delete forDelete;
countNodes--;
return true; }
bool BSTNode<T>::isLeaf(void)
{return !left && !right;}
Rossella Lau
Lecture 9, DCO20105, Semester A,2005-6
Deletion of an item which has one child
50
28
50
75
40
35
 Delete
28
90
87
90
40 87
95
95
35
75:
When the item is found, put its only child at its place
Rossella Lau
Lecture 9, DCO20105, Semester A,2005-6
The algorithm for deletion of single child node
bool BSTree<T>::remove (T const & target)
{
BSTNode<T> *& contentAt (find (target));
if (! contentAt ) return false;
BSTNode<T> *forDelete (contenAt);
if (!contentAt->isLeaf() && // with single
!contentAt->isFull() )
// child
contentAt = contentAt->left ?
contentAt->left :
contentAt->right;
forDelete->left = forDelete->right = 0;
delete forDelete;
countNodes--;
return true; }
bool BSTNode<T>::isFull(void)
{return left && right; }
Rossella Lau
Lecture 9, DCO20105, Semester A,2005-6
Deletion of an item which has two children
50
87
40
or
28
90
40 87
35
 Delete
28
90
40 50
95
35
28
90
50 87
35
95
95
35
50:
Theory: The inorder successor/predecessor of an internal
node at most has one child at its right/left hand side
 When the item is found at node n, replace n's data with n's
inorder successor s or predecessor p, then deletion goes to
s or p -- s or p is either a leaf or a node with single child!

Rossella Lau
Lecture 9, DCO20105, Semester A,2005-6
The algorithm for deletion of an internal node
bool BSTree<T>::remove (T const & target)
{BSTNode<T> *& contentAt (find (target));
if (! contentAt ) return false;
BSTNode<T> *& forDelete(prepareRemoval(contentAt));
BSTNode<T> *realDelete (forDelete);
…… // deletion of a leaf or a single child’s parent
}
BSTNode<T> *& BSTree<T>::prepareRemoval(
BSTNode<T> *& contentAt) {
if (contentAt->isFull()) {
BSTNode *& succ ( successor ( contentAt) );
swap ( succ->getItem(), contentAt->getItem() );
return succ;
}
else return contentAt;
}
Rossella Lau
Lecture 9, DCO20105, Semester A,2005-6
The algorithm for finding an inorder successor
BSTNode<T> *& BSTree<T>::successor (BSTNode<T> const *p)
{
// Assume that the input node (p) has two children
BSTNode<T> *it (const_cast<BSTNode<T>*> (p));
if (it->right->left) { // successor at the
// left-most right subTree
it = it->right;
while (it->left->left) it= it->left;
return it->left;
}
else
//successor is the right child
return it->right;
}
Rossella Lau
Lecture 9, DCO20105, Semester A,2005-6
Notes on const_cast
 C++

supports the following type cast operators:
const_cast to cast away constant attribute
• In the previous example, p is passed as a pointer pointing to a constant
object.
• However, it tries to traverse p’s children and the compiler would not
allow it to have updated operation it=itnext;
• To allow it to traverse its children, const_cast is needed to temporarily
cast away the constant attributes of p

static_cast the new way to do former type cast
• Former way: doubleResult = (double) intA / intB;
• C++: doubleResult = static_cast<double> (intA) / intB;

Other two which are not encouraged:
• dynamic_cast, reinterpret_cast
Rossella Lau
Lecture 9, DCO20105, Semester A,2005-6
Exercises on BST removal
 BST

removal:
Ford’s exercise: 10:26: delete 30, 80, 25; 10:32
 Other

BST removal related functions
find a predecessor
Rossella Lau
Lecture 9, DCO20105, Semester A,2005-6
Complexity for remove()
 The
main logic for delete() is still find(). However, it
requires a function successor() to search an in-order
successor. successor() should have a complexity less
than or equal to find(), therefore, the big O function of
delete() is still the same as find()
remove() is similar to find() and has the same
complexity as find()
Rossella Lau
Lecture 9, DCO20105, Semester A,2005-6
Balanced Binary Tree
 To
solve the problem of a "linear" BST and maintain
an optimal complexity, the problem becomes how to
maintain a balanced binary tree
 A balanced

binary tree is also called an AVL tree
It was discovered by two Russian mathematicians:
Adel'son-Vel'skii and Landis
 First,
the height is defined as the depth of the tree
 Then,
a balanced binary tree is a binary tree in which
the heights of the two sub-trees of every node never
differ by more than 1.
Rossella Lau
Lecture 9, DCO20105, Semester A,2005-6
Examples of AVL BST and non-AVL BST
A
J
B
C
D
E
G
K
F
H
M
L
N
Q
O
R
P
S
T
AVL tree
Rossella Lau
Non AVL tree
Lecture 9, DCO20105, Semester A,2005-6
Efficiency concerns on an AVL BST
 There
are efficient algorithms to maintain a binary
tree as an AVL tree
Insert/remove a node into/from an AVL tree and resulting
an AVL tree at O(1) (without searching)
 Fords: Supplementary in the book web site
 Goodrich et al.: Chapter 9
 Collins: Chapter 9

 It
requires more information, the height of a node
 With
an AVL BST, it can always have an optimal
search process on a BST
Rossella Lau
Lecture 9, DCO20105, Semester A,2005-6
B-Tree
 A node
storing only one item is not efficient especially
considering I/O is based on “blocks” and a block
usually stores about 512 bytes
 B-Tree

is an extension of a balanced binary tree
When saying a binary tree of order n, it means that the tree
allows a node to have n children and stores n-1 items
 Searching
on a B-tree involves only the number of
level block I/O when treating each node as an I/O
block and searching within a node which has items
stored in a vector that can apply binary search
Rossella Lau
Lecture 9, DCO20105, Semester A,2005-6
A sample B-Tree of Order 5
A
• 367 •
B
C
•492 •661•815•912 •
• 103 • 218 •
17 87 119 165 198 245 272 330 408 435 524 602 686 770 799 832 871 956 968 975 991
D
Rossella Lau
E
F
G
H
I
J
Lecture 9, DCO20105, Semester A,2005-6
K
Searching on a B-Tree
 Search
for 832
1. Getting block A, linear or binary search on the key values, 815
> 367 go to block C along the right pointer of 367
2. Getting block C, 832 is in between 815 and 912  go to block
J along the pointer between 815 and 912
3. Getting block J, search for 832  found!
 Search
for 65
Getting block A, then B, and D, 65 does not exist in D  not
found!
Rossella Lau
Lecture 9, DCO20105, Semester A,2005-6
2-3-4 Tree
 A special
case of a B-Tree is 2-3-4 tree, B-tree of order
4, in which a node can have up to four children and
stores 3 items
 Ford’s
slides: Chapter 12: 10-15
 Ford’s
exercises: Chapter 12: 26(b)

Draw the 2-3-4 tree built when you insert the keys from
E A S Y Q U T I O N into an initially empty tree.
Rossella Lau
Lecture 9, DCO20105, Semester A,2005-6
Red-Black Tree
 To
implement a B-Tree is complicated and to
implement a 2-3-4 tree is easier but still complicated
 Using
a Red-black tree to implement (represent) a 23-4 tree is easier

Red-black tree is a binary tree
• The root is BLACK
• A RED parent never has a RED child
• Every path from the root to an empty sub-tree has the same
number of BLACK nodes

It is closed to a balanced tree and easier to be constructed
 Ford’s
Rossella Lau
slides: 12:16-17; exercises: 12:26(c)
Lecture 9, DCO20105, Semester A,2005-6
Summary
 Construction
of a BST is also a sorting method which is at
O(n logn) for optimal cases
 The
in-order successor/predecessor of an interior node must
be either a leaf or a node with single child
 To
erase a node from a BST can be categorized as two cases: to
delete a leaf and a node with single child
 To
solve the worst case of a BST, constructing a BST should
assure that it is a balanced BST (AVL)
 An
extension of a BST is a B-Tree and a special case is 2-3-4
tree
 Using
a Red-Black tree to implement/represent a 2-3-4 tree
greatly reduces the complexity
Rossella Lau
Lecture 9, DCO20105, Semester A,2005-6
Reference
 Ford:
 Data
10.5-6, 12.6-7
Structures and Algorithms in C++ by Michael
T. Goodrich, Roberto Tamassia, David M.
Mount : Chapter 9
-- END -Rossella Lau
Lecture 9, DCO20105, Semester A,2005-6
Download