DCO20105 Data structures and algorithms Lecture 9: More on BST Removal of a BST Some advanced balanced BST trees (AVL trees): 234 tree, Red-Black tree -- By Rossella Lau Rossella Lau Lecture 9, DCO20105, Semester A,2005-6 Re-visit on BST A BST is a tree where all the values of the left sub-tree are less than the root and all the values of the right sub-tree are greater than the root It supports O(log n) execution time for both search and insert in optimal cases when the BST has high density The worst execution time may be O(n) when the BST is sparse Rossella Lau Lecture 9, DCO20105, Semester A,2005-6 Some facts of a BST A binary search tree’s in-order traversal sequence is a sort order insertion to a BST can also be treated as a tree sort method and this is another O(n log n) sort algorithm The minimum value of a BST is on the left most leaf BSTNode<T> cur = root; // assume size()>=1 while (cur->left) cur = cur->left; Return cur->item; The maximum value of a BST is on the right most leaf Rossella Lau Lecture 9, DCO20105, Semester A,2005-6 BST removal Removing a node from a BST should maintain the resulting tree to be a tree as a BST It cannot have three children left sub-tree < root < right sub-tree Should consider different situations of a node (or a sub-tree) A leaf A node with a single child A full node, which has two children Rossella Lau Lecture 9, DCO20105, Semester A,2005-6 Deletion of an item which is a leaf 50 28 22 75 40 35 Delete 50 28 90 87 75 40 95 35 90 87 95 22: When the item is found, delete it! Rossella Lau Lecture 9, DCO20105, Semester A,2005-6 The algorithm for deletion of a leaf bool BSTree<T>::remove (T const & target) { BSTNode<T> *& contentAt (find (target)); if (! contentAt ) return false; BSTNode<T> *forDelete (contenAt); if (contentAt->isLeaf()) contentAt = 0; forDelete->left = forDelete->right = 0; delete forDelete; countNodes--; return true; } bool BSTNode<T>::isLeaf(void) {return !left && !right;} Rossella Lau Lecture 9, DCO20105, Semester A,2005-6 Deletion of an item which has one child 50 28 50 75 40 35 Delete 28 90 87 90 40 87 95 95 35 75: When the item is found, put its only child at its place Rossella Lau Lecture 9, DCO20105, Semester A,2005-6 The algorithm for deletion of single child node bool BSTree<T>::remove (T const & target) { BSTNode<T> *& contentAt (find (target)); if (! contentAt ) return false; BSTNode<T> *forDelete (contenAt); if (!contentAt->isLeaf() && // with single !contentAt->isFull() ) // child contentAt = contentAt->left ? contentAt->left : contentAt->right; forDelete->left = forDelete->right = 0; delete forDelete; countNodes--; return true; } bool BSTNode<T>::isFull(void) {return left && right; } Rossella Lau Lecture 9, DCO20105, Semester A,2005-6 Deletion of an item which has two children 50 87 40 or 28 90 40 87 35 Delete 28 90 40 50 95 35 28 90 50 87 35 95 95 35 50: Theory: The inorder successor/predecessor of an internal node at most has one child at its right/left hand side When the item is found at node n, replace n's data with n's inorder successor s or predecessor p, then deletion goes to s or p -- s or p is either a leaf or a node with single child! Rossella Lau Lecture 9, DCO20105, Semester A,2005-6 The algorithm for deletion of an internal node bool BSTree<T>::remove (T const & target) {BSTNode<T> *& contentAt (find (target)); if (! contentAt ) return false; BSTNode<T> *& forDelete(prepareRemoval(contentAt)); BSTNode<T> *realDelete (forDelete); …… // deletion of a leaf or a single child’s parent } BSTNode<T> *& BSTree<T>::prepareRemoval( BSTNode<T> *& contentAt) { if (contentAt->isFull()) { BSTNode *& succ ( successor ( contentAt) ); swap ( succ->getItem(), contentAt->getItem() ); return succ; } else return contentAt; } Rossella Lau Lecture 9, DCO20105, Semester A,2005-6 The algorithm for finding an inorder successor BSTNode<T> *& BSTree<T>::successor (BSTNode<T> const *p) { // Assume that the input node (p) has two children BSTNode<T> *it (const_cast<BSTNode<T>*> (p)); if (it->right->left) { // successor at the // left-most right subTree it = it->right; while (it->left->left) it= it->left; return it->left; } else //successor is the right child return it->right; } Rossella Lau Lecture 9, DCO20105, Semester A,2005-6 Notes on const_cast C++ supports the following type cast operators: const_cast to cast away constant attribute • In the previous example, p is passed as a pointer pointing to a constant object. • However, it tries to traverse p’s children and the compiler would not allow it to have updated operation it=itnext; • To allow it to traverse its children, const_cast is needed to temporarily cast away the constant attributes of p static_cast the new way to do former type cast • Former way: doubleResult = (double) intA / intB; • C++: doubleResult = static_cast<double> (intA) / intB; Other two which are not encouraged: • dynamic_cast, reinterpret_cast Rossella Lau Lecture 9, DCO20105, Semester A,2005-6 Exercises on BST removal BST removal: Ford’s exercise: 10:26: delete 30, 80, 25; 10:32 Other BST removal related functions find a predecessor Rossella Lau Lecture 9, DCO20105, Semester A,2005-6 Complexity for remove() The main logic for delete() is still find(). However, it requires a function successor() to search an in-order successor. successor() should have a complexity less than or equal to find(), therefore, the big O function of delete() is still the same as find() remove() is similar to find() and has the same complexity as find() Rossella Lau Lecture 9, DCO20105, Semester A,2005-6 Balanced Binary Tree To solve the problem of a "linear" BST and maintain an optimal complexity, the problem becomes how to maintain a balanced binary tree A balanced binary tree is also called an AVL tree It was discovered by two Russian mathematicians: Adel'son-Vel'skii and Landis First, the height is defined as the depth of the tree Then, a balanced binary tree is a binary tree in which the heights of the two sub-trees of every node never differ by more than 1. Rossella Lau Lecture 9, DCO20105, Semester A,2005-6 Examples of AVL BST and non-AVL BST A J B C D E G K F H M L N Q O R P S T AVL tree Rossella Lau Non AVL tree Lecture 9, DCO20105, Semester A,2005-6 Efficiency concerns on an AVL BST There are efficient algorithms to maintain a binary tree as an AVL tree Insert/remove a node into/from an AVL tree and resulting an AVL tree at O(1) (without searching) Fords: Supplementary in the book web site Goodrich et al.: Chapter 9 Collins: Chapter 9 It requires more information, the height of a node With an AVL BST, it can always have an optimal search process on a BST Rossella Lau Lecture 9, DCO20105, Semester A,2005-6 B-Tree A node storing only one item is not efficient especially considering I/O is based on “blocks” and a block usually stores about 512 bytes B-Tree is an extension of a balanced binary tree When saying a binary tree of order n, it means that the tree allows a node to have n children and stores n-1 items Searching on a B-tree involves only the number of level block I/O when treating each node as an I/O block and searching within a node which has items stored in a vector that can apply binary search Rossella Lau Lecture 9, DCO20105, Semester A,2005-6 A sample B-Tree of Order 5 A • 367 • B C •492 •661•815•912 • • 103 • 218 • 17 87 119 165 198 245 272 330 408 435 524 602 686 770 799 832 871 956 968 975 991 D Rossella Lau E F G H I J Lecture 9, DCO20105, Semester A,2005-6 K Searching on a B-Tree Search for 832 1. Getting block A, linear or binary search on the key values, 815 > 367 go to block C along the right pointer of 367 2. Getting block C, 832 is in between 815 and 912 go to block J along the pointer between 815 and 912 3. Getting block J, search for 832 found! Search for 65 Getting block A, then B, and D, 65 does not exist in D not found! Rossella Lau Lecture 9, DCO20105, Semester A,2005-6 2-3-4 Tree A special case of a B-Tree is 2-3-4 tree, B-tree of order 4, in which a node can have up to four children and stores 3 items Ford’s slides: Chapter 12: 10-15 Ford’s exercises: Chapter 12: 26(b) Draw the 2-3-4 tree built when you insert the keys from E A S Y Q U T I O N into an initially empty tree. Rossella Lau Lecture 9, DCO20105, Semester A,2005-6 Red-Black Tree To implement a B-Tree is complicated and to implement a 2-3-4 tree is easier but still complicated Using a Red-black tree to implement (represent) a 23-4 tree is easier Red-black tree is a binary tree • The root is BLACK • A RED parent never has a RED child • Every path from the root to an empty sub-tree has the same number of BLACK nodes It is closed to a balanced tree and easier to be constructed Ford’s Rossella Lau slides: 12:16-17; exercises: 12:26(c) Lecture 9, DCO20105, Semester A,2005-6 Summary Construction of a BST is also a sorting method which is at O(n logn) for optimal cases The in-order successor/predecessor of an interior node must be either a leaf or a node with single child To erase a node from a BST can be categorized as two cases: to delete a leaf and a node with single child To solve the worst case of a BST, constructing a BST should assure that it is a balanced BST (AVL) An extension of a BST is a B-Tree and a special case is 2-3-4 tree Using a Red-Black tree to implement/represent a 2-3-4 tree greatly reduces the complexity Rossella Lau Lecture 9, DCO20105, Semester A,2005-6 Reference Ford: Data 10.5-6, 12.6-7 Structures and Algorithms in C++ by Michael T. Goodrich, Roberto Tamassia, David M. Mount : Chapter 9 -- END -Rossella Lau Lecture 9, DCO20105, Semester A,2005-6