Threaded Binary Trees Given a binary tree with n nodes, the total number of links in the tree is 2n. Each node (except the root) has exactly one incoming arc only n - 1 links point to nodes remaining n + 1 links are null. One can use these null links to simplify some traversal processes. A threaded binary search tree is a BST with unused links employed to point to other tree nodes. Traversals (as well as other operations, such as backtracking) made more efficient. A BST can be threaded with respect to inorder, preorder or postorder successors. Threaded Binary Trees Given the following BST, thread it to facilitate inorder traversal: H E B A K F D J G I L M C The first node visited in an inorder traversal is the leftmost leaf, node A. Since A has no right child, the next node visited in this traversal is its parent, node B. Use the right pointer of node A as a thread to parent B to make backtracking easy. Threaded Binary Trees H E B A K F D J G I L M C The thread from A to B is shown as the arrow in above diagram. Threaded Binary Trees The next node visited is C, and since its right pointer is null, it also can be used as a thread to its parent D: H E B A F D C K J G I L M Threaded Binary Trees Node D has a null right pointer which can be replaced with a pointer to D’s inorder successor, node E: H E B A F D C K J G I L M Threaded Binary Trees The next node visited with a null right pointer is G. Replace the null pointer with the address of G’s inorder successor: H H E B A F D C K J G I L M Threaded Binary Trees Finally, we replace: first, the null right pointer of I with a pointer to its parent and then, likewise, the null right pointer of J with a pointer to its parent H E B A F D C K J G I L M Algorithm to right-thread BST Perform an inorder traversal of the BST. Whenever a node x with a null right pointer encountered replace this right link with a thread to the inorder successorof x. If x is a left child of S then S is the inorder successor of x. If x is not a left child, then the inorder successor S of x is the nearest ancestor S of x that contains x in its left subtree. Algorithm to right-thread BST 1. Initialize a pointer p to the root of the tree. 2. While (p) // p is not null: a. While (p->left) is not null, p = p>left . b. Visit the node pointed to by p. c. While p->right is a thread do: i. p = p->right. ii. Visit the node pointed to by p. d. p = p->right. Must be able to distinguish between right links that are threads and those that are pointers to right children add a boolean field rightThread in the node declaration Algorithm to right-thread BST class BinNode { public: DataType bool BinNode data; rightThread; *left, *right; BinNode () { rightThread = false; left = right = 0; } // reset when tree threaded BinNode (DataType item) { data = item; rightThread = false; // reset when tree threaded left = right = 0; } };// end of class BinNode declaration Tree Balancing Binary search trees designed for fast searching!! But the order of insertion into a binary search tree determines the shape of the tree and hence the efficiency with which tree can be searched. If it grows so that the tree is as “fat” as possible (fewest levels) then it can be searched most efficiently. However, if the tree grows “lopsided”, with many more items in one subtree than another, than the search time degrades from logarithmic to linear. Optimally, a BST with n nodes should have log2n levels. In the worst case, there could be n levels (one node per level), in which binary search degenerates to sequential search. Tree Balancing EXAMPLE: A BST OF STATE ABBREVIATIONS Construct the BST of state abbreviations when states NY, IL, GA, RI, MA, PA, DE, IN, VT, TX, OH, and WY are inserted, in this order, into an empty tree. NY YY IL GA YY DE YY MA YY IN YY RI PA YY OH YY VT YY TX YY WY YY For such nicely balanced trees, the search time is O(log 2 n), where n is the number of nodes in the tree. Tree Balancing If the abbreviations were inserted alphabetically, i.e. DE, GA, IL, IN, MA, MI, NY, OH, PA, RI, TX, VT, WY DE would be the root, GA its right child, IL the right child of GA, IN the right child of IN, etc. BST degenerates into a linked list search time increases to O(n) Need a way to keep BST height-balanced! AVL Trees An AVL tree is a binary tree that is height-balanced: The difference in height between the left and right subtrees at any point in the tree is restricted. Define the balance factor of node x to be the height of x’s left subtree minus the height of its right subtree An AVL tree is a BST in which the balance factor of each node is 0, -1, or 1 Sample trees AVL trees Not AVL trees AVL tree class definition template <typename DataType> class AVLTree { private: class AVLNode { public: DataType data; short int balanceFactor; AVLNode *left,*right; AVLNode() { data = item; balanceFactor = 0; left = right = 0; } AVLNode(DataType item) { data = item; balanceFactor = 0; left = right = 0; } }; // end of class AVLNode declaration . . . }; // end of class AVLTree declaration Keeping AVL trees balanced When a new item inserted into a balanced binary tree resulting tree may become unbalanced tree can be rebalanced transform subtree rooted at the node that is the nearest ancestor of the new node unacceptable balance factor This transformation carried out by rotation: the relative order of nodes in a subtree is shifted, changing the root, and the number of nodes in the left and the right subchildres. Four standard types of rotations are performed on AVL trees. Keeping AVL trees balanced 1. Simple right rotation: When new item inserted in the left subtree of the left child B of the nearest ancestor A with balance factor +2. 2. Simple left rotation: When new item inserted in the right subtree of the right child B of the nearest ancestor A with balance factor -2. 3. Left-right rotation: When new item inserted in the right subtree of the left child B of the nearest ancestor A with balance factor +2. 4. Right-left rotation: When new item inserted in the left subtree of the right child B of the nearest ancestor A with balance factor -2. Keeping AVL trees balanced Each of these rotations performed by simply resetting some links. For example, consider a simple right rotation, used when item inserted in the left subtree of the left child B of the nearest ancestor A with balance factor +2. A simple right rotation involves resetting three links: 1. Reset the link from the parent of A to B. 2. Set the left link of A equal to the right link of B. 3. Set the right link of B to point A. Rebalancing cost/benefit Balanced binary trees with n nodes will have a depth O(log 2 n). AVL trees thus guarantee search time will be O(log 2 n). Overhead involved in rebalancing as the AVL tree justified when search operations exceed insertions faster searches compensate for slower insertions. Empirical studies indicate that on the average, rebalancing is required for approximately 45 percent of the insertions. Roughly one-half of these rotations require are double rotations. M-ary trees Other trees, such as parse trees, game trees, genealogical trees need to be searched efficiently but have more than two children may be several different directions that a search path may follow (rather than left if less, right if greater) shorter search paths because there are fewer levels in the tree. Definition: An m-node in a search tree stores m - 1 data values k 1 < k 2, ... < k m-1, and has links to m subtrees T 1, ... , T m, where for each i, all data values in T i < k i <= all data values in Ti+1 2-3-4 tree 2-3-4 tree is a tree with the following properties: 1. Each node stores at most 3 data values. 2. Each internal node is a 2-node, a 3-node, or a 4-node. 3. All the leaves are on the same level. Basic Operations: construct, determine if empty, and search insert a new item in the 2-3-4 tree so the result is a 2-3-4 tree delete an item from the 2-3-4 tree so the result is a 2-3-4 tree Expression trees Links in 2-3-4 trees Each node must have one link for each possible child, even though most of the nodes will not use all of these links. the amount of “wasted” space may be quite large. Say a 2-3-4 tree has n nodes. Linked representation requires 4 links for each node, for a total of 4n links. But only n – 1 of these links are used to connect the n nodes. This 4n - (n - 1) = 3n + 1 of the links are null fraction of unused links is (3n +1) / 4n, approximately 75% of the links For 2-3-4 trees, there is a special kind of binary tree called a red-black tree that can be used to represent AVL trees (i.e. height-balanced 2-3-4 trees) Red-black trees A binary search tree with two kinds of links, red and black, which satisfies the following properties: 1. Every path from root to leaf has the same number of black links. 2. No path from root to leaf has two or more consecutive red links. Basic Operations: construct, determine if empty, search, and traverse as for BSTs insert new item in the red-black tree and maintain red-black properties delete item from the red-black tree and maintain red-black properties Red-black tree class enum ColorType {RED, BLACK}; class RedBlackTreeNode { public: TreeElementType data; ColorType parentColor; // RED if link from parent is red, // BLACK otherwise RedBlackTreeNode }; *parent, *left, *right; AVL rotations AVL trees have been replaced in many applications by 2-3-4 trees or red-black trees AVL rotations are still used to keep a red-black tree balanced. To construct a red-black tree, use top-down 2-3-4 tree insertion with 4-node splitting during descent: 1. Search for a place to insert the new node. (Keep track of its parent, grandparent, and great grandparent.) 2. When 4-node q found along the search path, split it as follows: a. Change both links of q to black. b. Change the link from the parent to red: 3. If there now are two consecutive red links (from grandparent gp to parent p to q), perform the appropriate AVL-type rotation as determined by the direction (LL, RR, LR, RL) B-trees B-tree of order m -- generalization of 2-3-4 trees Used for external searching A tree with the following properties: 1. Each node stores at most m-1 data values. 2. Each internal node is a 2-node, a 3-node, … , or an m-node. 3. All the leaves are on the same level. Basic Operations: construct, determine if empty, and search insert a new item in the B-tree so the result is a B-tree delete an item from the B-tree so the result is a B-tree Sample B-tree