Binary Search Trees One of the tree applications in Chapter 10 is binary search trees. In Chapter 10, binary search trees are used to implement bags. This presentation illustrates how another data type called a dictionary is implemented with binary search trees. Binary Search Trees (BST) A data structure for efficient searching, insertion and deletion Binary search tree property For every node x All the keys in its left subtree are smaller than the key value in x All the keys in its right subtree are larger than the key value in x Binary Search Trees A binary search tree Not a binary search tree Binary Search Trees The same set of keys may have different BSTs Average depth of a BST is O(log n) Maximum depth of a BST is O(n) The Dictionary Data Type A dictionary is a collection of items, similar to a bag. But unlike a bag, each item has a string attached to it, called the item's key. The Dictionary Data Type A dictionary is a collection of items, similar to a bag. But unlike a bag, each item has a string attached to it, called the item's key. Example: The items I am storing are records containing data about a state. The Dictionary Data Type A dictionary is a collection of items, similar to a bag. But unlike a bag, each item has a string attached to it, called the item's key. Example: The key for each record is the name of the state. Washington The Dictionary Data Type void dictionary::insert(The key for the new item, The new item); The insertion procedure for a dictionary has two parameters. The Dictionary Data Type When you want to retrieve an item, you specify the key... Item dictionary::retrieve("Washington"); The Dictionary Data Type When you want to retrieve an item, you specify the key... ... and the retrieval procedure returns the item. Item dictionary::retrieve("Washington"); The Dictionary Data Type We'll look at how a binary tree can be used as the internal storage mechanism for the dictionary. A Binary Search Tree of States Oklahoma Colorado Mass. Arizona Arkansas West Virginia Washington New Hampshire The data in the dictionary will be stored in a binary tree, with each node containing an item and a key. A Binary Search Tree of States Oklahoma Colorado Mass. Arizona Arkansas West Virginia Washington New Hampshire Storage rules: Every key to the left of a node is alphabetically before the key of the node. A Binary Search Tree of States Mass. Washington Arkansas West Virginia Example: “Massachusetts” and “ New Hampshire” are alphabetically before “Oklahoma” Oklahoma New Hampshire Storage rules: Every key to the left of Colorado a node is alphabetically before the key of the node. Arizona A Binary Search Tree of States Oklahoma Mass. West Virginia Washington New Hampshire Storage rules: Every key to the left of Colorado a node is alphabetically before the key of the node. Arizona Every key to the right of a node is alphabetically after Arkansas the key of the node. A Binary Search Tree of States Oklahoma Mass. West Virginia Washington New Hampshire Storage rules: Every key to the left of Colorado a node is alphabetically before the key of the node. Arizona Every key to the right of a node is alphabetically after Arkansas the key of the node. Searching BST If we are searching for 15, then we are done. If we are searching for a key < 15, then we should search in the left subtree. If we are searching for a key > 15, then we should search in the right subtree. Searching (Find) Find X: return a pointer to the node that has key X, or NULL if there is no such node Time complexity: O(depth of the tree) Retrieving Data Oklahoma Colorado Mass. Arizona Arkansas West Virginia Washington New Hampshire Start at the root. If the current node has the key, then stop and retrieve the data. If the current node's key is too large, move left and repeat 1-3. If the current node's key is too small, move right and repeat 1-3. Retrieve " New Hampshire" Oklahoma Colorado Mass. Arizona Arkansas West Virginia Washington New Hampshire Start at the root. If the current node has the key, then stop and retrieve the data. If the current node's key is too large, move left and repeat 1-3. If the current node's key is too small, move right and repeat 1-3. Retrieve "New Hampshire" Oklahoma Colorado Mass. Arizona Arkansas West Virginia Washington New Hampshire Start at the root. If the current node has the key, then stop and retrieve the data. If the current node's key is too large, move left and repeat 1-3. If the current node's key is too small, move right and repeat 1-3. Retrieve "New Hampshire" Oklahoma Colorado Mass. Arizona Arkansas West Virginia Washington New Hampshire Start at the root. If the current node has the key, then stop and retrieve the data. If the current node's key is too large, move left and repeat 1-3. If the current node's key is too small, move right and repeat 1-3. Retrieve "New Hampshire" Oklahoma Colorado Mass. Arizona Arkansas West Virginia Washington New Hampshire Start at the root. If the current node has the key, then stop and retrieve the data. If the current node's key is too large, move left and repeat 1-3. If the current node's key is too small, move right and repeat 1-3. Inorder Traversal of BST Inorder traversal of BST prints out all the keys in sorted order Inorder: 2, 3, 4, 6, 7, 9, 13, 15, 17, 18, 20 findMin/ findMax Goal: return the node containing the smallest (largest) key in the tree Algorithm: Start at the root and go left (right) as long as there is a left (right) child. The stopping point is the smallest (largest) element Time complexity = O(depth of the tree) Insertion Proceed down the tree as you would with a find If x is found, do nothing (or update something) Otherwise, insert x at the last spot on the path traversed Time complexity = O(depth of the tree) Deletion When we delete a node, we need to consider how we take care of the children of the deleted node. This has to be done such that the property of the search tree is maintained. Deletion under Different Cases Case 1: the node is a leaf Delete Case it immediately 2: the node has one child Adjust node a pointer from the parent to bypass that Adding a New Item with a Given Key Oklahoma Colorado Mass. Arizona Washington Arkansas West Virginia Pretend that you are trying to find the key, but stop when there is no node to move to. Add the new node at the spot where you would have moved to if there had been a node. New Hampshire Pretend that you are trying to find the key, but stop when there is no node to move to. Add the new node at the spot where you would have moved to if there had been a node. Oklahoma Colorado Mass. Arizona Washington Arkansas West Virginia New Hampshire Adding Iowa Pretend that you are trying to find the key, but stop when there is no node to move to. Add the new node at the spot where you would have moved to if there had been a node. Oklahoma Colorado Mass. Arizona Washington Arkansas West Virginia New Hampshire Adding Iowa Pretend that you are trying to find the key, but stop when there is no node to move to. Add the new node at the spot where you would have moved to if there had been a node. Oklahoma Colorado Mass. Arizona Washington Arkansas West Virginia New Hampshire Adding Iowa Pretend that you are trying to find the key, but stop when there is no node to move to. Add the new node at the spot where you would have moved to if there had been a node. Oklahoma Colorado Mass. Arizona Washington Arkansas West Virginia New Hampshire Adding Iowa Pretend that you are trying to find the key, but stop when there is no node to move to. Add the new node at the spot where you would have moved to if there had been a node. Oklahoma Colorado Mass. Arizona Washington Arkansas West Virginia New Hampshire Adding Iowa Adding Oklahoma Colorado Mass. Arizona Washington Iowa Arkansas West Virginia Pretend that you are trying to find the key, but stop when there is no node to move to. Add the new node at the spot where you would have moved to if there had been a node. New Hampshire Adding Oklahoma Colorado Mass. Arizona Iowa Arkansas West Virginia Washington New Hampshire Where would you add this state? Adding Oklahoma Colorado Mass. Arizona Iowa Arkansas West Virginia Washington New Hampshire Kazakhstan is the new right child of Iowa? Deletion When we delete a node, we need to consider how we take care of the children of the deleted node. This has to be done such that the property of the search tree is maintained. Deletion under Different Cases Case 1: the node is a leaf Delete Case it immediately 2: the node has one child Adjust node a pointer from the parent to bypass that Deletion Case 3 Case 3: the node has 2 children Replace the key of that node with the minimum element at the right subtree Delete that minimum element Has either no child or only right child because if it has a left child, that left child would be smaller and would have been chosen. So invoke case 1 or 2. Time complexity = O(depth of the tree) Removing an Item with a Given Key Oklahoma Colorado Mass. Arizona Washington Iowa Arkansas West Virginia Find the item. If necessary, swap the item with one that is easier to remove. Remove the item. New Hampshire Removing "Florida" Find the item. Oklahoma Colorado Mass. Arizona Iowa Arkansas West Virginia Washington New Hampshire Removing "Florida" Oklahoma Colorado Mass. Arizona Iowa Arkansas West Virginia Florida cannot be removed at the moment... New Hampshire Washington Removing "Florida" Oklahoma Colorado Mass. Iowa Arkansas West Virginia Washington New Hampshire ... because removing Florida would break the tree into two pieces. Arizona Removing "Florida" If necessary, do some rearranging. Oklahoma Colorado Mass. Iowa Arkansas West Virginia Washington New Hampshire The problem of breaking the tree happens because Florida has 2 children. Arizona Removing "Florida" If necessary, do some rearranging. Oklahoma Colorado Mass. Arizona For the rearranging, take the smallest item in the right subtree... Iowa Arkansas West Virginia Washington New Hampshire Removing "Florida" Iowa If necessary, do some rearranging. Oklahoma Colorado Mass. Arizona ...copy that smallest item onto the item that we're removing... Iowa Arkansas West Virginia Washington New Hampshire Removing "Florida" Iowa If necessary, do some rearranging. Oklahoma Colorado Mass. Arizona ... and then remove the extra copy of the item we copied... Arkansas West Virginia Washington New Hampshire Removing "Florida" Iowa If necessary, do some rearranging. Oklahoma Colorado Mass. Arizona ... and reconnect the tree Arkansas West Virginia Washington New Hampshire Removing "Florida" Oklahoma Colorado Mass. Arkansas West Virginia Washington New Hampshire Why did I choose the smallest item in the right subtree? Arizona Removing "Florida" Iowa Oklahoma Colorado Mass. Arizona Arkansas West Virginia Washington New Hampshire Because every key must be smaller than the keys in its right subtree Removing an Item with a Given Key Find the item. If the item has a right child, rearrange the tree: Find smallest item in the right subtree Copy that smallest item onto the one that you want to remove Remove the extra copy of the smallest item (making sure that you keep the tree connected) else just remove the item. Summary Binary search trees are a good implementation of data types such as sets, bags, and dictionaries. Searching for an item is generally quick since you move from the root to the item, without looking at many other items. Adding and deleting items is also quick.