Binary Trees Tree Example Tree Structures A tree is a hierarchical structure that places elements in nodes along branches that originate from a root. Nodes in a tree are subdivided into levels in which the topmost level holds the root node. Any node in a tree can have multiple successors at the next level Therefore a tree is a nonlinear structure. Tree Structures (continued) Operating systems use a general tree to maintain file structures. Tree Terminology Tree structure Collection of nodes that originate from a unique starting node called the root. Each node consists of a value and a set of zero or more links to successor nodes. The terms parent and child describe the relationship between a node and any of its successor nodes. Tree Terminology (continued) root A child sibling s parent B C D interior (internal) node E F G H leaf node I J subtree Tree Terminology (continued) Consists of nodes connected by edges A tree is an instance of a more general category called a graph (a later slideshow discusses graphs) Nodes Edges Nodes connected by edges (a graph – but not a tree) Tree Terminology Tree – recursively defined as empty or a root node with zero or more sub-trees Node – a holder for data plus edges to children Edge – connects a parent node to a child node Root – a pointer to the first node, if it exists, or NULL Leaf node – a node with no children Internal node – a node with one or two children Path – sequence of edges between two nodes Height – longest path in a tree from root to any other node Depth – number of edges from root to a node Tree Terminology (continued) A path between a parent node P and any node N in its subtree is a sequence of nodes P=X0, X1, . . ., Xk = N Depth (also called Level) of a node k is the length of the path Each node Xi in the sequence is the parent of Xi+1 for 0 i k-1. Length of the path from root to the node. Equivalent - number of edges from root to the node. Viewing a node as a root of its subtree, the height of a node is the length of the longest path from the node to a leaf in the subtree. The height of a tree is the maximum level in the tree. Tree Terminology (continued) Height = 3 Sorted Binary Trees In a binary tree, each parent has no more than two children each node (item in the tree) has a value a total order (linear order) is defined on these values left subtree of a node contains only values less than the node's value right subtree contains only values greater than or equal to the node's value. 21 10 5 2 34 25 7 39 33 Binary Trees A compiler builds unsorted binary trees while parsing expressions in a program's source code. Binary Trees (continued) Each node of a binary tree defines a left and a right subtree. Each subtree is itself a tree. Right child of T Left child of T Binary Trees (continued) A recursive definition of a binary tree: T is a binary tree if T has no node (T is an empty tree) or has at most two subtrees. Height of a Binary Tree The height of a binary tree is the length of the longest path from the root to a leaf node Let TN be the subtree with root N and TL and TR be the roots of the left and right subtrees of N. Then height(N) = height(TN) = { -1 1+max( height(TL), height(TR)) if TN is empty if TN not empty leaf node will always have a height of 0 Height of a Binary Tree (concluded) Degenerate binary tree (least dense) Density of a Binary Tree In a binary trees, the number of nodes at each level falls within a range of values. At level 0, there is 1 node, the root; at level 1 there can be 1 or 2 nodes. At any level k, the number of nodes is in the range from 1 to 2k. The number of nodes per level contributes to the density of the tree. Intuitively, density is a measure of the size of a tree (number of nodes) relative to the height of the tree. Density of a Binary Tree (continued) Density of Binary Tree If degenerate trees allowed: Problem – search in basic binary tree is O(N) Value could be anywhere in tree No better than a list Density of a Binary Tree (continued) A complete binary tree of height h has all possible nodes through level h-1, and the nodes on depth h exist left to right with no gaps Complete binary trees are excellent storage structures due to packing a large number of nodes near the root Density of a Binary Tree (continued) Density of a Binary Tree (continued) Determine the minimum height of a complete tree that holds n elements. Through first h - 1 levels, total number of nodes is 1 + 2 + 4 + ... + 2h-1 = 2h - 1 At depth h, the number of additional nodes ranges from a minimum of 1 to a maximum of 2h. Hence the number of nodes n in a complete binary tree of height h ranges between 2 h - 1 + 1 = 2h n 2h - 1 + 2h = 2h+1 - 1 < 2h+1 Density of a Complete Binary Tree After applying the logarithm base 2 to all terms in the inequality, we have h log2 n < h+1 and conclude that a complete binary tree with n nodes must have height h = int(log2n) search in complete binary tree is O(log2(n)) Binary Tree Nodes Define a binary tree a node as an instance of the generic TNode class. A node contains three fields. The data value, called nodeValue. The reference variables, left and right that identify the left child and the right child of the node respectively. Binary Tree Nodes (continued) The TNode class allows us to construct a binary tree as a collection of TNode objects. TNode Class // declared as an inner class within the class building the tree Building a Binary Tree Using previous slide’s TNode class Scanning a Binary Tree Next issue: How will you retrieve the data stored in the tree? Iterative Level-Order Scan A level-order scan visits the root, then nodes on level 1, then nodes on level 2, etc. Iterative Level-Order Scan A level-order scan is an iterative process that uses a queue as an intermediate storage collection. Initially, the root enters the queue. Start a loop ending when the queue is empty Remove a node from the queue Perform some action with the node Add its children onto the queue Because siblings enter the queue during a visit of their parent, the siblings (on the same level) will exit the queue in successive iterations. Notice in the next example how I'm using Java's Queue interface Iterative Level-Order Scan (continued) Iterative Level-Order Scan (continued) Step 1: remove A then add B and C into queue Step 2: remove B then add D into the queue Step 3: remove C then add E into the queue Step 4: remove D Step 5: remove E Iterative Level-Order Scan (continued) A remove A B C remove B C D remove C D E F remove D E F G H remove E F G H remove F G H I remove G H I remove H I remove I J J remove J Notice the order removed from queue and appended to the output s Recursive Binary Tree-Scan Algorithms If current node == null is stopping condition To scan a tree recursively Visit and display the node (D) scan the left subtree (L) and scan the right subtree (R) The order in which you perform the D, L, R tasks determines the order in which nodes are retrieved In the following code, t is initially the reference to the root node: Inorder Scan L D R Scan is in order of visits to the left subtree, the node's own value, and visits to the right subtree Inorder Scan: G D J H B A E C F I inorderDisplay method call stack: inorderDisplay(a) inorderDisplay(b) inorderDisplay(d) inorderDisplay(g) inorderDisplay(null) append g to s inorderDisplay(null) append d to s inorderDisplay(h) inorderDisplay(j) inorderDisplay(null) append j to s inorderDisplay(null) append h to s inorderDisplay(null) append b to s inorderDisplay(null) append a to s inorderDisplay(c) inorderDisplay(e) inorderDisplay(null) append e to s inorderDisplay(null) append c to s inorderDisplay(f) inorderDisplay(null) append f to s inorderDisplay(i) inorderDisplay(null) append i to s inorderDisplay(null) Postorder Scan L R D Scan is in order of visits to the left subtree, visits to the right subtree, and the node's own value Scan order: G J H D B E I F C A Can you write the method call stack for this? Method Call Stack for Postorder inorderDisplay method call stack: inorderDisplay(a) inorderDisplay(b) inorderDisplay(d) inorderDisplay(g) inorderDisplay(null) inorderDisplay(null) append g to s inorderDisplay(h); inorderDisplay(j); inorderDisplay(null); inorderDisplay(null); append j to s inorderDisplay(null); append h to s // rest left to you… Scan order: G J H D B E I F C A Preorder Scan D L R Scan is in order of the node's own value, visits to the left subtree, and visits to the right subtree Scan order: A B D G H J C E F I More Recursive Scanning Examples Preorder (NLR): Inorder (LNR): Postorder (LRN): A D G B G D D B B G A H C H I E E E H I F I C C F F A Visitor Design Pattern When an action is needed on each element of a collection Don't know in advance what the type of each element will be Defines the visit() method which denotes what a visitor does For a specific visitor pattern Create a Visitor interface Create a class that implements the Visitor interface In class scanning a tree 1. 2. 3. Create an object of the class implementing the Visitor interface During traversal, call Visitor object's visit() and pass the current value as an argument Visitor Design Pattern (continued) 1. 2. Another possibility (requires T to be "Comparable"): 2. 3. scanInorder() This recursive method scanInorder() provides a generalized inorder traversal of a tree that performs an action specified by a visitor pattern. Uses Visitor object's visit instead of System.out.println scanInOrder() If Visitor parameter is VisitorOutput Prints output to the console If Visitor parameter is VisitMax VisitMax parameter stores the tree element with the max value after scanInOrder finishes Program B_Tree Illustrates use of all the scanning methods Computing Tree Height Recall that the height of a binary tree can be computed recursively. height(T) = { -1 if T is empty 1 + max(height(TL), height(TR)) if T is nonempty F Computing Tree Height (continued) Copying a Binary Tree Simple case – exact duplicate Duplicate with additional information Contain nodes with additional field Possibility - references the parent - this allows a scan up the tree along the path of parents Copying a Binary Tree (continued) Copy a tree using a postorder scan Build the duplicate tree from the bottom up. Clearing a Binary Tree Clear a tree with a postorder scan Remove the left and right subtrees before removing the node. Binary Search (Sorted) Trees Binary Search Trees Assume each data element has some key value For every node Key is greater than all keys found in the left subtree Key less than all keys found in the right subtree All nodes can be considered ordered (i.e., sorted) BSTs - More Average depth of a balanced tree is log2N Function Definitions Make_Empty Find Find_Min / Find_Max Insert Remove O(N) O(log2N) O(log2N) O(log2N) O(log2N) Most operations on a binary search tree take time directly proportional to the tree's height, so it is desirable to keep the height small. Ordinary binary search (unbalanced) trees have the primary disadvantage that they can attain very large heights in rather ordinary situations, such as when the keys are inserted in order. The result is a data structure similar to a linked list, making all operations on the tree expensive. Binary Search (sorted) Trees a total order (linear order) is defined on node or key values left subtree of a node contains only values less than node or key value right subtree contains only values greater than or equal to node or key value. 21 10 5 2 34 25 7 39 33 Inserting into a Sorted Binary Tree Create the new node (set children to null) If first node in tree Make new node be the root Else – determine where to insert the new node Set current node to root Loop until done If new node greater than current node If current.right node is null – end of a branch Set current.right to the new node Set done to true Else Set current to current.right Similar for less than on the left side… Deleting from a Sorted Binary Tree Several cases to consider If delete leaf node If delete node with one child Remove node from tree Delete node and replace with its child If delete node with two children Find the successor of the node Copy the successor value into the deletion position Delete the successor node (cases 1 and 2 above) Successor will not have a left branch See following slides Deleting from a Sorted Binary Tree Delete node with two children example 50 50 Delete 25 Copy successor 30 25 15 5 35 20 30 15 40 20 30 Delete this node 32 31 5 35 33 40 32 31 33 Deleting from a Sorted Binary Tree How to find successor (node replacing deleted node) 50 Delete 25 25 15 5 35 20 30 40 32 31 33 Start with deleted node's right child Then follow path of left children to the end – this is the successor 30 has no left child – this is the successor The successor will never have a left child Searching a Sorted Binary Tree for a Value Set current to root Loop while current's value isn't the item's value If item sought value is less than current's value Set current to current.right Else Set current to current.left If current is null didn't find the item Heaps Chapter 22 Ford and Top Array based binary trees Array-Based Binary Trees A complete binary tree of depth d Contains all possible nodes through level d-1 Nodes at level d in the leftmost positions in the tree. An array a can be viewed as a complete binary tree root is a[0] first-level children are a[1] and a[2] second-level children are a[3], a[4], a[5], a[6] and so forth. Array-Based Binary Trees (continued) Integer[] arr = {5, 1, 3, 9, 6, 2, 4, 7, 0, 8}; Array-Based Binary Trees (concluded) For element a[i] in an n-element array-based binary tree: Left child of a[i] is a[2*i + 1] undefined if 2*i + 1 n Right child of a[i] is a[2*i + 2] undefined if 2*i + 2 n Parent of a[i] is a[(i-1)/2] undefined if i = 0 Heaps Array based tree structure Max heap: Min heap: If B is a child node of A, then key(A) >= key(B) If relationship is reversed: key(A) <= key(B) Root is max if max heap, min if min heap Heaps A maximum heap is an array-based tree in which the value of a parent is ≥ the value of its children. A minimum heap uses the relation ≤. Inserting into a Heap Assume that an array with elements in the index range 0 i < last < n forms a heap. The new element will enter the array at index last with the heap expanding by one element. Inserting into a Heap (continued) Move nodes on the path of parents down one level until the item is assigned as a parent that has heap ordering. Path of parents for insertion of item = 50 Inserting into a Heap (continued) The static method pushHeap() of the class Heaps inserts a new value in the heap. The parameter list includes the array arr, the index last, the new value item, and a Comparator object of type Greater or Less indicating whether the heap is a maximum or minimum heap. Inserting into a Heap (continued) The algorithm uses an iterative scan with variable currPos initially set to last. At each step, compare the value item with the value of the parent and if item is larger, copy the parent value to the element at index currPos and assign the parent index as the new value for currPos. The effect is to move the parent down one level. Stop when the parent is larger and assign item to the position currPos. Inserting into a Heap (continued) Deleting from a Heap Deletion from a heap is normally restricted to the root only. Hence, the operation removes the maximum (or minimum) element. To erase the root of an n-element heap, exchange the element at index n-1 and the root and filter the root down into its correct position in the tree. Deleting from a Heap (continued) Deleting from a Heap (continued) Filter down 18 18 arr[0] 30 40 arr[2] arr[1] 5 arr[7] 10 25 arr[3] arr[4] 3 arr[8] 8 arr[5] 38 arr[6] adjustHeap() The implementation of adjustHeap() uses the integer variables currPos and childPos to scan the path of children. Let currPos = first and target = arr[first]. The iterative scan proceeds until we reach a leaf node or target is ≥ to the values of the children at the current position. Move currPos and childPos down the path of children in tandem. Set childPos = index of the largest (smallest) of arr[2*currPos + 1] and arr[2*currPos + 2]. Implementing popHeap() The implementation first captures the root and then exchanges it with the last value in the heap (arr[last-1]). A call to adjustHeap() reestablishes heap order in a heap which now has index range [0, last-1). Method popHeap() concludes by returning the original root value. Complexity of Heap Operations A heap stores elements in an array-based tree that is a complete tree. The pushHeap() and adjustHeap() operations reorder elements in the tree by move up the path of parents for push() and down the path of largest (smallest) children for pop(). Assuming the heap has n elements, the maximum length for a path between a leaf node and the root is log2n, so the runtime efficiency of the algorithms is O(log2 n) Sorting with a Heap If the original array is a maximum heap, an efficient sorting algorithm can be devised. For each iteration i, the largest element is arr[0]. Exchange arr[0] with arr[i] and then reorder the array so that elements in the index range [0, i) are a heap. This is precisely the action of popHeap(), which is an O(log2n) algorithm. By transforming an arbitrary array into a heap, this algorithm will sort the array. Building a Heap Transforming an arbitrary array into a heap is called "heapifying" the array. The method makeHeap() in the Heaps class performs this transformation. Turn an n-element array into a heap by filtering down each parent in the tree beginning with the last parent at index (n-2)/2 and ending with the root node at index 0 Building a Heap (continued) Integer[] arr = {9, 12, 17, 30, 50, 20, 60, 65, 4, 19}; Building a Heap (continued) 9 9 17 12 30 65 50 4 20 19 adjustHeap () at 4 No changes (a) 9 17 12 60 65 30 50 4 20 19 adjustHeap () at 3 Move 30 down 1 level (b) 60 12 60 65 30 50 4 20 19 adjustHeap () at 2 Move 17 down 1 level (c) 17 Heapsort The heap sort is a modified version of the selection sort for an array arr that is a heap. For each i = n, n-1, ..., 2, call popHeap() which pops arr[0] from the heap and assigns it at index i-1. With a maximum heap, the array is assorted in ascending order. A minimum heap sorts the array in descending order. Heapsort (continued) public static <T> void heapSort(T[] arr, Comparator<? super T> comp) { // "heapify" the array arr Heaps.makeHeap(arr, comp); int i, n = arr.length; // iteration that determines elements // arr[n-1] ... arr[1] for (i = n; i > 1; i--) { // call popHeap() to move next // largest to arr[n-1] Heaps.popHeap(arr, i, comp); } } Heapsort (concluded) A mathematical analysis shows that the worst case running time of makeHeap() is O(n). During the second phase of the heap sort, popHeap() executes n - 1 times. Each operation has efficiency O(log2 n). The worst-case complexity of the heap sort is O(n) + O(n log2 n) = O(n log2 n). Implementing a Priority Queue Recall that the HeapPQueue class implements the PQueue interface. The class uses a heap as the underlying storage structure. The user is free to specify either a Less or Greater comparator which dictates whether a deletion removes the minimum or the maximum element from the collection. Implementing a Priority Queue (continued) The HeapPQueue Class public class HeapPQueue<T> implements PQueue<T> { // heapElt holds the priority queue elements private T[] heapElt; // number of elements in the priority queue private int numElts; // Comparator used for comparisons private Comparator<T> comp; // create an empty maximum priority queue public HeapPQueue() { comp = new Less<T>(); numElts = 0; heapElt = (T[]) new Object[10]; } . . . } HeapPQueue Class peek() // return the highest priority item // Precondition: the priority queue is not empty; // if it is empty, throws NoSuchElementException public T peek() { // check for an empty heap if (numElts == 0) throw new NoSuchElementException( "HeapPQueue peek(): empty queue"); // return the root of the heap return heapElt[0]; } HeapPQueue Class pop() // erase the highest priority item and return it // Precondition: the priority queue is not empty; // if it is empty, throws NoSuchElementException public T pop() { // check for an empty priority queue if (numElts == 0) throw new NoSuchElementException( "HeapPQueue pop(): empty queue"); // pop the heap and save the return value in top T top = Heaps.popHeap(heapElt, numElts, comp); // heap has one less element numElts--; return top; } HeapPQueue Class push() // insert item into the priority queue public void push(T item) { // if the current capacity is used up, reallocate // with double the capacity if (numElts == heapElt.length) enlargeCapacity(); // insert item into the heap Heaps.pushHeap(heapElt, numElts, item, comp); numElts++; }