COP 3540 Data Structures with OOP Chapter 8 - Part 1 Binary Trees 1 Why Trees? Trees are one of the fundamental data structures. Many real-world phenomena cannot be represented w/data structures we’ve had so far. Think of arrays: Easy to search, especially if ordered. • O(log2n) performance! Great. (binary search) Inserting; Deleting? Horrible if ordered. Must find item or place before actions. 2 Why Trees? How about Linked Lists? Inserts and deletes? Great. Take O(1) time – the best you can get! (if inserting / deleting from one end) Searching? Search to insert / delete/ change? Not nearly as good as O(1) or even O(log2n)! • On average, must search n/2 items! • Process requires O(n) time. • Ordering the linked list may help, as we must still search to find. 3 Trees First, trees in general. Consists of nodes connected by edges. Trees have indegree <=1, no cycles; • Implies one ‘path’ to a node. Trees with indegree > 1 and cycles = graphs. • More later on graphs. 4 Trees General Tree A root node B D C edge F E G H Edges connect nodes. Only way to get to one node is along an indicated path (edges) and these are downward. Edges are represented in a program by ‘references;’ nodes as likely primitives or objects. One node at top of tree called the root. Can only have one root. Binary Trees have ‘outdegree’ <= 2 for ALL nodes. Multi-way tree can have outdegree >= 2 for at5 least one node (see above). Trees General Tree A root node B D C edge F E G H Parent: All nodes have exactly one parent. Child: Any node may have one or more lines coming from it: children Binary tree has at most two children emanating from a node. Leaf: node with no children Subtree: any node may be considered to be a root of a subtree, even leaves. Visiting: term used to indicate that a node is visited under program control – usually to process the data at that node. Merely passing over a node does not constitute a visit. Traversing: refers to visiting all nodes in a tree in a prescribed manner. Levels: start with level 0. Keys: normally what is displayed on the tree, like A, B, C, … above. 6 Trees Binary Tree A root node B D edge F E G H Binary Tree: every node has no more than two children. A child node is called a left child or right child, but may, in turn, be the root of a subtree! A node in a binary tree may have no children. We normally talk in terms of binary search trees. In theory (logical; abstraction), can exist to any number of levels; in practice (i.e. implementation), can run out of memory space. 7 How Do Binary Search Trees Work? Need to carry out basic tree operations such as finding a node, traversing a tree (get around in the tree), adding a node, deleting a node, etc. This is what this chapter is all about. 8 Unbalanced Trees Note: tree is ‘unbalanced.’ root This means nodes are mostly on one side or the other. Tree may be nearly balanced until certain levels Then, it may become quite unbalanced. Skewed. A D Trees can become unbalanced due to the way they were created. Generally, they are more balanced, if randomly developed. F H Greatly prefer balanced trees. Have very welcome properties Unbalanced trees present problems in efficiently processing them. Red-Black trees address unbalanced trees (further ahead). 9 G Trees in Java Code So, how do we implement binary trees in Java? Normally, we will store the nodes at unrelated places in memory with references to children as we are accustomed to do. Can also represent a tree in memory as an array, with children located in specific positions within this array. Will look later at this. Let’s look at some code segments now. 10 The Node Class We need a class of node objects. These will contain appropriate data and up to two references to children (can have MORE, as we shall see later…) class Node { int iData; // data used as key value double fData // other data Node leftChild; // this node’s left child Node rightChild; // this node’s right child. public void displayNode() { // whatever… }// end display() } // end class Node. 11 Sometimes the data might be objects rather than primitives, and better simply referenced in the Node itself. class Node { Person p; Node leftChild; Node rightChild; // reference to a person object // this node’s left child // this node’s right child. public void displayNode() { // whatever… }// end display() } // end class Node. class Person { int iData; double fData; … } // end class person 12 The Tree Class Need a tree class from which a tree object can be instantiated. Will call class Tree with one field: a Node variable that references the root. Identical to ‘first’ and ‘last’ for linked lists…(get started) Also, since we do not allocate the entire structure at one time (like an array), we can ONLY have a pointer to the first element or root. Consider the basic format of a Tree class: 13 Tree Class class Tree { private Node root; // only data field in Tree; but key! public void find (int key) { // stub: not showing details of this method here }// end find() public void insert (int id, double dd) { // stub; placeholder }// end insert() pubic void delete (int id) { // stub }// end delete() } // end class Tree. 14 The TreeApp Class class TreeApp { public static void main (String [ ] args) { Tree theTree = new Tree; // make a tree. // creates an object of type Tree. (previous slide) theTree.insert (50, 1.5); theTree.insert(25, 1.7); theTree.insert(75, 1.9); //insert three nodes // invoking tree methods… // no implementations shown yet. Node found = theTree.find(25); // find node with key 25 if (found != null) // So what does this do??? System.out.println (“Found the node with key 25”); else System.out.pirntln(“ Could not find node with key 25”); // end main() } // end class TreeApp. 15 Java Code for Finding a Node Insert and delete a bit later. Start with finding a node Remember, nodes have values. In building a binary search tree, we ‘assume’ nodes are built in an order; that is the values to the left of the root are smaller than the parent or root, while values to the right are larger. Assume the binary search tree is already built: 16 Sample Binary Tree Note the left and right, less than, greater than relationships. 50 30 20 60 40 17 Discuss the code: public Node find (int key) { Node current = root; // assumes non-empty tree // start at root while (current.iData != key) // if no match { if (key < current.iData) current = current.leftChild; // recall: current = current.next?? else current = current.rightChild; If (current == null) return null; // not found; boundary condition } // end while return current; // returns reference to node } // end find() 18 Java Code for Inserting a Node Must find the place where to insert new node. Follow a path to parent where we can insert child Connect new Node to parent as left or right child This depends on whether the new node is greater than or less than the value of the parent. First: create a new node. Then, use similar to ‘find’ to locate new node. Ignore dupes at this time 19 public void insert(int id, double dd) // we are within TreeApp.java file… { Node newNode = new Node(); // make new node newNode.iData = id; // insert data // create new node; move in its data newNode.dData = dd; if(root==null) // no node in root. root = newNode; // if true, we are done. else // root occupied { // else not root Node current = root; // start at root Current is a pointer to a node and it points to root. Node parent; // creating a reference to a parent (think singly-linked list!!! Needed ‘previous’) while(true) // (exits internally) // here is our search to find the right spot. { parent = current; if (id < current.iData) // go left? { current = current.leftChild; // But maybe there IS no left child. So: if(current == null) // means there is no left child of current { parent.leftChild = newNode; // insert on left . Link this new node in. We are done. return; // we are done if we get here. }// end if else // go right // current > current.iData? If so, go right? { current = current.rightChild; if(current == null) { // if end of the line insert on right parent.rightChild = newNode; // link in and we are done again. return; // we are done if we get here. }// end if } // end else } // end while }// end else not root Note: we use ‘parent’ to keep track of where we are… 20 } // end insert() Parent is used to keep track of the last non-null node. Traversing the Tree Traversing the tree means ‘visiting’ all the nodes is some kind of specified order. Not particularly fast (unless you use recursion). Three basic ways (there are several others…) Preorder traversal (NLR scan or traversal) Inorder traversal (LNR traversal) Postorder traversal (LRN traversal) Most common is inorder. 21 Inorder Traversal – Binary Tree (LNR Traversal) Results in ascending scan based on key values. Simplest way to traverse a tree is via recursion Start with ‘a’ node as an argument (start with root) Recursive routine will Call itself to traverse the node’s left subtree Visit the node (implies do something with it!) Call itself to traverse the node’s right subtree. 22 Java Code for Traversing – Inorder Recursive… Traversals typically take a total of three statements, if executed recursively. Here’s the LNR traversal: private void inOrder (Node localRoot) initially called with root, { // as in inOrder(root); if (localRoot != null) { inOrder (localRoot.leftChild) System.out.print(“localRoot.iData + “ “); inOrder (localRoot.rightChild); }// end if } // end inOrder() It continues until there are no more nodes to visit. All nodes visited! Displays value of the data via System.out.print statement. 23 Execute the algorithm: LNR Traversal (inOrder) private void inOrder (node localRoot) { (initially called with root as in inOrder(root);) if (localRoot != null) { inOrder (localRoot.leftChild) System.out.print(“localRoot.iData + “ “); inOrder (localRoot.rightChild); }// end if } // end inOrder() // inOrder means priority Left! Start with 50 (root) - Check left subchild L; not null Recursive call with 30 as root; check with left subchild L; not null; 20 Recursive call with 20 as root; check with left subchild; L null. Visit 20 (print) N (System.out.print above…) 50 30 60 40 Recursive call with right subchild; R null – completed most recent call to the left Visit 30 (print) N (System.out.print above…) Recursive call with right subchild (40) Recursive call with left subchild of 40; It is null L Visit 40 (print) N Recursive call with right subchild of 40. It is null – completed another call on right Visit 50 N (System.out.print above) Recursive call with right subtree (60) (not null) Recursive call with left subtree of N (60) It is null L Visit 60. 24 Recursive call with right subtree of N (60) It is null Done! Priority: L-N-R Inorder traversal (LNR) means Left. Go to the left if at all possible; Continue going to the left as much as possible. THEN: process that node Node Next, the root (node) N; process it Right. Lastly, go to the right, R. But then try to go left if at all possible, then N, and process it, then Right… process the Right if there is nothing to t left. Note: the Right then becomes N. But even when you go to the right, before you process it, you must determine if you can go to the left again, as much as possible…recursively… 25 Inorder, preorder, postorder get their names from position of N in LNR, NLR, and LRN. In traversing the tree, the first letter in the scan is the priority, second letter is second priority, and third letter is last priority. 26 Pre-Order (NLR) and Post-Order (LRN) Traversals NLR and LRN traversals. Very important. Same statements in algorithm: Change order!! NLR, visit first and then go left, which would be considered the node, N, of the left subtree (if present and not null). Process this node (visit the node). Go left. If not null, this node is now the new root, N, of another subtree. We process that…etc. Simple priority: N – L – R and 27 Can see only the order of recursive calls is changed. inOrder (localRoot.leftChild) // LNR recursive scan inOrder (localRoot.leftChild) System.out.print(“localRoot.iData + “ “); inOrder (localRoot.rightChild); preOrder (localRoot.leftChild) // NLR recursive scan System.out.print(“localRoot.iData + “ “); inOrder (localRoot.leftChild) inOrder (localRoot.rightChild); postOrder (localRoot.leftChild) // LRN recursive scan inOrder (localRoot.leftChild) inOrder (localRoot.rightChild); System.out.print(“localRoot.iData + “ “); 28 Pre-Order and Post-order Traversals Are specific applications for these traversals. A binary tree (not a binary search tree) can be used to represent algebraic expressions that involve binary operators. The root holds the operator and the other nodes hold a variable (operands) or another operator. Each subtree is a valid algebraic expression. Consider the following: 29 Pre-Order Traversal: (NLR) This represents the expression (A+B)*C This is called ‘infix’ notation – which we are used to. * + C For preorder traversals, NLR, we have the algorithm: visit the node A B call itself to traverse the node’s left subtree call itself to traverse the nodes’s right subtree See the priority??? Above, preorder would be: *+ABC This is also called ‘prefix’ notation. (sometimes called Polish prefix) Advantage: parentheses are never required. Starting with the left, the operator is applied to the next to operands: So, (A+B)*C (operator needs to operands. A+B is temp and considered an operand. Of course, there are more advanced parse trees! (Different one in book) 30 Post-Order Traversal – LRN. * You can guess the execution sequence: Call itself to traverse the node’s left subtree Call itself to traverse the node’s right subtree Visit the node, N A Will use the tree in the book. Priority: L R N in that order. So, ABC+* Note: you will only pop after visiting a node… + B My words: Start with the root. Go left. Get A Can I go to the left. No. Can I go to the right? No. Therefore visit A Go back to previous node (root here) Have gone left. Then go right. (I’m at +) But before I visit +, any chance to go left. Yes! Go left. (I’m at B) Any more to the left? No (I’m at B). Anything to the Right? No. So visit B. Back to the node, +. Next priority is to the right or C. But before I ‘visit’ C can I go to the left? NO. Can I go to the right? NO Ergo, Visit C Back to the node, +. Have visited L and R. So N is left. Visit N, ( + ). Now recurse to it’s parent. Have gone to the left; have gone to the right. So visit that node, *. We are done. Priority: Go left first, then right, then the node (visit) L – R – N. Note this is the postfix notation!!!!! 31 This gives us: abc+* which is postfix notation!! C Evaluating a Postfix Notation – postorder traversal… So, how would you evaluate an expression in postfix notation? Create tree using postfix notation as input. Once tree is built, look for operators. Play with this…you will see these again. 32 Enter: Pseudo Code for Iterative Scans: 33 One GREAT way to approach these (as a few of you have emailed me) is to design the modules first; that is, use pseudo-code to draft out your logic, You all have progressed to the point where punching away at code is extremely frustrating and very error prone. You can easily get to this point that you might mess up more than you fix. So it really is time to hone your skills to the next level and try to design the logic before trying to implement your logic in code. Please consider the following as an example: For inorder scan (LNR) Inorder Scan: (LNR) Clear stack // since we use the stack in three separate iterative scans, clear it prior to each one. Set current to root node // need this to get me started into the tree. Loop while true if current is not null { Push current onto stack Set current to left-child of current // remember: “L” is top dog. Go to the left as // much as possible } // end if // before ‘visiting’ the node. else { If stack is empty // nothing left to pop; boundary condition. return Pop stack and set current equal to object popped from stack Visit this node // in this app, print it out Set current to right child of current } // end else End loop End inorder (LNR) scan. 34 Pre-Order (NLR) Iterative Scan Create a stack (or use current one, but clear it) // Stack myStack = new Stack(); // of course, Stack must have a push() pop(), isEmpty() isFull() etc. and // these routines must adjust the stack index as expected… Could pass the root as a parameter and pass it, as in // NLR-Iterative-Scan (Node current) or something like this… Pre-Order NLR Iterative Scan: If current is not null // start at root. Make root current node. { Create a node for the Stack Push onto Stack // push node onto stack Loop while stack is not empty { Pop Stack // maybe current = (Node) myStack.pop() Visit popped object // display the node in our application If current.right child is not null // this sequence is critical. Push this object onto the stack // myStack.push(the object) If current.left child is not null // note right child pushed before left child. // Why? Push this object onto the stack } // end loop } // end if } // end NLR iterative preorder scan 35 as obvious…) Will leave iterative LRN scan for you…. (not Finding Maximum and Minimum Values Pretty easy to do in a binary search tree. Do the LNR scan. When you encounter an L with no left subtree, ‘that’ is the minimum. Maximum? Go to the right. Same song. 36