Fundamentals of Python: From First Programs Through Data Structures Chapter 18 Hierarchical Collections: Trees Objectives After completing this chapter, you will be able to: • Describe the difference between trees and other types of collections using the relevant terminology • Recognize applications for which general trees and binary trees are appropriate • Describe the behavior and use of specialized trees, such as heaps, BSTs, and expression trees • Analyze the performance of operations on binary search trees and heaps • Develop recursive algorithms to process trees Fundamentals of Python: From First Programs Through Data Structures 2 An Overview of Trees • In a tree, the ideas of predecessor and successor are replaced with those of parent and child • Trees have two main characteristics: – Each item can have multiple children – All items, except a privileged item called the root, have exactly one parent Fundamentals of Python: From First Programs Through Data Structures 3 Tree Terminology Fundamentals of Python: From First Programs Through Data Structures 4 Tree Terminology (continued) Fundamentals of Python: From First Programs Through Data Structures 5 Tree Terminology (continued) Note: The height of a tree containing one node is 0 By convention, the height of an empty tree is –1 Fundamentals of Python: From First Programs Through Data Structures 6 General Trees and Binary Trees • In a binary tree, each node has at most two children: – The left child and the right child Fundamentals of Python: From First Programs Through Data Structures 7 Recursive Definitions of Trees • A general tree is either empty or consists of a finite set of nodes T – Node r is called the root – The set T – {r} is partitioned into disjoint subsets, each of which is a general tree • A binary tree is either empty or consists of a root plus a left subtree and a right subtree, each of which are binary trees Fundamentals of Python: From First Programs Through Data Structures 8 Why Use a Tree? • A parse tree describes the syntactic structure of a particular sentence in terms of its component parts Fundamentals of Python: From First Programs Through Data Structures 9 Why Use a Tree? (continued) • File system structures are also tree-like Fundamentals of Python: From First Programs Through Data Structures 10 Why Use a Tree? (continued) • Sorted collections can also be represented as treelike structures – Called a binary search tree, or BST for short • Can support logarithmic searches and insertions Fundamentals of Python: From First Programs Through Data Structures 11 The Shape of Binary Trees • The shape of a binary tree can be described more formally by specifying the relationship between its height and the number of nodes contained in it A full binary tree contains the maximum number of nodes for a given height H N nodes Height: N – 1 Fundamentals of Python: From First Programs Through Data Structures 12 The Shape of Binary Trees (continued) • The number of nodes, N, contained in a full binary tree of height H is 2H + 1 – 1 • The height, H, of a full binary tree with N nodes is log2(N + 1) – 1 • The maximum amount of work that it takes to access a given node in a full binary tree is O(log N) Fundamentals of Python: From First Programs Through Data Structures 13 The Shape of Binary Trees (continued) Fundamentals of Python: From First Programs Through Data Structures 14 Three Common Applications of Binary Trees • In this section, we introduce three special uses of binary trees that impose an ordering on their data: – Heaps – Binary search trees – Expression trees Fundamentals of Python: From First Programs Through Data Structures 15 Heaps • • • • In a min-heap each node is ≤ to both of its children A max-heap places larger nodes nearer to the root Heap property: Constraint on the order of nodes Heap sort builds a heap from data and repeatedly removes the root item and adds it to the end of a list • Heaps are also used to implement priority queues Fundamentals of Python: From First Programs Through Data Structures 16 Binary Search Trees • A BST imposes a sorted ordering on its nodes – Nodes in left subtree of a node are < node – Nodes in right subtree of a node are > node • When shape approaches that of a perfectly balanced binary tree, searches and insertions are O(log n) in the worst case • Not all BSTs are perfectly balanced – In worst case, they become linear and support linear searches Fundamentals of Python: From First Programs Through Data Structures 17 Binary Search Trees (continued) Fundamentals of Python: From First Programs Through Data Structures 18 Binary Search Trees (continued) Fundamentals of Python: From First Programs Through Data Structures 19 Expression Trees • Another way to process expressions is to build a parse tree during parsing – Expression tree • An expression tree is never empty • An interior node represents a compound expression, consisting of an operator and its operands • Each leaf node represents a numeric operand • Operands of higher precedence usually appear near bottom of tree, unless overridden in source expression by parentheses Fundamentals of Python: From First Programs Through Data Structures 20 Expression Trees (continued) Fundamentals of Python: From First Programs Through Data Structures 21 Binary Tree Traversals • Four standard types of traversals for binary trees: – Preorder traversal: Visits root node, and then traverses left subtree and right subtree in similar way – Inorder traversal: Traverses left subtree, visits root node, and traverses right subtree • Appropriate for visiting items in a BST in sorted order – Postorder traversal: Traverses left subtree, traverses right subtree, and visits root node – Level order traversal: Beginning with level 0, visits the nodes at each level in left-to-right order Fundamentals of Python: From First Programs Through Data Structures 22 Binary Tree Traversals (continued) Fundamentals of Python: From First Programs Through Data Structures 23 Binary Tree Traversals (continued) Fundamentals of Python: From First Programs Through Data Structures 24 Binary Tree Traversals (continued) Fundamentals of Python: From First Programs Through Data Structures 25 Binary Tree Traversals (continued) Fundamentals of Python: From First Programs Through Data Structures 26 A Binary Tree ADT • Provides many common operations required for building more specialized types of trees • Should support basic operations for creating trees, determining if a tree is empty, and traversing a tree • Remaining operations focus on accessing, replacing, or removing the component parts of a nonempty binary tree—its root, left subtree, and right subtree Fundamentals of Python: From First Programs Through Data Structures 27 The Interface for a Binary Tree ADT Fundamentals of Python: From First Programs Through Data Structures 28 The Interface for a Binary Tree ADT (continued) Fundamentals of Python: From First Programs Through Data Structures 29 Processing a Binary Tree • Many algorithms for processing binary trees follow the trees’ recursive structure • Programmers are occasionally interested in the frontier, or set of leaf nodes, of a tree – Example: Frontier of parse tree for English sentence shown earlier contains the words in the sentence Fundamentals of Python: From First Programs Through Data Structures 30 Processing a Binary Tree (continued) • frontier expects a binary tree and returns a list – Two base cases: • Tree is empty return an empty list • Tree is a leaf node return a list containing root item Fundamentals of Python: From First Programs Through Data Structures 31 Implementing a Binary Tree Fundamentals of Python: From First Programs Through Data Structures 32 Implementing a Binary Tree (continued) Fundamentals of Python: From First Programs Through Data Structures 33 Implementing a Binary Tree (continued) Fundamentals of Python: From First Programs Through Data Structures 34 The String Representation of a Tree • __str__ can be implemented with any of the traversals Fundamentals of Python: From First Programs Through Data Structures 35 Developing a Binary Search Tree • A BST imposes a special ordering on the nodes in a binary tree, so as to support logarithmic searches and insertions • In this section, we use the binary tree ADT to develop a binary search tree, and assess its performance Fundamentals of Python: From First Programs Through Data Structures 36 The Binary Search Tree Interface • The interface for a BST should include a constructor and basic methods to test a tree for emptiness, determine the number of items, add an item, remove an item, and search for an item • Another useful method is __iter__, which allows users to traverse the items in BST with a for loop Fundamentals of Python: From First Programs Through Data Structures 37 Data Structures for the Implementation of BST … Fundamentals of Python: From First Programs Through Data Structures 38 Searching a Binary Search Tree • find returns the first matching item if the target item is in the tree; otherwise, it returns None – We can use a recursive strategy Fundamentals of Python: From First Programs Through Data Structures 39 Inserting an Item into a Binary Search Tree • add inserts an item in its proper place in the BST • Item’s proper place will be in one of three positions: – The root node, if the tree is already empty – A node in the current node’s left subtree, if new item is less than item in current node – A node in the current node’s right subtree, if new item is greater than or equal to item in current node • For options 2 and 3, add uses a recursive helper function named addHelper • In all cases, an item is added as a leaf node Fundamentals of Python: From First Programs Through Data Structures 40 Removing an Item from a Binary Search Tree • Save a reference to root node • Locate node to be removed, its parent, and its parent’s reference to this node • If item is not in tree, return None • Otherwise, if node has a left and right child, replace node’s value with largest value in left subtree and delete that value’s node from left subtree – Otherwise, set parent’s reference to node to node’s only child • Reset root node to saved reference • Decrement size and return item Fundamentals of Python: From First Programs Through Data Structures 41 Removing an Item from a Binary Search Tree (continued) • Fourth step is fairly complex: Can be factored out into a helper function, which takes node to be deleted as a parameter (node containing item to be removed is referred to as the top node): – Search top node’s left subtree for node containing the largest item (rightmost node of the subtree) – Replace top node’s value with the item – If top node’s left child contained the largest item, set top node’s left child to its left child’s left child – Otherwise, set parent node’s right child to that right child’s left child Fundamentals of Python: From First Programs Through Data Structures 42 Complexity Analysis of Binary Search Trees • BSTs are set up with intent of replicating O(log n) behavior for the binary search of a sorted list • A BST can also provide fast insertions • Optimal behavior depends on height of tree – A perfectly balanced tree supports logarithmic searches – Worst case (items are inserted in sorted order): tree’s height is linear, as is its search behavior • Insertions in random order result in a tree with close-to-optimal search behavior Fundamentals of Python: From First Programs Through Data Structures 43 Case Study: Parsing and Expression Trees • Request: – Write a program that uses an expression tree to evaluate expressions or convert them to alternative forms • Analysis: – Like the parser developed in Chapter 17, current program parses an input expression and prints syntax error messages if errors occur – If expression is syntactically correct, program prints its value and its prefix, infix, and postfix representations Fundamentals of Python: From First Programs Through Data Structures 44 Case Study: Parsing and Expression Trees (continued) Fundamentals of Python: From First Programs Through Data Structures 45 Case Study: Parsing and Expression Trees (continued) Fundamentals of Python: From First Programs Through Data Structures 46 Case Study: Parsing and Expression Trees (continued) • Design and Implementation of the Node Classes: Fundamentals of Python: From First Programs Through Data Structures 47 Case Study: Parsing and Expression Trees (continued) Fundamentals of Python: From First Programs Through Data Structures 48 Case Study: Parsing and Expression Trees (continued) Fundamentals of Python: From First Programs Through Data Structures 49 Case Study: Parsing and Expression Trees (continued) • Design and Implementation of the Parser Class: – Easiest to build an expression tree with a parser that uses a recursive descent strategy • Borrow parser from Chapter 17 and modify it – parse should now return an expression tree to its caller, which uses that tree to obtain information about the expression – factor processes either a number or an expression nested in parentheses • Calls expression to parse nested expressions Fundamentals of Python: From First Programs Through Data Structures 50 Case Study: Parsing and Expression Trees (continued) Fundamentals of Python: From First Programs Through Data Structures 51 Case Study: Parsing and Expression Trees (continued) Fundamentals of Python: From First Programs Through Data Structures 52 An Array Implementation of Binary Trees • An array-based implementation of a binary tree is difficult to define and practical only in some cases • For complete binary trees, there is an elegant and efficient array-based representation – Elements are stored by level • The array representation of a binary tree is pretty rare and is used mainly to implement a heap Fundamentals of Python: From First Programs Through Data Structures 53 An Array Implementation of Binary Trees (continued) Fundamentals of Python: From First Programs Through Data Structures 54 An Array Implementation of Binary Trees (continued) Fundamentals of Python: From First Programs Through Data Structures 55 An Array Implementation of Binary Trees (continued) Fundamentals of Python: From First Programs Through Data Structures 56 Implementing Heaps Fundamentals of Python: From First Programs Through Data Structures 57 Implementing Heaps (continued) • At most, log2n comparisons must be made to walk up the tree from the bottom, so add is O(log n) • Method may trigger a doubling in the array size – O(n), but amortized over all additions, it is O(1) Fundamentals of Python: From First Programs Through Data Structures 58 Using a Heap to Implement a Priority Queue • In Ch15, we implemented a priority queue with a sorted linked list; alternatively, we can use a heap Fundamentals of Python: From First Programs Through Data Structures 59 Summary • Trees are hierarchical collections – The topmost node in a tree is called its root – In a general tree, each node below the root has at most one parent node, and zero child nodes – Nodes without children are called leaves – Nodes that have children are called interior nodes – The root of a tree is at level 0 • In a binary tree, nodes have at most two children – A complete binary tree fills each level of nodes before moving to next level; a full binary tree includes all the possible nodes at each level Fundamentals of Python: From First Programs Through Data Structures 60 Summary (continued) • Four standard types of tree traversals: Preorder, inorder, postorder, and level order • Expression tree: Type of binary tree in which the interior nodes contain operators and the successor nodes contain their operands • Binary search tree: Nonempty left subtree has data < datum in its parent node and a nonempty right subtree has data > datum in its parent node – Logarithmic searches/insertions if close to complete • Heap: Binary tree in which smaller data items are located near root Fundamentals of Python: From First Programs Through Data Structures 61 Reading from course text: Qu es ti ons? ______________________ Devon M. Simmonds Computer Science Department University of North Carolina Wilmington _____________________________________________________________ 62