Chapter 5 Trees: Outline Introduction Representation Of Trees Binary Trees Binary Tree Traversals Additional Binary Tree Operations Threaded Binary Trees Heaps Binary Search Trees Selection Trees Forests Introduction (1/8) A tree structure means that the data are organized so that items of information are related by branches Examples: Introduction (2/8) Definition (recursively): A tree is a finite set of one or more nodes such that There is a specially designated node called root. The remaining nodes are partitioned into n>=0 disjoint set T1,…,Tn, where each of these sets is a tree. T1,…,Tn are called the subtrees of the root. Every node in the tree is the root of some subtree Introduction (3/8) Some Terminology node: the item of information plus the branches to each node. degree: the number of subtrees of a node degree of a tree: the maximum of the degree of the nodes in the tree. terminal nodes (or leaf): nodes that have degree zero nonterminal nodes: nodes that don’t belong to terminal nodes. children: the roots of the subtrees of a node X are the children of X parent: X is the parent of its children. Introduction (4/8) Some Terminology (cont’d) siblings: children of the same parent are said to be siblings. Ancestors of a node: all the nodes along the path from the root to that node. The level of a node: defined by letting the root be at level one. If a node is at level l, then it children are at level l+1. Height (or depth): the maximum level of any node in the tree Example Introduction (5/8) A is the root node Property: (# edges) = (#nodes) - 1 B is the parent of D and E C is the sibling of B D and E are the children of B D, E, F, G, I are external nodes, or leaves A, B, C, H are internal nodes Level The level of E is 3 The height (depth) of the tree is 4 A 1 The degree of node B is 2 The degree of the tree is 3 B C The ancestors of node I is A, C, H 2 The descendants of node C is F, G, H, I H D E F G I 3 4 Introduction (6/8) Representation Of Trees List Representation we can write of Figure 5.2 as a list in which each of the subtrees is also a list ( A ( B ( E ( K, L ), F ), C ( G ), D ( H ( M ), I, J ) ) ) The root comes first, followed by a list of sub-trees Introduction (7/8) Representation Of Trees (cont’d) Left ChildRight Sibling Representation Introduction (8/8) Representation Of Trees (cont’d) Representation As A Degree Two Tree Binary Trees (1/9) Binary trees are characterized by the fact that any node can have at most two branches Definition (recursive): A binary tree is a finite set of nodes that is either empty or consists of a root and two disjoint binary trees called the left subtree and the right subtree Thus the left subtree and the right subtree are A distinguished B A B Any tree can be transformed into binary tree by left child-right sibling representation Binary Trees (2/9) The abstract data type of binary tree Binary Trees (3/9) Two special kinds of binary trees: (a) skewed tree, (b) complete binary tree The all leaf nodes of these trees are on two adjacent levels Binary Trees (4/9) Properties of binary trees Lemma 5.1 [Maximum number of nodes]: 1. The maximum number of nodes on level i of a binary tree is 2i-1, i 1. 2. The maximum number of nodes in a binary tree of depth k is 2k-1, k1. Lemma 5.2 [Relation between number of leaf nodes and degree-2 nodes]: For any nonempty binary tree, T, if n0 is the number of leaf nodes and n2 is the number of nodes of degree 2, then n0 = n2 + 1. These lemmas allow us to define full and complete binary trees Binary Trees (5/9) Definition: A full binary tree of depth k is a binary tree of death k having 2k-1 nodes, k 0. A binary tree with n nodes and depth k is complete iff its nodes correspond to the nodes numbered from 1 to n in the full binary tree of depth k. From Lemma 5.1, the height of a complete binary tree with n nodes is log2(n+1) Binary Trees (6/9) Binary tree representations (using array) Lemma 5.3: If a complete binary tree with n nodes is represented sequentially, then for any node with index i, 1 i n, we have 1. parent(i) is at i /2 if i 1. If i = 1, i is at the root and has no parent. 2. LeftChild(i) is at 2i if 2i n. If 2i n, then i has no left child. A 1 3. RightChild(i) is at 2i+1 if 2i+1 n. If 2i +1 n, then i has no left child B 2 [1] [2] [3] [4] [5] [6] [7] C 3 A B C — D — E D Level 1 Level 2 Level 3 4 5 E 6 7 Binary Trees (7/9) Binary tree representations (using array) Waste spaces: in the worst case, a skewed tree of depth k requires 2k-1 spaces. Of these, only k spaces will be occupied Insertion or deletion of nodes from the middle of a tree requires the movement of potentially many nodes to reflect the change in the level of these nodes Binary Trees (8/9) Binary tree representations (using link) Binary Trees (9/9) Binary tree representations (using link) Binary Tree Traversals (1/9) How to traverse a tree or visit each node in the tree exactly once? There are six possible combinations of traversal LVR, LRV, VLR, VRL, RVL, RLV Adopt convention that we traverse left before right, only 3 traversals remain LVR (inorder), LRV (postorder), VLR (preorder) left_child L: moving left data right_child V : visiting node R: moving right Binary Tree Traversals (2/9) Arithmetic Expression using binary tree inorder traversal (infix expression) A/B*C*D+E preorder traversal (prefix expression) +**/ABCDE postorder traversal (postfix expression) AB/C*D*E+ level order traversal +*E*D/CAB Binary Tree Traversals (3/9) Inorder traversal (LVR) (recursive version) output: A / B * C * D + E ptr L V R Binary Tree Traversals (4/9) Preorder traversal (VLR) (recursive version) output: + * * / A B C D E V L R Binary Tree Traversals (5/9) Postorder traversal (LRV) (recursive version) output: A B / C * D * E + L R V Binary Tree Traversals (6/9) Iterative inorder traversal we use a stack to simulate recursion 5 4 11 8 3 14 2 17 1 A B / *C D * E + L V R output: A / B*C * D + E node Binary Tree Traversals (7/9) Analysis of inorder2 (Non-recursive Inorder traversal) Let n be the number of nodes in the tree Time complexity: O(n) Every node of the tree is placed on and removed from the stack exactly once Space complexity: O(n) equal to the depth of the tree which (skewed tree is the worst case) Binary Tree Traversals (8/9) Level-order traversal method: We visit the root first, then the root’s left child, followed by the root’s right child. We continue in this manner, visiting the nodes at each new level from the leftmost node to the rightmost nodes This traversal requires a queue to implement Binary Tree Traversals (9/9) Level-order traversal (using queue) output: + * E * D / C A B 1 17 3 14 4 11 5 2 *+ E * D / 8 C A B FIFO ptr Additional Binary Tree Operations (1/7) Copying Binary Trees we can modify the postorder traversal algorithm only slightly to copy the binary tree similar as Program 5.3 Additional Binary Tree Operations (2/7) Testing Equality Binary trees are equivalent if they have the same topology and the information in corresponding nodes is identical V L R the same topology and data as Program 5.6 Additional Binary Tree Operations (3/7) Variables: x1, x2, …, xn can hold only of two possible values, true or false Operators: (and), (or), ¬(not) Propositional Calculus Expression A variable is an expression If x and y are expressions, then ¬x, xy, xy are expressions Parentheses can be used to alter the normal order of evaluation (¬ > > ) Example: x1 (x2 ¬x3) Additional Binary Tree Operations (4/7) Satisfiability problem: Is there an assignment to make an expression true? Solution for the Example x1 (x2 ¬x3) : If x1 and x3 are false and x2 is true false (true ¬false) = false true = true For n value of an expression, there are 2n possible combinations of true and false Additional Binary Tree Operations (5/7) (x1 ¬x2) (¬ x1 x3) ¬x3 postorder traversal data value X1 X3 X2 X3 X1 Additional Binary Tree Operations (6/7) node structure For the purpose of our evaluation algorithm, we assume each node has four fields: We define this node structure in C as: Additional Binary Tree Operations (7/7) Satisfiability function To evaluate the tree is easily obtained by modifying the original recursive postorder traversal TRUE node TRUE FALSE FALSE TRUE TRUE T TRUE TRUE FALSE F F FALSE FALSE T TRUE T L R V Threaded Binary Trees (1/10) Threads Do you find any drawback of the above tree? Too many null pointers in current representation of binary trees n: number of nodes number of non-null links: n-1 total links: 2n null links: 2n-(n-1) = n+1 Solution: replace these null pointers with some useful “threads” Threaded Binary Trees (2/10) Rules for constructing the threads If ptr->left_child is null, replace it with a pointer to the node that would be visited before ptr in an inorder traversal If ptr->right_child is null, replace it with a pointer to the node that would be visited after ptr in an inorder traversal Threaded Binary Trees (3/10) A Threaded Binary Tree t: true thread f: false child root A dangling f B f dangling C t E t D F G inorder traversal: H I H D I B E A F C G Threaded Binary Trees (4/10) Two additional fields of the node structure, left-thread and right-thread If ptr->left-thread=TRUE, then ptr->left-child contains a thread; Otherwise it contains a pointer to the left child. Similarly for the right-thread Threaded Binary Trees (5/10) If we don’t want the left pointer of H and the right pointer of G to be dangling pointers, we may create root node and assign them pointing to the root node Threaded Binary Trees (6/10) Inorder traversal of a threaded binary tree By using of threads we can perform an inorder traversal without making use of a stack (simplifying the task) Now, we can follow the thread of any node, ptr, to the “next” node of inorder traversal 1. If ptr->right_thread = TRUE, the inorder successor of ptr is ptr->right_child by definition of the threads 2. Otherwise we obtain the inorder successor of ptr by following a path of left-child links from the right-child of ptr until we reach a node with left_thread = TRUE Threaded Binary Trees (7/10) Finding the inorder successor (next node) of a node threaded_pointer insucc(threaded_pointer tree){ threaded_pointer temp; temp = tree->right_child; if (!tree->right_thread) while (!temp->left_thread) temp = temp->left_child; return temp; } Inorder tree temp Threaded Binary Trees (8/10) Inorder traversal of a threaded binary tree void tinorder(threaded_pointer tree){ /* traverse the threaded binary tree inorder */ threaded_pointer temp = tree; output: H D I B E A FC G for (;;) { temp = insucc(temp); if (temp==tree) tree break; printf(“%3c”,temp->data); } } Time Complexity: O(n) Threaded Binary Trees (9/10) Inserting A Node Into A Threaded Binary Tree Insert child as the right child of node parent 1. change parent->right_thread to FALSE 2. set child->left_thread and child->right_thread to TRUE 3. set child->left_child to point to parent 4. set child->right_child to parent->right_child 5. change parent->right_child to point to child Threaded Binary Trees (10/10) Right insertion in a threaded binary tree void insert_right(thread_pointer parent, threaded_pointer child){ /* insert child as the right child of parent in a threaded binary tree */ threaded_pointer temp; root child->right_child = parent->right_child; parent child->right_thread = parent->right_thread; A child->left_child = parent; B child->left_thread = TRUE; X C child parent->right_child = child; parent->right_thread = FALSE; temp If(!child->right_thread){ parent A temp = insucc(child); child B temp->left_child = child; X } C } D First Case Second Case E successor F Heaps (1/6) The heap abstract data type Definition: A max(min) tree is a tree in which the key value in each node is no smaller (larger) than the key values in its children. A max (min) heap is a complete binary tree that is also a max (min) tree Basic Operations: creation of an empty heap insertion of a new elemrnt into a heap deletion of the largest element from the heap Heaps (2/6) The examples of max heaps and min heaps Property: The root of max heap (min heap) contains the largest (smallest) element Heaps (3/6) Abstract data type of Max Heap Heaps (4/6) Queue in Chapter 3: FIFO Priority queues Heaps are frequently used to implement priority queues delete the element with highest (lowest) priority insert the element with arbitrary priority Heaps is the only way to implement priority queue machine service: amount of time (min heap) amount of payment (max heap) factory: time tag Heaps (5/6) Insertion Into A Max Heap Analysis of insert_max_heap The complexity of the insertion function is O(log2 n) insert 2 51 *n= 6 5 i= 1 6 7 3 [1] 20 21 [2] 15 [4] parent sink item upheap [3] [5] 14 10 20 52 [6] [7] 2 5 Deletion from a max heap Heaps (6/6) After deletion, the heap is still a complete binary tree Analysis of delete_max_heap The complexity of the insertion function is O(log2 n) parent = 4 1 2 child = 8 2 [1] 4 [2] < 15 20 [3] 15 14 [4] *n= 5 4 [5] 14 10 10 2 item.key = 20 temp.key = 10 Binary Search Trees (1/8) Why do binary search trees need? Heap is not suited for applications in which arbitrary elements are to be deleted from the element list a min (max) element is deleted O(log2n) deletion of an arbitrary element O(n) search for an arbitrary element O(n) Definition of binary search tree: Every element has a unique key The keys in a nonempty left subtree (right subtree) are smaller (larger) than the key in the root of subtree The left and right subtrees are also binary search trees Binary Search Trees (2/8) Example: (b) and (c) are binary search trees medium smaller larger Binary Search Trees (3/8) Search: Search(25) Search(76) 44 17 88 65 32 28 97 54 29 82 76 80 Binary Search Trees (4/8) Searching a binary search tree O(h) Binary Search Trees (5/8) Inserting into a binary search tree An empty tree Binary Search Trees (6/8) Deletion from a binary search tree Three cases should be considered case 1. leaf delete case 2. one child delete and change the pointer to this child case 3. two child either the smallest element in the right subtree or the largest element in the left subtree Binary Search Trees (7/8) Height of a binary search tree The height of a binary search tree with n elements can become as large as n. It can be shown that when insertions and deletions are made at random, the height of the binary search tree is O(log2n) on the average. Search trees with a worst-case height of O(log2n) are called balance search trees Binary Search Trees (8/8) Time Complexity Searching, insertion, removal O(h), where h is the height of the tree Worst case - skewed binary tree O(n), where n is the # of internal nodes Prevent worst case rebalancing scheme AVL, 2-3, and Red-black tree Selection Trees (1/6) Problem: suppose we have k order sequences, called runs, that are to be merged into a single ordered sequence Solution: straightforward : k-1 comparison selection tree : log2k+1 There are two kinds of selection trees: winner trees and loser trees Selection Trees (2/6) Definition: (Winner tree) a selection tree is the binary tree where each node represents the smaller of its two children root node is the smallest node in the tree a winner is the record with smaller key Rules: tournament : between sibling nodes put X in the parent node X tree where X = winner or loser Winner Tree Selection Trees (3/6) sequential allocation scheme (complete binary tree) Each node represents the smaller of its two children ordered sequence Selection Trees (4/6) Analysis of merging runs using winner trees # of levels: log2K +1 restructure time: O(log2K) merge time: O(nlog2K) setup time: O(K) merge time: O(nlog2K) Slight modification: tree of loser consider the parent node only (vs. sibling nodes) Selection Trees (5/6) After one record has been output 6 6 6 6 15 Selection Trees (6/6) Tree of losers can be conducted by Winner tree 0 6 8 9 10 17 20 9 90 Forests (1/4) Definition: A forest is a set of n 0 disjoint trees Transforming a forest into a binary tree Definition: If T1,…,Tn is a forest of trees, then the binary tree corresponding to this forest, denoted by B(T1,…,Tn ): is empty, if n = 0 has root equal to root(T1); has left subtree equal to B(T11,T12,…,T1m); and has right subtree equal to B(T2,T3,…,Tn) where T11,T12,…,T1m are the subtrees of root (T1) Forests (2/4) Rotate the tree clockwise by 45 degrees A Leftmost child A B C G E D Right sibling F H B I E F C D G H I Forest traversals Forests (3/4) Forest preorder traversal (1)If F is empty, then return. (2)Visit the root of the first tree of F. (3)Traverse the subtrees of the first tree in tree preorder. (4)Traverse the remaining tree of F in preorder. Forest inorder traversal (1)If F is empty, then return (2)Traverse the subtrees of the first tree in tree inorder (3)Visit the root of the first tree of F (4)Traverse the remaining tree of F in inorder Forest postorder traversal (1)If F is empty, then return (2)Traverse the subtrees of the first tree in tree postorder (3)Traverse the remaining tree of F in postorder (4)Visit the root of the first tree of F preorder: A B C D E F G H I inorder: B C A E D G H F I A E, D, G, H, F, I B, C preorder: A B C (D E F G H I) inorder: B C A (E D G H F I) A D B C E F, G, H, I Forests (4/4) Set Representation(1/13) S1={0, 6, 7, 8}, S2={1, 4, 9}, S3={2, 3, 5} 6 7 2 4 0 8 1 Si Sj = 9 3 5 Two operations considered here Disjoint set union S1 S2={0,6,7,8,1,4,9} Find(i): Find the set containing the element i. 3 S3, 8 S1 Set Representation(2/13) Union and Find Operations Make one of trees a subtree of the other 0 6 7 8 0 4 1 9 6 7 1 8 Possible representation for S1 union S2 4 9 Set Representation(3/13) set name pointer S1 0 6 7 S2 8 4 S3 1 9 2 3 5 *Figure 5.41:Data Representation of S1S2and S3 (p.240) Set Representation(4/13) Array Representation for Set i [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] parent -4 4 -3 2 -3 2 0 0 0 4 int find1(int i) { for(; parent[i] >= 0; i = parent[i]); return i; } void union1(int i, int j) { parent[i] = j; } Program 5.18: Initial attempt at union-find function (p.241) Set Representation(5/13) union operation O(n) n-1 n-1 find operation O(n2) n n-2 i i 2 union(0,1), find(0) union(1,2), find(0) . . . union(n-2,n-1),find(0) 0 degenerate tree *Figure 5.43:Degenerate tree (p.242) Set Representation(6/13) weighting rule for union(i,j): if # of nodes in i < # in j then j the parent of i Set Representation(7/13) Modified Union Operation void union2(int i, int j) Keep a count in the root of tree { int temp = parent[i] + parent[j]; if (parent[i] > parent[j]) { parent[i] = j; /*make j the new root*/ parent[j] = temp; } else { parent[j] = i;/* make i the new root*/ parent[i] = temp; } If the number of nodes in tree i is } less than the number in tree j, then make j the parent of i; otherwise make i the parent of j. Set Representation(8/13) Figure 5.45:Trees achieving worst case bound (p.245) log28+1 Set Representation(9/13) The definition of Ackermann’s function used here is : A ( p, q) = P=0 2 q 0 q=0 and p >= 1 P>=1 and p = 1 0 A( p 1, A( p, q 1)) p>=1 and q >= 2 Set Representation(10/13) Modified Find(i) Operation Int find2(int i) { int root, trail, lead; for (root=i;parent[root]>=0;oot=parent[root]) ; for (trail=i; trail!=root; trail=lead) { lead = parent[trail]; parent[trail]= root; } If j is a node on the path from return root; i to its root then make j a child } of the root Set Representation(11/13) 0 0 1 4 2 3 5 1 6 2 4 3 5 6 7 7 find(7) find(7) find(7) find(7) find(7) find(7) find(7) find(7) go up reset 3 1 1 2 12 moves (vs. 24 moves) 1 1 1 1 1 Set Representation(12/13) Applications Find equivalence class i j Find Si and Sj such that i Si and j Sj (two finds) Si = Sj do nothing Si Sj union(Si , Sj) example 0 4, 3 1, 6 10, 8 9, 7 4, 6 8, 3 5, 2 11, 11 0 {0, 2, 4, 7, 11}, {1, 3, 5}, {6, 8, 9, 10} Set Representation(13/13) Counting Binary trees(1/10) Distinct Binary Trees : If n=0 or n=1, there is only one binary tree. If n=2 and n=3, Counting Binary trees(2/10) Stack Permutations preorder: inorder: A ABCDEF GHI BCAED GH FI A D, E, F, G, H, I B, C A C D, E, F, G, H, I B C D B E F G I H Counting Binary trees(3/10) Figure5.49(c) with the node numbering of Figure 5.50.Its preorder permutation is 1,2…,9, and its inorder 1 permutation is 2,3,1,5,4, 4 7,8,6,9. 2 3 5 6 7 9 8 Counting Binary trees(4/10) If we start with the numbers1,2,3, then the possible permutations obtainable by a stack are: (1,2,3) (1,3,2) (2,1,3) (2,3,1) (3,2,1) Obtaining(3,1,2) is impossible. 1 1 1 2 2 3 3 1 1 2 2 2 3 3 3 Counting Binary trees(5/10) Matrix Multiplication Suppose that we wish to compute the product of n matrices: M1 * M2 . . .* Mn Since matrix multiplication is associative, we can perform these multiplications in any order. We would like to know how many different ways we can perform these multiplications . For example, If n =3, there are two possibilities: (M1*M2)*M3 M1*(M2*M3) Counting Binary trees(6/10) Let bn be the number of different ways to compute the product of n matrices. Then b2 1, , b3 2 and b4 5 . Let M ij , i j be the product M i * M i1 * ... * M j . . The product we wish to compute is M ln by computing any one of the products M 1i * M i 1,n ,1 i n. The number of distinct ways to obtain M 1i andM i 1,n are bi and bni ,respectively. Therefore, letting bi =1, we have: n 1 bn b b i 0 i n i ,n 1 Counting Binary trees(7/10) Now instead let bn be the number of distinct binary trees with n nodes. Again an expression for bn in terms of n is what we want. Than we see that bn is the sum of all the possible binary trees formed in the following way: a root and two subtrees with bi and bn i 1 nodes, for 0 i n . This explanation says that bn n 1 b b i 0 i n i 1 ,n 1 and b0 1 bn bi bn-i-1 Counting Binary trees(8/10) Number of Distinct Binary Trees: To obtain number of distinct binary trees with n nodes, we must solve the recurrence of Eq.(5.5).To i B ( x ) b x begin we let: (5.6) i0 i Which is the generating function for the number of binary trees. Next observe that by the recurrence relation we get the identity: xB2 ( x ) b( x ) 1 Using the formula to solve quadratics and the fact (Eq. (5.5)) that B(0) = b0 = 1 ,we get 1 1 4x B( x ) 2x Counting Binary trees(9/10) Number of Distinct Binary Trees: We can use the binomial theorem to expand 1 4 x 1 / 2 to obtain: n 1 / 2 1 B( x ) 1 ( 4 x ) 2 n 0 n 1/ 2 m 2 m 1 m ( 1 ) 2 x m0 m 1 (5.7) Counting Binary trees(10/10) Number of Distinct Binary Trees: Comparing Eqs.(5.6) and (5.7) we see that bn , which is the coeffcient of x n in B(x), is : 1 / 2 1n 2 2n1 n 1 Some simplification yields the more compact form 1 2n bn n 1 n which is approximately bn O(4 n n 3/ 2 )