Lesson 10: Trees data structure 10.1. Introduction It is a non-linear data structure that organizes data hierarchically in nodes. A tree consists of a collection of nodes connected by directed (or undirected) edges. A conceptual picture of a tree is as shown below: Root A E F K G Leaf H L Figure 1 Features of tree ADT i. One node is distinguished as a root; ii. Every node (excluding a root) is connected by a directed edge from exactly one other node. 10.2. Lesson objectives By the end of this lesson, the students should be able to: Explain applications of a trees Discuss operations that can be applied on a tree Describe various types of trees: binary trees, binary search trees and heaps. 10.3. Lesson outline This lesson is organized as follows: 10.1. Introduction 10.2. Lesson objectives 10.3. Lesson outline 10.4. Terminologies 10.5. Types of trees 10.6. Binary trees 10.7. Binary search trees 10.8. Heaps 10.9. Revision questions 10.10. Summary 10.11. Suggested reading 10.4. Terminologies Root: The first node in a tree. For example node A in figure 1. Edges: The directed lines from one node to the next. Leaves: The bottom nodes with no outgoing edges. For example node E, H, L and K in figure 1. Path: A sequence of zero or more connected nodes. For example: i. AE ii. FGL iii. AFGH iv. AFK v. AFGL Length: The number of nodes in a path. For example, using the table below: Path Length AE 2 FGL 3 AFGH 4 AFK 3 AFGL 4 FG 2 G 1 Height: The length of the longest path from the root t a leaf. For example using table above, the height is 4. Depth: The length of the path from the root to that node. For example using figure 1. Node Depth A 1 E 2 F 2 G 3 H 4 L 4 G 3 Parent: A node that has outgoing edge. For example, using figure 1, A, F and G are parents. Child: A node that has incoming edge. For example, using figure 1, E, F, G, H, L and K are children. Sub-tree: A child of a given parent node together with its descendants. Implementation of tree data structures. Tree data structure can be implemented using: [i]. Class data structure. [ii].Arrays [iii]. Linked lists Applications of tree ADT [i]. Representation of family genealogies [ii]. Decision making algorithms [iii]. Provide fast access to information in the database by use of indexes [iv]. Compiler design 10.5. Types of trees a) A general tree: A tree where each node may have zero or more children. General trees are used to model applications such as file systems. b) Binary tree: A tree in which the nodes can have a maximum of 2 children. There is order for storage of data items. Building a tree ADT A tree data structure can be declared as follows: struct TreeNode { int item; // The data in this node. TreeNode *left; // Pointer to the left subtree. TreeNode *right; // Pointer to the right subtree. }; Operations on tree ADT [i]. Enumerate : Operation used to list all the items [ii]. Search : Operation used to search for an item /value in a node [iii]. Add: Operation used to add a new item at a certain position on the tree [iv]. Delete : Operation used to delete an item or node [v]. Prune : Operation used to remove a whole section of a tree. [vi]. Graft : An operation that adds a whole section to a tree. [vii]. Traverse : Operation for iterating through the tree nodes in order to display their contents. Advantages of trees [i]. Quick search [ii]. Quick inserts [iii]. Quick deletes Disadvantage of trees [i]. Additions and deletions of nodes are inefficient, because of the data movements in the array. [ii]. Space is wasted if the binary tree is not complete. That is, the linear representation is useful if the number of missing nodes is small. 10.6. Binary trees A binary tree is a tree in which each parent node can have a maximum of two child nodes. Each node has two pointers- left pointer and right pointer. Implementations of binary trees Binary trees can be implemented using: [a].Class data structure [b]. Arrays [c].Linked lists [d]. Structures Operations on binary tree Operations performed on binary trees include: [a].Traverse [b]. Search [c].Delete [d]. Prune [e]. Graft. Types of binary trees. There are three types of binary trees. [a].Binary search tree [b]. Expression trees [c].Heaps data structure Binary tree Traversals It is often useful to iterate through the nodes in a tree: [i]. to print all values [ii]. to determine if there is a node with some property [iii]. to make a copy There are many different orders in which we might visit the nodes. The three common traversal orders for binary trees are: pre-order, post-order and in-order, all described below. [a]. Pre-order traversal In this method we visit the root first then traverse left sub-tree and finally right sub-tree. The algorithm is: [i]. Visit the root. [ii]. Transverse the left leaf in pre-order. [iii]. Transverse the right leaf in pre-order. Summary: Root->Left->Right [b]. Post-order traversal In this method we traverse left sub-tree then traverse right sub-tree and finally visit the root.The algorithm is: [i]. Transverse the left leaf in post-order. [ii]. Transverse the right leaf in post-order. [iii]. Visit the root. Summary: Left->Right->Root [c]. In-order traversal An in-order traversal involves visiting the root "in between" visiting its left and right subtrees. Therefore, an in-order traversal only makes sense for binary trees. The (recursive) definition is: [i]. Transverse the left leaf in in-order. [ii]. Visit the root. [iii]. Transverse the right leaf in in-order. Summary: Left->Root->Right Example: Use in-order, pre-order and post-order traversals to print out the values in the tree below. Ans In-order: 6,7,2,8,9 Pre-order: 2,6,7,9,8 Post-order:7,6,8,9,2 10.7. Binary search trees This is a binary tree in which each node has a comparable key and satisfies the restriction that the key in any node is larger than the keys in all nodes in that node’s left sub-tree and smaller than the keys in all nodes in that nodes right sub-tree. Applications of binary search trees Binary search trees can be used in implementing dictionary data structure for lookup of values. BST do not allow duplicates. Implementations of binary search tree. Binary search tree can be implemented using; [a]. Class data structure [b]. Structure data structure [c]. Arrays [d]. Linked lists. Operations on binary search tree Operations performed on BST include: Traverse Search Delete Graft Prune Construction of a binary search tree Algorithm for creating a binary tree: Step 1: Read the first element and store in the first node Step 2: Read the next value then compare with the root value Step 3: If the next value is a match, return message “duplicate” Step 4: If its smaller, examine left sub-tree If the sub tree is empty, store the value If the sub-tree is nonempty, compare the element with the content in the subtree and go to step 3-step 4 with sub-tree Step 5: If its larger, examine the right sub-tree If the sub tree is empty, store the value If the sub-tree is nonempty, compare the element with the content in the subtree and go to step 3-step 4 with sub-tree Example: use the algorithm above to construct a BST to store 3, 6, 2, 9, 5. Answer 10.8. Heap data structure A heap is a binary tree which is complete and satisfies the heap order property. Compete property: This means that each level is completely filled except the bottom level. The bottom level must be filled from left to right. Heap order property: This means that the data stored in each node is greater than or equal to the data stored in its children. Example: [i]. Creation of an empty heap [ii]. Insertion of a new element into the heap [iii]. Deletion of the largest (smallest) element from the heap. Applications of heaps Heap data structure can be applied in heap sort. Heap sort is a sort method where the largest element is picked from the heap and arranged in the right order. Implementation of heaps [a]. Can be implemented using linked lists where a node will have a pointer(s) to the next node(s). [b]. Can also be implemented using the arrays. In this case the nodes in the heap are numbered from top to bottom, numbering the nodes on each level from left to right and storing the ith node in the ith location of the array. Implementation using arrays. We need to know length ( number of elements in the array) and heap size ( number of heap elements in the array). After numbering the nodes, we assign the indices and then store data [1] [2] [3] [4] [5] [6] [7] 12 10 9 7 6 4 5 Building a heap Given an array of n values we can easily build a heap to store the values by sifting each internal node down to its proper location. Procedure [1]. Start with the last internal node. [2]. Swap the current internal node with its larger child, if necessary. [3]. Then follow the swapped node down [4]. Continue until all internal nodes are done Following the steps (1) –(4) 10.9. Revision questions [a].Discuss any three applications of tree data structures [b]. Explain the following operations on a tree data structure. [i]. Prune(); [ii]. Graft(); [iii].Search(); [c]. Discuss any three operations performed on a binary tree [d]. Explain three ways of implementing a binary tree. [e]. Discuss any two types of binary trees. [f]. Traverse the tree representing the feeding hierarchy of wild animals using in-order, post-order and pre-order traversal. [g]. Describe two ways of implementing binary search trees. [h]. Explain any three operations performed on a BST. [i]. Construct a BST to store the following values. [i]. Chelsea, Liverpool, Manchester united, Wigan, Fulham, Newcastle united. [ii]. 23,54,33,12,20,19,22,28,31. [iii]. C,B,A,E,D [j]. Explain any three operations performed on a heap ADT. [k]. Describe one application of heap data structure. 10.10. Summary In this lesson, we have learnt that a tree is a non-linear structure that organizes data hierarchically in nodes. Trees are implemented using pointers, structure and class data structures. Trees are useful in decision making algorithms, compiler design, indexes in databases e.t.c. Tree data structures fall into categories like general trees and binary trees. Operations performed on trees include grafting, adding, deleting, enumerating, traversal, searching and pruning. A binary tree is a tree in which each parent node can have a maximum of two child nodes. We showed that one of the most important operations performed on a binary tree is traversal. We have seen that binary tree can be implemented using class, structure and pointers. We noted that a binary tree can be classified as binary search tree or heap. A binary search tree is a binary tree in which each node has a comparable key (and an associated value) and satisfies the restriction that the key in any node is larger than the keys in all nodes in that node’s left subtree and smaller than the keys in all nodes in that nodes right sub-tree. It can be implemented using pointers. BST can be used in implementing dictionaries i.e. lookup. Operations performed on BST are add, search, remove and get a value. A heap is a binary tree which is complete and satisfies the heap order property. This data structure can be implemented using linked lists and arrays. Heaps can be classified as maximum and minimum heaps. Some of the operations on heaps include create, insert and delete. Heaps are used in heap sort algorithm 10.11. Suggested reading [1]. Data structures using C and C++, 2nd Edition by Yedidyah Langsam, Aaron J.Augenstein and Aaron M.Tenebaum: Publisher: Pearson. [2].Data structures and algorithms in c++ by Michael T.Goodrich,Robertio Tamassia and David Mount: Publisher: Wiley [3]. Fundamentals of data structures in c++ by Ellis Horowitz,Sartaj Sahni and Dinesh Mehta. Publisher:Galgotia [4].Introduction to data structures and algorithms with c++ by Glenn W.Rowe . Publisher: Prentice Hall.