Unit 6 Trees and Graphs: Basic terminology, binary trees and its representation, insertion and deletion of nodes in binary tree, binary search tree and its traversal, threaded binary tree, Heap, Balanced Trees. Terminology and representation of graphs using adjacency matrix, Warshall’s algorithm. Basic terminology A tree is either empty or consists of one node called the root and zero or more subtrees. Every node (exclude a root) in a tree is connected by a directed edge from exactly one other node. This node is called a parent. Each node can be connected to arbitrary number of nodes, called children. Nodes with no children are called leaves, or external nodes. Nodes which are not leaves are called internal nodes. Nodes with the same parent are called siblings. A is the root; B, C and D are children of A; C is the parent of E, F and G; B, D, E, F and G are leaves; B, C and D are siblings B,C,D are internal nodes Binary Tree A binary tree is either empty or consists of a root, a left subtree and a right subtree. The depth of a node is the number of edges from the root to the node. The height of a node is the number of edges from the node to the deepest leaf. The height of a tree is a height of the root. A binary tree in which each node has exactly zero or two children is called a full binary tree. A complete binary tree is a binary tree, which is completely filled, with the possible exception of the bottom level, which is filled from left to right. Tree Traversals There are four standard methods of traversing trees: PreOrder traversal -visit the parent first and then left and right children; InOrder traversal - visit the left child, then the parent and the right child; PostOrder traversal - visit left child, then the right child and then the parent; LevelOrder traversal -visit nodes by levels from top to bottom and from left to right. PreOrder Traversal -8, 5, 9, 7, 1, 12, 2, 4, 11, 3 InOrder Traversal -9, 5, 1, 7, 2, 12, 8, 4, 3, 11 PostOrder Traversal -9, 1, 2, 12, 7, 5, 3, 11, 4, 8 LevelOrder Traversal -8, 5, 4, 9, 7, 11, 1, 12, 3, 2 These common traversals can be represented as a single algorithm by assuming that we visit each node three times. An Euler tour is a walk around the binary tree where each edge is treated as a wall, which you cannot cross. In this walk each node will be visited either on the left, or under the below, or on the right. The Euler tour in which we visit nodes on the left produces a preorder traversal. When we visit nodes from the below, we get an inorder traversal. And when we visit nodes on the right, we get a postorder traversal. Traversals can be easily implemented recursively. Complexity of traversal Assume that T(n) is the function that describes complexity of traversing a binary tree with n nodes. T(n) accumulates complexity for visiting the root -it takes constant time -and for visiting the left subtree T(m) and the right subtree T(k), where m+k = n-1. Therefore, T(n) = c + T(m) + T(k) We solve this recurrence equation asymptotically, by guessing that T(n)< a*n. It follows that T(n) = c + T(m) + T(k) < a1*m + a2*k = max(a1,a2) * (m + k) = O(n) binary trees representation A tree is said to be a binary tree if each node of the tree can have maximum of two children. Children of a node of binary tree are ordered. One child is called left child and the other is called right child. A node of a binary tree is represented by a structure containing a data part and two pointers to other structures of the same type. struct node { int data; struct node *left; struct node *right; }; Binary Search Trees Binary search tree is a data structure that quickly allows us to maintain a sorted list of numbers. It is called a binary tree because each tree node has maximum of two children. It is called a search tree because it can be used to search for the presence of a number in O(log(n)) time. The properties that separates a binary search tree from a regular binary tree are All nodes of left subtree are less than root node All nodes of right subtree are more than root node Both subtrees of each node are also BSTs i.e. they have the above two properties Searching, Insertion and deletion in BST Searching a key To search a given key in Binary Search Tree, we first compare it with root, if the key is present at root, we return root. If key is greater than root’s key, we recur for right subtree of root node. Otherwise we recur for left subtree. struct node* search(struct node* root, int key) if (root == NULL || root->key == key) return root; if (root->key < key) return search(root->right, key); return search(root->left, key); } Searching, Insertion and deletion in BST Insertion of a key A new key is always inserted at leaf. We start searching a key from root till we hit a leaf node. Once a leaf node is found, the new node is added as a child of the leaf node. struct node* insert(struct node* node, int key) { if (node == NULL) return newNode(key); if (key < node->key) node->left = insert(node->left, key); else if (key > node->key) node->right = insert(node->right, key); return node; } Searching, Insertion and deletion in BST Deletion of a key 1) Node to be deleted is leaf: Simply remove from the tree. 2) Node to be deleted has only one child: Copy the child to the node and delete the child Node to be deleted has two children: Find inorder successor of the node. Copy contents of the inorder successor to the node and delete the inorder successor. Note that inorder predecessor can also be used. #include <stdio.h> #include <conio.h> #include <alloc.h> #define TRUE 1 #define FALSE 0 struct btreenode { struct btreenode *leftchild ; int data ; struct btreenode *rightchild ; }; void insert ( struct btreenode **, int ) ; void inorder ( struct btreenode * ) ; void preorder ( struct btreenode * ) ; void postorder ( struct btreenode * ) ; void main( ) { struct btreenode *bt ; int req, i = 1, num ; bt = NULL ; /* empty tree */ printf ( "Specify the number of items to be inserted: " ) ; scanf ( "%d", &req ) ; while ( i++ <= req ) { printf ( "Enter the data: " ) ; scanf ( "%d", &num ) ; insert ( &bt, num ) ; } printf ( "\nIn-order Traversal: " ) ; inorder ( bt ) ; printf ( "\nPre-order Traversal: " ) ; preorder ( bt ) ; printf ( "\nPost-order Traversal: " ) ; postorder ( bt ) ; } void insert ( struct btreenode **sr, int num ) { if ( *sr == NULL ) { *sr = malloc ( sizeof ( struct btreenode ) ) ; ( *sr ) -> leftchild = NULL ; ( *sr ) -> data = num ; ( *sr ) -> rightchild = NULL ; return ; } else /* search the node to which new node will be attached */ { /* if new data is less, traverse to left */ if ( num < ( *sr ) -> data ) insert ( &( ( *sr ) -> leftchild ), num ) ; else /* else traverse to right */ insert ( &( ( *sr ) -> rightchild ), num ) ; } return ; } * traverse a binary search tree in a LDR (Left-Data-Right) fashion */ void inorder ( struct btreenode *sr ) { if ( sr != NULL ) { inorder ( sr -> leftchild ) ; /* print the data of the node whose leftchild is NULL or the path has already been traversed */ printf ( "\t%d", sr -> data ) ; inorder ( sr -> rightchild ) ; } else return ; } /* traverse a binary search tree in a DLR (Data-Left-right) fashion */ void preorder ( struct btreenode *sr ) { if ( sr != NULL ) { printf ( "\t%d", sr -> data ) ; /* print the data of a node */ preorder ( sr -> leftchild ) ; /* traverse till leftchild is not NULL */ preorder ( sr -> rightchild ) ; /* traverse till rightchild is not NULL */ } else return ; } /* traverse a binary search tree in LRD (Left-Right-Data) fashion */ void postorder ( struct btreenode *sr ) { if ( sr != NULL ) { postorder ( sr -> leftchild ) ; postorder ( sr -> rightchild ) ; printf ( "\t%d", sr -> data ) ; } else return ; } void delete ( struct btreenode **root, int num ) { int found ; struct btreenode *parent, *x, *xsucc ; if ( *root == NULL ) /* if tree is empty */ { printf ( "\nTree is empty" ) ; return ; } parent = x = NULL ; /* call to search function to find the node to be deleted */ search ( root, num, &parent, &x, &found ) ; /* if the node to deleted is not found */ if ( found == FALSE ) { printf ( "\nData to be deleted, not found" ) ; return ; /* if the node to be deleted has no child */ if ( x -> leftchild == NULL && x -> rightchild == NULL ) { if ( parent -> rightchild == x ) parent -> rightchild = NULL ; else parent -> leftchild = NULL ; free ( x ) ; return ; } /* if the node to be deleted has only rightchild */ if ( x -> leftchild == NULL && x -> rightchild != NULL ) { if ( parent -> leftchild == x ) parent -> leftchild = x -> rightchild ; else parent -> rightchild = x -> rightchild ; free ( x ) ; return ; } /* if the node to be deleted has only left child */ if ( x -> leftchild != NULL && x -> rightchild == NULL ) { if ( parent -> leftchild == x ) parent -> leftchild = x -> leftchild ; else parent -> rightchild = x -> leftchild ; free ( x ) ; return ; } } /* if the node to be deleted has two children */ if ( x -> leftchild != NULL && x -> rightchild != NULL ) { parent = x ; xsucc = x -> rightchild ; while ( xsucc -> leftchild != NULL ) { parent = xsucc ; xsucc = xsucc -> leftchild ; } x -> data = xsucc -> data ; x = xsucc ; } void search ( struct btreenode **root, int num, struct btreenode **par, struct btreenode **x, int *found ) { struct btreenode *q ; q = *root ; *found = FALSE ; *par = NULL ; while ( q != NULL ) { /* if the node to be deleted is found */ if ( q -> data == num ) { *found = TRUE ; *x = q ; return ; } *par = q ; if ( q -> data > num ) q = q -> leftchild ; else q = q -> rightchild ; } } Expression Trees Typically we deal with mathematical expressions in infix notation where the operators (e.g. +, *) re written between the operands: 2+5 Here, 2 and 5 are called operands, and the '+' is operator. The above arithmetic expression is called infix, since the operator is in between operands. Writing the operators after the operands gives a postfix form. 2 5 + In a prefix form, operands follow operator: + 2 5 The infix form has a disadvantage that parentheses must be used to indicate the order of evaluation. For example, the following expression is ambiguous 7 +10 *15 Evaluation becomes much easier if we evaluate the operators from left to right. This leads to a postfixform expression. Postfixform has a niceproperty that parentheses are unnecessary. 7 10 15 * + Threaded Binary Trees Binary trees have a lot of wasted space: the leaf nodes each have 2 null pointers We can use these pointers to help us in inorder traversals We have the pointers reference the next node in an inorder traversal; called threads We need to know if a pointer is an actual link or a thread, so we keep a boolean for each pointer class Node { Node left, right; boolean leftThread, rightThread; } Threaded Tree Traversal We start at the leftmost node in the tree, print it, and follow its right thread If we follow a thread to the right, we output the node and continue to its right If we follow a link to the right, we go to the leftmost node, print it, and continue Threaded Tree Traversal Output 1 6 8 3 1 5 7 11 9 Start at leftmost node, print it 13 Threaded Tree Traversal Output 1 3 6 8 3 1 5 7 11 9 Follow thread to right, print node 13 Threaded Tree Traversal Output 1 3 5 6 8 3 1 5 7 11 9 Follow link to right, go to leftmost node and print 13 Threaded Tree Traversal Output 1 3 5 6 6 8 3 1 5 7 11 9 Follow thread to right, print node 13 Threaded Tree Traversal 6 8 3 1 5 7 11 9 Follow link to right, go to leftmost node and print 13 Output 1 3 5 6 7 Threaded Tree Traversal 6 8 3 1 5 7 11 9 Follow thread to right, print node 13 Output 1 3 5 6 7 8 Threaded Tree Traversal 6 8 3 1 5 7 11 9 Follow link to right, go to leftmost node and print 13 Output 1 3 5 6 7 8 9 Threaded Tree Traversal 6 8 3 1 5 7 11 9 Follow thread to right, print node 13 Output 1 3 5 6 7 8 9 11 Threaded Tree Traversal 6 8 3 1 5 7 11 9 Follow link to right, go to leftmost node and print 13 Output 1 3 5 6 7 8 9 11 13 Threaded Tree Modification We’re still wasting pointers, since half of our leafs’ pointers are still null We can add threads to the previous node in an inorder traversal as well, which we can use to traverse the tree backwards or even to do postorder traversals Threaded Tree Modification 6 8 3 1 5 7 11 9 13 Heaps A heap is a certain kind of complete binary tree. Heaps Root A heap is a certain kind of complete binary tree. When a complete binary tree is built, its first node must be the root. Heaps Complete binary tree. Left child of the root The second node is always the left child of the root. Heaps Complete binary tree. Right child of the root The third node is always the right child of the root. Heaps Complete binary tree. The next nodes always fill the next level from left-to-right. Heaps Complete binary tree. The next nodes always fill the next level from left-to-right. Heaps Complete binary tree. The next nodes always fill the next level from left-to-right. Heaps Complete binary tree. The next nodes always fill the next level from left-to-right. Heaps Complete binary tree. Heaps 45 A heap is a certain kind of complete binary tree. 35 27 19 23 21 22 Each node in a heap contains a key that can be compared to other nodes' keys. 4 Heaps 45 A heap is a certain kind of complete binary tree. 35 27 19 23 21 22 The "heap property" requires that each node's key is >= the keys of its children 4 Adding a Node to a Heap 45 Put the new node in the next available spot. Push the new node upward, swapping with its parent until the new node reaches 19 an acceptable location. 35 27 23 21 42 22 4 Adding a Node to a Heap 45 Put the new node in the next available spot. Push the new node upward, swapping with its parent until the new node reaches 19 an acceptable location. 35 42 23 21 27 22 4 Adding a Node to a Heap 45 Put the new node in the next available spot. Push the new node upward, swapping with its parent until the new node reaches 19 an acceptable location. 42 35 23 21 27 22 4 Adding a Node to a Heap 45 The parent has a key that is >= new node, or The node reaches the root. The process of pushing the new node upward is called 19 reheapification upward. 42 35 23 21 27 22 4 Removing the Top of a Heap 45 Move the last node onto the root. 42 35 19 23 21 27 22 4 Removing the Top of a Heap 27 Move the last node onto the root. 42 35 19 23 21 22 4 Removing the Top of a Heap 27 Move the last node onto the root. Push the out-of-place node downward, swapping with its larger child until the new node reaches an 19 acceptable location. 42 35 23 21 22 4 Removing the Top of a Heap 42 Move the last node onto the root. Push the out-of-place node downward, swapping with its larger child until the new node reaches an 19 acceptable location. 27 35 23 21 22 4 Removing the Top of a Heap 42 Move the last node onto the root. Push the out-of-place node downward, swapping with its larger child until the new node reaches an 19 acceptable location. 35 27 23 21 22 4 Removing the Top of a Heap 42 The children all have keys <= the out-ofplace node, or The node reaches the leaf. The process of pushing the new node downward is called 19 reheapification downward. 35 27 23 21 22 4 Implementing a Heap 42 We will store the data from the nodes in a partially-filled array. 35 27 An array of data 23 21 Implementing a Heap 42 Data from the root goes in the first location of the array. 35 27 42 An array of data 23 21 Implementing a Heap 42 Data from the next row goes in the next two array locations. 35 27 42 35 An array of data 23 23 21 Implementing a Heap 42 Data from the next row goes in the next two array locations. 35 27 42 35 An array of data 23 27 23 21 21 Implementing a Heap 42 Data from the next row goes in the next two array locations. 35 27 42 35 23 27 23 21 21 An array of data We don't care what's in this part of the array. Important Points about the Implementation 42 The links between the tree's nodes are not actually 35 stored as pointers, or in any other way. 27 21 The only way we "know" that "the array is a tree" is from the way we manipulate the data. 42 35 23 27 21 An array of data 23 Important Points about the Implementation 42 If you know the index of a node, then it is easy to 35 figure out the indexes of that node's parent and children. Formulas are given 27 21 in the book. 42 35 23 27 [1] [2] [3] [4] 21 [5] 23 Balanced Trees The disadvantage of a binary search tree is that its height can be as large as N-1 This means that the time needed to perform insertion and deletion and many other operations can be O(N) in the worst case We want a tree with small height A binary tree with N node has height at least (log N) Thus, our goal is to keep the height of a binary search tree O(log N) Such trees are called balanced binary search trees. Examples are AVL tree, red-black tree. AVL Trees Here are some important notions: [1] The lenght of the longest road from the root node to one of the terminal nodes is what we call the height of a tree. [2] The difference between the height of the right subtree and the height of the left subtree is what we call the balancing factor. [3] The binary tree is balanced when all the balancing factors of all the nodes are -1,0,+1. Formally, we can translate this to this: | hd – hs | ≤ 1, node X being any node in the tree, where hs and hdrepresent the heigts of the left and the right subtrees Insertion into AVL Trees Imbalance occours while insertion in AVL According to imbalances, Tree is rotated LL rotation RR Rotation LR Rotation RL Rotation AVL Tree Example: • Insert 14, 17, 11, 7, 53, 4, 13 into an empty AVL tree 14 11 7 4 17 53 AVL Tree Example: • Insert 14, 17, 11, 7, 53, 4, 13 into an empty AVL tree 14 7 4 17 11 53 13 AVL Tree Example: • Now insert 12 14 7 4 17 11 53 13 12 AVL Tree Example: • Now insert 12 14 7 4 17 11 53 12 13 AVL Tree Example: • Now the AVL tree is balanced. 14 7 4 17 12 11 53 13 AVL Tree Example: • Now insert 8 14 7 4 17 12 11 8 53 13 AVL Tree Example: • Now insert 8 14 7 4 17 11 8 53 12 13 AVL Tree Example: • Now the AVL tree is balanced. 14 11 7 4 17 12 8 53 13 In Class Exercises Build an AVL tree with the following values: 15, 20, 24, 10, 13, 7, 30, 36, 25 15, 20, 24, 10, 13, 7, 30, 36, 25 20 15 15 24 20 10 24 13 20 13 10 20 24 15 15 13 10 24 15, 20, 24, 10, 13, 7, 30, 36, 25 20 13 13 10 24 10 15 7 20 15 24 30 7 13 10 7 36 20 15 30 24 36 15, 20, 24, 10, 13, 7, 30, 36, 25 13 10 7 13 20 15 10 30 24 20 7 15 24 36 30 25 25 13 10 24 7 20 15 30 25 36 36 Remove 24 and 20 from the AVL tree. 13 10 13 24 7 10 20 30 15 25 7 15 36 7 30 25 13 10 20 36 13 30 15 10 36 25 15 7 30 25 36