Tree Data Structures Trees Data Structures Tree Nodes Each node can have 0 or more children A node can have at most one parent Binary tree Tree with 0–2 children per node Tree Binary Tree Trees Terminology Root no parent Leaf no child Interior non-leaf Height distance from root to leaf Root node Interior nodes Leaf nodes Height Binary Search Trees Key property Value at node Smaller values in left subtree Larger values in right subtree Example X X>Y X<Z Y Z Binary Search Trees Examples 5 10 2 5 10 45 30 5 45 30 2 25 45 2 10 25 30 25 Binary search trees Not a binary search tree Iterative Search of Binary Tree Node *Find( Node *n, int key) { while (n != NULL) { if (n->data == key) // Found it return n; if (n->data > key) // In left subtree n = n->left; else // In right subtree n = n->right; } return null; } Node * n = Find( root, 5); Example Binary Searches Find ( root, 2 ) root 10 5 5 10 > 2, left 30 5 > 2, left 2 45 2 = 2, found 2 25 30 45 5 > 2, left 10 25 2 = 2, found Example Binary Searches Find (root, 25 ) 10 5 5 10 < 25, right 30 30 > 25, left 2 45 25 = 25, found 2 25 5 < 25, right 45 > 25, left 30 > 25, left 30 45 10 10 < 25, right 25 = 25, found 25 Types of Binary Trees Degenerate – only one child Complete (Full)– always two children Balanced – “mostly” two children more formal definitions exist, above are intuitive ideas Degenerate binary tree Balanced binary tree Complete binary tree Binary Trees Properties Degenerate Height = O(n) for n nodes Similar to linked list Degenerate binary tree Balanced Height = O( log(n) ) for n nodes Useful for searches Balanced binary tree Binary Search Properties Time of search Proportional to height of tree Balanced binary tree O( log(n) ) time Degenerate tree O( n ) time Like searching linked list / unsorted array Binary Search Tree Construction How to build & maintain binary trees? Insertion Deletion Maintain key property (invariant) Smaller values in left subtree Larger values in right subtree Binary Search Tree – Insertion Algorithm 1. 2. 3. 4. Perform search for value X Search will end at node Y (if X not in tree) If X < Y, insert new leaf X as new left subtree for Y If X > Y, insert new leaf X as new right subtree for Y Observations O( log(n) ) operation for balanced tree Insertions may unbalance tree Example Insertion Insert ( 20 ) 10 5 10 < 20, right 30 > 20, left 30 25 > 20, left 2 25 20 45 Insert 20 on left Binary Search Tree – Deletion Algorithm 1. 2. 3. Perform search for value X If X is a leaf, delete X Else // must delete internal node a) Replace with largest value Y on left subtree OR smallest value Z on right subtree b) Delete replacement value (Y or Z) from subtree Observation O( log(n) ) operation for balanced tree Deletions may unbalance tree Example Deletion (Leaf) Delete ( 25 ) 10 5 10 10 < 25, right 30 5 30 > 25, left 30 25 = 25, delete 2 25 45 2 45 Example Deletion (Internal Node) Delete ( 10 ) 10 5 2 5 30 25 5 45 Replacing 10 with largest value in left subtree 2 5 30 25 2 45 Replacing 5 with largest value in left subtree 2 30 25 45 Deleting leaf Example Deletion (Internal Node) Delete ( 10 ) 10 5 2 25 30 25 5 45 Replacing 10 with smallest value in right subtree 2 25 30 25 5 45 Deleting leaf 2 30 45 Resulting tree Balanced Search Trees Kinds of balanced binary search trees height balanced vs. weight balanced “Tree rotations” used to maintain balance on insert/delete Non-binary search trees 2/3 trees each internal node has 2 or 3 children all leaves at same depth (height balanced) B-trees Generalization of 2/3 trees Each internal node has between k/2 and k children Each node has an array of pointers to children Widely used in databases Other (Non-Search) Trees Parse trees Convert from textual representation to tree representation Textual program to tree Used extensively in compilers Tree representation of data E.g. HTML data can be represented as a tree called DOM (Document Object Model) tree XML Like HTML, but used to represent data Tree structured Parse Trees Expressions, programs, etc can be represented by tree structures E.g. Arithmetic Expression Tree A-(C/5 * 2) + (D*5 % 4) + A * / C 5 % * 2 D 5 4 Preoder, Inorder, Postorder In Preorder, the root is visited before (pre) the subtrees traversals In Inorder, the root is visited in-between left and right subtree traversal In Preorder, the root is visited after (pre) the subtrees traversals Preorder Traversal: 1. Visit the root 2. Traverse left subtree 3. Traverse right subtree Inorder Traversal: 1. Traverse left subtree 2. Visit the root 3. Traverse right subtree Postorder Traversal: 1. Traverse left subtree 2. Traverse right subtree 3. Visit the root Tree Traversal + Goal: visit every node of a tree in-order traversal A * / C 5 void Node::inOrder () { if (left != NULL) { cout << “(“; left->inOrder(); cout << “)”; } cout << data << endl; if (right != NULL) right->inOrder() }Output: A – C / 5 * 2 + D * 5 % 4 To disambiguate: print brackets % * 2 D 5 4 Tree Traversal (contd.) + pre-order and post-order: void Node::preOrder () { cout << data << endl; if (left != NULL) left->preOrder (); if (right != NULL) right->preOrder (); } A * / % * 2 C 5 Output: + - A * / C 5 2 % * D 5 4 void Node::postOrder () { if (left != NULL) left->preOrder (); if (right != NULL) right->preOrder (); cout << data << endl; } Output: A C 5 / 2 * - D 5 * 4 % + D 5 4 More Illustrations for Traversals Assume: visiting a node is printing its label Preorder: 1 3 5 4 6 7 8 9 10 11 12 4 Inorder: 4 5 6 3 1 8 7 9 11 10 12 Postorder: 4 6 5 3 8 11 12 10 9 7 1 1 3 7 5 8 9 10 6 11 12 More Illustrations for Traversals (Contd.) Assume: visiting a node 15 20 8 is printing its data Preorder: 15 8 2 6 3 7 27 11 2 11 10 12 14 20 27 22 30 6 10 12 22 30 Inorder: 2 3 6 7 8 10 11 3 7 14 12 14 15 20 22 27 30 Postorder: 3 7 6 2 10 14 12 11 8 22 30 27 20 15 Application of Traversal Sorting a BST Observe the output of the inorder traversal of the BST example two slides earlier It is sorted This is no coincidence As a general rule, if you output the keys (data) of the nodes of a BST using inorder traversal, the data comes out sorted in increasing order Array Representation of Full Trees and Almost Complete Trees A canonically label-able tree, like full binary trees and almost complete binary trees, can be represented by an array A of the same length as the number of nodes A[k] is identified with node of label k That is, A[k] holds the data of node k Advantage: no need to store left and right pointers in the nodes save memory Direct access to nodes: to get to node k, access A[k] Illustration of Array Representation 15 20 8 2 13 15 1 8 20 2 2 3 4 30 11 6 18 27 12 11 30 27 13 6 10 12 5 6 7 8 9 10 11 Notice: Left child of A[5] (of data 11) is A[2*5]=A[10] (of data 18), and its right child is A[2*5+1]=A[11] (of data 12). Parent of A[4] is A[4/2]=A[2], and parent of A[5]=A[5/2]=A[2] Application of Almost Complete Binary Trees: Heaps A heap (or min-heap to be precise) is an almost complete binary tree where Every node holds a data value (or key) The key of every node is ≤ the keys of the children Note: A max-heap has the same definition except that the Key of every node is >= the keys of the children Example of a Min-heap 5 20 8 15 33 16 18 30 11 12 27 Operations on Heaps Delete the minimum value and return it. This operation is called deleteMin. Insert a new data value Applications of Heaps: • A heap implements a priority queue, which is a queue that orders entities not a on first-come first-serve basis, but on a priority basis: the item of highest priority is at the head, and the item of the lowest priority is at the tail • Another application: sorting, which will be seen later DeleteMin in Min-heaps The minimum value in a min-heap is at the root! To delete the min, you can’t just remove the data value of the root, because every node must hold a key Instead, take the last node from the heap, move its key to the root, and delete that last node But now, the tree is no longer a heap (still almost complete, but the root key value may no longer be ≤ the keys of its children Illustration of First Stage of deletemin 5 15 33 15 30 27 11 20 8 20 8 16 18 12 16 18 33 30 11 12 12 15 33 16 18 12 20 8 11 30 27 20 8 27 15 33 16 18 11 30 27 Restore Heap To bring the structure back to its “heapness”, we restore the heap Swap the new root key with the smaller child. Now the potential bug is at the one level down. If it is not already ≤ the keys of its children, swap it with its smaller child Keep repeating the last step until the “bug” key becomes ≤ its children, or the it becomes a leaf Illustration of Restore-Heap 12 20 8 15 33 8 11 30 20 12 27 16 18 15 33 11 16 18 8 20 11 15 33 16 18 12 30 27 Now it is a correct heap 30 27 XML Data Representation E.g. <dependency> <object>sample1.o</object> <depends>sample1.cpp</depends> <depends>sample1.h</depends> <rule>g++ -c sample1.cpp</rule> </dependency> Tree representation dependency object sample1.o depends depends sample1.cpp sample1.h rule g++ -c … Heaps (Ref. www.cse.unt.edu/~rada/CSCE3110/Lectures/Heaps.ppt ) A heap is a binary tree T that stores a keyelement pairs at its internal nodes It satisfies two properties: • MinHeap: key(parent) key(child) • [OR MaxHeap: key(parent) key(child)] 4 • all levels are full, except 6 the last one, which is5 left-filled 15 16 9 25 14 7 12 11 20 8 What are Heaps Useful for? To implement priority queues Priority queue = a queue where all elements have a “priority” associated with them Remove in a priority queue removes the element with the smallest priority insert removeMin Heap or Not a Heap? Heap Properties A heap T storing n keys has height h = log(n + 1), which is O(log n) 4 5 6 15 16 9 25 14 7 12 11 20 8 Heap Insertion Insert 6 Heap Insertion Add key in next available position Heap Insertion Begin Unheap Heap Insertion Heap Insertion Terminate unheap when reach root key child is greater than key parent Heap Removal Remove element from priority queues? removeMin( ) Heap Removal Begin downheap Heap Removal Heap Removal Heap Removal Terminate downheap when reach leaf level key parent is greater than key child Building a Heap build (n + 1)/2 trivial one-element heaps build three-element heaps on top of them Building a Heap downheap to preserve the order property now form seven-element heaps Building a Heap Building a Heap Heap Implementation Using arrays Parent = k ; Children = 2k , 2k+1 Why is it efficient? [1] [2] 12 [3] 7 [5] [4] 18 [1] 6 6 19 [6] 9 [2] 9 [4] 10 [3] 7 [1] 30 [2] 31