241-423 Advanced Data Structures and Algorithms Semester 2, 2012-2013 11. Heaps Objectives – implement heaps (a kind of array-based complete binary tree), heap sort, priority queues ADSA: Heaps/11 1 Contents 1. 2. 3. 4. 5. 6. 7. 8. Tree Terminology Binary Trees The Comparator Interface Generalized Array Sorting Heaps Sorting with a Max Heap API for the Heaps Class Priority Queue Collection ADSA: Heaps/11 2 1. Tree Terminology root A child sibling parent B C D interior (internal) node E F G H leaf node I J subtree ADSA: Heaps/11 continued 3 • A path between a parent node X0 and a subtree node N is a sequence of nodes P = X0, X1, . . ., (Xk is N) – k is the length of the path – each node Xi is the parent of Xi+1 for 0 i k-1 ADSA: Heaps/11 continued 4 • The level of a node is the length of the path from root to that node. • The height of a tree is the maximum level in the tree. ADSA: Heaps/11 continued 5 ADSA: Heaps/11 6 2. Binary Trees • In a binary tree, each node has at most two children. ADSA: Heaps/11 continued 7 • Each node of a binary tree defines a left and a right subtree. Each subtree is a tree. Right child of T Left child of T ADSA: Heaps/11 8 Height of a Binary Tree Node o Node N starts a subtree TN , with TL and TR the left and right subtrees. Then: height(N) = height(TN) = ADSA: Heaps/11 { -1 1+max( height(TL), height(TR)) if TN is empty if TN not empty continued 9 degenerate binary tree (a list) ADSA: Heaps/11 10 A Complete Binary Tree • A complete binary tree of height h has all its nodes filled through level h-1, and the nodes at depth h run left to right with no gaps. ADSA: Heaps/11 continued 11 ADSA: Heaps/11 12 An Array-based Complete Binary Tree • An array arr[] can be viewed as a complete binary tree if: – the root is stored in arr[0] – the level 1 children are in arr[1], arr[2] – the level 2 children are in arr[3], arr[4], arr[5], arr[6] – etc. ADSA: Heaps/11 continued 13 Integer[] arr = {5, 1, 3, 9, 6, 2, 4, 7, 0, 8}; ADSA: Heaps/11 continued 14 o For arr[i] in an n-element array-based complete binary tree: Left child of arr[i] is arr[2*i + 1]; or undefined if (2*i + 1) n Right child of arr[i] is arr[2*i + 2]; or undefined if (2*i + 2) n Parent of arr[i] is arr[(i-1)/2]; or undefined if i = 0 ADSA: Heaps/11 15 3. The Comparator Interface interface COMPARATOR<T> java.util Methods int compare(T x, T y) Compares its two arguments for order. Returns a negative integer, zero, or a positive integer to specify that x is less than, equal to, or greater than y. boolean equals(Object obj) Returns true if this object equals obj and false otherwise. ADSA: Heaps/11 continued 16 3.1. Comparator vs. Comparable • Comparator.compare(T x, T y) – comparison is done between two objects, x and y – the method is implemented outside the T class • Comparable.compareTo(T y) – comparison is done between 'this' object and object y – the method is implemented by the T class ADSA: Heaps/11 17 3.2. Benefits of Comparator The comparison code is in a separate place from the other code for a class. 1. This means that the same objects can be compared in different ways by defining different Comparators – see the two comparators for circles ADSA: Heaps/11 18 • 2. A Comparator can be passed to a sort method as an argument to control the ordering of objects. • this means that the same sort method can sort in different ways depending on its Comparator argument. • e.g. sorting an array into ascending or descending order • e.g. sorting data into max heaps or min heaps ADSA: Heaps/11 19 3.3. Comparing Circles public class Circle { private double xCenter, yCenter; private double radius; public Circle(double x, double y, double r) { xCenter = x; yCenter = y; radius = r; } public double getX() { return xCenter; } : ADSA: Heaps/11 20 public double getY() { return yCenter; } public double getRadius() { return radius; } public double getArea() { return Math.PI * radius * radius; : } } // more methods, but no comparison function // end of Circle class ADSA: Heaps/11 21 Comparing Circle Radii public class RadiiComp implements Comparator<Circle> { public int compare(Circle c1, Circle c2) { double radC1 = c1.getRadius(); double radC2 = c2.getRadius(); } // returns < 0 if radC1 < radC2, // 0 if radC1 == radC2, // > 0 if radC1 > radC2 return (int)(radC1 – radC2); // end of compare() // equals() is inherited from Object superclass } // end of RadiiComp class ADSA: Heaps/11 22 Comparing Circle Positions public class PosComp implements Comparator<Circle> { public int compare(Circle c1, Circle c2) { double c1Dist = (c1.getX() * c1.getX()) + (c1.getY() * c1.getY()); double c2Dist = (c2.getX() * c2.getX()) + (c2.getY() * c2.getY()); } } // returns < 0 if c1Dist < c2Dist, // 0 if c1Dist == c2Dist, // > 0 if c1Dist > c2Dist return (int)(c1Dist - c2Dist); // end of compare() // equals() is inherited from Object superclass // end of PosComp class ADSA: Heaps/11 23 Comparing Circles in Two Ways Circle c1 = new Circle(0, 0, 5); Circle c2 = new Circle(3, 2, 7); RadiiComp rComp = new RadiiComp(); if (rComp.compare(c1, c2) < 0) System.out.println("Circle 1 is smaller than circle 2"); PosComp pComp = new PosComp(); if (pComp.compare(c1, c2) < 0) System.out.println("Circle 1 is nearer to origin than circle 2"); Circle 1 is smaller than circle 2 Circle 1 is nearer to origin than circle 2 ADSA: Heaps/11 24 3.4. Comparators for Sorting • The Less and Greater Comparator classes make sorting and searching functions more flexible – see selectionSort() in the next few slides • Less and Greater will also be used to create min and max heaps. ADSA: Heaps/11 25 The Less Comparator import java.util.Comparator; // the < Comparator public class Less<T> implements Comparator<T> { public int compare(T x, T y) { return ((Comparable<T>)x).compareTo(y); } } x Less y == x.compareTo y == x < y ADSA: Heaps/11 uses T's compareTo() to compare x and y 26 Using Less Comparator<Integer> lessInt = new Less<Integer>(); Integer a = 3, b = 5; if (lessInt.compare(a, b) < 0) System.out.println(a + " < " + b); ADSA: Heaps/11 3<5 27 The Greater Comparator // the > Comparator public class Greater<T> implements Comparator<T> { public int compare(T x, T y) { return -((Comparable<T>)x).compareTo(y); } } x Greater y == -(x.compareTo y) == -(x < y) == x > y ADSA: Heaps/11 uses T's compareTo() to compare x and y 28 Using Greater Comparator<Integer> greaterInt = new Greater<Integer>(); Integer a = 9, b = 5; if (greaterInt.compare(a, b) < 0) System.out.println(a + " ADSA: Heaps/11 > " + b); 9>5 29 4. Generalized Array Sorting • A single selectionSort() method can sort array in different ways by being passed different Comparator arguments. • The Less or Greater Comparators let the method sort an array of objects into ascending or descending order. ADSA: Heaps/11 30 Selection Sort with Comparator // new version adds a Comparator parameter public static <T> void selectionSort(T[] arr, Comparator<? super T> comp) { int smallIndex; int n = arr.length; for (int pass = 0; pass < n-1; pass++) { // scan the sublist starting at index smallIndex = pass; : Compare with selectionSort() in Part 2 which uses Comparable. ADSA: Heaps/11 31 // j traverses the sublist arr[pass+1] to arr[n-1] for (int j = pass+1; j < n; j++) // if smaller element found, assign posn to smallIndex if (comp.compare(arr[j], arr[smallIndex]) < 0) smallIndex = j; // swap smallest element into arr[pass] T temp = arr[pass]; arr[pass] = arr[smallIndex]; arr[smallIndex] = temp; } } // end of selectionSort() ADSA: Heaps/11 32 Example String[] arr = {"red", "green", "blue", "yellow", "teal", "orange"}; Less<String> lessComp = new Less<String>(); Greater<String> gtComp = new Greater<String>(); Arrays.selectionSort(arr, lessComp); System.out.println("Less sort: " + Arrays.toString(arr)); Arrays.selectionSort(arr, gtComp); System.out.println("Greater sort: " + Arrays.toString(arr)); Less sort: [blue, green, orange, red, teal, yellow] Greater sort: [yellow, teal, red, orange, green, blue] ADSA: Heaps/11 // ascending // descending 33 5. Heaps • A maximum heap is an array-based complete binary tree in which the value of a parent is ≥ the value of both its children. lvl 0 1 2 3 55 50 52 25 10 11 5 20 22 0 1 2 3 4 5 6 7 8 there’s no ordering within a level (between siblings) ADSA: Heaps/11 continued 34 lvl 0 1 2 40 15 30 10 0 1 2 3 ADSA: Heaps/11 35 • A minimum heap uses the relation ≤. lvl 0 1 2 3 5 10 50 11 20 52 55 25 22 0 1 2 3 4 5 6 7 8 ADSA: Heaps/11 continued 36 lvl 0 1 2 10 15 30 40 0 1 2 3 ADSA: Heaps/11 37 A max heap will be ordered using the Greater comparator. – I’ll focus on max heaps in these notes A min heap will be ordered using the Less comparator. ADSA: Heaps/11 38 Heap Uses • Heapsort – one of the best sorting methods • in-place; no quadratic worst-case • Selection algorithms – finding the min, max, median, k-th element in sublinear time • Graph algorithms – Prim's minimal spanning tree; – Dijkstra's shortest path ADSA: Heaps/11 39 Max Heap Operations o Inserting an element: pushHeap() o Deleting an element: popHeap() o most of the work is done by calling adjustHeap() o Array --> heap conversion: makeHeap() o most of the work is done by calling adjustHeap() o Heap sorting: heapSort() o utilizes makeHeap() then popHeap() ADSA: Heaps/11 40 5.1. Inserting into a Max Heap pushHeap() • Assume that the array is a maximum heap. – a new item will enter the array at index last with the heap expanding by one element ADSA: Heaps/11 continued 41 • Insert an item by moving the nodes on the path of parents down one level until the item is correctly placed as a parent in the heap. insert 50 path of parents ADSA: Heaps/11 continued 42 ADSA: Heaps/11 continued 43 • At each step, compare item with parent – if item is larger, move parent down one level • • arr[currPos] = parent; currPos = the parent index; • Stop when parent is larger than item – assign item to the currPos position • arr[currPos] = item; ADSA: Heaps/11 44 pushHeap() Depending on the comparator, pushHeap() can work with max or min heaps. public static <T> void pushHeap(T[] arr, int last, T item, Comparator<? super T> comp) { // insert item into the heap arr[] // assume that arr[0] to arr[last-1] // are in heap order // currPos is the index that moves up the // path of parents int currPos = last; int parentPos = (currPos-1)/2; // see slide 16 : ADSA: Heaps/11 45 // move up parents path to the root while (currPos != 0) { // compare target and parent value if (comp.compare(item,arr[parentPos]) < 0) { arr[currPos] = arr[parentPos]; // move data from parent --> current currPos = parentPos; parentPos = (currPos-1)/2; // get next parent } else // heap condition is ok break; } } arr[currPos] = item; // end of pushHeap() ADSA: Heaps/11 // put item in right location 46 5.2. Deleting from a Heap popHeap() • Deletion is normally restricted to the root only – remove the maximum element (in a max heap) • To erase the root of an n-element heap: – exchange the root with the last element (the one at index n-1); delete the moved root – filter (sift) the new root down to its correct position in the heap call adjustHeap() ADSA: Heaps/11 47 Deletion Example for a Max Heap • Delete 63 – exchange with 18; remove 63 – filter down 18 to correct position ADSA: Heaps/11 continued 48 Filter down 18 18 arr[0] 30 40 arr[2] arr[1] 5 arr[7] 10 25 arr[3] arr[4] 8 arr[5] 38 arr[6] 3 arr[8] (63) removed ADSA: Heaps/11 continued 49 • Move 18 down: – smaller than 30 and 40; swap with 40 18 18 38 ADSA: Heaps/11 continued 50 • Move 18 down: – smaller than 38; swap with 38 18 38 18 • Stop since 18 is now a leaf node. ADSA: Heaps/11 51 Filter (Sift) Down a Max Heap adjustHeap() • Move root value down the tree: – compare value with its two children – if value < a child then heap order is wrong – select largest child and swap with value – repeat algorithm but with new child – continue until value ≥ both children or at a leaf ADSA: Heaps/11 52 adjustHeap() Depending on the comparator, adjustHeap() can work with max or min heaps. // filter a value down the heap public static <T> void adjustHeap(T[] arr, int first, int last, Comparator<? super T> comp) { int currentPos = first; // start at first T value = arr[first]; // filter value down the heap // compute the left child index int childPos = 2*currentPos + 1; : ADSA: Heaps/11 // see slide 16 53 index of right child is childPos+1 // scan down path of children while (childPos < last) { // compare the two children; select bigger if ((childPos+1 < last) && comp.compare(arr[childPos+1], arr[childPos]) < 0) childPos = childPos + 1; // since cp+1 < cp // compare selected child with value if (comp.compare(arr[childPos],value) < 0) { // swap child and value arr[currentPos] = arr[childPos]; arr[childPos] = value; : ADSA: Heaps/11 54 // update indices to continue the scan currentPos = childPos; childPos = 2*currentPos + 1; // left child index } else // value in right position break; } } // end of adjustHeap() ADSA: Heaps/11 55 Deletion method: popHeap() see slide 47 for more details • Exchange the root with the last value in the heap (arr[last-1]) • Call adjustHeap() to filter value down heap – heap now has index range [0, last-1) • Return the original root value ADSA: Heaps/11 56 // delete the maximum (or minimum) element in the heap // and return its value public static <T> T popHeap(T[] arr, int last, Comparator<? super T> comp) { // value that is removed from the heap T temp = arr[0]; // exchange last element in heap with the root value arr[0] = arr[last-1]; arr[last-1] = temp; // filter the value down over the range (0, last-1) adjustHeap(arr, 0, last-1, comp); return temp; } ADSA: Heaps/11 57 5.3. Using Heaps import ds.util.Heaps; import ds.util.Greater; import ds.util.Less; public class UsingHeaps { public static void main(String[] args) { // integer array, and arrA and arrB heaps Integer[] intArr = {15, 29, 52, 17, 21, 39, 8}; Integer[] heapArrA = new Integer[intArr.length], Integer[] heapArrB = new Integer[intArr.length]; // comparators for maximum and minimum heaps Greater<Integer> gtComp = new Greater<Integer>(); Less<Integer> lessComp = new Less<Integer>(); : ADSA: Heaps/11 58 // load intArr into heapArrA to form a maximum heap // and into heapArrB to form a minimum heap for (i = 0; i < intArr.length; i++) { Heaps.pushHeap(heapArrA, i, intArr[i], grComp); //max Heaps.pushHeap(heapArrB, i, intArr[i], lessComp);//min } // print heapArrA System.out.println("Display maximum heap:"); System.out.println( Heaps.displayHeap(heapArrA, heapArrA.length, 2)); // draw heapArrB Heaps.drawHeap(heapArrB, heapArrB.length, 2); : ADSA: Heaps/11 59 // pop minimum value Integer minObj = Heaps.popHeap(heapArrB, heapArrB.length, less); System.out.println("\nMinimum value is " + minObj); } } // draw heapArrB before and after popHeap() // the index range is 0 to heapArrB.length-1 Heaps.drawHeap(heapArrB, heapArrB.length-1, 2); // end of main() // end of UsingHeaps class ADSA: Heaps/11 60 Execution max heap, heapArrA displayHeap() drawHeap() remove 8 ADSA: Heaps/11 minimum heap, heapArrB continued 61 ADSA: Heaps/11 minimum heap, heapArrB 62 5.4. Complexity of Heap Operations • A heap stores elements in an array-based complete tree. • pushHeap() reorders elements in the tree by moving up the path of parents. • adjustHeap() reorders elements in the tree by moving down the path of the children. – their cost depends on path length ADSA: Heaps/11 continued 63 • Assuming the heap has n elements, the maximum length for a path between a leaf node and the root is log2n • since the tree is balanced • The runtime efficiency of the algorithms is O(log2 n) ADSA: Heaps/11 64 5.5. Heapifying O(n)! makeHeap() • Transforming an array into a heap is called "heapifying the array". • Turn an n-element array into a heap by filtering down each parent in the tree – begin with the last parent at index (n-2)/2 – end with the root node at index 0 ADSA: Heaps/11 continued 65 Max Heap Creation Integer[] arr = {9, 12, 17, 30, 50, 20, 60, 65, 4, 19}; The grey nodes are the parents. Adjust in order: 50, 30, 17, 12, 9 ADSA: Heaps/11 continued 66 ADSA: Heaps/11 continued 67 ADSA: Heaps/11 continued 68 ADSA: Heaps/11 69 Depending on the comparator, makeHeap() can create max or min heaps. // arrange array elements into a heap public static <T> void makeHeap(T[] arr, Comparator<? super T> comp) { int lastPos = arr.length; // heap size int heapPos = (lastPos - 2)/2; // see slide 16 // the index of the last parent } // filter down every parent in order // from last parent up to root while (heapPos >= 0) { adjustHeap(arr, heapPos, lastPos, comp); heapPos--; } // end of makeHeap() ADSA: Heaps/11 70 6. Sorting with a Max Heap heapSort() • If the array is a maximum heap, it has an efficient sorting algorithm: – For each iteration i, the largest element is arr[0]. – Exchange arr[0] with arr[i] and then reorder the array so that elements in the index range [0, i) are a heap. – This is done by popHeap(), which is O(log2n) ADSA: Heaps/11 71 Max Heap Sort ADSA: Heaps/11 continued 72 the max heap sort is into ascending order ADSA: Heaps/11 73 Heapsort • Heap sort is a modified version of selection sort for an array that is a heap. – for each i = n, n-1, ..., 2, call popHeap() which pops arr[0] from the heap and assign it at index i-1. • A maximum heap is sorted into ascending order • A minimum heap is sorted into descending order. ADSA: Heaps/11 74 Depending on the comparator, heapSort() can work with max or min heaps. public static <T> void heapSort(T[] arr, Comparator<? super T> comp) { Heaps.makeHeap(arr, comp); // "heapify" arr[] } int n = arr.length; // iteration through arr[n-1] ... arr[1] for (int i = n; i > 1; i--) { // popHeap() moves largest (smallest) to arr[n-1] Heaps.popHeap(arr, i, comp); } // end of heapSort() ADSA: Heaps/11 75 Example Integer[] arr = {7,1,9,0,8,2,4,3,6,5}; // make a max heap Arrays.heapSort(arr, new Greater<Integer>() ); System.out.println("Sort ascending: " + Arrays.toString(arr)); // make a min heap Arrays.heapSort(arr, new Less<Integer>() ); System.out.println("Sort decending: " + Arrays.toString(arr)); ADSA: Heaps/11 Sort ascending: [0,1,2,3,4,5,6,7,8,9] // max heap Sort descending: [9,8,7,6,5,4,3,2,1,0] // min heap 76 • The worst case running time of makeHeap() is closer to O(n), not O(n log2 n). • During the second phase of the heap sort, popHeap() executes n-1 times. Each operation has efficiency O(log2 n). • The worst-case complexity of the heap sort is O(n) + O(n log2 n) = O(n log2 n). ADSA: Heaps/11 77 7. API for the Heaps Class class Heaps ds.util Static Methods static <T> void adjustHeap(T[] arr, int first, int last, Comparator<? super T> comp) Filters the array element arr[first] down the heap. Called by makeHeap() to convert an array to a heap. static String displayHeap(Object[] arr, int n, int maxCharacters) Returns a string that presents the array elements in the index range [0, n) as a complete binary tree. static void drawHeap(Object[] arr, int n, int maxCharacters) Provides a graphical display of a heap as a complete binary tree static void drawHeaps (Object[] arr, int n, int maxCharacters) Provides a graphical display of a heap as a complete binary tree. Keeps the window open for new frames created by subsequent calls to drawHeap() or drawHeaps(). ADSA: Heaps/11 continued 78 class Heaps (continued) ds.util Static Methods static <T> void heapSort(T[] arr, Comparator<? super T> comp) Sorts the array in the ordered specified by the comparator; if comp is Greater, the array is sorted in ascending order; if comp is Less, the array is sorted in descending order. static <T> void makeHeap(T[] arr, Comparator<? super T> comp) The method that is responsible for converting an array to a heap. At each element that may not satisfy the heap property, the algorithm reorders elements in the subtree by calling adjustHeap() static <T> T popHeap(T[] arr, int last, Comparator<? super T> comp) The index range [0, last) is a heap. Deletes the optimum element from the heap, stores the deleted value at index last-1, and returns the value. The index range [0,last-1) is a heap with one less element. static <T> void pushHeap(T[] arr, int last, T item, Comparator<? super T> comp) Inserts item into a heap that consists of the array elements in the index range [0,last-1). The elements in the index range [0, last) become a heap. ADSA: Heaps/11 79 8. Priority Queue Collection o o In a priority queue, all the elements have priority values. A deletion always removes the element with the highest priority. ADSA: Heaps/11 80 o Two types of priority queues: – maximum priority queue • • remove the largest value first what I’ll be using – minimum priority queue • remove the smallest value first ADSA: Heaps/11 81 PQueue Interface o The generic PQueue resembles a queue with the same method names. interface PQueue<T> ds.util boolean isEmpty() Returns true if the priority queue is empty and false otherwise. T peek() Returns the value of the highest-priority item. If empty, throws a NoSuchElementException. ADSA: Heaps/11 continued continued 82 interface PQueue<T> ds.util T pop() Removes the highest priority item from the queue and returns its value. If it is empty, throws a NoSuchElementException. void push(T item) Inserts item into the priority queue. int size() Returns the number of elements in this priority queue ADSA: Heaps/11 83 HeapPQueue Class Example // create a max priority queue of Strings HeapPQueue<String> pq = new HeapPQueue<String>(); pq.push("green"); pq.push("red"); The max priority for Strings pq.push("blue"); is z --> a order // output the size, and element with highest priority System.out.println(pq.size() + ", " + pq.peek()); // use pop() to empty the pqueue and list elements // in decreasing priority order while ( !pq.isEmpty() ) System.out.print( pq.pop() + " "); 3, red red green ADSA: Heaps/11 blue 84 Implementing PQueue • We can use a heap (the HeapPQueue class) to implement the PQueue interface. • The user can specify either a Greater (max heap) or Less (min heap) comparator • this dictates whether deletion removes the maximum or the minimum element from the collection • max heap --> maximum priority queue • min heap --> minimum priority queue ADSA: Heaps/11 85 ADSA: Heaps/11 86 The HeapPQueue Class public class HeapPQueue<T> implements PQueue<T> { private T[] heapElt; // the heap of queue elems private int numElts; // number of elements in queue private Comparator<T> comp; public HeapPQueue() // create an empty maximum priority queue { comp = new Greater<T>(); // so ordered big small numElts = 0; heapElt = (T[]) new Object[10]; } // more methods. . . } // end of HeapPQueue class ADSA: Heaps/11 87 public T peek() // return the highest priority item; O(1) { // check for an empty heap if (numElts == 0) throw new NoSuchElementException( "HeapPQueue peek(): empty queue"); return heapElt[0]; // return heap root } // end of peek() ADSA: Heaps/11 88 // erase the highest priority item and return it public T pop() // O(log n) { // check for an empty priority queue if (numElts == 0) throw new NoSuchElementException( "HeapPQueue pop(): empty queue"); // pop heap and save return value in top T top = Heaps.popHeap(heapElt, numElts, comp); } numElts--; // heap has one less element return top; // end of pop() ADSA: Heaps/11 89 public void push(T item) // insert item into the priority queue; { // if full then double the capacity if (numElts == heapElt.length) enlargeCapacity(); } O(log n) // insert item into the heap Heaps.pushHeap(heapElt, numElts, item, comp); numElts++; // end of push() ADSA: Heaps/11 90 Not examinable Heapify Analysis • Heapify performs better than O(n log n) because most of the node adjusting is done over heights less than log n. level 0 ADSA: Heaps/11 height 3 = log n 1 2 2 1 3 0 91 • ADSA: Heaps/11 92 Cost of Heap Building • less than O(log n) most of the time ADSA: Heaps/11 93 • The summation converges to 2. Why? ADSA: Heaps/11 94 Proof of Convergence • The sum of a geometric series: • Take the derivatives of both sides: ADSA: Heaps/11 95 • ADSA: Heaps/11 96