CSE 310 Review 2/17/2016 Patrick Michaelson Ian Nall Topics • Divide and Conquer Algorithm • Base Concept • Merge-Sort • Quick Sort • Analysis of Algorithms • Recurrence • Iterative • Insertion Sort • Correctness through Loop Invariants • Heaps • Priority Queue • Decision Tree Divide and Conquer • Split the problem apart to make it easier to solve • Usually done through recursion • Makes solving sorting problems easier • Each Level • Divide – make a smaller sub problem • Conquer – solve the problem recursively, while bringing it down to a trivial level • Combine – put the solutions of the sub problems back together to be a solution of the original problem • Can be done iteratively, but the problem becomes much larger Merge-Sort • Divide: The problem goes from n to n/2 elements • Conquer: Then sort the sub problems recursively using merge-sort, if the sub problems are size 1 then they’re already sorted, so you know the limit is 1. • Combine: Merge the two sorted sub problems so you produce a sorted sequence of n elements 5 1 4 7 6 3 9 2 Example 5147 51 5 6392 6 3 4 7 1 4 1 5 4 7 7 6 Dividing process 9 2 3 9 2 9 3 6 2369 1 4 5 7 1 2 3 4 5 6 7 9 2 Merging process Merge-Sort Pseudo Code • MERGE-SORT(A, p, r) • If p<r then • • • • q = floor of (p+r)/2 MERGE-SORT(A,p,q) MERGE-SORT(A,q+1,r) MERGE(A,p,q,r) Merge Pseudo code • MERGE(A,p,q,r) • • • • • B[p..r] = A[p..r] // a temporary array to contain the data i= p j = q+1 z=p while i ≤ q and j ≤ r do • if B[i] ≤ B[j] then • A[z] = B[i] • i++ • else • A[z] = B[j] • z++ • If i< q then • A[z..r] = B[i..q] • else if j<r • A[z..r] = B[j..r] Analysis • T(1) = constant • If n>1 • 2T(n/2) + cn + b • 2T due to the recursion that occurs from both merge sorts and splitting the problem up • The constants come from b time steps for finding the middle of the array • c*n comes from the merge which is a linear time for array operations Quick Sort • Divide: The array is rearranged into two non empty sub arrays, A[p..q] and A[q+1..r] from A[p..r]. This is done in such a way that each element in A[p..q] a the left array is less than or equal to the elements of A[q+1..r] (the q index is determined by the partition function) • Conquer: Sort the subarrays A[p..q] and A[q+1..r] or Left array and Right array respectively • Combine: the subarrays are already sorted in place. • A[p..r] is already sorted no further work needed Quick Sort Example • Pick an index as a pivot point of an array: 10, 12, 7, 2, 15, 6. • Pivot choice for us will always be the last index. • 6 is the pivot, so everything less than 6 to the left and greater than equal to the right • 2 | 6 | 10, 12, 7, 15 • We then take the last index of both arrays • 2 | 6 | 10, 12, 7 |15 • We then continue this till all of the cells are single sized then put them back together • 2|6|7|10,12|15 • 2|6|7|10|12|15 • 2,6,7,10,12,15 Quicksort Pseudo Code • QUICKSORT(A,p,r) • If p<r then • q = PARTITION(A,p,r) • QUICKSORT(A,p,q-1) • QUICKSORT(A,q+1,r) Partition • Partition can be arbitrarily set up by the designer • Could be median of the array, could be start, could be end that contains the partition, it doesn’t matter • Some logic is harder to follow then others though • In practice • • • • • • Choose A[r] as pivot element Scan left until ≥ A[r] is found Scan right until < A[r] is found Swap these two elements Continue until the scan pointers meet. Swap A[r] with the element left most position of right sub list(the element that’s being pointed at by both pointers) Partition Subroutine • PARTITION(A, p, r) pivot = A[r] i=p j = r-1 while TRUE do while(j > p){if A[j] < pivot) break; else j--;} • while(i < r){if A[i] > pivot) break; else i++;} if i < j • Exchange A[i] and A[j] else Exchange A[i] and A[r] • return (i) Analysis • Total time complexity (1) if n 1 T ( n) 2T (n / 2) (n) otherwise T(n) = (nlogn) Analysis of Algorithms • Many forms of algorithms with different methods of analysis • Recurrence • Recursion • Iterative Recurrence • Multiple ways to solve recurrence relationships • Recursion Tree • Substitution • Master Method Recursion Tree - Merge Sort example • Expand recurrence to the base case, n=1 for Merge Sort • • • • T(n) = 2T(n/2) +cn + b = 2(2T(n/4)+c(n/2)+b)+cn+b …. T(1), which is easier to see in Tree form T(n) || cn+b k = 1 = 20 cn+b + + T(n/2) T(n/2) | | k = 2 = 21 c(n/2)+b c(n/2)+b 2(c(n/2)+b) + + + + T(n/4) T(n/4) T(n/4) T(n/4) || || || || k = 4 = 22 c(n/4)+b c(n/4)+b c(n/4)+b c(n/4)+b 4(c(n/4)+b) …….. In general k*(c(n/k)+b) T(1) T(1) ………………………………..T(1) || here k becomes n=2h (h is height of tree) constant n(c(n/n)+b) ------------------------Total: k=1,2,4,8,….n k*(c(n/k)+b)= k=1,2,4,8,….n (cn+kb). k=1,2,4,8,….n k*(c(n/k)+b)= k=1,2,4,8,….n (cn+kb) = cn k=1,2,4,8,….n 1 + b k=1,2,4,8,….n k (We can pull out variables independent of k from the summation) Since k=1,2,4,8,….n 1 = log2n (How many time we need to add 1? The height of tree n= 2h, so h=log2n times) And k=1,2,4,8,….n k = 2n-1 (k=1,2,4,8,….n k = 1+2+4+8+ …+ (n/4)+(n/2)+n, you can try with n=16, to verify this) Thus, we get: Total = cn*(log2 n) + b*(2n-1) = Θ(n log2n) Therefore, Merge-Sort is Θ(nlog2n) or by convention (omitting the base): Θ(nlogn) Merge sort is more efficient than insertion sort for large enough inputs. Substitution • There’s 2 steps to this process • Guess what form the solution will take • Use mathematical induction(ie. weak induction, p→q) to find constants and show that the solution works • Steps of mathematical induction • Prove base case n=1 for Merge-Sort, if so our guessed solution works at least this far • Prove that if it works for base case it works for n = k, it will also work for n = k + 1(which is the next possible value of n) • If it passes those steps then we know it will work for any possible n Merge-Sort substitution example T(n) a, if n=1 2T(n/2) + cn + b, if n>1 (c and b are constants) A guess for this based on the type of recurrence T(n) = cnlog2n + 2bn – b (for n is a power of 2) • Base case: n = 2 • T(2) = 2c log22+ 4b-b = 2c+3b • From the given recurrence • T(2) = 2T(1) + 2c + b = 2c+3b if T(1) = b • Induction: Assume T(k) = cklog2k + 2bk – b, Then • T(2k) = 2T(k)+c(2k)+b (by the given recurrence) • = 2(cklog2k +2bk – b) + c(2k) + b • = c(2klog2k) + 4bk- 2b + c(2k log22) +b • = c(2k)(log2k +log22) + 2b(2k) –b • So T(n) = cnlog2n+2bn-b is true for n = 2k if it is true for n = k • Conclusion: T(n) = cnlog2n+2bn-b for n = 2, 4, 8, …, 2i,.. Master’s Theorem Master Theorem (Theorem 4.1) Suppose that T(n) = aT(n/b) + f(n) where a 1 and b > 1 and they are constants, and f(n) is a function of n. Then 1. If f(n) = O(nlog b a - ) for some constant > 0, then T(n) = (n log b a). 2. If f(n) = (n log b a), then T(n) = (n log b a log2 n). 3. If f(n) = (n log b a + ) for some constant > 0, and if a f(n/b) c f(n) for some constant c < 1 and all sufficiently large n, then T(n) = (f(n)). Master’s Theorem Merge-Sort example • T(n) = 2T(n/2) +cn + b • a = 2, b = 2, f(n) = cn + b, logba = log22 = 1 • Thus • f(n) = (nlog ba) = (n). • The second case: T(n) = (nlog b a log2n) = (nlog2n) Iterative • A process that is repeated until it reaches a specific goal, or to reach a specific goal • Example: Insertion Sort • InsertionSort(A) //Overall runtime is O(n2) • Min = A[0] • For(i = 0 -> A.length) //Adds n potential time to run time, because it must run n times • For(j = i+1 -> A.length) //Adds n potential time to run time • If A[j] < Min then • Min = A[j] • Swap(A, A[i], A[min]) • return A Correctness • Loop Invariant - A loop invariant is a condition [among program variables] that is necessarily true immediately before and immediately after each iteration of a loop. (Note that this says nothing about its truth or falsity part way through an iteration.) • Initialization • Maintenance • Termination Heaps • Binary heap data structure is an array object, which can be viewed as a nearly complete binary tree • A complete binary tree: a binary tree that is completely filled on all levels except the lowest possible level, ie a tree of height 3 would have 1 node, then 2 nodes, then somewhere between 1-4 nodes at its lowest level • The lowest level is filled from left to right Heap visual example 3 4 5 6 7 8 9 10 11 12 1 20 18 10 7 12 8 9 5 4 2 20 1 2 1 7 2 PARENT(i) return i/2 // parent of i in the tree LEFT(i) return 2i // left child of i in the tree RIGHT(i) // right child of i in the tree return (2i+1) 3 18 10 4 7 5 5 6 1 7 8 12 4 2 9 7 Procedures of Heaps • MAX-HEAPIFY: maintains heap property (O(logn)) • BUILD-MAX-HEAP: produces a heap from an unordered input array (O(n)) • HEAPSORT: sort an array in place (O(nlogn)) • EXTRACT MAX or INSERT: allow heap data structure to be used as a priority queue (O(logn)) MAX-HEAPIFY pseudo code • MAX-HEAPIFY(A, i) // T (n) T (2n / 3) (1) T(n) = O(logn) • L = LEFT(i) • R = RIGHT(i) • If L ≤ A.heap-size and A[L] > A[i] • largest = L • Else • largest = I • If r ≤ A.heap-size and A[r] > A[largest] • largest = r • If largest ≠ I • Exchange A[i] and A[largest] • MAX-HEAPIFY(A, largest) BUILD-MAX-HEAP pseudo code • BUILD-MAX-HEAP(A) • A.heap-size = A.length • For i = floor(A.length/2) down to 1 do • MAX-HEAPIFY(A,i) Priority Queue • Maintains a set of elements we’ll call S, each element has an associate value called a key • Operations • • • • Insert – inserts an element into the set Maximum – returns the element in S with the largest key Extract-max – removes and returns the element in S with the largest key Increase-Key - Increases the value of an element’s key to a different value • Can use linked lists or a heap to create a priority queue Priority Queue Heap based pseudo code • HEAP-MAXIMUM(A) • Return A[1] • HEAP-EXTRACT-MAX(A) //Running time O(logn) • If A.heap-size <1 • Error:no element to extract • Else • • • • • Max= A[1] A[1] = A[A.heap-size] A.heap-size— HEAPIFY(A,1) return Max • HEAP-INCREASE-KEY(A, i, key)//Running time O(logn) • If key <A[i] then • Error new key is smaller than current key • A[i] = key • While i> 1 and A[PARENT(i)] < A[i] • ExchangeA[i] and A[PARENT(i)] • i = PARENT(i) • MAX-HEAP-INSERT(A, key)//Running time O(logn) • A.heap-size++ • A[A.heap-size] = -infinity • HEAP-INCREASE-KEY(A, A.heapsize, key) Decision Trees • A model to show a process • Shows all possible permutations • Only 1 possible permutation is possible per set up • Leaves correspond to permutations • Internal nodes represent pair-wise comparisons; The root is the first comparison • Execution of the algorithm corresponds to tracking the path from root to the leaf EXAMPLE - Decision tree for INSERTION-SORT operating on <a1, a2, a3> a1 < a2? yes no a2 < a3? a1 < a3? no yes <a1, a2, a3> yes <a1, a3, a2> no a1 < a3? yes <a2, a1, a3> no <a3, a1, a2> a2 < a3? yes <a2, a3, a1> no <a3, a2, a1> • Each of the n! permutations of the elements must appear as a leaf of the tree, for the sorting algorithm to sort properly.