So far so good, but can we do better? Yes, cheaper by halves... http://www.cs.miami.edu/~burt/learning/Csc517.101/w orkbook/cheaperbyhalf.html Let’s try it out … CSC317 1 Toward merge sort CSC317 2 What is it going to cost us? • Divide Left and Right: constant C1 ænö • Sort Left with Insertion Sort: C2 ç ÷ è2ø 2 ænö • SortRight:withInsertionSort C2 ç ÷ è2ø • Merge: C3n 2 æ ö • Total: C1 + 2C2 ç n ÷ + C3n è2ø 2 (Have we gained from Insertion sort (Cn2)? CSC317 3 Another example: find min Input: array A Output: minimum value in array A Cost? C1n CSC317 4 Is splitting the array in half going to help us? Let’s see: • Divide Left and Right: constant C1 ænö • Find min Left: C2 ç ÷ è2ø ænö • Find min Right: C2 ç ÷ è2ø • Merge: C3n ænö • Total: C1 + 2C2 ç ÷ + C3n è2ø Have we gained from find min on full array (Cn)? CSC317 5 Back to merge sort: If we can split once, can we do it more often? CSC317 6 Merge sort: pseudo code (step-by-step) Merge-Sort If array larger than size 1 Divide array into Left and Right arrays // divide Merge-Sort(Left array) // conquer left; recursive call Merge-Sort(Right array) // conquer right; recursive call Merge sorted Left and Right arrays // combine CSC317 7 Merge sort: pseudo code (more formal) length of array A // divide (split in half) // conquer left // conquer right // combine At what step do we have most work? Merging step, otherwise we just split arrays CSC317 8 Merge sort: run time analysis Total work in each step: • Divide: constant • Combine: Cn • Conquer: recursively solve two subproblems, each size n/2 We’ll write out the recursion as recursion n T (n) = 2T( ) + Cn 2 combination grows like n log2n • Compared to insertion sort is this good or bad? • And how did we end up here? CSC317 9 What is this? A recursion tree where recursion n T (n) = 2T( ) + Cn 2 combination is Cn T(n) CSC317 n n T( ) T( ) 2 2 10 What is this? A recursion tree where recursion n T (n) = 2T( ) + Cn 2 combination is Cn T(n) CSC317 n n T( ) T( ) 2 2 11 Next step in the recursion: Cn T(n) Cn/2 Cn/2 n n T( ) T( ) 4 4 n n T( ) T( ) 4 4 CSC317 12 Let’s keep on going (I mean splitting)! Cn T(n) Cn/2 Cn/2 Each row adds to how much work? Cn/4 Cn/4 Cn/4 Cn/4 C C C CC CSC317 C C C 13 Cn Cn Cn/2 Cn/2 Cn Cn/4 Cn/4 Cn/4 Cn/4 Cn T(n) C C C CC C C C CSC317 Cost per level stays the same! Cn 14 Cn Level 0: n 20 Level 1: n 21 Level 2: n 22 T(n) Cn/2 Cn/2 Cn/4 Cn/4 Cn/4 Cn/4 C C C CC C C C CSC317 n Level k: k 2 15 n We know that n = 1 and k = 1 2 We need to find level k (height of tree) n =1 k 2 n = 2k k = lg2 (n) number of levels, each level needs work Cn Total work: Cn lg2(n) CONCLUSION: • grows as n lg2(n) • Insert sort: n2 CSC317 Is mergesort always faster? Any disadvantages? 16 Correctness and loop invariants • How do we know that an algorithm is correct (i.e. always gives the right answer? • We use loop invariants (what does that mean?). • Invariant = something that does not change • Loop invariant = a property about the algorithm that does not change at every iteration before the loop • Usually the property we would like to prove is correct about the algorithm! • Intuitive, but we would like to state mathematically CSC317 17 Loop invariant example: insertion sort Question: What invariant property makes this algorithm correct? CSC317 18 Loop invariant example: insertion sort KEY KEY KEY KEY KEY Question: What invariant property makes this algorithm correct? Answer: Before each iteration of the for loop, the elements thus far are sorted. Next question: Can we state that mathematically? CSC317 19 Loop invariant example: insertion sort KEY KEY KEY KEY KEY Insertion sort loop invariant: at the start of each iteration of the for loop, A[1,...,j-1] consists of elements originally in A[1,...,j-1] but in sorted order. CSC317 20 Can we prove this? 3 steps: • Initialization: Algorithm is true prior to first iteration of the loop (base case). • Maintenance: If it is true before an iteration of the loop it remains true before the next iteration (like an induction step). • Termination: When the loop terminates, the invariant gives a useful property that shows the algorithm is correct. CSC317 21 • Initialization: Algorithm is true prior to first iteration of the loop (base case). When j=2, A[1] is just one element, which is the original element in A[1], and must be already sorted Insert sort pseudocode j=2 1. for j = 2 to n 2. key = A[ j ] 3. Insert key into sorted array A[1,...,j-1] by comparing and swapping into correct position CSC317 22 • Maintenance: If it is true before an iteration of the loop it remains true before the next iteration (like an induction step). KEY KEY KEY KEY KEY true for j-1 1. for j = 2 to n 2. key = A[ j ] 3. Insert key into sorted array A[1,...,j-1] by comparing and swapping into correct position Might not be true in loop. But make sure here that it remains sorted, by pairwise swaps. CSC317 23 • Maintenance: If it is true before an iteration of the loop it remains true before the next iteration (like an induction step). KEY KEY KEY KEY KEY So will remain true for j 1. for j = 2 to n 2. key = A[ j ] 3. Insert key into sorted array A[1,...,j-1] by comparing and swapping into correct position We make sure here that it remains sorted, by pairwise swaps CSC317 24 • Maintenance: If it is true before an iteration of the loop it remains true before the next iteration (like an induction step). KEY KEY KEY KEY KEY • If A[1,...,j-1] sorted before iteration of loop, then for key=A[j], we pairwise swap it into correct position; so now A[1,...,j] is also sorted. • Also, A[1,...,j-1] includes elements originally in A[1,...,j-1]. Then A[1,...,j] includes those elements and the element A[j], so must include elements originally in A[1,...,j] CSC317 25 • Termination: When the loop terminates, the invariant gives a useful property that shows the algorithm is correct. When the loop terminates (i.e. j=n+1) A[1,...,j] must be in sorted order, which is A[1,…,n] or the entire array. true for j= n + 1 1. for j = 2 to n 2. key = A[ j ] 3. Insert key into sorted array A[1,...,j-1] by comparing and swapping into correct position CSC317 26 Loop invariants example: find min Animation example (Burt Rosenberg) http://www.cs.miami.edu/~burt/learning/Csc517.101/workbook/findmin.html Input: array A Output: minimum value in array A Question: What invariant would make this algorithm correct? CSC317 27 Input: array A Output: minimum value in array A Question: What invariant would make this algorithm correct? Answer: At each point of the algorithm, the current min is known. More formally … CSC317 28 Input: array A Output: minimum value in array A Loop invariant: At each iteration of the for loop, min is the smallest Element in A[1,…,i-1]. CSC317 29 k Loop invariant of merge: i Loop invariant: At the start of each iteration of the for loop, A[p,...,k-1] contains the k-p smallest elements, in sorted order. Also, L[i] and R[j] are the smallest elements of their arrays not yet copied back into A. J j i k CSC317 30