CS6045: Advanced Algorithms Sorting Algorithms Sorting • Input: sequence of numbers Output: a sorted sequence a1 , a2 ,..., an a1 a2 ... an Insertion Sort Insertion Sort //next current //go left //find place for current // shift sorted right // go left //put current in place An Example: Insertion Sort 30 10 40 20 1 2 3 4 i = j = key = A[j] = A[j+1] = InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 30 10 40 20 1 2 3 4 i=2 j=1 A[j] = 30 key = 10 A[j+1] = 10 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 30 30 40 20 1 2 3 4 i=2 j=1 A[j] = 30 key = 10 A[j+1] = 30 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 30 30 40 20 1 2 3 4 i=2 j=1 A[j] = 30 key = 10 A[j+1] = 30 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 30 30 40 20 1 2 3 4 i=2 j=0 A[j] = key = 10 A[j+1] = 30 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 30 30 40 20 1 2 3 4 i=2 j=0 A[j] = key = 10 A[j+1] = 30 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 10 30 40 20 1 2 3 4 i=2 j=0 A[j] = key = 10 A[j+1] = 10 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 10 30 40 20 1 2 3 4 i=3 j=0 A[j] = key = 10 A[j+1] = 10 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 10 30 40 20 1 2 3 4 i=3 j=0 A[j] = key = 40 A[j+1] = 10 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 10 30 40 20 1 2 3 4 i=3 j=0 A[j] = key = 40 A[j+1] = 10 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 10 30 40 20 1 2 3 4 i=3 j=2 A[j] = 30 key = 40 A[j+1] = 40 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 10 30 40 20 1 2 3 4 i=3 j=2 A[j] = 30 key = 40 A[j+1] = 40 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 10 30 40 20 1 2 3 4 i=3 j=2 A[j] = 30 key = 40 A[j+1] = 40 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 10 30 40 20 1 2 3 4 i=4 j=2 A[j] = 30 key = 40 A[j+1] = 40 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 10 30 40 20 1 2 3 4 i=4 j=2 A[j] = 30 key = 20 A[j+1] = 40 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 10 30 40 20 1 2 3 4 i=4 j=2 A[j] = 30 key = 20 A[j+1] = 40 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 10 30 40 20 1 2 3 4 i=4 j=3 A[j] = 40 key = 20 A[j+1] = 20 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 10 30 40 20 1 2 3 4 i=4 j=3 A[j] = 40 key = 20 A[j+1] = 20 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 10 30 40 40 1 2 3 4 i=4 j=3 A[j] = 40 key = 20 A[j+1] = 40 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 10 30 40 40 1 2 3 4 i=4 j=3 A[j] = 40 key = 20 A[j+1] = 40 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 10 30 40 40 1 2 3 4 i=4 j=3 A[j] = 40 key = 20 A[j+1] = 40 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 10 30 40 40 1 2 3 4 i=4 j=2 A[j] = 30 key = 20 A[j+1] = 40 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 10 30 40 40 1 2 3 4 i=4 j=2 A[j] = 30 key = 20 A[j+1] = 40 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 10 30 30 40 1 2 3 4 i=4 j=2 A[j] = 30 key = 20 A[j+1] = 30 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 10 30 30 40 1 2 3 4 i=4 j=2 A[j] = 30 key = 20 A[j+1] = 30 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 10 30 30 40 1 2 3 4 i=4 j=1 A[j] = 10 key = 20 A[j+1] = 30 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 10 30 30 40 1 2 3 4 i=4 j=1 A[j] = 10 key = 20 A[j+1] = 30 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 10 20 30 40 1 2 3 4 i=4 j=1 A[j] = 10 key = 20 A[j+1] = 20 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } An Example: Insertion Sort 10 20 30 40 1 2 3 4 i=4 j=1 A[j] = 10 key = 20 A[j+1] = 20 InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } Done! Correctness • Which elements are in sorted order after running each iteration? • Loop invariant: the subarray A[1 … j-1] consists of the elements originally in A[1 … j-1] but in sorted order Correctness • To use a loop invariant to prove correctness, we must show three things about it: – Initialization: It is true prior to the first iteration of the loop. – Maintenance: If it is true before an iteration of the loop, it remains true before the next iteration. – Termination: When the loop terminates, the invariant—usually along with the reason that the loop terminated—gives us a useful property that helps show that the algorithm is correct. Insertion Sort Correctness • Initialization: Just before the first iteration, j = 2. The subarray A[1 … j-1] is the single element A[1], which is the element originally in A[1], and it is trivially sorted. • Maintenance: To be precise, we would need to state and prove a loop invariant for the “inner” while loop. Rather than getting bogged down in another loop invariant, we instead note that the body of the inner while loop works by moving A[1 … j-1], A[1 … j-2], A[1 … j-3], and so on, by one position to the right until the proper position for key (which has the value that started out in A[j]) is found. At that point, the value of key is placed into this position. • Termination: The outer for loop ends when j > n, which occurs when j = n+1. Therefore, j – 1 = n. Plugging n in for j - 1 in the loop invariant, the subarray A[1 … n] consists of the elements originally in A[1 … n] but in sorted order. In other words, the entire array is sorted. Analyze Algorithm’s Running Time • Depends on – input size – input quality (partially ordered) • Kinds of analysis – Worst case (standard) – Average case (sometimes) – Best case (never) Asymptotic Analysis • Ignore machine dependent constants • Look at growth of T(n) while n – Drop lower-order terms – Ignore the constant coefficient in the leading term • O - big O notation to represent the order of growth Asymptotic Notations • BIG O: O – f = O(g) if f is no faster then g – f / g < some constant • BIG OMEGA: – f = (g) if f is no slower then g – f / g > some constant • BIG Theta: – f = (g) if f has the same growth rate as g – some constant < f / g < some constant Time Analysis of Insertion Sort InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } How many times will A[j+1] = key this loop execute? } } Insertion Sort Statement Effort InsertionSort(A, n) { for i = 2 to n { c1n key = A[i] c2(n-1) j = i - 1; c3(n-1) while (j > 0) and (A[j] > key) { c4T A[j+1] = A[j] c5(T-(n-1)) j = j - 1 c6(T-(n-1)) } 0 A[j+1] = key c7(n-1) } 0 } T = t2 + t3 + … + tn where ti is number of while expression evaluations for the ith for loop iteration Analyzing Insertion Sort • T(n) = c1n + c2(n-1) + c3(n-1) + c4T + c5(T - (n-1)) + c6(T - (n-1)) + c7(n-1) = c8T + c9n + c10 • What can T be? – Best case -- inner loop body never executed • ti = 1 T(n) is a linear function – Worst case -- inner loop body executed for all previous elements • ti = i T(n) is a quadratic function – Average case • ??? Insertion Sort Analysis • Best Case – O(n) • Worst Case – O(n^2) • Average Case – O(n^2) Merge Sort • • • • Divide (into two equal parts) Conquer (solve for each part separately) Combine separate solutions Merge sort – Divide into two equal parts – Sort each part using merge-sort (recursion!!!) – Merge two sorted subsequences Merge Sort Example 1 Example 2 Merging • Design an algorithm, which takes O(n) time? //split the array A to L and R //add sentinels at the end of arrays L & R //find smaller value from L & R and then save back to array A Analysis of Merge Sort Statement Effort MergeSort(A, left, right) { if (left < right) { mid = floor((left + right) / 2); MergeSort(A, left, mid); MergeSort(A, mid+1, right); Merge(A, left, mid, right); } } • So T(n) = (1) when n = 1, and 2T(n/2) + (n) when n > 1 • So what (more succinctly) is T(n)? T(n) (1) (1) T(n/2) T(n/2) (n) Recurrences • The expression: c n 1 T ( n) 2T n cn n 1 2 is a recurrence. – Recurrence: an equation that describes a function in terms of its value on smaller functions Recurrence Examples 0 n0 s ( n) c s(n 1) n 0 0 n0 s ( n) n s(n 1) n 0 c n 1 T ( n) 2T n c n 1 2 c n 1 T ( n) n aT cn n 1 b Solving Recurrences • Substitution method • Iteration method • Master method Solving Recurrences • The substitution method (Textbook1 4.1) – A.k.a. the “making a good guess method” – Guess the form of the answer, then use induction to find the constants and show that solution works – Examples: • T(n) = 2T(n/2) + (n) T(n) = (n lg n) • T(n) = 2T(n/2) + n ??? Solving Recurrences • The substitution method (Textbook1 4.1) – A.k.a. the “making a good guess method” – Guess the form of the answer, then use induction to find the constants and show that solution works – Examples: • T(n) = 2T(n/2) + (n) T(n) = (n lg n) • T(n) = 2T(n/2) + n T(n) = (n lg n) • T(n) = 2T(n/2 )+ 17) + n ??? Solving Recurrences • The substitution method (Textbook1 4.1) – A.k.a. the “making a good guess method” – Guess the form of the answer, then use induction to find the constants and show that solution works – Examples: • T(n) = 2T(n/2) + (n) T(n) = (n lg n) • T(n) = 2T(n/2) + n T(n) = (n lg n) • T(n) = 2T(n/2+ 17) + n (n lg n) Analyze Merge Sort 12345678 1 358 15 5 2467 38 1 8 log n 47 3 7 26 4 • n comparisons per level • log n levels • total runtime = n log n 6 2 Solving Recurrences • Another option is what the book calls the “iteration method” – Expand the recurrence – Work some algebra to express as a summation – Evaluate the summation • We will show several examples 0 n0 s ( n) c s(n 1) n 0 • s(n) = c + s(n-1) c + c + s(n-2) 2c + s(n-2) 2c + c + s(n-3) 3c + s(n-3) … kc + s(n-k) = ck + s(n-k) 0 n0 s ( n) c s(n 1) n 0 • So far for n >= k we have – s(n) = ck + s(n-k) • What if k = n? – s(n) = cn + s(0) = cn 0 n0 s ( n) c s(n 1) n 0 • So far for n >= k we have – s(n) = ck + s(n-k) • What if k = n? – s(n) = cn + s(0) = cn • So 0 n0 s ( n) c s(n 1) n 0 • Thus in general – s(n) = cn 0 n0 s ( n) n s(n 1) n 0 • = = = = = = s(n) n + s(n-1) n + n-1 + s(n-2) n + n-1 + n-2 + s(n-3) n + n-1 + n-2 + n-3 + s(n-4) … n + n-1 + n-2 + n-3 + … + n-(k-1) + s(n-k) 0 n0 s ( n) n s(n 1) n 0 • = = = = = = s(n) n + s(n-1) n + n-1 + s(n-2) n + n-1 + n-2 + s(n-3) n + n-1 + n-2 + n-3 + s(n-4) … n + n-1 + n-2 + n-3 + … + n-(k-1) + s(n-k) n = i i n k 1 s(n k ) 0 n0 s ( n) n s(n 1) n 0 • So far for n >= k we have n i i n k 1 s(n k ) 0 n0 s ( n) n s(n 1) n 0 • So far for n >= k we have n i s(n k ) i n k 1 • What if k = n? 0 n0 s ( n) n s(n 1) n 0 • So far for n >= k we have n i s(n k ) i n k 1 • What if k = n? n 1 i s (0) i 0 n 2 i 1 i 1 n n 0 n0 s ( n) n s(n 1) n 0 • So far for n >= k we have n i s(n k ) i n k 1 • What if k = n? n 1 i s (0) i 0 n 2 i 1 i 1 n • Thus in general n 1 s ( n) n 2 n c n 1 n T (n) 2T c n 1 2 • T(n) = 2T(n/2) + c 2(2T(n/2/2) + c) + c 22T(n/22) + 2c + c 22(2T(n/22/2) + c) + 3c 23T(n/23) + 4c + 3c 23T(n/23) + 7c 23(2T(n/23/2) + c) + 7c 24T(n/24) + 15c … 2kT(n/2k) + (2k - 1)c c n 1 n T (n) 2T c n 1 2 • So far for n > 2k we have – T(n) = 2kT(n/2k) + (2k - 1)c • What if k = lg n? – T(n) = 2lg n T(n/2lg n) + (2lg n - 1)c = n T(n/n) + (n - 1)c = n T(1) + (n-1)c = nc + (n-1)c = (2n - 1)c The Master Theorem • Given: a divide and conquer algorithm – An algorithm that divides the problem of size n into a subproblems, each of size n/b – Let the cost of each stage (i.e., the work to divide the problem + combine solved subproblems) be described by the function f(n) • Then, the Master Theorem gives us a cookbook for the algorithm’s running time: The Master Theorem • if T(n) = aT(n/b) + f(n) then logb a n logb a T (n) n log n f (n) f (n) O n logb a 0 logb a f ( n) n c 1 f (n) n logb a AND af (n / b) cf (n) for large n Using The Master Method • T(n) = 9T(n/3) + n – a=9, b=3, f(n) = n – nlog a = nlog 9 = (n2) – Since f(n) = O(nlog 9 - ), where =1, case 1 applies: b 3 3 T (n) nlogb a when f (n) O nlogb a – Thus the solution is T(n) = (n2)