CS6045: Advanced Algorithms Sorting Algorithms Sorting • Input: sequence of numbers Output: a sorted sequence a1 , a2 ,..., an a1 a2 ... an Insertion Sort Insertion Sort //next current //go left //find place for current // shift sorted right // go left //put current in place Example Correctness • Which elements are in sorted order after running each iteration? • Loop invariant: the subarray A[1 … j-1] consists of the elements originally in A[1 … j-1] but in sorted order Correctness • To use a loop invariant to prove correctness, we must show three things about it: – Initialization: It is true prior to the first iteration of the loop. – Maintenance: If it is true before an iteration of the loop, it remains true before the next iteration. – Termination: When the loop terminates, the invariant—usually along with the reason that the loop terminated—gives us a useful property that helps show that the algorithm is correct. Insertion Sort Correctness • Initialization: Just before the first iteration, j = 2. The subarray A[1 … j-1] is the single element A[1], which is the element originally in A[1], and it is trivially sorted. • Maintenance: To be precise, we would need to state and prove a loop invariant for the “inner” while loop. Rather than getting bogged down in another loop invariant, we instead note that the body of the inner while loop works by moving A[1 … j-1], A[1 … j-2], A[1 … j-3], and so on, by one position to the right until the proper position for key (which has the value that started out in A[j]) is found. At that point, the value of key is placed into this position. • Termination: The outer for loop ends when j > n, which occurs when j = n+1. Therefore, j – 1 = n. Plugging n in for j - 1 in the loop invariant, the subarray A[1 … n] consists of the elements originally in A[1 … n] but in sorted order. In other words, the entire array is sorted. Analyze Algorithm’s Running Time • Depends on – input size – input quality (partially ordered) • Kinds of analysis – Worst case (standard) – Average case (sometimes) – Best case (never) Asymptotic Analysis • Ignore machine dependent constants • Look at growth of T(n) while n – Drop lower-order terms – Ignore the constant coefficient in the leading term • O - big O notation to represent the order of growth Asymptotic Notations • BIG O: O – f = O(g) if f is no faster then g – f / g < some constant • BIG OMEGA: – f = (g) if f is no slower then g – f / g > some constant • BIG Theta: – f = (g) if f has the same growth rate as g – some constant < f / g < some constant Analyze Insertion Sort Insertion Sort Analysis • Best Case – O(n) • Worst Case – O(n^2) • Average Case – O(n^2) Merge Sort • • • • Divide (into two equal parts) Conquer (solve for each part separately) Combine separate solutions Merge sort – Divide into two equal parts – Sort each part using merge-sort (recursion!!!) – Merge two sorted subsequences Merge Sort Example 1 Example 2 Merging • Design an algorithm, which takes O(n) time? Analyze Merge Sort 12345678 1 358 15 5 2467 38 1 8 log n 47 3 7 26 4 • n comparisons per level • log n levels • total runtime = n log n 6 2 Quicksort • Sorts in place like insertion unlike merge • Divide into two parts such that – elements of left part < elements of right part • Conquer: recursively solve for each part separately • Combine: trivial - do not do anything Quicksort(A,p,r) if p <r then q Partition(A,p,r) Quicksort(A,p,q-1) Quicksort(A,q+1,r) //divide //conquer left //conquer right Divide = Partition PARTITION(A,p,r) //Partition array from A[p] to A[r] with pivot A[r] //Result: All elements original A[r] has index i x = A[r] i =p-1 for j = p to r - 1 if A[j] <= x i=i+1 exchange A[i] A[j] exchange A[i+1] with A[r] return i + 1 Loop Invariant Runtime of Quicksort • Worst case: – Partition cause one sub-problem with n-1 elements and one with 0 elements – O(n^2) 0123456789 0 123456789 n 89 8 9 Runtime of Quicksort • Best case: – every time partition in (almost) equal parts – O(n log n) • Average case – O(n log n) Randomized Quicksort • Idea: select a randomly chosen element as the pivot • Randomized algorithms: – includes (pseudo) random-number generator – the behavior depends not only from the input but from random-number generator also • Simple approach: permute randomly the input – same result but more difficult to analyze Randomized Quicksort Randomized Quicksort • Partition around first element: O(n^2) worst-case • Average case: O(n log n)