1 7. Introduction to Computational Complexity: The Sorting Problem p CHAPTER 7 Foundations of Algorithms Chapter 7 introduces the concept of “computational complexity analysis” of all algorithms for a given problem. 2 The Sorting Problem Can we develop a sorting algorithm that has Θ(n) average case time complexity? Two approaches 1. Try to come up with a Θ(n) sorting algorithm. 2. Try to prove that it is not possible to develop such an algorithm. 3 7.1 Computational Complexity Computational Complexity Analysis - So far, we have analyzed the complexity of a certain algorithm that can solve a given problem. - But what about the lower bound on the efficiency of all algorithms for a given problem? Computational Complexity Analysis 4 7.1 Computational Complexity Example: Sorting only by comparisons of keys Exchange Quick Merge Worst Θ(n2) Θ(n2) Θ(n lg n) Average Θ(n2) Θ(n lg n) Θ(n lg n) We will know later that Ω(n lg n) is a lower bound for algorithms that sort by comparing keys. 5 7.2 Insertion Sort and Selection Sort Insertion Sort - an algorithm that sorts by inserting keys in an existing sorted array Method: 1. Assume that the keys in the first (i-1) array slots are sorted and let x be the key in the ith slot. 2. Compare x with S[i-1], S[i-2], etc. until a key smaller than x is found. Let j be the index of such slot. 3. Move the keys S[j+1] through S[i-1] to S[j+2] through S[i] and insert x in the (j+1)st slot 4. Repeat this process for i=2 through i=n 6 Insertion Sort i=2: i=3: i=4: i=5: i=6: i=7: i=8: 2 2 2 2 2 2 2 1 ⑧ 5 9 4 8 ⑤ 9 4 5 8 ⑨ 4 5 8 9 ④ 7 7 7 7 4 4 3 2 ⑦ 5 5 4 3 8 7 5 4 9 8 7 5 9 8 7 3 3 3 3 3 ③ 9 8 1 1 1 1 1 1 ① 9 7 Insertion Sort public static void insertionSort(int n, keyType[ ] S) { index i,j ; keyType x; for (i=2; i <= n ; i++) { x = S[i] ; j=i-1 ; while ( j > 0 && S[j] > x ) { S[j+1] = S[j] ; j-- ; } S[j+1] = x ; } } 8 Insertion Sort Worst Case Time Complexity: Basic Operation: Comparison of S[j] with x Input Size: n, the number of keys Note: The worst case occurs when the input is already sorted, but in the reverse order. In the worst case, the comparison is done (i-1) times for a given i.( 2 ≤ i ≤ n) n W(n) = ∑ ( i – 1 ) = n(n-1)/2 i=2 9 Insertion Sort Average Case Time Complexity: For a given i, there are i slots where x can be inserted. Slot i i-1 i-2 … 2 1 # of comparisons 1 2 3 … i-1 i-1 10 Insertion Sort Average Case Time Complexity -continued: Since the probability to place x in slot k is the same for all k, 1 ≤ k ≤ i, the average number of comparisons needed to insert x for given i is [ 1+2+3+ … + (i-1) + (i-1) ] / i = [ i(i-1)/2 + (i-1) ] / i = (i+1)/2 –1/i Thus, the average number of comparisons needed to sort the array is n A(n) = ∑ ( (i+1)/2 – 1/i ) ≒ (n+4)(n-1)/4 - ln n i=2 11 7.3 Lower Bounds for Algorithms that remove at most one inversion per comparison Definition: Inversion - a pair of keys that are in the wrong order. Example: In (5,8,2,4,3), there are 7 inversions: (5,2),(5,4),(5,3),(8,2),(8,4),(8,3), and (4,3). Note: After each comparison, the insertion sort either moves the key in the j-th slot to the (j+1)st slot or does nothing. it removes at most one inversion per comparison. 12 7.3 Lower Bounds for Algorithms that remove at most one inversion per comparison Theorem 7.1 Any algorithm that sorts n distinct keys only by comparisons of keys and removes at most one inversion after each comparison must in the worst case do at least n(n-1)/2 comparisons of keys and on the average at least n(n-1)/4 comparisons of keys. 13 7.3 Lower Bounds for Algorithms that remove at most one inversion per comparison Proof of Theorem 7.1 Worst Case: -We just need to show that there is a permutation with n(n-1)/2 inversions. In fact, (n,n-1,…,2,1) is such a permutation: (n-1) + (n-2) + … + 2 + 1 = n(n-1)/2. 14 7.3 Lower Bounds for Algorithms that remove at most one inversion per comparison Proof of Theorem 7.1 Average Case: There are n! permutations for a given n. The average number of inversions in the input is the least number of comparisons the algorithm needs to do on the average case. The average number of inversions in the input can be computed as follows: n! [ ∑ (# of inversions in the i th permutation) ] / n! i =1 15 7.3 Lower Bounds for Algorithms that remove at most one inversion per comparison Proof of Theorem 7.1 Average Case - continued: If a permutation p has k inversions, its transpose will have C(n,2) – k inversions.(Why?). 16 7.3 Lower Bounds for Algorithms that remove at most one inversion per comparison Proof of Theorem 7.1 Average Case - continued: If a permutation p has k inversions, its transpose will have C(n,2) – k inversions. Why? 1, 2, 3, 4, …, n-2, n-1, n 0 inversion 1, 2, 3, 4, …, n-2, n, n-1 1 inversion 3, n, 2, 1, …, n-8, n-2, 6 k inversions … transpose 6, n-2, n-8, …, 1, 2, n, 3 C(n,2) - k inversions … n-1, n, n-2, …, 4, 3, 2, 1 C(n,2) - 1 inversions … n, n-1, n-2, …, 4, 3, 2, 1 C(n,2) - 0 inversions 17 7.3 Lower Bounds for Algorithms that remove at most one inversion per comparison Proof of Theorem 7.1 Average Case - continued: Thus, for any pair of a permutation and its transpose, there exist C(n,2) inversions. Since there are n!/2 such pairs, n! [ ∑ (# of inversions in the i th permutation) ] / n! i =1 = [ C(n,2) * n!/2 ] / n! = C(n,2)/2 = n(n-1)/4 18 7.6 HeapSort Heap - An essentially complete binary tree such that the values stored at each node is greater than or equal to the values stored at its children. (Heap Property) 67 33 16 52 31 45 19 7.6 HeapSort HeapSort - Given a heap, repeatedly remove the keys at the root while maintaining the heap property. - Put the removed keys in an array starting from the nth slot and going down to the first slot. We need to figure out 1. How to construct the initial heap 2. How to remove the keys while maintaining the heap property 20 7.6 HeapSort How to construct the initial heap? 1. Given an array S[1..n] of keys, we can first construct an essentially complete binary tree with the S[1] as the root key. S= 2 4 5 3 1 9 6 7 10 8 1 2 3 4 5 6 7 8 9 10 level 0 2 4 level 1 level 2 level 3 7 3 1 10 8 5 9 6 21 7.6 HeapSort How to construct the initial heap? 2. Now we need to make the essentially complete binary tree (with a depth d) a heap. Do this by converting the level-d subtrees to heaps first, then the level d-1 subtrees to heaps next, and so on until the whole tree becomes a heap. level 0 Example: 2 4 level 1 level 2 level 3 7 3 1 10 8 5 9 6 22 7.6 HeapSort How to construct the initial heap? Example: level 0 4 level 1 level 2 level 3 First Round: 2 7 3 1 10 8 level 3 5 9 6 7 10 8 Each subtree is already a heap. 23 7.6 HeapSort How to construct the initial heap? Example: level 0 4 level 1 level 2 level 3 Second Round: 2 7 3 1 10 8 level 2 5 9 6 7 7 3 1 10 8 10 8 3 1 9 6 9 6 Now all level 2 subtrees are heaps. 24 7.6 HeapSort How to construct the initial heap? Example: level 0 4 level 1 level 2 level 3 Third Round: 2 7 3 1 10 8 5 9 6 10 8 3 1 7 10 7 4 8 3 1 level 1 5 4 9 6 10 4 7 8 3 1 9 5 6 Now all level 1 subtrees are heaps. 25 7.6 HeapSort How to construct the initial heap? Example: level 0 4 level 1 level 2 level 3 Fourth Round: 2 7 3 1 10 8 5 9 6 4 10 7 8 3 1 9 5 10 4 8 3 1 6 10 2 7 level 0 2 9 5 8 6 4 7 2 3 1 9 5 6 26 7.6 HeapSort How to construct the initial heap? Result : level 0 10 8 level 1 level 2 level 3 S = 10 8 4 9 7 7 2 3 1 2 9 5 5 6 6 4 3 1 27 7.6 HeapSort How to remove the root key from a heap? 1. After removing the key at the root, move the last key at the bottom level to the root. 2. Then, sift that key down until the heap property is restored. Example: 4 10 8 9 7 2 5 3 1 6 28 7.6 HeapSort How to remove the root key from a heap? 1 7 4 8 9 2 5 6 8 9 2 5 7 3 1 4 3 9 9 1 8 7 4 3 S= 9 2 5 6 7 4 8 6 6 7 2 5 8 6 2 5 1 3 1 4 3 10 29 7.6 HeapSort Implementation of HeapSort public static void heapSort(int n, heap H, keyType[ ] S) { makeHeap(n, H) ; removeKeys(n, H, S) ; } public static void makeHeap(int n, heap H) { index i; heap Hsub; for (i= depth(H) -1; i >=0 ; i--) // d: depth of the tree for (all subtrees Hsub whose roots have depth i) siftDown(Hsub) ; } 30 7.6 HeapSort Implementation of HeapSort public static void removeKeys(int n, heap H, keyType[ ] S) { index i; for (i=n ; i >=1 ; i--) S[i] = root(H) ; } public static keyType root(heap H) { keyType keyout ; keyOut = key at the root ; move the key at the bottom node to the root ; delete the bottom node ; siftDown(H) ; return keyOut ; } 31 7.6 HeapSort Implementation of HeapSort public static void siftDown(heap H) { index parent, largerChild ; parent = root of H; largerChild = parent’s child containing larger key ; while (key at parent is smaller than key at largerchild ) { exchange key at parent and key at largerChild ; parent = largerChild ; largerChild = parent’s child containing larger key ; } } 32 7.6 HeapSort Worst Case Time Complexity of HeapSort Basic Operation: - the comparison of keys in siftDown Input Size: - n, the number of keys to be sorted Assumption: n = 2d. Level 0 Level 1 … … … Level d-1 Level d 33 7.6 HeapSort Worst Case Time Complexity of HeapSort Since makeHeap and removeKeys both call siftDown, we analyze both routines: makeHeap Since there is only one node at level d, we first ignore that node when counting the number of comparisons in siftDown and then add d back to the result. 34 7.6 HeapSort Worst Case Time Complexity of HeapSort makeHeap - continued Level # of nodes Greatest # of nodes that a key would be sifted 0 1 … j … d-1 1 21 … 2j … 2d-1 d-1 d-2 … d-j-1 … 0 d-1 j d-1 j d-1 j j=0 j=0 j=0 (1+x+x2+…+xd)=(1-xd+1)/(1-x) (1+2x+…+dxd-1)= (-(d+1)xd(1-x)+(1-xd+1))/(1-x)2 2(d2d-1+(1-2d)) d2d-2d+1+2 ∑ 2 (d-j-1) = (d-1)∑2 - ∑ j2 = 2d – d – 1 By adding d back to the above expression, we have 2d – d – 1 + d = 2d – 1 = n-1. 35 7.6 HeapSort Worst Case Time Complexity of HeapSort removeKeys For each removal of the first 2d-1 keys, each new top key would be sifted (d-1) times in the worst case. Level 0 Level 1 … … … … Level d-2 … Level d-1 Level d Thus, the total number of nodes through which 2d-1 keys are sifted is (d-1)*2d-1 . 36 7.6 HeapSort Worst Case Time Complexity of HeapSort removeKeys For each removal of the next 2d-2 keys, each new top key would be sifted (d-2) times in the worst case. Level 0 Level 1 … … … … Level d-2 Level d-1 Thus, the total number of nodes through which 2d-2 keys are sifted is (d-2)*2d-2 . 37 7.6 HeapSort Worst Case Time Complexity of HeapSort removeKeys Considering all levels from d-1 to 1, we have d-1 j ∑ j2 = d2d – 2d+1 + 2 j=1 = n lg n – 2n + 2 38 7.8 Lower Bounds for Sorting Only by Comparisons of Keys Decision Trees for Sorting Algorithms void sortThree(keyType[ ] S) { keyType a, b, c ; a=S[1] ; b= S[2] ; c=S[3] ; if (a < b) { … } else if (b < c) { … } else S = c, b, a ; } if ( b < c ) S = a, b, c ; else if ( a < c ) S = a, c, b ; else S = c, a, b ; a<b yes no b<c yes no a,b,c if ( a < c ) S = b, a, c ; else S = b, c, a ; a<c b<c yes no a<c c,b,a yes no yes no a,c,b c,a,b b,a,c b,c,a Decision tree for sortThree 39 7.8 Lower Bounds for Sorting Only by Comparisons of Keys Decision Trees for Sorting Algorithms void exchangeSort(int n, keyType[ ] S) { index i, j ; // a=S[1] ; b= S[2] ; c=S[3] ; for (i=1; i<n; i++) for (j=i+1; j<=n ; j++) if ( S[j] < S[i]) exchange S[i] and S[j] ; b<a yes no c<b yes no b<a yes c<a a<b yes no yes } c,b,a c<a yes no b,c,a c<b yes no b,a,c c,a,b a,c,b a,b,c Decision tree for exchangeSort 7.8 Lower Bounds for Sorting Only by Comparisons of Keys Decision Trees for Sorting Algorithms Lemma 7.1: To every deterministic algorithm for sorting n distinct keys there corresponds a pruned, valid binary decision tree containing exactly n! leaves. Lemma 7.2: The worst case number of comparisons done by a decision tree is equal to its depth. 40 7.8 Lower Bounds for Sorting Only by Comparisons of Keys 41 Decision Trees for Sorting Algorithms Lemma 7.3: Let m be the number of leaves in a binary tree and d its depth. Then, d ≥ lg m . Theorem 7.2: Any deterministic algorithm that sorts n distinct keys only by comparisons of keys must in the worst case do at least lg(n!) comparisons of keys. 7.8 Lower Bounds for Sorting Only by Comparisons of Keys Decision Trees for Sorting Algorithms n ≥ ∫ lg xdx Lemma 7.4: 1 1 For any positive integer n, [ x ln x − x]1n = ln 2 lg(n!) ≥ n lgn – 1.45n. 1 = Theorem 7.3: ( n ln n − n + 1) ln 2 = n lg n − ( n − 1) lg e Any deterministic algorithm that sorts n distinct keys only by comparisons of keys must in the worst case do at least n lgn – 1.45n comparisons of keys. 42