Algorithms Homework – Fall 2000 8.1-1 Using Figure 8.1 as a model, illustrate the operation of PARTITION on the array A = 13,19,9,5,12,8,7,4,11,2,6,21 13 19 i 9 5 12 8 7 4 11 2 6 19 i i 9 5 12 8 7 4 11 2 13 j j 6 2 9 5 12 8 7 4 11 19 i ………………………… j return 11, 6 21 jj 13 SPLIT = 6,2,9,5,12,8,7,4,11 and 21 21 19,13,21 8.1-2 What value of q does PARTITION return when all elements in the array A[p…r] have the same value? q = (p+r)/2, where p = index 0, and r = highest index 8.1-3 Give a brief argument that the running time of PARTITION on a subarray of size n is (n). In the worst case, PARTITION must move the j pointer by one element (to the 2nd to last element), and the i pointer all the way to j, making a comparison at each element along the way. Since there are n comparisons made, the running time is (n) In the average (and best) case, PARTITION must move the j pointer to an element at or near the half-way point in the array and the i pointer all the way to j, making a comparison at each element along the way. Once again there are n comparisons made and the running time is (n) 8.2-1 Show that the running time of QUICKSORT is (n lg n) when all elements of array A have the same value. T(n) = 2T(n/2) + (n) CASE 2 of Master Theorem f(n) lg n = (n lg n) 8.2-2 show that the running time of QUICKSORT is (n2) when the array A is sorted in nonincreasing order. The recurrence in this case is T(n) = T(n – 1) + (n) = T(n – 2) + (n – 1) + (n) = T(n – 3) + (n – 2) + (n – 1) + (n) … n = (n) + k k 1 = (n ) 2 8.2-3 Banks often record transactions on an account in order of the times of the transactions, but many people like to receive their bank statements with checks listed in order by check number. People usually write checks in order by check number, and merchants usually cash them with reasonable dispatch. The problem of converting time-of-transaction ordering to check-number ordering is therefore the problem of sorting almost-sorted input. Argue that the procedure INSERTIONSORT would tend to beat the procedure QUICKSORT on this problem. As the data being sorted becomes closer to sorted, insertion sort gets closer to (n) and quicksort gets closer to (n2). Therefore, there comes a point at which insertion sort performs better than quicksort. Further, while quicksort is still (n lg n) the constant multiple continues to grow as the data is more sorted. And with insertion sort, the constant multiple continues to shrink. 8.2-4 Suppose that the splits at every level of quicksort are in the proportion from 1 – to , where 0 ½ is a constant. Show that the minimum depth of a leaf in the recursion tree is approximately –lg n/lg and the maximum depth is approximately –lg n/lg (1 – ). (Don’t worry about the integer round-off.) log1/n = (logbn)/(logb1/) = –(logbn)/(logb) = –(lgn)/(lg) log1/(1-)n = (logbn)/(logb1/(1-)) = –(logbn)/(logb(1-)) = –(lgn)/(lg(1-)) 8.2-5 See Quiz 4 answer sheet. 8.3-2 During the running of the procedure RANDOMIZED-QUICKSORT, how many calls are made to the random-number generator RANDOM in the worst case? How does the answer change in the best case? In both cases the answer is (n – 1) 8.3-3 Describe an implementation of the procedure RANDOM(a, b) that uses only fair coin flips. What is the expected running time of your procedure? Do lg(b – a + 1) flips and store them as a bit vector where H = 1 and T = 0. 8.4-4 The running time of quicksort can be improved in practice by taking advantage of the fast running time of insertion sort when its input is “nearly” sorted. When quicksort is called on a subarray with fewer than k elements, let it simply return without sorting the subarray. After the top-level call to quicksort returns, run insertion sort on the entire array to finish the sorting process. Argue that this sorting algorithm runs in O(nk + n lg (n/k)) expected time. How should k be picked, both in theory and in practice? In theory, k should be picked such that cnk cn lg (n/k) ck c lg (n/k) ck c lg n – c lg k In practice, k should be picked as 0, since this type of improvement is not done. PROBLEMS 8-1 Partition correctness Give a careful argument that the procedure PARTITION in section 8.1 is correct. Prove the following: a. The indices i and j never reference an element of A outside the interval [p…r]. i cannot go past j more than 1 index. j cannot go off the end of the array, because … (case 1) it will stop at the pivot at the end of the array, or (case 2) it will stop when it reaches i. i cannot go past j if j is at its first element. b. The index j is not equal to r when PARTITION terminates (so that the split is always non-trivial). j will have at least two chances to move before i can get to j, since i stops at the first index (where the pivot element is). 8-3 Stooge sort Professors Howard, Fine, and Howard have proposed the following “elegant” sorting algorithm: STOOGE-SORT(A, i, j ) 1 if A[i] A[j] 2 then exchange A[i] A[j] 3 if i + 1 j 4 then return 5 k (j – i + 1) / 3 6 STOOGE-SORT(A, i, j – k) 7 STOOGE-SORT(A, i + k, j ) 8 STOOGE-SORT(A, i, j – k ) | Round down. | First two-thirds. | Last two-thirds. | First two-thirds again. a. Argue that STOOGE-SORT(A, 1, length[A] ) correctly sorts the input array A[1…n], where n = length[A]. The array is sorted in place and the size of the array being sorted is always brought down to two elements (recursively). Each set of two elements in a row is recursively compared and swapped if necessary. Therefore the array will be sorted. b. Give a recurrence for the worst-case running time of STOOGE-SORT and a tight asymptotic (-notation) bound on the worst-case running time. T(n) = T( 2n/3 ) + T( 2n/3 ) + T( 2n/3 ) + T(1) = 3T( 2n/3 ) + T(1) CASE 1 of Master’s Theorem T(n) = (nlog3/23) (n2.7) c. Compare the worst-case running time of STOOGE-SORT with that of insertion sort, merge sort, heapsort, and quicksort. Do the professors deserve tenure? No. The running time of STOOGE-SORT is always worse than insertion sort, merge sort, heapsort, and quicksort. 8-4 Stack depth for quicksort The QUICKSORT algorithm of Section 8.1 contains two recursive calls to itself. After the call to PARTITION, the left subarray is recursively sorted and the right subarray is recursively sorted. The second recursive call in QUICKSORT is not really necessary; it can be avoided by using an iterative control structure. This technique, called tail recursion, is provided automatically by good compilers. Consider the following version of quicksort, which simulates tail recursion. QUICKSORT(A, p, r) 1 while p r 2 do | Partition and sort the left array 3 q PARTITION(A, p, r) 4 QUICKSORT(A, p, q) 5 pq+1 Compilers usually execute recursive procedures by using a stack that contains pertinent information, including the parameter values, for each recursive call. The information for the most recent call is at the top of the stack, and the information for the initial call is at the bottom. When a procedure is invoked, its information is pushed onto the stack; when it terminates, its information is popped. Since we assume that array parameters are actually represented by pointers, the information for each procedure call on the stack requires O(1) stack space. The stack depth is the maximum amount of stack space used at any time during a computation. b. Describe a scenario in which the stack depth of QUICKSORT is (n) on an nelement input array. The array is already ordered. 8-5 Median-of-3 partition One way to improve the RANDOMIZED-QUICKSORT procedure is to partition around an element x that is chosen more carefully than by picking a random element from the subarray. One common approach is the median-of-3 method: choose x as the median (middle element) of a set of 3 elements randomly selected from the subarray. For this problem, let us assume that the elements in the input array A[1…n] are distinct and that n 3. We denote the sorted output array by A[1…n]. Using the median-of-3 method to choose the pivot element x, define pi = Pr{ x = A[i] }. a. Give an exact formula for pi as a function of n and i for i = 2, 3, …, n – 1. (Note that p1 = pn = 0.) pi = 1/(n – 2) b. By what amount have we increased the likelihood of choosing x = A[ (n + 1) / 2 ], the median of A[1…n], compared to the ordinary implementation? Assume that n , and give the limiting ration of these probabilities. (1/(n – 2)) – (1/n) 1 – ((1/(n – 2)) – (1/n))