Sorting Part 2 Merge Sort and Quick Sort Merge Sort • Merge Sort is a "divide and conquer" style algorithm. • It recursively splits an array in half. • Next, it merges sorted parts of the array into a single sorted region. • It relies on the fact that once elements are split to subregions that contain a single element, then that element is sorted. • From there, it merges adjacent sorted regions until the entire array is sorted. • Let's look at it in-depth Merge Sort, general idea • Start with an array, then repeatedly divide it in half. Merge Sort, general idea • Start with an array, then repeatedly divide it in half. • Next, merge adjacent parts into larger sorted regions. Merge Sort, general idea • Start with an array, then repeatedly divide it in half. • Next, merge adjacent parts into larger sorted regions. • This means choosing the smallest element to place next from the two subregions. Merge Sort • This algorithm relies of two functions. • MergeSort: This is the top-level function that divides the array into subparts. • Merge: This function takes two regions and merges them into the same region. • Let's look at the Psuedo-code Merge Sort, pseudo-code • Merge Sort is a fairly simple recursive function. • The indexes are explained in the figure below. Merge Sort, pseudo-code • The Merge algorithm is actually the most challenging. • We start by creating temporary storage for the left and right subregions. • Next these data are merged back into the original array. Merge Sort, pseudo-code • Let's "divide and conquer" this merge function! Merge Sort, pseudo-code • Create temporary arrays. • Sizes and indexes are assigned. • The temporary arrays have 1 extra value. • This stores a special value that is "always larger" than the element in the other array. This is known as a "sentinel" value. Merge Sort, pseudo-code • Copy the values from the original array into the temporary storage. • Assign the last element of the array to the sentinel value. In this case MAX. • In C++ you can use std::numeric_limits<int>::max() with "#include <limits>" Merge Sort, pseudo-code • Now merge the sorted temporary arrays into the original. • Start with two indexes to walk through the elements of the array. • If the current left element is less than the right, insert to the left element. • Else, insert the right element. • be sure to increase the left or right index after one is added to the array. Merge Sort, pseudo-code • Don’t forget to clean up! • Remember, if you allocated memory with "new" you need to delete it at the end of the function. • This language specific detail is not in the pseudo-code • in C++ call "delete[] arrayLeft" and "delete[] arrayRight" Merge Sort, Time Complexity • What is the time Complexity of Merge Sort? • There are several ways to calculate it. • We can get a clue from this illustration. • We see we have two phases: • Splitting • Merging Merge Sort, Time Complexity • Let's think about the Merge function first. • What is the time complexity of Merge? • The relevant part is below. Merge Sort, Time Complexity • Let's think about the Merge function first. • What is the time complexity of Merge? O(n) • The relevant part is below. See the for-loop from startLeft to end. Merge Sort, Time Complexity • Merging cost c*n, where n is the size of the array. • So at each step, the time cost could be calculated like this: • T(n) = 2*T(n/2) + c*n • The c*n, is the O(n) merge cost. • The 2*T(n/2) represents the cost of merge-sorting the two subarrays before the merge. • To do deeper, we need to calculate T(n/2). • We can just substitute (n/2) for n in the definition, then substitute this T(n/2) it back in. • T(n)=2*[2 * T(n/4) + c*n/2] + c*n. Merge Sort, Time Complexity • From this definition we can rewrite it as shown in the textbook. • T(n)= 2*[2 * T(n/4) + c*n/2] + c*n. • T(n)= 2*[2 * T(n/4) + c*(n/2)] + c*n • = 4 * T(n/4) + 2*c*(n/2) + c*n • = 4 * T(n/4) + c*n + c*n. • Eventually, recursion will attempt to sort an array of size 1. One element is already sorted, so with will have a constant cost. • We are left with: • T(n) = n * T(1) + log2 n * c * n, or ... • T(n) = n * c + log2 n * c * n Merge Sort, Time Complexity • Taking the last equaiton and rewriting, we can see what the leading term would be. • T(n) = c*n*log2 n + c*n • This leads to a time complexity of O(n log n). Merge Sort, Time Complexity • To illustrate, here are few diagrams. • The cost of sorting is shown. Merge Sort, Time Complexity • To illustrate, here are few diagrams. • The cost of sorting is shown. • Going a step further, the cost of sorting the left and right is shown. Merge Sort, Time Complexity • Ultimately, the splitting cost is overshadowed by the cost of merging. • Each split needs to be merged. • Each level as a set of small arrays, but merging them all results in O(n) cost. • So, the cost is n times the number of levels. • With even division by 2, there are log n levels. • O(n log n) Note: Merge Sort guarantees O(n log n) time. This means that there is no distinction between best-case, worst-case, or averagecase. Merge Sort, Space Complexity • With space complexity, most of the algorithms we study are O(n) due to the space cost of the input array. • A constant multiple of n will still be O(n). This means that using 3 or 4 times the memory of the input array is O(n). • However, we may be interested in auxiliary space. • Auxiliary space is the amount of extra space (beyond the input array) needed for the algorithm. Merge Sort, Space Complexity • How much auxiliary space does Merge Sort consume? • Since Merge Sort uses recursion, it accumulates stack frames. We can expect roughly O(log n) stack frames each taking a constant amount of memory. • Additionally, each merge step requires a temporary array. This requires O(n) memory. • So, we may expect a space cose of S(n) = c*log n + c*n. • This gives an auxiliary space complexity of O(n). • But, recall that the overall space complexity is still O(n). • c*n + c*n + c*log n is still O(n). In-Place Sorting • Now that we understand Auxiliary space, we can introduce a new term. • A sorting algorithm is known as in-place if it requires only O(1) auxiliary space. Sometimes, this is relaxed to any auxiliary space less than O(n). • An in-place algorithm sorts the array in its original location without creating extra copies of the data. • We may consider algorithms that only use O(log n) auxiliary stack space as in-place also. Quick Sort • Another sorting algorithm. • Uses recursion. • Provides a good example for algorithm study for several reasons. • Let’s look at the algorithm first. Quick Sort • General strategy • Choose a number and designate it as the pivot. • Any value less than the pivot, put on the left of the pivot. • Put the pivot in its place after all the smaller values. • Any value greater than the pivot, put on the right of the pivot. • Recursively quick sort all values to the left of the pivot. • Recursively quick sort all values to the right of the pivot. From the textbook … Quick Sort Quick Sort Use a wrapper. End should be end-1 or length-1 (size - 1). End should be the last valid index. A closer look at the partition function • Partition works by selecting a pivot, then considering a range of values in the array. • smallIndex marks the end of the values that are less than the pivot. • Index traverses the array. • When index sees a value less than the pivot, smallIndex increase and the value is placed it in the position after smallIndex. An illustration of partition. In the middle of execution … We are in the middle of executing quick sort. 1 smallIndex pivot 5 3 2 index 7 1 We are considering the value 1. Our pivot is 5. 9 1 is less, so we need to … First, add 1 to smallIndex Second, call exchange(smallIndex, index). 2 smallIndex index pivot 5 3 2 1 7 Now the exchange has been made. 9 Index will move to the next position in the array An illustration of partition. In the middle of execution … We are in the middle of executing quick sort. 3 pivot index smallIndex Now the last position is being considered. 9 is not less than the pivot. 5 3 2 1 7 9 4 index smallIndex pivot The end of the loop is reached. The final step is to switch the pivot and small index. 1 3 2 5 7 9 Finishing partition. In the middle of execution … 5 smallIndex Now partition is done. 1 3 2 5 7 9 0 1 2 3 4 5 We will return the value of smallIndex. In this case 3. Pivot is in its final position in sorted order 1 3 2 Values less than pivot 5 7 9 Values greater than pivot Continuing with Quick Sort Pivot is in its final position in sorted order 1 3 2 Values less than pivot 5 7 9 Values greater than pivot Recursively quicksort the left and right unsorted values 1 3 2 7 9 Quick Sort complexity, worst case • Let’s think about the worst case for quicksort. • At the start of partition, we set the index value to start + 1. • Index then goes along the whole array to the end. • So, partition must be a O(n) procedure. • We looked at an example where our pivot was near the middle of the array. • After partition runs, we call quicksort on the rest of the array to the left and right of the pivot. Quick Sort complexity, worst case • We looked at an example where our pivot was near the middle of the array. • After partition runs, we call quicksort on the rest of the array to the left and right of the pivot. • What if our pivot was not near the middle but near an end? Quick Sort complexity, worst case • What if our pivot was not near the middle but near an end? • Suppose pivot was in the start position. • This means our first recursive call returns immediately doing nothing. • The second recursive call will begin with pivotIndex being start + 1. • The second call runs on n-1 array positions. Quick Sort complexity, worst case • This gives us a situation like the following images. Quick Sort complexity, worst case • This gives us a situation like the following images. • So, our complexity would look like this. • c*n + c*(n-1) + c*(n- ) + … + + • This leads to O(n2) in similar way as selection sort. Quicksort, best case • What would be the best case? • A better situation would be if we always split the array in half. • This would mean that our pivot was always the median. • If the pivot was always near the middle, then the recursive calls be doing nearly equal amounts of work on the remaining halves of the array. About half of the elements each. Quicksort, best case • equal amounts of work on the remaining halves of the array. About half of the elements each. Quicksort, best case • Working out the complexity mathematically. • Consider n elements in an array … • We need to partition, O(n) as we discussed. • Then recursively quick sort two halves. This is a little trickier. Partition Apply same algorithm on left and right • T(n) = c*n + T(n/2) + T(n/2) Quicksort, best case Partition Apply same algorithm on left and right • T(n) = c*n + T(n/2) + T(n/2) = c*n + 2*T(n/2) • Let’s think about T(n/ ). Just substitute n/ for n and rewrite. • T(n/2) = c*(n/2) + T((n/2)/2) + T((n/2)/2) • Note that (n/2)/2 = n*(1/2)*(1/2) = n*(1/4) = n/4 • Now let’s substitute into T(n). • T(n) = c*n + 2*[ c*(n/2) + 2*T(n/4)] Quicksort, best case • T(n) = c*n + T(n/2) + T(n/2) = c*n + 2*T(n/2) • Now let’s substitute into T(n). • T(n) = c*n + 2*[ c*(n/2) + 2*T(n/4)] = c*n + (c*n/2 + c*n/2) + 4T(n/4) • Let’s visualize this explanation … c*n + c*n/2 + c*n/2 + c* n/4 + c*n/4 + c*n/4 + c*n/4 +… Quicksort, best case • T(n) = c*n + 2*[ c*(n/2) + 2*T(n/4)] = c*n + (c*n/2 + c*n/2) + 4T(n/4) • Let’s visualize this explanation … c*n + c*n/2 + c*n/2 + c* n/4 + c*n/4 + c*n/4 + c*n/4 c*n Combining some terms we get this + 2 * ½ * c*n + 4*(1/4) * c*n … c*n Simplifying we get this … We are adding up a lot of c*n values = + c *n + c *n … The question is how many c*n do we get? The answer is something like this T(n) = X * c * n, But we are not sure what X is. Quicksort, best case • We need to find the number of splits before each subset of the array is size 1. The question is how many c*n c*n do we get? = + c *n + c *n … The answer is something like this T(n) = X * c * n, But we are not sure what X is. • Eventually we reach something like this … • T(n) = (k-1)c*n + n*T(n/2k)), • So we want to find the k, where (n/2k ) = 1 because, T(1) = c. • If n = 16, what k solves: 2k = 16? Quicksort, best case • We need to find the number of splits before each subset of the array is size 1. • Eventually we reach something like this … • T(n) = (k-1)c*n + n*T(n/2k)), • So we want to find the k, where (n/2k ) = 1 because, T(1) = c. • If n = 16, what k solves: 2k = 16? • Its 4, or log2(16). • So, our algorithm will take log n * c * n operations in the best case. • This is written O(n log n). c*n = + c *n + c *n … Some context • How likely is the worst case for quick sort? • If we assume that the order of an array is random, then • Any of the n values could appear in any location with equal probably. • What is probability of the smallest value in the array appearding in position 0? • If the smallest value is equally likely to be in any position, then its appearing in position 0 is just 1/n. • Next, what is probability that the next smallest value appears in position 1? • Treat this as a similar array of n-1 elelments. • We get 1/(n-1). Some context • What is probability of the smallest value in the array appearding in position 0? • If the smallest value is equally likely to be in any position, then its appearing in position 0 is just 1/n. • Next, what is probability that the next smallest value appears in position 1? • Treat this as a similar array of n-1 elelments. • We get 1/(n-1). • The probability that these two events both happen is: • (1/n) * (1/[n-1]) • Following this logic, the probability that the array is already sorted just by change is: • 1/n! Some context • Following this logic, the probability that the array is already sorted just by change is: • 1/n! • So, the probability of encountering the worst-case for Quick Sort just by chance is 1/n! • If n is even as small as 8, the probability is • 1/8! = 1/40320. • This is 0.00002480158 in decimal form. • This means in practice, Quick Sort is a great algorithm. • Many variations of Quick Sort have been studied that may make it even more likely to avoid the worst-case.