Problem definition Counting sort Lecture 5: Sorting 4 CSC2100 Data Structure Yufei Tao CSE department, CUHK February 13, 2011 Summary Problem definition Counting sort Summary In the last lecture, we showed that all algorithms in the comparison class have a time lower bound of Ω(n log n) (if the array to be sorted has n numbers). In this lecture, we will learn an algorithm outside the comparison class. This algorithm, whose name is counting sort, achieves the cost of O(n) when the data domain has a size of O(n). Problem definition 1 Problem definition 2 Counting sort Main idea Formal description Time complexity Counting sort Summary Problem definition Counting sort Summary We first consider a simplified version of the problem that can be solved by counting sort efficiently. This simplification allows us to focus on studying the core of the algorithm. It is in fact fairly straightforward to extend the algorithm to solve the general version of the problem, as we will see. Problem (Sort-within-ten) Let A be an array of n numbers, each of which is an integer from 1 to 10. The objective is to arrange the numbers of A in ascending order. Note Numbers such as 0, -2, 11, and 31 are not allowed in A. Problem definition Counting sort Summary Main idea Rationale Let us first get the idea from an example. Assume that the array to be sorted is A = {8, 3, 8, 9, 2, 5, 8, 5}. Example 1 First, create an array cnt of size 10 (i.e., the size of our domain) with all numbers set to 0. 2 Scan A once. Given each number v read, increase cnt[v ] by 1. After reading A[1] = 8, increase cnt[8] by 1. After reading A[2] = 3, increase cnt[3] by 1. ... At the end of the scan, cnt becomes {0, 1, 1, 0, 2, 0, 0, 3, 1, 0}. Note Now cnt[i] (i ≤ 10) equals the number of occurrences of i in A. Problem definition Counting sort Main idea Rationale (cont.) Example (cont.) 3 Create an array pos of size 10, such that pos[i ] (i ≤ 10) equals the sum of the first i numbers in cnt. pos = {0, 1, 2, 2, 4, 4, 4, 7, 8, 8}. For example, pos[5] is the sum of the first 5 numbers in cnt. Note pos[i] (i ≤ 10) gives the last position of i in the sorted order, provided that i belongs to A. For example, pos[8] = 7 means that, in the sorted order, the last 8 (of the three 8’s in A) is at the 7-th position. Summary Problem definition Counting sort Summary Main idea Rationale (cont.) Example (cont.) 4 Create an array A′ of size n for storing the final sorted list. 5 Scan A another time. Given each number v read, copy v to position pos[v ] in A′ , and decrease pos[v ] by 1 (to indicate where the next occurrence of v should be). Put A[1] = 8 at position pos[8] = 7 in A′ ; decrease pos[8] to 6. Put A[2] = 3 at position pos[3] = 2 in A′ ; decrease pos[3] to 1. Put A[3] = 8 at position pos[8] = 6 in A′ ; decrease pos[8] to 5. ... Order of filling A′ {, , { , 3, { , 3, { , 3, , , , , , , , 8, } , , , 8, } , , 8, 8, } , , 8, 8, 9} ... Problem definition Counting sort Formal description Pseudo-code Algorithm CSort(A[1..n]) 1. cnt = an array of size 10 with all elements set to 0 2. for i = 1 to n 3. cnt[A[i ]] + + 4. pos = an array of size 10 with all elements set to 0 5. pos[1] = cnt[1] 5. for i = 2 to 10 6. pos[i ] = pos[i − 1] + cnt[i ] ′ 7. A = an array of size n 8. for i = 1 to n 9. A′ [pos[A[i ]] = A[i ] 10. pos[A[i ]] − − Summary Problem definition Time complexity Cost analysis Obviously O(n). Counting sort Summary Problem definition Counting sort Summary Time complexity General problem Problem (Small-domain Sort) Let A be a set of n numbers each of which is an integer from 1 to k. The objective is to arrange the numbers of A in ascending order. To think Extend the counting sort algorithm to solve the above problem in O(n + k) time. Problem definition Counting sort Playback of this lecture: Counting sort. Time complexity O(n + k) where k is the size of the data domain. Summary