CSE 2331/5331 Topic 6: Selection More on sorting CSE 2331/5331 Order Statistics Rank: The order of an element in the sorted sequence Selection: Return the element with certain rank CSE 2331/5331 Examples Return rank(1) (i.e, minimum), or rank(n) (i.e, maximum) Return both rank(1) and rank(n) n-1 operations Best possible Better than 2n-2? Yes Return the median CSE 2331/5331 Select ( r ) Return the element with rank r. Straightforward approach: Sort, then return A[r] Takes Θ(𝑛 lg 𝑛) time Compute a lot of extra, unnecessary information CSE 2331/5331 Divide and Conquer Again! Similar to QuickSort Select ( A, 1, n, r ) First, perform m = Partition(A, 1, n); A[m]: pivot Case 1: r = m return A[m] CSE 2331/5331 Select (A, 1, n, r ) A[m]: pivot Case 2: r < m return Select ( A, 1, m-1, r ) CSE 2331/5331 Select (A, 1, n, r ) A[m]: pivot Case 3: r > m return Select ( A, m+1, n, r-m ) CSE 2331/5331 Pseudo-code Select ( A, s, t, r ) Select(A, 1, n, r) T(n) = T( max(m-1, n-m) ) + O(n) if ( s = t ) return A[s] ; m = Partition ( A, s, t ); if ( m = r+s-1) return A[m]; if ( m > r+s-1 ) return Select ( A, s, m-1, r ); else return Select ( A, m+1, t , r+s-m ); CSE 2331/5331 Analysis 𝑇 𝑛 = 𝑇 max 𝑚 − 1, 𝑛 − 𝑚 Worst case + 𝑐𝑛 𝑇 𝑛 = 𝑇 𝑛 − 1 + 𝑐𝑛 = Θ 𝑛2 If lucky: Lucky case: 𝑇 𝑛 =Θ 𝑛 always gets a balanced partition: 𝑇 𝑛 =𝑇 9𝑛 10 9 2 𝑛 10 + 𝑐𝑛 = 𝑇 = 𝑇 1 + 𝑐𝑛 1 + 9 10 + + 𝑐𝑛 + 9 2 10 9 10 ⋅ 𝑐𝑛 + … =Θ 𝑛 CSE 2331/5331 Pseudo-code Rand-Select ( A, s, t, r ) if ( s = t ) return A[s] ; m = Rand-Partition ( A, s, t ); if ( m = r+s-1) return A[m]; if ( m > r+s-1 ) return Rand-Select ( A, s, m-1, r ); else return Rand-Select ( A, m+1, t , r+s- m ); CSE 2331/5331 Rand-Select ( A, 1, n, r) m = Rand-Partition ( A, 1, n ); if ( m = r) return A[m]; if ( m > r ) return Rand-Select ( A, 1, m-1, r ); else return Rand-Select ( A, m+1, n, r - m ); 𝐸𝑇 𝑛 ≤ 𝑛 𝑘=1 Pr 𝑚 = 𝑘 ⋅ 𝐸𝑇 max 𝑘 − 1, 𝑛 − 𝑘 + 𝑂(𝑛) CSE 2331/5331 Solving Recursion 𝑛 𝐸𝑇 𝑛 ≤ Pr 𝑚 = 𝑘 ⋅ 𝐸𝑇 max 𝑘 − 1, 𝑛 − 𝑘 + 𝑂(𝑛) 𝑘=1 1 𝑛 𝑘=1 𝑛 ≤ = 1 𝑛 ⋅ 𝐸𝑇 max 𝑘 − 1, 𝑛 − 𝑘 𝑛 𝑘=1 𝐸𝑇 max 𝑘 − 1, 𝑛 − 𝑘 + 𝑂(𝑛) + 𝑂(𝑛) CSE 2331/5331 Recursion cont. max(k-1, n-k) = k-1 n-k 𝑛 𝑘=1 𝐸𝑇(max(𝑘 𝐸𝑇(𝑛) ≤ 2 𝑛 if k > n/2 otherwise − 1, 𝑛 − 𝑘))] ≤ 2 𝑛−1 𝑛 𝐸𝑇 𝑘=⌊ ⌋ 𝑘 2 𝑛−1 𝑛 𝑘= 𝐸𝑇(𝑘) + 𝑂 𝑛 2 Proof by substitution method that 𝐸𝑇(𝑛) = 𝑂 𝑛 . CSE 2331/5331 That is, the expected running time for randomized selection is 𝐸𝑇 𝑛 = 𝑂 𝑛 CSE 2331/5331 Remarks One can in fact have a deterministic selection algorithm of time complexity Θ 𝑛 Smart way to guarantee that one always find a good partition This induces a deterministic QuickSort algorithm with time complexity Θ(𝑛 lg 𝑛) CSE 2331/5331 Lower Bound for Sorting Model What types of operations are allowed E.g: partial sum Both addition and subtraction Addition-only model For sorting: Comparison-based model CSE 2331/5331 Decision Tree ai > aj yes ak > am no as > at CSE 2331/5331 Example For insertion sort with 3 elements CSE 2331/5331 Decision Tree Not necessary same height Worst case complexity: Longest root-leaf path Each leaf: A possible outcome I.e., a permutation of input Every possible outcome should be some leaf #leaves ≥ n! CSE 2331/5331 Lower Bound Any binary tree of height h has at most 2ℎ leaves A binary tree with m leaves is of height at least lg m Worst case complexity for any algorithm sorting n elements is : (lg (n!) ) = (n lg n) (by Stirling approximation) CSE 2331/5331 Non-comparison Based Sorting Assume inputs are integers [1, … k] k = O (n) Count-Sort (A, n) (simplified version) Initialize array C[ 1,…k ] with C[ i ] = 0 for i = 1 to n do C[ A[ i ] ] ++ output based on C Time and space complexity 𝑂 𝑘 =𝑂 𝑛 CSE 2331/5331 Summary Selection Linear time algorithm Both by a randomized algorithm, and by a deterministic algorithm. Sorting Under comparison model, requires Ω(𝑛 lg 𝑛) comparisons MergeSort, QuickSort, optimal under this model. CSE 2331/5331