Outline • Applications of Median Bucket Sort, Applications of Median Recitation 4: Bucket Sort, Applications of Median – Median Partitioned Quicksort • Bucket Sort 1 Recitation 4: Bucket Sort, Applications of Median Median Partitioned Quicksort Bucket Sort • Quicksort has average running time Θ(n lg n) but a worst case running time Θ(n2). • Can we improve the worst case? • Linear time sorts: – Counting sort : Bound on the size of the numbers known – Radix sort : Bound on the number of bits known – partition around the median!! • Finding the median runs in Θ(n) worst case! – Recursively sort both halves. • T(n) = 2T(n/2) + Θ(n) = Θ(n lg n) by Master Theorem case 2. • Theoretically interesting but not used much in practice - constants in Θ(n) for finding median is large. Recitation 4: Bucket Sort, 2 • Linear time sorting of real number – Bucket sort: Expected run time is Θ(n) when the input is generated independently by a distribution over [0,1). 3 Recitation 4: Bucket Sort, Applications of Median Applications of Median 4 Bucket Sort BUCKET − SORT(A) 1 n ← length[A] 2 for i ← 1 to n • Partition [0,1) into n equal-sized intervals (buckets). • Distribute the inputs into the buckets. 3 4 5 6 – A linked list normally used • Sort each bucket using insertion sort • Go through the buckets in order, listing the elements. Recitation 4: Bucket Sort, Applications of Median 5 do insert A[i] into B[ ⎣nA[i]⎦ for i ← 0 to n - 1 do sort list B[i] with insertion sort concatenate the lists B[0], B[1], ..., B[n - 1] together in order Recitation 4: Bucket Sort, Applications of Median 6 1 Running time Analysis n −1 T(n) = Θ(n) + ∑ O(n 2 ) • Worst case running time i =0 – All elements fall into the same bucket – O(n2) n −1 E[T(n)] = E[Θ(n) + ∑ O(n i2 )] (linearity of expectation) i =0 • Average case running time n −1 = Θ(n) + ∑ O(E[n i2 ]) – Θ(n) i =0 • We claim that E[ni2] = 2 - 1/n • So E[T(n)] = Θ(n) Recitation 4: Bucket Sort, Applications of Median 7 Recitation 4: Bucket Sort, Applications of Median 8 • Let ⎧1 X ij = ⎨ ⎩0 n if A[j] falls in bucket i E[ni2 ] = E[(∑ X ij ) 2 ] otherwise j =1 ⎡ n n ⎤ = E ⎢∑∑ X ij X ik ⎥ ⎣ j =1 k =1 ⎦ ⎡ n ⎤ = E ⎢∑ X ij2 + ∑ ∑ X ij X ik ⎥ ⎢ j =1 ⎥ 1≤ j ≤ n 1≤ k ≤ n k≠ j ⎣⎢ ⎦⎥ for i = 0,1,...,n-1 and j = 1,2,...,n (indicator random variable). • We have n ni = ∑ X ij j =1 n = ∑ E[ X ij2 ] + j =1 Recitation 4: Bucket Sort, Applications of Median 9 X ik ] Recitation 4: Bucket Sort, Applications of Median 10 k≠ j 1 1 = n. + n(n − 1). 2 n n n −1 = 1+ n 1 = 2− n • When k≠j, Xij and Xik are independent E[ X ij X ik ] = E[ X ij ]E[ X ik ] 11 1 = n n n2 Recitation 4: Bucket Sort, Applications of Median ij n 1 1 E[ ni2 ] = ∑ + ∑ ∑ 2 j =1 n 1≤ j ≤ n 1≤ k ≤ n n 1 1 1 E[ X ij2 ] = 1. + 0.(1 − ) = n n n = ∑ ∑ E[ X 1≤ j ≤ n 1≤ k ≤ n k≠ j • Even if input distribution is not uniform, expected running time is still linear if E[ni2]=O(1). 11 Recitation 4: Bucket Sort, Applications of Median 12 2 Exercise Summary A ship arrives at a port, and the 40 sailors on board go ashore for revelry. When they return, each chooses a random cabin in their state of drunkenness. What is the expected number of sailors in their own cabins? Recitation 4: Bucket Sort, Applications of Median 13 • Application of median: – Good worst case theoretical bound for median partitioned quicksort • Bucket sort: – Linear time sorting of real number – Only for expected run time – Only for certain distributions of inputs Recitation 4: Bucket Sort, Applications of Median 14 3