TDDB56 DALGOPT-D – Lecture 10 – Sorting (part III) TDDB56 DALGOPT-D – Lecture 10 – Sorting (part III) Content: Lecture 8: Sorting part I: – Intro: Aspects of sorting, different strategies when sorting.... TDDB56 – DALGOPT-D Algorithms and optimization – Insertion Sort, Quick Sort, Details about Quick Sort Lecture 9: Sorting part II: Lecture 10 – Heap Sort, Merge Sort, comparison sort movie Lecture 10: Sorting part III: Sorting (Part III) Selection – Theoretical lower bound for comparison sorting, – Bucket Sort, Radix Sort – Selection, Median finding: Quick Select Jan Maluszynski - HT 2005 10.1 Jan Maluszynski - HT 2005 10.2 TDDB56 DALGOPT-D – Lecture 10 – Sorting (part III) TDDB56 DALGOPT-D – Lecture 10 – Sorting (part III) Recall: O(x), Ω(x) and Θ(x) How fast can comparison based sorting be? We have seen O(n2) and O(nlogn) algorithms... ...is there anything faster? No – and we can prove it! 1. 2. Given an arbitrary but fixed data set S of length n Execute a sorting algorithm, and study each comparison between two data elements: – – Record a 1 if the it was successful Record a 0 if not The execution of an algorithm will result in a sequence of 1’s and 0’s - a trace of all decisions taken Jan Maluszynski - HT 2005 10.3 Jan Maluszynski - HT 2005 10.4 1 TDDB56 DALGOPT-D – Lecture 10 – Sorting (part III) TDDB56 DALGOPT-D – Lecture 10 – Sorting (part III) How fast... How fast... 3. What is the minimum length of the trace? 5. Look att all the traces from all permutations of S at the same time as a binary decision tree: 4. For each possible permutation of S that we sort... ...there will be a unique trace! • • • Why? – – Consider all possible permutations of S If two different permutations, S and S’, when sorted, would result in the same trace... ...then our sorting algorithm has taken the exact same decisions for S and S’... ...and thus only one of them would be correctly sorted! The root represents a state before the first decision (0 or 1) If next token in the trace is a <, follow left branch If next token in the trace is a >, follow right branch 0:1 < > > < 0:2 > 0:2 < Jan Maluszynski - HT 2005 < 1:2 10.5 1:2 > < > Jan Maluszynski - HT 2005 10.6 TDDB56 DALGOPT-D – Lecture 10 – Sorting (part III) TDDB56 DALGOPT-D – Lecture 10 – Sorting (part III) How fast… cont How fast... Recall props of binary trees 0:1 < < 7. A binary tree with n! leafs has at least a heigt of log2(n!) > 1:2 < > > 0:2 < (since there can be at most 2h nodes on level h) 0:2 8. 1:2 > < > 5. Each leaf node represents the end of one trace – Since each trace was unique, we have one leaf node for each trace, i.e. One for each permutation of S – There is n! permutations of S, thus n! leaf nodes n!= n × (n − 1) × (n − 2) × ... ×1 ≥ (n / 2) n / 2 n n / 2 n n log( n!) ≥ log = log 2 2 2 9. Thus the length of traces is Ω(nlogn) Look at the first n/2 terms... ...they can all be expressed as ki*n/2 where ki ∈ [1..2] ...hence > (n/2) n/2 ...and so is the execution time ...of a comparison based sorting algorithm! Jan Maluszynski - HT 2005 10.7 Jan Maluszynski - HT 2005 10.8 2 TDDB56 DALGOPT-D – Lecture 10 – Sorting (part III) TDDB56 DALGOPT-D – Lecture 10 – Sorting (part III) Bucket-Sort – not comparison based... Bucket-sort – complexity? • Don’t compare keys – use them as indices to a table B • If we know that all keys are in {0..255} ...create a table B with room for 256 items, ...”move” all items into B using the key as index ...copy them back in order to A Procedure BucketSort (table <K> A[0:m]): table <integer> B[0: |K|] for i from 0 to |K| do B[i] ← 0 for i from 0 to m do B[A[i]]++ j←0 for i from 0 to |K| do while B[i]-- > 0 do A[j++] ← i ..where |K| is the max number of different keys of type K A 5 2 0 2 6 ... ... B 1 0 2 0 0 1 1 ... Jan Maluszynski - HT 2005 A 0 2 2 5 6 ... ... Procedure BucketSort (table <K> A[0:m]): table <integer> B[0: |K|] 1 for i from 0 to |K| do B[i] ← 0 2 for i from 0 to m do B[ A[i] ]++ 3 j←0 4 for i from 0 to |K| do 5 while B[i]- - > 0 do A[j++] ← i Jan Maluszynski - HT 2005 TDDB56 DALGOPT-D – Lecture 10 – Sorting (part III) TDDB56 DALGOPT-D – Lecture 10 – Sorting (part III) Bucket-sort – Stability? Complex data? A variant of Bucket Sort If we have complex data in the table, we must temporarily copy that data into a queue... Procedure BucketSort (table <K,D> A[0:m]): table <integer> B[0: |K|] 1 for i from 0 to |K| do B[i] ← new Queue() 2 for i from 0 to m do B[ A[i].getKey() ].append(A[i]) 3 j←0 4 for i from 0 to |K| do 5 while ! B[i].isEmpty() do A[j++] ← B[i].removeFirst() Jan Maluszynski - HT 2005 10.10 • Organizing items as lists/queues Input: 7,a ; 1,b ; 3,c ; 7,d ; 3,e ; 7,f 1, b 3, c 1 • Since we use a queue, relative order is preserved! • We still get O(m + |K|) for a fixed size |K| ...since all queue operations are O(1) t5=|K| ...for fixed size |K| ... We get O(m + |K|) 10.9 t1=|K| t2=m 2 3 4 5 3, e 6 7 7, a 8 7, d 7, f 9 Output: 1,b ; 3,c ; 3,e ; 7,a ; 7,d ; 7,f 10.11 Jan Maluszynski - HT 2005 10.12 3 TDDB56 DALGOPT-D – Lecture 10 – Sorting (part III) TDDB56 DALGOPT-D – Lecture 10 – Sorting (part III) If keys are large – Radix Sort Radix Sort - Example • Divide keys in n smaller parts ...suitable for Bucket Sort Example: alphabetic keys, split by character, buckets on letters • For all keys, focusing on the last part, sort using Bucket Sort. • Redo sort using second last part. • ... • Sort using first part of each key. mary anna bill adam mona (input data) anna mona bill adam mary (sorted by 4th letter) adam bill anna mona mary (sorted by 3rd letter, keep relative order) mary adam bill anna mona adam anna bill mary mona Since Bucket sort is stable, Radix sort must be stable! Jan Maluszynski - HT 2005 10.13 Jan Maluszynski - HT 2005 TDDB56 DALGOPT-D – Lecture 10 – Sorting (part III) TDDB56 DALGOPT-D – Lecture 10 – Sorting (part III) Selection / Median Finding Yes – possible in O(n) New problem (but related to sorting): • Find the median element in a set of unsorted elements • Generalized: Find the i:th large element in a set That’s easy! Sort the set and pick element i. • Overkill – sorting not needed for all other elements • Limits us to O(n logn) algorithms • For i close to end of data set we can easily achieve O(n) algorithm... But in general? Jan Maluszynski - HT 2005 10.15 10.14 procedure quickSelect (table S[1..n], k) ≥ if n = 1 then return S[1] Divide into three x ← S[random(1,n)] groups: • Smaller than x partition(S, x, L, E, G) • Equal to x if k ≤ |L| then • Larger than x return quickSelect(L, k) Recurse into if k ≤ |L| + |E| then proper group... return x else return quickSelect(G, k-|L|-|E|) Jan Maluszynski - HT 2005 10.16 4 TDDB56 DALGOPT-D – Lecture 10 – Sorting (part III) TDDB56 DALGOPT-D – Lecture 10 – Sorting (part III) Proving time complexity of QuickSelect Proving time complexity of QuickSelect (cont.) t(n) ≤ n * g(n) + t(3n/4) ...where g(n) is the no.of times we scan all data g(n) (bad partitions), E(g(n)) = 2 + the time to search the good partition, max t(3n/4) • Recall linearity of expectation E: – E(x + y) = E(x) + E(y) – E(cx) = cE(x) constants move out • Say that a recursive invocation is ”good” if it partitions S so that |L| ≤ 3n/4 and |G| ≤ 3n/4 Î 50% ...otherwise it is a bad partition E(t(n)) ≤ E(n * g(n) + t(3n/4)) = n*E(g(n)) + E(t(3n/4)) = n*2 + E(t(3n/4)) g(n) Reapply: E(t(n)) ≤ n*2 + 3n/4 *2 + E(t(3(3n/4)/4)) = n*2 + n*2*3/4 + E(n(3/4)2) And again: E(t(n)) ≤ n*2 + n*2*3/4 + n*2* (3/4)2 +E(n(3/4)3) • Let’s overdo it! Asume a bad partition leaves all n elements in either L or G. • g(n) = no. of recursive calls before we get a good partition... (since 50% chance, and each try is independent, we expect to find a good partition after 2 recursive calls...) Î E(g(n)) = 2 Jan Maluszynski - HT 2005 10.17 Jan Maluszynski - HT 2005 51837246 n 51837246 n 21435786 n3/4 2143 2143 2143 We recurse into a partition of size n(1/4)...n(3/4) ...asume worst (largest) size 10.18 TDDB56 DALGOPT-D – Lecture 10 – Sorting (part III) ... O(n) in expected case?? E(t(n)) ≤ n*2 + n*2*3/4 + n*2* (3/4)2 +E(n(3/4)3) Looks like a geometric sum: E (t ( n)) ≤ 2n ⋅ n a n +1 − 1 ai = ∑ The sum: a −1 i =0 log 4 / 3 n ∑ (3 / 4) i i =0 a n +1 − 1 <c a −1 E (t (n)) ≤ 2nc ⇒ O(n) a < 1⇒ Thus, Comment: When we divide data in ½ Î log2n... ...now we choose a partition of ¾ Îlog4/3n Jan Maluszynski - HT 2005 10.19 5