TDDB57 DALGOPT-D – Lecture 11: Sorting II: ¡Information-theoretic lower bound, digital sorting. Page 1 J. Maluszynski, IDA, Linköpings Universitet, 2005. TDDB57 DALGOPT-D – Lecture 11: Sorting II: ¡Information-theoretic lower bound, digital sorting. Page 2 J. Maluszynski, IDA, Linköpings Universitet, 2005. Mergesort Idea Overview Mergesort [L/D p. 29] Lower Bound of Comparison-based Sorting [L/D 11.5] Bucket sort, Radix sort, Radix Exchange sort [L/D 11.6] External Sorting: Merge sorts; Generating the Initial Runs [L/D 11.7] Split one-element table is sorted Merge merge two sorted tables into one 13 55 Page 3 J. Maluszynski, IDA, Linköpings Universitet, 2005. 55 7 10 9 3 13 4 13 4 55 7 10 4 13 7 55 9 4 7 13 3 TDDB57 DALGOPT-D – Lecture 11: Sorting II: ¡Information-theoretic lower bound, digital sorting. 4 7 10 55 4 7 33 3 33 9 3 33 10 3 33 10 33 3 9 10 13 TDDB57 DALGOPT-D – Lecture 11: Sorting II: ¡Information-theoretic lower bound, digital sorting. 33 Page 4 9 9 55 J. Maluszynski, IDA, Linköpings Universitet, 2005. Mergesort analysis Merge Sort Split in constant time (at every level) at most n 1 comparisons to merge at every level number of levels: O log n Mergesort time complexity is O n log n procedureMergeSort table T a b : if a b then return middle a b 2 MergeSort T a middle MergeSort T middle 1 b Merge T a middle T middle 1 b Cannot be done in place: extra memory needed for merging TDDB57 DALGOPT-D – Lecture 11: Sorting II: ¡Information-theoretic lower bound, digital sorting. Page 5 TDDB57 DALGOPT-D – Lecture 11: Sorting II: ¡Information-theoretic lower bound, digital sorting. J. Maluszynski, IDA, Linköpings Universitet, 2005. 0:1 Binary decision tree T : internal nodes correspond to comparisons every leaf corr. to a permutation of the initial sequence all permutations appear n! permutations < > > < 1:2 < > > < + Bucket Sort + Radix Sort n 2πn ne Θ 1 n 1 Page 7 J. Maluszynski, IDA, Linköpings Universitet, 2005. N 1 Prerequisite: Keys in domain 0 1 5 6 10 12 7 9 15 11 3 7 [L/D Algorithm 11.9] 15 8 Time complexity of bucket sort? On N if N fits the word size ... 1 k key0 k 3 8 6 keyK 1 J. Maluszynski, IDA, Linköpings Universitet, 2005. Bucket sort on successive parts, in ascending order of significance, keeping the relative order of items within each bucket (stable!) 0 6 8 1 parts: 3. scan S, concatenate elements pointed to from nonempty fields of S 0 Split key k into K 1 of pointers 2. place in S k pointer(s) to record(s) with key k (for multiple occurrences of same key: use linked lists or counters) 4 Page 8 For long keys: 1. Create a table S 0 N 10 TDDB57 DALGOPT-D – Lecture 11: Sorting II: ¡Information-theoretic lower bound, digital sorting. Radix Sort Bucket Sort 2 + Radix Exchange Sort (no slides, see the book) TDDB57 DALGOPT-D – Lecture 11: Sorting II: ¡Information-theoretic lower bound, digital sorting. 0 The lower bound for comparison-based sorting does not apply! Ω n log n log2 n! 1 > T has at least n! leaves S TIRLING’s approximation: n! 5 No key comparison. 1:2 the height of T is at least log2 n! = number of comparisons in the worst case 12 Use properties of the binary representation of a key e.g. to compute its address or table index in the sorted sequence. 0:2 0:2 < J. Maluszynski, IDA, Linköpings Universitet, 2005. Digital Sorting Information-Theoretic Lower Bound for Comparison-based Sorting How many comparisons for sorting n elements? Page 6 TDDB57 DALGOPT-D – Lecture 11: Sorting II: ¡Information-theoretic lower bound, digital sorting. Page 9 J. Maluszynski, IDA, Linköpings Universitet, 2005. Radix Sort Example TDDB57 DALGOPT-D – Lecture 11: Sorting II: ¡Information-theoretic lower bound, digital sorting. Page 10 J. Maluszynski, IDA, Linköpings Universitet, 2005. A comment on External Sorting Example: alphabetic keys, split by character, buckets on letters Based on merge sort mary anna bill adam mona Place blocks of sorted information (runs ) on external devices anna mona bill adam mary (sorted by 4th letter) adam bill anna mona mary (sorted by 3rd letter, keep relative order) Read two runs, merge them and write as a new run minimize the number of acesses: initial runs as large as possible mary adam bill anna mona adam anna bill mary mona TDDB57 DALGOPT-D – Lecture 11: Sorting II: ¡Information-theoretic lower bound, digital sorting. Page 11 J. Maluszynski, IDA, Linköpings Universitet, 2005. Straight Merge Sort Page 12 J. Maluszynski, IDA, Linköpings Universitet, 2005. Replacement Selection Pass1 Tape 1 TDDB57 DALGOPT-D – Lecture 11: Sorting II: ¡Information-theoretic lower bound, digital sorting. 0 1 2 34 5 6 7 Pass2 012 Pass3 345 67 14 7 345 Tape 4 2 5 67 Tape 3 a buffer totally filled with two heaps H1 and H2 fill H1 with input data; H2 is empty Repeat: if Empty H1 then H1 H2 x DeleteMin H1 ; out put x input y ; if y x then insert y H1 else insert y H2 012 Generating the initial runs by Replacement Selection: 0 3 6 01234567 Tape 2 Pass4 TDDB57 DALGOPT-D – Lecture 11: Sorting II: ¡Information-theoretic lower bound, digital sorting. Page 13 J. Maluszynski, IDA, Linköpings Universitet, 2005. Internal Sorting Summary Average Θ n2 Θ n2 Θ n log n Θ n log n Θ n log n Θn Worst case Θ n2 Θ n2 Θ n log n Θ n2 Θ n log n Θn Sorting algorithm Insertion sort Selection sort Heap sort Quick sort Merge sort Bucket/Radix sort Stable yes yes no no yes ?