TDDB57 DALG – Lecture 9: Sorting I: Comparison-based sorting. Page 2 L j if i Page 4 J. Maluszynski, IDA, Linköpings Universitet, 2004. Page 1 TDDB57 DALG – Lecture 9: Sorting I: Comparison-based sorting. The Sorting Problem Sorting: Overview Input: A list L of data items with keys. L i j Output: A list L of the same items placed in order of keys. [L/D 11.1] [L/D 11.2] [L/D 11.3] Lecture 9: Comparison-based sorting Kinds of sorting Insertion sort Selection Sort and Heap Sort Labintroduction heapsort [L/D 11.4] Lecture 10: QuickSort QuickSort, Labintroduction Labintroduction quicksort 0 0 J. Maluszynski, IDA, Linköpings Universitet, 2004. j i i n-1 n-1 x J. Maluszynski, IDA, Linköpings Universitet, 2004. If the items are much larger than the keys (e.g., database records): do not move the items themselves, but sort an array of pointers to them. [L/D p. 29] [L/D 11.5] [L/D 11.6] Lecture 11: Comparison-based sorting 2; digital sorting Mergesort Lower Bound of Comparison-based Sorting Bucket sort and Radix sort TDDB57 DALG – Lecture 9: Sorting I: Comparison-based sorting. J. Maluszynski, IDA, Linköpings Universitet, 2004. Page 3 TDDB57 DALG – Lecture 9: Sorting I: Comparison-based sorting. Insertion Sort Kinds of Sorting In place sorting algorithm using a table A 0 n 1 Internal sorting: all data kept in main memory External sorting: not all data can be kept in main memory. Sorting in place: the original data items are rearranged within their positions in memory Sorting with auxiliary data structure: extra memory is used At the beginning of each step i, for 1 i n, the table consists of: left, sorted part A 0 i 1 , marked element x A i , right, unsorted part A i 1 n 1 . The marked element x is inserted in the sorted part; the rest of the sorted part is shifted to the right: j i while j 1 and A j 1 x A j A j 1 j j 1 x A j Sorting by comparison: the (keys of) data items are compared to establish their relative order. Digital sorting: the binary representation of (the key of) a data item is used to determine its resulting position. Stable sorting algorithm: items with equal keys never change their relative position Unstable sorting algorithm: otherwise 1 x 1 J. Maluszynski, IDA, Linköpings Universitet, 2004. Page 6 TDDB57 DALG – Lecture 9: Sorting I: Comparison-based sorting. in place, using a table A 0 n min min swap min swap J. Maluszynski, IDA, Linköpings Universitet, 2004. Motivation: reduce the maximal shifting distance in each phase. Improvement: iterate on k, decreasing hk InsertionSort the result InsertionSort only elements at positions i i h i 2h Repeat for i 0 1 h 1 0 0 10 J. Maluszynski, IDA, Linköpings Universitet, 2004. 1 1 11 11 10 10 11 Shell Sort Idea 7 6 5 4 3 2 6 5 4 3 7 2 6 5 9 8 2 7 2 6 5 9 8 7 4 5 6 7 8 9 5) Page 8 O n2 A 11 10 9 8 A0 11 A1 10 A2 9 A3 8 A4 A0 1 A1 0 A2 4 A3 3 A4 A 1 0 4 3 A 0 1 2 3 (Example for h TDDB57 DALG – Lecture 9: Sorting I: Comparison-based sorting. Iterative in-place sorting Sorting by insertion: Place next unsorted element in its place in the sorted part. Insertion Sort, Shell Sort Sorting by selection: Find the least unsorted element attach it to the sorted list. Selection Sort, Heap Sort Selection sort Page 5 1 2 J. Maluszynski, IDA, Linköpings Universitet, 2004. 1 TDDB57 DALG – Lecture 9: Sorting I: Comparison-based sorting. Insertion Sort Complexity What is the worst case? n Page 7 Consider the number of inversions in the unsorted array = number of pairs A i A j , for all i j, with A i A j TDDB57 DALG – Lecture 9: Sorting I: Comparison-based sorting. Each inversion requires a comparison A j 2 Choosing Shell Sort Increments 1, Shell Sort is InsertionSort. For h How many increments? Empirical studies show good results with: h1 1 hi 1 3hi 1 while hi 1 n h 1 4 13 40 121 taken in decreasing order. Number of comparisons in the worst case: nn 1 For which data does it work well? Estimation? TDDB57 DALG – Lecture 9: Sorting I: Comparison-based sorting. 2 2 2 2 0 5 5 5 5 5 1 3 3 3 3 3 2 2 2 4 1 1 1 1 3 1 1 4 4 4 4 4 Page 9 Page 11 1 3 3 4 1 5 3 1 1 5 n 4 2 1 4 3 2 3 2 1 5 3 1 1 5 3 1n 2 3 2 4 2 0 2 0 2 0 1 5 J. Maluszynski, IDA, Linköpings Universitet, 2004. O n2 2 J. Maluszynski, IDA, Linköpings Universitet, 2004. 4 2 0 4 3 2 2 0 3 2 1 4 4 0 TDDB57 DALG – Lecture 9: Sorting I: Comparison-based sorting. Heap Sort Improving Selection Sort: Page 10 Put data items on heap in O n log n time How to do this in place? TDDB57 DALG – Lecture 9: Sorting I: Comparison-based sorting. 1 1 1 1 4 0 2 2 2 5 5 5 1 3 5 3 3 3 3 2 4 4 4 4 2 2 3 Heap to Sorted: In Place 1 1 4 4 2 5 3 5 Page 12 3 2 1 5 3 4 1 5 3 4 1 3 2 1 4 4 0 3 2 2 4 0 5 2 3 4 0 1 3 4 1 3 4 1 5 3 2 Delete minimal data items from heap until empty Complexity of Selection Sort n Unsorted to Heap: In Place 2 3 1 5 Is there a “worst case”? Heap Sort 4 Number of comparisons: n How to improve it? TDDB57 DALG – Lecture 9: Sorting I: Comparison-based sorting. J. Maluszynski, IDA, Linköpings Universitet, 2004. J. Maluszynski, IDA, Linköpings Universitet, 2004. 3 2 4 4 0 2 3 2 5 4 0 5 4 0 TDDB57 DALG – Lecture 9: Sorting I: Comparison-based sorting. Page 13 J. Maluszynski, IDA, Linköpings Universitet, 2004. TDDB57 DALG – Lecture 9: Sorting I: Comparison-based sorting. Lab 5 Introduction Page 14 Complexity of Heap Sort Yet another heap, and heap sort The main idea Partially ordered trees with reverse order: parent not smaller than the children The worst time of HeapSort is Heapification: usual ( Lecture 8) heap with reverse order: heapification in T with root at T 0 maximal key at T 0 J. Maluszynski, IDA, Linköpings Universitet, 2004. Sorting: Residual heap kept at the beginning of T deleteMax exchanges T 0 with the last element of the heap shi f tdown: you have to implement re-heapification, elements beyond heap are sorted. O n log n Explain!