Object-Oriented Programming in Python Goldwasser and Letscher Chapter 14 Sorting Algorithms Terry Scott University of Northern Colorado 2007 Prentice Hall Introduction: What is Covered in Chapter 14 • • • • • • Customizing use of Python’s sort. Selection Sort. Insertion Sort. Merge Sort. Quicksort. Which algorithm does Python use? Terry Scott University of Northern Colorado 2007 Prentice Hall 2 Standard Lexicographic Sorting >>> data = ['bread','soda','cheese','milk','pretzels'] >>> data.sort() >>> print data ['bread', 'cheese', 'milk', 'pretzels', 'soda' ] Terry Scott University of Northern Colorado 2007 Prentice Hall 3 lengthCmp Method #A comparison function for lengths #Return -1, 0, or 1 depending on whether a < b, #a == b, or a > b def lengthCmp(a, b): if len(a) < len(b): return -1 elif len(a) == len(b): return 0 else: return 1 Terry Scott University of Northern Colorado 2007 Prentice Hall 4 Sorting by String Length >>> #can use built in cmp function >>> data=['bread', 'soda', 'cheese', 'milk', 'pretzels'] >>> def lengthCmp(a, b): return cmp(len(a), len(b)) >>> data.sort(lengthCmp) #lengthCmp with no parens >>> print data #is a reference to the function ['soda', 'milk', 'bread', 'cheese', 'pretzels'] Terry Scott University of Northern Colorado 2007 Prentice Hall 5 Sorting Decorated Tuples >>> data=['bread', 'soda', 'cheese', 'milk', 'pretzels'] >>> decorated = [ ] >>> for s in data: decorated.append((len(s),s)) >>> print decorated [(5,'bread'),(4,'soda'),(6, 'cheese'),(4,'milk'),(8,'pretzels')] >>> decorated.sort() Terry Scott University of Northern Colorado 2007 Prentice Hall 6 Sorting Decorated Tuples (continued) >>> for i in range(len(data)): data[i] = decorated[i][1] >>> print data ['soda', 'milk', 'bread', 'cheese', 'pretzels'] Terry Scott University of Northern Colorado 2007 Prentice Hall 7 Decorator Function #sort function has built-in ability to include #a decorator function def lengthDecorator(s): return len(s) data.sort(key=lengthDecorator) #keyword parameter passing #Of course we could have just done data.sort(key = len) Terry Scott University of Northern Colorado 2007 Prentice Hall 8 Selection Sort 1. Make one pass through the array and find smallest. 2. Swap element at position 0 with the smallest one that was just found. 3. Repeat steps 1 and 2 with the rest of the list. 4. Number of times to do steps 1 and 2 is number of items minus 1 (n – 1). Obviously once n – 1 items are sorted then the nth one is in place. 5. Order of Algorithm is n2, where n is the number of items in the list. Terry Scott University of Northern Colorado 2007 Prentice Hall 9 Selection Sort Code def selectionSort(data): hole = 0; #next index to fill while hole < len(data) – 1: #last item sorted is in place small = hole walk = hole + 1 while walk < len(data): if data[walk] < data[small]: small = walk #new minimum found walk += 1 data[hole], data[small] = data[small], data[hole] hole += 1 Terry Scott University of Northern Colorado 2007 Prentice Hall 10 First Seven Passes of Selection Sort Terry Scott University of Northern Colorado 2007 Prentice Hall 11 Insertion Sort • • • • • • First item is in correct place in the first 1 items. Place 2nd item in correct spot in the first 2 items. Place 3rd item in correct spot in the first 3 items. Continue until all items are sorted. Number of times done is number of items (n). Each pass may move item on average number of items looked at divided by 2. • Worst case order is n2 but if list is nearly ordered then order can be nearly n. Terry Scott University of Northern Colorado 2007 Prentice Hall 12 Insertion Sort def insertionSort(data): next = 1 #index of next item to insert while next < len(data): value = data[next] #will insert this value in place hole = next while hole > 0 and data[hole-1] > value: data[hole] = data[hole-1] #slide data[hole-1] hole -= 1 #forward and the hole back one data[hole] = value next += 1 Terry Scott University of Northern Colorado 2007 Prentice Hall 13 First Seven Passes of Insertion Sort Terry Scott University of Northern Colorado 2007 Prentice Hall 14 Merge Sort • Divide and Conquer solutions: Solve the problem by dividing into smaller problems and solving the smaller problems. • Merge sort is a divide and conquer algorithm. • If the list is divided into half and each half is sorted, then the two lists can be merged together to make the entire list sorted. • Each half of the list is sorted by dividing it into half, sorting each half, then merging them. • Recursively we repeat the previous operation until the list size is 1 and it is obviously sorted. This is merged with another list of size 1 and this continues until the list is finally sorted. • To avoid having to sort in place, the list is copied into a temporary spot in memory. Terry Scott University of Northern Colorado 2007 Prentice Hall 15 Merge Sort Code def merge(data, start, mid, stop, temp): i = start while i= stop: temp[i] = data[i] i += 1 mergedMark = start leftMark = start rightMark = mid Terry Scott University of Northern Colorado 2007 Prentice Hall 16 Merge Sort Code while mergedMark < stop: if leftMark < mid and (rightMark == stop or temp[leftMark] < temp[rightMark]): data[mergedMark] = temp[leftMark] leftMark += 1 else: data[mergedMark] = temp[rightMark] rightMark += 1 mergedMark += 1 Terry Scott University of Northern Colorado 2007 Prentice Hall 17 Merge Diagram • The next slide shows how data would be merged. • This data is in the temporary buffer and will be placed back in the previous list. • It starts with comparing 2 and 3 which are at the start of the first sorted list and at the start of the second sorted list. • 2 would then be copied back to the original list since it is smaller. • Then 3 and 9 are compared and 3 would be copied. • This continues until the end of both lists is reached. Terry Scott University of Northern Colorado 2007 Prentice Hall 18 Merge Sort Terry Scott University of Northern Colorado 2007 Prentice Hall 19 Recursive Merge Sort • To perform the steps in the merge sort, first setup a function that has a signature that is easily called by the user. • Then just call the previously defined merge sort recursively until the list size is down to one. Terry Scott University of Northern Colorado 2007 Prentice Hall 20 Recursive Merge Sort Code def mergeSort(data) #it creates temporary spot for data so that it is done once. _mergeSort(data, 0, len(data), [none]*len(data) def _recursiveMergeSort(data, start, stop, tempList): if start < stop – 1: mid = (start – stop)//2 _recursiveMergeSort(data, start, mid, tempList) _recursiveMergeSort(data, mid, stop, tempList) _merge(data, start, mid, stop, tempList) #_merge previously defined – now a local function Terry Scott University of Northern Colorado 2007 Prentice Hall 21 Merge Sort • • • • First line is unsorted list. Second line is after left half is sorted. Third line is after right half is sorted. Fourth line is after the merge has occurred. Terry Scott University of Northern Colorado 2007 Prentice Hall 22 Merge Sort • This is similar to the previous figure but is when sorting occurs on the left half of the list. • Merge Sort has order Θ(n log2n). Terry Scott University of Northern Colorado 2007 Prentice Hall 23 Quick Sort • Quick sort is another divide and conquer algorithm. • Best case it has order Θ(n log2 n). The same as the merge sort. • The worst case is when the data is partitioned and all items are either larger or smaller than the pivot. If this were to occur for every partition then the order would degenerate to: Θ(n2). • It has the advantage over merge of not requiring extra space. Terry Scott University of Northern Colorado 2007 Prentice Hall 24 Quick Sort Algorithm • Partition the list into two pieces: smaller values and larger values. • To do this pick a pivot item: choose last item in list for pivot item. • All items less than the pivot item should be placed at the beginning of the list and those that are larger should be placed at the end of the list. Terry Scott University of Northern Colorado 2007 Prentice Hall 25 Quick Sort • The picture below shows the process part way through the partitioning. • The 14 is the pivot item. • 9, 10, 2, and 10 are smaller than the pivot and are at the beginning of the list. • 20 and 18 are larger and are after the smaller items. • The algorithm is now ready to process 17. It is larger and so remains where it is. • The 3 needs to be moved up to the smaller items so it will be swapped with the 20. • This is shown in more detail on the next slide. Terry Scott University of Northern Colorado 2007 Prentice Hall 26 Quick Sort Terry Scott University of Northern Colorado 2007 Prentice Hall 27 Quick Sort (continued) Terry Scott University of Northern Colorado 2007 Prentice Hall 28 Quick Sort Partitioning Code def _quickPartition(data, start, stop): """partition data[start:stop] values are partitioned around the pivot point""" pivotVal = data[stop – 1] big = unknown = start #initially everything unknown while unknown < stop: if data[unknown] <= pivotVal: data[big], data[unknown]=data[unknown],data[big] big += 1 unknown += 1 return big - 1 Terry Scott University of Northern Colorado 2007 Prentice Hall 29 Quick Sort Recursive Code def _recursiveQuicksort(data, start, stop): if start < stop – 1: pivot = _quickPartition(data, start, stop) _recursiveQuicksort(data, start, pivot) _recursiveQuicksort(data,pivot+1, stop) Terry Scott University of Northern Colorado 2007 Prentice Hall 30 Quick Sort: Complete Trace • First line unsorted data. • Second line: after first partition: arrow shows moving of pivot element. • Remaining lines show succeeding recursions. Terry Scott University of Northern Colorado 2007 Prentice Hall 31 Which Algorithm Does Python Use for Sorting? • For larger lists it uses a variant of the merge sort that does not require as much extra space as was used in the discussion in this chapter. • Once the list is smaller it uses insertion sort. Terry Scott University of Northern Colorado 2007 Prentice Hall 32