Uploaded by andres elias

Collins ch11

advertisement
Chapter 11
Sorting
Assumptions:
 Mostly, comparison-based sorts
 We will sort sequences only:
Array
ArrayList
LinkedList
For now, assume we want to sort x [0], x
[1], …, x [n – 1], an array of n int values.
Insertion sort:
public static void insertionSort (int[ ] x)
{
for (int i = 1; i < x.length; i++)
{
// At this point, x [0] <= x [1] <= … <= x [i-1].
SIFT x [i] DOWN TO ITS PROPER PLACE.
} // for
} // insertionSort
Example:
80, 60, 90, 75, 55, 90, …
80, 60, 90, 75, 55, 90, …
80
80, 60, 90, 75, 55, 90, …
60, 80
60, 80, 90, 75, 55, 90, …
60, 80, 90
60, 80, 90, 75, 55, 90, …
60, 75, 80, 90
60, 75, 80, 90, 55, 90, …
55, 60, 75, 80, 90
55, 60, 75, 80, 90, 90, …
55, 60, 75, 80, 90, 90
Sift x [i] down by comparing it to x [i-1],
x [i-2], …, until the proper place for x [i]
is found.
/**
* Sorts a specified array of int values into ascending
* order.
* The worstTime(n) is O(n * n).
*
* @param x – the array to be sorted.
*
*/
public static void insertionSort (int[ ] x)
{
for (int i = 1; i < x.length; i++)
for (int j = i; j > 0 && x [j -1] > x [j]; j - -)
swap (x, j, j -1);
} // method insertionSort
Example, starting with i = 3:
60, 80, 90, 75, 55, 90 // i = 3; j = 3; swap
60, 80, 75, 90, 55, 90 // i = 3; j = 2; swap
60, 75, 80, 90, 55, 90 // i = 3; j = 1; no swap
60, 75, 80, 90, 55, 90 // i = 4; j = 4; swap
60, 75, 80, 55, 90, 90 // i = 4; j = 3; swap
and so on
Worst case: x in descending order
Let n = x.length.
Outer-loop iterations = n-1
Inner-loop iterations
= 1 + 2 + … + n-1
n-1
1 + 2 + 3 + ... + n-2 + n-1 =  i = n(n-1) / 2
i=1
worstTime(n)  Max number of loop
iterations = n-1 + n (n-1) / 2
worstTime(n) is quadratic in n.
averageTime(n)  Average number of loop
iterations  n-1 + n (n-1) / 4
averageTime(n) is quadratic in n.
Exercise: Determine the exact number of
inner-loop iterations (= number of swaps)
in applying insertion sort to the following
array of int values:
96, 86, 76, 56, 66
What if we want a version of InsertionSort
to sort an array of String elements into
lexicographic order?
public static void insertionSort (Object[ ] x)
{
for (int i = 1; i < x.length; i++)
for (int j = i; j > 0 && x [j -1] > x [j]; j - -)
swap (x, j, j -1);
} // method insertionSort
What if we want a version of InsertionSort
to sort an array of String elements into
lexicographic order?
Illegal!
public static void insertionSort (Object[ ] x)
{
for (int i = 1; i < x.length; i++)
for (int j = i; j > 0 && x [j -1] > x [j]; j - -)
swap (x, j, j -1);
} // method insertionSort
public static void insertionSort (Object[ ] x)
{
for (int i = 1; i < x.length; i++)
for (int j = i; j > 0 &&
(((Comparable)x [j -1]).compareTo (x [j]) > 0);j- -)
swap (x, j, j -1);
} // method insertionSort
To call this method:
insertionSort (words);
The Comparable interface is appropriate
when you want the elements sorted by
the “natural” ordering, for example,
String objects in lexicographic order.
What if you want the elements sorted into
an unnatural order?
For example, Integer objects in decreasing
order, or String objects sorted by the length
of the string.
public interface Comparator<T>
{
/**
* Compares two specified elements.
*
* @param element1 – one of the elements.
* @param element2 – the other element.
*
* @return a negative integer, 0, or a positive
*
integer, depending on whether element1
*
is less than, equal to, or greater than
*
element2.
*/
int compare (T element1, T element2);
} // interface Comparator
For example, to sort an array intArray of
Integer elements into decreasing order:
public class Decreas ing
implements Comparator<Integ er>
{
public int compare (Intege r i, Integer j)
{
return j.compareTo (i);
} // method compare
} // class Decreas ing
public static <T> void insertionSort (T[ ] x,
Comparator<T> comp)
{
for (int i = 1; i < x.length; i++)
for (int j = i; j > 0 && comp.compare (x [j -1], x [j]) > 0);
j- -)
swap (x, j, j -1);
} // method insertionSort
insertionSort (intArray, new Decreasing());
Now suppose you want to sort stringArray,
an array of strings, in increasing order of
the lengths of the strings, with lexicographic
comparison of equal-length strings, so that
“yes” < “here” < “true” < “maybe”
public class ByLength implements Comparator<String>
{
/**
* Compares two specified String objects
* lexicographically if they have the same length, and
* otherwise returns the difference in their lengths.
*
* @param s1 – one of the specified String objects.
* @param s2 – the other specified String object.
*
* @return s1.compareTo (s2) if s1 and s2 have the
*
same length; otherwise, return
*
s1.length() – s2.length().
*
*/
public int compare (String s1, String s2)
{
int len1 = s1.length(),
len2 = s2.length();
if (len1 == len2)
return s1.compareTo (s2);
return len1 – len2;
} // method compare
} // class ByLength
insertionSort (stringArray, new ByLength ());
The insertionSort method is unchanged!
The Comparator interface allows you to
compare elements in a class any way you
want, even if you cannot modify the class.
For example, you cannot modify the Integer
or String classes.
Exercise: Determine the ordering
of “Yes”, “here”, “true”, “maybe”,
and “yes” by the following version
of compare:
public int compare (String s1, String s2)
{
return s1.length() – s2.length();
} // method compare
How Fast Can We Sort?
A decision tree is a binary tree in which
each non-leaf represents a comparison
between two elements and each leaf
represents a sorted sequence of those
elements.
Left branch: Yes
Right branch: No
Example: Apply insertion sort to a1, a2, a3.
a 1 < a2 ?
a 2 < a3 ?
a1 a2 a3
a1 a3 a2
a 1 < a3 ?
a 1 < a3 ?
a2 a1 a3
a3 a1 a2
a2 < a3?
a2 a3 a1
a3 a2 a1
A decision tree has one leaf for each
permutation of the n elements to be sorted.
The number of permutations of n distinct
elements is ?
n!
So a decision tree to sort n elements must
have n! leaves.
By the binary tree theorem, for
any non-empty tree t,
leaves (t) <= 2
height (t)
Since n! = leaves(t), we must have
n! <= 2
height (t)
which implies that
log2 (n!) <= height (t)
In the context of a decision tree, height(t)
represents the maximum number of
comparisons needed to sort the n
elements.
So log2(n!) <= the maximum number
of comparisons to sort n elements.
Therefore,
worstTime(n) >= log2(n!)
By concept exercise 11.7,
log2(n!) >= n/2 log2(n/2)
So
worstTime(n) >= n/2 log2(n/2)
So worstTime(n) is (n log n) for any
comparison-based sort.
What can we say about averageTime(n)?
averageTime(n) >= average number of comparisons
= total number of comparisons / n!
In a decision tree, what is the total number
of comparisons equal to?
averageTime(n) >= average number of comparisons
= total number of comparisons / n!
In a decision tree, what is the total number of
comparisons equal to?
Hint: The length of each path from the root to a leaf
equals the number of comparisons in that path.
The total number of comparisons is equal
to the sum of all root-to-leaf path lengths.
E(t), the external path length of tree t, is
the sum of all root-to-leaf path lengths in t.
So the average number of comparisons is
E(t) / n!
In a decision tree, the number of leaves is n!.
so, by the external path length theorem,
averageTime(n) >= average # comparisons
= E(t) / n!
>= (n! / 2) floor (log2(n!)) / n!
= (1 / 2) floor (log2(n!))
>= (1 / 4) (log2(n!))
>= (n / 8) (log2(n / 2))
For any comparison-based sort,
averageTime(n) is (n log n).
Exercise: Suppose, for some sort method,
worstTime(n) is linear logarithmic in n.
True or false:
1. averageTime(n) must be linear
logarithmic in n.
2. averageTime(n) is O(n log n).
3. averageTime(n) is (n log n).
Fast Sorts
Msrge Sort
Given an array x of objects, keep splitting
into subarrays until the size of a subarray
is less than 7. Apply insertion sort to that
subarray and merge the subarrays back
together.
For example, suppose x.length = 25.
Here is an outline based on size:
25 Split
12 Split
6
Ins.sort
13 Split
6
Ins.sort
6
Ins.sort
7 Split
3
4
Ins.sort
Ins.sort
Merge
12
Merge
7
Merge
13
Merge
25
To simplify the merging, we’ll use an auxiliary array.
Suppose we want to merge two sorted subarrays of 5
elements each:
20 30 44 71 95
15 17 28 33 88
15, the smaller of the two, is copied to the auxiliary
array and the right index is incremented:
20 30 44 71 95
15 17 28 33 88
15 (in the auxiliary array)
17, the smaller of the two, is appended to the auxiliary
array and the right index is again incremented:
20 30 44 71 95
15 17 28 33 88
15 17 (in the auxiliary array)
20, the smaller of the two, is copied to the auxiliary
array and the left index is incremented:
20 30 44 71 95
15 17 28 33 88
15 17 20 (in the auxiliary array)
28, the smaller of the two, is copied to the auxiliary
array and the right index is incremented. And so on.
/**
* Sorts a specified array of objects according to the
* compareTo method in the specified class of elements.
* The worstTime(n) is linear-logarithmic in n.
*
* @param a – the array of objects to be sorted.
*
*/
public static void sort(Object[ ] a)
public static void sort(Object[ ] a) {
Object aux[ ] = (Object[ ])a.clone();
mergeSort(aux, a, 0, a.length);
}
/**
* Sorts, by the Comparable interface, a specified range of a
* specified array into the same range of another specified array.
* The worstTime(k) is linear-logarithmic in k, where k is the
* size of the subarray.
*
* @param src – the specified array whose elements are to be
*
sorted into another specified array.
* @param dest – the specified array whose subarray is to be *
*
sorted.
* @param low: the smallest index in the range to be sorted.
* @param high: 1 + the largest index in the range to be
*
sorted.
*/
private static void mergeSort (Object src[ ],
Object dest[ ],
int low, int high)
aux
59 46
32
80
46
55 87
43
44 81
mergeSort (aux, a, 0, 10)
a
59 46
32
80
46
55
mergeSort (a, aux, 0, 5)
Insertion Sort
aux
32
46
46
59
87
43
44
81
mergeSort (a , aux, 5, 10)
Insertion Sort
80
43
44
55
81
87
merge
a
32 43
44
46
46
55
59
80
81
87
aux
59 46 32 80 46 55 87 43 44 81 95 12 17 80 75 33 40 61 16 50
mergeSort (aux, a, 0, 20)
a
59 46 32 80 46 55 87 43 44 81
95 12 17 80 75 33 40 61 16 50
mergeSort (a, aux, 0, 10)
mergeSort (a, aux, 10, 20)
aux 59 46 32 80 46
mergeSort
(aux, a, 0, 5)
Insertion Sort
a
32 46 46 59 80
55 87 43 44 81
95 12 17 80 75
33 40 61 16 50
mergeSort
mergeSort
mergeSort
(aux, a, 5, 10)
(aux, a, 10, 15)
(aux, a, 15, 20)
Insertion Sort
Insertion Sort
Insertion Sort
43 44 55 81 87
12 17 75 80 95
16 33 40 50 61
a
32 46 46 59 80
43 44 55 81 87 12 17 75 80 95
merge
16 33 40 50 61
merge
aux 32 43 44 46 46 55 59 80 81 84
12 16 17 33 40 50 61 75 80 95
merge
a
12 16 17 32 33 40 43 44 46 46 50 55 59 61 75 80 80 81 84 87 95
private static void mergeSort (Object src[ ], Object dest[ ],
int low, int high)
{
int length = high – low;
// Use Insertion Sort for small subarrays.
if (length < 7)
{
for (int i = low; i < high; i++)
for (int j = i; j >low &&
((Comparable)dest[j-1]).compareTo(dest[j]) > 0; j--)
swap (dest, j, j-1);
return;
} // if length < 7
// Sort left and right halves of src into dest.
int mid = (low + high) / 2;
mergeSort (dest, src, low, mid);
mergeSort (dest, src, mid, high);
// If left subarray less than right subarray, copy src to dest.
if (((Comparable)src [mid-1]).compareTo (src [mid]) <= 0) {
System.arraycopy (src, low, dest, low, length);
return;
}
// Merge sorted subarrays in src into dest.
for (int i = low, p = low, q = mid; i < high; i++)
if (q>=high || (p<mid &&
((Comparable)src[p]).compareTo (src[q])<= 0))
dest [i] = src [p++];
else
dest[i] = src[q++];
} // method mergeSort
max comparisons
n
n/2
n/4
n/2
n/4
n/4
n/4
n/4
n/4
…
n/4
n/4
n/2
n/2
n
n
n
n
For merge sort, worstTime(n) is O(n log n)
Therefore
averageTime(n) IS linear-logarithmic in n
Exercise: Show the steps, including the
calls to mergeSort, in merge sorting the
following:
aux
27 26 25 … 2 1
mergeSort (aux, a, 0, 27)
a
Arrays.java also a version of merge sort
that takes a Comparator parameter:
public static <T> void sort(T[] a, Comparator<? super T> c)
To call this version:
Arrays.sort (myArray, new ByLength ());
Collections.java also has two versions
of merge sort: One to sort List<T> in the
natural order, and one to sort List<T> that
has a second parameter of type Comparator.
Collections.sort (myList);
Collections.sort (myList, new Reverse());
Quick Sort
/**
* Sorts a into ascending order.
* The worstTime(n) is O(n * n), and averageTime(n) is linear* logarithmic in n.
*/
public static void sort (int[ ] a)
{
sort1(a, 0, a.length);
} // method sort
/**
* Sorts the array x, from index off (inclusive) to index
* off + len (exclusive), into ascending order.
*/
private static void sort1(int x[ ], int off, int len)
If len < 7, use insertion sort
Otherwise, partition about a pivot element.
For now take pivot to be the median of the
first, middle and last elements.
59 46 32 80 46 55 87 43 44 81 95 12 17 80 75 33 40 61 16 50
v = pivot = median of x [0], x [0 + 20 / 2], x [19]
= median of {59, 95, 50}
=?
In a loop, we will partition x into a left
subarray of items <= v and a right subarray
of items >= v.
We then recursively call sort1 for the left
and right subarrays.
Here is the basic loop and calls:
int v = x[m];
// v is the pivot
int b = off, c = off + len - 1;
while(true) {
while (x[b] < v) b++;
while (x[c] > v) c--;
if (b > c)
break;
swap(x, b++, c--);
}
if (c – off + 1 > 1)
sort1 (x, off, c - off + 1);
if (off + len – b > 1)
sort1 (x, b, off + len - b);
59 46 32 80 46 55 87 43 44 81 95 12 17 80 75 33 40 61 16 50
b
Increment b until x [b] >= 59; decrement c until x [c] =< 59.
Then swap x [b] with x [c] and bump b and c.
c
50 46 32 80 46 55 87 43 44 81 95 12 17 80 75 33 40 61 16 59
b
c
50 46 32 16 46 55 87 43 44 81 95 12 17 80 75 33 40 61 80 59
b
c
50 46 32 16 46 55 40 43 44 81 95 12 17 80 75 33 87 61 80 59
b
c
50 46 32 16 46 55 40 43 44 33 95 12 17 80 75 81 87 61 80 59
b
c
50 46 32 16 46 55 40 43 44 33 17 12 95 80 75 81 87 61 80 59
bc
50 46 32 16 46 55 40 43 44 33 17 12 95 80 75 81 87 61 80 59
c
c = 11 and b = 12
Every element in x [0 … 11] <=59
Every element in x [12 … 19] >= 59
b
So we now call:
sort (x, 0, 12);
sort (x, 12, 8);
For a refinement, elements equal to the pivot
will be moved to locations between c and b,
so they are not involved in any further
partitioning.
During partitioning, elements equal to the
pivot are stored at either end of the subarray,
and then moved to the middle after
partitioning.
int v = x[m];
// Establish Invariant: v* (<v)* (>v)* v*
int a = off, b = a, c = off + len - 1, d = c;
while(true) {
while (b <= c && x[b] <= v) {
if (x[b] == v)
swap(x, a++, b);
b++;
}
while (c >= b && x[c] >= v) {
if (x[c] == v)
swap(x, c, d--);
c--;
}
if (b > c)
break;
swap(x, b++, c--);
}
59 46 59 80 46 55 87 43 44 81 95 12 17 80 75 33 40 59 16 50
b
c
59 59 46 50 46 55 16 43 44 40 33 12 17 80 75 95 81 80 87 59
c b
12 17 46 50 46 55 16 43 44 40 33 59 59 59 75 95 81 80 87 80
sort1 (x, 0, 11);
sort1 (x, 14, 6);
For another refinement, suppose len, the
size of the subarray to be sorted, is > 40.
Then split the subarray into 3 segments,
and take the median of each segment.
pivot = median of the 3 medians
For example, suppose we have len = 90.
Then we split x up into subarrays x
[0 … 29], x [30 … 59] and x [60 … 89].
element 55…87… 22 92…33…12 21…46…67
index
0 15 29 30 45 59 60 75 89
For example, suppose we have len = 90.
Then we split x up into subarrays x
[0 … 29], x [30 … 59] and x [60 … 89].
element 55…87… 22 92…33…12 21…46…67
index
0 15 29 30 45 59 60 75 89
55
33
46
46
For the estimate of averageTime(n) and
worstTime(n), we can imagine that quick
sort creates a binary search tree. The root
of the tree is the pivot.
After the first partitioning, the pivot of the
left subarray becomes the root of the left
subtree, and so on.
Suppose x = {0, 1, 2, 3, … 98, 99}
Already in order!
50
25
12
6
75
38
19
32
Best-Case Partitioning
63
44
57
88
69
82
94
Best-Case Partitioning:
log2(n/7) levels; at most n comparisons
per level
Total number of comparisons
<= n log2n
Average-Case Partitioning:
Average height of a binary search tree:
logarithmic in n
log2(n/7) levels; at most n
Comparisons per level
Total number of comparisons 
n log2n
The averageTime(n): linear logarithmic in n.
Worst-Case Partitioning?
Suppose x = {1, 2, . . . , 18, 0, 38, 19, 20, 21, . . . , 37}
37
1
38
0
35
3
2
# of comparisons
39
37
36
35
33
33
5 34
31
…
Worst-Case Partitioning:
# of comparisons  n + (n-2) + (n-4) + … + 8
n/2
= Σ (2i)  2 * (n/2 * (n/2 + 1) / 2)
i=4
 n2 / 4
The worstTime(n) is quadratic in n.
Exercise: Suppose x starts out as
10, 9, 8, 7, 6, 16, 17, 18, 19, 20, 5, 4, 3, 2, 1, 11, 12, 13, 14, 15
the median of x [0], x [10], x [19] = 10.
Just before the first swap of x[b] and x[c],
b = 5 and c = 14.
After all swaps, what does x contain?
What are the recursive calls to quick sort:
sort1 (x, ?, ?);
sort1 (x, ?, ?);
Download