Lecture 12

advertisement
Advanced Sorting Methods: Shellsort
Shellsort is an extension of insertion sort, which gains speed by allowing
exchanges of elements that are far apart.
The idea: Rearrange the file to give it a property that taking very h-th element
(starting anywhere) yields a sorted file, called h-sorted. That is, h-sorted file
is "h" independent sorted files interleaved together.
Example:
Let h = 13 during the first step, h = 4 during the second step, and
during the final step h= 1 (insertion sort at this step)
Step1:
15
8
7
3
2
14
11
1
5
9
4
12
13
6
10
4
12
13 15
10
10
12
13
compare and exchange
Step2:
6
8
7
3
2
14
Step 3:
2
8
4
1
5
9
11
7
3
1
6
5
14
9
15
11
To implement Shell sort we need a helper method, SegmentedInsertionSort.
Input: A, input array;
N, number of elements;
H, distance between elements in the same segment.
Output: Array, A, H-sorted.
Algorithm SegmentedInsertionSort (A, N, H)
for l := H + 1 to N do
j := l – H /* j counts down through the current segment */
while j > 0 do
if precedes (A[j + H], A[j]) then
swap (A[j + H], A[j])
j := j – H
else
j := 0
endif
endwhile
endfor
The Shell sort method now becomes:
Input: A, input array;
N number of elements;
Output: Array, A, sorted.
Algorithm ShellSort (A, N)
H := N / 2
while H > 0 do
SegmentedInsertionSort (A, N, H)
H := H / 2
endwhile
Notes: 1. H = H / 2 is a "bad" incremental sequence, because it repeatedly compares the
same values, and at the same time some values will not be compared to each other
until H = 1.
2. Any incremental sequence of values of H can be used, as long as the last
value is 1. Here are examples of "good" incremental sequences:
 H = 3 * H + 1 gives the following incremental sequence
… 1093, 364, 121, 40, 13, 4, 1.
 H = 2 * H + 1 gives the following incremental sequence
… 127, 63, 31, 15, 7, 3, 1.
Efficiency of Shell sort
Let the incremental sequence be H : = H / 2, foe example …, 64, 32, 16, 8, 4, 2, 1.
Then:
– The number of repetitions of SegmentedInsertionSort is O(Log N).
– The outer loop is each SegmentedInsertionSort is O(N).
– The inner loop of each SegmentedInsertionSort depends on the current
order of the data within that segment.
Therefore, the total number of comparisons in this case is O(A * N * Log N),
where A is unknown.
Empirical results for a better incremental sequence, H = 3 * H + 1, show the
average efficiency of Shell sort in terms of number of comparisons to be
O(N * (log N)^2), which is almost O(N^1.5).
Advanced sorting: Merge sort
The idea: Given two files in ascending order, put them into a third file
also arranged in ascending order.
Example:
file A
3
7
9
12
13
14
file B
1
5
8
10
17
19
file C
1
3
5
7
8
9
10
12
13
14
17
19
The efficiency of this
process is O(N)
The algorithm: (let us call this procedure merge)
1 Compare two numbers
2 Transfer the smaller number
3 Advance to the next number and go to 1. Repeat until one of the files is emptied.
Move the numbers left on the other file to the third file.
Algorithm merge (source, destination, lower, mid, upper)
Input: source array, and a copy of it, destination;
lower, mid and upper are integers defining sublists to be merged.
Output: destination file sorted.
int s1 := lower; int s2 := mid + 1; int d := lower
while (s1 <= mid and s2 <=upper) {
if (precedes (source[s1], source[s2]) {
destination[d] := source[s1]; s1 := s1 + 1
else
destination[d] := source[s2]; s2 := s2 + 1
d := d + 1 } // end if
} // end while
if (s1 > mid) {
while (s2 <= upper) {
destination[d] := source[s2]; s2 := s2 + 1; d := d +1}
else
while (s1 <= mid) {
destination[d] := source[s1]; s1 := s1 + 1; d := d +1}
} // end if
Efficiency of merge: O(N), where N is the number of items in source and destination.
Note that merge takes two already sorted files. Therefore, we need another procedure,
mergeSort, to actually sort these files. mergeSort is a recursive procedure, which at each
step takes a file to be sorted, and produces two sorted halves of this file. Because mergeSort
continuously calls merge, and merge works on two identical arrays, we must create a copy of
original array, source, which we will call destination.
Algorithm mergeSort (source, destination, lower, upper)
Input: source array;
a copy of source, destination;
lower and upper are integers defining the current sublist to be sorted.
Output: destination array sorted.
if (lower <> upper) {
mid := (lower + upper) / 2
mergeSort (destination, source, lower, mid)
mergeSort (destination, source, mid+1, upper)
merge (source, destination, lower, mid, upper)
}
Algorithm Sort (A, N)
Input: Array, A, of items to be sorted;
integer N defining the number of items to be sorted.
Output: Array, A, sorted.
create & initialize destionation[N]
mergeSort (A, destination, 1, N)
Quick sort
The idea (assume the list of items to be sorted is represented as an array):
1.
2.
3.
4.
5.
6.
Select a data item, called the pivot, which will be placed in its proper
place at the end of the current step. Remove it from the array.
Scan the array from right to left, comparing the data items with the
pivot until an item with a smaller value is found. Put this item in the
pivot’s place.
Scan the array from left to right, comparing data items with the pivot,
and find the first item which is greater than the pivot. Place it in the
position freed by the item moved at the previous step.
Continue alternating steps 2-3 until no more exchanges are possible.
Place the pivot in the empty space, which is the proper place for that
item.
Consider the sub-file to the left of the pivot, and repeat the same
process.
Consider the sub-file to the right of the pivot, and repeat the same
process.
Example
Consider the following list of items, and let the pivot be the leftmost item:
Step 1:
15
8
7
3
2
14
11
1
5
9
4
12
13
6
10
10
8
7
3
2
14
11
1
5
9
4
12
13
6
15
Step 2:
10
8
7
3
2
14
11
1
5
9
4
12
13
6
15
6
8
7
3
2
14
11
1
5
9
4
12
13
( )
15
6
8
7
3
2
(
) 11
1
5
9
4
12
13
14
15
6
8
7
3
2
4
11
1
5
9
( )
12
13
14
15
6
8
7
3
2
4
( )
1
5
9
11
12
13
14
15
6
8
7
3
2
4
9
1
5
( )
11
12
13
14
15
6
8
7
3
2
4
9
1
5
10
11
12
13
14
15
Example (contd.)
Step 3:
6
8
7
3
2
4
9
1
5
10
11
12
13
14
15
5
8
7
3
2
4
9
1
( )
10
11
12
13
14
15
5
( )
7
3
2
( )
1
8
10
11
12
13
14
15
5
1
7
3
2
4
9
( )
8
10
11
12
13
14
15
5
1
( )
3
2
4
9
7
8
10
11
12
13
14
15
5
1
4
3
2
6
9
7
8
10
11
12
13
14
15
5
1
4
3
2
6
9
7
8
10
11
12
13
14
15
2
1
4
3
( )
6
8
7
9
10
11
12
13
14
15
2
1
4
3
5
6
8
7
9
10
11
12
13
14
15
9
Step 4:
Example (contd.)
Step 5:
2
1
4
3
5
6
8
( )
9
10
11
12
13
14
15
1
( )
4
3
5
6
7
8
9
10
11
12
13
14
15
1
2
4
3
5
6
7
8
9
10
11
12
13
14
15
1
2
4
3
5
6
7
8
9
10
11
12
13
14
15
1
2
3
( )
5
6
7
8
9
10
11
12
13
14
15
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Step 6:
The partition method
Algorithm partition (A, lo, hi)
Input: Array, A, of items to be sorted;
lo and hi, integers defining the scope of the array to be sorted.
Output: Assuming A[lo] to be a pivotal value, array A is returned in a
partitioned form, where pivotPoint is an index of the final destination of the
pivot
int pivot := A[lo]
while (lo < hi) {
while (precedes (pivot, A[hi]) & (lo < hi))
hi := hi – 1
if (hi <> lo) {
A[lo] := A[hi]; lo := lo + 1}
while (precedes (A[lo], pivot) & (lo < hi))
lo := lo + 1
if (hi <> lo) {
A[hi] := A[lo]; hi := hi – 1}
} // end while
A[hi] := pivot; pivotPoint := hi
The quickSort and sort procedures
Algorithm quickSort (A, lo, hi)
Input: Array, A, of items to be sorted;
lo and hi, integers defining the scope of the array to be sorted.
Output: Array, A, sorted.
int pivotPoint := partition (A, lo, hi)
if (lo < pivotPoint)
quickSort (A, lo, pivotPoint-1)
if (hi > pivotPoint)
quickSort (A, pivotPoint+1, hi)
Algorithm Sort (A, N)
Input: Array, A, of items to be sorted;
integer N defining the number of items to be sorted.
Output: Array, A, sorted.
quickSort (A, 1, N)
Example and the partitioning method modified
Consider the same list as in the previous example. Let the pivot be the
rightmost item, and let us scan the file from both ends simultaneously
exchanging elements that are out of order. When two pointers cross,
exchange the pivot with the leftmost element of the right subfile.
Step 1:
15
8
7
3
2
14
11
1
5
9
4
12
13
6
10
6
8
7
3
2
14
11
1
5
9
4
12
13
15
10
6
8
7
3
2
4
11
1
5
9
14
12
13
15
10
6
8
7
3
2
4
9
1
5
11
14
12
13
15
10
6
8
7
3
2
4
9
1
5
10
14
12
13
15
11
Example modified (contd.)
Steps 2 - end:
6
8
7
3
2
4
9
1
5
10
14
12
1
8
7
3
2
4
9
6
5
1
4
7
3
2
8
9
6
5
1
4
2
3
7
8
9
6
5
1
4
2
3
5
8
9
6
7
10
11
12
1
2
4
3
5
6
9
8
7
10
11
1
2
3
4
5
6
7
8
9
10
11
13
15
11
13
15
14
12
13
14
15
12
13
14
15
Static representation of the partitioning process
Example (original)
15
10
11
6
5
9
2
8
4
1
3
7
12
13
14
Static representation of the partitioning process
Example (modified)
10
11
5
3
2
7
4
6
14
9
13
1
8
12
15
Efficiency results
Note that in the best case, if at each partitioning stage the file is divided into 2 equal
parts, we have:
–
–
–
–
1 call to quickSort with a segment of size N;
2 calls to quickSort with a segment of size N/2;
4 calls to quickSort with a segment of size N/4;
8 calls to quickSort with a segment of size N8, etc.
That is, the tree of recursive calls has (log N) levels in this best case, and N comparisons
are made at each level. Therefore, the total number of comparisons will be N log N.
The following recurrence relation describes this case:
CN = 2*C(N/2) + N
for N >= 2 with C1 = 0
To solve this relation, assume N = 2^n, and divide both sides by 2^n:
C(2^n) / 2^n = C(2^(n-1)) / 2^(n-1) + 1 =
...
restore
= C(2^(n-2)) / 2^(n-2) + 1 + 1 = C(2^(n -3)) / 2^(n-3) + 1 + 1 + 1 = ...
the
= C(2^0) / 2^0 + n = C1 / 1 + n = 0 + n = log N * N
denumerator
Efficiency results (cont.)
Result 1: The best case efficiency of quick sort is N log N (the pivot always
divides the file in two equal halves).
Result 2: The worst case efficiency of quick sort is N2 (file already sorted).
Result 3: The average case efficiency of quick sort is 1.38 N log N. This result
makes Quick sort good "general-purpose" sort. Its inner loop is very
short, thus making Quick sort better compared to other N log N sorting
methods.
Also: Quick Sort is an "in-place" method, which uses only a small auxiliary
stack for recursion.
Download