Sorting 1

advertisement
Sorting
Pseudocode of Insertion Sort
Insertion Sort
To sort array A[0..n-1], sort A[0..n-2] recursively and
then insert A[n-1] in its proper place among the
sorted A[0..n-2]
• Usually implemented bottom up (nonrecursively)
Example: Sort 6, 4, 1, 8, 5
6|4 1 8 5
4 6|1 8 5
1 4 6|8 5
1 4 6 8|5
1 4 5 6 8
Analysis of Insertion Sort
• Time efficiency
Cworst(n) = n(n-1)/2  Θ(n2)
Cavg(n) ≈ n2/4  Θ(n2)
Cbest(n) = n - 1  Θ(n) (also fast on almost sorted arrays)
• Space efficiency: in-place
• Stability: yes
• Best elementary sorting algorithm overall
• Binary insertion sort
Merge Sort
• Divide: Divide the n element array to be
sorted in two sub array of n/2 element each.
• Conquer: Sort the two sub array recursively
using merge sort
• Combine: Merge the two sorted sub array to
produce the sorted array
Mergesort
• Split array A[0..n-1] in two about equal halves and make copies
of each half in arrays B and C
• Sort arrays B and C recursively
• Merge sorted arrays B and C into array A as follows:
– Repeat the following until no elements remain in one of the
arrays:
• compare the first elements in the remaining
unprocessed portions of the arrays
• copy the smaller of the two into A, while incrementing
the index indicating the unprocessed portion of that
array
– Once all elements in one of the arrays are processed, copy
the remaining unprocessed elements from the other array
into A.
Mergesort Example
8 3 2 9 7 1 5 4
8 3 2 9
8 3
8
7 1 5 4
2 9
3
2
3 8
71
9
2 9
7
5 4
1
5
1 7
2 3 8 9
4 5
1 4 5 7
1 2 3 4 5 7 8 9
4
Pseudocode of Mergesort
Pseudocode of Merge
Analysis of Mergesort
• All cases have same efficiency: Θ(n log n)
• Number of comparisons in the worst case is
close to theoretical minimum for comparisonbased sorting:
log2 n! ≈ n log2 n - 1.44n
Quick Sort
• Divide: The array A[l..r] is partitioned into two
nonempty sub array A[l..m] and A[m+1..r] such that
each element of A[l..m] is less than or equal to each
element of A[m+1..r]. The index m is computed as
part of the partitioning process.
• Conquer: The two sub array are sorted in place by
recursive call to quick sort.
• Combine: since the sub arrays are sorted in place, no
work is needed to combine them.
Quicksort
• Select a pivot (partitioning element) – here, the first element
• Rearrange the list so that all the elements in the first s
positions are smaller than or equal to the pivot and all the
elements in the remaining n-s positions are larger than or
equal to the pivot
p
A[i]p
A[i]p
• Exchange the pivot with the last element in the first (i.e., )
subarray — the pivot is now in its final position
• Sort the two subarrays recursively
Quicksort Example
5 3 1 9 8 2 4 7
Quick Sort Algorithm
Quicksort(A[l..r])
If l< r
then
m = Partition(A[l..r]
Quicksort(A[l..m]
Quicksort(A[m+1..l)
Partitioning Algorithm
•
•
•
•
Analysis of Quicksort
Best case: split in the middle — Θ(n log n)
Worst case: sorted array! — Θ(n2)
Average case: random arrays — Θ(n log n)
Improvements:
– better pivot selection: median of three partitioning
– switch to insertion sort on small subfiles
– elimination of recursion
These combine to 20-25% improvement
• Considered the method of choice for internal sorting of large
files (n ≥ 10000)
Heap and Heap Sort
Definition:
A heap is a binary tree with the following conditions:
• it is essentially complete: all its levels are full, except
last level where only some rightmost leaves may be
missing
• The key at each node is ≥ keys at its children
Example
10
10
5
4
7
2
1
a heap
10
5
7
2
1
not a heap
5
6
7
2
1
not a heap
Note: Heap’s elements are ordered top down (along any path
down from its root), but they are not ordered left to right
Some Important Properties of a Heap
• Given n, there exists a unique binary tree with n
nodes that
is essentially complete, with h = log2 n
• The root contains the largest key
• The subtree rooted at any node of a heap is also a
heap
• A heap can be represented as an array
Heap’s Array Representation
Store heap’s elements in an array (whose elements indexed, for
convenience, 1 to n) in top-down left-to-right order
Example:
9
1 2 3 4 5 6
5
1
•
•
•
•
3
4
9 5 3 1 4 2
2
Left child of node j is at 2j
Right child of node j is at 2j+1
Parent of node j is at j/2
Parental nodes are represented in the first n/2 locations
Heap Construction (bottom-up)
Step 0: Initialize the structure with keys in the order
given
Step 1: Starting with the last (rightmost) parental node,
fix the heap rooted at it, if it doesn’t satisfy the
heap condition: keep exchanging it with its
largest child until the heap condition holds
Step 2: Repeat Step 1 for the preceding parental node
Example of Heap Construction
Construct a heap for the list 2, 9, 7, 6, 5, 8
2
2
9
6
7
5
>
9
6
8
8
5
2
8
5
7
9
8
6
5
7
9
9
6
2
7
>
9
2
6
8
5
7
>
6
2
8
5
7
Bottom-up heap construction algorithm
Heap sort Algorithm:
1. Build heap
2. Remove root –exchange with last (rightmost) leaf
3. Fix up heap (excluding last leaf)
Repeat 2, 3 until heap contains just one node.
Root deletion
The root of a heap can be deleted and the heap fixed up as
follows:
• exchange the root with the last leaf
• compare the new root (formerly the leaf) with each of its
children and, if one of them is larger than the root, exchange it
with the larger of the two.
• continue the comparison/exchange with the children of the
new root until it reaches a level of the tree where it is larger
than both its children
Example of Sorting by Heapsort
Sort the list 2, 9, 7, 6, 5, 8 by heapsort
Stage 1 (heap construction)
2 9 7 6 5 8
2 9 8 6 5 7
2 9 8 6 5 7
9 2 8 6 5 7
9 6 8 2 5 7
Stage 2 (root/max removal)
9 6 8 2 5 7
7 6 8 2 5|9
8 6 7 2 5|9
5 6 7 2|8 9
7 6 5 2|8 9
2 6 5|7 8 9
6 2 5|7 8 9
5 2|6 7 8 9
5 2|6 7 8 9
2|5 6 7 8 9
Analysis of Heap sort (continued)
Recall algorithm:
1. Build heap
Θ(n)
2. Remove root –exchange with last (rightmost) leaf
Θ(log n)
3. Fix up heap (excluding last leaf)
n – 1 times
Repeat 2, 3 until heap contains just one node.
Total: Θ(n) + Θ( n log n) = Θ(n log n)
• Note: this is the worst case. Average case also Θ(n log n).
Priority queues
• A priority queue is the ADT of an ordered set
with the operations:
– find element with highest priority
– delete element with highest priority
– insert element with assigned priority
• Heaps are very good for implementing priority
queues
Insertion of a new element
• Insert element at last position in heap.
• Compare with its parent and if it violates heap
condition
exchange them
• Continue comparing the new element with
nodes up the tree until the heap condition is
satisfied
Insertion of a New Element into a Heap
• Insert the new element at last position in heap.
• Compare it with its parent and, if it violates heap condition,
exchange them
• Continue comparing the new element with nodes up the tree until
the heap condition is satisfied
Example: Insert key 10
9
6
2
>
8
5
10
9
7
10
Efficiency: O(log n)
6
2
>
10
5
7
8
2
6
9
5
7
8
Bottom-up vs. Top-down heap construction
• Top down: Heaps can be constructed by
successively inserting elements into an
(initially) empty heap
• Bottom-up: Put everything in and then fix it
Radix Sort
• Based on examining digits in some base-b
numeric representation of items (or keys)
• Least significant digit radix sort
– Processes digits from right to left
– Used in early punched-card sorting machines
• Create groupings of items with same value in
specified digit
– Collect in order and create grouping with next
significant digit
33
Radix Sort
• Sort each digit (or field) separately.
• Start with the least-significant digit.
• Radix sort must invoke a stable sort.
RADIX-SORT(A, d)
1 for i ← 1 to d
2 do use a stable sort to sort array A on digit i
34
Running Time of Radix Sort
• use counting sort as the invoked stable sort, if
the range of digits is not large
• if digit range is 1..k, then each pass takes
Θ(n+k) time
• there are d passes, for a total of Θ(d(n+k))
• if k = O(n), time is Θ(dn)
• when d is const, we have Θ(n), linear!
35
Another example
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Radix Sort Example
data[ ]
123 234 345 456 543 987 654 23 76 934 765 452 857 356 805 294 490 780 120
200 73
Buckets[ ]
0: 490 780 120 200
1:
2: 452
3: 123 543 23 73
4: 234 654 934 294
5: 345 765 805
6: 456 76 356
7: 987 857
8:
9:
Another example (Cont.)
• data[ ]
• 490 780 120 200 452 123 543 23 73 234 654 934 294 345 765 805 456 76
356 987 857
• Buckets[ ]
• 0: 200 805
• 1:
• 2: 120 123 23
• 3: 234 934
• 4: 345 543
• 5: 452 654 456 356 857
• 6: 765
• 7: 73 76
• 8: 780 987
• 9: 490 294
Another example (Cont.)
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
data[ ]
200 805 120 123 23 234 934 345 543 452 654 456 356 857 765 73 76 780 987 490 294
buckets[ ]
0: 23 73 76
1: 120 123
2: 200 234 294
3: 345 356
4: 452 456 490
5: 543
6: 654
7: 765 780
8: 805 857
9: 934 987
data[ ]
23 73 76 120 123 200 234 294 345 356 452 456 490 543 654 765 780 805 857 934 987
Download