pivot element

advertisement
Computer Science 101
Fast Searching and Sorting
Improving Efficiency
• We got a better best case by tweaking the
selection sort and the bubble sort
• We would like to improve the worst cases,
too
Example: Sequential Search
set Current to 1
Set Found to false
while Current <= N and not Found do
if A(Current) = Target then
set Found to true
else
increment Current
if Found then
output Current
else
output 0
If the data items are in random order, then each one
must be examined in the worst case
Requires N comparisons in the worst case
Searching a Sorted List
• When we search a phone book, we don’t
begin with the first name and look at each
successor
• We skip over large numbers of names until
we find the target or give up
Binary Search
• Strategy
– Have pointers marking left and right ends of the list still to be
processed
– Compute the position of the midpoint between the two pointers
– If the target equals the value at midpoint, quit with the position
found
– Otherwise, if the target is less than that value, search just the
positions to the left of midpoint
– Otherwise, search the just the positions to the right of midpoint
– Give up when the pointers cross
target 66
14
begin
16
22
32
mid
34
66
80
90
end
Binary Search
• Strategy
– Have pointers marking left and right ends of the list still to be
processed
– Compute the position of the midpoint between the two pointers
– If the target equals the value at midpoint, quit with the position
found
– Otherwise, if the target is less than that value, search just the
positions to the left of midpoint
– Otherwise, search the just the positions to the right of midpoint
– Give up when the pointers cross
target 66
14
16
22
32
34
66
begin mid
80
90
end
The Binary Search Space
34
0
34
0
34
0
41 56
1 2
41
1
63
3
72
4
89
5
95
6
56
2
72
4
89
5
56
2
72
4
95
6
95
6
The Binary Search Algorithm
set Begin to 1
Set End to N
Set Found to false
while Begin <= End and not Found do
compute the midpoint
if Target = A(Mid) then
set Found to true
else if Target < A(Mid) then
search to the left of the midpoint
else
search to the right of the midpoint
if Found then
output Mid
else
output 0
The Binary Search Algorithm
set Begin to 1
Set End to N
Set Found to false
while Begin <= End and not Found do
set Mid to (Begin + End) / 2
if Target = A(Mid) then
set Found to true
else if Target < A(Mid) then
search to the left of the midpoint
else
search to the right of the midpoint
if Found then
output Mid
else
output 0
The Binary Search Algorithm
set Begin to 1
Set End to N
Set Found to false
while Begin <= End and not Found do
set Mid to (Begin + End) / 2
if Target = A(Mid) then
set Found to true
else if Target < A(Mid) then
set End to Mid – 1
else
search to the right of the midpoint
if Found then
output Mid
else
output 0
The Binary Search Algorithm
set Begin to 1
Set End to N
Set Found to false
while Begin <= End and not Found do
set Mid to (Begin + End) / 2
if Target = A(Mid) then
set Found to true
else if Target < A(Mid) then
set End to Mid – 1
else
set Begin to Mid + 1
if Found then
output Mid
else
output 0
Analysis of Binary Search
• On each pass through the loop, ½ of the positions
in the list are discarded
• In the worst case, the number of comparisons
equals the number of times the size of the list can
be divided by 2
• How many comparisons for a list of size N, in the
worst case?
Improving on Sorting
• Several algorithms have been developed to
break the (N2 - N) / 2 barrier for sorting
• Most of them use a divide-and-conquer
strategy
• Break the list into smaller pieces and apply
another algorithm to them
Quicksort
• Strategy - Divide and Conquer:
– Partition list into two parts, with small elements in the
first part and large elements in the second part
– Sort the first part
– Sort the second part
• Question - How do we sort the sections?
Answer - Apply Quicksort to them
• Recursive algorithm - one which makes use of
itself to solve smaller problems of the same type
Quicksort
• Question - Will this recursive process ever
stop?
• Answer - Yes, when the problem is small
enough, we no longer use recursion. Such
cases are called base cases
Partitioning a List
• To partition a list, we choose a pivot
element
• The elements that are less than or equal to
the pivot go into the first section
• The elements larger than the pivot go into
the second section
Partitioning a List
19
8
15
5
Pivot is the element
at the midpoint
10
1
19
30
20
10
1
28
25 12
20
30
28 25
Partition
8
15
Sublist to sort
5
12
Sublist to sort
Data are where they should be relative to the pivot
The Quicksort Algorithm
if the list to sort has more than 1 element then
if the list has exactly two elements then
if the elements are out of order then
exchange them
else
perform the Partition Algorithm on the list
apply QuickSort to the first section
apply QuickSort to the second section
Partitioning: Choosing the Pivot
• Ideal would be to choose the median
element as the pivot, but this would take too
long
• Some versions just choose the first element
• Our choice - the median of the first three
elements
Partitioning a List
19
8
15
5
30
Pivot is median of
first three items
10
8
12
20
10
1
28
25 12
30
28
25 19
Partition
5
1
15
20
The median of the first three elements is a better
approximation to the actual median than the element at
the midpoint and results in more even splits
The Partition Algorithm
exchange the median of the first 3 elements with the first
set P to first position of list
set L to second position of list
set U to last position of list
while L <= U
while A(L)  A(P) do
set L to L + 1
while A(U) > A(P) do
set U to U - 1
if L < U then
exchange A(L) and A(U)
exchange A(P) and A(U)
A
P
L
U
The list
The position of the pivot element
Probes for elements > pivot
Probes for elements <= pivot
Quicksort: Rough Analysis
• For simplification, assume that we always get
even splits when we partition
• When we partition the entire list, each element is
compared with the pivot - approximately n
comparisons
• Each of the halves is partitioned, each taking
about n/2 comparisons, thus about n more
comparisons
• Each of the fourths is partitioned,each taking
about n/4 comparisons - n more
Quicksort: Rough Analysis
• How many levels of about n comparisons do we
get?
• Roughly, we keep splitting until the pieces are
about size 1
• How many times must we divide n by 2 before we
get 1?
• log(n) times, of course
• Thus comparisons  n Log(n) in the ideal or best
case
Call Tree For a Best Case
34
41
56
63
72
89
34 41 56
34
56
95
72 89 95
72
95
We select the midpoint element as the pivot.
The median element happens to be at the midpoint on each call.
But the array was already sorted!
Worst Case
• What if the value at the midpoint is near the
largest value on each call?
• Or near the smallest value on each call?
• Then there will be approximately n
subdivisions, and the total number of
comparisons will degenerate to n2
Call Tree For a Worst Case
34
41 56
63
72
89
95
41 56 63 72 89 95
56
63
72
89
95
63
72
89
95
72
89
95
89
95
95
We select the first element as the pivot.
The smallest element happens to be the first one on each call.
n subdivisions!
Other Methods of Selecting the
Pivot Element
• Pick a random element
• Pick the median of the first three elements
• Pick the median of the first, middle, and
last elements
• Pick the median element - not!! This is an
O(n) algorithm
For Monday
Continue Reading in Chapter 3
Download