PPT

advertisement
CS 332: Algorithms
Quicksort
David Luebke
1
3/14/2016
Review: Analyzing Quicksort

What will be the worst case for the algorithm?


What will be the best case for the algorithm?


Partition is balanced
Which is more likely?


Partition is always unbalanced
The latter, by far, except...
Will any particular input elicit the worst case?

Yes: Already-sorted input
David Luebke
2
3/14/2016
Review: Analyzing Quicksort

In the worst case:
T(1) = (1)
T(n) = T(n - 1) + (n)

Works out to
T(n) = (n2)
David Luebke
3
3/14/2016
Review: Analyzing Quicksort

In the best case:
T(n) = 2T(n/2) + (n)

What does this work out to?
T(n) = (n lg n)
David Luebke
4
3/14/2016
Review: Analyzing Quicksort
(Average Case)

Intuitively, a real-life run of quicksort will
produce a mix of “bad” and “good” splits



Randomly distributed among the recursion tree
Pretend for intuition that they alternate between
best-case (n/2 : n/2) and worst-case (n-1 : 1)
What happens if we bad-split root node, then goodsplit the resulting size (n-1) node?
David Luebke
5
3/14/2016
Review: Analyzing Quicksort
(Average Case)

Intuitively, a real-life run of quicksort will
produce a mix of “bad” and “good” splits



Randomly distributed among the recursion tree
Pretend for intuition that they alternate between
best-case (n/2 : n/2) and worst-case (n-1 : 1)
What happens if we bad-split root node, then
good-split the resulting size (n-1) node?
 We
end up with three subarrays, size 1, (n-1)/2, (n-1)/2
 Combined cost of splits = n + n -1 = 2n -1 = O(n)
 No worse than if we had good-split the root node!
David Luebke
6
3/14/2016
Review: Analyzing Quicksort
(Average Case)
Intuitively, the O(n) cost of a bad split
(or 2 or 3 bad splits) can be absorbed
into the O(n) cost of each good split
 Thus running time of alternating bad and good
splits is still O(n lg n), with slightly higher
constants
 How can we be more rigorous?

David Luebke
7
3/14/2016
Analyzing Quicksort: Average Case

For simplicity, assume:


All inputs distinct (no repeats)
Slightly different partition() procedure
 partition
around a random element, which is not
included in subarrays
 all splits (0:n-1, 1:n-2, 2:n-3, … , n-1:0) equally likely
What is the probability of a particular split
happening?
 Answer: 1/n

David Luebke
8
3/14/2016
Analyzing Quicksort: Average Case
So partition generates splits
(0:n-1, 1:n-2, 2:n-3, … , n-2:1, n-1:0)
each with probability 1/n
 If T(n) is the expected running time,

1 n 1
T n    T k   T n  1  k   n 
n k 0
What is each term under the summation for?
 What is the (n) term for?

David Luebke
9
3/14/2016
Analyzing Quicksort: Average Case

So…
1 n 1
T n    T k   T n  1  k   n 
n k 0
2 n 1
  T k   n 
n k 0
David Luebke
10
Write it on
the board
3/14/2016
Analyzing Quicksort: Average Case

We can solve this recurrence using the dreaded
substitution method




Guess the answer
Assume that the inductive hypothesis holds
Substitute it in for some value < n
Prove that it follows for n
David Luebke
11
3/14/2016
Analyzing Quicksort: Average Case

We can solve this recurrence using the dreaded
substitution method

Guess the answer
 What’s



the answer?
Assume that the inductive hypothesis holds
Substitute it in for some value < n
Prove that it follows for n
David Luebke
12
3/14/2016
Analyzing Quicksort: Average Case

We can solve this recurrence using the dreaded
substitution method

Guess the answer
 T(n)



= O(n lg n)
Assume that the inductive hypothesis holds
Substitute it in for some value < n
Prove that it follows for n
David Luebke
13
3/14/2016
Analyzing Quicksort: Average Case

We can solve this recurrence using the dreaded
substitution method

Guess the answer
 T(n)

= O(n lg n)
Assume that the inductive hypothesis holds
 What’s


the inductive hypothesis?
Substitute it in for some value < n
Prove that it follows for n
David Luebke
14
3/14/2016
Analyzing Quicksort: Average Case

We can solve this recurrence using the dreaded
substitution method

Guess the answer
 T(n)

Assume that the inductive hypothesis holds
 T(n)


= O(n lg n)
 an lg n + b for some constants a and b
Substitute it in for some value < n
Prove that it follows for n
David Luebke
15
3/14/2016
Analyzing Quicksort: Average Case

We can solve this recurrence using the dreaded
substitution method

Guess the answer
 T(n)

Assume that the inductive hypothesis holds
 T(n)

= O(n lg n)
 an lg n + b for some constants a and b
Substitute it in for some value < n
 What

value?
Prove that it follows for n
David Luebke
16
3/14/2016
Analyzing Quicksort: Average Case

We can solve this recurrence using the dreaded
substitution method

Guess the answer
 T(n)

Assume that the inductive hypothesis holds
 T(n)

 an lg n + b for some constants a and b
Substitute it in for some value < n
 The

= O(n lg n)
value k in the recurrence
Prove that it follows for n
David Luebke
17
3/14/2016
Analyzing Quicksort: Average Case

We can solve this recurrence using the dreaded
substitution method

Guess the answer
 T(n)

Assume that the inductive hypothesis holds
 T(n)

 an lg n + b for some constants a and b
Substitute it in for some value < n
 The

= O(n lg n)
value k in the recurrence
Prove that it follows for n
 Grind
David Luebke
through it…
18
3/14/2016
Analyzing Quicksort: Average Case
2 n 1
T n    T k   n 
n k 0
The recurrence to be solved
2 n 1
  ak lg k  b   n 
n k 0
Plug
What
in inductive
are we doing
hypothesis
here?
2  n 1

case
Whatout
arethe
we k=0
doing
here?
 b   ak lg k  b   n  Expand
n  k 1

2 n 1
2b
  ak lg k  b  
 n 
n k 1
n
2b/n is just a constant,
What are we doing here?
so fold it into (n)
2 n 1
  ak lg k  b   n 
n k 1
Note: leaving the same
recurrence as the book
David Luebke
19
3/14/2016
Analyzing Quicksort: Average Case
2 n 1
T n    ak lg k  b   n 
n k 1
2 n 1
2 n 1
  ak lg k   b  n 
n k 1
n k 1
The recurrence to be solved
Distribute
thewe
summation
What are
doing here?
2a n 1
2b
Evaluate the summation:



k
lg
k

(
n

1
)


n
What are we doing here?

b+b+…+b = b (n-1)
n k 1
n
2a n 1

k lg k  2b  n 

n k 1
Since
n-1<n,
2b(n-1)/n
< 2b
What
are we
doing here?
This summation gets its own set of slides later
David Luebke
20
3/14/2016
Analyzing Quicksort: Average Case
2a n 1
T n  
k lg k  2b  n 

n k 1




David Luebke
The recurrence to be solved
2a  1 2
1 2
We’llthe
prove
this later
hell?
 n lg n  n   2b  n  What
n 2
8 
a
an lg n  n  2b  n 
Distribute
thewe(2a/n)
What are
doingterm
here?
4
a  Remember, our goal is to get

an lg n  b   n   b  n 
What are we doing here?
T(n)
 an lg n + b
4 

Pick a large enough that
an lg n  b
How did we do this?
an/4 dominates (n)+b
21
3/14/2016
Analyzing Quicksort: Average Case

So T(n)  an lg n + b for certain a and b




Thus the induction holds
Thus T(n) = O(n lg n)
Thus quicksort runs in O(n lg n) time on average
(phew!)
Oh yeah, the summation…
David Luebke
22
3/14/2016
Tightly Bounding
The Key Summation
n 1
n 2 1
n 1
k 1
k 1
k  n 2 
n 2 1
n 1
k 1
k  n 2 
 k lg k   k lg k   k lg k


David Luebke
 k lg k   k lg n
n 2 1
n 1
k 1
k  n 2 
 k lg k  lg n  k
23
Split the summation for a
What are we doing here?
tighter bound
The lg k in the second term
What are we doing here?
is bounded by lg n
Move the lg n outside the
What are we doing here?
summation
3/14/2016
Tightly Bounding
The Key Summation
n 1
 k lg k 
k 1


n 2 1
 k lg k  lg n
k 1
n 1
k
k  n 2 
n 2 1
n 1
k 1
k  n 2 
 k lg n 2  lg n  k
n 2 1
 k lg n  1  lg n
k 1
k
lg n/2
= lg
n we
- 1 doing here?
What
are
k  n 2 
n 1
k 1
k  n 2 
 k  lg n  k
24
The lg k in the first term is
What are we doing here?
bounded by lg n/2
n 1
n 2 1
 lg n  1
David Luebke
The summation bound so far
Move (lg n - 1) outside the
What are we doing here?
summation
3/14/2016
Tightly Bounding
The Key Summation
n 2 1
n 1
 k lg k  lg n  1
k 1
 lg n
 k  lg n
k 1
n 2 1
k 
k 1
k 1
n 2 1
k 1
k 1
The summation bound so far
k  n 2 
 k  lg n
n 1
k
Distribute
the
(lg nhere?
- 1)
What
are we
doing
k  n 2 
k
 n  1(n) 
 lg n

2


David Luebke
k
n 2 1
n 1
 lg n k 
n 1
The summations overlap in
What are we doing here?
range; combine them
n 2 1
k
TheWhat
Guassian
are weseries
doing here?
k 1
25
3/14/2016
Tightly Bounding
The Key Summation
 n  1(n) 
k lg k  
 lg n   k

2


k 1
k 1
n 2 1
1
 nn  1lg n   k
2
k 1
n 2 1
n 1
The summation bound so far
Rearrange first term, place
What are we doing here?
upper bound on second
1
1  n  n 
 nn  1lg n     1
2
2  2  2 
1 2
1 2 n
 n lg n  n lg n  n 
2
8
4

David Luebke

26
X Guassian
What are series
we doing?
Multiply it
What are we doing?
all out
3/14/2016
Tightly Bounding
The Key Summation
n 1


1 2
1 2 n
k lg k  n lg n  n lg n  n 

2
8
4
k 1
1 2
1 2
 n lg n  n when n  2
2
8
Done!!!
David Luebke
27
3/14/2016
Download