notes-1d-divide

advertisement
Intro. to the Divide-and-Conquer Strategy via Merge Sort
CMPSC 465 – CLRS Sections 2.3, Intro. to and various parts of Chapter 4
I. Algorithm Design and Divide-and-Conquer
There are various strategies we can use to design algorithms.
One is an incremental approach, of the sort of approach where we start with a solution for a single element problem and
figure out ways, one increment at a time, to build the solution to the next largest problem from the prior solution. Insertion
sort fits this strategy.
Question: Why? Because at each step, we have already _______________________________________
we insert _________________________ into its place, ___________________,
and we get out ______________________________.
Another common strategy is called divide-and-conquer. It is for situations that are naturally recursive and has three
components:

Divide …

Conquer …

Combine …
This is one of three major algorithm design strategies we’ll study in this course. The other two are greedy algorithms (you got
a taste of that in 360 with minimum spanning trees) and dynamic programming.
II. Merge Sort
We’ve been discussing the merge sort algorithm informally all along. In this lesson, we’ll focus on merge sort and use it as an
introduction to the important ideas of analyzing divide-and-conquer algorithms.
We looked at merge sort on the first day. The idea is to sort a subarray A[p...r], where we start with p = ____ and r = _____.
However, the algorithm is recursive, and at each level, the values of p and r change.
Here’s the high-level overview of merge sort:
1.
Divide the array into two subarrays _______________ and _______________, where q is the midpoint of p and r
2.
Conquer by recursively sorting the two subarrays.
3.
Combine by ___________________ the two sorted subarrays A[p..q] and A[q+1..r] to produce a single sorted
subarray ________________.
Question: All recursive definitions and algorithms need a base case. What’s the base case here?
Page 1 of 9
Prepared by D. Hogan referencing CLRS - Introduction to Algorithms (3rd ed.) for PSU CMPSC 465
Here’s the pseudocode for merge sort from CLRS:
MERGE-SORT(A, p, r)
if p < r
q  ( p  r) / 2 
MERGE-SORT(A, p, q)
MERGE-SORT(A, q + 1, r)
MERGE(A, p, q, r)
// check for base case
// divide
// conquer
// conquer
// combine
Let’s trace the algorithm a few times.
Example: Trace MERGE-SORT for the following initial array:
index
value
1
5
2
2
3
4
4
7
5
1
6
3
7
2
8
6
6
4
7
7
8
3
Example: Trace merge sort for the following initial array:
index
value
Page 2 of 9
1
4
2
7
3
2
4
6
5
1
9
5
10
2
11
6
Prepared by D. Hogan referencing CLRS - Introduction to Algorithms (3rd ed.) for PSU CMPSC 465
III. Merging in Linear Time
The concept of merging is relatively intuitive, but implementing it or expressing it in pseudocode isn’t quite as
straightforward. The problem’s input and output are the following:
Input with preconditions: Array A and indices p, q, r s.t. p ≤ q < r, subarray A[p..q] is sorted, and subarray A[q+1..r] is
sorted
Output/postcondition: Subarrays A[p..q] and A[q+1..r] have been merged into a single sorted subarray in A[p..r]
Last week, we quickly discussed why this can be done in (n) time, but let’s look a little more carefully here.
Let’s visualize this as two piles of cards, where each pile is sorted and we can see the smallest card from each pile on top.
We’ll have the input piles face up. We want to merge those cards into a single sorted pile, face down. Let’s act this out, but
you need to count as we go. I’ll leave you space to take notes and keep count…
So,

Each basic step was to ____________________________________________________ and move it to an output
pile, exposing a smaller card.

We repeatedly perform basic steps until ______________________________________________.

Then, we finish by ________________________________________________________.
Each basic step takes __________________ time. What’s the maximum number of basic steps? ___________.
What’s the running time? _____________
The CLRS version does empty pile detection via a sentinel card, whose value is guaranteed to lose in a comparison to any
value. So, ∞ works for this, and you’ll see it in the pseudocode. In this mindset…

Given that the input array is indexed from p to r, there are exactly __________ non-sentinel cards.

So, we can use this to stop the algorithm after _____________ basic steps. Thus we can just fill up the output array
from index p through index r.
Page 3 of 9
Prepared by D. Hogan referencing CLRS - Introduction to Algorithms (3rd ed.) for PSU CMPSC 465
Here’s the pseudocode for merging from CLRS:
Example: Suppose array A is as below and trace MERGE(A, 9, 12, 16):
index
value
Page 4 of 9
…
…
9
2
10
4
11
5
12
7
13
1
14
2
15
3
16
6
…
…
Prepared by D. Hogan referencing CLRS - Introduction to Algorithms (3rd ed.) for PSU CMPSC 465
IV. The Correctness of the Merge Algorithm
Let’s study the correctness of the Merge algorithm via proving the correctness of the last loop, which is the heart of the
algorithm as far as proving correctness goes:
Loop Invariant:
Correctness Proof:
Initialization:
Maintenance:
Termination:
Page 5 of 9
Prepared by D. Hogan referencing CLRS - Introduction to Algorithms (3rd ed.) for PSU CMPSC 465
V. Back to Recurrences
The key tool to analyze divide-and-conquer algorithms is the recurrence. We can use a recurrence to describe the running
time of a divide-and-conquer algorithm very naturally, because of the recursion in the algorithm.
In this world, we use the following conventions:
 We let T (n) be a function for the running time for a problem of size n.
 A base case corresponds to a small enough problem size and we use a simple or brute force strategy we say takes
(1) time.
 The recursive case is where we divide a problem into a subproblems, each 1/b the size of the original.
 We express the time to divide a size n problem as D(n).
 We express the time to combine the solutions to the subproblems as C(n).
So, using these conventions, a general form of a recurrence for the running time of a divide-and-conquer algorithm is
As always, our goal with a recurrence is to solve it for a closed form, i.e. a formula that determines the same sequence.
Expanding upon the list we have from discrete math, there are several methods for solving recurrences:
1.
Iteration: Start with the recurrence and keep applying the recurrence equation until we get a pattern. (The result is a
guess at the closed form.)
2.
Substitution: Guess the solution; prove it using induction. (The result here is a proven closed form. It's often
difficult to come up the guess blindly though, so, in practice, we need another strategy to guess the closed form.)
3.
Recursion Tree: Draw a tree that illustrates the decomposition of a problem of size n into subproblems and tally the
costs of each level of the tree. Use this to find a closed form. (Like iteration, this is really a guess at the closed
form.)
4.
Master Theorem: Plugging into a formula that gives an approximate bound on the solution. (The result here is only
a bound on the closed form. It is not an exact solution. For many of our purposes, that’s good enough.)
We’ll look at the new techniques in depth in this chapter.
Here are some other issues that are specific to the use of recurrences in analyzing algorithms:

Floors and ceilings:

Expressing boundary conditions:

Asymptotic notation:
Page 6 of 9
Prepared by D. Hogan referencing CLRS - Introduction to Algorithms (3rd ed.) for PSU CMPSC 465
VI. The Merge Sort Recurrence
In Epp 11.5, we derived a merge sort recurrence. We’ll simplify it a bit here and state it in the same symbols as above. Some
notes:
 The base case occurs when n = 1.
 For n ≥ 2,
o The divide step is a computation of the average of p and q and takes (1)
o The conquer step is to solve two recursive subproblems, each of size n/2
o The combine step is to merge an n element array, which takes (n) time
So the recurrence is:

(1)
for n  1

T (n)  
 2T (n / 2)  (n) for n  1

as (n) + (1) = (n).
VII. Analysis via Recursion Tree
Let’s simplify the recurrence as follows, using c as a constant (which, in general is not the same, but works for this case and
keeps the analysis clean):

c
for n  1
T (n)  
2T
(n
/
2)

cn
for n  1

While we could solve this recurrence with our old iteration strategy, we’ll use a new strategy called a recursion tree, which
visually represents the recursion and is less prone to errors. At each step:
 The root of the recursion tree (or subtree) is the cost of dividing and combining at that step.
 We branch to children, with one child for each subproblem.
o Initially, we label the nodes as the recurrence function value.
o Then, we apply another round of recursion, repeating the whole process and replacing the roots with their
costs as described.
We continue this expansion until problem sizes get to the base case, or 1.
Let’s draw the first step of the recursion tree for merge sort:
Let’s draw the recursion tree after the first two steps:
Page 7 of 9
Prepared by D. Hogan referencing CLRS - Introduction to Algorithms (3rd ed.) for PSU CMPSC 465
Now let’s draw the complete recursion tree:
Now, we must analyze the tree’s cost at each level:

What is the cost of the top level? ______________

What is the cost per subproblem on the second level? _______ How many subproblems? _______
Second level cost? ______________

Third level cost?

Cost per level?
Next,

What is the height of the tree? _________

How many levels does the tree have? __________ (we’ll prove this momentarily)

What is the total cost of the tree? What is the merge sort running time?
Page 8 of 9
Prepared by D. Hogan referencing CLRS - Introduction to Algorithms (3rd ed.) for PSU CMPSC 465
Finally, let’s be clean in our analysis and prove the claim that there are lg n + 1 levels, for problem sizes that are powers of 2,
inductively:
Homework: CLRS Exercises 2.3-1, 2.3-2
Page 9 of 9
Prepared by D. Hogan referencing CLRS - Introduction to Algorithms (3rd ed.) for PSU CMPSC 465
Download