Lecture 7 Paradigm #5 Greedy Algorithms Ref. CLRS, Chap. 16 Example 1 (Making change) Suppose you buy something of cost less than $5, and you get your change in Canadian coins. How can you minimize the number of coins returned to you? Formally: Given x, where 0 ≤ x < 500, we wish to minimize t + l + q + d + n + p subject to x = 200t + 100l + 25q + 10d + 5n + p Greedy algorithm: choose as many toonies as possible (such that 200t < x); then for x-200t, choose as many loonies as possible, etc. Coin changing Example: make change for $4.87. We take, in turn, t = 2, l = 0, q = 3, d = 1, n = 0, p = 2. Theorem. In the Canadian system of coinage, the greedy algorithm always produces a solution where the total number of coins is minimized. I will give proof ideas in class, and you prove this formally in your assignment 3. Note: this theorem does not necessarily hold for arbitrary coin systems. For example, consider a system of denominations (12, 5, 1) (instead of (200,100,25,10,5,1)). Then the greedy algorithm provides the solution 15 = 1*12 + 0*5 + 3*1, using a total of four coins, but there is a better solution: 15 = 3*5, using only three coins. Moral of the story: the greedy algorithm doesn't always give the best solution. How can you tell if the greedy algorithm always gives the optimal solution? It turns out that for the change-making problem, it can be done efficiently. See D. Pearson, A polynomial time algorithm for the change-making problem, Operations Research Letters, 33(3) 2005, 231-234. Scheduling competing activities CLRS, Section 16.1 We have n "activities". Activity i has a start time s(i) and a finish time f(i); activities take up the half-open interval [ s(i), f(i) ). We say activity i is compatible with activity j if s(i) ≥ f(j) OR if s(j) ≥ f(i). The activity scheduling problem: produce a schedule containing the maximum number of compatible activities. Also called "activity selection", "interval scheduling". Attempts … This problem has lots of possible "greedy" strategies, e.g. 1. 2. 3. select the activity that starts earliest and is compatible with previous choices. select the activity of shortest duration from remaining ones. select the activity with the smallest number of conflicts. None of these produce the optimal schedule. Can you find counterexamples? A greedy algorithm that works First, sort the activities by finish time. Then, starting with the first, choose the next possible activity that is compatible with previous ones. For example, suppose the activity numbers and start and finish times are as follows: act. # start finish 1 1 4 2 3 5 3 0 6 4 5 7 5 3 8 6 5 9 7 6 10 8 8 11 9 8 12 10 2 13 11 12 14 The greedy alg first chooses 1, then 2,3 are out, so 4 is chosen, then 5,6,7 not compatible with {1,4} so 8 is chosen, then 9,10 not good, 11 is chosen. Finally: 1,4,8,11. Theorem. The greedy algorithm always produces a schedule with the maximum number of compatible activities. Proof: Suppose the greedy schedule is, as above, (1,4,8,11), and there exists a schedule with more activities, say (a1, a2, a3, a4, a5). We show how to modify this (supposed) longer schedule to get one that (essentially) coincides with the greedy one, but retaining the same number of activities. I claim (1, a2, a3, a4, a5) is a valid schedule. For activity a1 either equals 1, or finishes after 1. Since all activities a2, a3, a4, a5 begin after the finish time of a1, they must begin after the finish time of 1. Now I claim (1, 4, a3, a4, a5) is a valid schedule. For a2 must finish after 4 (since 4 was chosen greedily), so 4 is compatible with a3, a4, a5. Continuing in this fashion, we see (1, 4, 8, 11, a5) is a valid schedule. But this is impossible, since if a5 were compatible with the others, the greedy algorithm would have chosen it after activity 11. This completes the proof. Example 3: The knapsack problem In the knapsack problem, we are given a bunch of items numbered from 1 to n. Each item has a weight in kilos, say, wi , and a value in dollars, vi . For simplicity, let's assume all these quantities are integers. Your knapsack has a maximum weight it can hold, say W. Which items should we choose to maximize the value we can hold in our knapsack. Knapsack example Suppose W = 100 and suppose there are 6 items with values and weights as follows item # 1 2 value 80 70 weight 25 40 value/weight 3.20 1.75 3 4 85 40 70 15 1.21 2.67 5 75 20 3.75 6 65 5 13 You can think of many possible greedy strategies: 1. choose the most valuable item first 2. choose the heaviest item first 3. choose the lightest item first 4. choose the item with the highest ratio of value to weight first Nothing works. Greedy strategies do not work for this knapsack problem The best solution is to take items 1, 2, 5, and 6, for a total weight of 90 and a total value of 290. However, strategy 1 (most valuable first) chooses 3, then 1, then 6 for a total weight of 100 and a total value of 230. Strategy 2 (heaviest first) chooses 3, 1, then 6, which is the same as strategy 1. Strategy 3 chooses 6, then 4, then 5, then 1, for a total weight of 65 and a total value of 260. Strategy 4 chooses 6, 5, 1, 4, which is the same as strategy 3, but in a different order. Now let's change our problem so that we are allowed fractional amounts of each item. Instead of each item representing a single physical item, it represents a sack of some substance which can be arbitrarily divided, such as a sack of sugar, salt, gold dust, etc. This is called the fractional knapsack problem. Now there is a greedy strategy that works. Not surprisingly, it involves considering the items ordered by the ratio of value/weight, and taking as much of each item in order as possible that will fit in our knapsack. So for the example above we would take all of item 6, all of item 5, all of item 1, all of item 4. At this point we have used up 65 kilos and there are 35 left. Item 2 has the next highest ratio, and we take 35/40 of it (since it weighs 40).This gives us a value of 61.25, so our final result weighs 100 kilos and is worth 321.25. Here is the algorithm greedy(n, v[1..n], w[1..n], W) /* n is number of items, v is the array of values of each item w is the array of weights of each item, W is the knapsack capacity */ sort both v and w by the ratio v[i]/w[i]; free := W; sum := 0; for i := 1 to n do x := min(w[i], free); sum := sum + v[i]*(x/w[i]); free := free - x; return(sum); Why does it work? Suppose the optimal solution S is better than our greedy solution G. Then S must agree for some number of items (perhaps 0) with G and then differ. To make life simpler, let's assume that the ratios v[i]/w[i] are all distinct. Consider the first place where S differs from G, say when G chooses a certain amount of item k and S chooses a different amount. (Here we are assuming the items have been ordered by decreasing order of value/weight.) If S chooses more of item k, then choosing more would have been possible without exceeding W, so G would have done that too. So S must choose less of item k. Say G chooses g of item k and S chooses s of item k. Then we can exchange (g-s) of later items that S chooses with (g-s) of item k. Doing so would only increase the total value of S, since v[k]/w[k] > v[l]/w[l] for l > k. So S was not optimal after all, a contradiction.