COMP 482: Design and Analysis of Algorithms Spring 2013 Lecture 7 Prof. Swarat Chaudhuri Recap: Interval Scheduling Interval scheduling. Job j starts at sj and finishes at fj. Two jobs compatible if they don't overlap. Goal: find maximum subset of mutually compatible jobs. a b c d e f g h 0 1 2 3 4 5 6 7 8 9 10 11 Time 2 Recap: Interval Scheduling Algorithm. Greedy algorithm; objective is Earliest Finish Time. Argument for optimality: Greedy stays ahead job ir+1 finishes before jr+1 Greedy: i1 i1 ir OPT: j1 j2 jr ir+1 jr+1 ... why not replace job jr+1 with job ir+1? 3 Also applicable to Truck Driver’s problem (selection of breakpoints) Variant: 24-7 Interval Scheduling You have a processor that can operate 24-7. People submit requests to run daily jobs on the processor. Each such job comes with a start time and an end time; if the job is accepted it must run continuously for the period between the start and end times, EVERY DAY. (Note that some jobs can start before midnight and end after midnight.) Given a list of n such jobs, your goal is to accept as many jobs as possible (regardless of length), subject to the constraint that the processor can run at most one job at any given point of time. Give an algorithm to do this. For example, here you have four jobs (6pm, 6am), (9pm, 4am), (3am, 2pm), (1pm, 7pm). The optimal solution is to pick the second and fourth jobs. 4 Solution Let I1,…,In be the n intervals. We call an Ij-restricted solution one that contains the interval Ij. Here’s an algorithm, for fixed j, to compute an Ij-restricted solution of maximum size. Let x be a point in Ij. First delete Ij and all intervals that overlap it. The remaining intervals do not contain the point x, so we can cut the timeline at x and produce an instance of the Interval Scheduling Problem from class. This takes O(n) time assuming intervals are sorted by ending time. Now, the algorithm for the full problem is to compute an Ij-restricted solution of maximum size for each j = 1,…,n. This takes a total of O(n2) time. We now pick the largest of these solutions and claim that it is the optimal. Why? Consider the optimal solution to the full problem. Suppose this produces a set of intervals S. There must be SOME Ij in S, so the solution is an optimal Ij-restricted solution. But then our algorithm would find it. 5 Recap: Interval Partitioning Interval partitioning. Lecture j starts at sj and finishes at fj. Goal: find minimum number of classrooms to schedule all lectures so that no two occur at the same time in the same room. Ex: This schedule uses 4 classrooms to schedule 10 lectures. e c j g d b h a 9 9:30 f 10 10:30 11 11:30 12 12:30 1 1:30 i 2 2:30 3 3:30 4 4:30 Time 6 Recap: Interval Partitioning Interval partitioning. Lecture j starts at sj and finishes at fj. Goal: find minimum number of classrooms to schedule all lectures so that no two occur at the same time in the same room. Greedy algorithm. Consider lectures in increasing order of start time. Assign lecture to any compatible existing classroom. If not possible, open new classroom. c d b a 9 9:30 f j g i h e 10 10:30 11 11:30 12 12:30 1 1:30 2 2:30 3 3:30 4 4:30 Time 7 Interval Partitioning: Lower Bound on Optimal Solution Def. The depth of a set of open intervals is the maximum number that contain any given time. Key observation. Number of classrooms needed by any algorithm depth. Argument for optimality. Show that depth number of classrooms opened by greedy algorithm. c d b a 9 9:30 f j g i h e 10 10:30 11 11:30 12 12:30 1 1:30 2 2:30 3 3:30 4 4:30 Time 8 Greedy Analysis Strategies Greedy algorithm stays ahead. Show that after each step of the greedy algorithm, its solution is at least as good as any other algorithm's. Structural. Discover a simple "structural" bound asserting that every possible solution must have a certain value. Then show that your algorithm always achieves this bound. Exchange argument. Gradually transform any solution to the one found by the greedy algorithm without hurting its quality. 10 Q1: Coin Changing Goal. Given currency denominations: 1, 5, 10, 25, 100, devise a method to pay amount to customer using fewest number of coins. Ex: 34¢. Cashier's algorithm. At each iteration, add coin of the largest value that does not take us past the amount to be paid. Ex: $2.89. 11 Q1: Coin-Changing: Greedy Algorithm Cashier's algorithm. At each iteration, add coin of the largest value that does not take us past the amount to be paid. Sort coins denominations by value: c1 < c2 < … < cn. coins selected S while (x 0) { let k be largest integer such that ck x if (k = 0) return "no solution found" x x - ck S S {k} } return S Q1. Is cashier's algorithm optimal? 12 Coin-Changing: Analysis of Greedy Algorithm Theorem. Greed is optimal for U.S. coinage: 1, 5, 10, 25, 100. Pf. (by induction on x) Consider optimal way to change ck x < ck+1 : greedy takes coin k. We claim that any optimal solution must also take coin k. – if not, it needs enough coins of type c1, …, ck-1 to add up to x – table below indicates no optimal solution can do this Problem reduces to coin-changing x - ck cents, which, by induction, is optimally solved by greedy algorithm. ▪ k ck All optimal solutions must satisfy Max value of coins 1, 2, …, k-1 in any OPT 1 1 P4 - 2 5 N1 4 3 10 N+D2 4+5=9 4 25 Q3 20 + 4 = 24 5 100 no limit 75 + 24 = 99 13 Coin-Changing: Analysis of Greedy Algorithm Observation. Greedy algorithm is sub-optimal for US postal denominations: 1, 10, 21, 34, 70, 100, 350, 1225, 1500. Counterexample. 140¢. Greedy: 100, 34, 1, 1, 1, 1, 1, 1. Optimal: 70, 70. 14 Greedy Analysis Strategies Greedy algorithm stays ahead. Show that after each step of the greedy algorithm, its solution is at least as good as any other algorithm's. Structural. Discover a simple "structural" bound asserting that every possible solution must have a certain value. Then show that your algorithm always achieves this bound. Exchange argument. Gradually transform any solution to the one found by the greedy algorithm without hurting its quality. 15 4.2 Scheduling to Minimize Lateness Scheduling to Minimizing Lateness Minimizing lateness problem. Single resource processes one job at a time. Job j requires tj units of processing time and is due at time dj. If j starts at time sj, it finishes at time fj = sj + tj. Lateness: j = max { 0, fj - dj }. Goal: schedule all jobs to minimize maximum lateness L = max j. Ex: 1 2 3 4 5 6 tj 3 2 1 4 3 2 dj 6 8 9 9 14 15 lateness = 2 d3 = 9 0 1 d2 = 8 2 d6 = 15 3 4 d1 = 6 5 6 7 lateness = 0 max lateness = 6 d5 = 14 8 9 10 d4 = 9 11 12 13 14 15 17 Minimizing Lateness: Greedy Algorithms Greedy template. Consider jobs in some order. [Shortest processing time first] Consider jobs in ascending order of processing time tj. [Earliest deadline first] Consider jobs in ascending order of deadline dj. [Smallest slack] Consider jobs in ascending order of slack dj - tj. 18 Minimizing Lateness: Greedy Algorithms Greedy template. Consider jobs in some order. [Shortest processing time first] Consider jobs in ascending order of processing time tj. 1 2 tj 1 10 dj 100 10 counterexample [Smallest slack] Consider jobs in ascending order of slack dj - tj. 1 2 tj 1 10 dj 2 10 counterexample 19 Minimizing Lateness: Greedy Algorithm Greedy algorithm. Earliest deadline first. Sort n jobs by deadline so that d1 d2 … dn t 0 for j = 1 to n Assign job j to interval [t, t + tj] sj t, fj t + tj t t + tj output intervals [sj, fj] max lateness = 1 d1 = 6 0 1 2 d2 = 8 3 4 d3 = 9 5 6 d4 = 9 7 8 d5 = 14 9 10 11 12 d6 = 15 13 14 15 20 Minimizing Lateness: No Idle Time Observation. There exists an optimal schedule with no idle time. d=4 0 1 d=6 2 3 d=4 0 1 4 d = 12 5 d=6 2 3 6 7 8 9 10 11 8 9 10 11 d = 12 4 5 6 7 Observation. The greedy schedule has no idle time. 21 Minimizing Lateness: Inversions Def. An inversion in schedule S is a pair of jobs i and j such that: di < dj but j scheduled before i. inversion before swap j i Observation. Greedy schedule has no inversions. Observation. All schedules with no idle time and no inversions have same maximum lateness. Observation. If a schedule (with no idle time) has an inversion, it has one with a pair of inverted jobs scheduled consecutively. 22 Minimizing Lateness: Inversions Def. An inversion in schedule S is a pair of jobs i and j such that: di < dj but j scheduled before i. inversion fi j before swap i i after swap j f'j Claim. Swapping two adjacent, inverted jobs reduces the number of inversions by one and does not increase the max lateness. Pf. Let be the lateness before the swap, and let ' be it afterwards. 'k = k for all k i, j 'i i ¢j = fj¢ - d j (definitio n) If job j is late: = f -d ( j finishes at time f ) j i £ fi - di £ i (i < j ) i (definitio n) 23 Minimizing Lateness: Analysis of Greedy Algorithm Theorem. Greedy schedule S is optimal. Pf. Define S* to be an optimal schedule that has the fewest number of inversions, and let's see what happens. Can assume S* has no idle time. If S* has no inversions, then S = S*. If S* has an inversion, let i-j be an adjacent inversion. – swapping i and j does not increase the maximum lateness and strictly decreases the number of inversions – this contradicts definition of S* ▪ 24 Q2: Subsequences Suppose you have a collection of possible events (e.g., possible transactions) and a sequence S of n events. A given event may occur multiple times—e.g., you could have an event “buy Google stock” multiple times in a log of transactions. A sequence S’ is a subsequence of a sequence S if there’s a way to delete certain events from S such that the remaining sequence equals S’. For example, the reason to do this could be patternmatching. Give an algorithm that takes two sequences of events—S’ of length m and S of length n—and decides in time O(m+n) whether S’ is a subsequence of S. 25 Solution Greedy algorithm: Let the i-th event of S be S(i). Find the first event in S that matches S’(1), then the second event in S that matches S’(2), and so on. The running time is O(m+n). It is easy to show that if the algorithm finds a match, then S’ is in fact a subsequence of S. More difficult direction: if the algorithm does not find a match, then no match exists. The proof of this is by contradiction. Suppose S’ matches the subsequence S(l1).S(l2)…S(lm). Suppose GREEDY produces the sequence S(k1).S(k2)…. Show that greedy can produce a match all the way up to S(km) and also ki ≤ li for all i. This is done in a way similar to the proof in interval scheduling. 26