Greedy Algorithms Simplified version of dynamic programming. Used for optimization problems, i.e. to find the optimal solution. Main idea: when presented with a choice, make the choice that looks the best at the moment, hoping it will lead to a globally optimal solution. Note: it is not guaranteed that it will lead to the optimal solution, unless it has been mathematically proven that it will. For example, Huffman coding is an example of a greedy algorithm that does produce the optimal solution. In general, it is not guaranteed for any greedy algorithm. Practice: what is the main difference between dynamic programming approach and greedy approach? Example problem that can be solved using dynamic programming or greedy approach: Scheduling a room Assume that various activities take place (for example, classes in room POST 126). Schedule the room so that the maximum number of non-overlapping activities can take place (i.e. classes that do not overlap). Note: this is very similar to matrix chain problem or assembly line problem: find the shortest “path” and all the “nodes” that are on that “path.” In this case, find the max number of non-overlapping activities and then find the activities themselves. (The problem can also be optimized to find activities that produce max revenue, or max time scheduled, or whatever optimization criteria is desired). ai: activity i, happens in time interval[si, fi). si: the start time. 0 ≤ si < fi < ∞ fi: the finish time. Sij: set of all activities that are compatible (i.e. non-overlapping) with activities ai and aj. To solve the problem, we need to find maximum size subset of Sij. Example: i 1 2 3 4 5 6 7 8 9 10 11 si 1 3 0 5 3 5 6 8 8 2 12 fi 4 5 6 7 8 9 10 11 12 13 14 So, activity 1 starts at 1 o’clock and ends at 4, activity 2 starts at 3 and ends at 5, etc. There are several possible sets of non-overlapping activities. Which one has the most activities in it? Solution using dynamic programming: introduce a0, an+1 in order to have both ends covered. f0 = 0, sn+1 = ∞. Assume that the activities are sorted according to the finish time. f0 ≤ f1 ≤ f2 ≤ … ≤ fn+1 Step 1: assume that the optimal solution has activity ak in it, fi ≤ sk< fk ≤ sj. So now we have two subproblems, Sik and Skj. Optimal solution: Aik U {ak} U Akj. i ≥ 0, j ≤ n+1, i < j. Step 2: recursive solution: c[i,j]: the number of activities compatible with activities ai and aj. c[i,j] = c[i, k] + 1 + c[k,j] Try for different k to find the maximum number of solutions. Dynamic programming formula: c[i, j] is the maximum number of activities compatible with activities ai and aj. Set Sij contains all activities compatible with activities ai and aj. 0 ≤ i ≤ n, 1 ≤ j ≤ n+1. c[i, j] = 0 max(c[i, k] + c[k, j] + 1) i < k < j, ak is in Sij if Sij is an empty set, e.g. i ≥ j otherwise, where i < k < j and ak belongs to Sij Exercise: write pseudocode to implement this formula in recursive, memorized, and iterative way. Greedy Solution (proven to be optimal) Initial call: RecursiveActivitySelector(s, f, 1, n) //Input: // arrays s and f which describe activities 1, …, n // activities i and j //Assumptions: all activities are sorted in non-decreasing order based on their finish time //Output: the largest set of activities compatible with activities i and j //High level pseudocode: keep on adding the next compatible activity RecursiveActivitySelector(s, f, i, j) { m = i+1 //notice: activity m must finish before activity j while m < j and sm < fi //activity m is not compatible with activity i, it starts too early m = m+1 if m < j //found the first activity compatible with activities i and j return {am} U RecursiveActivitySelector(s, f, m, j) else return EmptySet } Exercise: convert this pseudocode into iterative. Note that this code looks very much like the code for searching a list: keep on going forward through the sequence until you find what you are looking for.