Approximation Algorithms • • • • Motivation and Definitions TSP Vertex Cover Scheduling Motivation • Consider an NP-hard optimization problem like TSP – Input: n cities, pairwise distances d(i,j) – Task: find a tour of minimum length • Given this is NP-hard, we are unlikely to find an optimal solution in polynomial time. • What are our options? – Give up – Try to build the fastest possible algorithm that returns an optimal solution – Find a polynomial-time algorithm that returns good solutions that are “approximately” optimal. • This is the option we focus on today. Definitions • For any input instance I, – let OPT(I) denote both the optimal solution and the value of the optimal solution – let A(I) denote both an approximate solution and the value of the approximate solution generated by running polynomial-time algorithm A. • So what do we mean by a good solution that is “approximately” optimal (for a minimization problem)? • Additive approximation – For any input I, A(I) ≤ OPT(I) + c for some constant c. • Multiplicative approximation – For any input I, A(I) ≤ c OPT(I) for some constant c. Are both approximations possible? • Consider the TSP problem – Suppose c = 3. – Additive: if OPT(I) = 10, then APPROX(I) ≤ 13. If OPT(I2) = 100, then APPROX(I) ≤ 103. – Multiplicative: if OPT(I) = 10, then APPROX(I) ≤ 30. If OPT(I2) = 100, then APPROX(I) ≤ 300. – Argue that at least one of these approximations is impossible unless we can solve TSP optimally in polynomial time. – Hint: think about scaling the integers in your input. – Is the other approximation possible for TSP? Definitions II • Because we typically cannot achieve additive approximation due to scaling, we try to get multiplicative approximation • For a minimization problem, an algorithm is a capproximation algorithm if for all inputs I, – A(I) ≤ c OPT(I) • For a maximization problem, an algorithm is a capproximation algorithm if for all inputs I, – A(I) ≥ 1/c OPT(I) • Approximation goal: for an NP-hard optimization problem, find a polynomial time algorithm that is a c-approximation algorithm for the smallest c possible. Example Problems • TSP – general TSP – metric TSP • Vertex cover • Scheduling TSP • We showed earlier there can be no additive approximation algorithm unless P=NP • Show now that for any constant c, there can be no c-approximation algorithm for TSP unless P=NP • Hint: show that if there is a c-approximation algorithm for some constant c, then Hamiltonian circuit can be solved in polynomial time Metric TSP • In the metric TSP, the city distances must satisfy a triangle inequality. – For any 3 cities i, j, k, it must be the case that d(i,j) ≤ d(i,k) + d(k,j) • Observe how our previous argument violates this triangle inequality • Show how to come up with a c-approximation algorithm for TSP based on a minimum spanning tree • What value of c can you come up with? Cristofides Improvement • Start with MST T as before • Identify nodes with odd degree • Find a minimum weight matching M on these nodes • Now compute an Euler tour of the graph of T union M (with shortcuts to prevent visiting an edge twice) • This solution is guaranteed to have length at most 3/2. Vertex Cover • Input: Graph G = (V,E) • Task: Find C subset of V of minimum size such that for each edge (u,v) in E, either u is in C or v is in C • This can be as bad as Θ(lg n) – – – – Hint: make a bipartite graph Make x nodes in one set all have the same max degree May y nodes in other set have varying degrees Greedy picks all the y nodes (with bad tie-breaking), optimal picks all the x nodes Vertex Cover • Better solution – Find a maximal matching M in G • Matching: set of edges in G that do not share any common vertices • Maximal matching: No edges can be added to the matching to produce a larger matching – Return as C all the nodes in edges in M • What approximation ratio does this guarantee and why? Scheduling • Input: – Set of m identical machines – Set of n jobs with length pi • Task: Assign the n jobs to the m machines with the goal of minimizing the maximum total length of jobs assigned to any one machine • Greedy strategy – – – – Take the jobs one by one in any order Assign the current job to the currently least loaded machine What approximation bound can be derived? What if we sort the jobs first? Should we do longest or shortest first? Scheduling • Greedy strategy – Take the jobs one by one in any order – Assign the current job to the currently least loaded machine • What approximation bound can be derived? – Lower bounds on OPT • What bounds can we derive on the best schedule? • How can we relate this algorithm’s schedule to OPT’s?