Set Cover 資工碩一9562635簡裕峰 Set Cover Problem 2.1 (Set Cover) Given a universe U of n elements, a collection of subsets of U, S ={S1,…,Sk}, and a cost function c : S -> Q+, find a minimum cost subcollection of S that covers all elements of U. Set Cover - ex Unit – – – – – – U = {a, b, c, d, e} S = {S1, S2, S3, S4} S1 = {a, b, c} S2 = {b, c, d} S3 = {d, e} S4 = {a, c} Weighted – – – – – – U = {a, b, c, d, e} S = {S1, S2, S3, S4} S1 = {a, b, c} , c(S1)=5 S2 = {b, c, d} , c(S2)=2 S3 = {d, e} , c(S3)=3 S4 = {a, c} , c(S4)=2 Set Cover Define the frequency of an element to be the number of sets it is in. Let us denote the frequency of the most frequent element by f. The various approximation algorithms for set cover achieve one of two factors : – – O(log n) f Set Cover & Vertex Cover When f = 2 : (Ex2.7) – – – – – – U = {a, b, c, d, e} S1 = {a, b} S2 = {a} S3 = {d, e} S4 = {c, e} S5 = {b, c, d} S1 b a S2 S5 d c S4 S3 e Factor 2 approximation algorithm in Chapter 1 Set Cover & Vertex Cover When f = 2 : (Ex2.7) – – – – – – U = {a, b, c, d, e, f} S1 = {a, b} S2 = {a, f} S3 = {d, e} S4 = {c, e} S5 = {b, c, d} S1 b a S2 S5 d f c S4 S3 e Factor 2 approximation algorithm in Chapter 1 2.1 The Greedy algorithm Algorithm 2.2 Greedy set cover algorithm Iteratively pick the most cost-effective set and remove the covered elements, until all elements are covered. 2.1 The Greedy algorithm Unit – – – – – – U = {a, b, c, d, e} S = {S1, S2, S3, S4} S1 = {a, b, c} S2 = {b, c, d} S3 = {d, e} S4 = {a, c} Weighted – – – – – – U = {a, b, c, d, e} S = {S1, S2, S3, S4} S1 = {a, b, c} , c(S1)=5 S2 = {b, c, d} , c(S2)=2 S3 = {d, e} , c(S3)=3 S4 = {a, c} , c(S4)=2 2.1 The Greedy algorithm Let e1,…,en be this numbering. Lemma 2.3 OPT For each k {1,...,n}, price(e k ) (n-k+1) Prove OPT OPT price(e k ) (n-k+1) C 2.1 The Greedy algorithm Theorem 2.4 The greedy algorithm is an Hn factor approximation algorithm for the minimun set cover problem, where Hn = 1 + ½ + … + 1/n. Prove : n price(ek ) The total cost = k 1 By lemma 2.3, this is at most Hn x OPT 2.1 The Greedy algorithm Example 2.5 The following is a tight example for algorithm 2.2 … 1/n 1/(n-1) 1+ε 1 2.2 Layering Let w : V -> Q+ be the function assigning weights to the vertices of the given graph G = (V,E). Degree-weighted: if there is a constant c > 0 such that the weight of each vertex v V is c x deg(v). 2.2 Layering Degree-weighted – – – – – w(S1) = 2c w(S2) = 1c w(S3) = 2c w(S4) = 2c w(S5) = 3c S1 b a S2 S5 d c S4 S3 e 2.2 Layering Lemma 2.6 Let w : V -> Q+ be a degree-weighted function. Then w(V) ≤ 2OPT. Prove OPT ≥ |E|, w(V) ≤ 2|E| 2.2 Layering Let us define the largest degree-weighted function in w as follow: – – – – Remove all degree zero vertices from the graph Over the remaining vertices, compute c = min{ w(v) / deg(v)}, t(v) = c x deg(v) is the desired function w`(v) = w(v) – t(v) to be the residual weight function. 2.2 Layering C W0 ... Wk 1 V C D0 ... Dk Gk Gk-1 Wk-1 G1 G0 W0 W1 Dk Dk-1 D1 D0 2.2 Layering 7 3 3 5 2 4 7 3 4 3 0 3 4 3 3 2 1 1 2.2 Layering 4 2 3 2 3 3 2 2 3 1 0 1 1 2.2 Layering 7 4 2 3 ~ ~ 5 3 1 4 3 x 7 3 ~ 3 2 1 2.2 Layering Theorem 2.7 The layer algorithm achieves an approximation guarantee of factor 2 for the vertex cover problem, assuming arbitrary vertex weights. Prove 1. ? The vertices we chosen is a vertex cover. 2.2 Layering Prove 2.8 Let C* be an optimal vertex cover w(C) ≤ 2 OPT If v Wj , w(v) = ti (v) i j If v V – C, w(v) ≥ t (v) i j k 1 i k 1 w(C ) ti (C Gi ) 2 ti (C Gi ) 2w(C ) * i 0 i 0 * 2.3 Application to shortest superstring Problem 2.9 Given a finite alphabet ∑, and a set of n strings, S = {s1,…,sn} ∑+, find a shortest string s that contains each si as a substring. Without loss of generality, we may assume that no string si is a substring of another string sj, i≠j. 2.3 Application to shortest superstring Algorithm 2.10 1. Use the greedy set cover algorithm to find a cover for the instance S. Let set(1),…, set( k) be the sets picked by this cover. 2.Concatenate the string 1 ,…, k , in any order. 3.Output the resulting string. Say s. 2.3 Application to shortest superstring Lemma 2.11 OPT ≤ OPTs ≤ 2OPT OPT : the length of the shortest superstring OPTs : an optimal solution to S 2.3 Application to shortest superstring s sb1 se1 sb2 se2 sb3 se3 1 2 3 2.3 Application to shortest superstring Theorem 2.12 This algorithm is a 2Hn factor algorithm for the shortest superstring problem, where n is the number of strings in the given instance.