The Matroid Median Problem Viswanath Nagarajan IBM Research Joint with R. Krishnaswamy, A. Kumar, Y. Sabharwal, B. Saha k-Median Problem Set of locations in a metric space (V,d) Symmetric, triangle inequality Place k facilities such that sum of connection costs (to nearest facility) is minimized: minFµV, |F|·k u2V d(u,F) k-Median Results poly(log n) approx via tree embeddings [B ’96] LP rounding O(1)-approx [CGST ’99] Lagrangian relaxation + primal dual [JV ’01] Local search with p-exchanges [AGKMMP ’04] best known ratio 3+² Hardness of approximation ¼ 1.46 [GK ’98] Red-Blue Median Facilities are of two different types Separate bounds kr and kb on facilities Recently introduced [HKK ’10] Motivated by Content Distribution Networks Partition V into red and blue sets T facility-types (RB Median is T=2) O(1)-approximation ratio via Local Search kr=3 kb=2 Matroid Median Given matroid M on ground-set V Locate facilities F that are independent in M Minimize connection cost Recap matroid M=(V, Iµ2V) A,B 2 I and |A|<|B| ) 9 e 2 BnA : A[{e} 2 I Substantial generalization of RB Median The CDN application with T facility-types reduces to partition matroid constraint A e B k1=2 k2=3 k3=1 k4=2 Talk Outline Thm: 16-approximation for Matroid Median Bad example for Local Search LP relaxation Phase I : sparsification Phase II: reformulation Local Search? Partition matroid with T parts T-1 exchange local search Swap up to T-1 facilities in each step Unlikely to work beyond T=O(1) m 1 m m m m Eg. T=5 Uniform metric on T+1 Clients n=mT+1 OPT = 1 (small fac.) LOPT = m (big fac.) locality gap (n/T) LP relaxation min u v d(u,v) ¢ xuv s.t. v xuv = 1 8u2V xuv · yv 8 u,v 2 V v2S yv · r(S) 8 Sµ V x, y ¸ 0. u xuv v y2M clients facilities connection constraints matroid rank constraints Solving the LP Exponential number of rank constraints Use separation oracle: minSµV r(S) - v2S yv An instance of submodular minimization Also more efficient algorithms to separate over the matroid polytope [C ’84] Solvable in poly-time via Ellipsoid algorithm Idea for approach(1) Problem non-trivial even if metric is a tree Even O(log n)-approximation not obvious What’s easier than a tree? Suppose input is special star-like instance One root facility (can help any client) client 2 Others are private facilities (help only 1 client) client 1 client 3 root facility Idea for approach(2) Recall LP variables yj : facility opening (in matroid polytope) xij : connection For any client i, private j 2 P(i) WMA xij = yj Connection constraint j xij = 1 So xir = 1 - j2P(i) xij = 1 - j2P(i) yj Can eliminate all connection variables ! client i private facilities P(i) r Idea for approach(3) Reformulate the LP xij xir min i [ j2P(i) dij ¢ yj + dir¢(1- j2P(i) yj) ] s.t. j2P(i) yj · 1, 8 clients i To ensure xir ¸ 0 y2M matroid constraint This is just an instance of intersection of M with partition matroid from P(i)s Idea for approach(4) Start with LP optimum (x,y) of arbitrary matroid median instance Phase I: Use (x,y) to form clusters of disjoint star-like instances Phase II: Resolve the new star-LP (x,y) itself restricted to the stars not integral Show that new LP is integral ¼ matroid intersection Phase I: sparsify LP solution Outline Modify LP connections x in four steps Key: no change in facility variables y Similar to [CGST ’99] Need to ensure y remains in matroid polytope Not true in [CGST ’99] Require some more (technical) work Step 1: cluster clients Lu = v duv¢xuv, contribution of u to LP obj. B(u) is local ball of u vertices within distance 2¢Lu from u Order clients u in increasing Lu Pick maximal disjoint set of local balls 3 1 2 T are the chosen clients Move each client to T-client close to it 5 3 4 6 1 4 5 2 Loss in obj · 4¢ LP* (additive) 6 Obs on step 1 Local balls of T clients are disjoint y-value inside any local ball ¸ ½ Markov inequality Restrict to clients T (now weighted) For any p,q2T : d(p,q) ¸ 2¢(LPp + LPq) well separated clients y¸½ separated T balls More obs on step 1 Suppose y-value in each T’s local ball ¸ 1 Then instance of matroid intersection: Matroid M and partition from local-ball(T) Resolving suitable LP ) integral soln Will need intersection with `laminar’ constraints, not just partition matroid Step 2: private facilities Ensure that each facility in some T-ball or helps at most one client (ie. private) Break connections from all except closest client 1 to facility j Reconnect to facilities in B(1), y-value ¸ ½ Total reconnection for any client · ½ j 3 2 1 Constant factor loss in obj Step 3: uniform objective Each connection from client p to any facility in B(q) will pay same objective d(p,q) Since p,q well separated d(p,q) · O(1)¢ d(p,j) For any j 2 B(q) Constant factor loss in obj p q Step 4: building stars WMA each client i 2 T connected to Set of `outer’ connections ¼ directed tree Its private facilities P(i), OR Its closest other client k2T, ie. facility in B(k) Unique out-edge from each client Lem: Can modify outer connection to `star’ Constant factor loss in obj The star structure One pseudo-root { r, r’ } Every other client connected to either r or r’ All LP-connections x are from client i to: private facility j2P(i), obj d(i,j) OR facility in B(k) with k2{ r, r’ }, uniform obj d(i,k) r r’ i Phase II: using star Will drop all the connection x-variables WMA xij = yj for j2P(i) private facilities Total outer connection=1 - j2P(i) xij =1 - j2P(i) yj Each outer-connection pays same obj d(i,r) Want property (in integral soln) that P(i)=; ) there is a recourse connection to r Do not quite ensure this, but… Phase II contd. Add constraint that y(P(r)) + y(P(r’)) ¸ 1 Indeed feasible for (x,y) since each local ball has y-value ¸ ½ This ensures (in integral soln) that P(i)=; ) there is a recourse connection to r or r’ Lose another constant factor in obj Phase II: new LP Apply constraints for each star to get LP min i [ j2P(i) dij ¢ yj + d(i,r(i)) ¢(1- j2P(i) yj)] s.t. j2P(i) yj · 1, 8 clients i laminar constraints y(P(r)) + y(P(r’)) ¸ 1, 8 p-root {r, r’} y2M matroid constraint Lem: Integral polytope (via proof similar to matroid intersection) Summarize Using LP solution and metric properties reduce to star-like instances Formulate new LP for star-like instances, with only facility variables New LP is integral Other Results O(1)-approximation for prize-collecting version of matroid median Knapsack Median problem (knapsack constraint on open facilities) Give bi-criteria approx, violate budget by wmax Can we get true O(1)-approx? Handle other constraints in k-median? Thank You