Algorithms for submodular objectives: continuous extensions & dependent randomized rounding Chandra Chekuri Univ. of Illinois, Urbana-Champaign Modern aspects of Submodularity Georgi a Institute of Technology Modern Aspects of Submodularity March 19-22, 2012 Location:TBD ORGANIZING COMMITTEE: Shabbir Ahmed (School of ISyE, GaTech), Nina Balcan (School of CS, GaTech), Satoru Iwata (RIMS, Kyoto University), and Prasad Tetali (School of Math and School of CS, GaTech) Wor k sh op T h em e: Submodular functions are discrete analo gues of convex functions, arising in various fields of computer science and o perations research. Since the seminal work of Jack Edmonds (1970), submodularity has long been recognized as a common structure of many efficient ly solvable combinatorial optimization problems. Recent algo rithmic developments in the past decade include combinatorial strongly polynomial algorithm for minimization, constant factor approximation algorithms for maximization, and efficient methods for learning submodular functions. In addition, submodular functions find no vel applications in combinatorial auctions, machine learning, and so cial networks. This workshop aims at providing a forum for researchers from a variety of backgrounds for exchanging result s, ideas, and problems on submo dular optimization and its applicatio ns. The first day will be devo ted to tutorial-style lectures! Con f i r m ed Sp eak er s : Jef f Bi l m es (University of Washington), Ch an d r a Ch ek u r i (UIUC), Li sa Fl ei sc h er (Dartmouth), Sat or u Fu j i sh i ge (Kyoto University), Gagan Goel (Google Research), M i ch el Goem an s (MIT), Car l os Gu est r i n (CMU), Ni ck H ar v ey (UBC), Sat or u Iw at a (Kyoto University), K am al Jai n (Microsoft Research), A n d r ea K r au se (ETH), Jon Lee (U. Michigan), T om M cCor m i ck (UBC), A r an y ak M eh t a (Google Research), V ah ab M i r r ok n i (Google Research), K az u o M u r ot a (University of Tokyo), K i y oh i t o N agan o (University of Tokyo), M au r i ce Qu ey r an n e (UBC), A m i n Sab er i (Stanford), A k i y osh i Sh i ou r a (Tohoku University), M ax i m Sv i r i d en k o (IBM), Zoy a Sv i t k i n a (Google), Jan V on d r ak (IBM), Lasz l o V egh (Georgia Tech). For additional information, please visit: http://arc.gatech.edu/upco ming_events.php Sponsors: Algorithms & Randomness Center , ACO Ph.D. Program (Georgia Tech), Yandex Corporation (Russia), ... Combinatorial Optimization • N a finite ground set • w : N ! R weights on N max/min w(S) s.t S µ N satisfies constraints Combinatorial Optimization • N a finite ground set • w : N ! R weights on N • S µ 2N feasible solutions to problem max/min w(S) s.t S 2 S Examples: poly-time solvable • max weight matching • s-t shortest path in a graph • s-t minimum cut in a graph • max weight independent set in a matroid and intersection of two matroids • ... Examples: NP-Hard • max cut • min-cost multiway/multiterminal cut • min-cost (metric) labeling • max weight independent set in a graph • ... Approximation Algorithms A is an approx. alg. for a problem: • A runs in polynomial time • maximization problem: for all instances I of the problem A(I) ¸ ® OPT(I) • minimization problem: for all instances I of the problem A(I) · ® OPT(I) • ® is the worst-case approximation ratio of A This talk min/max f(S) s.t. S 2 S f is a non-negative submodular set function on N Motivation: • several applications • mathematical interest • modeling power and new results Submodular Set Functions A function f : 2N ! R+ is submodular if f(A+j) – f(A) ¸ f(B+j) – f(B) for all A ½ B, j 2 N\B A B j f(A+j) – f(A) ≥ f(A+i+j) – f(A+i) for all A N , i, j N\A Submodular Set Functions A function f : 2N ! R+ is submodular if f(A+j) – f(A) ¸ f(B+j) – f(B) for all A ½ B, i 2 N\B A B j f(A+j) – f(A) ≥ f(A+i+j) – f(A+i) for all A N , i, j N\A Equivalently: f(A) + f(B) ≥ f(AB) + f(AB) 8 A,B N Cut functions in graphs • G=(V,E) undirected graph • f : 2V ! R+ where f(S) = |δ(S)| S Coverage in Set Systems • X1, X2, ..., Xn subsets of set U • f : 2{1,2, ..., n} ! R+ where f(A) = |[ i X1 X5 X4 X2 X3 in A Xi | X1 X5 X4 X2 X3 Submodular Set Functions • Non-negative submodular set functions f(A) ≥ 0 8 A ) f(A) + f(B) ¸ f(A[ B) (sub-additive) • Monotone submodular set functions f(ϕ) = 0 and f(A) ≤ f(B) for all A B • Symmetric submodular set functions f(A) = f(N\A) for all A Other examples • Cut functions in hypergraphs (symmetric non-negative) • Cut functions in directed graphs (non-negative) • Rank functions of matroids (monotone) • Generalizations of coverage in set systems (monotone) • Entropy/mutual information of a set of random variables • ... Max-Cut max f(S) s.t S 2 S • f is cut function of a given graph G=(V,E) • S = 2V : unconstrained • NP-Hard! Unconstrained problem min/max f(S) • minimization poly-time solvable assuming value oracle for f • Ellipsoid method [GLS’79] • Strongly-polynomial time combinatorial algorithms [Schrijver, Iwata-Fleischer-Fujishige ’00] • maximization NP-Hard even for explicit cut-function Techniques min/max f(S) s.t. S 2 S f is a non-negative submodular set function on N • Greedy • Local Search • Mathematical Programming Relaxation + Rounding Math. Programming approach min/max w(S) s.t S 2 S min/max w¢x s.t x 2 P(S) xi 2 [0,1] indicator variable for i Exact algorithm: P(S) = convexhull( {1S : S 2 S}) Math. Programming approach min/max w(S) s.t S 2 S min/max w¢x s.t x 2 P(S) Round x* 2 P(S) to S* 2 S Exact algorithm: P(S) = convexhull( {1S : S 2 S}) Approx. algorithm: P(S) ¾ convexhull( {1S : S 2 S}) P(S) solvable: can do linear optimization over it Math. Programming approach min/max f(S) min/max g(x) s.t S 2 S s.t x 2 P(S) P(S) ¶ convexhull( {1S : S 2 S}) and solvable Round x* 2 P(S) to S* 2 S Math. Programming approach min/max f(S) min/max g(x) s.t S 2 S s.t x 2 P(S) • What is the continuous extension g ? • How to optimize with objective g ? • How do we round ? Round x* 2 P(S) to S* 2 S Continuous extensions of f For f : 2N ! R+ define g : [0,1]N ! R+ s.t • for any S µ N want f(S) = g(1S) • given x = (x1, x2, ..., xn) [0,1]N want polynomial time algorithm to evaluate g(x) • for minimization want g to be convex and for maximization want g to be concave Canonical extensions: convex and concave closure x = (x1, x2, ..., xn) [0,1]N min/max S ®S f(S) S ®S = 1 S ®S = xi for all i ®S ¸ 0 for all S - f (x) for minimization and f+(x) for maximization: convex and concave respectively for any f Submodular f • For minimization f-(x) can be evaluated in poly-time via submodular function minimization • Equivalent to the Lovasz-extension • For maximization f+(x) is NP-Hard to evaluate even when f is monotone submodular • Rely on the multi-linear-extension Lovasz-extension of f f»(x) = Eµ 2 [0,1][ f(xµ) ] where xµ = { i | xi ¸ µ } Example: x = (0.3, 0, 0.7, 0.1) xµ = {1,3} for µ = 0.2 and xµ = {3} for µ = 0.6 f»(x) = (1-0.7) f(;) + (0.7-0.3)f({3}) + (0.3-0.1) f({1,3}) + (0.1-0) f({1,3,4}) + (0-0) f({1,2,3,4}) Properties of » f • f» is convex iff f is submodular • f»(x) = f-(x) for all x when f is submodular • Easy to evaluate f» • For submod f : solve relax. via convex optimization min f»(x) s.t x 2 P(S) Multilinear extension of f [Calinescu-C-Pal-Vondrak’07] inspired by [Ageev-Svir.] For f : 2N ! R+ define F : [0,1]N ! R+ as x = (x1, x2, ..., xn) [0,1]N R: random set, include i independently with prob. xi F(x) = E[ f(R) ] = S N f(S) i S xi i N\S (1-xi) Properties of F • F(x) can be evaluated by random sampling • F is a smooth submodular function • 2F/xixj ≤ 0 for all i,j. Recall f(A+j) – f(A) ≥ f(A+i+j) – f(A+i) for all A, i, j • F is concave along any non-negative direction vector • F/xi ≥ 0 for all i if f is monotone Maximizing F max { F(x) | xi 2 [0,1] for all i} is NP-Hard equivalent to unconstrained maximization of f When f is monotone max { F(x) | i xi · k, xi 2 [0,1] for all i} is NP-Hard Approximately maximizing F [Vondrak’08] Theorem: For any monotone f, there is a (1-1/e) approximation for the problem max { F(x) | x P } where P [0,1]N is any solvable polytope. Algorithm: Continuous-Greedy Approximately maximizing F [C-Vondrak-Zenklusen’11] Theorem: For any non-negative f, there is a ¼ approximation for the problem max { F(x) | x P } where P [0,1]n is any down-closed solvable polytope. Remark: 0.325-approximation can be obtained Remark: Current best 1/e ' 0.3678 [Feldman-NaorSchwartz’11] Algorithms: variants of local-search and continuous-greedy Math. Programming approach min/max f(S) min/max g(x) s.t S 2 S s.t x 2 P(S) ✔ Round x* 2 P(S) to S* 2 S • What is the continuous extension g ? • Lovasz-extension for min and multilinear ext. for max ✔ • How to optimize with objective g ? • Convex optimization for min and O(1)-approx. alg for max • How do we round ? Rounding Rounding and approximation depend on S and P(S) Two competing issues: • Obtain feasible solution S* from fractional x* • Want f(S*) to be close to g(x*) Rounding approach Viewpoint: objective function is complex • round x* to S* to approximately preserve objective • fix/alter S* to satisfy constraints • analyze loss in fixing/altering Rounding to preserve objective x* : fractional solution to relaxation Minimization: f»(x) = Eµ 2 [0,1][ f(xµ) ] Pick µ uniformly at random in [0,1] (or [a, b]) S* = { i | x*i ¸ µ } Maximization: F(x) = E[f(R)] S* = pick each i 2 N independently with probability ® x*i (® · 1) Maximization max f(S) s.t S 2 I I µ 2N is a downward closed family A2I&B½A)B2I Captures “packing” problems Maximization High-level results: • optimal rounding in matroid polytopes [CalinescuC-Vondrak-Pal’07,C-Vondrak-Zeklusen’09]] • contention resolution scheme based rounding framework [C-Vondrak-Zenklusen’11] Max k-Coverage max f(S) s.t S 2 I • X1,X2,...,Xn subsets of U and integer k • N = {1,2,...,n} • f is the set coverage function (monotone) • I = { A µ N : |A| · k } (cardinality constraint) • NP-Hard Greedy [Nemhauser-Wolsey-Fisher’78, FNW’78] • Greedy gives (1-1/e)-approximation for the problem max { f(S) | |S| · k } when f is monotone Obtaining a (1-1/e + ²)-approximation requires exponentially many value queries to f • Greedy give ½ for maximizing monotone f over a matroid constraint • Unless P=NP no (1-1/e +²)-approximation for special case of Max k-Coverage [Feige’98] Matroid Rounding [Calinescu-C-Pal-Vondrak’07]+[Vondrak’08]=[CCPV’09] Theorem: There is a randomized (1-1/e) ' 0.632 approximation for maximizing a monotone f subject to any matroid constraint. [C-Vondrak-Zenklusen’09] Theorem: (1-1/e-²)-approximation for monotone f subject to a matroid and a constant number of packing/knapsack constraints. Rounding in Matroids [Calinescu-C-Pal-Vondrak’07] Theorem: Given any point x in P(M), there is a randomized polynomial time algorithm to round x to a vertex X(hence an indep set of M) such that • E[X] = x • f(X) = F(X) ≥ F(x) [C-Vondrak-Zenklusen’09] Different rounding with additional properties and apps. Contention Resolution Schemes • I an independence family on N • P(I) a relaxation for I and x 2 P(I) • R: random set from independent rounding of x CR scheme for P(I): given x, R outputs R’ µ R s.t. 1. R’ 2 I 2. and for all i, Pr[i 2 R’ | i 2 R] ¸ c Rounding and CR schemes max F(x) s.t x 2 P(I) Round x* 2 P(I) to S* 2 I Theorem: A monotone CR scheme for P(I) can be used to round s.t. E[f(S*)] ¸ c F(x*) Via FKG inequality Summary for maximization • Optimal results in some cases • Several new technical ideas and results • Questions led to results even for modular case • Similar results for modular and submodular (with in constant factors) for most known problems Minimization • Landscape is more complex • Many problems that are “easy” in modular case are hard in submodular case: shortest paths, spanning trees, sparse cuts ... • Some successes via Lovasz-extension • Future: need to understand special families of submodular functions and applications Submodular-cost Vertex Cover • Input: G=(V,E) and f : 2V ! R+ • Goal: min f(S) s.t S is a vertex cover in G • 2-approx for modular case well-known • 2-approx for sub-modular costs [KoufogiannakisYoung’ 99, Iwata-Nagano’99, Goel etal’99] Submodular-cost Vertex Cover • Input: G=(V,E) and f : 2V ! R+ • Goal: min f(S) s.t S is a vertex cover in G min f»(x) xi + xj ¸ 1 xi ¸ 0 for all ij 2 E for all i 2 V Rounding min f»(x) xi + xj ¸ 1 for all ij 2 E xi ¸ 0 for all i 2 V Pick µ 2 [0, 1/2] uniformly at random Output S = { i | xi ¸ µ } Rounding Analysis min f»(x) xi + xj ¸ 1 for all ij 2 E xi ¸ 0 for all i 2 V Pick µ 2 [0, 1/2] uniformly at random Output S = { i | x*i ¸ µ } Claim 1: S is a vertex cover with probability 1 Claim 2: E[ f(S) ] · 2 f»(x*) Proof: 2f»(x) = 2s10 f(xµ) dµ ¸ 2s1/20 f(xµ) = E[ f(S) ] Submodular-cost Set Cover • Input: Subsets X1,...,Xn of U, f : 2N ! R+ • Goal: min f(S) s.t [i 2 S Xi = U • Rounding according to objective gives only k-approx where k is max-element frequency. Also integrality gap of (k) • [Iwata-Nagano’99] (k/log k)-hardness A Labeling Problem Labels G=(V,E) Assign label to each vertex to minimize cost: • to label v with i cost c(v,i) • if edge uv get assigned different labels pay cost w(uv) A Labeling Problem Labels G=(V,E) Assign label to each vertex to minimize cost: • to label v with i cost c(v,i) • if edge uv get assigned different labels pay cost w(uv) A Labeling Problem • Labeling problem has many applications in computer vision [Boykov-Veksler-Zabih, ...] • Generalized to metric labeling by [Kleinberg-Tardos’99] • 2-approx for uniform case by interesting relaxation and rounding • O(log k)-approximation for general metric building on uniform case • LP relaxation and O(log k) gap [C-Khanna-Naor-Zosin’01] • Multiway-cut is special case Submodular Cost Allocation [C-Ene’11] A new model • V a finite ground set • L = {1, 2, ..., k} set of labels • for each i a submodular cost fn fi : 2V ! R+ • Goal: assign labels to V to minimize i fi(Ai) where Ai µ V assigned label i Labeling Problem as Submodular Cost Allocation Labels G=(V,E) For each label i define function fi : 2V ! R fi(S) = v 2 S c(v,i) + w(±(S))/2 Submodular Cost Allocation: Relaxation x(u,i) 2 {0,1} whether u allocated label i xi = (x(u1,i), x(u2,i),....,x(un,i)) min i fi»(xi) i x(u,i) = 1 for all u 2 V x(u,i) ¸ 0 for all u 2 V, i 2 L Relaxation • [Calinescu-Karloff-Rabani’98] relaxation for multiway cut is a special case • [Kleinberg-Tardos’99] relaxation for uniform metric labeling is a special case Rounding min i fi»(xi) i x(u,i) = 1 for all u 2 V x(u,i) ¸ 0 for all u 2 V, i 2 L Pick µ 2 [0,1] at random for each i let Ai = { u | x(u,i) ¸ µ } Claim: E[ i f(Ai) ] = i fi»(xi) Rounding min i fi»(xi) i x(u,i) = 1 for all u 2 V x(u,i) ¸ 0 for all u 2 V, i 2 L Pick µ 2 [0,1] at random for each i let Ai = { u | x(u,i) ¸ µ } Claim: E[ i f(Ai) ] = i fi»(xi) The Ai overlap & not all vertices assigned labels! Rounding (0,0,1) (0,1,0) (1,0,0) Rounding (0,0,1) (0,1,0) (1,0,0) Fixing Overlap • If fi are monotone overlap does not matter • If fi arise from single symmetric function uncross using posi-modularity • Ensure no overlap by picking µ 2 (1/2, 1] Not all vertices labeled • Repeat rounding until all vertices labeled • Assign all unlabeled vertices to some label Results [C-Ene’11a,C-Ene’11b] • Rounding “explains” [CKR’98,KT’99,...] • Submodular multiway partition: 1.5-approximation for symmetric, 2-approximation for general. Improve 2 and (k-1) respectively • Recover 2-approx for uniform metric labeling • Obtain O(log k) for generalization of labeling, hublocation, non-metric facility location ... Concluding Remarks • Substantial progress on constrained submodular function optimization in the last few years • New tools and connections including a general framework via continuous extensions and dependent randomized rounding • Increased awareness and applications • Much work to be done, especially on minimization problems Thanks!