Geometry and Expansion: A survey of some results Sanjeev Arora Princeton ( touches upon: S. A., Satish Rao, Umesh Vazirani, STOC’04; S. A., Elad Hazan, and Satyen Kale, FOCS’04; S. A., James Lee, and Assaf Naor, STOC’05 & JAMS’08 S.A., S. Kale STOC 2007. + papers that are not mine) Outline: • Graph partitioning problems: intro and history •New approximation via expander flows. • New approximation algorithm via semidefinite programming (+ analysis using “Structure Theorem”) • Outline of proof of “S. T.” • Uses of “S. T.” in geometric embeddings • Open problems [A., Rao, Vazirani] Sparsest Cut / Edge Expansion G = (V, E) S (G) = min | E(S, Sc)| SµV S |S| < |V|/2 c- balanced separator c(G) = min Both NP-hard | E(S, Sc)| SµV c |V| < |S| < |V|/2 |S| |S| Why these problems are important • Analysis of random walks, PRAM simulation, packet routing, clustering, VLSI layout etc. • Underlie many divide-and-conquer graph algorithms (surveyed by Shmoys’95) • Discrete analog of isoperimetry; useful in Riemannian geometry (via 2nd eigenvalue of Laplacian (Cheeger’70) • Graph-theoretic parameters of inherent interest (cf. Lipton-Tarjan planar separator theorem) Previous approximation algorithms 1) Eigenvalue approaches (Cheeger’70, Alon’85, Alon-Milman’85) Only yield factor n approximation. 2c(G) ¸ (G) ¸ c(G)2 /2 2) O(log n) -approximation via LP (multicommodity flows) (Leighton-Rao’88) • Approximate max-flow mincut theorems • Region-growing argument 3) Embeddings of finite metric spaces into l1 (Linial, London, Rabinovich’94, • Geometric approach; more general result AR’94) (but still O(log n) approximation) New results of [ARV’04] 1. O( log n ) -approximation to sparsest cut and conductance 2. O( log n )-pseudoapproximation to c-balanced separator (algorithm outputs a c’-balanced separator, c’ < c) 3. Existence of expander flows in every graph (approximate certificates of expansion) Disparate approaches from previous slide get “unified” Subsequent work: [AHK’05],[AK’07],[S’09]: O(m + n1.5 ) time! The three main characters Expansion Isoperimetry (continuous analog of expansion) Geometry (and geometric embeddings of finite metric spaces) Identifying sparse cuts via “traffic flows” Approach 1: traffic congestion identifies sparse cuts [SM’87]: Stress a network by passing traffic “flow” through it. Look at congested edges to identify sparse cuts [LR88] O(log n) approximation to sparsest cut. Route 1 unit of traffic between every pair of nodes [ARV’04] Traffic flow is like embedding a weighted graph. wij = amount of traffic from i to j Solve a math program to find the “right” flow pattern ([AHK’05] Do it in O(n2) time) Expander traffic flows G = (V, E) S [ARV’04] A D-regular flow graph s.t. S 8 S w(S, Sc) = ( D |S|) (*) (certifies expansion = (D) ) Weighted Graph w satisfies (*) iff L(w) = (1) [Cheeger] Our Thm: If G has expansion , then a D-regular expander flow exists in it where D= Formal statement : 9 0 >0 s.t. foll. LP is feasible for D = (G) log n Pij = paths whose endpoints are i, j 8i j p 2 Pij fp = D 8e 2 E p 3 e fp · 1 8S µ V i 2 S j 2 Sc p 2 Pij fp ¸ 0 D |S| fp ¸ 0 8 paths p in G WHY IS THIS FEASIBLE??? (degree) (capacity) (demand graph is an expander) Feasibility Criterion for LP on prev. slide (via Farkas’s Lemma) Existence of such i, j proved in [ARV’04]. When fail to find such i, j, we find a cut of small expansion Overall approximation algorithm via flows Try to solve above LP to find D-regular expander flow If succeed, have verified that expansion is > D/10. If fail, then use [ARV04] ideas to find a cut of capacity Note: Before finding this cut already had D/2-regular flow Next: The SDP-based approach to Graph partitioning (ARV’04) Semidefinite relaxation for c-balanced separator |vi –vj|2/4 =1 |vi –vj|2 =0 S +1 c(G) = min S | E(S, Sc)| SµV -1 |S| c |V| < |S| < |V|/2 Find unit vectors in <n semimetric” Assign {+1, -1} to v1, v2, …, vn to minimize “cut (i, j) 2 E |vi –vj|2/4 Triangle inequality Subject to i < j |vi –vj|2/4 ¸ c(1-c)n2 |vi –vj|2 + |vj –vk|2 ¸ |vi –vk|2 8 i, j, k Unit l22 space Unit vectors v1, v2,… vn 2 <d Vi Vj |vi –vj|2 + |vj –vk|2 ¸ |vi –vk|2 8 i, j, k non obtuse ! Vk Example: Hypercube {-1, 1}k |u – v|2 = i |ui – vi|2 = 2 i |ui – vi| = 2 |u – v|1 In fact, l2 and l1 are subcases of l22 Structure Theorem for l22 spaces <d [ARV’04] Subsets S and T are -separated if for every vi 2 S, vj 2 T |vi –vj|2 ¸ ¸ Thm: If i< j |vi –vj|2 = (n2) then 9 S, T of size (n) that are -separated for = ( 1 ) log n Main thm ) O( log n)-approximation v1, v2,…, vn 2 <d is optimum SDP soln; SDPopt = (i, j) 2 E |vi –vj|2 S, T : –separated sets of size (n) Do BFS from S until you hit T. Take the level of the BFS tree with the fewest edges and output the cut (R, Rc) defined by this level d(S, i) d(S, j) j S i (i, j) 2 E |vi –vj|2 ¸ |E(R, Rc)| £ ) |E(R, Rc)| · SDPopt / · O( log nSDPopt) Other new log n -approximation algorithms • MIN-2-CNF deletion and several graph deletion problems. [Agarwal, Charikar, Makarychev, Makarychev’04]. Weighted version of S.T. • MIN-LINEAR ARRANGEMENT [Charikar, Karloff, Rao’04] • General SPARSEST CUT [A., Lee, Naor ’04] • Min-ratio VERTEX SEPARATORS and Balanced VERTEX SEPARATORS [ Feige, Hajiaghayi, Lee, ’04] All use the Structure Theorem (+ other ideas) Outline: • Graph partitioning problems: intro and history • New approximation algorithm via semidefinite programming (+ analysis using “Structure Theorem”) [A., Rao, Vazirani] • Outline of proof of “S. T.” • Uses of “S. T.” in geometric embeddings (Algorithm to produce -separated T sets S, T, of size (n) ) • Introduction to expander flows and O(n2) time algorithms S • Open problems Algorithm to produce two –separated sets <d u Tu Easy: Su and Tu likely to have size (n) Delete any vi 2 Su, vj 2 Tu s.t. |vi –vj|2 < . (till no such pair remains) Su 0.01 If Su, Tu still have size (n), output them d Main difficulty: Show that whp only o(n) points get deleted “Stretched pair”: vi, vj such that |vi –vj|2 · and | h vi –vj, u i | ¸ 0.01 d Obs: Deleted pairs are stretched and they form a matching. Naïve analysis of random projection fails <d v e u <u, v> ?? Stretched pair: |vi –vj|2 < ; 1 1 d d |<vi –vj, u>| > 0.01 2 -t /2 = O( 1 ) standard deviations d E[# of stretched pairs] = n2 exp(-) À n Proof by contradiction: Suppose matching of (n) size exists with probability (1)… ….stretched pairs are almost everywhere you look! Vj Ball (vi , ) u Vi 0.01 d Idea: Put stretched pairs together; derive very improbable event Walks in unit l22 space Unit vectors v1, v2,… vn 2 <d Vi Vj |vi –vj|2 + |vj –vk|2 ¸ |vi –vk|2 8 i, j, k Vk Angles are non obtuse s s s s Taking r steps of length s only takes you squared distance rs2 (i.e. distance r s) Proof by contradiction Claim: 9walk on stretched edges (contd.) s s r £ standard deviation Projection = s s r steps of length s ) VERY UNLIKELY IF distance r large enough squared rs2 (distance ) Walk impossible (CONTRADICTION) r s) Stretched pair: |vi –vj|2 < ; |<vi –vj, u> ¸ 0.01 d …. 0.01 0.01 <vfinal –v0, u> ¸ r 0.01 u Why walk dis possible:ddelicate|vargument; measure dconcentration final –v0| · r Outline: • Graph partitioning problems: intro and history • New approximation algorithm via semidefinite programming (+ analysis using “Structure Theorem”) • Outline of proof of “S. T.” • Geometric embeddings of metric spaces • Open problems [A., Rao, Vazirani] <k (with l2 norm) Finite metric space (X, d) y x f(x) f d(x,y) f(y) distortion of f is minimum C>1 such that d( x, y) · |f(x ) – f( y)|2 · C d( x, y) 8 x, y Thm (Bourgain’85): For every n-point metric space, a map exists with distortion O(log n) [LLR’94]: EfficientO(log algorithm to find Qs: Improve n) for X =the l22map; (say) or l1 ? Proof that O(log n) cannot be improved in general Embeddings and Cuts (LLR’94, AR’94) Recall: Cut semi-metric 1 Fact: Metric (X, d) embeds isometrically in l1 iff it can be written as a positive combination of cut semimetrics 0 Embedding l22 into l1 gives a way to produce cuts from SDP solution Status report of this area Best lowerbound l1 into l2 l22 into l1 Exactly the integrality gap of SDP for general SPARSEST CUT [LLR’94, AR’94] l22 into l2 log0.5 n [Enflo’69] 1.16 [Zatloukal’04] Superconstant [Khot, Vishnoi’04] (logn)0.01 [Cheeger,Kleine r, ‘08] log0.5 n [Enflo’69] Best upperbound Disproves log n Goemans-Linial [Bourgain’85] conjecture Uses fourier log0.75 techniques n developed [Chawla,Gupta,Racke ’04] for PCPs! log0.5 n log log n [A., Lee, Naor’04] Uses new metric differentiation techniques Upperbounds: Frechet’s recipe to embed metric space (X, d) into Rk Pick k suitable subsets A1, A2, …, Ak of X Map x 2 X to (d(x, A1), d(x, A2), … , d(x, Ak)) Note: d(x, A1) – d(y, A1) · d(x, y) In recent embeddings, Ai’s are chosen using S.T.and “Measured descent” idea of [Krauthgamer, Lee, Naor, and Mendel’04] x Ai Embedding lowerbounds (Khot-Vishnoi’05) Explicit unit- l22 space (X, d) that requires distortion log log log n into l1 Main observation: Need good handle on cut structure of X Use hypercube as building block ! Cut ´ Boolean Function Number of cut edges = average sensitivity (Fourier analysis a la KKL, Friedgut, Hastad, Bourgain etc. ) isoperimetric theorems) OPEN PROBLEMS • Better approximation factor than O( )? (log log n “lowerbound” assuming UGC ) • Better distortion bound for embedding l22 into l1?.) ( v/s upperbound lowerbound • Combinatorial approximation algorithms for other problems ? (similar to one for SPARSEST CUT from [A., Hazan, Kale] ) • Other applications of expander flows? (Useful in some geometric results [Naor, Rabani, Sinclair’04]) • Ways to use spectral ideas a la [ABS’10] for SPARSEST CUT? Example of expander flow n-cycle Take any 3-regular expander on n nodes Put a weight of 1/3n on each edge Embed this into the n-cycle Routing of edges does not exceed any capacity ) expansion =(1/n) Other extensions of flow-based techniques • Generalization to problems other than sparsest cut [A., Kale07] “Primal-dual approach to SDP.” • Very fast algorithms for O(log n) approximation: O(n1.5 + m) time (Faster than [LR88] type algorithms!) • Very simple algorithms; use only maxflow and eigenvalue computations [KRV06] Looking forward to more progress… Thanks ! New Result (A., Hazan, Kale;FOCS’04) O(n2) time algorithm that given any graph G finds for some D >0 • a D-regular expander flow • a cut of expansion O( D log n ) ) (D) · (G) ·O(D log )n Ingredients: Approximate eigenvalue computations; Approximate flow computations (Garg-Konemann; Fleischer) Random sampling (Benczur-Karger + some more) Idea: Define a zero-sum game whose optimum solution is an expander flow; solve approximately using Freund-Schapire approximate solver. Expander flows: LP view ·1 ·D G LP feasible ) ¸ (D) Thm [ARV]: 9 0 s.t. the LP is feasible with D = /√log n Open problems (circa April’04) 2) time; O(n • Better running time/combinatorial algorithm? [A., Hazan, Kale] • Improve approximation ratio to O(1); better rounding?? (our conjectures may be useful…) • Extend result to other expansion-like problems (multicut, general sparsest cut; MIN-2CNF deletion) Integrality gap is (log n) [Charikar] • Resolve conjecture about embeddability of l22 into l1; of l1 into l2 log3/4 n distortion; [Chawla,Gupta, Racke] • Any applications of expander flows? Yes [Naor,Sinclair,Rabani] Better embeddings of lp into lq [Lee] Various new results O(n2) time combinatorial algorithm for sparsest cut (does not use semidefinite programs) [A., Hazan, Kale’04] New results about embeddings: (i) lp into lq [J. Lee’04] (ii) l22 and l1 into l2 [CGR’04] (approx for general sparsest cut) Clearer explanation of expander flows and their connection to embeddings [NRS’04] A concrete conjecture (prove or refute) G = (V, E); = (G) For every distribution on n/3 –balanced cuts {zS} (i.e., S zS =1) there exist (n) disjoint pairs (i1, j1), (i2, j2), ….. such that for each k, • distance between ik, jk in G is O(1/ ) • i k, j k are across (1) fraction of cuts in {zS} (i.e., S: i 2 S, j 2 Sc zS = (1) ) Conjecture ) existence of d-regular expander flows for d = log n log n log n Example of l22 space: hypercube {-1, 1}k |u – v|2 = i |ui – vi|2 = 2 i |ui – vi| = 2 |u – v|1 In fact, every l1 space is also l22 Conjecture (Goemans, Linial): Every l22 space is l1 up to distortion O(1) Semidefinite LP Relaxations for c-balanced separator Min (i, j) 2 E Xij Motivation: Every cut (S, S 0c) ·defines Xij · 1a (semi) metric Xij 2 {0,1} 1 0 1 1 0 Xij + Xj k ¸ Xik i< j Xij ¸ c(1-c)n2 There exist unit vectors v1, v2, …, vn 2 <n such that Xij = |vi - vj|2 /4 Semidefinite relaxation (contd) Min (i, j) 2 E Unit l22 space |vi –vj|2/4 |vi|2 = 1 |vi –vj|2 + |vj –vk|2 ¸ |vi –vk|2 8 i, j, k i < j |vi –vj|2 ¸ 4c(1-c)n2 Many other NP-hard problems have similar relaxations. Algorithm to produce two –separated sets <d u Tu Check if Su and Tu have size (n) If any vi 2 Su and vj 2 Tu satisfy |vi –vj|2 · , delete them and Su 0.01 d repeat until no such vi, vj remain If Su, Tu still have size (n), output them Main difficulty: Show that whp only o(n)2 points get deleted “Stretched pair”: vi, vj such that |vi –vj| · and | h vi –vj, u i | ¸ 0.01 d Obs: Deleted pairs are stretched and they form a matching. Next 10-12 min: Proof-sketch of Structure Thm ( algorithm to produce -separated S, T of size (n); = 1/ log n ) T S “Matching is of size o(n) whp” : naive argument fails “Stretched pair”: vi, vj such that |vi –vj|2 · and | h vi –vj, u i | ¸ 0.01 d O( 1 ) £ standard deviation ) PrU [ vi, vj get stretched] = exp( - 1 ) = exp( - log n ) E[# of stretched pairs] = O( n2 ) £ exp(- log n ) Generating a contradiction: the walk on stretched pairs Contradiction if r is large enough! Vj vfinal Vi 0.01 0.01 d d |vfinal - vi| < r steps 0.01 r d | <vfinal – vi, u>| ¸ 0.01r d = O( r ) x standard dev. u r Measure concentration (P. Levy, Gromov etc.) A : measurable set with (A) ¸ 1/4 <d A A : points with distance · to A (A) ¸ 1 – exp(-2 d) A Reason: Isoperimetric inequality for spheres Expander flows (approximate certificates of expansion)