TOWARDS AN SDP-BASED APPROACH TO SPECTRAL METHODS: A NEARLY-LINEAR-TIME ALGORITHM FOR GRAPH PARTITIONING AND DECOMPOSITION Lorenzo Orecchia, UC Berkeley Nisheeth K. Vishnoi, MSR Bangalore Speaker: Lorenzo Orecchia SODA 2011 GRAPH PARTITIONING Undirected weighted graph G = (V; E; w) jV j = n jEj = m vol(S) = w(S; V ) = 8 ¹ =4 w(S; S) S Á(S) = Conductance of S µ V Á(S) = 1 2 ¹ w(S;S) ¹ minfvol(S);vol(S)g GRAPH PARTITIONING DECISION PROBLEM Does G have a c-balanced cut of conductance <° ? S 1 2 c > vol(S) > c vol(V ) GRAPH PARTITIONING DECISION PROBLEM Does G have a c-balanced cut of conductance <° ? S 1 2 c NP-HARD > vol(S) > c vol(V ) APPROXIMATION ALGORITHMS DECISION PROBLEM Does G have a c-balanced cut of conductance <° ? Algorithm Method Recursive Eigenvector Spectral [Leighton, Rao ‘88] Flow [Arora, Rao,Vazirani ‘04 ] Distinguishes ¸ ° and p O( °) O(° log n) SDP (Flow + Spectral) O(° p log n) Running Time ~ 2) O(n ~ 23 ) [AK’07, O(n OSVV’08] 3 ~ 2 O(n )[Sherman ‘09] APPROXIMATION ALGORITHMS DECISION PROBLEM Does G have a c-balanced cut of conductance <° ? Algorithm Method Recursive Eigenvector Spectral [Leighton, Rao ‘88] Flow [Arora, Rao,Vazirani ‘04 ] Distinguishes ¸ ° and p O( °) O(° log n) SDP (Flow + Spectral) O(° p log n) Running Time ~ 2) O(n ~ 23 ) [AK’07, O(n OSVV’08] 3 ~ 2 O(n )[Sherman ‘09] GOAL: NEARLY –LINEAR TIME ALGORITHMS APPROXIMATION ALGORITHMS DECISION PROBLEM Does G have a c-balanced cut of conductance <° ? Algorithm Method Recursive Eigenvector Spectral [Leighton, Rao ‘88] Flow [Arora, Rao,Vazirani ‘04 ] Distinguishes ¸ ° and Running Time p O( °) ~ 2) O(n ~ 23 ) [AK’07, O(n OSVV’08] O(° log n) SDP (Flow + Spectral) O(° p log n) 3 ~ 2 O(n )[Sherman ‘09] FOCUS ON SPECTRAL METHODS SPECTRAL ALGORITHMS Distinguishes ¸ ° and Algorithm Method Recursive Eigenvector Eigenvector p O( °) ~ 2) O(n Local Random Walks µq ¶ O ° log3 n µ [Spielman, Teng ‘04] [Andersen, Chung, Lang ‘07] Local Random Walks [Andersen, Peres ‘09] Evolving Sets ³p ´ O ° log n ³p ´ O ° log n Time ¶ m °2 µ ¶ m ~ O ° µ ¶ m O~ p ° ~ O SPECTRAL ALGORITHMS Distinguishes ¸ ° and Algorithm Method Recursive Eigenvector Eigenvector p O( °) ~ 2) O(n Local Random Walks µq ¶ O ° log3 n µ [Spielman, Teng ‘04] [Andersen, Chung, Lang ‘07] Local Random Walks [Andersen, Peres ‘09] Evolving Sets ³p ´ O ° log n ³p ´ O ° log n UNIQUE GAMES IMPLICATIONS Time ¶ m °2 µ ¶ m ~ O ° µ ¶ m O~ p ° ~ O SPECTRAL ALGORITHMS Distinguishes ¸ ° and Algorithm Method Recursive Eigenvector Eigenvector p O( °) ~ 2) O(n Local Random Walks µq ¶ O ° log3 n µ [Spielman, Teng ‘04] [Andersen, Chung, Lang ‘07] Local Random Walks [Andersen, Peres ‘09] Evolving Sets [Orecchia, Vishnoi ‘11] SDP-based ³p ´ O ° log n ³p ´ O ° log n p O( °) Time ¶ m °2 µ ¶ m ~ O ° µ ¶ m O~ p ° µ ¶ ~ m O ° ~ O OUR RESULT [Orecchia, Vishnoi ‘11] SDP-based p O( °) ~ O • First spectral nearly-linear time algorithm that matches optimal approximation guarantee. • Outputs certificates of special form. o Allows application to constructing graph decompositions. • Uses SDP formulation to obtain fast spectral algorithm. o Arora-Kale framework. • Algorithm has natural random walk interpretation. µ m ° ¶ OUR RESULT [Orecchia, Vishnoi ‘11] SDP-based p O( °) ~ O • First spectral nearly-linear time algorithm that matches optimal approximation guarantee. • Outputs certificates of special form. o Allows application to constructing graph decompositions. • Uses SDP formulation to obtain fast spectral algorithm. o Arora-Kale framework. • Algorithm has natural random walk interpretation. µ m ° ¶ GRAPH DECOMPOSITION C2 C1 Ct (®, ²) - decomposition Partition V into subsets C1, C2, C3, …, Ct such that GRAPH DECOMPOSITION C2 C1 Ct (®, ²) - decomposition Partition V into subsets C1, C2, C3, …, Ct such that • 8i; ¸GjCi ¸ ® WELL-CONNECTED CLUSTERS GRAPH DECOMPOSITION (®, ²) - decomposition Partition V into subsets C1, C2, C3, …, Ct such that • 8i; 8i; Á¸0GjC (Cii) ¸ ¸® ® • P i<j w(Ci ; Cj ) · ² ¢ vol(V ) WELL-CONNECTED CLUSTERS FEW INTERCLUSTER EDGES GRAPH DECOMPOSITION Applications: – Clustering [Kannan, Vempala, Vetta ‘00] – Sparsification [Spielman, Teng ‘04] – Preconditioning [Spielman, Teng ‘04], [Koutis, Miller’08] – Fast Graph Algorithms and Heuristics NB: must be computed in nearly-linear time. GRAPH DECOMPOSITION Applications: – Clustering [Kannan, Vempala, Vetta ‘00] – Sparsification [Spielman, Teng ‘04] – Preconditioning [Spielman, Teng ‘04], [Koutis, Miller’08] – Fast Graph Algorithms and Heuristics NB: must be computed in nearly-linear time. SPECIAL CERTIFICATE ENABLES CONSTRUCTION OF GRAPH DECOMPOSITIONS DECOMPOSITION AND PARTITIONING PARTITIONING ALGORITHM Partition(G, °, c) either outputs o an (c)-balanced cut T with Á(T ) · f(°); or o a certificate that for all c-balanced S ½ V, Á(S) ¸ °: DECOMPOSITION AND PARTITIONING PARTITIONING ALGORITHM Partition(G, °, c) either outputs o an (c)-balanced cut T with Á(T ) · f(°); or o a set U, vol(U) · c=2 , such that for all S with vol(S) · vol(V )=2 and Á(S) · ° vol(S [ U) ¸ 12 vol(S) CERTIFICATE DECOMPOSITION AND PARTITIONING PARTITIONING ALGORITHM Partition(G, °, c) either outputs o an (c)-balanced cut T with Á(T ) · f(°); or o a set U, vol(U) · c=2 , such that for all S with vol(S) · vol(V )=2 and Á(S) · ° vol(S [ U) ¸ 12 vol(S) CERTIFICATE U UNBALANCED DECOMPOSITION AND PARTITIONING PARTITIONING ALGORITHM Partition(G, °, c) either outputs o an (c)-balanced cut T with Á(T ) · f(°); or o a set U, vol(U) · c=2 , such that for all S with vol(S) · vol(V )=2 and Á(S) · ° vol(S [ U) ¸ 12 vol(S) CERTIFICATE U S DECOMPOSITION AND PARTITIONING PARTITIONING ALGORITHM Partition(G, °, c) either outputs o an (c)-balanced cut T with Á(T ) · f(°); or o a set U, vol(U) · c=2 , such that for all S with vol(S) · vol(V )=2 and Á(S) · ° vol(S [ U) ¸ 12 vol(S) CERTIFICATE U S DECOMPOSITION AND PARTITIONING PARTITIONING ALGORITHM Partition(G, °, c) either outputs o an (c)-balanced cut T with Á(T ) · f(°); or o a set U, vol(U) · c=2 , such that Á(U) · f(°); for all S with vol(S) · vol(V )=2 and Á(S) · ° vol(S [ U) ¸ 12 vol(S) SPECIAL CERTIFICATE U SPARSE UNBALANCED CUT CORRELATED WITH ALL SPARSE UNBALANCED CUTS DECOMPOSITION AND PARTITIONING PARTITIONING ALGORITHM Partition(G, °, c) either outputs o an (c)-balanced cut T with Á(T ) · f(°); or o a set U, vol(U) · c=2 , such that Á(U) · f(°); for all S with vol(S) · vol(V )=2 and Á(S) · ° vol(S [ U) ¸ 12 vol(S) DECOMPOSITION Nearly-linear time Partition(G, °, c) APPLY RECURSIVELY Can construct (®, f(®) log2 n)-decomposition in nearly-linear time [Spielman, Teng ’04] OUR ALGORITHM BALCUT(G, °, c) runs in time O~ µ m ° ¶ and outputs either o an (c)-balanced cut T with p Á(T ) · O( °); or o a special certificate that for all c-balanced S ½ V, Á(S) ¸ °: THE ALGORITHM For each t, keep graph Gt and a rate ´t. G0= G and ´0=1. = +1 charge = –1 charge At iteration t, repeat O(log n) times: • Consider Gt. • Pick random assignment of §1 charge to the vertices. THE ALGORITHM For each t, keep graph Gt and a rate ´t. G0= G and ´0=1. = +1 charge = –1 charge At iteration t, repeat O(log n) times: • Consider Gt. • Pick random assignment of §1 charge to the vertices. • Mix charge along edges of Gtusing heat kernel with rate ´t. THE ALGORITHM For each t, keep graph Gt and a rate ´t. G0= G and ´0=1. = +1 charge = –1 charge At iteration t, repeat O(log n) times: • Consider Gt. • Pick random assignment of §1 charge to the vertices. • Mix charge along edges with rate ´t. • Sort final distribution by charge. THE ALGORITHM For each t, keep graph Gt and a rate ´t. G0= G and ´0=1. = +1 charge S1 S2 Si Sk = –1 charge At iteration t, repeat O(log n) times: • Consider Gt. • Pick random assignment of §1 charge to the vertices. • Mix along edges with rate ´t. • Sort final distribution by charge. • Check all (c)-balanced sweep cuts S1, ,Sk for Á(Si) · O(°1/2) in G. THE ALGORITHM For each t, keep graph Gt and a rate ´t. G0= G and ´0=1. = +1 charge S1 S2 Si Sk = –1 charge At iteration t, repeat O(log n) times: • Consider Gt. • Pick random assignment of §1 charge to the vertices. • Mix along edges with rate ´t. • Consider final distribution, sorted by charge. • Check all (b)-balanced sweep cuts S1, ,Skfor Á(Si) · O(°1/2) in G. What if no sparse balanced cut is found? THE ALGORITHM • What if no sparse balanced cut is found? Consider the O(log n)-dimensional vector embedding of vertices given by the final distributions. vi vj THE ALGORITHM CASE 1: X 2 fi;jg2E jjvi ¡ vj jj ¸ ° ¢ X i2V jjvi jj2 Random walks have not mixed enough. vi How to fix it? vj Increase rate. ´t+1 = ´t + 1 THE ALGORITHM CASE 2: X fi;jg2E 2 jjvi ¡ vj jj · ° ¢ vi vj X i2V jjvi jj2 THE ALGORITHM CASE 2: X fi;jg2E 2 jjvi ¡ vj jj · ° ¢ X i2V jjvi jj2 GOAL: find U such that vi vj p • Á(U) · O( °); • U captures most of the variance of the embedding. THE ALGORITHM CASE 2: X 2 fi;jg2E jjvi ¡ vj jj · ° ¢ X i2V jjvi jj2 GOAL: find U such that vi p • Á(U) · O( °); • U captures large fraction of the variance of the embedding. vj X i2U 2 jjvi jj ¸ -(1) ¢ X i2V jjvi jj2 THE ALGORITHM CASE 2: X 2 fi;jg2E jjvi ¡ vj jj · ° ¢ X i2V jjvi jj2 GOAL: find U such that vi vj U p • Á(U) · O( °); • U captures large fraction of the variance of the embedding. SOLUTION: Check all ball cuts centered around origin. THE ALGORITHM CASE 2: X 2 fi;jg2E jjvi ¡ vj jj · ° ¢ X i2V jjvi jj2 GOAL: find U such that vi vj U p • Á(U) · O( °); • U captures large fraction of the variance of the embedding. SOLUTION: Check all ball cuts centered around origin. Check all sweep-cuts of radius vector ri = jjvi jj Long vectors imply random walks got stuck in U THE ALGORITHM CASE 2: X 2 fi;jg2E jjvi ¡ vj jj · ° ¢ X i2V jjvi jj2 GOAL: find U such that vi vj U p • Á(U) · O( °); • U captures large fraction of the variance of the embedding. SOLUTION: Check all ball cuts centered around origin. NB: constructing sparse unbalanced U crucial in obtaining a special certificate. THE ALGORITHM CASE 2: X fi;jg2E 2 jjvi ¡ vj jj · ° ¢ X i2V jjvi jj2 Given U such that vi vj U p • Á(U) · O( °); • U captures large fraction of the variance of the embedding. How to fix it? Gt+1 °X = Gt + Stari n i2U THE ALGORITHM • If no sparse balanced cut found in T iterations for µ ¶ log n T =O ; ° we can show that [Ui is a special certificate that G has no c-balanced cuts of conductance less than °. • Running Time log n m ~ ~ O(m) £ O( ) = O( ) ° ° time per iteration iterations SDP FORMULATION Efi;jg2EG Efi;jg2V £V 8i 2 V Ej2V jjvi ¡ vj jj2 · °; 1 2 jjvi ¡ vj jj = ; 2m 1 1 2 jjvi ¡ vj jj · ¢ : c 2m 0 SDP FORMULATION Efi;jg2EG Efi;jg2V £V 8i 2 V Ej2V jjvi ¡ vj jj2 · °; 1 2 jjvi ¡ vj jj = ; 2m 1 1 2 jjvi ¡ vj jj · ¢ : c 2m 0 SHORT EDGES FIXED VARIANCE SDP FORMULATION Efi;jg2EG Efi;jg2V £V 8i 2 V Ej2V jjvi ¡ vj jj2 · °; 1 2 jjvi ¡ vj jj = ; 2m 1 1 2 jjvi ¡ vj jj · ¢ : c 2m SHORT EDGES FIXED VARIANCE LENGTH OF STAR EDGES SHORT RADIUS 0 SEPARATION ORACLE 1. If BROKEN FIRST CONSTRAINT X fi;jg2E 2 jjvi ¡ vj jj ¸ ° ¢ X i2V jjvi jj2 increase rate: ´t+1 = ´t + 1. BROKEN STAR CONSTRAINTS vi vj U 2. Otherwise, grow ball to find an unbalanced cut U such that o U contains most of the variance of the embedding, o Á( U) · O(°1/2). Set: Gt+1 °X = Gt + Stari n i2U CERTIFICATE • SDP dual implies L(GT ) ¸ 2°L(Kn) [Ui Á([Ui) · p ° CERTIFICATE • SDP dual implies L(GT ) ¸ 2°L(Kn) [Ui Á(T ) · ° Á([Ui) · p ° T Any sparse unbalanced cut T must contain the root of many stars. T is well-correlated with [Ui CERTIFICATE • SDP dual implies L(GT ) ¸ 2°L(Kn) [Ui Á(T ) · ° Á([Ui) · p ° T Any sparse small cut T must contain the root of many stars. T is well-correlated with [Ui THANK YOU