Approximating the Exponential, the Lanczos Method and an Õ(m)-Time Spectral Algorithm for Balanced Separator Lorenzo Orecchia, MIT Sushant Sachdeva, Princeton University Nisheeth Vishnoi, MSR India Talk on Tuesday at 2:25 pm, Session 13A 1 2 BALANCED GRAPH PARTITIONING AND SPECTRAL METHODS For an undirected unweighted instance graph G=(V,E) with |V|=n and |E|=m, the conductance of a cut S µ V is defined as ¹ jE(S;S)j Á(S) = ¹ minfvol(S);vol(S)g conductance at most ° ? This problem is NP-hard. However, approximation algorithms exist: Iterative Eigenvector Spectral [Leighton, Rao `88] Flow [Arora, Rao, Vazirani `04 ] SDP (Flow + Spectral) [Madry `10] SDP (Flow + Spectral) Distinguishes ¸ ° and p O( °) ~ 2) O(n ~ 23 ) [AK `07, OSVV `08] O(m ~ 23 ) [Sherman `09] O(m O(° log n) p O(° log n) Iterative Eigenvector Approach Spectral methods for finding balanced cuts perform well in many applications and have fast running times and optimized implementations that make them a popular choice among practitioners. The following are the theoretical guarantees of some spectral algorithms for this problem. • If ¸2 ¸ °, output NO. Otherwise, sweep eigenvector to p find cut St such that Á(St) · O( °): Iterative Eigenvector Eigenvector [Andersen, Peres `09] Distinguishes ¸ ° and p O( °) ³p ´ O ° log n p O( °) Local Random Walks [Orecchia, Vishnoi `10] SDP Time • If USi is (b/2)-balanced, output USi. ~ 2) O(n • Otherwise, let Gt+1be the graph induced by Gt on V-St with self-loops replacing the edges going to St. Recurse on Gt+1. ~ (m=p°) O ~ (m=°) O p OUR THEOREM: We give an algorithm that either outputs an (b)-balanced cut S ⊂ V such that Á(S) ≤ O( °) , or outputs a certificate that no b-balanced cut of conductance ° exists. The algorithm runs in time O(m poly(log n)). TECHNICAL COMPONENTS: If a low-conductance balanced cut is found, we perform a soft removal of this cut by modifying the current random ³ ´ walk as follows: Pt¡1 P Pt+1 = e i2St S3 where Stari is the star graph rooted at vertex i. The transition rate from St to all vertices increases, making the process converge faster to its stationary. S1 St 5 HOW TO COMPUTE OUR RANDOM-WALK VECTORS GOAL: For symmetric diagonally-dominant A and vector u, with sparsity m, compute vector v such that jje¡Au ¡ vjj · ± In our case, A = log n , so that ° L in time ~ (m ¢ polylog(jjAjj) ¢ polylog(1=±)) O ~ 1 ) and the running time is O~ (m) . jjAjj = O( ° ITERATIVE APPROACHES: £(jjAjj log(1=±)) terms and yields a running time of O~ (m=°) . p ~ (m=p°) . • Direct Lanczos Method: requires £( jjAjjpolylog(1=±)) iterations . The running time is O MIXING ANALYSIS: Using properties of the heat-kernel random walk, we show that the mixing improves significantly: ~ O((m + n) ¢ log(2 + jjAjj) ¢ polylog(1=±)) , yielding O~ (m) p p ~ ~ O((m + n) ¢ 1 + jjAjj ¢ log(2 + jjAjj) ¢ polylog(1=±)) , yielding O (m= °) - Using Spielman-Teng solver: 3 4 ª(Pt ; V ) - Using Conjugate Gradient: After T=O(log n) iterations, if no low-conductance (b)-balanced cut is found, the following holds: OUR APPROACH exploits the speed of the linear-system solver by using it to perform the inverse iteration. We speed 1 poly(n) up convergence by applying Lanczos method. IDEA: Perform k matrix-vector multiplications to obtain the subspace Rk = Spanfu; Au; A2u; A3u; A4u; : : : ; Ak ug [St Compute an orthonormal basis Qk for Rk and let Tk be A restricted to Rk: Tk = Qk AQTk As k grows, Tk becomes a better approximation to A. If a function f is close to a polynomial p of degree k, i.e. Á(T ) ¸ ° tj ¡ ° j[S jT j ¸° ¡ ° b=2 b ¸ ° 2 poly(n) p We can find St with Á(St) · O( °) . Moreover, 6 ª(Pt; St) ¸ 12 ª(Pt ; V ): APPLYING LANCZOS METHOD TO THE INVERSE ITERATION A direct application of Lanczos method does not meet our goal, but applying Lanczos to the inverse iteration yield our result. DIRECT LANCZOS: There exists a polynomial p such that 1 supx2(0;jjAjj) jp(x) ¡ e¡x j · poly(n) p ~ and p has degree O( jjAjj) . p p ~ ~ IMPLICATION: O( jjAjj) = O (1= °) iterations are sufficient. p ~ LOWER BOUND: This bound is tight: O( jjAjj) iterations may be necessary. ¡1 , i.e. compute subspace IDEA: Apply Lanczos Method to B = (I + A ) k Rk = Spanfu; Bu; B2u; B3u; B4u; : : : ; Bk ug NB: We can use the linear-system solvers to compute, for any vector x, ¡1 Bx = (I + A ) x k Review of Lanczos Method We show that we can turn this fact into a NO certificate for the b-Balanced Cut problem: Balanced cut T 1 This yields a certificate that p If Á(S) · O( °), we output S and ¸2 (Gt)¸ °, Which we turn into a NO certificate. answer YES. EXPANDER S4 • Our Algorithm relies on a linear-system solver for A. We obtain the following running times: ª(PT ; V ) · Some vectors must be very long, i.e. walk converges very poorly from some vertices. All vectors are short. The total deviation is: • Taylor Series Approximation: requires ª(Pt+1; V ) · ª(Pt ; V ) ¡ 12 ª(Pt ; St) · WALK HAS NOT MIXED AND NO BALANCED CUT IS FOUND St S3 L(Stari ) Increased Transition Rates WALK HAS NOT MIXED AND BALANCED CUT IS FOUND ¸2(G5) ¸ ° HOW WE UPDATE THE GRAPH AND OUR MIXING ANALYSIS j=1 jjvi ¡ ~1=njj2 We take O(log n) random projections and find the best (b)-balanced sweep cut S of the resulting vectors. ª(Pt; V ) · 2) Novel analysis of Lanczos methods for computing heat-kernel vectors ¡¿ L+° WALK HAS MIXED S2 1) SDP primal-dual iterative algorithm with a simple random walk interpretation i2S ROUNDING. Three possible cases: NB: It is not possible to argue that ¸2(Gt) grows significantly at every iteration. S2 P ª(P,V) measures the total deviation from stationary, which is also the variance of the geometric embedding {vi}. Example: S1 4 ª(P; S) = Worst-case instance: -(n) iterations. Each iteration takes time -(n), yielding a total running time of -(n) . • Compute the slowest mixing eigenvector of Gt and corresponding Laplacian eigenvalue ¸2. ~1 n MIXING: Define the deviation from stationary for a set Sµ V for process P: Quadratic Running Time Set G1 = G. For t=1, …,n do: Method ¿ = log n=° Denote the embedding {vi} given by vi = P ei with P = Pt. We use “spectral methods” to refer to algorithms that explore the graph by performing matrix-vector multiplications involving the graph Laplacian L. Such algorithms detect low-conductance cuts by exploiting the connection between the mixing of random walks in the graph and the cut structure of G. Algorithm Pt = e¡¿L(Gt ) Figure: Balanced cuts and unbalanced cuts, both of low conductance, in a citation graph. ~ 1+² ) O(m °polylog(n) vi = P ei IDEA BEHIND OUR ALGORITHM: Replace eigenvector by multidimensional embedding of heat-kernel random-walk. The algorithmic challenge is to detect and remove unbalanced cuts of low conductance quickly. Running Time OUR RANDOM-WALK APPROACH FOR FINDING UNBALANCED CUTS The eigenvector is unstable, as it may capture only a small cut and be oblivious to rest of the graph. Hence, we switch to a more stable distribution over low eigenvectors, represented by the transition matrix of a random walk that has converged to have most of its norm over eigenvectors with eigenvalue at most O(°). For simplicity, take G to be regular. The graph Laplacian is L = I-W. Spectral methods are targeted towards finding lowconductance cuts, regardless of how balanced they are. For this reason, spectral methods may have to find and remove all unbalanced cuts of low conductance before finding a balanced cut. b-BALANCED CUT PROBLEM: Given graph G, parameter ° 2 (0,1) and balance b 2 (0,1/2), does G contain a b-balanced cut of Method DETECTING AND REMOVING UNBALANCED CUTS Unbalanced cuts of low conductance are obstacles to detecting balanced cuts. where vol(S) is the total degree in set S. A cut S is b-balanced if vol(S) ¸ b ¢ vol(V). Algorithm 3 then jjp(x) ¡ f(x)jj · ± 8x 2 Spectrum(A) jjQk f(Tk )QTk u ¡ f(A)ujj · 2±jjujj APPROXIMATION: In fact, the quality of the k-approximation depends on the existence of a good approximation to the exponential function by rational functions, a larger class of functions than polynomials. THEOREM (Saff, Schonhage, Varga `75) : There exists a polynomial p such that ¯ ¯ ¯ p(x) ¡x ¯ supx2(0;jjAjj) ¯ (1+x=k)k ¡ e ¯ · ± ~ and p has degree O(log(2 + jjAjj) ¢ polylog(1=±)) . STRONG ERROR GUARANTEE