SDD models Algorithm Computations Solving SDPs over Symmetric, Diagonally Dominant Matrices David Phillips joint work with Michael Lewis (William & Mary) Rui Zhang (Smith-U. Maryland) United States Naval Academy November 11, 2014 David Phillips SDPs over SDDs 1/18 SDD models Algorithm Computations Symmetric, diagonally dominant matrices A matrix, X is symmetric, diagonally dominant (SDD) if: X X = X > and Xii ≥ |Xij | j6=i Talking points X is positive semidefinite by Gershgorin disks. Enforcing X to be SDD requires n linear constraints Graph Laplacians are SDD. Solving linear systems of equations are easier over SDDs (e.g., Spielmann and Teng ‘14). David Phillips SDPs over SDDs 2/18 SDD models Algorithm Computations Laplacian For G = (N, E) with degree sequence d and node-node adjacency matrix A define L(G) = diag(d) − A is the Laplacian of G. David Phillips SDPs over SDDs 3/18 2 SDD models Algorithm 2Computations a e Laplacian 3 3 For G =c (N, E) d with degree sequence d and node-node adjacency matrix A define L(G) = diag(d) − A b f 2 is 2the Laplacian of G. 2 2 a e 3 3 c d b f 2 2 2 −1 2 −1 L= a 0 0 0 −1 2 −1 0 0 0 −1 −1 3 −1 0 0 3 3 c d 0 0 −1 3 −1 −1 0 20 0 e−1 2 −1 0 0 0 −1 −1 2 David Phillips SDPs over SDDs 3/18 2 SDD models Algorithm 2Computations 2 2 a a e e Laplacian 3 3 3 3 For G =c (N, E) node-node adjacency c d with degree sequence d and d matrix A define L(G) = diag(d) −A b f b f 2 is 2the Laplacian of G. 2 2 2 2 2 2 a e a e 3 3 3 3 c d c d b f b f 2 2 2 2 2 −1 2 −1 L= a 0 0 0 −1 2 −1 0 0 0 −1 −1 3 −1 0 0 3 3 c d 0 0 −1 3 −1 −1 0 20 0 e−1 2 −1 0 0 0 −1 −1 2 David Phillips 2 0 −1 L= 0 −1 0 0 2 −1 0 0 −1 −1 −1 3 −1 0 0 0 0 −1 3 −1 −1 SDPs over SDDs 3/18 −1 0 0 −1 2 0 0 −1 0 −1 0 2 SDD models Algorithm Computations Def: λ2 (G) is the second smallest eigenvalue of L(G). David Phillips SDPs over SDDs 4/18 SDD models Algorithm Computations Def: λ2 (G) is the second smallest eigenvalue of L(G). Theorem [Fiedler (‘77)] For any graph G, λ2 (G) > 0 iff G is connected. David Phillips SDPs over SDDs 4/18 SDD models Algorithm Computations Def: λ2 (G) is the second smallest eigenvalue of L(G). b f Theorem [Fiedler (‘77)] 2 2 For any graph G, λ2 (G) > 0 iff G is connected. a e c b d f λ2 (G) ≈ .44 David Phillips SDPs over SDDs 4/18 g SDD models Algorithm Computations Def: λ2 (G) is the second smallest eigenvalue of L(G). Theorem [Fiedler (‘77)] For any graph G, λ2 (G) > 0 iff G is connected. a e c b d f λ2 (G) = 1.0 David Phillips SDPs over SDDs 4/18 SDD models Algorithm Computations c d Def: λ2 (G) is the second smallest eigenvalue of L(G). Theorem [Fiedler (‘77)] b f For any graph G, λ2 (G) > 0 iff G is connected. a e c b d f g λ2 (G) ≈ .34 David Phillips SDPs over SDDs 4/18 SDD models Algorithm Computations Def: λ2 (G) is the second smallest eigenvalue of L(G). Theorem [Fiedler (‘77)] For any graph G, λ2 (G) > 0 iff G is connected. a e c b d f g λ2 (G) ≈ .60 David Phillips SDPs over SDDs 4/18 SDD models Algorithm Computations Extremal eigenvalues in SDDs Let E denote a class of SDDs We wish to find: max{λi (X)|X ∈ E|} for i = 1 or i = 2. i = 2 will correspond to the Fiedler value for the right E i = 1 will correspond to a similar Fiedler-like value for a different kind of E. David Phillips SDPs over SDDs 5/18 SDD models Algorithm Computations Extremal eigenvalues in SDDs Let E denote a class of SDDs We wish to find: max{λi (X)|X ∈ E|} for i = 1 or i = 2. i = 2 will correspond to the Fiedler value for the right E i = 1 will correspond to a similar Fiedler-like value for a different kind of E. Leads to the following SDP: max ρ s.t. X ρB X ∈ E. where B 0. David Phillips SDPs over SDDs 5/18 SDD models Algorithm Computations Degree-constrained network design max ρ s.t. X ρB X ∈ E. Given a positive vector b and a base graph G, find the edge weighted subgraph H of G with node degrees given by b and maximized Fiedler value. David Phillips SDPs over SDDs 6/18 SDD models Algorithm Computations Degree-constrained network design max ρ s.t. X ρB X ∈ E. Given a positive vector b and a base graph G, find the edge weighted subgraph H of G with node degrees given by b and maximized Fiedler value. Xii = bi ; P Xii + j6=i Xij = 0; Xij = 0 for ij 6∈ G, −1 ≤ Xij ≤ 0 for ij ∈ G; and Set B = I − n1 . David Phillips SDPs over SDDs 6/18 SDD models Algorithm Computations Absolute algebraic connectivity max ρ s.t. X ρB X ∈ E. Given a positive value c and a base graph G = (V, A), add weighted edges to G up to a budget of c so as to maximize the Fiedler value. David Phillips SDPs over SDDs 7/18 SDD models Algorithm Computations Absolute algebraic connectivity max ρ s.t. X ρB X ∈ E. Given a positive value c and a base graph G = (V, A), add weighted edges to G up to a budget of c so as to maximize the Fiedler value. P Xii + j6=i Xij = 0; P Add i Xii ≤ |A| + c constraint to E; Xij = 1 for ij ∈ G, −1 ≤ Xij ≤ 0 for ij 6∈ G; and Set B = I − 1 n David Phillips SDPs over SDDs 7/18 SDD models Algorithm Computations Nanoporous materials max ρ s.t. X ρB X ∈ E. Given a positive vector b, a positive value c, and a base graph G, find the edge weights for G up to a budget of c that maximizes the first eigenvalue of L(G) + diag(b). David Phillips SDPs over SDDs 8/18 SDD models Algorithm Computations Nanoporous materials max ρ s.t. X ρB X ∈ E. Given a positive vector b, a positive value c, and a base graph G, find the edge weights for G up to a budget of c that maximizes the first eigenvalue of L(G) + diag(b). P Add − i>j Xik ≤ c constraint to E; P Xii = bi − j6=i Xij Xij = 0 for ij 6∈ G, −1 ≤ Xij ≤ 0 for ij ∈ G; and Set B = I. David Phillips SDPs over SDDs 8/18 SDD models Algorithm Computations SDP is easy! Just use barrier! David Phillips SDPs over SDDs 9/18 SDD models Algorithm Computations SDP is easy! Just use barrier! In theory: Õ(n6.5 ) for our SDP In practice (8 gb RAM, 1.7 Ghz, dual 2-core i7): n time 100 >7m 200 >6h David Phillips 300 Crash! SDPs over SDDs 9/18 SDD models Algorithm Computations SDP is easy! Just use barrier! In theory: Õ(n6.5 ) for our SDP In practice (8 gb RAM, 1.7 Ghz, dual 2-core i7): n time 100 >7m 200 >6h 300 Crash! So use first-order methods: Avoids Cholesky decomposition by only using gradient information Generally requires knowledge of problem structure David Phillips SDPs over SDDs 9/18 SDD models Algorithm Computations Penalizing semidefiniteness max ρ s.t. X ρB X ∈ E. David Phillips SDPs over SDDs 10/18 SDD models Algorithm Computations Penalizing semidefiniteness max ρ s.t. X ρB X ∈ E. At each iteration: Take a candidate solution X and calculate penalty matrix Y on X ρB. Solve linear optimization minimizing Y • X := Tr(Y > X) over E. Update soluion with a partial step in resulting direction. David Phillips SDPs over SDDs 10/18 SDD models Algorithm Computations Potential penalty function iteration 1 If the current iterate X is “good” then terminate 2 Else let Y = B −1/2 exp(−αB −1/2 XB −1/2 )B −1/2 3 Solve for X̂ = arg min{Y • Z : Z ∈ E} 4 Update X = σ X̂ + (1 − σ)X “good” are relaxed duality conditions Refs: Plotkin, Shmoys, and Tardos (PST95), Grigoriadis and Khachiyan (GT94), Bienstock (B02). For SDP see also Klein & Lu (‘96), Iyengar, P., and Stein (‘11), Arora & Kale (‘07) Relative -opt solution in Õ(−2 (τ + κ)) where τ is the time to compute exp(−αM ) and κ is the time to solve linear optimization over P . Y = exp(−α(B −1/2 XB −1/2 )) exact requires Õ(n3 ), approximation results in Õ(−1/2 nm) where m is the number of nonzeros. David Phillips SDPs over SDDs 11/18 SDD models Algorithm Computations Key features of our algorithm We penalize X 6 ρB versus previous methods Previous first methods for SDP require “simple” linear constraints in the easy set and each iteration generates a psd solution; We put all linear constraints in the easy set and each generates a solution that may or may not satisfy the psd constraint. Resulting method requires some linear algebraic massaging of the previous analyses. David Phillips SDPs over SDDs 12/18 SDD models Algorithm Computations Key features of our algorithm We penalize X 6 ρB versus previous methods Previous first methods for SDP require “simple” linear constraints in the easy set and each iteration generates a psd solution; We put all linear constraints in the easy set and each generates a solution that may or may not satisfy the psd constraint. Resulting method requires some linear algebraic massaging of the previous analyses. Using an inner iteration similar to PST95 yields an absolute -approximation, then a bisection search method of B02 yields a relative -approximation. David Phillips SDPs over SDDs 12/18 SDD models Algorithm Computations Linear optimization over E For our applications κ = Õ(m3.5 ), where m is the number of edges in our graphs, if we use interior point. For the network design problem, the underlying problem is a b-matching problem and can be solved Õ(n3 ). But, in practice, κ is not the bottleneck. David Phillips SDPs over SDDs 13/18 SDD models Algorithm Computations Linear optimization over E For our applications κ = Õ(m3.5 ), where m is the number of edges in our graphs, if we use interior point. For the network design problem, the underlying problem is a b-matching problem and can be solved Õ(n3 ). But, in practice, κ is not the bottleneck. Theorem The potential algorithm can find a relative -optimal solution in Õ(−2 m3.5 ) time. David Phillips SDPs over SDDs 13/18 SDD models Algorithm Computations Data Results Some computational questions: How does the graph topology affect runtime? How does the algorithm scale? Approximate versus exact matrix exponential? David Phillips SDPs over SDDs 14/18 SDD models Algorithm Computations Data Results For the dense and sparse instances using degree sequences from rudy, a graph generator of Rinaldi. 10 instances each for n = 50, 100, 200, 400, 600, 800, 1000 Dense data Number of edges was O(pn2 ) for p = .05, .1, .15 Sparse data Number of edges was O(kn) for k = 5, 10, 15 A line-tree was added (i.e., degree of each node increased by 2 or 1) to ensure connectivity Mosek and MATLAB, 2 1.7 GHz Intel Core i7, dual-core processors, 8 GB RAM = 0.001 David Phillips SDPs over SDDs 15/18 SDD models Algorithm Computations Data Results 1000 800 Dense Sparse Dense, approx. MatExp 600 400 200 100 200 300 400 500 600 700 800 900 1000 Number of nodes (n) versus runtime (CPU s) David Phillips SDPs over SDDs 16/18 SDD models Algorithm Computations Data Results Dense Best fit line 6 4 2 4 4.5 5 5.5 6 6.5 −2 Log scale of n versus runtime Slope of best fit: ≈ 2.6 ⇒ O(n3 ). David Phillips SDPs over SDDs 17/18 7 SDD models Algorithm Computations Data Results Conclusions & future work Method can solve large SDPs (n = 1000 means 106 variables) But how much larger can we solve? David Phillips SDPs over SDDs 18/18 SDD models Algorithm Computations Data Results Conclusions & future work Method can solve large SDPs (n = 1000 means 106 variables) But how much larger can we solve? Sparse and approx. mat. exp. are faster, but not much faster... How do we leverage the sparsity? How to leverage SDD structure to help us? David Phillips SDPs over SDDs 18/18 SDD models Algorithm Computations Data Results Conclusions & future work Method can solve large SDPs (n = 1000 means 106 variables) But how much larger can we solve? Sparse and approx. mat. exp. are faster, but not much faster... How do we leverage the sparsity? How to leverage SDD structure to help us? Practical and theoretical runtimes encouraging but... Still quite technical to use – how do we simplify the interface? Eigenvalue opt. over SDDs includes some interesting problems but... Other graph Laplacian problems? SDDs that aren’t’ graph Laplacians? David Phillips SDPs over SDDs 18/18 SDD models Algorithm Computations Data Results Thanks! David Phillips SDPs over SDDs 19/18