Solving SDPs over Symmetric, Diagonally Dominant Matrices David Phillips joint work with

advertisement
SDD models Algorithm Computations
Solving SDPs over Symmetric, Diagonally
Dominant Matrices
David Phillips
joint work with
Michael Lewis (William & Mary)
Rui Zhang (Smith-U. Maryland)
United States Naval Academy
November 11, 2014
David Phillips
SDPs over SDDs 1/18
SDD models Algorithm Computations
Symmetric, diagonally dominant matrices
A matrix, X is symmetric, diagonally dominant (SDD) if:
X
X = X > and Xii ≥
|Xij |
j6=i
Talking points
X is positive semidefinite by Gershgorin disks.
Enforcing X to be SDD requires n linear constraints
Graph Laplacians are SDD.
Solving linear systems of equations are easier over SDDs
(e.g., Spielmann and Teng ‘14).
David Phillips
SDPs over SDDs 2/18
SDD models Algorithm Computations
Laplacian
For G = (N, E) with degree sequence d and node-node adjacency
matrix A define
L(G) = diag(d) − A
is the Laplacian of G.
David Phillips
SDPs over SDDs 3/18
2 SDD models Algorithm 2Computations
a
e
Laplacian
3
3
For G =c (N, E)
d with degree sequence d and node-node adjacency
matrix A define
L(G) = diag(d) − A
b
f
2
is 2the Laplacian of G.
2
2
a
e
3
3
c
d
b
f
2
2

2
−1
2 

−1
L=
a  0
 0
0
−1
2
−1
0
0
0
−1
−1
3
−1
0
0
3
3
c
d
0
0
−1
3
−1
−1
0
20
0
e−1
2
−1

0
0 

0 

−1
−1
2
David Phillips
SDPs over SDDs 3/18
2 SDD models Algorithm 2Computations
2
2
a
a
e
e
Laplacian
3
3
3
3
For G =c (N, E)
node-node
adjacency
c
d with degree sequence d and
d
matrix A define
L(G) = diag(d)
−A
b
f
b
f
2
is 2the Laplacian of G.
2
2
2
2
2
2
a
e
a
e
3
3
3
3
c
d
c
d
b
f
b
f
2
2
2
2

2
−1
2 

−1
L=
a  0
 0
0
−1
2
−1
0
0
0
−1
−1
3
−1
0
0
3
3
c
d
0
0
−1
3
−1
−1
0
20
0
e−1
2
−1

0
0 

0 

−1
−1
2
David Phillips
2
 0

−1
L=
 0
−1
0

0
2
−1
0
0
−1
−1
−1
3
−1
0
0
0
0
−1
3
−1
−1
SDPs over SDDs 3/18
−1
0
0
−1
2
0

0
−1

0 

−1
0 
2
SDD models Algorithm Computations
Def: λ2 (G) is the second smallest eigenvalue of L(G).
David Phillips
SDPs over SDDs 4/18
SDD models Algorithm Computations
Def: λ2 (G) is the second smallest eigenvalue of L(G).
Theorem [Fiedler (‘77)]
For any graph G, λ2 (G) > 0 iff G is connected.
David Phillips
SDPs over SDDs 4/18
SDD models Algorithm Computations
Def: λ2 (G) is the second smallest eigenvalue of L(G).
b
f
Theorem [Fiedler (‘77)]
2
2
For any graph G, λ2 (G) > 0 iff G is connected.
a
e
c
b
d
f
λ2 (G) ≈ .44
David Phillips
SDPs over SDDs 4/18
g
SDD models Algorithm Computations
Def: λ2 (G) is the second smallest eigenvalue of L(G).
Theorem [Fiedler (‘77)]
For any graph G, λ2 (G) > 0 iff G is connected.
a
e
c
b
d
f
λ2 (G) = 1.0
David Phillips
SDPs over SDDs 4/18
SDD models Algorithm Computations
c
d
Def: λ2 (G) is the second smallest
eigenvalue
of L(G).
Theorem [Fiedler (‘77)]
b
f
For any graph G, λ2 (G) > 0 iff G is connected.
a
e
c
b
d
f
g
λ2 (G) ≈ .34
David Phillips
SDPs over SDDs 4/18
SDD models Algorithm Computations
Def: λ2 (G) is the second smallest eigenvalue of L(G).
Theorem [Fiedler (‘77)]
For any graph G, λ2 (G) > 0 iff G is connected.
a
e
c
b
d
f
g
λ2 (G) ≈ .60
David Phillips
SDPs over SDDs 4/18
SDD models Algorithm Computations
Extremal eigenvalues in SDDs
Let E denote a class of SDDs
We wish to find: max{λi (X)|X ∈ E|} for i = 1 or i = 2.
i = 2 will correspond to the Fiedler value for the right E
i = 1 will correspond to a similar Fiedler-like value for a
different kind of E.
David Phillips
SDPs over SDDs 5/18
SDD models Algorithm Computations
Extremal eigenvalues in SDDs
Let E denote a class of SDDs
We wish to find: max{λi (X)|X ∈ E|} for i = 1 or i = 2.
i = 2 will correspond to the Fiedler value for the right E
i = 1 will correspond to a similar Fiedler-like value for a
different kind of E.
Leads to the following SDP:
max ρ
s.t. X ρB
X ∈ E.
where B 0.
David Phillips
SDPs over SDDs 5/18
SDD models Algorithm Computations
Degree-constrained network design
max ρ
s.t. X ρB
X ∈ E.
Given a positive vector b and a base graph G, find the edge
weighted subgraph H of G with node degrees given by b
and maximized Fiedler value.
David Phillips
SDPs over SDDs 6/18
SDD models Algorithm Computations
Degree-constrained network design
max ρ
s.t. X ρB
X ∈ E.
Given a positive vector b and a base graph G, find the edge
weighted subgraph H of G with node degrees given by b
and maximized Fiedler value.
Xii = bi ;
P
Xii + j6=i Xij = 0;
Xij = 0 for ij 6∈ G, −1 ≤ Xij ≤ 0 for ij ∈ G; and
Set B = I − n1 .
David Phillips
SDPs over SDDs 6/18
SDD models Algorithm Computations
Absolute algebraic connectivity
max ρ
s.t. X ρB
X ∈ E.
Given a positive value c and a base graph G = (V, A), add
weighted edges to G up to a budget of c so as to maximize
the Fiedler value.
David Phillips
SDPs over SDDs 7/18
SDD models Algorithm Computations
Absolute algebraic connectivity
max ρ
s.t. X ρB
X ∈ E.
Given a positive value c and a base graph G = (V, A), add
weighted edges to G up to a budget of c so as to maximize
the Fiedler value.
P
Xii + j6=i Xij = 0;
P
Add i Xii ≤ |A| + c constraint to E;
Xij = 1 for ij ∈ G, −1 ≤ Xij ≤ 0 for ij 6∈ G; and
Set B = I −
1
n
David Phillips
SDPs over SDDs 7/18
SDD models Algorithm Computations
Nanoporous materials
max ρ
s.t. X ρB
X ∈ E.
Given a positive vector b, a positive value c, and a base
graph G, find the edge weights for G up to a budget of c
that maximizes the first eigenvalue of L(G) + diag(b).
David Phillips
SDPs over SDDs 8/18
SDD models Algorithm Computations
Nanoporous materials
max ρ
s.t. X ρB
X ∈ E.
Given a positive vector b, a positive value c, and a base
graph G, find the edge weights for G up to a budget of c
that maximizes the first eigenvalue of L(G) + diag(b).
P
Add − i>j Xik ≤ c constraint to E;
P
Xii = bi − j6=i Xij
Xij = 0 for ij 6∈ G, −1 ≤ Xij ≤ 0 for ij ∈ G; and
Set B = I.
David Phillips
SDPs over SDDs 8/18
SDD models Algorithm Computations
SDP is easy! Just use barrier!
David Phillips
SDPs over SDDs 9/18
SDD models Algorithm Computations
SDP is easy! Just use barrier!
In theory: Õ(n6.5 ) for our SDP
In practice (8 gb RAM, 1.7 Ghz, dual 2-core i7):
n
time
100
>7m
200
>6h
David Phillips
300
Crash!
SDPs over SDDs 9/18
SDD models Algorithm Computations
SDP is easy! Just use barrier!
In theory: Õ(n6.5 ) for our SDP
In practice (8 gb RAM, 1.7 Ghz, dual 2-core i7):
n
time
100
>7m
200
>6h
300
Crash!
So use first-order methods:
Avoids Cholesky decomposition by only using gradient
information
Generally requires knowledge of problem structure
David Phillips
SDPs over SDDs 9/18
SDD models Algorithm Computations
Penalizing semidefiniteness
max ρ
s.t. X ρB
X ∈ E.
David Phillips
SDPs over SDDs 10/18
SDD models Algorithm Computations
Penalizing semidefiniteness
max ρ
s.t. X ρB
X ∈ E.
At each iteration:
Take a candidate solution X and calculate penalty matrix
Y on X ρB.
Solve linear optimization minimizing Y • X := Tr(Y > X)
over E.
Update soluion with a partial step in resulting direction.
David Phillips
SDPs over SDDs 10/18
SDD models Algorithm Computations
Potential penalty function iteration
1 If the current iterate X is “good” then terminate
2
Else let Y = B −1/2 exp(−αB −1/2 XB −1/2 )B −1/2
3
Solve for X̂ = arg min{Y • Z : Z ∈ E}
4
Update X = σ X̂ + (1 − σ)X
“good” are relaxed duality conditions
Refs: Plotkin, Shmoys, and Tardos (PST95), Grigoriadis
and Khachiyan (GT94), Bienstock (B02).
For SDP see also Klein & Lu (‘96), Iyengar, P., and Stein
(‘11), Arora & Kale (‘07)
Relative -opt solution in Õ(−2 (τ + κ)) where τ is the time
to compute exp(−αM ) and κ is the time to solve linear
optimization over P .
Y = exp(−α(B −1/2 XB −1/2 )) exact requires Õ(n3 ),
approximation results in Õ(−1/2 nm) where m is the
number of nonzeros.
David Phillips
SDPs over SDDs 11/18
SDD models Algorithm Computations
Key features of our algorithm
We penalize X 6 ρB versus previous methods
Previous first methods for SDP require “simple” linear
constraints in the easy set and each iteration generates a
psd solution;
We put all linear constraints in the easy set and each
generates a solution that may or may not satisfy the psd
constraint.
Resulting method requires some linear algebraic massaging
of the previous analyses.
David Phillips
SDPs over SDDs 12/18
SDD models Algorithm Computations
Key features of our algorithm
We penalize X 6 ρB versus previous methods
Previous first methods for SDP require “simple” linear
constraints in the easy set and each iteration generates a
psd solution;
We put all linear constraints in the easy set and each
generates a solution that may or may not satisfy the psd
constraint.
Resulting method requires some linear algebraic massaging
of the previous analyses.
Using an inner iteration similar to PST95 yields an
absolute -approximation, then a bisection search method
of B02 yields a relative -approximation.
David Phillips
SDPs over SDDs 12/18
SDD models Algorithm Computations
Linear optimization over E
For our applications κ = Õ(m3.5 ), where m is the number
of edges in our graphs, if we use interior point.
For the network design problem, the underlying problem is
a b-matching problem and can be solved Õ(n3 ).
But, in practice, κ is not the bottleneck.
David Phillips
SDPs over SDDs 13/18
SDD models Algorithm Computations
Linear optimization over E
For our applications κ = Õ(m3.5 ), where m is the number
of edges in our graphs, if we use interior point.
For the network design problem, the underlying problem is
a b-matching problem and can be solved Õ(n3 ).
But, in practice, κ is not the bottleneck.
Theorem
The potential algorithm can find a relative -optimal solution in
Õ(−2 m3.5 ) time.
David Phillips
SDPs over SDDs 13/18
SDD models Algorithm Computations
Data Results
Some computational questions:
How does the graph topology affect runtime?
How does the algorithm scale?
Approximate versus exact matrix exponential?
David Phillips
SDPs over SDDs 14/18
SDD models Algorithm Computations
Data Results
For the dense and sparse instances using degree sequences from
rudy, a graph generator of Rinaldi.
10 instances each for n = 50, 100, 200, 400, 600, 800, 1000
Dense data
Number of edges was O(pn2 ) for p = .05, .1, .15
Sparse data
Number of edges was O(kn) for k = 5, 10, 15
A line-tree was added (i.e., degree of each node increased
by 2 or 1) to ensure connectivity
Mosek and MATLAB, 2 1.7 GHz Intel Core i7, dual-core
processors, 8 GB RAM
= 0.001
David Phillips
SDPs over SDDs 15/18
SDD models Algorithm Computations
Data Results
1000
800
Dense
Sparse
Dense, approx. MatExp
600
400
200
100 200 300 400 500 600 700 800 900 1000
Number of nodes (n) versus runtime (CPU s)
David Phillips
SDPs over SDDs 16/18
SDD models Algorithm Computations
Data Results
Dense
Best fit line
6
4
2
4
4.5
5
5.5
6
6.5
−2
Log scale of n versus runtime
Slope of best fit: ≈ 2.6 ⇒ O(n3 ).
David Phillips
SDPs over SDDs 17/18
7
SDD models Algorithm Computations
Data Results
Conclusions & future work
Method can solve large SDPs (n = 1000 means 106
variables)
But how much larger can we solve?
David Phillips
SDPs over SDDs 18/18
SDD models Algorithm Computations
Data Results
Conclusions & future work
Method can solve large SDPs (n = 1000 means 106
variables)
But how much larger can we solve?
Sparse and approx. mat. exp. are faster, but not much
faster...
How do we leverage the sparsity?
How to leverage SDD structure to help us?
David Phillips
SDPs over SDDs 18/18
SDD models Algorithm Computations
Data Results
Conclusions & future work
Method can solve large SDPs (n = 1000 means 106
variables)
But how much larger can we solve?
Sparse and approx. mat. exp. are faster, but not much
faster...
How do we leverage the sparsity?
How to leverage SDD structure to help us?
Practical and theoretical runtimes encouraging but...
Still quite technical to use – how do we simplify the
interface?
Eigenvalue opt. over SDDs includes some interesting
problems but...
Other graph Laplacian problems?
SDDs that aren’t’ graph Laplacians?
David Phillips
SDPs over SDDs 18/18
SDD models Algorithm Computations
Data Results
Thanks!
David Phillips
SDPs over SDDs 19/18
Download