Matrix sparsification and the sparse null space problem Lee-Ad Gottlieb Tyler Neylon Weizmann Institute Bynomial Inc. Matrix sparsification Problem definition: Given a matrix, make it as sparse as possible (minimize number of non-zeros), using elementary row reductions we want lots of 0’s 1 0 2 2 1 1 1 1 Could try Gaussian elimination… but can we do better? 1 0 2 2 0 1 -1 -1 1 0 2 2 1 2 0 0 Applications: Computational speed-up for many fundamental matrix operations Machine learning [SS-00] Discovery of cycle bases of graphs [KMMP-04] Matrix sparsification and the sparse null space problem 2 Matrix sparsification What’s known about matrix sparsification? Precious little… mostly work by McCormick and coauthors It’s NP-hard [M-83] Known results Heuristic [CM-02] Algorithm under limiting condition: [HM-84] gave an approximation algorithm for matrices that satisfy the Haar condition Matrix sparsification and the sparse null space problem 3 Sparse null space problem First recall the definition of null space: Problem definition: Given a matrix A, find an optimally sparse matrix N that is a full null matrix for A The null space of A is the set of all nonzero vectors b for which Ab=0 A null matrix for A spans the null space of A. Finding such a matrix is a basic function in linear algebra. N is full rank Columns of N span the null space of A N is sparse Applications Helps solve Linear Equality Problems [CP-86] (optimization via gradient descent, dual variable method for Navier-Stokes, quadratic programming) Efficient solution to the force method for structural analysis [GH-87] Finds correlations between time series, such as financial stocks [ZZNS-05] Matrix sparsification and the sparse null space problem 4 Sparse null space problem What’s known about sparse null space? Precious little… First considered in [P-84], it’s known to be NP-hard [CP-86] Known results: Heuristics [BHKLPW-85, CP-86, GH-87] Matrix sparsification and the sparse null space problem 5 Two matrix problems It’s not difficult to see that matrix sparsification and sparse null space are equivalent Let B be a full null matrix for A. The following statements are equivalent: N = BX for some invertible matrix X N is a full null matrix for X Surprisingly then, these two lines of work have proceeded independently Matrix sparsification and the sparse null space problem 6 Our contribution Two matrix problems Can we prove something concrete about matrix sparsification? Have been around since the 80’s Have many applications Are equivalent – from now on, we’ll refer only to matrix sparsification Hardness of approximation? Approximation algorithms? We can do both… Hardness of approximation As hard as label cover (quite hard to approximate) .5-o(1)n Hard to approximate within factor 2log of optimal (with some caveats…) Approximation algorithms For the hard problem Under limiting assumptions Matrix sparsification and the sparse null space problem 7 Min unsatisfy As a first step towards proving hardness of approximation, we’ll show that matrix sparsification is closely related to the min unsatisfy problem introduced in [ABSZ-97] Problem definition: Given a linear system Ax=b, provide a vector x that minimizes the number of equations not satisfied 1 2 0 1 1 0 x x1 x2 1 = 2 1 What’s known about min unsatisfy As hard to approximate as label cover [ABSZ-97] .5-o(1)n Under Q, hard within factor 2log of optimal under the assumption that NP does not admit a quasi-polynomial time solution. Randomized Θ(m/log m) approximation algorithm (m is number of rows) [BK-01] Matrix sparsification and the sparse null space problem 8 Hardness of matrix sparsification We’ll give a reduction from min unsatisfy to matrix sparsification, which will prove hardness of approximation for matrix sparsification. Preliminary note: There exist matrices that are unsparsifiable. and these can be construction in poly time. M = (I X), where I is the identity matrix and X contains no 0 entries. 1 0 1 1 1 0 1 2 4 8 The identity portion can always be achieved via Gaussian elimination Matrix sparsification and the sparse null space problem 9 Hardness of matrix sparsification Proof outline Let (A,y) be an instance of min unsatisfy We’ll create a matrix M with a few copies of A, but many of copies of y Minimizing the number of non-zero entries in M reduces to finding a sparse linear combination of y with some vectors of A That is, solving the instance of min unsatisfy. Construction: Let (Iq X) be an unsparsifiable matrix, and Ø be the Kronecker product We chose q=n2 Iq Ø y XØy 0 Iq Ø A Matrix sparsification and the sparse null space problem 10 Approximation algorithm Our first result: We conclude that matrix sparsification is as hard as min unsatisfy, which itself is as hard as label cover. Matrix sparsification is hard to approximate within factor 2log.5-o(1)n of optimal So what can be done for matrix sparsification? We will further show that a solution to min unsatisfy implies a similar solution for matrix sparsification. Hence, the randomized Θ(m/log m) approximation algorithm for min unsatisfy [BK-01] carries over to matrix sparsification. More importantly, we will also show how to extend a large number of heuristics and algorithms under limiting assumptions to apply to min unsatisfy, and therefore to matrix sparsification. In particular, we’ll show that the well-known l1 minimization heuristic applies to matrix sparsification. Matrix sparsification and the sparse null space problem 11 Another look at min unsatisfy Consider the exact dictionary representation (EDR) problem, the major problem in sparse approximation theory. Problem definition: Given a matrix D of dictionary vectors and a target vector s Find the smallest subset D’ such that a linear combinations of vectors is equal to s. What’s known about this problem A variant appeared in a paper of Schmidt in 1907 [T-03] NP-Hard [N-95] Applications in signal representation [CW-92,NP-09], amplitude optimization [S-90], function approximation [N-95], and data mining [CRT-06, ZGSD-06, GGIMS-02, GMS-05]. A large number of heuristics have been studied for this problem Also approximation algorithms under limiting assumptions Matrix sparsification and the sparse null space problem 12 Another look at min unsatisfy EDR is in fact equivalent to min-unsatisfy ([AK-95] proved one direction) although this seems to have escaped the notice of the sparse approximation theory community. We’ll show how to extend the heuristics and algorithms for EDR (and therefore, min unsatify) for matrix sparsification. Matrix sparsification and the sparse null space problem 13 Matrix sparsification The following greedy algorithm solves matrix sparsification We assumes existence of subroutine SIV(A,B), Notes: This subroutine can be easily implemented using a heuristic or approximation algorithm for min unsatisfy (see paper) The matrix sparsification algorithm below is a slight simplification (again see paper) Algorithm for matrix sparsification builds matrix B one column at a time B ← null For i=n…1 returns the sparsest vector in the span of matrix A that is not in the span of matrix B a = SIV(A,B) B←a [CP-86] proved that the greedy algorithm works for matroids. Matrix sparsification and the sparse null space problem 14 Algorithms for matrix sparsification We conclude that all algorithms for min unsatisfy (and EDR) apply to matrix sparsification as well. There exists a randomized Θ(m/log m) approximation algorithm for matrix sparsification. A large number of heuristics for EDR carry over to matrix sparsification. Practical contribution The popular l1 minimization heuristic for EDR carries over to matrix sparsification This heuristic finds a vector v that satisfies Dv=s, while minimizing ||v||1 instead of number of non-zeros in v The heuristic is also an approximation algorithm under certain limiting assumptions [F-04] This heuristic for matrix sparsification has already been implemented since the public posting of our paper! Matrix sparsification and the sparse null space problem 15 Conclusion Considered the matrix sparsification and sparse null space problems. Showed that they are very hard to approximate. Showed how to extend a large number of studied heuristics and algorithms to these problems Matrix sparsification and the sparse null space problem 16