Matrix Extensions to Sparse Recovery Yi Ma Allen Yang

Matrix Extensions to Sparse Recovery Yi Ma1,2 Allen Yang3 Microsoft Research Asia 1 University of Illinois at Urbana-Champaign 2 John Wright1 University of California Berkeley 3 CVPR Tutorial, June 20, 2009 FINAL TOPIC – Generalizations: sparsity to degeneracy The tools and phenomena underlying sparse recovery generalize very nicely to low-rank matrix recovery ??? FINAL TOPIC – Generalizations: sparsity to degeneracy The tools and phenomena underlying sparse recovery generalize very nicely to low-rank matrix recovery Matrix completion: Given an incomplete subset of the entries of a low-rank matrix, fill in the missing values. Robust PCA: Given a low-rank matrix which has been grossly corrupted, recover the original matrix. THIS TALK – From sparse recovery to low-rank recovery Examples of degenerate data: Face images Degeneracy: illumination models Errors: occlusion, corruption Relevancy data Degeneracy: user preferences co-predict Errors: Missing rankings, manipulation Video Degeneracy: temporal, dynamic structures Errors: anomalous events, mismatches… ??? KEY ANALOGY – Connections between rank and sparsity Sparse recovery Rank minimization Unknown Vector x Matrix A Observations y = Ax y = L[A] Linear programming Semidefinite programming (linear map) Combinatorial objective Convex relaxation Algorithmic tools KEY ANALOGY – Connections between rank and sparsity Sparse recovery Rank minimization Unknown Vector x Matrix A Observations y = Ax y = L[A] Linear programming Semidefinite programming (linear map) Combinatorial objective Convex relaxation Algorithmic tools This talk: exploiting this connection for matrix completion and RPCA CLASSICAL PCA – Fitting degenerate data If degenerate observations are stacked as columns of a matrix then CLASSICAL PCA – Fitting degenerate data If degenerate observations are stacked as columns of a matrix then Principal Component Analysis via singular value decomposition: • Stable, efficient computation • Optimal estimate of under iid Gaussian noise • Fundamental statistical tool, huge impact in vision, search, bioinformatics CLASSICAL PCA – Fitting degenerate data If degenerate observations are stacked as columns of a matrix then Principal Component Analysis via singular value decomposition: • Stable, efficient computation • Optimal estimate of under iid Gaussian noise • Fundamental statistical tool, huge impact in vision, search, bioinformatics But… PCA breaks down under even a single corrupted observation. ROBUST PCA – Problem formulation D - observation A – low-rank … … Problem: Given Low-rank structure E – sparse error … recover A 0 . Sparse errors Properties of the errors: • Each multivariate data sample (column) may be corrupted in some entries • Corruption can be arbitrarily large in magnitude (not Gaussian!) ROBUST PCA – Problem formulation D - observation A – low-rank … … Problem: Given Low-rank structure E – sparse error … recover A 0 . Sparse errors Numerous heuristic methods in the literature: • • • • Random sampling [Fischler and Bolles ‘81] Multivariate trimming [Gnanadesikan and Kettering ‘72] Alternating minimization [Ke and Kanade ‘03] Influence functions [de la Torre and Black ‘03] • No polynomial-time algorithm with strong performance guarantees! ROBUST PCA – Semidefinite programming formulation Seek the lowest-rank A that agrees with the data up to some sparse error: ROBUST PCA – Semidefinite programming formulation Seek the lowest-rank A that agrees with the data up to some sparse error: Not directly tractable, relax: ROBUST PCA – Semidefinite programming formulation Seek the lowest-rank A that agrees with the data up to some sparse error: Not directly tractable, relax: Convex envelope over Semidefinite program, solvable in polynomial time MATRIX COMPLETION – Motivation for the nuclear norm Related problem: we observe only a small known subset of entries of a rank- matrix . Can we exactly recover ? MATRIX COMPLETION – Motivation for the nuclear norm Related problem: recover a rank matrix from a known subset of entries Convex optimization heuristic [Candes and Recht] : For incoherent , exact recovery with [Candes and Tao] Spectral trimming also succeeds with for [Keshavan, Montanari and Oh] ROBUST PCA – Exact recovery? CONJECTURE: If with sufficiently low-rank and sufficiently sparse, then solving exactly recovers . Empirical evidence: probability of correct recovery vs rank and sparsity Sparsity of error Perfect recovery Rank ROBUST PCA – Which matrices and which errors? Fundamental ambiguity – very sparse matrices are also low-rank: Decompose as or rank-1 Obviously we can only hope to uniquely recover incoherent with the standard basis. 0-sparse ? rank-0 1-sparse that are Can we recover almost all low-rank matrices from almost all sparse errors? ROBUST PCA – Which matrices and which errors? Random orthogonal model (of rank r) [Candes & Recht ‘08]: independent samples from invariant measure on Steifel manifold of orthobases of rank r. arbitrary. ROBUST PCA – Which matrices and which errors? Random orthogonal model (of rank r) [Candes & Recht ‘08]: independent samples from invariant measure on Steifel manifold of orthobases of rank r. arbitrary. Bernoulli error signs-and-support (with parameter Magnitude of is arbitrary. ): MAIN RESULT – Exact Solution of Robust PCA “Convex optimization recovers almost any matrix of rank errors affecting of the observations!” from BONUS RESULT – Matrix completion in proportional growth “Convex optimization exactly recovers matrices of rank entries missing!” , even with MATRIX COMPLETION – Contrast with literature • [Candes and Tao 2009]: Correct completion whp for Does not apply to the large-rank case • This work: Correct completion whp for even with Proof exploits rich regularity and independence in random orthogonal model. Caveats: - [C-T ‘09] tighter for small r. - [C-T ‘09] generalizes better to other matrix ensembles. MAIN RESULT – Exact Solution of Robust PCA “Convex optimization recovers almost any matrix of rank errors affecting of the observations!” from ROBUST PCA – Solving the convex program Semidefinite program in millions of unknowns. Scalable solution: apply a first-order method with Sequence of quadratic approximations convergence to [Nesterov, Beck & Teboulle]: Solved via soft thresholding (E), and singular value thresholding (A). ROBUST PCA – Solving the convex program • Iteration complexity for suboptimal solution. • Dramatic practical gains from continuation SIMULATION – Recovery in various growth scenarios Correct recovery with and fixed, increasing. Empirically, almost constant number of iterations: Provably robust PCA at only a constant factor more computation than conventional PCA. SIMULATION – Phase Transition in Rank and Sparsity [0,1] x [0,1] Fraction of successes with [0,.4] x [0,.4] , varying [0,1] x [0,1] Fraction of successes with (10 trials) [0,.5] x [0,.5] , varying (65 trials) EXAMPLE – Background modeling from video Static camera surveillance video 200 frames, 72 x 88 pixels, Significant foreground motion Video Low-rank appx. Sparse error EXAMPLE – Background modeling from video Static camera surveillance video 550 frames, 64 x 80 pixels, significant illumination variation Background variation Anomalous activity Video Low-rank appx. Sparse error EXAMPLE – Faces under varying illumination 29 images of one person under varying lighting: … … RPCA EXAMPLE – Faces under varying illumination 29 images of one person under varying lighting: Specularity … … RPCA Selfshadowing EXAMPLE – Face tracking and alignment Initial alignment, inappropriate for recognition: EXAMPLE – Face tracking and alignment EXAMPLE – Face tracking and alignment EXAMPLE – Face tracking and alignment EXAMPLE – Face tracking and alignment EXAMPLE – Face tracking and alignment EXAMPLE – Face tracking and alignment EXAMPLE – Face tracking and alignment EXAMPLE – Face tracking and alignment EXAMPLE – Face tracking and alignment EXAMPLE – Face tracking and alignment Final result: per-pixel alignment EXAMPLE – Face tracking and alignment Final result: per-pixel alignment SIMULATION – Phase Transition in Rank and Sparsity [0,1] x [0,1] Fraction of successes with [0,.4] x [0,.4] , varying [0,1] x [0,1] Fraction of successes with (10 trials) [0,.5] x [0,.5] , varying (65 trials) CONJECTURES – Phase Transition in Rank and Sparsity Hypothesized breakdown behavior as m  ∞ 1 0 0 1 CONJECTURES – Phase Transition in Rank and Sparsity What we know so far: 1 0 This work 0 Classical PCA 1 CONJECTURES – Phase Transition in Rank and Sparsity 1 0 0 1 CONJECTURE I: convex programming succeeds in proportional growth CONJECTURES – Phase Transition in Rank and Sparsity 1 0 0 1 CONJECTURE II: for small ranks any fraction of errors , can eventually be corrected. Similar to Dense Error Correction via L1 Minimization, Wright and Ma ‘08 CONJECTURES – Phase Transition in Rank and Sparsity 1 0 0 CONJECTURE III: for any rank fraction, there exists a nonzero fraction of errors corrected with high probability. 1 , that can eventually be CONJECTURES – Phase Transition in Rank and Sparsity 1 0 0 1 CONJECTURE IV: there is an asymptotically sharp phase transition between correct recovery with overwhelming probability, and failure with overwhelming probability. CONJECTURES – Connections to Matrix Completion Our results also suggest the possibility of a proportional growth phase transition for matrix completion. 1 Matrix Completion Robust PCA 0 Robust PCA 0 Matrix Completion 1 • How do the two breakdown points compare? • How much is gained by knowing the location of the corruption? Similar to Recht, Xu and Hassibi ‘08 FUTURE WORK – Stronger results on RPCA? • RPCA with noise and errors: bounded noise (e.g., Gaussian) Conjecture: stable recovery with Tradeoff between estimation error and robustness to corruption? • Deterministic conditions on the matrix • Simultaneous error correction and matrix completion: we observe FUTURE WORK – Algorithms and Applications • Faster algorithms: Smarter continuation strategies Parallel implementations, GPU, multi-machine • Further applications: Computer vision: photometric stereo, tracking, video repair Relevancy data: search, ranking and collaborative filtering Bioinformatics System Identification REFERENCES + ACKNOWLEDGEMENT • Reference: Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices by Convex Optimization submitted to the Journal of the ACM • Collaborators: Prof. Yi Ma (UIUC, MSRA) Dr. Zhouchen Lin (MSRA) Dr. Shankar Rao (UIUC) Arvind Ganesh (UIUC) Yigang Peng (MSRA) • Funding: Microsoft Research Fellowship (sponsored by Live Labs) Grants NSF CRS-EHS-0509151, NSF CCF-TF-0514955, ONR YIP N00014-04-1-0633, NSF IIS 07-03756 THANK YOU! Questions, please? John Wright Robust PCA: Exact Recovery of Corrupted Low-Rank Matrices

Matrix Extensions to Sparse Recovery Yi Ma Allen Yang

Related documents

Products

Support

Matrix Extensions to Sparse Recovery Yi Ma Allen Yang

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib