Matrix Extensions to Sparse Recovery Yi Ma Allen Yang

advertisement
Matrix Extensions to Sparse
Recovery
Yi Ma1,2
Allen Yang3
Microsoft
Research Asia
1
University of Illinois
at Urbana-Champaign
2
John Wright1
University of
California Berkeley
3
CVPR Tutorial, June 20, 2009
FINAL TOPIC – Generalizations: sparsity to degeneracy
The tools and phenomena underlying sparse recovery
generalize very nicely to low-rank matrix recovery
???
FINAL TOPIC – Generalizations: sparsity to degeneracy
The tools and phenomena underlying sparse recovery
generalize very nicely to low-rank matrix recovery
Matrix completion:
Given an incomplete subset of the entries of a
low-rank matrix, fill in the missing values.
Robust PCA:
Given a low-rank matrix which has been grossly
corrupted, recover the original matrix.
THIS TALK – From sparse recovery to low-rank recovery
Examples of degenerate data:
Face images
Degeneracy: illumination models
Errors: occlusion, corruption
Relevancy data
Degeneracy: user preferences co-predict
Errors: Missing rankings, manipulation
Video
Degeneracy: temporal, dynamic structures
Errors: anomalous events, mismatches…
???
KEY ANALOGY – Connections between rank and sparsity
Sparse recovery
Rank minimization
Unknown
Vector x
Matrix A
Observations
y = Ax
y = L[A]
Linear programming
Semidefinite programming
(linear map)
Combinatorial
objective
Convex
relaxation
Algorithmic
tools
KEY ANALOGY – Connections between rank and sparsity
Sparse recovery
Rank minimization
Unknown
Vector x
Matrix A
Observations
y = Ax
y = L[A]
Linear programming
Semidefinite programming
(linear map)
Combinatorial
objective
Convex
relaxation
Algorithmic
tools
This talk: exploiting this connection for matrix completion and RPCA
CLASSICAL PCA – Fitting degenerate data
If degenerate observations are stacked as columns of a matrix
then
CLASSICAL PCA – Fitting degenerate data
If degenerate observations are stacked as columns of a matrix
then
Principal Component Analysis via singular value decomposition:
• Stable, efficient computation
• Optimal estimate of
under iid Gaussian noise
• Fundamental statistical tool, huge impact in vision, search,
bioinformatics
CLASSICAL PCA – Fitting degenerate data
If degenerate observations are stacked as columns of a matrix
then
Principal Component Analysis via singular value decomposition:
• Stable, efficient computation
• Optimal estimate of
under iid Gaussian noise
• Fundamental statistical tool, huge impact in vision, search,
bioinformatics
But… PCA breaks down under even a single corrupted observation.
ROBUST PCA – Problem formulation
D - observation
A – low-rank
…
…
Problem: Given
Low-rank structure
E – sparse error
…
recover A 0 .
Sparse errors
Properties of the errors:
• Each multivariate data sample (column) may be corrupted in some entries
• Corruption can be arbitrarily large in magnitude (not Gaussian!)
ROBUST PCA – Problem formulation
D - observation
A – low-rank
…
…
Problem: Given
Low-rank structure
E – sparse error
…
recover A 0 .
Sparse errors
Numerous heuristic methods in the literature:
•
•
•
•
Random sampling [Fischler and Bolles ‘81]
Multivariate trimming [Gnanadesikan and Kettering ‘72]
Alternating minimization [Ke and Kanade ‘03]
Influence functions [de la Torre and Black ‘03]
•
No polynomial-time algorithm with strong performance guarantees!
ROBUST PCA – Semidefinite programming formulation
Seek the lowest-rank A that agrees with the data up to some sparse error:
ROBUST PCA – Semidefinite programming formulation
Seek the lowest-rank A that agrees with the data up to some sparse error:
Not directly tractable, relax:
ROBUST PCA – Semidefinite programming formulation
Seek the lowest-rank A that agrees with the data up to some sparse error:
Not directly tractable, relax:
Convex envelope over
Semidefinite program, solvable in polynomial time
MATRIX COMPLETION – Motivation for the nuclear norm
Related problem: we observe only a small known subset
of entries of a rank-
matrix
. Can we exactly recover
?
MATRIX COMPLETION – Motivation for the nuclear norm
Related problem: recover a rank
matrix from a known subset of entries
Convex optimization heuristic [Candes and Recht] :
For incoherent
, exact recovery with
[Candes and Tao]
Spectral trimming also succeeds with
for
[Keshavan, Montanari and Oh]
ROBUST PCA – Exact recovery?
CONJECTURE:
If
with
sufficiently low-rank and
sufficiently sparse, then solving
exactly recovers
.
Empirical evidence: probability of correct recovery vs rank and sparsity
Sparsity of error
Perfect recovery
Rank
ROBUST PCA – Which matrices and which errors?
Fundamental ambiguity – very sparse matrices are also low-rank:
Decompose
as
or
rank-1
Obviously we can only hope to uniquely recover
incoherent with the standard basis.
0-sparse
?
rank-0
1-sparse
that are
Can we recover almost all low-rank matrices from almost all sparse errors?
ROBUST PCA – Which matrices and which errors?
Random orthogonal model (of rank r) [Candes & Recht ‘08]:
independent samples from invariant measure
on Steifel manifold
of orthobases of rank r.
arbitrary.
ROBUST PCA – Which matrices and which errors?
Random orthogonal model (of rank r) [Candes & Recht ‘08]:
independent samples from invariant measure
on Steifel manifold
of orthobases of rank r.
arbitrary.
Bernoulli error signs-and-support (with parameter
Magnitude of
is arbitrary.
):
MAIN RESULT – Exact Solution of Robust PCA
“Convex optimization recovers almost any matrix of rank
errors affecting
of the observations!”
from
BONUS RESULT – Matrix completion in proportional growth
“Convex optimization exactly recovers matrices of rank
entries missing!”
, even with
MATRIX COMPLETION – Contrast with literature
• [Candes and Tao 2009]:
Correct completion whp for
Does not apply to the large-rank case
• This work:
Correct completion whp for
even with
Proof exploits rich regularity and independence in random orthogonal model.
Caveats:
- [C-T ‘09] tighter for small r.
- [C-T ‘09] generalizes better to other matrix ensembles.
MAIN RESULT – Exact Solution of Robust PCA
“Convex optimization recovers almost any matrix of rank
errors affecting
of the observations!”
from
ROBUST PCA – Solving the convex program
Semidefinite program in
millions of unknowns.
Scalable solution: apply a first-order method with
Sequence of quadratic approximations
convergence to
[Nesterov, Beck & Teboulle]:
Solved via soft thresholding (E), and singular value thresholding (A).
ROBUST PCA – Solving the convex program
• Iteration complexity
for
suboptimal solution.
• Dramatic practical gains from continuation
SIMULATION – Recovery in various growth scenarios
Correct recovery with
and
fixed,
increasing.
Empirically, almost constant number of iterations:
Provably robust PCA at only a constant factor more computation
than conventional PCA.
SIMULATION – Phase Transition in Rank and Sparsity
[0,1] x [0,1]
Fraction of successes with
[0,.4] x [0,.4]
, varying
[0,1] x [0,1]
Fraction of successes with
(10 trials)
[0,.5] x [0,.5]
, varying
(65 trials)
EXAMPLE – Background modeling from video
Static camera
surveillance video
200 frames,
72 x 88 pixels,
Significant foreground
motion
Video
Low-rank appx.
Sparse error
EXAMPLE – Background modeling from video
Static camera
surveillance video
550 frames,
64 x 80 pixels,
significant illumination
variation
Background
variation
Anomalous activity
Video
Low-rank appx.
Sparse error
EXAMPLE – Faces under varying illumination
29 images of one
person under varying
lighting:
…
…
RPCA
EXAMPLE – Faces under varying illumination
29 images of one
person under varying
lighting:
Specularity
…
…
RPCA
Selfshadowing
EXAMPLE – Face tracking and alignment
Initial alignment, inappropriate for recognition:
EXAMPLE – Face tracking and alignment
EXAMPLE – Face tracking and alignment
EXAMPLE – Face tracking and alignment
EXAMPLE – Face tracking and alignment
EXAMPLE – Face tracking and alignment
EXAMPLE – Face tracking and alignment
EXAMPLE – Face tracking and alignment
EXAMPLE – Face tracking and alignment
EXAMPLE – Face tracking and alignment
EXAMPLE – Face tracking and alignment
Final result: per-pixel alignment
EXAMPLE – Face tracking and alignment
Final result: per-pixel alignment
SIMULATION – Phase Transition in Rank and Sparsity
[0,1] x [0,1]
Fraction of successes with
[0,.4] x [0,.4]
, varying
[0,1] x [0,1]
Fraction of successes with
(10 trials)
[0,.5] x [0,.5]
, varying
(65 trials)
CONJECTURES – Phase Transition in Rank and Sparsity
Hypothesized breakdown behavior as m  ∞
1
0
0
1
CONJECTURES – Phase Transition in Rank and Sparsity
What we know so far:
1
0
This work
0
Classical PCA
1
CONJECTURES – Phase Transition in Rank and Sparsity
1
0
0
1
CONJECTURE I: convex programming succeeds in proportional growth
CONJECTURES – Phase Transition in Rank and Sparsity
1
0
0
1
CONJECTURE II: for small ranks
any fraction of errors
,
can eventually be corrected.
Similar to Dense Error Correction via L1 Minimization, Wright and Ma ‘08
CONJECTURES – Phase Transition in Rank and Sparsity
1
0
0
CONJECTURE III: for any rank fraction,
there exists a nonzero fraction of errors
corrected with high probability.
1
,
that can eventually be
CONJECTURES – Phase Transition in Rank and Sparsity
1
0
0
1
CONJECTURE IV: there is an asymptotically sharp phase transition
between correct recovery with overwhelming probability, and
failure with overwhelming probability.
CONJECTURES – Connections to Matrix Completion
Our results also suggest the possibility of a proportional growth
phase transition for matrix completion.
1
Matrix Completion
Robust PCA
0
Robust PCA
0
Matrix Completion
1
• How do the two breakdown points compare?
• How much is gained by knowing the location of the corruption?
Similar to Recht, Xu and Hassibi ‘08
FUTURE WORK – Stronger results on RPCA?
• RPCA with noise and errors:
bounded noise
(e.g., Gaussian)
Conjecture: stable recovery with
Tradeoff between
estimation error and robustness to corruption?
• Deterministic conditions on the matrix
• Simultaneous error correction and matrix completion:
we observe
FUTURE WORK – Algorithms and Applications
• Faster algorithms:
Smarter continuation strategies
Parallel implementations, GPU, multi-machine
• Further applications:
Computer vision: photometric stereo, tracking, video repair
Relevancy data: search, ranking and collaborative filtering
Bioinformatics
System Identification
REFERENCES + ACKNOWLEDGEMENT
• Reference:
Robust Principal Component Analysis:
Exact Recovery of Corrupted Low-Rank Matrices by Convex Optimization
submitted to the Journal of the ACM
• Collaborators:
Prof. Yi Ma (UIUC, MSRA)
Dr. Zhouchen Lin (MSRA)
Dr. Shankar Rao (UIUC)
Arvind Ganesh (UIUC)
Yigang Peng (MSRA)
• Funding:
Microsoft Research Fellowship (sponsored by Live Labs)
Grants
NSF CRS-EHS-0509151, NSF CCF-TF-0514955, ONR YIP N00014-04-1-0633, NSF IIS 07-03756
THANK YOU!
Questions, please?
John Wright
Robust PCA: Exact Recovery of Corrupted Low-Rank Matrices
Download