slides]

advertisement
Confluence of
Visual Computing & Sparse Representation
Yi Ma
Electrical and Computer Engineering, UIUC
&
Visual Computing Group, MSRA
CVPR, June 19th, 2009
CONTEXT - Massive High-Dimensional Data
Recognition
Surveillance
Search and Ranking
Bioinformatics
The curse of dimensionality:
…increasingly demand inference with limited samples for very highdimensional data.
The blessing of dimensionality:
… real data highly concentrate on low-dimensional, sparse, or degenerate
structures in the high-dimensional space.
But nothing is free: Gross errors and irrelevant measurements are now
ubiquitous in massive cheap data.
CONTEXT - New Phenomena with High-Dimensional Data
KEY CHALLENGE: efficiently and reliably recover sparse or degenerate
structures from high-dimensional data, despite gross observation errors.
A sobering message: human intuition is severely limited in highdimensional spaces:
Gaussian samples in 2D
As dimension grows proportionally with the
number of samples…
A new regime of geometry, statistics, and computation…
CONTEXT - High-dimensional Geometry, Statistics, Computation
Exciting confluence of
Analytical Tools:
• Powerful tools from high-dimensional geometry, measure
concentration, combinatorics, coding theory …
Computational Tools:
• Linear programming, convex optimization, greedy pursuit,
boosting, parallel processing …
Practical Applications:
• Compressive sensing, sketching, sampling, audio,
image, video, bioinformatics, classification, recognition …
THIS TALK - Outline
PART I: Face recognition as sparse representation
Striking robustness to corruption
PART II: From sparse to dense error correction
How is such good face recognition performance possible?
PART III: A practical face recognition system
Alignment, illumination, scalability
PART IV: Extensions, other applications, and future directions
Part I: Key Ideas and Application
Robust Face Recognition via
Sparse Representation
CONTEXT – Face recognition: hopes and high-profile failures
# Pentagon Makes Rush Order for Anti-Terror Technology. Washington Post, Oct. 26, 2001.
# Boston Airport to Test Face Recognition System. CNN.com, Oct. 26, 2001.
# Facial Recognition Technology Approved at Va. Beach. 13News (wvec.com), Nov. 13, 2001.
# ACLU: Face-Recognition Systems Won't Work. ZDNet, Nov. 2, 2001.
# ACLU Warns of Face Recognition Pitfalls. Newsbytes, Nov. 2, 2001.
# Identix, Visionics Double Up. CNN / Money Magazine, Feb. 22, 2002.
# 'Face testing' at Logan is found lacking. Boston Globe, July 17, 2002.
# Reliability of face scan technology in dispute. Boston Globe, August 5, 2002.
# Tampa drops face-recognition system. CNET, August 21, 2003.
# Airport anti-terror systems flub tests. USA Today, September 2, 2003.
# Anti-terror face recognition system flunks tests. The Register, September 3, 2003.
# Passport ID technology has high error rate. The Washington Post, August 6, 2004.
# Smiling Germans ruin biometric passport system. VNUNet, November 10, 2005.
# U.K. cops look into face-recognition tech. ZDNet News, January 17, 2006.
# Police build national mugshot database. Silicon.com, January 16, 2006.
# Face Recognition Algorithms Surpass Humans matching faces, PAMI, 2007.
# 100% Accuracy in Automatic Face Recognition, Science, 2008., January 25, 2008
and the drama goes on and on…
FORMULATION – Face recognition under varying illumination
Face Subspaces
Training Images
Images of the same face under varying illumination lie approximately on
a low (nine)-dimensional subspace, known as the harmonic plane [Basri
& Jacobs, PAMI, 2003].
FORMULATION – Face recognition as sparse representation
Assumption: the test image,
,
linear combination of k training images, say
, can be expressed as a
of the same subject:
The solution,
,
, should be a sparse vector —
of its
entries should be zero, except for the ones associated with the correct subject.
ROBUST RECOGNITION – Occlusion + varying illumination
ROBUST RECOGNITION – Occlusion and Corruption
ROBUST RECOGNITION – Properties of the Occlusion
Several characteristics of occlusion
:
Randomly supported errors (location is unknown and unpredictable)
Gross errors (arbitrarily large in magnitude)
Sparse errors? (concentrated on relatively small part(s) of the image)
ROBUST RECOGNITION – Problem Formulation
Problem: Find the correct (sparse) solution
from the corrupted and overdetermined
system of linear equations:
Conventionally, the minimum 2-norm (least squares) solution is used:
ROBUST RECOGNITION – Joint Sparsity
Thus, we are looking for a sparse solution
system of linear equations
to an under-determined
:
The problem
can be solved efficiently via Linear Programming, and the
solution is stable under moderate noise [Candes & Tao’04, Donoho’04].
The equivalence holds iff
.
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
ROBUST RECOGNITION – Geometric Interpretation
Face recognition as determining which facet of the polytope the test image belongs to.
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
ROBUST RECOGNITION - L1 versus L2 Solution
Input:
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
ROBUST RECOGNITION – Classification from Coefficients
123…
subject 1…
N
subject i
123…
subject n
N
subject i
Classification criterion: assign to the class with the smallest residual.
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
ROBUST RECOGNITION – Algorithm Summary
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
EXPERIMENTS – Varying Level of Random Corruption
Extended Yale B Database (38 subjects)
Training: subsets 1 and 2 (717 images)
Testing: subset 3 (453 images)
30% corruption
99.3%
90.7%
50%
37.5%
70%
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
EXPERIMENTS – Varying Levels of Contiguous Occlusion
Extended Yale B Database (38 subjects)
Training: subsets 1 and 2 (717 images),
EBP ~ 13.3%.
Testing: subset 3 (453 images)
98.5%
90.3%
65.3%
30% occlusion
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
EXPERIMENTS – Recognition with Face Parts Occluded
Results corroborate findings in human vision: the eyebrow or eye region is
most informative for recognition [Sinha’06].
However, the difference is less significant for our algorithm than for humans.
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
EXPERIMENTS – Recognition with Disguises
The AR Database (100 subjects)
Training: 799 images (un-occluded)
EBP = 11.6%.
Testing: 200 images (with glasses)
200 images (with scarf)
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
Part II: Theory Inspired by Face
Recognition
Dense Error Correction via L1
Minimization
PRIOR WORK - Face Recognition as Sparse Representation
Represent any test image wrt the entire training set as
Test image
Training dictionary
coefficients
corruption,
occlusion
Solution
is not
but only supported on images of the same subject
should
be unique
sparse:…
ideally,
expected to be sparse: occlusion only affects a subset of the pixels
Seek the sparsest solution:
convex relaxation
PRIOR WORK - Striking Robustness to Random Corruption
Behavior under varying levels of random pixel corruption:
Recognition rate
99.3%
90.7%
37.5%
Can existing theory explain this phenomenon?
PRIOR WORK - Error Correction by
minimization
Candes and Tao [IT ‘05]:
• Apply parity check matrix
s.t.
, yielding
Underdetermined system in sparse e only
• Set
• Recover
from clean system
PRIOR WORK - Error Correction by
minimization
Candes and Tao [IT ‘05]:
• Apply parity check matrix
s.t.
, yielding
Underdetermined system in sparse e only
• Set
• Recover
from clean system
Succeeds whenever
in the reduced system
.
PRIOR WORK - Error Correction by
minimization
Candes and Tao [IT ‘05]:
• Apply parity check matrix
s.t.
, yielding
Underdetermined system in sparse e only
• Set
• Recover
from clean system
Succeeds whenever
This work:
in the reduced system
• Instead solve
Can be applied when A is wide (no parity check).
.
PRIOR WORK - Error Correction by
minimization
Candes and Tao [IT ‘05]:
• Apply parity check matrix
s.t.
, yielding
Underdetermined system in sparse e only
• Set
• Recover
from clean system
Succeeds whenever
This work:
in the reduced system
.
• Instead solve
Succeeds whenever
in the expanded system
.
PRIOR WORK -
Equivalence in
Algebraic sufficient conditions:
• (In)-coherence
Gribvonel + Nielsen ‘03
Donoho + Elad ‘03
suffices.
• Restricted Isometry
Candes + Tao ‘05
Candes + Tao + Romberg ‘06
suffices.
“The columns of
should be uniformly well-spread”
FACE IMAGES - Contrast with Existing Theory
Face images
Highly coherent
( volume
)
Image space
very sparse:
# images per subject,
often nonnegative (illumination cone models).
as dense as possible: robust to highest possible corruption.
Existing theory:
should not succeed.
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
SIMULATION - Dense Error Correction?
As dimension
, an even more striking phenomenon emerges:
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
SIMULATION - Dense Error Correction?
As dimension
, an even more striking phenomenon emerges:
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
SIMULATION - Dense Error Correction?
As dimension
, an even more striking phenomenon emerges:
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
SIMULATION - Dense Error Correction?
As dimension
, an even more striking phenomenon emerges:
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
SIMULATION - Dense Error Correction?
As dimension
, an even more striking phenomenon emerges:
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
SIMULATION - Dense Error Correction?
As dimension
, an even more striking phenomenon emerges:
Conjecture: If the matrices are sufficiently coherent, then for any error
fraction
, as
, solving
corrects almost any error
with
.
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
DATA MODEL - Cross-and-Bouquet
Our model for should capture the fact that the columns are tightly clustered
around a common mean :
Face images
L^-norm of deviations wellcontrolled ( -> v )
Image space
Mean is mostly incoherent
with standard (error) basis
We call this the “Cross-and-Bouquet’’ (CAB) model.
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
ASYMPTOTIC SETTING - Weak Proportional Growth
• Observation dimension
• Problem size grows proportionally:
• Error support grows proportionally:
• Support size sublinear in
Sublinear growth of
Need at least
:
is necessary to correct arbitrary fractions of errors:
“clean” equations.
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
MAIN RESULT - Correction of Arbitrary Error Fractions
Recall notation:
“
recovers any sparse signal from almost any error with density less than 1”
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
SIMULATION - Comparison to Alternative Approaches
“L1 - [A I]”:
“L1 -  comp”:
“ROMP”:
Candes + Tao ‘05
Regularized orthogonal matching pursuit
Needell + Vershynin ‘08
SIMULATION - Arbitrary Errors in WPG
Fraction of correct successes for increasing m (
,
)
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
IMPLICATIONS (1) - Error Correction with Real Faces
For real face images, weak proportional growth corresponds to the setting where
the total image resolution grows proportionally to the size of the database.
Fraction of correct recoveries
Above: corrupted images.
(
50% probability of correct recovery )
Below: reconstruction.
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
IMPLICATIONS (2) – Verification via Sparsity
Valid Subject
Invalid Subject
Reject as invalid if
Sparsity Concentration Index
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
IMPLICATIONS (2) – Receiver Operating Characteristic (ROC)
Yale Extended B, 19 valid subjects, 19 invalid,
under different levels of occlusions:
0%
10%
20%
30%
50%
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
IMPLICATIONS (3) - Communications through Bad Channels
Receiver
Transmitter
Extremely corrupting
channel
Transmitter encodes message
as
Receiver observes corrupted version
linear programming.
.
, recovers
by
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
IMPLICATIONS (4) - Application to Information Hiding
Alice
Bob
Intentionally corrupts messages
?????????
Knows , can recover
by linear programming
Eavesdropper
Code breaking as a dictionary learning problem…
Wright, and Ma. ICASSP 2009, submitted to IEEE Trans. Information Theory.
Part III:
A Practical Automatic Face
Recognition System
FACE RECOGNITION – Toward a Robust, Real-World System
So far: surprisingly good laboratory results, strong theoretical foundations.
Remaining obstacles to truly practical automatic face recognition:
• Pose and misalignment
-
real face detector imprecision!
• Obtaining sufficient training
-
which illuminations are truly needed?
• Scalability to large databases
-
both in speed and accuracy.
All three difficulties can be addressed within the same
unified framework of sparse representation.
FACE RECOGNITION – Coupled Problems of Pose and Illumination
Sufficient training illuminations,
but no explicit alignment:
Alignment corrected, but
insufficient training illuminations:
FACE RECOGNITION – Coupled Problems of Pose and Illumination
Sufficient training illuminations,
but no explicit alignment:
Alignment corrected, but
insufficient training illuminations:
Robust alignment and training
set selection:
Recognition succeeds
ROBUST POSE AND ALIGNMENT – Problem Formulation
What if the input image is misaligned, or has some pose?
If
were known, still have a sparse representation
Seek the
that gives the sparsest representation:
Wagner, Wright, Ganesh, Zhou and Ma. To appear in CVPR 09
POSE AND ALIGNMENT – Iterative Linear Programming
Robust alignment as sparse representation:
Nonconvex in
Linearize about current estimate of
:
Linear program
Solve, set
Wagner, Wright, Ganesh, Zhou and Ma. To appear in CVPR 09
POSE AND ALIGNMENT – How well does it work?
Succeeds up to >45o of pose::
Succeeds up to translations of 20% of face width, up to 30o in-plane rotation::
Recognition rate for synthetic misalignments (Multi-PIE)
Wagner, Wright, Ganesh, Zhou and Ma. To appear in CVPR 09
POSE AND ALIGNMENT – L1 vs L2 solutions
Crucial role of sparsity in robust alignment:
Minimum -norm
solution
Least-squares
solution
Wagner, Wright, Ganesh, Zhou and Ma. To appear in CVPR 09
POSE AND ALIGNMENT – Algorithm details
•
First align to each subject separately
Efficient multi-scale implementation
•
Select k subjects with smallest
global sparse representation
, classify based on
Excellent classification, validation and robustness with a linear-time algorithm
that is efficient in practice and highly parallelizable.
Wagner, Wright, Ganesh, Zhou and Ma. To appear in CVPR 09
LARGE-SCALE EXPERIMENTS – Multi-PIE Database
Training: 249 subjects appearing in Session 1, 9 illuminations per subject.
Testing: 336 subjects appearing in Sessions 2,3,4. All 18 illuminations.
Examples of failures:
Drastic changes in personal appearance
over time
Wagner, Wright, Ganesh, Zhou and Ma. To appear in CVPR 09
LARGE-SCALE EXPERIMENTS – Multi-PIE Database
Training: 249 subjects appearing in Session 1, 9 illuminations per subject.
Testing: 336 subjects appearing in Sessions 2,3,4. All 18 illuminations.
Receiver Operating Characteristic (ROC)
Validation performance:
Is the subject in the database of 249
people?
NN, NS, LDA not much better than chance.
Our method achieves an equal error rate of
< 10%.
Wagner, Wright, Ganesh, Zhou and Ma. To appear in CVPR 09
FACE RECOGNITION – Coupled Problems of Pose and Illumination
Sufficient training illuminations,
but no explicit alignment:
Alignment corrected, but
insufficient training illuminations:
Robust alignment and training
set selection:
Recognition succeeds
ACQUISITION SYSTEM – Efficient training collection
Generate different illuminations by reflecting light from DLP projectors off walls, onto
subject:
Fast: hundreds of images in a matter of seconds, flexible and easy to assemble.
Wagner, Wright, Ganesh, Zhou and Ma. To appear in CVPR 09
WHICH ILLUMINATIONS ARE NEEDED?
Real data representation error as a function of… …
Coverage of the sphere
Rear illuminations!
Granularity of the partition
32 illumination cells
• Rear illuminations are critical for representing real world variability
Missing from standard data sets such as AR, PIE, MultiPIE!
• 30-40 distinct illumination patterns suffice
Wagner, Wright, Ganesh, Zhou and Ma. To appear in CVPR 09
REAL-WORLD EXPERIMENTS – Our Dataset
Sufficient set of 38 training illuminations:
Recognition performance over 74 subjects:
Subset 1
95.9% rec. rate
Subset 2
91.5% rec. rate
Subset 3
62.3% rec. rate
Subset 4
73.7% rec. rate
Subset 5
53.5% rec. rate
Wagner, Wright, Ganesh, Zhou and Ma. To appear in CVPR 09
Part IV: Extensions, Other
Applications, and Future Directions
EXTENSIONS (1) – Topological Sparse Solutions
Recognition rate
99.3%
90.7%
37.5%
98.5%
90.3%
65.3%
EXTENSIONS (1) – Topological Sparse Solutions
How to better exploit the spatial characteristics of the error e in face recognition?
Simple solution: Markov random field and L1 minimization.
60% occlusion
Query image
recovered
error support
recovered
error
recovered
image
Longer-term direction: Sparse representation on structured domains (ala
[Baraniuk ’08, Do ’07]):
Z. Zhou, A. Wagner, J. Wright, and Ma. Submitted to ICCV09.
EXTENSIONS (2) – Does Feature Selection Matter?
12x10 pixels
120 dim
120 dim
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
EXTENSIONS (2) – Does Feature Selection Matter?
Compressed sensing:
–
Number of linear measurements is more important than specific
details of how those measurements are taken.
–
d > 2k log (N/d) random measurements suffice to efficiently
reconstruct any k-sparse signal. [Donoho and Tanner ’07]
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
EXTENSIONS (2) – Does Feature Selection Matter?
Extended Yale B: 38 subjects, 2,414 images of size 192x168
Training: 1,207 random images, Testing: remaining 1,207 images
Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2009
OTHER APPLICATIONS (1) - Image Super-resolution
Enhance images by sparse representation in coupled dictionaries
(high- and low-resolution) of image patches:
MRF / BP
[Freeman IJCV ‘00]
Soft edge prior
[Dai ICCV ‘07]
Our
method
Original
Original
J. Yang, Wright, Huang, and Ma. CVPR 2008
OTHER APPLICATIONS (2) - Face Hallucination
J. Yang, H. Tangt, Huang, and Ma. ICIP 2008
OTHER APPLICATIONS (3) - Activity Detection & Recognition
Precision: 98.8% and recall: 94.2%, far better than other existing detectors & classifiers
A. Yang et. al. (at UC Berkeley). CVPR 2008
OTHER APPLICATIONS (4) - Robust Motion Segmentation
deals with incomplete or mistracked features with dataset 80%
corrupted!
S. Rao, R. Tron, R. Vidal, and Ma. CVPR 2008
OTHER APPLICATIONS (5) - Data Imputation in Speech
91% at SNR -5dB on AURORA-2 compared to 61% with
conventional…
J.F. Gemmeke and G. Cranen, EUSIPCO’08
FUTURE WORK (1) – High-Dimensional Pattern Recognition
Toward an understanding of
high-dimensional pattern classification…
Data tasks beyond error correction:
Excellent classification performance
even with high-coherent dictionary
Excellent validation behavior
based on sparsity of the solution
Understanding either behavior requires a much more expressive model
for “what happens inside the bouquet?”
FUTURE WORK (2) – From Sparse Vectors to Low-Rank Matrices
…
…
D - observation
A – low-rank
…
E – sparse error
Robust PCA Problem: given D, recover A.
convex relaxation
Nuclear norm
Wright, Ganesh, Rao and Ma, submitted to the Journal of the ACM.
ROBUST PCA – Which matrices and which errors?
Random orthogonal model (of rank r) [Candes & Recht ‘08]:
independent samples from invariant measure
on Steifel manifold
of orthobases of rank r.
arbitrary.
Bernoulli error signs-and-support (with parameter
Magnitude of
):
is arbitrary.
Wright, Ganesh, Rao and Ma, submitted to the Journal of the ACM.
MAIN RESULT – Exact Solution of Robust PCA
“Convex optimization recovers almost any matrix of rank O(m/log m) from
errors affecting O(m2) of the observations!”
Wright, Ganesh, Rao and Ma, submitted to the Journal of the ACM.
ROBUST PCA – Contrast with literature
• [Chandrasekharan et. al. 2009]:
Correct recovery whp for
Only guarantees recovery from vanishing fractions of errors, even when r = O(1).
• This work:
Correct recovery whp for
, even with
Key technique: Iterative surgery for producing a certifying
dual vector (extends [Wright and Ma ’08]).
Wright, Ganesh, Rao and Ma, submitted to the Journal of the ACM.
BONUS RESULT – Matrix completion in proportional growth
“Convex optimization exactly recovers matrices of rank O(m), even when
O(m2) entries are missing!”
Wright, Ganesh, Rao and Ma, submitted to the Journal of the ACM.
MATRIX COMPLETION – Contrast with literature
• [Candes and Tao 2009]:
Correct completion whp for
Empty for
• This work:
Correct completion whp for
, even with
Exploits rich regularity and independence in random orthogonal model.
Caveats:
- [C-T ‘09] tighter for small r.
- [C-T ‘09] generalizes better to other matrix ensembles.
Wright, Ganesh, Rao and Ma, submitted to the Journal of the ACM.
FUTURE WORK (2) – Robust PCA via Iterative Thresholding
?
Efficient solutions to
Semidefinite program in millions of unknowns!
Shrink singular values
repeat
Shrink absolute values
Provable (and efficient) convergence to global optimum.
Future direction: sampling approximations to the singular value thresholding
operator [Rudelson and Vershynin ’08] ?
Wright, Ganesh, Rao and Ma, submitted to the Journal of the ACM.
FUTURE WORK (2) - Video Coding and Anomaly Detection
Videos are highly coherent data. Errors correspond to pixels that cannot be well
interpolated by the previous video.
550 frames,
64 x 80 pixels,
Video
Low-rank appx.
Sparse error
significant illumination
variation
Background
variation
Anomalous activity
Wright, Ganesh, Rao and Ma, submitted to the Journal of the ACM.
FUTURE WORK (2) - Background modeling
Static camera
surveillance video
Video
Low-rank appx.
Sparse error
200 frames,
72 x 88 pixels,
Significant foreground
motion
Wright, Ganesh, Rao and Ma, submitted to the Journal of the ACM.
FUTURE WORK (2) - Face under different illuminations
Original images
Low-rank appx.
Sparse error
Ext. Yale B database,
29 images of one
subject.
Images are 96 x 84
pixels.
Wright, Ganesh, Rao and Ma, submitted to the Journal of the ACM.
CONCLUSIONS
Analytic and algorithmic tools from sparse representation lead to a new
approach in face recognition:
• Robustness to corruption and occlusion
• Performance exceeds expectation & human ability
Face recognition reveals new phenomena in high-dim statistics & geometry:
• Dense error correction with a coherent dictionary
• Recovery of corrupt low-rank matrices
Theoretical insights to mathematical models lead back to practical gains
• Robust to misalignment, illumination, and occlusion
• Scalable in both computation and performance in realistic scenarios
MANY NEW APPLICATIONS BEYOND FACE RECOGNITION…
REFERENCES + ACKNOWLEDGEMENT
- Robust Face Recognition via Sparse Representation
IEEE Trans. on Pattern Analysis and Machine Intelligence, February 2009.
- Dense Error Correction via L1-minimization
ICASSP 2008, Submitted to IEEE Trans. Information Theory, September 2008.
- Towards a Practical Face Recognition System:
Robust Alignment and Illumination via Sparse Representation
IEEE Conference on Computer Vision and Pattern Recognition, June 2009.
- Robust Principal Component Analysis:
Exact Recovery of Corrupted Low-Rank Matrices by Convex Optimization
Submitted to the Journal of the ACM, May 2009.
John Wright, Allen Yang, Andrew Wagner, Arvind Ganesh, Zihan Zhou
This work was funded by NSF, ONR, and MSR
Yi Ma – Confluence of Computer Vision and Sparse Representation
THANK YOU
Questions, please?
Yi Ma – Confluence of Computer Vision and Sparse Representation
Yi Ma – Confluence of Computer Vision and Sparse Representation
EXPERIMENTS – Design of Robust Training Sets
The Equivalence Breakdown Point
Extended Yale B
AR Database
Bounding EBP, submitted to ACC ‘09, Sharon, Wright, and Ma
FEATURE SELECTION – Extended Yale B Database
38 subjects, 2,414 images of size 192x168
Training: 1,207 random images, Testing: remaining 1,207 images
L1
Dimension (d)
30
56
120
504
Eigen [%]
80.0
89.6
94.0
97.0
Laplacian [%]
80.6
91.7
93.9
96.5
Random[%]
81.9
90.8
95.0
96.8
Downsample[%]
76.2
87.6
92.7
96.9
Fisher[%]
85.9
N/A
N/A
N/A
Nearest Subspace
Nearest Neighbor
Dimension (d)
30
56
120
504
Dimension (d)
30
56
120
504
Eigen [%]
72.0
79.8
83.9
85.8
Eigen [%]
89.9
91.1
92.5
93.2
Laplacian [%]
75.6
81.3
85.2
87.7
Laplacian [%]
89.0
90.4
91.9
93.4
Random[%]
60.1
66.5
67.8
66.4
Random[%]
87.4
91.5
93.9
94.1
Downsample[%]
46.7
54.7
61.8
65.4
Downsample[%]
80.8
88.2
91.1
93.4
Fisher[%]
87.7
N/A
N/A
N/A
Fisher[%]
81.9
N/A
N/A
N/A
FEATURE SELECTION – AR Database
100 subjects, 1,400 images of size 165x120
Training: 700 images, varying lighting, expression
Testing: 700 images from second session
FEATURE SELECTION – AR Database
100 subjects, 1,400 images of size 165x120
Training: 700 images, varying lighting, expression
Testing: 700 images from second session
L1
Dimension (d)
30
56
120
504
Eigen [%]
71.1
80.0
85.7
92.0
Laplacian [%]
73.7
84.7
91.0
94.3
Random[%]
57.8
75.5
87.5
94.7
Downsample[%]
46.8
67.0
84.6
93.9
Fisher[%]
87.0
92.3
N/A
N/A
Nearest Neighbor
Nearest Subspace
Dimension (d)
30
56
120
504
Dimension (d)
30
56
120
504
Eigen [%]
68.1
74.8
79.3
80.5
Eigen [%]
64.1
77.1
82.0
85.1
Laplacian [%]
73.1
77.1
83.8
89.7
Laplacian [%]
66.0
77.5
84.3
90.3
Random[%]
56.7
63.7
71.4
75.0
Random[%]
59.2
68.2
80.0
83.3
Downsample[%]
51.7
60.9
69.2
73.7
Downsample[%]
56.2
67.7
77.0
82.1
Fisher[%]
83.4
86.8
N/A
N/A
Fisher[%]
80.3
85.8
N/A
N/A
FEATURE SELECTION – Recognition with Face Parts
Feature Masks
Examples of Test Features
Features
nose
right eye
mouch &
chin
Dimension
4,270
5,050
12,936
L1
87.3%
93.7%
98.3%
NN
49.2%
68.8%
72.7%
NS
83.7%
78.6%
94.4%
SVM
70.8%
85.8%
95.3%
NOTATION - Correct Recovery of Solutions
Whether
Call
is recovered depends only on
-recoverable if
and the minimizer is unique.
with these signs and support
PROOF (1) - Problem Geometry
Consider a fixed
. W.l.o.g., let
Success iff
Restrict to
and write
With some manipulation, optimality condition becomes
PROOF (1) - Problem Geometry
Consider a fixed
. W.l.o.g., let
Success iff
Restrict to
and write
With some manipulation, optimality condition becomes
PROOF (1) - Problem Geometry
Introduce
The NSC
hyperplane
and the unit ball of
are disjoint.
PROOF (1) - Problem Geometry
Introduce
The NSC
hyperplane
and the unit ball of
are disjoint.
PROOF (1) - Problem Geometry
Introduce
The NSC
hyperplane
and the unit ball of
are disjoint.
PROOF (1) - Problem Geometry
is a complicated polytope.
Instead look for a hyperplane
separating
and
in the higher-dimensional space.
PROOF (2) - When Does the Iteration Succeed?
Lemma: success if
Proof:
want to show
Consider the three statements:
PROOF (2) - When Does the Iteration Succeed?
Lemma: success if
Proof:
want to show
Consider the three statements:
Base case:
Trivial
Use that
PROOF (2) - When Does the Iteration Succeed?
Lemma: success if
Proof:
want to show
Consider the three statements:
Inductive step:
PROOF (2) - When Does the Iteration Succeed?
Lemma: success if
Proof:
want to show
Consider the three statements:
Inductive step (cont’d):
Magnitude
PROOF (2) - When Does the Iteration Succeed?
Lemma: success if
Proof:
want to show
Consider the three statements:
Inductive step (cont’d):
Download