Local Manifold Matching for Face Recognition Wei Liu , Wei Fan

advertisement
Local Manifold Matching for Face Recognition
Wei Liu1 , Wei Fan1 , Yunhong Wang1, 2 , and Tieniu Tan1
NLPR, Institute of Automation, Chinese Academy of Sciences
Beijing, 100080, P.R. China
2
School of Computer Science and Engineering, Beihang University
Beijing, 100083, P.R. China
Email: {wliu, wfan, wangyh, tnt}@nlpr.ia.ac.cn
1
Abstract— In this paper, we propose a novel classification
method, called local manifold matching (LMM), for face recognition. LMM has great representational capacity of available
prototypes and is based on the local linearity assumption that
each data point and its k nearest neighbors from the same class
lie on a linear manifold locally embedded in the image space.
We present a supervised local manifold learning algorithm for
learning all locally linear manifold structures. Then we propose
the nearest manifold criterion for the classification in which
the query feature point is assigned to the most matching face
manifold. Experimental results show that kernel PCA incorporated with the LMM classifier achieves the best face recognition
performance.
A. Locally Linear Assumption
The key issue of nearest feature classifiers (NFL) is that
how and where to generate virtual prototype feature points.
Since NFL uses a linear model to generate an infinite number
of virtual prototypes, We think the virtual prototypes should
be created in the patch which is linear or close to linear.
I. I NTRODUCTION
In general, appropriate facial representation and effective
classification rules are two central issues in most face recognition systems. In this paper, we will mainly explore various
classification rules to design a robust classifier. Up to now,
a lot of pattern classification methods have been presented.
One of the most popular classifiers among them is the nearest
neighbor (NN) classifier [2]. Although NN is a very simple
and convenient method, the representational capacity of face
database is limited by the available prototypes in each class,
which restrict the performance of NN.
To extend the capacity of covering more variations for a
face class, Li et al. presented the nearest feature line (NFL)
classifier in literature [4]. The method creates virtual prototype
feature points which complement the limited prototypes, thus
improves the performance of the NN method by expanding
the representational capacity of available prototypes.
In this paper, we incorporate the advantage of virtual
samples with manifold learning techniques. Firstly, we present
a supervised local manifold learning algorithm to learn all
local manifolds which are actually virtual ones. Subsequently,
we propose a local manifold matching (LMM) classifier which
is numerically stable and achieves best performance in balance
of both recognition rate and computational cost.
II. O UR M ETHOD
The locally linear manifold is intended to create high
dimensional virtual prototype feature points which will benefit
classification in lack of training samples. Therefore, it has
the great representational capacity of available prototypes and
covers most sufficient facial variations.
0-7803-9134-9/05/$20.00 ©2005 IEEE
(a)
(b)
Fig. 1. Local manifold examples. (a) x and its 5 nearest neighbors form
a local manifold; (b) 7 local manifolds make up a global manifold, LM 2,3
overlap with LM 6,7.
Motivated by LLE [5], we assume that each feature point
xci (i = 1, · · · , Nc ; c = 1, · · · , C) and its k (1 ≤ k ≤ K =
minc {Nc }−1) nearest neighboring points of the same class lie
on a linear Euclidean space, called locally linear manifold or
local manifold. The virtual samples will be created in the local
manifold. Accurately speaking, local manifolds are virtual
manifolds, which will facilitate effective manifold analysis
on limited training samples and benefit classification. The
local manifold usually distributes in a k-dimensional subspace,
which will degenerate to the feature line when k equals 1.
Examples for local manifold are shown in Fig. 1.
B. Definition of Local Manifold and Local Manifold Distance
We construct the local manifold as blow: select a neighborhood for each prototype, then learn a Euclidean subspace
spanned with the neighborhood, virtual samples are generated
based on the very subspace. Thus, in the sequel, each prototype
and those artificial samples form the local manifold. Provided
the training set {xi | xi ∈ <d , 1 ≤ i ≤ N }, for any prototype
feature point xi , denote its k − N N as xN (i,j) (1 ≤ j ≤ k),
where N (i, j) indicates the index of the jth nearest neighbor
of xi . Let us define the local manifold on which xi and its
neighbors reside as below:
M(xi , k) = M{xi , xN (i,1) , xN (i,2) , ..., xN (i,k) }
(1)
From the viewpoint of set L
theory, J
we can unite or split local
manifolds by the operator
and , for example
M
M(A)
M(B)=M(A ∪ B)
K
M(A)
M(B)=M(A − B)
(2)
where A and B are two neighboring sets such as {xi , xN (i,1) ,
· · · , xN (i,k) }, M(A) and M(B) represent two local manifolds
constructed with the two sets.
Because the local manifold is supposed to be linear, we will
apply classical linear techniques such as Principal Component
Analysis (PCA) [3] to achieve linear embedded structures. For
simplicity we denote each local manifold M(xi , k) as Mi ,
and then present a new definition of the covariance matrix of
samples on Mi
CMi =
k
X
(xN (i,j) − xi )(xN (i,j) − xi )T
(3)
j=1
we apply PCA or SVD on CMi to attain the principal
subspace UMi , which contains most of the information on
the neighborhood which xi resides on. It is noticeable that
the parameter k is not only the number of neighbors which
construct the local manifold Mi coupled with xi , but also
implies the upper limit of the intrinsic dimension for the local
manifold Mi .
For a new point x, to measure the matching extent with the
local manifold Mi , i.e. the possibility under which x does
lie in Mi , an available similarity measure is the Euclidean
distance between x and its projection onto the target local
manifold Mi . The projection point is denoted as pi , which is
just the most matching virtual sample of the new point x in the
local manifold Mi . Furthermore, we can detail the similarity
measure by applying projective geometry.
Projecting the difference vector x − xi into Mi (correlates
the principal subspace UMi ), we get the difference vector
T
(x − xi ). Due to Pythagorean
pi − xi and its coordinates UM
i
T
theorem (which imply kpi −xi k = kUM
(x−xi )k), we derive
i
the distance from x to the local manifold Mi
p
d(x, Mi ) = kx − pi k = kx − xi k2 − kpi − xi k2
q
T (x − x )k2
(4)
kx − xi k2 − kUM
=
i
i
which is called as local manifold distance or LM distance.
From Eq.(4), we can expediently calculate the LM distance
once provided xi and UMi . Therefore, we may describe a
local manifold as a binary element to cater exclusively for
classification:
Mi = M(xi , k) '< xi , UMi >
(5)
where UMi is also called as regularization matrix, which
rectifies the original Euclidean distance between the query
and a single prototype to the manifold distance between the
query and the local manifold that contains the prototype. As a
binary element, we only utilize two matrices to find the most
matching manifold in classification. What’s more, we are able
to judge which class the manifold belongs to through the class
label of xi .
C. Supervised Local Manifold Learning (SLML)
There is a key parameter in our model: the number of
nearest neighbors (k), which should be set to proper number
such that the locally linearity assumption holds. For each local
manifold Mi , we can define a decomposition error to measure
the extent of linearity of Mi
ε(Mi )
= d(xN (i,k) , M{xi , xN (i,1) , ..., xN (i,k−1) })2
K
(6)
= d(xN (i,k) , Mi
M{xN (i,k) })2
where we firstly decompose Mi and consider the degraded
local manifold which xi and its k − 1 NNs reside on, then we
use the squared LM distance, between the k nearest neighbor
of xi and the degraded local manifold, as decomposition errors
for local manifolds.
Motivated by the important definition of the reconstruction
error in the well known LLE’s [5] framework, we can show
that the decomposition error ε(Mi ) is just the reconstruction
error using k points xi , xN (i,1) , ..., xN (i,k−1) to linearly estimate the point xN (i,k) . So we can understand decomposition
error as the manifold reconstruction error using k-1 prototypes
to represent all k prototypes in each local manifold. The
smaller the decomposition error ε(Mi ) is, the better the
locally linearity of Mi holds. If xi preserves fixed, Mi
will expand or shrink going with changes of k, thereby the
decomposition error is a function of k.
Integrate linearity inosculation situations of all local manifolds, we define the sum of decomposition errors, which is a
function of k, is derived
εk =
N
X
ε(Mi )
(7)
i=1
where the sum of errors εk reflects the global linearity inosculation since the locally linearity of single local manifold holds
good must not ensure other local manifolds perform good in
linearity inosculation. Let k increases from 0, we will find that
the ideal k ∗ should be the value when the sum of errors starts
to decrease and local manifolds attain stationary structures,
thus we can identify the parameter k by
D = {k| ∆εk
k∗
= εk − εk−1 ≤ 0, 1 ≤ k ≤ K}
=
min{k}
k∈D
(8)
where K is the maximum permissible number of nearest
neighbors. If there are many prototypes for each subject in
training, a smaller value can be set to K.
We propose a supervised local manifold learning (SLML)
algorithm by an iterative program, as shown in Tab. I. In
the fist step, K-NNs for each prototype are identified in a
supervised way. Secondly, the ideal parameter k ∗ is found
through finite iteration based on the sum of decomposition
errors. In each iteration, PCA is used to extract the principal
subspace for each local manifold. Finally, all local manifolds
Mi :< xi , UMi > with corresponding regularization matrices
are learned. Notice that we define the local manifold for the
special case k = 0 as follows (assume ε0 = 0)
Mi = M(xi , 0) = M{xi } '< xi , 0 >
d(x, Mi ) = kx − xi k
(9)
TABLE I
In fact, we can bring the computational complexity of
LMM down further. We have learned N local manifolds, of
which a great quantity is likely to repeat especially when the
optimal k ∗ approaches K. So, we must discard the redundant
manifolds to save the computational cost. In SLML, an extra
checking step against repetitions should be performed to save
different manifolds. We denote these independent manifolds as
Mit (1 ≤ t ≤ N̄ )(N̄ is the number of independent manifolds),
rewrite (10) as below
it∗ = arg min d(x, Mit ), L(x) = L(xit∗ )
1≤t≤N̄
(11)
III. E XPERIMENTS
Our experiments are carried out on a mixed database of
125 persons and 985 images, which is a collection of three
databases: 1) The ORL database. There are 40 persons and
each person has 10 different images. 2) The YALE database.
It contains 15 persons and 11 facial images for each person.
3) The FERET subset. Seventy persons are selected from the
FERET database, and each person has six different images.
All the images are resized to 92 × 112. There are facial
expressions, illumination, and pose variations. In order to
reduce the influence of some extreme illumination, histogram
equalization is applied to the images as pre-processing. Fig. 2
shows some samples.
S UPERVISED L OCAL M ANIFOLD L EARNING A LGORITHM
Input: A training set with C classes: Z = {(xi , yi )| xi ∈ <d , yi = L(xi ) ∈
Y}. Label set Y = {1, 2, ..., C}.
Initialize: The maximum permissible nearest neighbors K
=
min1≤c≤C {Nc − 1}, the sum of decomposition errors εj = 0(j =
0, · · · , K). For each prototype xi , find K-NNs belong the class with the
label L(xi ) in Z, denote them as xN (i,j) (j = 1, · · · , K); assume the local
manifold Mi = M{xi }, and set UMi = 0.
Loop: Do for k = 1, 2, ..., K
step 1. Calculate the current sum of decomposition errors.
for i = 1, 2, ..., N
εk ←− εk + d(xN (i,k) , Mi )2
end.
step 2. Expand all local manifolds Mi and update principal
subspaces UMi .
for i = 1, 2, ..., N
L
M{xN (i,k) },
Mi ←− Mi
learn a subspace UMi for the new manifold,
update the binary element Mi :< xi , UMi > .
end.
step 3. If εk ≤ εk−1 , then abort loop.
Output: The most suitable k ∗ , and the most suitable local manifolds Mi :<
xi , UMi >.
D. Local Manifold Matching (LMM)
For classification, the optimal matching between the query
and learned local manifolds are performed based on the
powerful dissimilarity measure . The class label of the nearest
manifold is just that of the query feature point, hence, we
propose the nearest manifold criterion for classification, which
is formulated as follows (L(x) represents which class the
sample x belongs to)
i∗ = arg min d(x, Mi ), L(x) = L(xi∗ )
1≤i≤N
(10)
Fig. 2.
Samples from the mixture database.
A. Demonstration of LML
To test our supervised local manifold learning algorithm
(SLML), we track the sum of decomposition errors εk (×108 )
and error rate (%) with respect to k(<= 5), which is the
number of nearest neighbors. For simplicity, we only use the
ORL database. Fig. 3 shows the average sum of errors plus
error rate as functions of the number of nearest neighbors
(k). In each round, 6 images are randomly selected from
the database for training and the remaining images of the
same subject for testing. 20 tests are performed with different
configuration of the training and test set, and the results are
averaged. The standard eigenface method of Turk and Pentland
[6] is first applied to the set of training images to reduce
the dimension of facial image. In this experiment, we use 50
eigenfaces for each facial feature.
Plotted in Fig. 3(a), the sum of decomposition error εk starts
to decrease when k equals to 4, we think the local manifolds
attain stationary structures at this time. Nevertheless, SLML
will start overfitting at the same time. From Fig. 3(b) we
can discover that the error rate increases after k = 4, which
powerfully justifies our choice of optimal k ∗ (given by (8)).
The LMM error rate (k ∗ = 4 in Fig. 3(b) with 50 eigenfaces is
3.0625% whereas the NN error rate (corresponding to LMM
at k = 0 in Fig. 3(b)) is 4.69%.
We list the error rates and recognition time in Tab. II, our
method shows encouragingly comprehensive performance. Especially incorporated with KPCA, the proposed LMM method
yields the lowest error rate (8.72%) for the mixture database
with acceptable recognition time. Because LDA will break
the manifold structure of face data, we do not combine the
LMM classifier with Fisherface features. All experiments are
implemented using the MATLAB V6.1 under Pentium IV
personal computer with a clock speed of 2.4 GHZ.
TABLE II
C OMPARISON OF FIVE
Sum of squared error
12
Method
10
PCA+NN
PCA+NFL
LDA+NN
KPCA+NN
PCA+LMM
KPCA+LMM
8
6
4
Recognition Performance
Error Rate (%)
Run Time (ms)
18.09
8.649
16.14
65.53
13.23
9.340
15.06
8.981
10.89
18.57
8.72
20.48
2
0
0
1
2
3
4
3
4
k
5
(a)
5
4.5
Error rate (%)
Dims
80
80
124
100
80
100
RECOGNITION METHODS
4
3.5
3
0
1
2
k
5
(b)
Fig. 3. Demonstrate the SLML algorithm. (a) Sum of decomposition error
εk vs. k on ORL; (b) Error rate vs. k on ORL.
B. Performance of LMM
To demonstrate the efficiency of local manifold matching
(LMM), extensive experiments are done on the mixed database. All methods are compared on the same training sets and
testing sets. The mixture database is divided into two nonoverlapping set for training and testing. The training data set
consists of 500 images: 5 images, 6 images and 3 images
per person are randomly selected from the ORL, the YALE
database and the FERET subset respectively. The remaining
485 images are used for testing. Twenty runs are performed
with different, random partitions between training and testing
images, and the results are averaged.
Eigenface method (PCA) [6], Fisherface method (LDA) [1]
and KPCA [7] are applied to reduce the dimension of facial
image and provide the features for recognition. Incorporated
with PCA or KPCA, we use SLML on each subset (ORL,
YALE, and FERET) of the mixed database and learn the
optimal k ∗ , which equals to 4, 3, 2 respectively.
IV. C ONCLUSION
Following the work of NFL, we present the local manifold
matching (LMM) method for face classification. Our method
improves the performance of the NN or NFL method by
expanding the representational capacity of available prototypes. In contrast to NFL, LMM creates much more virtual
prototype feature points, of which a substantial part benefits
classification confronted with limited prototypes. What’s more,
strong correlations between our manifold matching method
with the state-of-the-art manifold learning techniques such as
Locally Linear Embedding (LLE) are discussed. Experiments
are carried out on a mixed database and demonstrate the
effectiveness of our method.
ACKNOWLEDGMENT
This work is sponsored by the Natural Sciences Foundation
of China under grant No. 60121302 and 60335010. The
authors thank the Olivetti Research Laboratory in Cambridge
(UK) and the FERET program (USA) and the YALE University for their devotion of face databases. Great appreciations
especially to Yilin Dong for her encourage.
R EFERENCES
[1] P.N. Belhumeur, J.P. Hespanha and D.J. Kriegman, “Eigenfaces vs.
Fisherfaces: Recognition Using Class Specific Linear Projection,” IEEE
Trans. on PAMI, vol. 19, no. 7, pp. 711-720, July 1997.
[2] T.M. Cover and P.E. Hart, “Nearest Neighbor Pattern Classification,”
IEEE Trans. on Information Theory, vol. 13, pp. 57-67, January 1967.
[3] I.T. Jolliffe, Principal Component Analysis, Springer-Verlag, New York,
1986.
[4] S.Z. Li and J-W. Lu, “Face Recognition Using the Nearest Feature Line
Method,” IEEE Trans. on Neural Networks, vol. 10, no. 2, pp. 439-443,
March 1999.
[5] S.T. Roweis and L.K. Saul, “Nonlinear Dimensionality Reduction by
Locally Linear Embedding,” Science, vol. 290, pp. 2323-2326, December
2000.
[6] M.A. Turk and A.P. Pentland, “Face Recognition Using Eigenfaces,” in
Proc. of IEEE Conf. on Computer Vision and Pattern Recognition, pp.
586-591, June 1991.
[7] M.H. Yang, N. Ahuja and D. Kriegman, “Face Recognition Using Kernel
Eigenfaces,” in Proc. of IEEE Int. Conf. on Image Processing, 2000.
Download