Pose Invariant Face Recognition Using Probability Distribution

advertisement
IEEE SIGNAL PROCESSING LETTERS, VOL. 15, 2008
537
Pose Invariant Face Recognition Using Probability
Distribution Functions in Different Color Channels
Hasan Demirel and Gholamreza Anbarjafari
Abstract—In this letter a new and high performance pose invariant face recognition system based on the probability distribution functions (PDF) of pixels in different color channels is proposed. The PDFs of the equalized and segmented face images are
used as statistical feature vectors for the recognition of faces by
minimizing the Kullback–Leibler distance (KLD) between the PDF
of a given face and the PDFs of faces in the database. Feature vector
fusion (FVF) and majority voting (MV) methods have been employed to combine feature vectors obtained from different color
channels in HSI and YCbCr color spaces to improve the recognition performance. The proposed system has been tested on the
FERET and the Head Pose face databases. The recognition rates
obtained using FVF approach for FERET database is 98.00% compared with 94.60% and 68.80% for MV and principle component
analysis (PCA)-based face recognition techniques, respectively.
Index Terms—Face recognition, feature vector fusion, Kullback–Leibler distance, majority voting, singular value decomposition.
I. INTRODUCTION
T
HE earliest work in computer recognition of faces was
reported by Bledsoe [1], where manually located feature
points are used. Statistical face recognition systems such as principal component analysis (PCA)-based eigenfaces introduced
by Turk and Pentland [2] attracted a lot of attention. Relhumeur
et al. [3] introduced the fisherfaces method which is based on
linear discriminant analysis (LDA).
Many of these methods are based on grayscale images; however, color images are increasingly being used since they add
additional biometric information for face recognition [4]. Color
PDFs of a face image can be considered as the signature of the
face, which can be used to represent the face image in a low
dimensional space. Images with small changes in translation,
rotation, and illumination still possess high correlation in their
corresponding PDFs, which prompts the idea of using PDFs for
face recognition.
The published face recognition papers using histograms indirectly uses PDFs for recognition. There is some published
work on application of histograms for the detection of objects
[5]. However, there are few publications on application of histogram or PDF-based methods in face recognition. Yoo et al.
used chromatic histograms of faces [6]. Rodriguez et al.[7] divided a face into several blocks and extracted the local binary
pattern (LBP) feature histograms from each block and concatenated into a single global feature histogram to represent the
Manuscript received January 28, 2008; revised April 30, 2008. The associate
editor coordinating the review of this manuscript and approving it for publication was Dr. Aleksandra Mojsilovic.
The authors are with the Department of Electrical and Electronic Engineering, Eastern Mediterranean University, Gazimağusa, KKTC, Mersin 10,
Turkey (e-mail: hasan.demirel@emu.edu.tr).
Digital Object Identifier 10.1109/LSP.2008.926729
face image; the face was recognized by a simple distance-based
grey-level histogram matching.
Face segmentation is one of the important preprocessing
phases of face recognition. There are several methods for this
task such as skin tone-based face detection for face segmentation. Skin is a widely used feature in human image processing
with a range of applications [8]. Human skin can be detected
by identifying the presence of skin color pixels. Many methods
have been proposed for achieving this. Chai and Ngan [9]
modeled the skin color in YCbCr color space. Boussaid et
al. modified Chai and Ngan’s methods in order to make it
VLSI friendly. They normalized the RGB components and skin
color was detected by using four threshold values, forming a
rectangular region in new RGB space [10] One of the recent
methods for face detection is proposed by Nilsson [11] which
is using local successive mean quantization transform (SMQT)
technique. Local SMQT is robust for illumination changes and
the receiver operation characteristics of the method is reported
to be very successful for the segmentation of faces.
In the present work, the local SMQT algorithm has been
adopted for face detection and cropping in the preprocessing
stage. Color PDFs in HSI and YCbCr color spaces of the
isolated face images are used as the face descriptors. Face
recognition is achieved using the Kullback–Leibler distance
(KLD) between the PDF of the input face and the PDFs of
the faces in the training set. FVF and MV methods have been
used to combine PDFs obtained from different color channels
to increase the recognition performance. In order to reduce the
effect of the illumination, a new method has been developed
which is based on the singular value decomposition of image
intensity matrices in respective R, G, and B color channels.
The Head Pose (HP) face database [12] and a subset from
the FERET [13] database with faces containing varying poses
to
of rotation around the vertical
changing from
axis passing through the neck were used to test the proposed
system.
II. PREPROCESSING OF FACE IMAGES USING
SINGULAR VALUE DECOMPOSITION
There exist several methods in the literature to equalize a
color image [14]. One of the most frequently used and simple
methods is to equalize the color image in RGB color space by
using histogram equalization (HE) in each color channel separately. In this work for each subject, the intensity matrices of
each color channel in RGB space has been described by using
singular value decomposition (SVD). In general, for any intensity image matrix
, SVD can be written as
(1)
where
and
are orthogonal square matrices and
matrix contains the sorted singular values on its main diagonal.
As reported in [15],
represents the intensity information
1070-9908/$25.00 © 2008 IEEE
Authorized licensed use limited to: ULAKBIM UASL - DOGU AKDENIZ UNIV. Downloaded on November 12, 2008 at 03:48 from IEEE Xplore. Restrictions apply.
538
IEEE SIGNAL PROCESSING LETTERS, VOL. 15, 2008
TABLE I
ENTROPY OF COLOR IMAGES IN DIFFERENT COLOR
CHANNELS COMPARED WITH THE GRAYSCALE IMAGES
image into a grayscale image. In order to compare the amount
of the information in a color and grayscale images, the entropy
of an image can be used, which can be calculated by
Fig. 1. Algorithm, with a sample image with different illumination from Oulu
face database, of preprocessing of the face images to obtain a segmented face
from the input face image.
of a given image intensity matrix. If an image is a low conof
trast image, this problem can be corrected to replace the
the image with another singular matrix obtained from a normal
image with no contrast problem. A normalized intensity image
matrix with no illumination problem can be considered to be
the one with a PDF having a Gaussian distribution with mean
of 0.5 and variance of 1. Such a synthetic intensity matrix with
the same size of the original image can easily be obtained by
generating random pixel values with Gaussian distribution with
mean of 0.5 and variance of 1. Then the ratio of the largest singular value of the generated normalized matrix over a normalized image can be calculated according to
(2)
is the singular value matrix of the synwhere
thetic intensity matrix. This coefficient can be used to regenerate a new singular value matrix which is actually an equalized
intensity matrix of the image generated by
(4)
measures the information of the image. The average
where
amount of information measured by using 650 face images of
the FERET and HP face databases are shown in Table I. The entropy figures indicate that there are significant amount of information in different color channels which should not be simply
ignored by only considering the grayscale image.
The PDF of an image is a statistical description of the distribution in terms of occurrence probabilities of pixel intensities, which can be considered as a feature vector representing
the image in a lower dimensional space. In a general mathematical sense, an image PDF is simply a mapping representing
the probability of the pixel intensity levels that fall into various
disjoint intervals, known as bins. The bin size determines the
size of the PDF vector. In this letter, the bin size is assumed
to be 256. Given a monochrome image, PDF
meets the following conditions, where is the total number of pixels in an
image:
(5)
(3)
is representing the equalized image in
where
A-color channel.
This task which is actually equalizing the images of a face
subject will eliminate the illumination problem. Then, this new
image can be used as an input for the face detector prepared
by Nilsson [16] in order to segment the face region and eliminate the undesired background. The segmented face image are
used for the generation of PDFs in H, S, I, Y, Cb, and Cr color
channels in HSI and YCbCr color spaces. If there is no face in
the image, then there will be no output from the face detector
software, so it means the probability of having a random noise
which has the same color distribution of a face but with different
shape is zero, which makes the proposed method reliable. The
proposed equalization has been tested on the Oulu face database
[17] as well as the FERET and the HP face databases. Fig. 1
shows the general required steps of the preprocessing phase of
the proposed system.
III. PDF-BASED POSE INVARIANT FACE RECOGNITION
Usually many face recognition systems use grayscale face
images. From the information point of view, a color image has
more information than a grayscale image. So we propose not to
lose the available amount of information by converting a color
Then, PDF feature vector,
, is defined by
(6)
The distance between the PDF of two images can be measured by using the KLD between the PDFs of the respective images. This distance is derived by considerations of information
theory which reflects the diversity in the information content between two PDFs. Given two PDF vectors and the KLD, ,
is defined as
(7)
where is the number of bins. Let the PDFs of a set of training
face images be
, where is the number of image
samples. Then, a given query face image, the PDF of the query
image can be used to calculate the KLD between and PDFs
of the images in the training samples as follows:
(8)
Here, is the minimum KLD reflecting the similarity of the th
image in the training set and the query face.
Authorized licensed use limited to: ULAKBIM UASL - DOGU AKDENIZ UNIV. Downloaded on November 12, 2008 at 03:48 from IEEE Xplore. Restrictions apply.
DEMIREL AND ANBARJAFARI: POSE INVARIANT FACE RECOGNITION
539
TABLE II
PERFORMANCE OF THE PROPOSED PDF-BASED FACE RECOGNITION
SYSTEM IN H, S, I, Y, CB, AND CR COLOR CHANNELS
OF THE FERET AND THE HP FACE DATABASES
Fig. 2. Two subjects from FERET database with two different poses (a), their
segmented faces (b) and their PDFs in H (c), S (d), I (e), r (f), g (g), and b (h)
color channels, respectively.
The image with the lowest KLD distance from the training
face images is declared to be the identified image in the set.
Fig. 2 shows two subjects with two different poses and their
segmented faces from the FERET face database which is well
known in terms of pose changes and also the images have different backgrounds with slight illumination variation. The intensity of each image has been equalized to minimize the illumination effect, where the shapes of the PDFs are more or less
preserved, while only the amplitudes are changing. The color
PDFs used in the proposed system are generated only from the
segmented face, and hence, the effect of background regions is
eliminated. The performance of the proposed system is tested
on the FERET and the HP face databases, where there are 500
and 150 face images with changing poses and background, respectively. The faces in those datasets are converted from RGB
to HSI and YCbCr color spaces, and the data set is divided into
training and test sets. In this setup, the training set contains
images per subject, and the rest of the images are used for the
test set.
The correct recognition rates in percent of the FERET and
HP face databases are shown in Table II. Each result is the average of 100 runs, where we have randomly shuffled the faces
in each class. It is important to note that the performance of
each color channel is different, which means that a person can
be recognized in one channel where the same person may fail
to be recognized in another channel. This observation prompts
the idea of combining the decisions of different color channels,
which may increase the probability of correct recognition of a
face image.
IV. MV VERSUS FVF METHODS TO COMBINE
PDFS IN DIFFERENT COLOR CHANNELS
The face recognition procedure explained in the previous section can be applied to different color channels such as H, S, I,
Y, Cb, and Cr. Hence, given a face image, the image can be
represented in these color spaces with dedicated color PDFs for
each channel. Different color channels contain different information regarding the image; therefore, all of these six PDFs can
be combined to represent a face image. This combination can
be done by the use of majority voting (MV) or feature vector
fusion (FVF) methods. The main idea behind MV is to achieve
increased recognition rate by combining decisions of the PDFbased face recognition procedures of different color spaces. By
considering the H, S, I, Y, Cb, and Cr PDFs separately and combining their results, using MV increases the performance of the
classification process. The MV procedure can be explained as
TABLE III
PERFORMANCE OF THE PROPOSED FVF AND MV-BASED
SYSTEM COMPARED WITH PCA-BASED SYSTEM
to be a set of PDFs of
follows. Consider
training face images in color channels
,
then a given a PDF of a query face image, , color PDFs of the
query image
and
can be used to
and PDFs of the images in the
calculate the KLD between
training samples as follows:
(9)
Thus, the similarity of the th images in the training set and
the query face can be reflected by
with the minimum KLD
value. The image with the minimum distance in a channel, ,
(where C=H, S, I, Y, Cb, or Cr) is declared to be the vector
representing the recognized subject. Given the decisions of each
classifier in each color space, the voted class can be chosen
as follows:
(10)
is a majority voting function declaring the most rewhere
peated class.
The proposed system using PDFs as the face feature vectors
in different color channels, minimum KLD as the PDF matching
measure, and MV principle as identifier is tested on the FERET
and the HP face databases, which contains 50 and 15 color
image subjects with ten different poses, respectively. The face
dataset is divided into training set of images per subject, and
the rest of the images are used as the test set. The correct recognition rates in percent are included in Table III. Each result is
the average of 100 runs, where we have randomly shuffled the
faces in each class. The results of the proposed system are encouraging, because with having five images as training we can
reach a recognition rate of 95.24% and 94.11% for the FERET
and the HP face databases, respectively.
MV is not the only way to combine the PDFs of the faces
in different color channels. PDFs vectors can also be simply
concatenated with the FVF process which can be explained as
follows. Consider
to be a set of training face
Authorized licensed use limited to: ULAKBIM UASL - DOGU AKDENIZ UNIV. Downloaded on November 12, 2008 at 03:48 from IEEE Xplore. Restrictions apply.
540
IEEE SIGNAL PROCESSING LETTERS, VOL. 15, 2008
TABLE IV
COMPARISON OF THE PROPOSED SVD-BASED EQUALIZATION WITH
STANDARD HISTOGRAM EQUALIZATION (HE) ON THE FINAL RECOGNITION
RATES, WHERE THERE ARE FIVE POSES IN THE TRAINING SET
images in color channel
, then for a given
is defined as a vector which is the
query face image, the
combination of all PDFs of the query image as follows:
(11)
This new PDF can be used to calculate the KLD between
and
of the images in the training samples as follows:
(12)
is the number of images in the training set. Thus, the
where
similarity of the th images in the training set and the query face
can be reflected by , which is the minimum KLD value. The
image with the lowest KLD distance in a channel, , is declared to be the vector representing the recognized subject. The
proposed system using PDFs in different color channels as the
face feature vector, minimum KLD as the matching measure,
and FVF as identifier is also tested on the FERET and the HP
face databases. The correct recognition rates in percent are included in Table III. Each result is the average of 100 runs, where
we have randomly shuffled the faces in each class.
V. RESULTS AND DISCUSSIONS
The results of the proposed system using combination of feature vectors, with five samples per subject in the training set,
achieve 98.00% and 94.60% recognition rates by using FVF and
MV methods for the FERET face database, respectively.
The MV and FVF results are 97.78% and 97.33% for the
HP face database, when five samples per subject is available in
the training set, respectively. These results are significant, when
compared with the recognition rates achieved by conventional
PCA, which has been applied on grayscale segmented face images. The results obtained by the proposed system using FVF
for the FERET database shows 29.20% improvement over PCA
and for the HP face database this improvement rate is 5.78%. In
all cases, both FVF and MV approaches outperforms the conventional methods in the literature. These results indicate the robustness of the proposed PDF-based approach for pose varying
faces. Obviously there is a limit of the angle of rotation where
the face has been completely changed, e.g., if only the back of
the head is in the image.
In an attempt to show the effectiveness of the proposed SVDbased equalization technique, the comparison between the proposed method and HE on the final recognition scores is shown
in Table IV. As the results indicate, HE is not a suitable preprocessing technique for the proposed face recognition system,
due to the fact that it transforms the input image such that the
PDF of the output image has uniform distribution. This process
dramatically reshapes the PDFs of the segmented face images,
which results in poor recognition performance.
In terms of complexity analysis, the PDF calculation is
computationally cheaper than the PCA, as it does not include
any matrix operations in the training phase. The KLD is also
a cheaper method in calculating the distance compared with
Euclidean distance used by PCA. Due to the repetition of decision procedures in different color channels, MV method has
higher computational cost than the PCA. However, the FVF,
which only concatenates the PDF vectors for KLD calculation,
is cheaper than the Euclidean distance calculation.
VI. CONCLUSION
In this letter, we introduced a high performance pose invariant
face recognition system based on PDFs in different color channels. A new preprocessing procedure was employed to equalize
the images. Furthermore, local SMQT technique has been employed to isolate the faces from the background, and KLD-based
PDF matching is used to perform face recognition. Minimum
KLD between the PDF of a given face and the PDFs of the
faces in the database was used to perform the PDF matching.
The combination of feature vectors in different color channels,
by using MV and FVF, improved the performance of the proposed PDF-based system. The results show that the system is
robust to pose changes due to the high correlation between the
PDFs of the pose varying faces.
REFERENCES
[1] W. W. Bledsoe, The Model Method in Facial Recognition, Panoramic
Research, Inc., Palo Alto, CA, Rep. PRI:15, 1964.
[2] M. Turk and A. Pentland, “Face recognition using eigenfaces,” in Proc.
Int. Conf. Pattern Recognition, 1991, pp. 586–591.
[3] P. Relhumeur, J. Hespanha, and D. Kreigman, “Eigenfaces vs. fisherfaces: Recognition using class specific linear projection,” IEEE Trans.
Pattern Anal. Mach. Intell., vol. 19, no. 7, pp. 7 11–720, Jul. 1997.
[4] S. Marcel and S. Bengio, “Improving face verification using skin color
information,” in Proc. 16th Int. Conf. Pattern Recognition, Aug. 11–15,
2002, vol. 2, pp. 378–381.
[5] I. Laptev, “Improvements of object detection using boosted histograms,” in Proc. BMVC, 2006, pp. III:949–III:958.
[6] T.-W. Yoo and I.-S. Oh, “A fast algorithm for tracking human faces
based on chromatic PDFs,” Pattern Recognit. Lett., vol. 20, pp.
967–978, 1999.
[7] Y. Rodriguez and S. Marcel, “Face authentication using adapted local
binary pattern pdfs,” in Proc. 9th Eur. Conf. Computer Vision (ECCV),
Graz, Austria, May 7–13, 2006, pp. 321–332.
[8] M.-H. Yang, D. Kriegman, and N. Ahuja, “Detecting faces in images:
A survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 1, pp.
34–58, Jan. 2002.
[9] D. Chai and K. N. Ngan, “Face segmentation using skin color map
in videophone application,” IEEE Trans. Circuits Syst. Video Technol.,
vol. 9, no. 4, pp. 551–564, Jun. 1999.
[10] F. Boussaid, D. Chai, and A. Bouzerdoum, “On-chip skin detection
for color CMOS imagers,” in Proc. Int. Conf. MEMS, NANO, Smart
Systems, July 20–23, 2003, pp. 357–361.
[11] M. Nilsson, J. Nordberg, and I. Claesson, “Face detection using local
SMQT features and split up snow classifier,” in Proc. ICASSP 2007,
vol. 2, pp. II-589–II-592.
[12] N. Gourier, D. Hall, and J. L. Crowley, “Estimating face orientation
from robust detection of salient facial features,” in Proc. Pointing 2004,
ICPR, Int. Workshop Visual Observation of Deictic Gestures, Cambridge, U.K.
[13] P. J. Philips, H. Moon, S. A. Rizvi, and P. J. Rauss, “The FERET evaluation methodology for face recognition algorithm,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 10, pp. 1090–1104, Oct. 2000.
[14] M. Abdullah-Al-Wadud, M. H. Kabir, A. Dewan, and O. Chae, “A dynamic histogram equalization for image contrast enhancement,” IEEE
Trans. Consum. Electron., vol. 53, no. 2, pp. 593–600, May 2007.
[15] Y. Tian, T. Tan, Y. Wang, and Y. Fang, “Do singular values contain
adequate information for face recognition?,” Pattern Recognit., vol. 36,
pp. 649–655, 2003.
[16] M. Nilsson, A face detector software, provided in MathWorks
exchange file, retrieved in January 2008. [Online]. Available:
http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=13701.
[17] E. Marszalec, B. Martinkauppi, M. Soriano, and M. Pietikäinen, “A
physics-based face database for color research,” J. Electron. Imag., vol.
9, no. 1, pp. 32–38, 2000.
Authorized licensed use limited to: ULAKBIM UASL - DOGU AKDENIZ UNIV. Downloaded on November 12, 2008 at 03:48 from IEEE Xplore. Restrictions apply.
Download