Face Detection (a MATLAB implementation of a face detection

advertisement
Face Detection
(a MATLAB implementation of a face detection algorithm based on Viola-Jones’ approach)
MATTEO VENANZI
bvenanzi@libero.it
LEONARDO BENUCCI
dawhiterabbit@hotmail.com
Abstract. This paper explains the work done to implement in a MATLAB framework an efficient
face detection algorithm based on Viola-Jones method, as a project for Computer Vision course. We
have implemented step by step the cascaded face detector based on AdaBoost technique that is
illustrated in the article by Viola-Jones1; then we have defined appropriate data structures, a proper
usage strategy and we have tried to obtain a trade-off between fast computing and good
performances; at the end we have tested it on many instances, varying the parameters of
configuration, and from the results we have observed good performances in terms of detected faces
and committed errors.
In this discusion, we will show that the behaviour of our resulting system is similar to the one
described in Viola-Jones’ paper.
1.
Introduction
This article describes a project of a face detector realized in MATLAB, based on Viola-Jones’
approach. Our face detection system classifies images basing on the value of four particular kind of
Haar-like features, so the first step is the implementation of a function capable to generate
randomically sets of features. The second step is the realization of the boosting algorithm, the
procedure that, given a set of simple weak classifiers (each one with a low accuracy) elaborated on
a training set of positive and negative face examples, round by round builds a strong classifier with
high accuracy by the combination of the T best weak classifiers. The third step is the realization of
the cascade, an optimization process designed for reducing the number of features to use and so
useful to increase the general performances. The last step consists in testing the system on a set of
examples and observing his behaviour, in terms of detection rate and accuracy.
2.
Features
Viola-Jones’ method is based on the notion of feature; a feature corresponds to a set of two or more
adjacent rectangles. To classify an image, we use a set of 3 different kind of features:
1
This article is avalaible on http://www.dis.uniroma1.it/~visiope/Progetto2008/Repository/integralImages.pdf

2-rectangle feature (direct, 90°)

3-rectangle feature (direct, 90°):

4-rectangle feature:
The value of a feature is computed by subtracting the sum of the pixels within the white rectangle
from the sum of the pixels within the black one.
We have created a Matlab function that generates the entire set of features on a generic image
through the following procedure:
1.
2.
3.
4.
5.
Build a feature of size 1x1 localized on the left-up corner of the image
Scroll the feature on the image along all the lines and the columns
Increase of one the width and go to step 1 until the feature is contained into the image
Increase of one the height and repeat the step 1 until the feature is contained into the image
Repeat this algorithm for all kind of features
Running this function, we obtained a set of about 160.000 features.
3.
AdaBoost
AdaBoost is the boosting procedure suggested in Viola-Jones’ paper for the training recognition
step: it returns a strong classifier based on a linear combination of T weak classifiers. In our
implementation we work on a training set2 composed by images of 59 faces and 49 non-faces;
round by round we evaluate the weak classifiers3, calculate the weighted error and finally choose as
best weak classifier the one with the minimum error. Then the best weak classifier is evaluated on a
validation set of 10 faces and 10 non-faces to extract data useful for giving an estimation of the best
weak classifier’s quality.
3.1
Accuracy
In figure 2 is shown the accuracy trend of the best classifier in respect to the iterations; we can
observe that after a few rounds (10-15) the optimal accuracy is reached. This behaviour is almost
identical to the one described in the adaBoost literature.
2
Taken from http://www.mathworks.com/matlabcentral/fileexchange/
The learning weak classifier module is avalaible in the package ABdemo.rar at the url
http://www.pudn.com/upload_log.asp?e=ncepu_ly*126.com&n=ncepu_ly#
3
Figure 2: Accuracy curve for the 5000 feature classifier in 25 rounds;
it shows the convergence of the algorithm. The best accuracy is
already achieved after 15 iterations.
3.2
ROC curve
The ROC curve of our adaBoost implementation (Figure 3) shows that increasing the false positive
rate the accuracy improves, according to the theorical results described in the paper.
Figure 3: Receiver operating characteristic (ROC) for the 5000 feature
classifier in 50 rounds.
The main difference between our chart and the one presented in the paper is that our detector
achieves an accuracy rate of 95% when the false positive rate is 8 x 10-3, while the other reaches a
good accuracy after a false positive rate of 3 x 10-3.
4. Cascade
The evolution introduced by Viola and Jones in face detection field is the cascade of classifiers.
Instead of learning a classifier through adaBoost on the entire set of features (160.000 on a 24x24
sub window), that would require weeks, they process a sub window through a series of classifiers,
where at the first levels there are simpler classifiers used only for rejecting many of the negative
sub windows, and subsequently there are more complex classifiers to detect the real faces between
those “faces” who have not been filtered in the previous steps; the algorithm keeps on adding levels
to the cascade until it achieves a certain false positive rate (Ftarget) obtaining, step by step, a
minimum improvement driven by the input values f and d (figure 4).
All sub windows
F1 ≤ f
D1 ≥ d
Strong classifier
Level 1
F2 ≤ f x F1
D2 ≥ d
True
Strong classifier
Level 2
False
Fn ≤ f x Fn-1
Dn ≥ d
True
Strong classifier
Level n
False
Sub window
accepted as a face
False
Sub window rejected
Figure 4: Detection cascade schema.
5. Results
We launched the cascade builder with these parameters:
 Target false positive rate = 10-2
 F = 0.8
 D = 0.95
and it achieved a target false positive rate of 8 x10-3 after about 1 hour of execution (CPU: 1,5
GHz; RAM = 512 MB) building a cascade of 8 levels, using 40 features. Then, we tried the
classifier, able to detect only 18x27 faces, on a small test set of images and on figure 5 there are the
results.
4.1
Comments
We can say that our classifier is not able to detect all the faces in the image but we can observe that
the majority of its detections is in presence of some face. In this project we did not realize a
mechanism for filtering the isolated incorrect detection so a part of this errors could be easily
removed; notice that the most of the false detections are localized in sub windows with areas of skin
or with bright and light colours (similar to the human skin). The most important advantage
underlined by the tests is that our classifier behaves rather good in presence of rotated or inclined
faces (this and the change of lightness are two tipical problems for the common face detectors). Of
course the first cause of this suboptimal performance is that this classifier is closer to a false
positive rate of 10-2 while, for a practical face detector on real application, the false positive rate
must be at least of 10-6.
4.2
What else can we do
Below, we suggest two principal points not included in this project (essentially for lack of time) but
that can easily improve and complete our work:
 Multiscale detection: Our face detector has been learned over a training set of 18x27 so it
can detect only faces smaller but not larger than this size. The technique for solving this
problem is scaling the feature of a certain constant value and apply it on the larger sub
window (this will also increase the number of false detections).
 Postprocess of the detected subwindows to combine overlapping detections into a single
detection: The inevitable presence of overlapping detection on the same face requires a
mechanism for combining it into a single bounding box that includes all the detections. A
possible solution is computing the centre of mass of a group of nearest detections and take it
for representing all the group.
Figure 5: Output of the face detector with 10-2 of false positive rate on a set of test images.
Download