FACE4IPCV2006d

advertisement
Face Detection and
Recognition
Zoltán VÁMOSSY
Budapest Tech, John von Neumann Faculty of Informatics
30/08/2006
IPCV 2006 Budapest
1
Face tasks in CV
 Face detection: Are there any face on the image (where)?
– concerns with a category of object
 Face localization:
– Aim to determine the image position of a single face
– A simplified detection problem with the assumption that
an input image contains only one face
 Facial feature extraction:
– To detect the presence and location of features such
as eyes, nose, nostrils, eyebrow, mouth, lips, ears, etc
– Usually assume that there is only one face in an image
 Face recognition (identification): concerns with individual
identity
 Facial expression recognition
 Human pose estimation
and tracking
30/08/2006
IPCV 2006 Budapest
2
Typical Biometric Authentication
Workflow
Enroll
Enrollment subsystem
Template
Feature
Extractor
Biometric
reader
1010010…
Authenticate
Authentication subsystem
Match or
No Match
Template
Biometric
Matcher
Database
1010010…
Biometric
reader
30/08/2006
Feature
Extractor
IPCV 2006 Budapest
3
Identification vs. Verification
Identification (1:N)
Biometric
reader
Biometric
Matcher
Database
This person is
Emily Dawson
I am Emily
Dawson
Verification (1:1)
ID
Biometric
reader
30/08/2006
Biometric
Matcher
Match
IPCV 2006 Budapest
Database
4
Face Recognition Applications
Automatic border control
Sydney airport
30/08/2006
Samsung’s Magicgate:
door lock control
IPCV 2006 Budapest
5
A Solved Problem?
 Excellent results and
products:
– Fast, multi pose,
partial occlusion
 Is face recognition
and detection a solved
problem?
 Not quite
30/08/2006
IPCV 2006 Budapest
Omron’s face detector
[Liu et al. 04]
6
Outline
 Relevant psychophysics/neuroscience issues
 Face Detection
– Detection on gray scale images
– Detection on color images
– Research directions
 Face Recognition
– Still image face recognition
 Face projects at Budapest Tech
30/08/2006
IPCV 2006 Budapest
7
Psychophysics Issues
 Is face perception the results of holistic/configural or local
feature analysis? Is it a dedicated process?
– Build a special or generic system
 Ranking of significance of facial features
– Different weights for features
 Movement and face recognition
– Favor video based recognition
 Role of race/gender/familiarity
– Suggest adaptive system
30/08/2006
IPCV 2006 Budapest
8
Inverted Faces
Prof. Eric H. Chudler Dept. of Anesthesiology University of Washington
30/08/2006
IPCV 2006 Budapest
9
Inverted Faces
Prof. Eric H. Chudler Dept. of Anesthesiology University of Washington
30/08/2006
IPCV 2006 Budapest
10
More Inverted Faces
Prof. Eric H. Chudler Dept. of Anesthesiology University of Washington
30/08/2006
IPCV 2006 Budapest
11
Negated Faces
30/08/2006
IPCV 2006 Budapest
12
Human Face Recognition Limitations
Harmon and Julesz, 1973
Degraded images
Human can handle low resolution
30/08/2006
IPCV 2006 Budapest
[Sinha05]
13
Outline of Detection Part
 Objective
– Survey major face detection works
– Address “how” and “why” questions
– Pros and cons of detection methods
– Research directions
 This part of the presentation is based on the
survey M-H. Yang:
http://vision.ai.uiuc.edu/mhyang/face-detectionsurvey.html
30/08/2006
IPCV 2006 Budapest
14
Face Detection
 Identify and locate
human faces in an
image regardless
of their
– position
– scale
– in-plane rotation
– orientation
– pose (out-of plane rotation)
– and illumination
 Face is a highly non-rigid object
Where are the faces, if any?
30/08/2006
IPCV 2006 Budapest
15
Why Face Detection is Important?
 First step for any fully automatic face
recognition system
 First step in many surveillance systems
 Lots of applications
 A step towards Automatic Target
Recognition (ATR) or generic object
detection/recognition
30/08/2006
IPCV 2006 Budapest
16
Why Face Detection Is Difficult?
 Pose (Out-of-Plane Rotation): frontal, 45





degree, profile, upside down
Presence or absence of structural
components: beard, mustache and
glasses
Facial expression: face appearance is
directly affected by a person's facial
expression
Occlusion: faces may be partially occluded
by other objects
Orientation (In-Plane Rotation): face
appearance directly vary for different
rotations about the camera's optical axis
Imaging conditions: lighting (spectra,
source distribution and intensity) and
camera characteristics (sensor response,
gain control, lenses), resolution
30/08/2006
IPCV 2006 Budapest
17
In One Thumbnail Face Image
 Consider a thumbnail 19 × 19 face pattern
 256361 possible combination of gray values
– 256361 = 28×361 = 22888
 Total world population 6,600,000,000
– 6,600,000,000 ≅ 232
 87 times more than the world population!
 Extremely high dimensional space!
30/08/2006
IPCV 2006 Budapest
18
What is a Correct Detection?
 Different interpretation of “correct detect”
 Precision of face location
 Affect the reporting results: detection, false
positive, false negative rates
30/08/2006
IPCV 2006 Budapest
19
Receiver Operator Characteristic Curve
 Useful for detailed performance
assessment
 Plot true positive (TP) proportion against
the false positive (FP) proportion for
various possible settings
 False positive: Predict a face when there is
actually none
 False negative: Predict a non-face where
there is actually one
30/08/2006
IPCV 2006 Budapest
20
Receiver Operator Characteristic (ROC)
 Correct
identification of
nearly all objects
will cause a large
percentage of
unknown objects to
be incorrectly
classified into some
known class
30/08/2006
IPCV 2006 Budapest
21
Face Detector: Ingredients
 Target application domain: single image, video
 Representation: holistic, feature, etc
 Pre processing: histogram equalization, etc
 Cues: color, motion, depth, voice, etc
 Search strategy: exhaustive, greedy, focus of
attention, etc
 Classifier design: ensemble, cascade
 Post processing: interpret detection results
30/08/2006
IPCV 2006 Budapest
22
In our Focus
 Focus on detecting
– upright, frontal
faces
– in a single grayscale image
– with decent
resolution
– under good lighting
conditions
30/08/2006
IPCV 2006 Budapest
23
Methods to Detect/Locate Faces
 Knowledge-based methods:
– Encode human knowledge of what constitutes a typical
face (usually, the relationships between facial features)
 Feature invariant approaches:
– Aim to find structural features of a face that exist even
when the pose, viewpoint, or lighting conditions vary
 Template matching methods:
– Several standard patterns stored to describe the face
as a whole or the facial features separately
 Appearance based methods:
– The models (or templates) are learned from a set of
training images which capture the representative
variability of facial appearance
30/08/2006
IPCV 2006 Budapest
24
Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks
30/08/2006
IPCV 2006 Budapest
25
Knowledge-Based Methods
 Top-down approach: Represent a face using a
set of human-coded rules
 Examples:
– The center part of face has uniform intensity
values
– The difference between the average intensity
values of the center part and the upper part is
significant
– A face often appears with two eyes that are
symmetric to each other, a nose and a mouth
 Use these rules to guide the search process
30/08/2006
IPCV 2006 Budapest
26
Knowledge-Based Method: [Yang and Huang 94]
 Multi-resolution focus-of-
attention approach
 Level 1 (lowest
resolution): apply the rule
“the center part of the
face has 4 cells with a
basically uniform
intensity” to search for
candidates
 Level 2: local histogram
equalization followed by
edge detection
 Level 3: search for eye
and mouth features for
validation
30/08/2006
IPCV 2006 Budapest
27
Knowledge-Based Method:
[Kotropoulos & Pitas 94]
 Horizontal/vertical projection to search for candidates
 Search eyebrow/eyes, nostrils/nose for validation
 Difficult to detect multiple people or in complex
background
30/08/2006
IPCV 2006 Budapest
28
Knowledge-based Methods: Summary
 Pros:
– Easy to find simple rules to describe the features of a
face and their relationships
– Based on the coded rules, facial features in an input
image are extracted first and face candidates are
identified
– Work well for face localization in uncluttered
background
 Cons:
– Difficult to translate human knowledge into rules
precisely: detailed rules fail to detect faces and general
rules may find many false positives
– Difficult to extend this approach to detect faces in
different poses: implausible to enumerate all the
possible cases
30/08/2006
IPCV 2006 Budapest
29
Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks
30/08/2006
IPCV 2006 Budapest
30
Feature-based Methods
 Bottom-up approach: Detect facial features
(eyes, nose, mouth, etc) first
– How can we detect facial features: edge,
intensity, shape, texture, color, etc
– Aim to detect invariant features
 Group features into candidates and verify
them
30/08/2006
IPCV 2006 Budapest
31
Feature-based Example [Yang 2003]
 Use the Canny operator to find an edge
map
 Separate face from background by
grouping edges
 Best fit an ellipse to the remaining edges
30/08/2006
IPCV 2006 Budapest
32
Feature Grouping [Yow and Cipolla 90]
 Apply a 2nd derivative Gaussian filter to
search for interest points
 Group the edges near interest points into
regions
 Each feature and grouping is evaluated
within a Bayesian network
 Handle a few poses
30/08/2006
IPCV 2006 Budapest
33
Feature Grouping [Yow and Cipolla 90]
 Face model and
component
 Model facial feature as pair
of edges
 Apply interest point
operator and edge detector
to search for features
 Using Bayesian network to
combine evidence
30/08/2006
IPCV 2006 Budapest
34
Random Graph Matching [Leung et al. 95]
 Formulate as a problem to find the
correct geometric arrangement of
facial features
– Facial features are defined by
the average responses of multiorientation, multi-scale
Gaussian derivative filters
– Learn the configuration of
features with Gaussian
distribution of mutual distance
between facial features
 Random graph matching among
the candidates to locate faces
30/08/2006
IPCV 2006 Budapest
35
Feature-Based Methods: Summary
 Pros:
– Features are invariant to pose and orientation
change
 Cons:
– Difficult to locate facial features due to several
corruption (illumination, noise, occlusion)
– Difficult to detect features in complex
background
30/08/2006
IPCV 2006 Budapest
36
Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks
30/08/2006
IPCV 2006 Budapest
37
Template Matching Methods
 Store a template
– Predefined: based on edges or regions
– Deformable: based on facial contours
(e.g., Snakes)
 Templates are hand-coded (not learned)
 Use correlation to locate faces
30/08/2006
IPCV 2006 Budapest
38
Face Template
 Use relative pair-wise
ratios of the brightness of
facial regions (14 × 16
pixels): the eyes are
usually darker than the
surrounding face [Sinha 94]
 Use average area
intensity values than
absolute pixel values
 See also Point
Distribution Model (PDM)
[Lanitis et al. 95]
30/08/2006
IPCV 2006 Budapest
39
Template-Based Methods: Summary
 Pros:
– Relatively simple
 Cons:
– Templates needs to be initialized near
the face images
– Difficult to enumerate templates for
different poses (similar to knowledgebased methods)
30/08/2006
IPCV 2006 Budapest
40
Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks
30/08/2006
IPCV 2006 Budapest
41
Appearance-Based Methods
Idea: train a classifier using positive (and
usually negative) examples of faces
 Representation
 Pre processing
 Train a classifier
 Search strategy
 Post processing
30/08/2006
IPCV 2006 Budapest
42
Appearance-Based Methods: Classifiers












Neural network: Multilayer Perceptrons
Principal Component Analysis (PCA), Factor Analysis
Support vector machine (SVM)
Mixture of PCA, Mixture of factor analyzers
Distribution-based method
Naive Bayes classifier
Hidden Markov model
Sparse network of winnows (SNoW)
Kullback relative information
Inductive learning: C4.5
AdaBoost (Jones and Viola)
…
30/08/2006
IPCV 2006 Budapest
43
Representation
 Holistic: Each image is raster scanned and
represented by a vector of intensity values
 Block-based: Decompose each face image into a
set of overlapping or non-overlapping blocks
– At multiple scale
– Further processed with vector quantization,
Principal Component Analysis, etc.
30/08/2006
IPCV 2006 Budapest
44
Face and Non-Face Examples
 Positive examples:
– Get as much variation as possible
– Manually crop and normalize each face image into a
standard size face (e.g., 19 × 19 pixels)
– Creating virtual examples [Sung and Poggio 94]
 Negative examples:
– Fuzzy idea
– Any images that
do not contain faces
– A large image subspace
30/08/2006
IPCV 2006 Budapest
45
Distribution-Based Method [Sung & Poggio 94]
 Masking: reduce the
unwanted noise in a face
pattern
 Illumination gradient
correction: find the best fit
brightness plane and then
subtracted from it to reduce
heavy shadows caused by
extreme lighting angles
 Histogram equalization:
compensates the imaging
effects due to changes in
illumination and different
camera input gains
30/08/2006
IPCV 2006 Budapest
46
Creating Virtual Positive Examples
 Simple and very
effective method
 Randomly mirror,
rotate, translate
and scale face
samples by small
amounts
 Less sensitive to
alignment error
30/08/2006
IPCV 2006 Budapest
47
Search over Space and Scale
 Scan an input image at one-pixel
increments horizontally and vertically
30/08/2006
Downsample the input image by a
factor of 1.2 and continue to search
IPCV 2006 Budapest
48
Continue to Search over Space and Scale
 Continue to downsample the input image
and search until the image size is too small
30/08/2006
IPCV 2006 Budapest
49
Detecting Faces in Different Pose
 Probabilistic Modeling of
Local Appearance
 Extend to detect faces in
different pose with
multiple detectors
 Each detector specializes
to a view: frontal, left pose
and right pose
[Schneiderman and Kanade 98]
30/08/2006
IPCV 2006 Budapest
50
Viola & Jones: Robust Real-time Face Detection
Viola and Jones
 Goals
 Motivation
 Features (rectangle features & integral
image)
 Learning classifier functions (AdaBoost)
 Cascade of classifiers
 Results
30/08/2006
IPCV 2006 Budapest
51
Viola & Jones: Robust Real-time Face Detection
Goals of Viola and Jones
 Robust real-time face detection in images
 Decide for each sub-window (starting at
24x24) of the image whether it is a face
30/08/2006
IPCV 2006 Budapest
52
Viola & Jones: Robust Real-time Face Detection
Motivation
 Rapid face detection in images and image
sequences
 Previous approaches too slow
– They are using auxiliary information
(image differences, colors)
 Only grey scale images needed
30/08/2006
IPCV 2006 Budapest
53
Viola & Jones: Robust Real-time Face Detection
The Framework of the Method
 Haar-like Feature: A feature representation
 Integral Image: A new image representation
and a fast Haar-like feature calculation
method
 Perceptron: Classifier, uses the features
 Boost Algorithm: An efficient classifier
generating algorithm
 Cascade: Method for combining classifiers
30/08/2006
IPCV 2006 Budapest
54
Viola & Jones: Robust Real-time Face Detection
Features
 Detection based on features, not pixels
 Why?
– Features can encode domain knowledge
that is difficult to learn
– Faster than a pixel-based system
– Scale independent
30/08/2006
IPCV 2006 Budapest
55
Viola & Jones: Robust Real-time Face Detection
Rectangle Features
 (Sum of pixels in black rectangles) - (sum of
pixels in white rectangles)
 Features are very simple  rapid
computation
 How to compute them rapidly?
30/08/2006
IPCV 2006 Budapest
56
Viola & Jones: Robust Real-time Face Detection
Integral Image
 Rectangle Features
can be computed
rapidly using an
intermediate
representation of the
image: integral image
 Value at point (x,y) is
the sum of all the pixel
values above and to
the left
30/08/2006
IPCV 2006 Budapest
57
Viola & Jones: Robust Real-time Face Detection
Integral Image
 The integral image can be computed with
the following recurrences:
 where
s(x, y): cumulative column sum
 One pass over the image
30/08/2006
IPCV 2006 Budapest
58
Viola & Jones: Robust Real-time Face Detection
Computation of Rectangular Sums
Sum of rectangle D: ii(4) – ii(3) – ii(2) + ii(1)
30/08/2006
IPCV 2006 Budapest
59
Viola & Jones: Robust Real-time Face Detection
Integral Image
 Rectangular sum can be computed in four
array references
 Two-rectangle features need six array
references
 Three-rectangle features need eight array
references
 Four-rectangle features need nine array
references
30/08/2006
IPCV 2006 Budapest
60
Viola & Jones: Robust Real-time Face Detection
Feature properties
 Rough, but sensitive to simple image
structures
 Only 2 orientations
 Extremely efficient
 Compensates the disadvantages
 Face detection for an entire image at every
scale in real-time
30/08/2006
IPCV 2006 Budapest
61
Viola & Jones: Robust Real-time Face Detection
Learning Classification Functions
 Given: feature set and training set of positive +
negative face images
– In principle any machine learning approach
can be used (SVM, neural network, …)
– But: 45 396 rectangle features for each subwindow
 Experiments showed that only a few features
needed for an efficient classifier
 How to find them?
30/08/2006
IPCV 2006 Budapest
62
Viola & Jones: Robust Real-time Face Detection
AdaBoost-Based Detector [Viola and Jones 01]
 Main idea:
– Feature selection: select important features
– Focus of attention: focus on potential regions
– Use an integral graph for fast feature evaluation
 Use AdaBoost to learn
– A small set of important features (feature selection)
 sort them in the order of importance
 each feature can be used as a simple (weak)
classifier
– A cascade of classifiers that
 combine all the weak classifiers to do a difficult task
 filter out the regions that most likely do not contain
faces
30/08/2006
IPCV 2006 Budapest
63
Viola & Jones: Robust Real-time Face Detection
Feature Selection [Viola and Jones 01]
 Training: If x is a face, then x
– most likely has feature 1 (easiest feature, and of greatest
importance)
– very likely to have feature 2 (easy feature)
– …
– likely to have feature n (more complex feature, and of less
importance since it does not exist in all the faces in the training
set)
 Testing: Given a test sub-image image x’
– if x’ has feature 1:
 Test whether x’ has feature 2
– Test whether x’ has feature n
–…
– else …
 else, it is not face
– else, it is not a face
30/08/2006
IPCV 2006 Budapest
 Similar to decision tree
64
Viola & Jones: Robust Real-time Face Detection
AdaBoost
 Variant of AdaBoost is used both to select
features and train the classifier
 AdaBoost is used to boost classification
performance of a simple (weak) learning
algorithm
 It combines a collection of weak classifier to form
a stronger one
 The weak learning algorithm is designed to select
the single rectangle feature which best separates
the positive and negative examples
30/08/2006
IPCV 2006 Budapest
65
Viola & Jones: Robust Real-time Face Detection
AdaBoost
 Searches over the set of possible weak
classifiers and selects those with the lowest
classification error
 Greedy feature selection process
 Efficient for selecting a small number of “good”
features (fj(x)) with significant variety
Weak classifier
 Error rates between 0.1-0.3 (early rounds) and
0.4 and 0.5 (later rounds)
30/08/2006
IPCV 2006 Budapest
66
Viola & Jones: Robust Real-time Face Detection
AdaBoost in Detail


Given example images
yi = 0,1 pos./neg.
Initialize weights for yi = 0, 1:
where m and
l number of negatives and positives
 For t = 1,…,T:
1. Normalize weights
2. For each feature j train a classifier hj which is restricted to
using a single feature, the error
3. Choose the classifier ht with lowest error
4. Update weights:
where ei = 0, if xi correctly classified, else ei = 1
 Final strong classifier:
30/08/2006
IPCV 2006 Budapest
67
Viola & Jones: Robust Real-time Face Detection
Feature Selection by AdaBoost
 First and second feature
selected by AdaBoost
 60 microprocessor
instructions
30/08/2006
IPCV 2006 Budapest
68
Viola & Jones: Robust Real-time Face Detection
AdaBoost – Learning Results
PIII 700 MHz, 2001
 200-feature-classifier
 Detection rate: 95%
 False positive rate: 1/14084
 Takes 0.7 seconds to scan 384x288 image
 Adding features to the classifier increases
computation time
30/08/2006
IPCV 2006 Budapest
69
Viola & Jones: Robust Real-time Face Detection
Cascade of Classifiers
 0.7 s for 1 image?  too slow =>
 Idea: Cascade of Classifiers to reduce
computation time
 Simple Classifiers to reject majority of negative
sub-windows:
all sub-windows
many sub-windows
have no faces
1
T
2
T
F
F
3
T
Further
processing
F
rejected
sub-windows
30/08/2006
IPCV 2006 Budapest
70
Viola & Jones: Robust Real-time Face Detection
Cascade of Classifiers - Experiment
 Monolithic 200-feature-classifier vs. cascade of
ten 20-feature-classifier
 Trained with 5000 faces and 10000 non-face
sub-windows
30/08/2006
IPCV 2006 Budapest
71
Viola & Jones: Robust Real-time Face Detection
Results - Training
 Structure of the Cascade:
– 32 layers with 4297 features
– First layer uses two features, 60% non-faces
rejected, close to 100% faces detected
– Second layer uses five features, rejects 80%
non-faces
– 3rd 20-feature-classifier
– 4th 50-feature-classifier
– 5th 100-feature-classifier
– 20th 200-feature-classifier
30/08/2006
IPCV 2006 Budapest
72
Viola & Jones: Robust Real-time Face Detection
Results - Speed
 Speed of the detector depends on the
number of features evaluated
 On the MIT+CMU test set, an average of 8
features (out of 4297), on a 700 MHz
Pentium III, 384x288 takes 0.067s (2001)
30/08/2006
IPCV 2006 Budapest
73
Viola & Jones: Robust Real-time Face Detection
Outlook
 Viola & Jones improved their detector to detect
not only frontal faces, but also non-upright and
non-frontal faces
 Additional rectangle features
 Pose estimator
 Classifier for each rotation class (12)
30/08/2006
IPCV 2006 Budapest
74
Haar-like Features Extension
 45º rotated rectangle
feature by Lienhart
 Feature is represented by
(type, x, y, w, h): type
[1(a), 2(b), etc.]; (x, y) for
location in detection
window and (w, h) for
feature size
 Number of features: the
window size 24x24, there
are 117,941 different
features, a large number
30/08/2006
IPCV 2006 Budapest
75
Integral Image, Fast Feature
Calculation
 Corresponding to the
origin feature and 45º
rotated feature, there are
two types of Integral
Image, called by Lienhart
as Sum Area Table and
Rotated Sum Area Table
SAT(x,y)
RSAT(x,y)
30/08/2006
IPCV 2006 Budapest
76
Integral Image, Fast Feature
Calculation
 SAT and RSAT Calculation:
 Sum of Rect Calculation:
 Feature value: calculated as the weighted
sum of the rectangles
30/08/2006
IPCV 2006 Budapest
77
Appearance-Based Methods: Summary
 Pros:
– Use powerful machine learning algorithms
– Has demonstrated good empirical results
– Fast and fairly robust
– Extended to detect faces in different pose and
orientation
 Cons
– Usually needs to search over space and scale
– Need lots of positive and negative examples
– Limited view-based approach
30/08/2006
IPCV 2006 Budapest
78
Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks
30/08/2006
IPCV 2006 Budapest
79
Color-Based Face Detector
 Distribution of skin color across different ethnic
groups
– Under controlled illumination conditions:
compact
– Arbitrary conditions: less compact
 Color space
– RGB, normalized RGB, HSV, HIS, YCrCb, YIQ,
UES, CIE XYZ, CIE LIV, …
 Statistical analysis
– Histogram, look-up table, Gaussian model,
mixture model, …
30/08/2006
IPCV 2006 Budapest
80
Methods of Skin Detection
 Pixel-Based Methods
– Classify each pixel as skin or non-skin individually,
independently from its neighbors
– Simpler, faster
– Color Based Methods fall in this category
 Region Based Methods
– Try to take the spatial arrangement of skin pixels into
account during the detection stage to enhance the
methods performance
– More computation extensive
– Additional knowledge in terms of texture etc. are
required
30/08/2006
IPCV 2006 Budapest
81
Skin Color Based Methods - Advantages





30/08/2006
Allows fast processing
Robust to geometric variations of the skin
patterns
Robust under partial occlusion
Robust to resolution changes
Experience suggests that human skin has a
characteristic color, which is easily recognized
by humans
IPCV 2006 Budapest
82
Skin Modelling
 Explicitly defined skin region
– Using simple rules
 Non-parametric skin distribution modeling Estimate skin color distribution from the training
data without deriving an explicit model of the skin
– Look up table or Histogram Model
– Bayes Classifier
 Parametric skin distribution modeling - Deriving a
parametric model from the training set
30/08/2006
IPCV 2006 Budapest
83
Explicit Rule
 To define explicitly (through a number of rules) the
boundaries skin cluster in some color space
 E.g. assuming in space (RGB, rgb, HSI, YUV, etc)
– (R’,G’,B’) is classified as skin if:
 R’ > 95 and G’ > 40 and B’ > 20,
 Max{R’,G’,B’} – min{R’,G’,B’} > 15, and
 |R’-G’| > 15 and R’ > G’ and R’ > B’
Peer (2003)
 The simplicity of this method is attractive
 Simplicity of skin detection rules leads to construction of a
very rapid classifier
 The main difficulty achieving high recognition rates need
to find
– Good color space
rules
30/08/2006 – Adequate decision IPCV
2006 Budapest
84
Example: YCrCb Space [Yoo and Kim]
 Skin-colors form a cluster in YCrCb color space
 We can approximate the skin color region with
several lines or with an ellipse
30/08/2006
IPCV 2006 Budapest
85
Color Histogram
 If
– h(rk, gk, bk)=nk,
 Then
– p(rk, gk, bk) = nk / N
– p(r’k, g’k, b’k) = (N-nk) / N
 That is:
– If there are nk skin color (rk, gk, bk) pixels in
the image, the probability of any pixel which is
a skin pixel is nk/N
– and the probability of any pixel which is NOT
a skin pixel is (N-nk)/N
30/08/2006
IPCV 2006 Budapest
86
Bayes Probability - Recall
 P(A,B) = P(A|B) x P(B)
 P(A,B) = P(B|A) x P(A)
 If P(B) + P(~B) = 1, and P(A) + P(~A) = 1, then
– P(A) = P(A|B) x P(B) + P(A|~B) x P(~B)
– P(B) = P(B|A) x P(A) + P(B|~A) x P(~A)
 P(A|B) x P(B) = P(B|A) x P(A)
P(B|A) x P(A)
P(A|B) = ------------------P(B)
P(B|A) x P(A)
P(A|B) = ----------------------------------------------P(B|A) x P(A) + P(B|~A) x P(~A) )
30/08/2006
IPCV 2006 Budapest
87
Bayes Classifier (1)
 p(r,g,b|skin) is a conditional probability - a
probability of observing color (red, or green, or
blue), knowing that we see a skin pixel
 A more appropriate measure for skin detection
would be p(skin|r,g,b) - a probability of observing
skin, given a concrete (r,g,b) color value
 To compute this probability, the Bayes rule is
used:
30/08/2006
IPCV 2006 Budapest
88
Bayes Classifier (2)
 p(r,g,b|skin) and p(r,g,b|non-skin) can be
computed from skin and non-skin color
histograms
 The prior probabilities p(skin) and p(non-skin)
can also be estimated from the overall number
of skin and non-skin samples in the training set
An inequality
p(skin|r,g,b) >= Q,
where Q is a threshold value, can be used as a
skin detection rule
30/08/2006
IPCV 2006 Budapest
89
Bayes Classifier (3)
 Consider normalised RGB space
 Histograms approximate probability densities
– p(skin) can be estimated as the ratio of the skin size
Nskin to the size of the image Ntotal ,
– p(non-skin) can be estimated as the ratio of the skin
size Nnon-skin to the size of the image Ntotal ,
– p((r,g,b)|skin) the probability density of (r,g,b) given
the skin,
– p((r,g,b)|non-skin) the probability density of (r,g,b)
given the non-skin,
30/08/2006
IPCV 2006 Budapest
90
Bayes Classifier (4)
 Then we can compute for any color (r,g,b)
the probability p(skin|(r,g,b)) → Probability
Map of the image
 Skin Probability Map (SPM)
 That is:
Ratio of
histograms
30/08/2006
IPCV 2006 Budapest
91
Bayes Classifier from [Rehg & Jones]
 Skin photos:
 Non skin photos:
30/08/2006
IPCV 2006 Budapest
92
Results from [Rehg & Jones]
 Used 18 696 images to build a general color model
 Density is concentrated around the gray line and is more





sharply peaked at white than black
Most colors fall on or near the gray line
Black and white are by far the most frequent colors, with
white occurring slightly more frequently
There is a marked skew in the distribution toward the red
corner of the color cube
77% of the possible 24 bit RGB colors are never
encountered (i.e. the histogram is mostly empty)
52% of web images have people in them
30/08/2006
IPCV 2006 Budapest
93
Marginal Distributions [Rehg & Jones]
30/08/2006
IPCV 2006 Budapest
94
Skin model [Rehg & Jones]
30/08/2006
IPCV 2006 Budapest
95
Non Skin Model [Rehg & Jones]
30/08/2006
IPCV 2006 Budapest
96
Skin and Non-Skin Color Model
 Analyze 1 billion labeled pixels
 Skin and non-skin models
 A significant degree of separation between
skin non-skin model
 Achieves 80% detection rate with 8.5%
false positives
 Histogram method outperforms Gaussian
mixture model
30/08/2006
IPCV 2006 Budapest
97
Experimental Results [Jones and Rehg 02]
 May have lots of
false positives in
the raw results
 Need further
processing to
eliminate false
negatives and
group color pixels
for face detection
30/08/2006
IPCV 2006 Budapest
98
Results from [Zarit et al.]
 They compared 5 different color spaces CIELab, HSV,
HS, Normalized RGB and YCrCb
 Four different metrics are used to evaluate the results of
the skin detection algorithms
– C %– Skin and Non Skin pixels identified correctly
– S %– Skin pixels identified correctly
– SE – Skin error – skin pixels identified as non skin
– NSE – Non Skin error – non skin pixels identified as
skin
 They compared the 5 color space with 2 color models –
look up table and Bayes classifier
30/08/2006
IPCV 2006 Budapest
99
Look up table results
 HSV, HS
gave the
best results
 Normalized
rg is not far
behind
 CIELAB and
YCrCb gave
poor results
30/08/2006
IPCV 2006 Budapest
100
Color-Based Face Detector: Summary
 Pros:
– Easy to implement
– Effective and efficient in constrained
environment
– Insensitive to pose, expression, rotation
variation
 Cons:
– Sensitive to environment and lighting change
– Noisy detection results (body parts, skin-tone
line regions)
30/08/2006
IPCV 2006 Budapest
101
Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks
30/08/2006
IPCV 2006 Budapest
102
Video-Based Face Detector
 Motion cues:
– Frame differencing
– Background modeling and subtraction
 Can also used depth cue (e.g., from stereo)
when available
 Reduces the search space dramatically
30/08/2006
IPCV 2006 Budapest
103
HONDA Humanoid Robot: ASIMO
 Using motion and depth cue
– Motion cue: from a gradient-based tracker
– Depth cue: from stereo camera
 Dramatically reduce search space
 Cascade face detector
30/08/2006
IPCV 2006 Budapest
104
Video-Based Detectors: Summary
 Pros:
– An easier problem than detection in still
images
– Use all available cues: motion, depth, voice,
etc. to reduce search space
 Cons:
– Need to efficient and effective methods to
process the multimodal cues
– Data fusion
30/08/2006
IPCV 2006 Budapest
105
Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks
30/08/2006
IPCV 2006 Budapest
106
Performance Evaluation
 Need to set the evaluation criteria/protocol
– Training set
– Test set
– What is a correct detect?
– Detection rate: false positive/negative
– Precision of face location
– Speed: training/test stage
30/08/2006
IPCV 2006 Budapest
107
Training Sets
 Cropped and pre processed data set with face
and non-face images provided by MIT CBCL:
http://www.ai.mit.edu/projects/cbcl.old/softwaredatasets/index.html
30/08/2006
IPCV 2006 Budapest
108
Standard Test Sets
 MIT test set: subsumed by CMU test set
 CMU test set (http://www.cs.cmu.edu/~har)
(de facto benchmark): 130 gray scale
images with a total of 507 frontal faces
 CMU profile face test set: 208 images with
faces in profile views
 Kodak data set (Eastman Kodak Corp):
faces of multiple size, pose and varying
lighting conditions in color images
30/08/2006
IPCV 2006 Budapest
109
CMU Test Set I: Upright Frontal Faces
 130 images with 507




frontal faces
Collected by K. Collected
by K.-K. Sung K. Sung
and H. Rowley
Including 23 images used
in [Sung and Poggio 94]
Some images have low
resolution
De facto benchmark
30/08/2006
IPCV 2006 Budapest
110
CMU Test Sets: Rotated and Profile
Faces
 50 images with 223
faces in-plane
orientation, collected
by H. Rowley
 208 images with 441
faces, collected by H.
Schneiderman
30/08/2006
IPCV 2006 Budapest
111
Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks
30/08/2006
IPCV 2006 Budapest
112
Research Issues
 Representation: How to describe a typical face?
 Scale: How to deal with face of different size?
 Search strategy: How to spot these faces?
 Speed: How to speed up the process?
 Precision: How to locate the faces precisely?
 Post processing: How to combine detection
results?
30/08/2006
IPCV 2006 Budapest
113
Face Detection: A Solved Problem?
 Not quite yet…
 Factors:
– Shadows
– Occlusions
– Robustness
– Resolution
 Lots of potential
applications
 Can be applied to
other domains
30/08/2006
IPCV 2006 Budapest
114
Outline of Recognition Part
 Objective
– Categorization
– Brief description of representative face
recognition techniques
– Open research issues
 This part of the presentation is based on the
survey W. Zhao, R. Chellappa, J. Phillips and A.
Rosenfeld:
Face Recognition: A Literature Survey ACM
Computing Survey Vol. 35, Dec. 2003/
30/08/2006
IPCV 2006 Budapest
115
Recognition from Intensity Images
High-level Categorization
 1. Holistic matching methods
– Classification using whole face region
 2. Feature-based (structural) matching methods
– Structural classification using local features
 3. Hybrid Methods
– Using local features and whole face region
30/08/2006
IPCV 2006 Budapest
116
Holistic Approaches
 Principal Component Analysis (PCA)
– Eigenface
– Fisherface/Subspace LDA
– SVM
– ICA
 Other Representations
– LDA/FLA
– PDBNN
30/08/2006
IPCV 2006 Budapest
117
IPCV, Saint-Etienne, 2004, Prof. V. Bochko
What is PCA?
 Principal component analysis (PCA) is a
technique for extracting a smaller set of
variables with less redundancy from highdimensional data sets in order to retain as
much of the information as possible
 A small number of principal components
often give most of the structure in the data
30/08/2006
IPCV 2006 Budapest
118
IPCV, Saint-Etienne, 2004, Prof. V. Bochko
Standard PCA
 Assume that an n-dimensional observed vector is
 The vector is centered
where
 The covariance matrix is
where
is an
expectation
 The eigendecomposition is
 The matrix of eigenvectors is
where
 The eigenvectors corresponds to eigenvalues
 The eigenvalues are
 The eigenvalues are in descending order
 The first k principal components are
 Finally, the reconstruction is
30/08/2006
IPCV 2006 Budapest
119
Facial Recognition: PCA - Overview
 Create training set of faces and




calculate the eigenfaces
Project the new image onto the
eigenfaces
Check if image is close to “face
space”
Check closeness to one of the
known faces
Add unknown faces to the
training set and re-calculate
30/08/2006
IPCV 2006 Budapest
120
Facial Recognition: PCA Training
 Find average of training
images
 Subtract average face
from each image
 Create covariance matrix
 Generate eigenfaces
 Each original image can
be expressed as a linear
combination of the
eigenfaces – face space
30/08/2006
IPCV 2006 Budapest
121
Facial Recognition: PCA Recognition
 A new image is project into the “facespace”
 Create a vector of weights that describes
this image
 The distance from the original image to this
eigenface is compared
 If within certain thresholds then it is a
recognized face
30/08/2006
IPCV 2006 Budapest
122
Feature Based Approaches
 Elastic bunch graph matching using Gabor
wavelets
30/08/2006
IPCV 2006 Budapest
123
Detecting Feature Points
 Using of Gabor Wavelets
 Create 5 frequencies, 8 angles
Spatial figure
30/08/2006
IPCV 2006 Budapest
Top view
124
Gabor filters
 The responses of the filters
Original
Filtered
Result
 2D Gabor filters can be used to detect
dimensions along a specific orientation
 Jet: set of 40 complex coefficients for the
reference points
30/08/2006
IPCV 2006 Budapest
125
Face Graph
 Facial fiducial points
– Pupil, tip of mouth, etc.
 Face graph
 Bunch
– Nodes at fiducial pts.
– A set of Jets assoc. with
– Un-directed graph
the same fiducial point
– The structure of graph is
– e.g. an eye Jet may
the same for each face
consists of different types
of eyes: open, closed, etc.
– Fitting a face image to a
face graph is done
 Face bunch graph (FBG):
automatically
– Same as a face graph,
except each node
– Some nodes may be
consists of a jet bunch
undefined due to
30/08/2006
IPCV 2006 Budapestrather than a jet
126
occlusion
Face Bunch Graph
 FBG has the same structure as
individual face graph
– Each node labeled with a
bunch of jets
– Each edge labeled with
average distance between
corresponding nodes in face
samples
30/08/2006
IPCV 2006 Budapest
127
Still Face Recognition: Open Issues
 Addressing the issue of recognition being too
sensitive to inaccurate facial feature localization
 Robustly recognizing faces
– Small and/or noisy images
– Images acquired years apart
– Outdoor acquisition: lighting, pose
 What is the limit on the number of faces that can
be distinguished?
 What is the principal and optimal way to arbitrate
and combine local features and global features
30/08/2006
IPCV 2006 Budapest
128
Web Resources (detection)
 Face detection home page
http://home.t-online.de/home/Robert.Frischholz/face.htm
 Henry Rowley’s home page
http://www-2.cs.cmu.edu/~har/faces.html
 Henry Schneiderman’s home page
http://www.ri.cmu.edu/projects/project_416.html
 MIT CBCL web page
http://www.ai.mit.edu/projects/cbcl.old/softwaredatasets/index.html
 Face detection resources
http://vision.ai.uiuc.edu/mhyang/face-detection-survey.html
30/08/2006
IPCV 2006 Budapest
129
Web Resources (recognition)
Face Recognition Home Page
http.www.cs.rug.nl/~peterkr/FACE/frhp.html
Face Databases
 UT Dallas
www.utdallas.edu/dept/bbs/FACULTY_PAGES/otoole/database.htm
 Notre Dame database www.nd.edu/~cvrl/HID-data.html
 MIT database ftp://whitechapel.media.mit.edu/pub/images
 Edelman ftp://ftp.wisdom.weizmann.ac.il/pub/FaceBase
 CMU PIE www.ri.cmu.edu/projects/project\_418.htm
 Stirlingdatabase pics.psych.stir.ac.uk
 M2VTS multimodal www.tele.ucl.ac.be/M2VTS
 Yale database cvc.yale.edu/projects/yalefaces/yalefaces.htm
 Yale databaseB cvc.yale.edu/projects/yalefacesB/yalefacesB.htm
 Harvard database hrl.harvard.edu/pub/faces
 Weizmanndatabase www.wisdom.weizmann.ac.il/~yael
 UMIST database images.ee.umist.ac.uk/danny/database.html
30/08/2006
IPCV 2006 Budapest
130
 Purdue rvl1.ecn.purdue.edu/~aleix/aleix\_face\_DB.html
References (detection)
 M.-H. Yang, D. J. Kriegman, and N. Ahuja,
“Detecting Faces in Images: A Survey” IEEE
Transactions on Pattern Analysis and Machine
Intelligence (PAMI), vol. 24, no. 1, pp. 34-58,
2002. (http://vision.ai.uiuc.edu/mhyang/facedetection-survey.html
 M.-H. Yang and N. Ahuja, Face Detection and
Hand Gesture Recognition for Human Computer
Interaction, Kluwer Academic Publishers, 2001.
 Paul Viola, Michael J. Jones, Robust Real- time
Object Detection, Cambridge Research Laboratory
Technology Reports Series, CRL 2001/01, Feb, 2001
30/08/2006
IPCV 2006 Budapest
131
Face and Some CV-based
Projects at Budapest Tech
30/08/2006
IPCV 2006 Budapest
132
Combined verification
Hangfelvétel
Hangminta
LP spektrum
Neurális háló
Arc maszk
Original
30/08/2006
Filter response
Features
IPCV
2006
Budapest
Szűrő válasz
133
Morphing by Automatic Feature
Detection
30/08/2006
IPCV 2006 Budapest
134
Eye and Lip Tracking
 Real time face detection
 Eye a mouth motion tracking
 Client-server internet communication
30/08/2006
IPCV 2006 Budapest
135
MYRA
 Face detection
 Skin tone based
tracking
 Feature tracking
 Motion detection
 Recognition options:
– Eigenface method
– Gabor wavelet
based bunch
graph
30/08/2006
IPCV 2006 Budapest
136
Mobile robots – navigation using CV
30/08/2006
IPCV 2006 Budapest
137
Recognition and navigation
30/08/2006
IPCV 2006 Budapest
138
Download