Face Detection and Recognition Zoltán VÁMOSSY Budapest Tech, John von Neumann Faculty of Informatics 30/08/2006 IPCV 2006 Budapest 1 Face tasks in CV Face detection: Are there any face on the image (where)? – concerns with a category of object Face localization: – Aim to determine the image position of a single face – A simplified detection problem with the assumption that an input image contains only one face Facial feature extraction: – To detect the presence and location of features such as eyes, nose, nostrils, eyebrow, mouth, lips, ears, etc – Usually assume that there is only one face in an image Face recognition (identification): concerns with individual identity Facial expression recognition Human pose estimation and tracking 30/08/2006 IPCV 2006 Budapest 2 Typical Biometric Authentication Workflow Enroll Enrollment subsystem Template Feature Extractor Biometric reader 1010010… Authenticate Authentication subsystem Match or No Match Template Biometric Matcher Database 1010010… Biometric reader 30/08/2006 Feature Extractor IPCV 2006 Budapest 3 Identification vs. Verification Identification (1:N) Biometric reader Biometric Matcher Database This person is Emily Dawson I am Emily Dawson Verification (1:1) ID Biometric reader 30/08/2006 Biometric Matcher Match IPCV 2006 Budapest Database 4 Face Recognition Applications Automatic border control Sydney airport 30/08/2006 Samsung’s Magicgate: door lock control IPCV 2006 Budapest 5 A Solved Problem? Excellent results and products: – Fast, multi pose, partial occlusion Is face recognition and detection a solved problem? Not quite 30/08/2006 IPCV 2006 Budapest Omron’s face detector [Liu et al. 04] 6 Outline Relevant psychophysics/neuroscience issues Face Detection – Detection on gray scale images – Detection on color images – Research directions Face Recognition – Still image face recognition Face projects at Budapest Tech 30/08/2006 IPCV 2006 Budapest 7 Psychophysics Issues Is face perception the results of holistic/configural or local feature analysis? Is it a dedicated process? – Build a special or generic system Ranking of significance of facial features – Different weights for features Movement and face recognition – Favor video based recognition Role of race/gender/familiarity – Suggest adaptive system 30/08/2006 IPCV 2006 Budapest 8 Inverted Faces Prof. Eric H. Chudler Dept. of Anesthesiology University of Washington 30/08/2006 IPCV 2006 Budapest 9 Inverted Faces Prof. Eric H. Chudler Dept. of Anesthesiology University of Washington 30/08/2006 IPCV 2006 Budapest 10 More Inverted Faces Prof. Eric H. Chudler Dept. of Anesthesiology University of Washington 30/08/2006 IPCV 2006 Budapest 11 Negated Faces 30/08/2006 IPCV 2006 Budapest 12 Human Face Recognition Limitations Harmon and Julesz, 1973 Degraded images Human can handle low resolution 30/08/2006 IPCV 2006 Budapest [Sinha05] 13 Outline of Detection Part Objective – Survey major face detection works – Address “how” and “why” questions – Pros and cons of detection methods – Research directions This part of the presentation is based on the survey M-H. Yang: http://vision.ai.uiuc.edu/mhyang/face-detectionsurvey.html 30/08/2006 IPCV 2006 Budapest 14 Face Detection Identify and locate human faces in an image regardless of their – position – scale – in-plane rotation – orientation – pose (out-of plane rotation) – and illumination Face is a highly non-rigid object Where are the faces, if any? 30/08/2006 IPCV 2006 Budapest 15 Why Face Detection is Important? First step for any fully automatic face recognition system First step in many surveillance systems Lots of applications A step towards Automatic Target Recognition (ATR) or generic object detection/recognition 30/08/2006 IPCV 2006 Budapest 16 Why Face Detection Is Difficult? Pose (Out-of-Plane Rotation): frontal, 45 degree, profile, upside down Presence or absence of structural components: beard, mustache and glasses Facial expression: face appearance is directly affected by a person's facial expression Occlusion: faces may be partially occluded by other objects Orientation (In-Plane Rotation): face appearance directly vary for different rotations about the camera's optical axis Imaging conditions: lighting (spectra, source distribution and intensity) and camera characteristics (sensor response, gain control, lenses), resolution 30/08/2006 IPCV 2006 Budapest 17 In One Thumbnail Face Image Consider a thumbnail 19 × 19 face pattern 256361 possible combination of gray values – 256361 = 28×361 = 22888 Total world population 6,600,000,000 – 6,600,000,000 ≅ 232 87 times more than the world population! Extremely high dimensional space! 30/08/2006 IPCV 2006 Budapest 18 What is a Correct Detection? Different interpretation of “correct detect” Precision of face location Affect the reporting results: detection, false positive, false negative rates 30/08/2006 IPCV 2006 Budapest 19 Receiver Operator Characteristic Curve Useful for detailed performance assessment Plot true positive (TP) proportion against the false positive (FP) proportion for various possible settings False positive: Predict a face when there is actually none False negative: Predict a non-face where there is actually one 30/08/2006 IPCV 2006 Budapest 20 Receiver Operator Characteristic (ROC) Correct identification of nearly all objects will cause a large percentage of unknown objects to be incorrectly classified into some known class 30/08/2006 IPCV 2006 Budapest 21 Face Detector: Ingredients Target application domain: single image, video Representation: holistic, feature, etc Pre processing: histogram equalization, etc Cues: color, motion, depth, voice, etc Search strategy: exhaustive, greedy, focus of attention, etc Classifier design: ensemble, cascade Post processing: interpret detection results 30/08/2006 IPCV 2006 Budapest 22 In our Focus Focus on detecting – upright, frontal faces – in a single grayscale image – with decent resolution – under good lighting conditions 30/08/2006 IPCV 2006 Budapest 23 Methods to Detect/Locate Faces Knowledge-based methods: – Encode human knowledge of what constitutes a typical face (usually, the relationships between facial features) Feature invariant approaches: – Aim to find structural features of a face that exist even when the pose, viewpoint, or lighting conditions vary Template matching methods: – Several standard patterns stored to describe the face as a whole or the facial features separately Appearance based methods: – The models (or templates) are learned from a set of training images which capture the representative variability of facial appearance 30/08/2006 IPCV 2006 Budapest 24 Agenda (Detection) Detecting faces in gray scale images – Knowledge-based – Feature-based – Template-based – Appearance-based Detecting faces in color images Detecting faces in video Performance evaluation Research direction and concluding remarks 30/08/2006 IPCV 2006 Budapest 25 Knowledge-Based Methods Top-down approach: Represent a face using a set of human-coded rules Examples: – The center part of face has uniform intensity values – The difference between the average intensity values of the center part and the upper part is significant – A face often appears with two eyes that are symmetric to each other, a nose and a mouth Use these rules to guide the search process 30/08/2006 IPCV 2006 Budapest 26 Knowledge-Based Method: [Yang and Huang 94] Multi-resolution focus-of- attention approach Level 1 (lowest resolution): apply the rule “the center part of the face has 4 cells with a basically uniform intensity” to search for candidates Level 2: local histogram equalization followed by edge detection Level 3: search for eye and mouth features for validation 30/08/2006 IPCV 2006 Budapest 27 Knowledge-Based Method: [Kotropoulos & Pitas 94] Horizontal/vertical projection to search for candidates Search eyebrow/eyes, nostrils/nose for validation Difficult to detect multiple people or in complex background 30/08/2006 IPCV 2006 Budapest 28 Knowledge-based Methods: Summary Pros: – Easy to find simple rules to describe the features of a face and their relationships – Based on the coded rules, facial features in an input image are extracted first and face candidates are identified – Work well for face localization in uncluttered background Cons: – Difficult to translate human knowledge into rules precisely: detailed rules fail to detect faces and general rules may find many false positives – Difficult to extend this approach to detect faces in different poses: implausible to enumerate all the possible cases 30/08/2006 IPCV 2006 Budapest 29 Agenda (Detection) Detecting faces in gray scale images – Knowledge-based – Feature-based – Template-based – Appearance-based Detecting faces in color images Detecting faces in video Performance evaluation Research direction and concluding remarks 30/08/2006 IPCV 2006 Budapest 30 Feature-based Methods Bottom-up approach: Detect facial features (eyes, nose, mouth, etc) first – How can we detect facial features: edge, intensity, shape, texture, color, etc – Aim to detect invariant features Group features into candidates and verify them 30/08/2006 IPCV 2006 Budapest 31 Feature-based Example [Yang 2003] Use the Canny operator to find an edge map Separate face from background by grouping edges Best fit an ellipse to the remaining edges 30/08/2006 IPCV 2006 Budapest 32 Feature Grouping [Yow and Cipolla 90] Apply a 2nd derivative Gaussian filter to search for interest points Group the edges near interest points into regions Each feature and grouping is evaluated within a Bayesian network Handle a few poses 30/08/2006 IPCV 2006 Budapest 33 Feature Grouping [Yow and Cipolla 90] Face model and component Model facial feature as pair of edges Apply interest point operator and edge detector to search for features Using Bayesian network to combine evidence 30/08/2006 IPCV 2006 Budapest 34 Random Graph Matching [Leung et al. 95] Formulate as a problem to find the correct geometric arrangement of facial features – Facial features are defined by the average responses of multiorientation, multi-scale Gaussian derivative filters – Learn the configuration of features with Gaussian distribution of mutual distance between facial features Random graph matching among the candidates to locate faces 30/08/2006 IPCV 2006 Budapest 35 Feature-Based Methods: Summary Pros: – Features are invariant to pose and orientation change Cons: – Difficult to locate facial features due to several corruption (illumination, noise, occlusion) – Difficult to detect features in complex background 30/08/2006 IPCV 2006 Budapest 36 Agenda (Detection) Detecting faces in gray scale images – Knowledge-based – Feature-based – Template-based – Appearance-based Detecting faces in color images Detecting faces in video Performance evaluation Research direction and concluding remarks 30/08/2006 IPCV 2006 Budapest 37 Template Matching Methods Store a template – Predefined: based on edges or regions – Deformable: based on facial contours (e.g., Snakes) Templates are hand-coded (not learned) Use correlation to locate faces 30/08/2006 IPCV 2006 Budapest 38 Face Template Use relative pair-wise ratios of the brightness of facial regions (14 × 16 pixels): the eyes are usually darker than the surrounding face [Sinha 94] Use average area intensity values than absolute pixel values See also Point Distribution Model (PDM) [Lanitis et al. 95] 30/08/2006 IPCV 2006 Budapest 39 Template-Based Methods: Summary Pros: – Relatively simple Cons: – Templates needs to be initialized near the face images – Difficult to enumerate templates for different poses (similar to knowledgebased methods) 30/08/2006 IPCV 2006 Budapest 40 Agenda (Detection) Detecting faces in gray scale images – Knowledge-based – Feature-based – Template-based – Appearance-based Detecting faces in color images Detecting faces in video Performance evaluation Research direction and concluding remarks 30/08/2006 IPCV 2006 Budapest 41 Appearance-Based Methods Idea: train a classifier using positive (and usually negative) examples of faces Representation Pre processing Train a classifier Search strategy Post processing 30/08/2006 IPCV 2006 Budapest 42 Appearance-Based Methods: Classifiers Neural network: Multilayer Perceptrons Principal Component Analysis (PCA), Factor Analysis Support vector machine (SVM) Mixture of PCA, Mixture of factor analyzers Distribution-based method Naive Bayes classifier Hidden Markov model Sparse network of winnows (SNoW) Kullback relative information Inductive learning: C4.5 AdaBoost (Jones and Viola) … 30/08/2006 IPCV 2006 Budapest 43 Representation Holistic: Each image is raster scanned and represented by a vector of intensity values Block-based: Decompose each face image into a set of overlapping or non-overlapping blocks – At multiple scale – Further processed with vector quantization, Principal Component Analysis, etc. 30/08/2006 IPCV 2006 Budapest 44 Face and Non-Face Examples Positive examples: – Get as much variation as possible – Manually crop and normalize each face image into a standard size face (e.g., 19 × 19 pixels) – Creating virtual examples [Sung and Poggio 94] Negative examples: – Fuzzy idea – Any images that do not contain faces – A large image subspace 30/08/2006 IPCV 2006 Budapest 45 Distribution-Based Method [Sung & Poggio 94] Masking: reduce the unwanted noise in a face pattern Illumination gradient correction: find the best fit brightness plane and then subtracted from it to reduce heavy shadows caused by extreme lighting angles Histogram equalization: compensates the imaging effects due to changes in illumination and different camera input gains 30/08/2006 IPCV 2006 Budapest 46 Creating Virtual Positive Examples Simple and very effective method Randomly mirror, rotate, translate and scale face samples by small amounts Less sensitive to alignment error 30/08/2006 IPCV 2006 Budapest 47 Search over Space and Scale Scan an input image at one-pixel increments horizontally and vertically 30/08/2006 Downsample the input image by a factor of 1.2 and continue to search IPCV 2006 Budapest 48 Continue to Search over Space and Scale Continue to downsample the input image and search until the image size is too small 30/08/2006 IPCV 2006 Budapest 49 Detecting Faces in Different Pose Probabilistic Modeling of Local Appearance Extend to detect faces in different pose with multiple detectors Each detector specializes to a view: frontal, left pose and right pose [Schneiderman and Kanade 98] 30/08/2006 IPCV 2006 Budapest 50 Viola & Jones: Robust Real-time Face Detection Viola and Jones Goals Motivation Features (rectangle features & integral image) Learning classifier functions (AdaBoost) Cascade of classifiers Results 30/08/2006 IPCV 2006 Budapest 51 Viola & Jones: Robust Real-time Face Detection Goals of Viola and Jones Robust real-time face detection in images Decide for each sub-window (starting at 24x24) of the image whether it is a face 30/08/2006 IPCV 2006 Budapest 52 Viola & Jones: Robust Real-time Face Detection Motivation Rapid face detection in images and image sequences Previous approaches too slow – They are using auxiliary information (image differences, colors) Only grey scale images needed 30/08/2006 IPCV 2006 Budapest 53 Viola & Jones: Robust Real-time Face Detection The Framework of the Method Haar-like Feature: A feature representation Integral Image: A new image representation and a fast Haar-like feature calculation method Perceptron: Classifier, uses the features Boost Algorithm: An efficient classifier generating algorithm Cascade: Method for combining classifiers 30/08/2006 IPCV 2006 Budapest 54 Viola & Jones: Robust Real-time Face Detection Features Detection based on features, not pixels Why? – Features can encode domain knowledge that is difficult to learn – Faster than a pixel-based system – Scale independent 30/08/2006 IPCV 2006 Budapest 55 Viola & Jones: Robust Real-time Face Detection Rectangle Features (Sum of pixels in black rectangles) - (sum of pixels in white rectangles) Features are very simple rapid computation How to compute them rapidly? 30/08/2006 IPCV 2006 Budapest 56 Viola & Jones: Robust Real-time Face Detection Integral Image Rectangle Features can be computed rapidly using an intermediate representation of the image: integral image Value at point (x,y) is the sum of all the pixel values above and to the left 30/08/2006 IPCV 2006 Budapest 57 Viola & Jones: Robust Real-time Face Detection Integral Image The integral image can be computed with the following recurrences: where s(x, y): cumulative column sum One pass over the image 30/08/2006 IPCV 2006 Budapest 58 Viola & Jones: Robust Real-time Face Detection Computation of Rectangular Sums Sum of rectangle D: ii(4) – ii(3) – ii(2) + ii(1) 30/08/2006 IPCV 2006 Budapest 59 Viola & Jones: Robust Real-time Face Detection Integral Image Rectangular sum can be computed in four array references Two-rectangle features need six array references Three-rectangle features need eight array references Four-rectangle features need nine array references 30/08/2006 IPCV 2006 Budapest 60 Viola & Jones: Robust Real-time Face Detection Feature properties Rough, but sensitive to simple image structures Only 2 orientations Extremely efficient Compensates the disadvantages Face detection for an entire image at every scale in real-time 30/08/2006 IPCV 2006 Budapest 61 Viola & Jones: Robust Real-time Face Detection Learning Classification Functions Given: feature set and training set of positive + negative face images – In principle any machine learning approach can be used (SVM, neural network, …) – But: 45 396 rectangle features for each subwindow Experiments showed that only a few features needed for an efficient classifier How to find them? 30/08/2006 IPCV 2006 Budapest 62 Viola & Jones: Robust Real-time Face Detection AdaBoost-Based Detector [Viola and Jones 01] Main idea: – Feature selection: select important features – Focus of attention: focus on potential regions – Use an integral graph for fast feature evaluation Use AdaBoost to learn – A small set of important features (feature selection) sort them in the order of importance each feature can be used as a simple (weak) classifier – A cascade of classifiers that combine all the weak classifiers to do a difficult task filter out the regions that most likely do not contain faces 30/08/2006 IPCV 2006 Budapest 63 Viola & Jones: Robust Real-time Face Detection Feature Selection [Viola and Jones 01] Training: If x is a face, then x – most likely has feature 1 (easiest feature, and of greatest importance) – very likely to have feature 2 (easy feature) – … – likely to have feature n (more complex feature, and of less importance since it does not exist in all the faces in the training set) Testing: Given a test sub-image image x’ – if x’ has feature 1: Test whether x’ has feature 2 – Test whether x’ has feature n –… – else … else, it is not face – else, it is not a face 30/08/2006 IPCV 2006 Budapest Similar to decision tree 64 Viola & Jones: Robust Real-time Face Detection AdaBoost Variant of AdaBoost is used both to select features and train the classifier AdaBoost is used to boost classification performance of a simple (weak) learning algorithm It combines a collection of weak classifier to form a stronger one The weak learning algorithm is designed to select the single rectangle feature which best separates the positive and negative examples 30/08/2006 IPCV 2006 Budapest 65 Viola & Jones: Robust Real-time Face Detection AdaBoost Searches over the set of possible weak classifiers and selects those with the lowest classification error Greedy feature selection process Efficient for selecting a small number of “good” features (fj(x)) with significant variety Weak classifier Error rates between 0.1-0.3 (early rounds) and 0.4 and 0.5 (later rounds) 30/08/2006 IPCV 2006 Budapest 66 Viola & Jones: Robust Real-time Face Detection AdaBoost in Detail Given example images yi = 0,1 pos./neg. Initialize weights for yi = 0, 1: where m and l number of negatives and positives For t = 1,…,T: 1. Normalize weights 2. For each feature j train a classifier hj which is restricted to using a single feature, the error 3. Choose the classifier ht with lowest error 4. Update weights: where ei = 0, if xi correctly classified, else ei = 1 Final strong classifier: 30/08/2006 IPCV 2006 Budapest 67 Viola & Jones: Robust Real-time Face Detection Feature Selection by AdaBoost First and second feature selected by AdaBoost 60 microprocessor instructions 30/08/2006 IPCV 2006 Budapest 68 Viola & Jones: Robust Real-time Face Detection AdaBoost – Learning Results PIII 700 MHz, 2001 200-feature-classifier Detection rate: 95% False positive rate: 1/14084 Takes 0.7 seconds to scan 384x288 image Adding features to the classifier increases computation time 30/08/2006 IPCV 2006 Budapest 69 Viola & Jones: Robust Real-time Face Detection Cascade of Classifiers 0.7 s for 1 image? too slow => Idea: Cascade of Classifiers to reduce computation time Simple Classifiers to reject majority of negative sub-windows: all sub-windows many sub-windows have no faces 1 T 2 T F F 3 T Further processing F rejected sub-windows 30/08/2006 IPCV 2006 Budapest 70 Viola & Jones: Robust Real-time Face Detection Cascade of Classifiers - Experiment Monolithic 200-feature-classifier vs. cascade of ten 20-feature-classifier Trained with 5000 faces and 10000 non-face sub-windows 30/08/2006 IPCV 2006 Budapest 71 Viola & Jones: Robust Real-time Face Detection Results - Training Structure of the Cascade: – 32 layers with 4297 features – First layer uses two features, 60% non-faces rejected, close to 100% faces detected – Second layer uses five features, rejects 80% non-faces – 3rd 20-feature-classifier – 4th 50-feature-classifier – 5th 100-feature-classifier – 20th 200-feature-classifier 30/08/2006 IPCV 2006 Budapest 72 Viola & Jones: Robust Real-time Face Detection Results - Speed Speed of the detector depends on the number of features evaluated On the MIT+CMU test set, an average of 8 features (out of 4297), on a 700 MHz Pentium III, 384x288 takes 0.067s (2001) 30/08/2006 IPCV 2006 Budapest 73 Viola & Jones: Robust Real-time Face Detection Outlook Viola & Jones improved their detector to detect not only frontal faces, but also non-upright and non-frontal faces Additional rectangle features Pose estimator Classifier for each rotation class (12) 30/08/2006 IPCV 2006 Budapest 74 Haar-like Features Extension 45º rotated rectangle feature by Lienhart Feature is represented by (type, x, y, w, h): type [1(a), 2(b), etc.]; (x, y) for location in detection window and (w, h) for feature size Number of features: the window size 24x24, there are 117,941 different features, a large number 30/08/2006 IPCV 2006 Budapest 75 Integral Image, Fast Feature Calculation Corresponding to the origin feature and 45º rotated feature, there are two types of Integral Image, called by Lienhart as Sum Area Table and Rotated Sum Area Table SAT(x,y) RSAT(x,y) 30/08/2006 IPCV 2006 Budapest 76 Integral Image, Fast Feature Calculation SAT and RSAT Calculation: Sum of Rect Calculation: Feature value: calculated as the weighted sum of the rectangles 30/08/2006 IPCV 2006 Budapest 77 Appearance-Based Methods: Summary Pros: – Use powerful machine learning algorithms – Has demonstrated good empirical results – Fast and fairly robust – Extended to detect faces in different pose and orientation Cons – Usually needs to search over space and scale – Need lots of positive and negative examples – Limited view-based approach 30/08/2006 IPCV 2006 Budapest 78 Agenda (Detection) Detecting faces in gray scale images – Knowledge-based – Feature-based – Template-based – Appearance-based Detecting faces in color images Detecting faces in video Performance evaluation Research direction and concluding remarks 30/08/2006 IPCV 2006 Budapest 79 Color-Based Face Detector Distribution of skin color across different ethnic groups – Under controlled illumination conditions: compact – Arbitrary conditions: less compact Color space – RGB, normalized RGB, HSV, HIS, YCrCb, YIQ, UES, CIE XYZ, CIE LIV, … Statistical analysis – Histogram, look-up table, Gaussian model, mixture model, … 30/08/2006 IPCV 2006 Budapest 80 Methods of Skin Detection Pixel-Based Methods – Classify each pixel as skin or non-skin individually, independently from its neighbors – Simpler, faster – Color Based Methods fall in this category Region Based Methods – Try to take the spatial arrangement of skin pixels into account during the detection stage to enhance the methods performance – More computation extensive – Additional knowledge in terms of texture etc. are required 30/08/2006 IPCV 2006 Budapest 81 Skin Color Based Methods - Advantages 30/08/2006 Allows fast processing Robust to geometric variations of the skin patterns Robust under partial occlusion Robust to resolution changes Experience suggests that human skin has a characteristic color, which is easily recognized by humans IPCV 2006 Budapest 82 Skin Modelling Explicitly defined skin region – Using simple rules Non-parametric skin distribution modeling Estimate skin color distribution from the training data without deriving an explicit model of the skin – Look up table or Histogram Model – Bayes Classifier Parametric skin distribution modeling - Deriving a parametric model from the training set 30/08/2006 IPCV 2006 Budapest 83 Explicit Rule To define explicitly (through a number of rules) the boundaries skin cluster in some color space E.g. assuming in space (RGB, rgb, HSI, YUV, etc) – (R’,G’,B’) is classified as skin if: R’ > 95 and G’ > 40 and B’ > 20, Max{R’,G’,B’} – min{R’,G’,B’} > 15, and |R’-G’| > 15 and R’ > G’ and R’ > B’ Peer (2003) The simplicity of this method is attractive Simplicity of skin detection rules leads to construction of a very rapid classifier The main difficulty achieving high recognition rates need to find – Good color space rules 30/08/2006 – Adequate decision IPCV 2006 Budapest 84 Example: YCrCb Space [Yoo and Kim] Skin-colors form a cluster in YCrCb color space We can approximate the skin color region with several lines or with an ellipse 30/08/2006 IPCV 2006 Budapest 85 Color Histogram If – h(rk, gk, bk)=nk, Then – p(rk, gk, bk) = nk / N – p(r’k, g’k, b’k) = (N-nk) / N That is: – If there are nk skin color (rk, gk, bk) pixels in the image, the probability of any pixel which is a skin pixel is nk/N – and the probability of any pixel which is NOT a skin pixel is (N-nk)/N 30/08/2006 IPCV 2006 Budapest 86 Bayes Probability - Recall P(A,B) = P(A|B) x P(B) P(A,B) = P(B|A) x P(A) If P(B) + P(~B) = 1, and P(A) + P(~A) = 1, then – P(A) = P(A|B) x P(B) + P(A|~B) x P(~B) – P(B) = P(B|A) x P(A) + P(B|~A) x P(~A) P(A|B) x P(B) = P(B|A) x P(A) P(B|A) x P(A) P(A|B) = ------------------P(B) P(B|A) x P(A) P(A|B) = ----------------------------------------------P(B|A) x P(A) + P(B|~A) x P(~A) ) 30/08/2006 IPCV 2006 Budapest 87 Bayes Classifier (1) p(r,g,b|skin) is a conditional probability - a probability of observing color (red, or green, or blue), knowing that we see a skin pixel A more appropriate measure for skin detection would be p(skin|r,g,b) - a probability of observing skin, given a concrete (r,g,b) color value To compute this probability, the Bayes rule is used: 30/08/2006 IPCV 2006 Budapest 88 Bayes Classifier (2) p(r,g,b|skin) and p(r,g,b|non-skin) can be computed from skin and non-skin color histograms The prior probabilities p(skin) and p(non-skin) can also be estimated from the overall number of skin and non-skin samples in the training set An inequality p(skin|r,g,b) >= Q, where Q is a threshold value, can be used as a skin detection rule 30/08/2006 IPCV 2006 Budapest 89 Bayes Classifier (3) Consider normalised RGB space Histograms approximate probability densities – p(skin) can be estimated as the ratio of the skin size Nskin to the size of the image Ntotal , – p(non-skin) can be estimated as the ratio of the skin size Nnon-skin to the size of the image Ntotal , – p((r,g,b)|skin) the probability density of (r,g,b) given the skin, – p((r,g,b)|non-skin) the probability density of (r,g,b) given the non-skin, 30/08/2006 IPCV 2006 Budapest 90 Bayes Classifier (4) Then we can compute for any color (r,g,b) the probability p(skin|(r,g,b)) → Probability Map of the image Skin Probability Map (SPM) That is: Ratio of histograms 30/08/2006 IPCV 2006 Budapest 91 Bayes Classifier from [Rehg & Jones] Skin photos: Non skin photos: 30/08/2006 IPCV 2006 Budapest 92 Results from [Rehg & Jones] Used 18 696 images to build a general color model Density is concentrated around the gray line and is more sharply peaked at white than black Most colors fall on or near the gray line Black and white are by far the most frequent colors, with white occurring slightly more frequently There is a marked skew in the distribution toward the red corner of the color cube 77% of the possible 24 bit RGB colors are never encountered (i.e. the histogram is mostly empty) 52% of web images have people in them 30/08/2006 IPCV 2006 Budapest 93 Marginal Distributions [Rehg & Jones] 30/08/2006 IPCV 2006 Budapest 94 Skin model [Rehg & Jones] 30/08/2006 IPCV 2006 Budapest 95 Non Skin Model [Rehg & Jones] 30/08/2006 IPCV 2006 Budapest 96 Skin and Non-Skin Color Model Analyze 1 billion labeled pixels Skin and non-skin models A significant degree of separation between skin non-skin model Achieves 80% detection rate with 8.5% false positives Histogram method outperforms Gaussian mixture model 30/08/2006 IPCV 2006 Budapest 97 Experimental Results [Jones and Rehg 02] May have lots of false positives in the raw results Need further processing to eliminate false negatives and group color pixels for face detection 30/08/2006 IPCV 2006 Budapest 98 Results from [Zarit et al.] They compared 5 different color spaces CIELab, HSV, HS, Normalized RGB and YCrCb Four different metrics are used to evaluate the results of the skin detection algorithms – C %– Skin and Non Skin pixels identified correctly – S %– Skin pixels identified correctly – SE – Skin error – skin pixels identified as non skin – NSE – Non Skin error – non skin pixels identified as skin They compared the 5 color space with 2 color models – look up table and Bayes classifier 30/08/2006 IPCV 2006 Budapest 99 Look up table results HSV, HS gave the best results Normalized rg is not far behind CIELAB and YCrCb gave poor results 30/08/2006 IPCV 2006 Budapest 100 Color-Based Face Detector: Summary Pros: – Easy to implement – Effective and efficient in constrained environment – Insensitive to pose, expression, rotation variation Cons: – Sensitive to environment and lighting change – Noisy detection results (body parts, skin-tone line regions) 30/08/2006 IPCV 2006 Budapest 101 Agenda (Detection) Detecting faces in gray scale images – Knowledge-based – Feature-based – Template-based – Appearance-based Detecting faces in color images Detecting faces in video Performance evaluation Research direction and concluding remarks 30/08/2006 IPCV 2006 Budapest 102 Video-Based Face Detector Motion cues: – Frame differencing – Background modeling and subtraction Can also used depth cue (e.g., from stereo) when available Reduces the search space dramatically 30/08/2006 IPCV 2006 Budapest 103 HONDA Humanoid Robot: ASIMO Using motion and depth cue – Motion cue: from a gradient-based tracker – Depth cue: from stereo camera Dramatically reduce search space Cascade face detector 30/08/2006 IPCV 2006 Budapest 104 Video-Based Detectors: Summary Pros: – An easier problem than detection in still images – Use all available cues: motion, depth, voice, etc. to reduce search space Cons: – Need to efficient and effective methods to process the multimodal cues – Data fusion 30/08/2006 IPCV 2006 Budapest 105 Agenda (Detection) Detecting faces in gray scale images – Knowledge-based – Feature-based – Template-based – Appearance-based Detecting faces in color images Detecting faces in video Performance evaluation Research direction and concluding remarks 30/08/2006 IPCV 2006 Budapest 106 Performance Evaluation Need to set the evaluation criteria/protocol – Training set – Test set – What is a correct detect? – Detection rate: false positive/negative – Precision of face location – Speed: training/test stage 30/08/2006 IPCV 2006 Budapest 107 Training Sets Cropped and pre processed data set with face and non-face images provided by MIT CBCL: http://www.ai.mit.edu/projects/cbcl.old/softwaredatasets/index.html 30/08/2006 IPCV 2006 Budapest 108 Standard Test Sets MIT test set: subsumed by CMU test set CMU test set (http://www.cs.cmu.edu/~har) (de facto benchmark): 130 gray scale images with a total of 507 frontal faces CMU profile face test set: 208 images with faces in profile views Kodak data set (Eastman Kodak Corp): faces of multiple size, pose and varying lighting conditions in color images 30/08/2006 IPCV 2006 Budapest 109 CMU Test Set I: Upright Frontal Faces 130 images with 507 frontal faces Collected by K. Collected by K.-K. Sung K. Sung and H. Rowley Including 23 images used in [Sung and Poggio 94] Some images have low resolution De facto benchmark 30/08/2006 IPCV 2006 Budapest 110 CMU Test Sets: Rotated and Profile Faces 50 images with 223 faces in-plane orientation, collected by H. Rowley 208 images with 441 faces, collected by H. Schneiderman 30/08/2006 IPCV 2006 Budapest 111 Agenda (Detection) Detecting faces in gray scale images – Knowledge-based – Feature-based – Template-based – Appearance-based Detecting faces in color images Detecting faces in video Performance evaluation Research direction and concluding remarks 30/08/2006 IPCV 2006 Budapest 112 Research Issues Representation: How to describe a typical face? Scale: How to deal with face of different size? Search strategy: How to spot these faces? Speed: How to speed up the process? Precision: How to locate the faces precisely? Post processing: How to combine detection results? 30/08/2006 IPCV 2006 Budapest 113 Face Detection: A Solved Problem? Not quite yet… Factors: – Shadows – Occlusions – Robustness – Resolution Lots of potential applications Can be applied to other domains 30/08/2006 IPCV 2006 Budapest 114 Outline of Recognition Part Objective – Categorization – Brief description of representative face recognition techniques – Open research issues This part of the presentation is based on the survey W. Zhao, R. Chellappa, J. Phillips and A. Rosenfeld: Face Recognition: A Literature Survey ACM Computing Survey Vol. 35, Dec. 2003/ 30/08/2006 IPCV 2006 Budapest 115 Recognition from Intensity Images High-level Categorization 1. Holistic matching methods – Classification using whole face region 2. Feature-based (structural) matching methods – Structural classification using local features 3. Hybrid Methods – Using local features and whole face region 30/08/2006 IPCV 2006 Budapest 116 Holistic Approaches Principal Component Analysis (PCA) – Eigenface – Fisherface/Subspace LDA – SVM – ICA Other Representations – LDA/FLA – PDBNN 30/08/2006 IPCV 2006 Budapest 117 IPCV, Saint-Etienne, 2004, Prof. V. Bochko What is PCA? Principal component analysis (PCA) is a technique for extracting a smaller set of variables with less redundancy from highdimensional data sets in order to retain as much of the information as possible A small number of principal components often give most of the structure in the data 30/08/2006 IPCV 2006 Budapest 118 IPCV, Saint-Etienne, 2004, Prof. V. Bochko Standard PCA Assume that an n-dimensional observed vector is The vector is centered where The covariance matrix is where is an expectation The eigendecomposition is The matrix of eigenvectors is where The eigenvectors corresponds to eigenvalues The eigenvalues are The eigenvalues are in descending order The first k principal components are Finally, the reconstruction is 30/08/2006 IPCV 2006 Budapest 119 Facial Recognition: PCA - Overview Create training set of faces and calculate the eigenfaces Project the new image onto the eigenfaces Check if image is close to “face space” Check closeness to one of the known faces Add unknown faces to the training set and re-calculate 30/08/2006 IPCV 2006 Budapest 120 Facial Recognition: PCA Training Find average of training images Subtract average face from each image Create covariance matrix Generate eigenfaces Each original image can be expressed as a linear combination of the eigenfaces – face space 30/08/2006 IPCV 2006 Budapest 121 Facial Recognition: PCA Recognition A new image is project into the “facespace” Create a vector of weights that describes this image The distance from the original image to this eigenface is compared If within certain thresholds then it is a recognized face 30/08/2006 IPCV 2006 Budapest 122 Feature Based Approaches Elastic bunch graph matching using Gabor wavelets 30/08/2006 IPCV 2006 Budapest 123 Detecting Feature Points Using of Gabor Wavelets Create 5 frequencies, 8 angles Spatial figure 30/08/2006 IPCV 2006 Budapest Top view 124 Gabor filters The responses of the filters Original Filtered Result 2D Gabor filters can be used to detect dimensions along a specific orientation Jet: set of 40 complex coefficients for the reference points 30/08/2006 IPCV 2006 Budapest 125 Face Graph Facial fiducial points – Pupil, tip of mouth, etc. Face graph Bunch – Nodes at fiducial pts. – A set of Jets assoc. with – Un-directed graph the same fiducial point – The structure of graph is – e.g. an eye Jet may the same for each face consists of different types of eyes: open, closed, etc. – Fitting a face image to a face graph is done Face bunch graph (FBG): automatically – Same as a face graph, except each node – Some nodes may be consists of a jet bunch undefined due to 30/08/2006 IPCV 2006 Budapestrather than a jet 126 occlusion Face Bunch Graph FBG has the same structure as individual face graph – Each node labeled with a bunch of jets – Each edge labeled with average distance between corresponding nodes in face samples 30/08/2006 IPCV 2006 Budapest 127 Still Face Recognition: Open Issues Addressing the issue of recognition being too sensitive to inaccurate facial feature localization Robustly recognizing faces – Small and/or noisy images – Images acquired years apart – Outdoor acquisition: lighting, pose What is the limit on the number of faces that can be distinguished? What is the principal and optimal way to arbitrate and combine local features and global features 30/08/2006 IPCV 2006 Budapest 128 Web Resources (detection) Face detection home page http://home.t-online.de/home/Robert.Frischholz/face.htm Henry Rowley’s home page http://www-2.cs.cmu.edu/~har/faces.html Henry Schneiderman’s home page http://www.ri.cmu.edu/projects/project_416.html MIT CBCL web page http://www.ai.mit.edu/projects/cbcl.old/softwaredatasets/index.html Face detection resources http://vision.ai.uiuc.edu/mhyang/face-detection-survey.html 30/08/2006 IPCV 2006 Budapest 129 Web Resources (recognition) Face Recognition Home Page http.www.cs.rug.nl/~peterkr/FACE/frhp.html Face Databases UT Dallas www.utdallas.edu/dept/bbs/FACULTY_PAGES/otoole/database.htm Notre Dame database www.nd.edu/~cvrl/HID-data.html MIT database ftp://whitechapel.media.mit.edu/pub/images Edelman ftp://ftp.wisdom.weizmann.ac.il/pub/FaceBase CMU PIE www.ri.cmu.edu/projects/project\_418.htm Stirlingdatabase pics.psych.stir.ac.uk M2VTS multimodal www.tele.ucl.ac.be/M2VTS Yale database cvc.yale.edu/projects/yalefaces/yalefaces.htm Yale databaseB cvc.yale.edu/projects/yalefacesB/yalefacesB.htm Harvard database hrl.harvard.edu/pub/faces Weizmanndatabase www.wisdom.weizmann.ac.il/~yael UMIST database images.ee.umist.ac.uk/danny/database.html 30/08/2006 IPCV 2006 Budapest 130 Purdue rvl1.ecn.purdue.edu/~aleix/aleix\_face\_DB.html References (detection) M.-H. Yang, D. J. Kriegman, and N. Ahuja, “Detecting Faces in Images: A Survey” IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol. 24, no. 1, pp. 34-58, 2002. (http://vision.ai.uiuc.edu/mhyang/facedetection-survey.html M.-H. Yang and N. Ahuja, Face Detection and Hand Gesture Recognition for Human Computer Interaction, Kluwer Academic Publishers, 2001. Paul Viola, Michael J. Jones, Robust Real- time Object Detection, Cambridge Research Laboratory Technology Reports Series, CRL 2001/01, Feb, 2001 30/08/2006 IPCV 2006 Budapest 131 Face and Some CV-based Projects at Budapest Tech 30/08/2006 IPCV 2006 Budapest 132 Combined verification Hangfelvétel Hangminta LP spektrum Neurális háló Arc maszk Original 30/08/2006 Filter response Features IPCV 2006 Budapest Szűrő válasz 133 Morphing by Automatic Feature Detection 30/08/2006 IPCV 2006 Budapest 134 Eye and Lip Tracking Real time face detection Eye a mouth motion tracking Client-server internet communication 30/08/2006 IPCV 2006 Budapest 135 MYRA Face detection Skin tone based tracking Feature tracking Motion detection Recognition options: – Eigenface method – Gabor wavelet based bunch graph 30/08/2006 IPCV 2006 Budapest 136 Mobile robots – navigation using CV 30/08/2006 IPCV 2006 Budapest 137 Recognition and navigation 30/08/2006 IPCV 2006 Budapest 138