Face recognition: component-based versus global approaches 指導老師: 萬書言 老師 報告學生: 何炳杰 報告日期: 2010/10/08 1 論文出處 Computer Vision and Image Understanding Volume 91, Issues 1-2, July- August 2003, Pages 6-21 Special Issue on Face Recognition Authors : Honda Research Institute US, 145 Tremont St., Boston, MA 02111, USA Center for Biological and Computational Learning, M.I.T., Cambridge, MA, USA Hewlett-Packard, Cambridge, MA, USA Received 15 February 2002; accepted 11 February 2003. ; Available online 17 July 2003. 2 Abstract 在這篇文章中,作者分別呈現局部(component- based method)與整臉(global method)方式的人臉辨 識,並評估這兩種呈現方式的系統的穩定性(針對 人臉的位置轉動部分)。 Component system: 1 Locate facial components. 2 3 Extract them. Combine them into a single feature vector. The 1st Sys. Train a single SVM classifier for each person in the database. The 2nd Sys. Consists of sets of view-specific SVM classifier and involves clustering during training. The two global system: 3 1. Introduction (i) global approach (ii) component-based approach (I) Global Approach: 4 Focusing on the aspect of pose invariance. Global Approach References Minimum distance classification [2,3] Fisher’s discriminant analysis [4] Neural networks [5] Fisher’s discriminant analysis && Kernel PCA [6, 7] 1. Introduction - Global approach Global approach 方法的限制: Global techniques are not robust against pose changes since global features are highly sensitive to translation and rotation of the face. Solutions: An alignment stage can be added before classifying the face. Aligning an input face image with a reference frontal face image requires computing correspondences between the two face images. 5 1. Introduction - Global approach (cont.) Solutions: correspondents: A small number of prominent points in the face like the center of the eye, the nostrils, or the corners of the mouth. center of eye the nostrils the corners of the mouth 6 1. Introduction (II) Component-Based Approach: 7 Component-Based References Face recognition: features versus templates [14] Face recognition under varying pose [15] Face recognition by elastic bunch graph matching [16] An embedded hmm-based approach for face detection and recognition [17] Recognizing imprecisely localized, partially occluded, and expression variant faces from a single sample per class [18] 1. Introduction - Component-Based Approach The main idea of component-based recognition is to compensate for pose changes by allowing a flexible geometrical relation between the components in the classification stage. A component-based approach is to classify local facial components. ( eyes, nose, mouth... ) 8 1. Introduction - global methods We present two global approach and a component-based approach to face recognition and evaluate their robustness against pose changes. The first global method: A straightforward face detector which extracts the face from an input image. The second global method: Split the images of each person into view-specific clusters. We then train view-specific SVM classifiers on each single cluster. 9 1. Introduction - component-based methods The component-based system: Use a face detector that detects and extracts local components of the face. The detector consists of a set of SVM classifiers that locate learned facial components and a single geometrical classifier that checks if the configuration of the components matches a learned geometrical face model. 示意圖: A set of SVM image image image 10 Face detector Training(學習階段) classifiers. 1. Introduction The outline of the paper is as follows: Section 2: Give a brief overview on SVM learning and strategies for multi-class classification with SVMs. Section 3: Describe the two global methods for face recognition. Section 4: It’s about the component-based system. Section 5: Contain experimental results and a comparison between the global and component systems. Section 6: Concludes the paper and suggests future work. 11 2.1. Binary classification SVMs belong to the class of maximum margin classifiers. They perform pattern recognition between two class by finding a decision surface that has maximum distance to the closest points in the training set which are termed support vectors. 示意圖: 12 出處: C. Cortes, V. Vapnik, Support vector networks, Mach. Learning 20 (1995) 1–25. 2.1. Binary classification - linear classification OSH(Optimal Separating Hyperplane ) 參數部分: xi n, i 1, 2,..., N where each xi points belongs to one of two classes identified the label yi {1,1}. - xi : A training set of points - i and b : They are the solutions of a quadratic programming problem. - Goal: Separate the two classes by a hyperplane such the distance to the support vectors is maximized. 13 2.1. Binary classification - linear classification 功用: Perform multi-class classification. d : The sign of d is the classification result for x , and d is the x distance from to the hyperplane. The larger d , the more reliable the classification result. 14 2.1. Binary classification - non-linear classification linear classification non-linear classification k ( x, y ) 的由來: - Each point x in the input space is mapped to a point z = (x) of a higher dimensional space, called the feature space, where the data are separated by a hyperplane. - (.) : 由數學中的內積所衍生出的性質。 15 2.1. Binary classification - non-linear classification - (.) : It is subject to the condition that the dot product of two points in the feature space (x)* ( y)。 - Feature space: f2 . .. . . . . . . . x f2 f1 . . . . . . z = (x) f3 f1 - Each point x in the input space is mapped to a point higher dimensional space, called the feature space. 16 z = (x) of a 2.1. Binary classification - non-linear classification An important family of kernel functions is the polynomial kernel - d : The degree of the polynomial. 17 2.2. Multi-class classification One-vs-all approach(一對多-SVM分類方法) Pairwise approach(成對的SVM分類方法) - One-vs-all approach(一對多-SVM分類方法): . C3 class 1 C1 class 2 18 原則: 取大(取正 號) class 3 C2 出處: http://www.powercam.cc/slide/6556 2.2. Multi-class classification Pairwise approach(成對的SVM分類方法) bottom-up comparison 19 出處: G. Guodong, S. Li, C. Kapluk, Face recognition by support vector machines, in: Proc. IEEE Int. Conf. on Automatic Face and Gesture Recognition, 2000, pp. 196–201. 2.2. Multi-class classification Pairwise approach(成對的SVM分類方法) 20 top-down comparison 出處: J. Platt, N. Cristianini, J. Shawe-Taylor, Large margin dags for multiclass classification, Adv. Neural Inform. Process. Systems 2.2. Multi-class classification A more recent comparison between several multi-class techniques [20] favors the one-vs-all approach because of its simplicity and excellent classification performance. 出處: R. Rifkin, Everything old is new again: a fresh look at historical approaches in machine learning, Ph.D. thesis, M.I.T., 2002. 3. Global approach System process: image image image Face detector Face Extract the face from an input image. 22 Face recognition 3.1. Face detection - Global approach In order to detect faces at different scales we first computed a resolution pyramid for the input image and then shifted a 58*58 window over each image in the pyramid. ※金字塔架構 (pyramid structure) 第四階 ( N / 16 ´ N / 16) 第三階 ( N / 8 ´ N / 8) 第二階 ( N / 4 ´ N / 4) 第一階 ( N / 2 ´ N / 2) 第0階 (N ´ N ) 原影像 23 出處: http:// www.cs.pu.edu.tw/~ychu/class981/DataComp/15HierarchicalCoding.PPT 3.1. Face detection - Global approach 3.1. Face detection - Global approach The training data for the face detector was generated by rendering seven textured 3-D head models [29]. o o The heads were rotated between -30 and +30 in depth and illuminated by ambient light and a single directional light pointing towards the center of the face. -30o 25 +30o 出處: A morphable model for synthesis of 3D faces, in: Comput. Graphics Proc. SIGGRAPH, Los Angeles, 1999, pp. 187–194. 3.1. Face detection - Global approach Sample size: - We generated 2457 face images of size 58*58 pixels, some examples are shown in Fig. 2. - The negative training set initially consisted of 10,209 58*58 non-face patterns randomly extracted from 502 non-face images. 3.2. Recognition - Global approach We implemented two global recognition systems. Both systems were based on the one-vs-all strategy for SVM multi- class classification described in the previous section. - The first system: Use a linear SVM for every person in the database. Each SVM was trained to distinguish between all images of a single person ( labeled +1 ) and all other images in the training set (labeled -1 ). . C3 clas s1 C1 clas s2 27 clas s3 C2 3.2. Recognition - Global approach For both training and testing we first ran the face detector on the input image to extract the face. Re-scale: - We re-scaled the face image to 40*40 pixels and converted the gray values into a feature vector. - Given a set of q people and a set of q SVMs, each one associated to one person, the class label y of a face pattern x is computed as follows: 3.2. Recognition - Global approach Formulas: - di (x) : It is computed according to Eq. (2) for the SVM trained to recognize person i . - t : The classification threshold. - The class label 0 stands for rejection. 3.2. Recognition - Global approach 潛在的問題與限制: - Changes in the head pose lead to strong variations in the images of a person’s face. - These in-class variations complicate the recognition task. 3.2. Recognition - Global approach 解決方案: - For this reason, we developed a second method in which we split the training images of each person into clusters by a divisive cluster technique . - The cluster with the highest variance is split into two by a hyperplane. - N: The number of faces in the cluster. 3.2. Recognition - Global approach 解決方案: - The face with the minimum distance to all other faces in the same cluster is chosen to be the average face of the cluster. average face cluster average face cluster 4. Component-based approach System process: image image image Face detector Face Detect facial components. 33 Face recognition 4.1. Detection We implemented a two-level, component-based face detector. 34 4.1. Detection The 14 facial components used in the detection system are shown in Fig. 5a, their dimensions are given in Table 1. The shapes and positions of the components have been automatically determined from the training data. Fig. 5. (a) The 14 components of our face detector. The centers of the components are marked by a white cross. 35 4.1. Detection Table 1: We trained 14 linear SVMs on the component data and applied them to the whole training set in order to generate the training data for the geometrical classifier. 36 4.1. Detection In a final step: We trained the geometrical classifier, which was again a linear SVM, on the X–Y locations and continuous outputs of the 14 component classifiers. 37 4.1. Detection 缺點: - The component-based face detector was computationally more expensive than the global face detector. - This was because the combined size of the 14 components was about 1.12 times the size of the face region used in the global detector. - In addition, we had to locate the maxima of the responses of the component classifiers and compute the output of the geometrical classifier. - In average, the component-based detector was about 1.2 times slower than the global detector. 4.2. Recognition System process: - Step 1: First ran the component-based detector over each image in the training set. - Step 2: Extracted the components. From the 14 original components we kept 10 for face recognition. 篩選條件: C1014 - Removing those that either contained few gray value structures (e.g., cheeks) or strongly overlapped with other components. 39 4.2. Recognition The 10 selected components are shown in Fig. 5b. The 10 components that were used for face recognition are shown in (b). 40 4.2. Recognition Examples of the component-based face detector applied to images of the training set are shown in Fig. 6. 41 5. Experiments Training set: - 10,000 gray face images of 10 subjects from which about 1400 were frontal views. - The resolution of the face images ranged: 80*80 ~ 130*130 pixels. o ( with rotations in azimuth up to about +- 40 ) Testing stage: - 1154 images of all 10 subjects in the database. o - The rotation in depth was again up to about +- 40 . 42 5. Experiments We trained four different recognition systems on the 10,000 images: (1) Global system using one linear SVM classifier per person. (2) Global system using one second-degree polynomial SVM per person. (3) Global system with one linear SVM for each cluster. (4) Component-based approach with one linear SVM classifier per person. 43 5. Experiments The ROC curves for the four systems are shown in Fig. 7. 44 5. Experiments Some examples of misclassifications caused by false detections are shown in Figs. 8 and 9. 45 6. Conclusion and future work We presented a component-based technique and two global techniques for face recognition and evaluated their performance with respect to robustness against pose changes. The component-based system: - It detected and extracted a set of 10 facial components and arranged them in a single feature vector that was classified by linear SVMs. Both global systems: - we detected the whole face, extracted it from the image, and used it as input to the classifiers. 46 6. Conclusion and future work In the experiment the component-based system outperformed the global systems even though we used more powerful classifiers (i.e., non-linear instead of linear SVMs) for the global system. 研究限制( the current component-based classifier ): - The current component-based classifier cannot deal with the full range of poses (from frontal to profile views). 解決方案: - It will be necessary to train view-specific component classifiers, e.g., two mouth classifiers trained on frontal and profile views, respectively. 47 Thank You! 48