Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian Outline Overview Scaling Invariance Rotation Invariance Face Recognition Methods Multi-Layer Perceptron Hybrid NN SOM Convolutional NN Conclusion 2 Overview 3 Scaling Invariance Magnifying image while minimizing the loss of perceptual quality. Interpolation methods: Weighted sum of neighboring pixels. Content-adaptive methods. Edge-directed. Classification-based. Using multilayer neural networks. Proposed method: Content-adaptive neural filters using pixel classification. 4 Scaling Invariance (Cont.) Pixel Classification: 0, if x xav ADRC( x) 1, otherwise Adaptive Dynamic Range Coding (ADRC): Concatenation of ADRC(x) of all pixels in the window gives the class code. If we invert the picture date, the coefficients for the filter should remain the same ⇒ It is possible to reduce half of the numbers of classes. Number of classes: 2N-1 for a window with N pixels 5 Scaling Invariance (Cont.) Content-adaptive neural filters: The original high resolution, y, and the downscaled, x, images are employed as the training set. These pairs, (x, y), are classified using ADRC on the input vector x. The optimal coefficients are obtained for each class. The coefficients are stored in the corresponding index of a look-up-table(LUT). 6 Scaling Invariance (Cont.) A simple 3-layer feedforward architecture. Few neurons in the hidden layer. The activation function in the hidden layer is tanh. The neural network can be described as: Nh y1 un .(tanh(n .x) bn ) b0 n 1 y2, y3 and y4 can be calculated in the same way by flipping the window simmetrically 7 Scaling Invariance (Cont.) Pixel classification set reduction 1. Calculate the Euclidian distance of normalized coefficient vector between each class. 9 D (i ,a i ,b ) 2 i 1 2. 3. If the distance is below the threshold, combine the classes. The coefficient can be obtained by training on the combined data of the corresponding classes. Repeat step 1 for the new class set , until the threshold is reached. 8 Scaling Invariance (Cont.) 9 Rotation Invariance Handling in-plane rotation of face. Using a neural network called router. The router’s input is the same region that the detector network will receive as input. The router returns the angle of the face. 10 Rotation Invariance (Cont.) The output angle can be represented by Single unit 1-of-N encoding Gaussian output encoding An array of 72 output unit is used for proposed method. For a face with angle of θ, each output trained to have a value of cos(θ – i×5o) Computing an input face angle as: 71 71 outputi cos(i 5), outputi sin(i 5) i 0 i 0 11 Rotation Invariance (Cont.) Router architecture Input is 20×20 window of scaled image. Router has a single hidden layer consisting of a total 100 units. There are 4 sets of units in hidden layer. Each unit connects to a 4×4 region of the input. Each set of 25 units covers the entire input without overlap. The activation function for hidden layer is tanh. The network in trained using the standard error back propagation algorithm. 12 Rotation Invariance (Cont.) Generating a set of manually labeled example images Align the labeled faces: 1. 2. 3. 4. Initializing F, a vector which will be the average position of each labeled feature over all the training faces. Each face is aligned with F by computing rotation and scaling. Transformation can be written as linear functions, we can solve it for the best alignment. After iterating these steps a small number of times, the alignments converge. 13 Rotation Invariance (Cont.) To generate the training set, the faces are rotated to a random orientation. 14 Rotation Invariance (Cont.) Empirical results: 15 Rotation Invariance (Cont.) 16 Face Recognition Methods Database: ORL(Olivetti Research Lab.) Database consists of 10 92×112 different images of 40 distinct subject. 5 image per person for training set and 5 for test. There are variation of facial expression and facial detail. 17 Face Recognition Methods Multi-Layer Perceptron: The training set faces are run through a PCA, and the 200 corresponding eigenvectors (principal components) are found which can be displayed as eigenfaces. Each face in the training set can be reconstructed by a linear combination of all the principal components. By projecting the test set images onto the eigenvector basis, the eigenvector expansion coefficients can be found. (a dimensionality reduction!) 18 Face Recognition Methods (Cont.) MLP Training classifier using coefficients of training set images. Using variable number of principal components ranging from 25 to 200 in different simulation. Repeating simulation 5 times for each number with random initialization of all parameters in the MLP and averaging the results for that number. The Error Backpropagation learning algorithm was applied with a small constant learning rate (normally < 0.01) 19 Face Recognition Methods (Cont.) MLP Results: 20 Face Recognition Methods (Cont.) Hybrid NN 21 Face Recognition Methods (Cont.) Hybrid NN 1. Local Image Sampling • [ xi W , j W , xi W , j W 1 ,...,xij ,...,xi W , j W 1 , xi W , j W ] • [ xij xi W , j W , xij xi W , j W 1,...,wij xij ,...,xij xi W , j W 1, xij xi W , j W ] 22 Face Recognition Methods (Cont.) Hybrid NN 2. Self-Organizing Map 23 Face Recognition Methods (Cont.) Hybrid NN mi [ i1 , i 2 ,...,in ]T n mi is a refrence vectorin theinputspaceassignedto each nodein theSOM. mi (t 1) mi (t ) hci (t )[x(t ) mi (t )] hci (t ) h( rc ri , t ) hci (t ) is a neighborhood function rc ri 2 rt r c i hci (t ) exp 0 2 2 (t ) rc is the nodewith theclosest weight vector to the input (t ) is a scalar valued learningrate (t ) defines the widthof the kernel 24 Face Recognition Methods (Cont.) Hybrid NN SOM image samples corresponding to each node before training and after training 25 Face Recognition Methods (Cont.) Hybrid NN 3. Convolutional NNs Invariant to some degree of: Shift Deformation Using these 3 ideas: Local Receptive Fields Shared Weights aiding genaralization Spatial Subsampling 26 Face Recognition Methods (Cont.) Hybrid NN 27 Face Recognition Methods (Cont.) Hybrid NN Network Layers: Convolutional Layers Each Layer one or more planes Each Plane can be considered as a feature map which has a fixed feature detector that is convolved with the local window which is scanned over the planes in previous layer. Subsampling Layers Local averaging and subsampling operation 28 Face Recognition Methods (Cont.) Hybrid NN Convolutional and Sampling relations: 29 Face Recognition Methods (Cont.) Hybrid NN Simulation Details: Initial weights are uniformly distributed random numbers in the range [-2.4/Fi, 2.4/Fi] where Fi is the fan-in neuron i. Target outputs are -0.8 and 0.8 using the tanh output activation function. 30 Weights are updated after each pattern presentation. Face Recognition Methods (Cont.) Hybrid NN Expremental Results Expriment #1: Variation of the number of output classes 31 Face Recognition Methods (Cont.) Hybrid NN Variation of the dimentionality of the SOM 32 Face Recognition Methods (Cont.) Hybrid NN Substituting the SOM with the KLT Replacing the CN with an MLP 33 Face Recognition Methods (Cont.) Hybrid NN The tradeoff between rejection threshold and recognition accuracy 34 Face Recognition Methods (Cont.) Hybrid NN Comparison with other known results on the same database 35 Face Recognition Methods (Cont.) Hybrid NN Variation of the number of training images per person 36 Face Recognition Methods (Cont.) Hybrid NN 37 Face Recognition Methods (Cont.) Expriment #2: 38 Face Recognition Methods (Cont.) 39 Conclusion The results of the face recognition expriments are greatly influenced by: The Training Data The Preprocessing Function The Type of Network selected Activation Functions A fast, automatic system for face recognition has been presented which is a combination of SOM and CN. This network is partial invariant to translation, rotation, scale and deformation. 40