2011-2012 MITRE Computer Science Clinic Landmarking and Pose Correction for Face Recognition Background Pose correction pipeline Image warping Posecorrected image MITRE Corporation is a federally-funded research-anddevelopment corporation that has developed their own facial recognition system, known as MITRE Matcher. Non-frontal facial images create a significant challenge to the recognition process for both MITRE Matcher and other facial recognition systems, even if the degree of the pose variation is as small as ten or twenty degrees. This project's goal was to research, implement, and evaluate facial-landmarking algorithms and approaches to pose- analysis and pose-correction. 1.97 Original off-pose image Landmark map We use thin-plate splines to smoothly warp from one face set of landmarks to another. By isolating the yaw from ASM, we transform a landmarked face into a neutral, frontal pose. The other ASM-derived vectors enable other transformations. 0.93 Best landmarks Facial Landmarking ASEF Training image Synthetic image Exact filter Overview: The original image is cropped to the face and landmarked to determine possible feature locations on the face. The best combination of feature locations is selected using a combination of spatial heuristics and statistical estimation. Using a statistical model of landmark variation with pose, the landmarks are neutrally posed. These landmarks are used to warp the image to a neutral pose. UMACE Landmarking heuristics multiple responses King of the Hill is a technique for finding the n local maxima in a two-dimensional array. We use it to determine the top three possible locations in our UMACE or ASEF filter responses. geometric constraints UMACE filter The original and a forward-facing comparison image and the resulting match scores. For reference, the self-match score is about 6.29 Average filter (ASEF) The Average of Synthetic Exact Filters (ASEF) is a texturebased method for landmarking. To create an ASEF filter we specify a desired synthetic output for each training image. That desired output consists of a Gaussian dot centered at the ground truth landmark position in the training image. This results in a filter that exactly transforms the training image into the synthetic image. We average all of a dataset's exact filters to get the final ASEF filter, which can then be applied to facial images to locate the landmark. Facial image with right-eye region Example yaw warp: off-pose (right) to frontal pose (middle) Response of eye region to filter We also investigated and implemented a feature-detection algorithm involving the use of Unconstrained Minimum Average Correlation Energy (UMACE) filters. We divide the average values of a standard square region around the ground truth eye location of each training image by the average power spectrum for that same region. This gives us a correlation filter that we can apply to a standard eye-containing region to determine possible eye locations within that region. Facial features tend to end up in roughly the same area of each facial crop. We can take advantage of this by constraining the area in which we search for each landmark. combining features An Active Shape Model (at right) and/or feature-strength heuristics can provide a probability that a particular set of landmarks form a face shape. The current system uses only feature strength, but can support additional metrics in the future. Modeling Faces with Active Shape Models An Active Shape Model (ASM) uses a dataset's statistics to capture the possible shapes that objects of a certain class can take. We used face shapes consisting of sets of landmarks (nose tip, mouth corners, etc.) and trained the model to recognize the configuration of the average face in our training data. We also find the most significant parameters that describe the ways a face can vary from the average while still representing a viable face. Results and Deliverables The team is delivering to MITRE: • Code implementing ASEF and UMACE landmarking, Active Shape Models and its component algorithms, and our posecorrection technique, as well as scripts and applications for testing and demonstrating all of these algorithms. • Accuracy results for landmarking and pose-corrected match scores. Because full-image pose-correction can lead to lower match scores, improved face recognition may result from comparing feature-relative patches instead of warped full images. The team's feature extraction routines will form the basis of that process. The mean face (green) computed from 300 faces (white), after alignment via the Procrustes algorithm the team implemented. Three standard deviations from the mean along the largest source of variation, roughly corresponding to yaw Three standard deviations from the mean along the largest source of variation, roughly representing pitch Circles showing 5%, 10%, and 25% of interocular distance Accuracy results for the best-match feature in each of six locations Acknowledgments Team Members Elliot Godzich '12 Dylan Marriner '12 Emily Myers-Stanhope '12 Emma Taborsky '12 (PM) Heather Williams '12 MITRE Liaisons Joshua Klontz '10 Mark Burge Faculty Advisor Zachary Dodds