MITRE Liaisons

advertisement
2011-2012 MITRE Computer Science Clinic
Landmarking and Pose Correction for
Face Recognition
Background
Pose correction pipeline
Image warping
Posecorrected
image
MITRE Corporation is a federally-funded research-anddevelopment corporation that has developed their own facial
recognition system, known as MITRE Matcher. Non-frontal
facial images create a significant challenge to the recognition
process for both MITRE Matcher and other facial recognition
systems, even if the degree of the pose variation is as small
as ten or twenty degrees. This project's goal was to research,
implement, and evaluate facial-landmarking algorithms and
approaches to pose- analysis and pose-correction.
1.97
Original off-pose image
Landmark map
We use thin-plate splines to smoothly warp from one face set
of landmarks to another. By isolating the yaw from ASM, we
transform a landmarked face into a neutral, frontal pose. The
other ASM-derived vectors enable other transformations.
0.93
Best landmarks
Facial Landmarking
ASEF
Training
image
Synthetic
image
Exact
filter
Overview: The original image is cropped to the face and landmarked to
determine possible feature locations on the face. The best combination
of feature locations is selected using a combination of spatial heuristics
and statistical estimation. Using a statistical model of landmark variation
with pose, the landmarks are neutrally posed. These landmarks are
used to warp the image to a neutral pose.
UMACE
Landmarking heuristics
multiple responses
King of the Hill is a technique for
finding the n local maxima in a
two-dimensional array. We use it
to determine the top three
possible locations in our UMACE
or ASEF filter responses.
geometric constraints
UMACE filter
The original and a forward-facing comparison
image and the resulting match scores. For
reference, the self-match score is about 6.29
Average
filter (ASEF)
The Average of Synthetic Exact Filters (ASEF) is a texturebased method for landmarking. To create an ASEF filter we
specify a desired synthetic output for each training image.
That desired output consists of a Gaussian dot centered at
the ground truth landmark position in the training image. This
results in a filter that exactly transforms the training image into
the synthetic image. We average all of a dataset's exact
filters to get the final ASEF filter, which can then be applied to
facial images to locate the landmark.
Facial image with
right-eye region
Example yaw warp: off-pose (right) to frontal pose (middle)
Response of eye
region to filter
We also investigated and implemented a feature-detection
algorithm involving the use of Unconstrained Minimum
Average Correlation Energy (UMACE) filters. We divide the
average values of a standard square region around the ground
truth eye location of each training image by the average power
spectrum for that same region. This gives us a correlation
filter that we can apply to a standard eye-containing region to
determine possible eye locations within that region.
Facial features tend to end up in
roughly the same area of each
facial crop.
We can take
advantage of this by constraining
the area in which we search for
each landmark.
combining features
An Active Shape Model (at right)
and/or feature-strength heuristics
can provide a probability that a
particular set of landmarks form a
face shape. The current system
uses only feature strength, but
can support additional metrics in
the future.
Modeling Faces with Active
Shape Models
An Active Shape Model (ASM) uses a dataset's statistics to
capture the possible shapes that objects of a certain class
can take. We used face shapes consisting of sets of
landmarks (nose tip, mouth corners, etc.) and trained the
model to recognize the configuration of the average face in
our training data. We also find the most significant
parameters that describe the ways a face can vary from the
average while still representing a viable face.
Results and Deliverables
The team is delivering to MITRE:
• Code implementing ASEF and UMACE landmarking, Active
Shape Models and its component algorithms, and our posecorrection technique, as well as scripts and applications for
testing and demonstrating all of these algorithms.
• Accuracy results for landmarking and pose-corrected match
scores. Because full-image pose-correction can lead to lower
match scores, improved face recognition may result from
comparing feature-relative patches instead of warped full
images. The team's feature extraction routines will form the
basis of that process.
The mean face (green)
computed from 300 faces
(white), after alignment via
the Procrustes algorithm the
team implemented.
Three standard deviations
from the mean along the
largest source of variation,
roughly corresponding to yaw
Three standard deviations
from the mean along the
largest source of variation,
roughly representing pitch
Circles showing
5%, 10%, and
25% of interocular distance
Accuracy results for the best-match
feature in each of six locations
Acknowledgments
Team Members
Elliot Godzich '12
Dylan Marriner '12
Emily Myers-Stanhope '12
Emma Taborsky '12 (PM)
Heather Williams '12
MITRE Liaisons
Joshua Klontz '10
Mark Burge
Faculty Advisor
Zachary Dodds
Download