Methods of Eye and Face Detection Devin Karns EGGN512 5/1/13

advertisement
Methods of Eye and Face Detection
Devin Karns
EGGN512 5/1/13
Eye and Face Detection

Widely used in many applications




Biometrics
Visual surveillance
Human-robot interactions
Eyes represent most essential physical information of face


Closely connected to other facial feature positions
Can determine other facial characteristics from relative eye
orientations
Methods

Segmentation texture




Dynamic time warping




Divide image into NxN blocks
Get block textures from FFT
Derive facial features from texture major axis projections
Feature vector is a waveform composed of horizontal and vertical
projections of an image
Template feature vector compared against image subsections
Scores accumulated and thresholded
Local gradient patterns



Pixel neighborhoods of image gradient converted to 8-bit codes
Lookup table of AdaBoost pixel classifiers developed from massive
database of face and non-face images
Image pixel codes weighted based on a lookup table as either face or
non-face
Segmentation Texture - FFT



Get facial symmetry axis from edge image inertia matrix
Image is divided into blocks with partial overlap in
sampling
Take FFT of each block and threshold
Segmentation Texture – Face Region



Calculate Emajor from block eigenvalues and vectors
Perform elliptical hough transform on binarized Emajor
distribution
Projected ellipse
encloses face region
Segmentation Texture – Eye Position


Calculate binarized FFT block white-to-total pixel ratio
(tau)
Max of projection of tau along rows (within face region)
dictates eye row
Segmentation Texture – Eye Position

Horizontal positions determined from local maxima in
column projection of tau about symmetry axis
Segmentation Texture – Performance




Overall detection rate: ~50%
Proper detection: ~28%
Partial detection: ~42%
False detection: ~30%
Segmentation – Conclusions

Pros




No database/template required, fast (~1-3 sec/image)
Can account for some head tilt fairly well
Usually finds at least one eye
Cons



If face symmetry axis not correctly found, eye orientations will
be skewed
Emajor does not always sufficiently outline head, can lead to
face region mismatching
Fooled by glasses, dimples, moustaches
Dynamic Time Warping – Feature Vector


Measures similarity between two sequences
Image region row and column projections weighted by triangle
function to emphasize nose bridge and eye regions. Vectors
are concatenated to form a feature vector.
Dynamic Time Warping – Eye Location



Warping path determines
minimum distance between
vectors
Minimum distances accumulated
over image at each pixel
Minimum region of accumulated
distances determines potential eye locations
Dynamic Time Warping – Process




Obtain edge image using sobel filter
Sample image at every pixel NxM region where NxM is
template size and convert to feature vector
Compare template feature vectors to image feature
vectors
Accumulate minimum distances and threshold to
determine eye locations
Dynamic Time Warping – Performance




Overall detection rate: ~21.37%
Proper detection: ~10%
Partial detection: ~28.5%
False detection: ~61.5%
Dynamic Time Warping – Conclusions

Pros




Only requires one or more templates
Less sensitive to different head poses if enough templates are used
With proper thresholding and weighting, will usually find eye rows
properly
Cons




If eyes are shaded, they will most likely be missed
Increasing number of templates and sizes drastically decreases
performance
Frequently thinks that chins, necks, eyebrows, facial hair, cheeks,
noses, teeth, foreheads, scalps, glasses, shadows, and the background
are eyes
Generalized thresholding is difficult
Local Gradient Pattern – Pixel Codes




Uses small kernel to summarize local structure of an
image
Samples surrounding pixel intensities with bilinear
interpolation
Surrounding-to-center intensity delta computed for local
structure pixels
Deltas thresholded and read clockwise to form binary
code
Local Gradient Pattern – AdaBoost Learning





Defines weighting lookup tables based on collections of
face and non-face images
Lookup table is a 3D matrix (NxNx256) of classifiers that
defines weighting for every pixel of an NxN region for
every pixel’s intensity (0-255)
Weights are learned through iterative searches for feature
point intensities that are common through known face
and non-face images
Scores image regions by scanning lookup table over LGPcoded image to obtain strong classifier values
Strong classifier maxima determine face regions
Local Gradient Pattern - Procedure
Local Gradient Pattern – Conclusions

Could not get this to work




With larger databases, AdaBoost learning time can take
days on fast systems



Significant processing time on normal machines
Most likely was not coded optimally
Pros


Lack of database images? (~4400 here vs >1,000,000 in paper)
Lack of variety of faces? (only 40 different people)
Maybe I didn’t do it correctly?
Could theoretically efficiently find faces at multiples scales
invariant to intensity
Cons…
Questions?
Download