Application of a Post-Processing Algorithm for Improved Human

advertisement
Application of a Post-processing Algorithm
for Improved Human Face Recognition
Metin Artiklar and Mohamad H. Hassoun
Paul Watta
Dept. Electrical and Computer Engineering
Wayne State University
Detroit, MI 48202
Dept. Electrical and Computer Engineering
University of Michigan-Dearborn
Dearborn, MI 48128
metin@tarek.eng.wayne.edu
watta@umich.edu
Abstract
moved 2 pixels up, the average Hamming
distance is 14.6%.
This paper presents a shifting algorithm which
can be used in a pattern recognition system to
improve the system’s performance in the
presence of shifted input patterns. The algorithm
is outlined and simulation results are presented
for some human face recognition experiments. It
is shown that the shifting algorithm improves
recognition performance for several different
face recognition algorithms.
Figure 2 shows that even a 1-pixel shift in the
image can yield a large change in Hamming
distance.
1. Introduction
Figure 1(a) shows an example of an 82x115dimensional face image. To facilitate the shifting
process, the image is cropped to a size of 72x72,
as shown in (b). The image is shown in gray
scale, but it can be made binary by simply
thresholding the gray levels at 127.
Figure 2 shows average Hamming distances (on
the binary images) computed among 100 face
images such as (b). The Hamming distance is
computed between the image and itself, but with
various amounts of shift and various directions
of the shift. In the middle of the Figure, since the
left/right and top/bottom shifts are 0, the
Hamming distance is 0 (no shift is applied). As
the image is moved 1 pixel to the right, the
Hamming distance is 10.4% (on average). If it is
(a)
(b)
Figure 1.
(a) An 82x115 face image and (b)
the same image cropped to 72x72.
% Ave Hamming distance
The problems of shift, rotation, and scaling are
troublesome for image processing applications,
such as automatic recognition systems. This
paper presents an algorithm which can be used to
improve the classification performance of such a
system in the presence of shifted images.
Up/down shift
Left/right shift
Figure 2.
The average Hamming distance (in
percent and averaged over 100 faces) between a
face image and shifted versions of itself. The 0 in
the center represents the case where no shift is
applied.
Note that all the images discussed here were
obtained in a laboratory setting using an
apparatus which constrains the amount of shift,
rotation, scale, and tilt of the face. In particular,
the experimental setup consists of a frame
attached to a tripod. The subject puts his or her
head in the frame, and the picture is snapped.
This setup eliminates the need for segmenting
the face from the rest of the image.
Even with this constrained method of snapping
the images, there can still be an appreciable
amount of shift present in (different) images of
the same person. Figure 3 shows two different
pictures of the same individual. The images were
snapped within minutes of each other. Figure 4
shows the resulting Hamming distances as image
(b) is shifted and compared to image (a).
(a)
neighbors: north, south, east, and west. In this
case, the Hamming distance is smallest in the
south direction, hence the path moves in that
direction. This process is continued until no
further improvements can be made, as shown in
the final Hamming distance value of 10.4%.
Here, by applying this shifting process, the
distance between the two images decreases by
41%. Increasing the match between a test image
of a person and a stored prototype is desirable in
many face recognition systems.
Unfortunately, this process can also improve the
Hamming distance for images of different
people. For example, Figure 5 shows two images
of different people, and Figure 6 shows the
corresponding Hamming distances as image (b)
is shifted and compared to image (a). Again, a
path can be taken which lowers the Hamming
distance, but notice that the improvement in
Hamming distance here (28%) is smaller than the
case of the same individual.
(b)
Figure 3.
Two different 72x72 images of the
same individual.
(a)
29.4
28.5
27.5
27.3
27.3
28.5
29.4
30.5
31.5
27.9
27.0
25.8
25.4
25.6
26.4
27.8
29.1
30.6
27.0
25.4
24.2
23.5
23.7
25.0
25.9
27.6
29.9
25.8
23.7
22.0
21.4
21.4
22.8
24.5
26.8
29.2
25.0
22.3
19.6
18.0
17.7
20.1
22.5
25.4
27.6
24.0
21.2
17.2
14.2
13.9
17.0
21.0
24.4
27.1
23.7
19.7
15.7
11.3
10.4
15.8
20.4
23.2
26.2
24.2
20.8
16.9
12.9
12.8
16.6
20.4
23.5
26.9
25.4
22.8
19.8
17.5
17.0
19.2
22.0
25.2
27.9
Figure 4.
This diagram shows the Hamming
distance (in percent) between image 3(a) and
shifted versions of image 3(b).
At the center value in the table (where neither
image is shifted), the Hamming distance is
17.7%. Starting from this position, a path can be
outlined (shown underlined) which seeks to
minimize the Hamming distance between the
two images. This is accomplished by examining
the Hamming distance in the four nearest
Figure 5.
(b)
Images of two different people.
25.3
26.4
28.2
30.7
31.8
32.8
33.7
35.4
37.5
23.7
25.3
27.3
29.5
31.3
32.5
33.6
35.2
37.4
23.1
24.5
27.1
29.2
31.2
32.9
33.7
35.2
37.5
23.6
25.5
27.3
29.8
31.9
33.3
34.2
35.7
38.0
24.3
26.2
28.3
30.5
32.6
34.0
34.8
36.3
38.5
25.6
27.1
29.3
31.4
33.5
34.6
35.1
36.6
38.7
26.8
28.7
30.9
32.7
34.3
35.2
35.7
37.0
38.9
28.0
29.5
32.1
33.6
34.9
36.1
36.7
37.8
39.5
30.0
31.8
33.5
34.9
35.8
37.2
38.0
39.1
40.8
Figure 6.
This diagram shows the Hamming
distance (in percentage) between the two images
shown in Figure 5 with various amounts of shift.
Of course, for recognition purposes, it is desired
that the distance between images of different
people be as large as possible. The conjecture
here is that even though the shifting process
decreases the distance between different people,
it tends to do so by a lesser amount than the
distance improvement for images of the same
person. Hence, there is an overall increase in
the separation ability of the classifier.
The process of shifting the input image to obtain
a better match between images (of the same
individual taken at different times) is the basis
for the post-processing algorithm which is
described in the next section.
2. The Shifting Algorithm
This method requires the use of a pattern
recognition scheme which can produce an
ordered list of outputs ranked according to
similarity to an input pattern. In our case, we use
a simple nearest neighbor classification scheme
and rank the outputs on the basis of Hamming
distance (in the case of binary images) and city
block distance (in the case of gray-scale images).
In this post-processing algorithm, we select two
parameters: k, the number of patterns from the
ordered list which will participate in the shifting
process (i.e., the k patterns at the top of the list),
and s, the maximum path length in the shifting
process. To minimize computational time, k
should be chosen as small as possible, but
sufficiently large so that there is a high
probability that the correct image is contained in
this subset of patterns.
As illustrated in the previous section, each of the
k candidate output patterns is shifted by 1 pixel
in 4 possible directions: north, east, south, and
west. After computing a similarity measure for
each of these directions, we select the direction
which yields the lowest distance and move the
kth pattern in that direction. This process is
repeated at most s times, or until there is no
improvement in the distance measure in any
direction. The final distance is recorded for each
of the k patterns, and the image which has the
least distance after this shifting process is taken
as the output of the system (the best match).
Note that when computing the optimal path, it is
not necessary to generate an entire grid of all
possibilities, as shown in Figure 5. Rather, we
need only compute those Hamming distances
along the path (and those distances surrounding
the optimal path).
In addition, from a computational point of view,
there is no need to actually shift the pixels of the
image at each step. Rather, one need only store
the current corner point of the shifted image and
use that as an offset index when computing the
required Hamming distance. Hence, this
algorithm can be implemented very efficiently.
In fact, the computational burden of computing
the optimal path consists of computing 4
Hamming distances at each step (on the 72x72
images), along with finding the minimum of
those 4 values.
Many recognition system use several prototypes
of patterns to be recognized. For example, in the
context of face recognition, it is possible to store
several images of each person. We have
constructed such a data set, containing 400
images of 100 different people. Each person has
4 images showing different facial expressions: a
blank expression, smile, angry, and surprised.
In this case, it is possible to select the top 10
winners (k = 10) and then also include all 4
expressions of each person as part of the shifting
process. Hence, 40 images participate in the
shifting post-processing algorithm. In the results
below, this method is called Shift-10.
Alternatively, we can simply select the top 40
images (k = 40) and apply the shift to these
images. This method is referred to as Shift-40.
3. Results
We tested the performance of this postprocessing algorithm on a data set consisting of
400 72x72-dimensional face images (100
different people), as described in the previous
section. Note that an additional test image was
taken of each person in the data set (a blank
expression).
Figure 7 shows the results of the simulations
before and after the post-processing algorithm on
binary face images using two different
classification methods: a Hamming distance
method and a two-level decoupled Hamming
network (Watta, Akkal, and Hassoun, 1997;
Ikeda, Watta, and Hassoun, 1998). The numbers
in column 1 indicate the percentage that the
correct person was ranked first in the list, and
columns 2 and 3 indicate what percentage of the
time the correct person was ranked second and
third, respectively.
For these results, s was set at 5 and k was set at
40 (using the Shift-10 and Shift-40 approaches
as described in the previous section).
Post
Processing
Algorithm
1
2
3
Hamming Classifier
94
73
45
2-Level Hamming
79
58
41
Hamming Classifier
97
87
47
2-Level Hamming
97
88
53
Hamming Classifier
97
87
46
2-Level Hamming
98
87
46
Post
Processing
None
None
Shift-10
Shift-10
Shift-40
Figure 7.
Results of the post-processing
algorithms for binary face images.
Figure 8 shows the results of the post-processing
algorithms when the gray-scale images are used.
Here, 3 different recognition schemes are used: a
nearest neighbor classifier (using the city block
distance as the similarity measure), the two-level
decoupled Hamming network, and a wavelet face
recognition algorithm (Stollnitz, DeRose, and
Salesin, 1995; Jacobs, Finkelstein, and Salesin,
1995).
The results indicate that the proposed postprocessing shifting algorithm improves overall
system performance. Most notable is the
improvement for the patterns which appear
second and third in the ordered list. For example,
for binary images using the Hamming distance,
the correct person was present second on the list
only 76% of the time, but after applying the
shifting algorithm, the correct person appeared
90% of the time. Hence this post-processing
algorithm could be used in the context of a
sensor
fusion
scheme
whereby
final
classification is made on the basis of the
information present in several of the top
matching patterns, rather than just the best
matching pattern.
Future publications will explore the use of this
shifting algorithm in a more practical recognition
system, which includes a mechanism for
rejecting an image when there is an insufficient
match between the input and one of the memory
patterns.
Shift-40
Algorithm
1
2
3
Hamming Classifier
94
82
52
2-Level Hamming
94
75
52
Wavelet Classifier
97
80
62
Hamming Classifier
97
89
65
2-Level Hamming
95
91
70
Wavelet Classifier
98
93
76
Hamming Classifier
98
89
65
2-Level Hamming
95
85
61
Wavelet Classifier
97
90
75
Figure 8.
Results of the post-processing
algorithms for gray-scale images.
Acknowledgments
This work was supported by the National
Science Foundation (NSF) under contract ECS9618597.
References
1.
Ikeda, N., Watta, P., and Hassoun, M. (1998).
“Capacity Analysis of the Two-Level Decoupled
Hamming Associative Memory” Proceedings of
the IEEE International Conference on Neural
Networks, ICNN’98, May 4-9, 1998, Anchorage,
Alaska, pp. 486-491.
2.
Jacobs, C., Finkelstein, A., and Salesin, D.
(1995). "Fast Multiresolution Image Querying,"
Proceedings of SIGGAPH 95, in Computer
Graphics Proceedings, Annual Conference
Series, pp. 277-286, August 1995, Los Angeles,
CA.
3.
Stollnitz, E., DeRose, T., and Salesin, D. (1995).
"Wavelets for computer graphics: A primer, Part
1," IEEE Transactions on Computer Graphics
and Applications, 15(3), pp. 76-84.
4.
Watta, P., Akkal, M., and Hassoun, M. (1997).
“Decoupled Voting Hamming Associative
Memory Networks” Proceedings of the IEEE
International Conference on Neural Networks,
ICNN’97, June 9-12, 1997, Houston, Texas, pp.
1188-1193.
Download