Application of a Post-processing Algorithm for Improved Human Face Recognition Metin Artiklar and Mohamad H. Hassoun Paul Watta Dept. Electrical and Computer Engineering Wayne State University Detroit, MI 48202 Dept. Electrical and Computer Engineering University of Michigan-Dearborn Dearborn, MI 48128 metin@tarek.eng.wayne.edu watta@umich.edu Abstract moved 2 pixels up, the average Hamming distance is 14.6%. This paper presents a shifting algorithm which can be used in a pattern recognition system to improve the system’s performance in the presence of shifted input patterns. The algorithm is outlined and simulation results are presented for some human face recognition experiments. It is shown that the shifting algorithm improves recognition performance for several different face recognition algorithms. Figure 2 shows that even a 1-pixel shift in the image can yield a large change in Hamming distance. 1. Introduction Figure 1(a) shows an example of an 82x115dimensional face image. To facilitate the shifting process, the image is cropped to a size of 72x72, as shown in (b). The image is shown in gray scale, but it can be made binary by simply thresholding the gray levels at 127. Figure 2 shows average Hamming distances (on the binary images) computed among 100 face images such as (b). The Hamming distance is computed between the image and itself, but with various amounts of shift and various directions of the shift. In the middle of the Figure, since the left/right and top/bottom shifts are 0, the Hamming distance is 0 (no shift is applied). As the image is moved 1 pixel to the right, the Hamming distance is 10.4% (on average). If it is (a) (b) Figure 1. (a) An 82x115 face image and (b) the same image cropped to 72x72. % Ave Hamming distance The problems of shift, rotation, and scaling are troublesome for image processing applications, such as automatic recognition systems. This paper presents an algorithm which can be used to improve the classification performance of such a system in the presence of shifted images. Up/down shift Left/right shift Figure 2. The average Hamming distance (in percent and averaged over 100 faces) between a face image and shifted versions of itself. The 0 in the center represents the case where no shift is applied. Note that all the images discussed here were obtained in a laboratory setting using an apparatus which constrains the amount of shift, rotation, scale, and tilt of the face. In particular, the experimental setup consists of a frame attached to a tripod. The subject puts his or her head in the frame, and the picture is snapped. This setup eliminates the need for segmenting the face from the rest of the image. Even with this constrained method of snapping the images, there can still be an appreciable amount of shift present in (different) images of the same person. Figure 3 shows two different pictures of the same individual. The images were snapped within minutes of each other. Figure 4 shows the resulting Hamming distances as image (b) is shifted and compared to image (a). (a) neighbors: north, south, east, and west. In this case, the Hamming distance is smallest in the south direction, hence the path moves in that direction. This process is continued until no further improvements can be made, as shown in the final Hamming distance value of 10.4%. Here, by applying this shifting process, the distance between the two images decreases by 41%. Increasing the match between a test image of a person and a stored prototype is desirable in many face recognition systems. Unfortunately, this process can also improve the Hamming distance for images of different people. For example, Figure 5 shows two images of different people, and Figure 6 shows the corresponding Hamming distances as image (b) is shifted and compared to image (a). Again, a path can be taken which lowers the Hamming distance, but notice that the improvement in Hamming distance here (28%) is smaller than the case of the same individual. (b) Figure 3. Two different 72x72 images of the same individual. (a) 29.4 28.5 27.5 27.3 27.3 28.5 29.4 30.5 31.5 27.9 27.0 25.8 25.4 25.6 26.4 27.8 29.1 30.6 27.0 25.4 24.2 23.5 23.7 25.0 25.9 27.6 29.9 25.8 23.7 22.0 21.4 21.4 22.8 24.5 26.8 29.2 25.0 22.3 19.6 18.0 17.7 20.1 22.5 25.4 27.6 24.0 21.2 17.2 14.2 13.9 17.0 21.0 24.4 27.1 23.7 19.7 15.7 11.3 10.4 15.8 20.4 23.2 26.2 24.2 20.8 16.9 12.9 12.8 16.6 20.4 23.5 26.9 25.4 22.8 19.8 17.5 17.0 19.2 22.0 25.2 27.9 Figure 4. This diagram shows the Hamming distance (in percent) between image 3(a) and shifted versions of image 3(b). At the center value in the table (where neither image is shifted), the Hamming distance is 17.7%. Starting from this position, a path can be outlined (shown underlined) which seeks to minimize the Hamming distance between the two images. This is accomplished by examining the Hamming distance in the four nearest Figure 5. (b) Images of two different people. 25.3 26.4 28.2 30.7 31.8 32.8 33.7 35.4 37.5 23.7 25.3 27.3 29.5 31.3 32.5 33.6 35.2 37.4 23.1 24.5 27.1 29.2 31.2 32.9 33.7 35.2 37.5 23.6 25.5 27.3 29.8 31.9 33.3 34.2 35.7 38.0 24.3 26.2 28.3 30.5 32.6 34.0 34.8 36.3 38.5 25.6 27.1 29.3 31.4 33.5 34.6 35.1 36.6 38.7 26.8 28.7 30.9 32.7 34.3 35.2 35.7 37.0 38.9 28.0 29.5 32.1 33.6 34.9 36.1 36.7 37.8 39.5 30.0 31.8 33.5 34.9 35.8 37.2 38.0 39.1 40.8 Figure 6. This diagram shows the Hamming distance (in percentage) between the two images shown in Figure 5 with various amounts of shift. Of course, for recognition purposes, it is desired that the distance between images of different people be as large as possible. The conjecture here is that even though the shifting process decreases the distance between different people, it tends to do so by a lesser amount than the distance improvement for images of the same person. Hence, there is an overall increase in the separation ability of the classifier. The process of shifting the input image to obtain a better match between images (of the same individual taken at different times) is the basis for the post-processing algorithm which is described in the next section. 2. The Shifting Algorithm This method requires the use of a pattern recognition scheme which can produce an ordered list of outputs ranked according to similarity to an input pattern. In our case, we use a simple nearest neighbor classification scheme and rank the outputs on the basis of Hamming distance (in the case of binary images) and city block distance (in the case of gray-scale images). In this post-processing algorithm, we select two parameters: k, the number of patterns from the ordered list which will participate in the shifting process (i.e., the k patterns at the top of the list), and s, the maximum path length in the shifting process. To minimize computational time, k should be chosen as small as possible, but sufficiently large so that there is a high probability that the correct image is contained in this subset of patterns. As illustrated in the previous section, each of the k candidate output patterns is shifted by 1 pixel in 4 possible directions: north, east, south, and west. After computing a similarity measure for each of these directions, we select the direction which yields the lowest distance and move the kth pattern in that direction. This process is repeated at most s times, or until there is no improvement in the distance measure in any direction. The final distance is recorded for each of the k patterns, and the image which has the least distance after this shifting process is taken as the output of the system (the best match). Note that when computing the optimal path, it is not necessary to generate an entire grid of all possibilities, as shown in Figure 5. Rather, we need only compute those Hamming distances along the path (and those distances surrounding the optimal path). In addition, from a computational point of view, there is no need to actually shift the pixels of the image at each step. Rather, one need only store the current corner point of the shifted image and use that as an offset index when computing the required Hamming distance. Hence, this algorithm can be implemented very efficiently. In fact, the computational burden of computing the optimal path consists of computing 4 Hamming distances at each step (on the 72x72 images), along with finding the minimum of those 4 values. Many recognition system use several prototypes of patterns to be recognized. For example, in the context of face recognition, it is possible to store several images of each person. We have constructed such a data set, containing 400 images of 100 different people. Each person has 4 images showing different facial expressions: a blank expression, smile, angry, and surprised. In this case, it is possible to select the top 10 winners (k = 10) and then also include all 4 expressions of each person as part of the shifting process. Hence, 40 images participate in the shifting post-processing algorithm. In the results below, this method is called Shift-10. Alternatively, we can simply select the top 40 images (k = 40) and apply the shift to these images. This method is referred to as Shift-40. 3. Results We tested the performance of this postprocessing algorithm on a data set consisting of 400 72x72-dimensional face images (100 different people), as described in the previous section. Note that an additional test image was taken of each person in the data set (a blank expression). Figure 7 shows the results of the simulations before and after the post-processing algorithm on binary face images using two different classification methods: a Hamming distance method and a two-level decoupled Hamming network (Watta, Akkal, and Hassoun, 1997; Ikeda, Watta, and Hassoun, 1998). The numbers in column 1 indicate the percentage that the correct person was ranked first in the list, and columns 2 and 3 indicate what percentage of the time the correct person was ranked second and third, respectively. For these results, s was set at 5 and k was set at 40 (using the Shift-10 and Shift-40 approaches as described in the previous section). Post Processing Algorithm 1 2 3 Hamming Classifier 94 73 45 2-Level Hamming 79 58 41 Hamming Classifier 97 87 47 2-Level Hamming 97 88 53 Hamming Classifier 97 87 46 2-Level Hamming 98 87 46 Post Processing None None Shift-10 Shift-10 Shift-40 Figure 7. Results of the post-processing algorithms for binary face images. Figure 8 shows the results of the post-processing algorithms when the gray-scale images are used. Here, 3 different recognition schemes are used: a nearest neighbor classifier (using the city block distance as the similarity measure), the two-level decoupled Hamming network, and a wavelet face recognition algorithm (Stollnitz, DeRose, and Salesin, 1995; Jacobs, Finkelstein, and Salesin, 1995). The results indicate that the proposed postprocessing shifting algorithm improves overall system performance. Most notable is the improvement for the patterns which appear second and third in the ordered list. For example, for binary images using the Hamming distance, the correct person was present second on the list only 76% of the time, but after applying the shifting algorithm, the correct person appeared 90% of the time. Hence this post-processing algorithm could be used in the context of a sensor fusion scheme whereby final classification is made on the basis of the information present in several of the top matching patterns, rather than just the best matching pattern. Future publications will explore the use of this shifting algorithm in a more practical recognition system, which includes a mechanism for rejecting an image when there is an insufficient match between the input and one of the memory patterns. Shift-40 Algorithm 1 2 3 Hamming Classifier 94 82 52 2-Level Hamming 94 75 52 Wavelet Classifier 97 80 62 Hamming Classifier 97 89 65 2-Level Hamming 95 91 70 Wavelet Classifier 98 93 76 Hamming Classifier 98 89 65 2-Level Hamming 95 85 61 Wavelet Classifier 97 90 75 Figure 8. Results of the post-processing algorithms for gray-scale images. Acknowledgments This work was supported by the National Science Foundation (NSF) under contract ECS9618597. References 1. Ikeda, N., Watta, P., and Hassoun, M. (1998). “Capacity Analysis of the Two-Level Decoupled Hamming Associative Memory” Proceedings of the IEEE International Conference on Neural Networks, ICNN’98, May 4-9, 1998, Anchorage, Alaska, pp. 486-491. 2. Jacobs, C., Finkelstein, A., and Salesin, D. (1995). "Fast Multiresolution Image Querying," Proceedings of SIGGAPH 95, in Computer Graphics Proceedings, Annual Conference Series, pp. 277-286, August 1995, Los Angeles, CA. 3. Stollnitz, E., DeRose, T., and Salesin, D. (1995). "Wavelets for computer graphics: A primer, Part 1," IEEE Transactions on Computer Graphics and Applications, 15(3), pp. 76-84. 4. Watta, P., Akkal, M., and Hassoun, M. (1997). “Decoupled Voting Hamming Associative Memory Networks” Proceedings of the IEEE International Conference on Neural Networks, ICNN’97, June 9-12, 1997, Houston, Texas, pp. 1188-1193.