Tamkang Journal of Science and Engineering, Vol. 11, No. 1, pp. 65-72 (2008) 65 A Real-Time Driver Fatigue Detection System Based on Eye Tracking and Dynamic Template Matching Wen-Bing Horng* and Chih-Yuan Chen Department of Computer Science and Information Engineering, Tamkang University, Tamsui, Taiwan 251, R.O.C. Abstract This paper presents a vision-based real-time driver fatigue detection system for driving safety. Based on skin colors, the driver’s face is located from a color video captured in a car. Then, edge detection is employed to locate the regions of the driver’s eyes, which are used as the templates for eye tracking in subsequent frames. Finally, the tracked eyes’ images are used for fatigue detection in order to generate warning alarms for driving safety. The proposed system was tested on a Pentium M 1.4G notebook with 512 MB RAM. The experimental results seemed quite encouraging and promising. The system could reach more than 30 frames per second for eye tracking, and the average correct rate for eye location and tracking could achieve 96.0% on five test videos. Though the average precision rate of fatigue detection was 89.3%, the correct detection rate could achieve 100% on the test videos. Key Words: Intelligent Transportation System, Driver Fatigue Detection, Eye Detection, Face Detection, Template Matching 1. Introduction As the standard of living has improved in the past few decades, many families possess their own transportation vehicles. Therefore, people often drive their cars for business or trip. However, long distance driving usually makes drivers exhausted. Thus, the driver fatigue problem has been an important factor of traffic accidents. In order to decrease traffic accidents, many countries have begun to pay attention to the driving safety problem. Some psychologists [1,2] have investigated the drivers’ mental states relating to driving safety. Furthermore, many other researchers [3-6,10-19] have proposed various auxiliary mechanisms to improve driving safety. Cho et al. [3] designed a special car seat to improve the alertness of the driver by adjusting the temperature and humidity of the driver’s thighs and hips to reduce the degree of drowsiness. Chieh et al. [4] monitored the grip force change of the driver on the steering wheel to detect fatigue. Wilson and Bracewell [5] analyzed EEG (Elec*Corresponding author. E-mail: horng@mail.tku.edu.tw troencephalogram) recordings by neural networks to monitor alertness. Similar research was also performed by Vuckovic et al. [6] to detect drowsiness. However, these methods need some special sensors adhering to the human body, which seems impractical during driving. Other researchers based on image processing [7] and computer vision [8,9] techniques to detect driver’s fatigue from images captured by a camera to improve driving safety. Some of these researchers [10-14] utilized infrared CCD cameras to locate the driver’s face and eyes positions easily. However, infrared cameras are very expensive, which makes them of limited usage. In addition, the influence of infrared light on human eyes is not known yet. To be practical, some other researchers [15-19] employed ordinary CCD cameras to perform fatigue detection. However, most of those driver fatigue detection methods suffer from the illumination change problem. In addition, they might not be suitable for real-time applications due to their complicated computations. In this paper, we propose a vision-based real-time driver fatigue detection system based on ordinary CCD cameras to cope with the above drawbacks for practical 66 Wen-Bing Horng and Chih-Yuan Chen usage to improve driving safety. For the illumination change problem, we adopt the HSI color model to represent face skin, not only because the model detaches the intensity from the hue of a color, but also because face colors have fixed distribution range on the hue component. Since eyes have the nature of sophisticated edges, we use the Sobel edge operator and the projection technique to locate eyes’ positions. The obtained eyes’ images are then used as the templates for eye tracking in subsequent frames, instead of using the static eye images in the database as used in [16], to enhance the reality and reliability of the templates to increase accuracy of eye tracking. Finally, the tracked eyes’ images are used to determine whether the eyes are open or closed, which is the basis of driver fatigue detection used in this paper. The rest of the paper is organized as follows. In Section 2, we briefly review the related driver fatigue detection methods based on ordinary CCD cameras. In Section 3, we present a new vision-based real-time driver fatigue detection system, including face detection, eye detection, eye tracking, and fatigue detection. The experimental results of the system are analyzed in Section 4. Finally, the paper is concluded in the last section. 2. Related Driver Fatigue Detection Methods Based on Ordinary CCDs In this section, we briefly review the related visionbased driver fatigue detection methods using ordinary CCD cameras. In 1997, Eriksson and Papanikolopoulos [15] proposed a method to detect driver fatigue from ordinary videos. In this method, the symmetric property of the face is used to detect the facial area of the driver on an image. Then, pixel difference is performed to find the edges on the facial region, and horizontal projection is used to locate the vertical position of the eyes. Finally, a concentric circle template is used to locate and track the exact eyes’ positions. Although face symmetry is an obvious feature for an upright face, it usually fails to locate the correct face position when the face tilts, rotates, or is shadowed. Instead of using symmetric central line method, Singh and Papanikolopoulos (1999) [16] later proposed another method to monitor driver fatigue. In this method, the Gaussian distribution of skin colors in the RGB color model is used to distinguish skin and non-skin pixels for face detection. Then, the Sobel vertical edge operator is used for eye detection on the driver’s face. Besides, a set of eye images as templates in a database is used for eye detection and tracking. Although the Gaussian distribution of skin colors based on the RGB color model is used to predict skin quite well, the method cannot get rid of the factor of illumination changes. In addition, the eye images in the database as templates may be quite different from drivers’ eyes, which will reduce the accuracy for locating eye positions. In 2003, Wang et al. [17] proposed a method for monitoring driver fatigue behavior. Like [16], the Gaussian distribution of skin colors in the RGB color model is used to locate a face region. Then, the face is binarized with a fixed threshold to locate the eyes’ positions with some geometric restrictions and to track the eyes by using the Hausdorff distance template matching. Finally, the features of the eyes are analyzed by the Gabor filters, and the states of driver’s eyes are classified by neural networks. Since the eyes are detected with a fixed threshold, this system might not resist illumination changes. In 2004, Wang et al. [18] proposed another method for driver fatigue detection by monitoring mouth movement. Similarly, the Gaussian distribution of skin colors is used to locate a face region. Then, a Fisher linear classifier is used to detect the mouth of the driver, and a Kalman filter is used to track the mouth movement. However, the mouth movement may not be a reliable feature for detecting drowsiness because the mouth shapes will also change when talking, eating, laughing, and so on. It is not easy to classify the mouth movement into a correct drowsy state. In 2005, Dong and Wu [19] also presented a driver fatigue detection algorithm based on the distance of eyelids. Instead of using the RGB color model, this method utilizes the Gaussian distribution of skin colors based on the YCbCr color model to detect the face region. Then, horizontal projection is used to locate eyes’ positions, and the gray scale correlation is used for eye tracking. Finally, the distance of eyelids is calculated to detect fatigue. Although the method can resist the influence of illumination changes, it is not easy to distinguish the distance of eyelids of closed eyes or open eyes by small difference of eyelids’distance, especially when persons have long eyelash. However, most of the above vision-based driver fa- A Real-Time Driver Fatigue Detection System Based on Eye Tracking and Dynamic Template Matching 67 tigue detection methods using ordinary CCD cameras cannot resolve the illumination change problem. Besides, they might not be practical for real-time applications due to their complicated and intensive computations. 3. The Proposed Driver Fatigue Detection System The proposed system uses an ordinary color CCD camera mounted on the dash board of a car to capture the images of the driver for driver fatigue detection. The flow chart of the proposed fatigue detection system is depicted in Figure 1. The first image is used for initial face location and eye detection. If any one of these detection procedures fails, then go to the next frame and restart the above detection processes. Otherwise, the current eye images, as the dynamic templates, are used for eye tracking on subsequent images. If eye tracking fails, the processes of face location and eye detection restart on the present image. These procedures continue until there are no more frames. The detailed steps are described in the following subsections. 3.1 Face Detection Digital images usually adopt the RGB color space to represent colors. However, any color in the RGB space not only displays its hue but also contains its brightness. For two colors with the same hue but different intensities, they would be viewed as two different colors by the human visual system. In order to accurately distinguish skin and non-skin pixels so that they will not be affected by shadows or light changes, the brightness factor must be excluded from colors. Since in the HSI color model, hue is independent of brightness, this model is well suited for distinguishing skin and non-skin colors no matter whether the face is shadowed or not. Thus, in this paper it is used for face detection. The HSI color model has three components: hue, saturation, and intensity (or brightness). The detailed method for converting a color between the RGB space and the HSI model refers to [7]. The following Equations (1) to (3) show the results of converting a color from the RGB model to the HSI model. For any color with three elements, R, G, and B, normalized in the range between 0 and 1, the formulas for converting to its corresponding H, S, and I components are given as follows. Figure 1. Flow chart of the proposed driver fatigue detection system. (1) (2) (3) When (B/I) > (G/I), H = 360° – H. From the face images of the test videos, it is observed that the H values of skin colors fall in the range between 10° and 45°. Using this condition, it is easy to separate the face region, as shown in Figure 2(b). By performing the vertical projection on skin pixels, the right 68 Wen-Bing Horng and Chih-Yuan Chen and left boundaries can be found when the projecting values are greater than some threshold. In order to eliminate the region of the ears on the both sides of the face, the value of threshold is defined as 50 pixels through experiments. Similarly, by performing the horizontal projection on skin pixels within the left and right boundaries, the top and bottom boundaries can also be found, as shown in Figure 2(c). According to the normal position of the eyes in a face, it is reasonable to assume that the possible location of eyes will be in the block of the upper two fifths of the face region, as shown in Figure 2(d). Figures 3(a) and 3(c) illustrate that even under the case of uneven illumination on the faces during driving, the proposed method based on the HSI color model can still correctly segment the face regions. 3.2 Eye Detection After locating the eye region, the original color eye region is converted into gray scale by Equation (4) as follows (refer to [7]). Y = 0.299R + 0.587G + 0.114B (4) where Y is the gray value of a color (R, G, B). Then, the Sobel edge operator is used to compute the edge magnitude in the eye region to find the vertical position of the eyes. In order to save computation time, an approximate calculation of the edge magnitude, G(x, y), is computed in Equation (5) as follows. G ( x, y) » S x + S y (5) where Sx and Sy are the horizontal and vertical gradient values obtained, respectively, from the Sobel horizontal and vertical edge operators, and |Sx| represents the absolute value of Sx. For detailed calculation, refer to [7]. After computing the edge magnitude of each pixel in the eye region, an edge map, E(x, y), is obtained by defining a pixel (x, y) being an edge pixel if its edge magnitude G(x, y) is greater than some threshold, T, as defined in Equation (6). (6) where black and white stand for the black pixel value and the white pixel value, respectively. The threshold T is set to 150 via several tests. The left part of Figure 4(b) illustrates the edge map of Figure 4(a). Performing horizontal projection on the edge map, the exact vertical position of the eyes can be located at the peak, as shown in the right part of Figure 4(b). Some- Figure 2. Result of face detection. Figure 3. Results of face detection under uneven illumination on the face during driving. A Real-Time Driver Fatigue Detection System Based on Eye Tracking and Dynamic Template Matching 69 Figure 4. Results of edge detection and horizontal projection. times, it may need to smooth horizontal projections for locating peaks. After locating the vertical position of the eyes, as shown in Figure 4(b), the left and right eye positions can be located by finding the largest connected components from the center of the eye region along the horizontal line of the vertical position of the eyes. In order to avoid marking noise, the total number of pixels in a connected component must exceed some threshold so that it can be regarded as a reasonable eye. After finding the connected component of an eye, a bounding box for the connected component on the original image is used to enclose the eye image, which is used as the template for eye tracking in the subsequent frames. Figure 5 illustrates the bounding boxes of detected eye positions. 3.3 Eye Tracking After finding the eye templates, they are used for driver’s eye tracking by template matching. The searching area for matching is the original bounding box of an eye by expanding some reasonable number, t, of pixels, say t = 10, in each of four directions. Consider an eye template g(x, y) locating in the position (a, b) of the face image f(x, y). Equation (7) is used for the eye template matching. (7) where w and h are the width and height of the eye template g(x, y), and p and q are offsets of the x-axis and y-axis in which (a – t) p (a + t) and (b – t) q (b + t). If M(p*, q*) is the minimum value within the search area, the point (p*, q*) is defined as the most matching position of the eye. Thus, (a, b) is update to be the new position (p*, q*). The newly updated eye templates must satisfy the following two constrains to confirm successful eye tracking: (i) The ratio of width to height of each new eye template is between 1 and 4. Figure 5. Results of eye detection with bounding boxes. (ii) Since the movement of the driver’s head will not very big between each pair of consecutire frames (1/30 second), the distance between the newly obtained templates and the previous templates is less than 10 pixels. Otherwise, the eye tracking will be regarded as failure and the system will restart the face and eye location procedures. 3.4 Fatigue Detection At this stage, the colors of the eyeballs in the eye templates are used directly for fatigue detection. Since the property that the eyeball colors are much darker is a quite stable feature, the eye templates are inverted (negated) and then converted to the HSI color model. The original darker eyeballs become brighter ones in the inverted image. According to the observation, the saturation values of eyeball pixels normally fall in the range between 0.00 and 0.14. This observation is used to distinguish whether a pixel in an eye template is viewed as an eyeball pixel, as shown in Figure 6. When the eyes are open, there are some eyeball pixels, as shown in Figures 6(a) and 6(b). When the eyes are closed, there are no eyeball pixels, as shown in Figures 6(c) and 6(d). By checking the eyeball pixels, it is easy to detect whether the eyes are open or closed. According to the experimental results as shown in Figure 7, the duration of a normal eye blink (the small peak) is usually less than 4 frames at the capture rate of 30 frames per second. If the duration for an eye-close is more than 5 frames, the distance of the car 70 Wen-Bing Horng and Chih-Yuan Chen Figure 6. Eyeball detection of open eye and closed eye. Figure 7. Analysis of eye blinks. will move more than 5.5 meters at the speed of 100 kilometers per hour in a high way. It will be very dangerous for the driver to ignore the status of the road when driving under such a condition. For this reason, in our system it is defined that when the eyes close over 5 consecutive frames, then the driver is regarded as dozing off. Of course, this parameter can be adjusted by users according to their needs. 4. Experimental Results The proposed driver fatigue detection system used a Sony PC115 color DV camera to capture driver’s images. The system was tested under the environment of Pentium M 1.4G notebook with 512 MB RAM. The format of input video is 320 ´ 240 true color. After starting the system, it took less than 40 milliseconds for initial face location and eye detection. Once the eye templates were found, eye tracking could achieve more than 30 frames per second. Recall that in this system, if the driver closes his/her eyes over 5 consecutive frames, then the driver is regarded as dozing off. Table 1 lists the results of eye tracking from five test videos, as shown in Figure 8. The first four videos were captured under different illumination conditions with different people and backgrounds. The last video, video 5, was captured when driving around a parking lot at nightfall with large illumination change. The field Total Frames means the total number of frames in each video. Tracking Failure is the count of eye tracking failure. Correct Rate of eye tracking is defined as in Equation (8), which is the ratio of (Total Frames – Tracking Failure) to Total Frames. (8) From Table 1, we can see that the correct rate of eye tracking is higher than 98.3% in the first four videos, and it still can reach 80.0% in the very strict environment of video 5. The average correct rate achieves 96.0% in the proposed system on these test videos. Table 2 shows the result of driver fatigue detection on the five test videos. The field Close Eyes represents the number eye closing and then opening of the driver in the video. Real Dozing is the number of dozing off determined by humans. Generate Warning is the number of fatigue detection that generates a warning alarm. False Table 1. Result of eye tracking Total Frames Tracking Failure Correct Rate Video 1 Video 2 Video 3 Video 4 Video 5 Total 2634 8 99.7% 1524 6 99.6% 2717 46 98.3% 433 7 98.4% 1399 280 80.0% 8707 347 96.0% A Real-Time Driver Fatigue Detection System Based on Eye Tracking and Dynamic Template Matching 71 Figure 8. Snapshots during tracking with different people under different illumination. Table 2. Result of fatigue detection Close Eyes Real Dozing Generate Warning False Positive False Negative Correct Warning Correct Rate Precision Rate Video 1 Video 2 Video 3 Video 4 Video 5 Total 22 03 03 00 00 03 100% 100% 18 04 04 00 00 04 100% 100% 43 15 18 03 00 15 100% 83.3% 6 2 2 0 0 2 100% 100% 3 1 1 0 0 1 100% 100% 92 25 28 03 00 25 100% 89.3% Positive is the number of wrongly generated warning alarms. False Negative is the number of not detected fatigue. Correct Warning is the number of correctly detected fatigue, which is equal to (Generate Warning – False Positive – False Negative). Correct Rate of fatigue detection is defined as in Equation (9). posed system could still correctly detect and track the driver’s eyes. In spite of this kind of interference and illumination changes, the average precision rate for fatigue detection for all videos could still achieve 89.3% in the proposed system. 5. Conclusion (9) Precision Rate for fatigue detection is defined as in Equation (10). (10) As can be seen from Table 2, the system could correctly detect all fatigue states in all the test videos. In video 3, three mistakes made by the system are due to the interference of the driver’s hand, as shown in Figure 8(c). Even for the uneven illumination on the face in a moving car, as shown in Figures 8(e) to 8(h), the pro- This paper has presented a vision-based real-time driver fatigue detection system for driving safety. The system uses an ordinary CCD camera to capture the driver’s images, which makes the method more practical. Tested on a notebook of Pentium M 1.4G CPU with 512 MB RAM, the system could reach 96.0% of average correct rate of eye detection on five videos of various illumination conditions, and could achieve 100% of correct rate of fatigue detection with precision rate 89.3%. The speed of this system could reach more than 30 frames per second for eye tracking, with each color frame of size 320 ´ 240. In summary, the proposed driver fatigue de- 72 Wen-Bing Horng and Chih-Yuan Chen tection system could cope with the illumination change problem and could be suitable for real-time applications to improve driving safety. Acknowledgement The authors would like to thank the anonymous reviewers for their valuable comments. This research was partially supported by the National Science Council, Taiwan, ROC., under contract no. NSC 92-2213-E-032-031. References [1] Shinar, D., Psychology on the Road, John Wiley & Sons, Danvers, MA, U.S.A. (1979). [2] Recarte, M. A. and Nunes, L. M., “Effects of Verbal and Spatial-Imagery Tasks on Eye Fixations While Driving,” Journal of Experimental Psychology: Applied, Vol. 6, pp. 31-43 (2000). [3] Cho, K. J., Roy, B., Mascaro, S. and Asada, H. H., “A Vast DOF Robotic Car Seat Using SMA Actuators with a Matrix Drive System,” Proc. IEEE Robotics and Automation, New Orleans, LA, U.S.A., Vol. 4, pp. 3647-3652 (2004). [4] Chieh, T. C., Mustafa, M. M., Hussain, A., Zahedi, E. and Majlis, B. Y., “Driver Fatigue Detection Using Steering Grip Force,” Proc. IEEE Student Conference on Research and Development, Putrajaya, Malaysia, pp. 45-48 (2003). [5] Wilson, B. J. and Bracewell, T. D., “Alertness Monitor Using Neural Networks for EEG Analysis,” Proc. IEEE Signal Processing Society Workshop on Neural Networks for Signal Processing, Sydney, Australia, Vol. 2, pp. 814-820 (2000). [6] Vuckovic, A., Popovic, D. and Radivojevic, V., “Artificial Neural Network for Detecting Drowsiness from EEG Recordings,” Proc. IEEE Seminar on Neural Network Applications in Electrical Engineering, Belgrade, Yugoslavia, pp. 155-158 (2002). [7] Gonzalez, R. C. and Woods, R. E., Digital Image Processing, Second Edition, Prentice Hall, Upper Saddle River, NJ, U.S.A. (2002). [8] Davies, E. R., Machine Vision: Theory, Algorithms, Practicalities, Third Edition, Elsevier, Oxford, UK (2005). [9] Forsyth, D. A. and Ponce, J., Computer Vision: A Mo- dern Approach, Prentice Hall, Upper Saddle River, NJ, U.S.A. (2003). [10] Gu, H., Ji, Q. and Zhu, Z., “Active Facial Tracking for Fatigue Detection,” Proc. IEEE Workshop on Applications of Computer Vision, Orlando, FL., U.S.A., pp. 137-142 (2002). [11] Ji, Q., Zhu, Z. and Lan, P., “Real-Time Nonintrusive Monitoring and Prediction of Driver Fatigue,” IEEE Transactions on Vehicular Technology, Vol. 53, pp. 1052-1068 (2004). [12] Zhu, Z. and Ji, Q., “Real-Time and Non-intrusive Driver Fatigue Monitoring,” Proc. IEEE Intelligent Transportation Systems, Washington, D.C., U.S.A., pp. 657-662 (2004). [13] Gu, H. and Ji, Q., “An Automated Face Reader for Fatigue Detection,” Proc. IEEE Automatic Face and Gesture Recognition, Seoul, Korea, pp. 111-116 (2004). [14] Ito, T., Mita, S., Kozuka, K., Nakano, T. and Yamamoto, S., “Driver Blink Measurement by the Motion Picture Processing and its Application to Drowsiness Detection,” Proc. IEEE Intelligent Transportation Systems, Singapore, pp. 168-173 (2002). [15] Eriksson, M. and Papanikolopoulos, N. P., “EyeTracking for Detection of Driver Fatigue,” Proc. IEEE Intelligent Transportation Systems, Boston, MA, U.S.A., pp. 314-319 (1997). [16] Singh, S. and Papanikolopoulos, N. P., “Monitoring Driver Fatigue Using Facial Analysis Techniques,” Proc. IEEE Intelligent Transportation Systems, Tokyo, Japan, pp. 314-318 (1999). [17] Wang, R., Guo, K., Shi, S. and Chu, J., “A Monitoring Method of Driver Fatigue Behavior Based on Machine Vision,” Proc. IEEE Intelligent Vehicles Symposium, Columbus, Ohio, U.S.A., pp. 110-113 (2003). [18] Wang, R., Guo, L., Tong, B., and Jin, L., “Monitoring Mouth Movement for Driver Fatigue or Distraction with One Camera,” Proc. IEEE Intelligent Transportation Systems, Washington, D.C., U.S.A., pp. 314319 (2004). [19] Dong, W. and Wu, X., “Driver Fatigue Detection Based on the Distance of Eyelid,” Proc. IEEE VLSI Design and Video Technology, Suzhou, China, pp. 365368 (2005). Manuscript Received: Dec. 29, 2006 Accepted: Jul. 25, 2007