A Real-Time Driver Fatigue Detection System Based on Eye

advertisement
Tamkang Journal of Science and Engineering, Vol. 11, No. 1, pp. 65-72 (2008)
65
A Real-Time Driver Fatigue Detection System Based
on Eye Tracking and Dynamic Template Matching
Wen-Bing Horng* and Chih-Yuan Chen
Department of Computer Science and Information Engineering, Tamkang University,
Tamsui, Taiwan 251, R.O.C.
Abstract
This paper presents a vision-based real-time driver fatigue detection system for driving safety.
Based on skin colors, the driver’s face is located from a color video captured in a car. Then, edge
detection is employed to locate the regions of the driver’s eyes, which are used as the templates for eye
tracking in subsequent frames. Finally, the tracked eyes’ images are used for fatigue detection in order
to generate warning alarms for driving safety. The proposed system was tested on a Pentium M 1.4G
notebook with 512 MB RAM. The experimental results seemed quite encouraging and promising. The
system could reach more than 30 frames per second for eye tracking, and the average correct rate for
eye location and tracking could achieve 96.0% on five test videos. Though the average precision rate
of fatigue detection was 89.3%, the correct detection rate could achieve 100% on the test videos.
Key Words: Intelligent Transportation System, Driver Fatigue Detection, Eye Detection, Face Detection, Template Matching
1. Introduction
As the standard of living has improved in the past
few decades, many families possess their own transportation vehicles. Therefore, people often drive their cars
for business or trip. However, long distance driving usually makes drivers exhausted. Thus, the driver fatigue
problem has been an important factor of traffic accidents.
In order to decrease traffic accidents, many countries have
begun to pay attention to the driving safety problem.
Some psychologists [1,2] have investigated the drivers’
mental states relating to driving safety. Furthermore,
many other researchers [3-6,10-19] have proposed various auxiliary mechanisms to improve driving safety.
Cho et al. [3] designed a special car seat to improve
the alertness of the driver by adjusting the temperature
and humidity of the driver’s thighs and hips to reduce the
degree of drowsiness. Chieh et al. [4] monitored the grip
force change of the driver on the steering wheel to detect
fatigue. Wilson and Bracewell [5] analyzed EEG (Elec*Corresponding author. E-mail: horng@mail.tku.edu.tw
troencephalogram) recordings by neural networks to monitor alertness. Similar research was also performed by
Vuckovic et al. [6] to detect drowsiness. However, these
methods need some special sensors adhering to the human body, which seems impractical during driving.
Other researchers based on image processing [7] and
computer vision [8,9] techniques to detect driver’s fatigue from images captured by a camera to improve driving safety. Some of these researchers [10-14] utilized infrared CCD cameras to locate the driver’s face and eyes
positions easily. However, infrared cameras are very expensive, which makes them of limited usage. In addition,
the influence of infrared light on human eyes is not known
yet. To be practical, some other researchers [15-19] employed ordinary CCD cameras to perform fatigue detection. However, most of those driver fatigue detection methods suffer from the illumination change problem. In addition, they might not be suitable for real-time applications due to their complicated computations.
In this paper, we propose a vision-based real-time
driver fatigue detection system based on ordinary CCD
cameras to cope with the above drawbacks for practical
66
Wen-Bing Horng and Chih-Yuan Chen
usage to improve driving safety. For the illumination
change problem, we adopt the HSI color model to represent face skin, not only because the model detaches the
intensity from the hue of a color, but also because face
colors have fixed distribution range on the hue component. Since eyes have the nature of sophisticated edges,
we use the Sobel edge operator and the projection technique to locate eyes’ positions. The obtained eyes’ images are then used as the templates for eye tracking in
subsequent frames, instead of using the static eye images
in the database as used in [16], to enhance the reality and
reliability of the templates to increase accuracy of eye
tracking. Finally, the tracked eyes’ images are used to determine whether the eyes are open or closed, which is the
basis of driver fatigue detection used in this paper.
The rest of the paper is organized as follows. In Section 2, we briefly review the related driver fatigue detection methods based on ordinary CCD cameras. In Section 3, we present a new vision-based real-time driver fatigue detection system, including face detection, eye detection, eye tracking, and fatigue detection. The experimental results of the system are analyzed in Section 4.
Finally, the paper is concluded in the last section.
2. Related Driver Fatigue Detection Methods
Based on Ordinary CCDs
In this section, we briefly review the related visionbased driver fatigue detection methods using ordinary
CCD cameras. In 1997, Eriksson and Papanikolopoulos
[15] proposed a method to detect driver fatigue from ordinary videos. In this method, the symmetric property of
the face is used to detect the facial area of the driver on
an image. Then, pixel difference is performed to find the
edges on the facial region, and horizontal projection is
used to locate the vertical position of the eyes. Finally, a
concentric circle template is used to locate and track the
exact eyes’ positions. Although face symmetry is an obvious feature for an upright face, it usually fails to locate
the correct face position when the face tilts, rotates, or is
shadowed.
Instead of using symmetric central line method, Singh
and Papanikolopoulos (1999) [16] later proposed another method to monitor driver fatigue. In this method,
the Gaussian distribution of skin colors in the RGB color
model is used to distinguish skin and non-skin pixels for
face detection. Then, the Sobel vertical edge operator is
used for eye detection on the driver’s face. Besides, a set
of eye images as templates in a database is used for eye
detection and tracking. Although the Gaussian distribution of skin colors based on the RGB color model is used
to predict skin quite well, the method cannot get rid of
the factor of illumination changes. In addition, the eye
images in the database as templates may be quite different from drivers’ eyes, which will reduce the accuracy
for locating eye positions.
In 2003, Wang et al. [17] proposed a method for monitoring driver fatigue behavior. Like [16], the Gaussian
distribution of skin colors in the RGB color model is
used to locate a face region. Then, the face is binarized
with a fixed threshold to locate the eyes’ positions with
some geometric restrictions and to track the eyes by using the Hausdorff distance template matching. Finally,
the features of the eyes are analyzed by the Gabor filters,
and the states of driver’s eyes are classified by neural
networks. Since the eyes are detected with a fixed threshold, this system might not resist illumination changes.
In 2004, Wang et al. [18] proposed another method
for driver fatigue detection by monitoring mouth movement. Similarly, the Gaussian distribution of skin colors
is used to locate a face region. Then, a Fisher linear classifier is used to detect the mouth of the driver, and a
Kalman filter is used to track the mouth movement. However, the mouth movement may not be a reliable feature
for detecting drowsiness because the mouth shapes will
also change when talking, eating, laughing, and so on. It
is not easy to classify the mouth movement into a correct
drowsy state.
In 2005, Dong and Wu [19] also presented a driver
fatigue detection algorithm based on the distance of eyelids. Instead of using the RGB color model, this method
utilizes the Gaussian distribution of skin colors based on
the YCbCr color model to detect the face region. Then,
horizontal projection is used to locate eyes’ positions,
and the gray scale correlation is used for eye tracking. Finally, the distance of eyelids is calculated to detect fatigue. Although the method can resist the influence of illumination changes, it is not easy to distinguish the distance of eyelids of closed eyes or open eyes by small difference of eyelids’distance, especially when persons have
long eyelash.
However, most of the above vision-based driver fa-
A Real-Time Driver Fatigue Detection System Based on Eye Tracking and Dynamic Template Matching
67
tigue detection methods using ordinary CCD cameras
cannot resolve the illumination change problem. Besides, they might not be practical for real-time applications
due to their complicated and intensive computations.
3. The Proposed Driver Fatigue Detection
System
The proposed system uses an ordinary color CCD
camera mounted on the dash board of a car to capture the
images of the driver for driver fatigue detection. The flow
chart of the proposed fatigue detection system is depicted in Figure 1. The first image is used for initial face location and eye detection. If any one of these detection
procedures fails, then go to the next frame and restart the
above detection processes. Otherwise, the current eye
images, as the dynamic templates, are used for eye tracking on subsequent images. If eye tracking fails, the processes of face location and eye detection restart on the
present image. These procedures continue until there are
no more frames. The detailed steps are described in the
following subsections.
3.1 Face Detection
Digital images usually adopt the RGB color space to
represent colors. However, any color in the RGB space
not only displays its hue but also contains its brightness.
For two colors with the same hue but different intensities, they would be viewed as two different colors by the
human visual system. In order to accurately distinguish
skin and non-skin pixels so that they will not be affected
by shadows or light changes, the brightness factor must
be excluded from colors. Since in the HSI color model,
hue is independent of brightness, this model is well suited for distinguishing skin and non-skin colors no matter whether the face is shadowed or not. Thus, in this paper it is used for face detection.
The HSI color model has three components: hue, saturation, and intensity (or brightness). The detailed method for converting a color between the RGB space and
the HSI model refers to [7]. The following Equations (1)
to (3) show the results of converting a color from the
RGB model to the HSI model. For any color with three
elements, R, G, and B, normalized in the range between 0
and 1, the formulas for converting to its corresponding
H, S, and I components are given as follows.
Figure 1. Flow chart of the proposed driver fatigue detection
system.
(1)
(2)
(3)
When (B/I) > (G/I), H = 360° – H.
From the face images of the test videos, it is observed that the H values of skin colors fall in the range between 10° and 45°. Using this condition, it is easy to separate the face region, as shown in Figure 2(b). By performing the vertical projection on skin pixels, the right
68
Wen-Bing Horng and Chih-Yuan Chen
and left boundaries can be found when the projecting
values are greater than some threshold. In order to eliminate the region of the ears on the both sides of the face,
the value of threshold is defined as 50 pixels through experiments. Similarly, by performing the horizontal projection on skin pixels within the left and right boundaries, the top and bottom boundaries can also be found,
as shown in Figure 2(c). According to the normal position of the eyes in a face, it is reasonable to assume that
the possible location of eyes will be in the block of the
upper two fifths of the face region, as shown in Figure
2(d). Figures 3(a) and 3(c) illustrate that even under the
case of uneven illumination on the faces during driving,
the proposed method based on the HSI color model can
still correctly segment the face regions.
3.2 Eye Detection
After locating the eye region, the original color eye
region is converted into gray scale by Equation (4) as follows (refer to [7]).
Y = 0.299R + 0.587G + 0.114B
(4)
where Y is the gray value of a color (R, G, B). Then, the
Sobel edge operator is used to compute the edge magnitude in the eye region to find the vertical position of the
eyes. In order to save computation time, an approximate calculation of the edge magnitude, G(x, y), is computed in Equation (5) as follows.
G ( x, y) » S x + S y
(5)
where Sx and Sy are the horizontal and vertical gradient
values obtained, respectively, from the Sobel horizontal
and vertical edge operators, and |Sx| represents the absolute value of Sx. For detailed calculation, refer to [7].
After computing the edge magnitude of each pixel in
the eye region, an edge map, E(x, y), is obtained by defining a pixel (x, y) being an edge pixel if its edge magnitude G(x, y) is greater than some threshold, T, as defined
in Equation (6).
(6)
where black and white stand for the black pixel value
and the white pixel value, respectively. The threshold T
is set to 150 via several tests. The left part of Figure 4(b)
illustrates the edge map of Figure 4(a).
Performing horizontal projection on the edge map,
the exact vertical position of the eyes can be located at
the peak, as shown in the right part of Figure 4(b). Some-
Figure 2. Result of face detection.
Figure 3. Results of face detection under uneven illumination on the face during driving.
A Real-Time Driver Fatigue Detection System Based on Eye Tracking and Dynamic Template Matching
69
Figure 4. Results of edge detection and horizontal projection.
times, it may need to smooth horizontal projections for
locating peaks. After locating the vertical position of the
eyes, as shown in Figure 4(b), the left and right eye positions can be located by finding the largest connected
components from the center of the eye region along the
horizontal line of the vertical position of the eyes. In order to avoid marking noise, the total number of pixels in
a connected component must exceed some threshold so
that it can be regarded as a reasonable eye. After finding
the connected component of an eye, a bounding box for
the connected component on the original image is used
to enclose the eye image, which is used as the template
for eye tracking in the subsequent frames. Figure 5 illustrates the bounding boxes of detected eye positions.
3.3 Eye Tracking
After finding the eye templates, they are used for driver’s eye tracking by template matching. The searching
area for matching is the original bounding box of an eye
by expanding some reasonable number, t, of pixels, say t
= 10, in each of four directions. Consider an eye template
g(x, y) locating in the position (a, b) of the face image f(x,
y). Equation (7) is used for the eye template matching.
(7)
where w and h are the width and height of the eye template g(x, y), and p and q are offsets of the x-axis and
y-axis in which (a – t) † p † (a + t) and (b – t) † q † (b
+ t). If M(p*, q*) is the minimum value within the search area, the point (p*, q*) is defined as the most matching position of the eye. Thus, (a, b) is update to be the
new position (p*, q*).
The newly updated eye templates must satisfy the following two constrains to confirm successful eye tracking:
(i) The ratio of width to height of each new eye template is between 1 and 4.
Figure 5. Results of eye detection with bounding boxes.
(ii) Since the movement of the driver’s head will not
very big between each pair of consecutire frames
(1/30 second), the distance between the newly obtained templates and the previous templates is less
than 10 pixels.
Otherwise, the eye tracking will be regarded as failure
and the system will restart the face and eye location procedures.
3.4 Fatigue Detection
At this stage, the colors of the eyeballs in the eye
templates are used directly for fatigue detection. Since
the property that the eyeball colors are much darker is a
quite stable feature, the eye templates are inverted (negated) and then converted to the HSI color model. The
original darker eyeballs become brighter ones in the inverted image. According to the observation, the saturation values of eyeball pixels normally fall in the range
between 0.00 and 0.14. This observation is used to distinguish whether a pixel in an eye template is viewed as
an eyeball pixel, as shown in Figure 6. When the eyes are
open, there are some eyeball pixels, as shown in Figures
6(a) and 6(b). When the eyes are closed, there are no eyeball pixels, as shown in Figures 6(c) and 6(d). By checking the eyeball pixels, it is easy to detect whether the eyes
are open or closed. According to the experimental results
as shown in Figure 7, the duration of a normal eye blink
(the small peak) is usually less than 4 frames at the capture rate of 30 frames per second. If the duration for an
eye-close is more than 5 frames, the distance of the car
70
Wen-Bing Horng and Chih-Yuan Chen
Figure 6. Eyeball detection of open eye and closed eye.
Figure 7. Analysis of eye blinks.
will move more than 5.5 meters at the speed of 100 kilometers per hour in a high way. It will be very dangerous
for the driver to ignore the status of the road when driving
under such a condition. For this reason, in our system it is
defined that when the eyes close over 5 consecutive frames,
then the driver is regarded as dozing off. Of course, this parameter can be adjusted by users according to their needs.
4. Experimental Results
The proposed driver fatigue detection system used a
Sony PC115 color DV camera to capture driver’s images. The system was tested under the environment of
Pentium M 1.4G notebook with 512 MB RAM. The format of input video is 320 ´ 240 true color. After starting
the system, it took less than 40 milliseconds for initial
face location and eye detection. Once the eye templates
were found, eye tracking could achieve more than 30 frames per second. Recall that in this system, if the driver
closes his/her eyes over 5 consecutive frames, then the
driver is regarded as dozing off.
Table 1 lists the results of eye tracking from five test
videos, as shown in Figure 8. The first four videos were
captured under different illumination conditions with different people and backgrounds. The last video, video 5,
was captured when driving around a parking lot at nightfall with large illumination change. The field Total Frames means the total number of frames in each video.
Tracking Failure is the count of eye tracking failure.
Correct Rate of eye tracking is defined as in Equation
(8), which is the ratio of (Total Frames – Tracking Failure) to Total Frames.
(8)
From Table 1, we can see that the correct rate of eye
tracking is higher than 98.3% in the first four videos, and
it still can reach 80.0% in the very strict environment of
video 5. The average correct rate achieves 96.0% in the
proposed system on these test videos.
Table 2 shows the result of driver fatigue detection
on the five test videos. The field Close Eyes represents
the number eye closing and then opening of the driver in
the video. Real Dozing is the number of dozing off determined by humans. Generate Warning is the number of
fatigue detection that generates a warning alarm. False
Table 1. Result of eye tracking
Total Frames
Tracking Failure
Correct Rate
Video 1
Video 2
Video 3
Video 4
Video 5
Total
2634
8
99.7%
1524
6
99.6%
2717
46
98.3%
433
7
98.4%
1399
280
80.0%
8707
347
96.0%
A Real-Time Driver Fatigue Detection System Based on Eye Tracking and Dynamic Template Matching
71
Figure 8. Snapshots during tracking with different people under different illumination.
Table 2. Result of fatigue detection
Close Eyes
Real Dozing
Generate Warning
False Positive
False Negative
Correct Warning
Correct Rate
Precision Rate
Video 1
Video 2
Video 3
Video 4
Video 5
Total
22
03
03
00
00
03
100%
100%
18
04
04
00
00
04
100%
100%
43
15
18
03
00
15
100%
83.3%
6
2
2
0
0
2
100%
100%
3
1
1
0
0
1
100%
100%
92
25
28
03
00
25
100%
89.3%
Positive is the number of wrongly generated warning alarms. False Negative is the number of not detected fatigue. Correct Warning is the number of correctly detected fatigue, which is equal to (Generate Warning – False
Positive – False Negative). Correct Rate of fatigue detection is defined as in Equation (9).
posed system could still correctly detect and track the
driver’s eyes. In spite of this kind of interference and illumination changes, the average precision rate for fatigue detection for all videos could still achieve 89.3% in
the proposed system.
5. Conclusion
(9)
Precision Rate for fatigue detection is defined as in
Equation (10).
(10)
As can be seen from Table 2, the system could correctly detect all fatigue states in all the test videos. In
video 3, three mistakes made by the system are due to the
interference of the driver’s hand, as shown in Figure
8(c). Even for the uneven illumination on the face in a
moving car, as shown in Figures 8(e) to 8(h), the pro-
This paper has presented a vision-based real-time
driver fatigue detection system for driving safety. The
system uses an ordinary CCD camera to capture the driver’s images, which makes the method more practical.
Tested on a notebook of Pentium M 1.4G CPU with 512
MB RAM, the system could reach 96.0% of average correct rate of eye detection on five videos of various illumination conditions, and could achieve 100% of correct
rate of fatigue detection with precision rate 89.3%. The
speed of this system could reach more than 30 frames per
second for eye tracking, with each color frame of size
320 ´ 240. In summary, the proposed driver fatigue de-
72
Wen-Bing Horng and Chih-Yuan Chen
tection system could cope with the illumination change
problem and could be suitable for real-time applications
to improve driving safety.
Acknowledgement
The authors would like to thank the anonymous reviewers for their valuable comments. This research was
partially supported by the National Science Council, Taiwan, ROC., under contract no. NSC 92-2213-E-032-031.
References
[1] Shinar, D., Psychology on the Road, John Wiley &
Sons, Danvers, MA, U.S.A. (1979).
[2] Recarte, M. A. and Nunes, L. M., “Effects of Verbal
and Spatial-Imagery Tasks on Eye Fixations While
Driving,” Journal of Experimental Psychology: Applied, Vol. 6, pp. 31-43 (2000).
[3] Cho, K. J., Roy, B., Mascaro, S. and Asada, H. H., “A
Vast DOF Robotic Car Seat Using SMA Actuators
with a Matrix Drive System,” Proc. IEEE Robotics
and Automation, New Orleans, LA, U.S.A., Vol. 4, pp.
3647-3652 (2004).
[4] Chieh, T. C., Mustafa, M. M., Hussain, A., Zahedi, E.
and Majlis, B. Y., “Driver Fatigue Detection Using
Steering Grip Force,” Proc. IEEE Student Conference
on Research and Development, Putrajaya, Malaysia,
pp. 45-48 (2003).
[5] Wilson, B. J. and Bracewell, T. D., “Alertness Monitor
Using Neural Networks for EEG Analysis,” Proc. IEEE
Signal Processing Society Workshop on Neural Networks for Signal Processing, Sydney, Australia, Vol. 2,
pp. 814-820 (2000).
[6] Vuckovic, A., Popovic, D. and Radivojevic, V., “Artificial Neural Network for Detecting Drowsiness from
EEG Recordings,” Proc. IEEE Seminar on Neural Network Applications in Electrical Engineering, Belgrade, Yugoslavia, pp. 155-158 (2002).
[7] Gonzalez, R. C. and Woods, R. E., Digital Image Processing, Second Edition, Prentice Hall, Upper Saddle
River, NJ, U.S.A. (2002).
[8] Davies, E. R., Machine Vision: Theory, Algorithms,
Practicalities, Third Edition, Elsevier, Oxford, UK
(2005).
[9] Forsyth, D. A. and Ponce, J., Computer Vision: A Mo-
dern Approach, Prentice Hall, Upper Saddle River,
NJ, U.S.A. (2003).
[10] Gu, H., Ji, Q. and Zhu, Z., “Active Facial Tracking for
Fatigue Detection,” Proc. IEEE Workshop on Applications of Computer Vision, Orlando, FL., U.S.A., pp.
137-142 (2002).
[11] Ji, Q., Zhu, Z. and Lan, P., “Real-Time Nonintrusive
Monitoring and Prediction of Driver Fatigue,” IEEE
Transactions on Vehicular Technology, Vol. 53, pp.
1052-1068 (2004).
[12] Zhu, Z. and Ji, Q., “Real-Time and Non-intrusive Driver Fatigue Monitoring,” Proc. IEEE Intelligent Transportation Systems, Washington, D.C., U.S.A., pp.
657-662 (2004).
[13] Gu, H. and Ji, Q., “An Automated Face Reader for Fatigue Detection,” Proc. IEEE Automatic Face and Gesture Recognition, Seoul, Korea, pp. 111-116 (2004).
[14] Ito, T., Mita, S., Kozuka, K., Nakano, T. and Yamamoto, S., “Driver Blink Measurement by the Motion
Picture Processing and its Application to Drowsiness
Detection,” Proc. IEEE Intelligent Transportation
Systems, Singapore, pp. 168-173 (2002).
[15] Eriksson, M. and Papanikolopoulos, N. P., “EyeTracking for Detection of Driver Fatigue,” Proc. IEEE
Intelligent Transportation Systems, Boston, MA, U.S.A.,
pp. 314-319 (1997).
[16] Singh, S. and Papanikolopoulos, N. P., “Monitoring
Driver Fatigue Using Facial Analysis Techniques,”
Proc. IEEE Intelligent Transportation Systems, Tokyo, Japan, pp. 314-318 (1999).
[17] Wang, R., Guo, K., Shi, S. and Chu, J., “A Monitoring
Method of Driver Fatigue Behavior Based on Machine
Vision,” Proc. IEEE Intelligent Vehicles Symposium,
Columbus, Ohio, U.S.A., pp. 110-113 (2003).
[18] Wang, R., Guo, L., Tong, B., and Jin, L., “Monitoring
Mouth Movement for Driver Fatigue or Distraction
with One Camera,” Proc. IEEE Intelligent Transportation Systems, Washington, D.C., U.S.A., pp. 314319 (2004).
[19] Dong, W. and Wu, X., “Driver Fatigue Detection Based on the Distance of Eyelid,” Proc. IEEE VLSI Design and Video Technology, Suzhou, China, pp. 365368 (2005).
Manuscript Received: Dec. 29, 2006
Accepted: Jul. 25, 2007
Download