ArticlePDF Available

Abstract and Figures

Traffic accidents due to human errors cause many deaths and injuries around the world. To help in reducing this fatality, in this research, a new module for Advanced Driver Assistance System (ADAS) for automatic driver drowsiness detection based on visual information and Artificial Intelligence is presented. The aim of this system is to locate, to track and to analyze the face and the eyes to compute a drowsiness index, working under varying light conditions and in real time. Examples of different images of drivers taken in a real vehicle are shown to validate the algorithm.
Content may be subject to copyright.
J Intell Robot Syst
DOI 10.1007/s10846-009-9391-1
Real-Time Warning System for Driver Drowsiness
Detection Using Visual Information
Marco Javier Flores ·José María Armingol ·
Arturo de la Escalera
Received: 3 November 2008 / Accepted: 18 November 2009
© Springer Science + Business Media B.V. 2009
Abstract Traffic accidents due to human errors cause many deaths and injuries
around the world. To help in reducing this fatality, in this research, a new module
for Advanced Driver Assistance System (ADAS) for automatic driver drowsiness
detection based on visual information and Artificial Intelligence is presented. The
aim of this system is to locate, to track and to analyze the face and the eyes to
compute a drowsiness index, working under varying light conditions and in real time.
Examples of different images of drivers taken in a real vehicle are shown to validate
the algorithm.
Keywords Driver’s drowsiness ·Neural networks ·Support vector machine ·
Gabor filter ·Artificial intelligence ·ADAS ·Computer vision
1 Introduction
ADAS is part of the active safety systems that interact much more with drivers to
help them avoid traffic accidents, indeed, its goal is to contribute in the reduction of
traffic accidents, by using new technologies; that is, incorporating new systems for
increasing vehicle security, and at the same time, decreasing the danger situations
M. J. Flores ·J. M. Armingol (B)·A. de la Escalera
Intelligent Systems Laboratory, Universidad Carlos III de Madrid,
C/. Butarque 15, 28991, Leganés, Madrid, Spain
M. J. Flores
A. de la Escalera
J Intell Robot Syst
that may arise during driving, due to human errors. In this scenario, vehicular
security research is focused on driver analysis, in this particular case; drowsiness and
distraction are studied more intensely [4].
Drowsiness appears in situations of stress and fatigue in an unexpected and
inopportune way, and it may be produced by sleep disorders, certain type of
medications, and even, boredom situations, for example, driving for a long time.
In this sense, sleepiness sensation diminishes the level of vigilance, and it produces
danger situations and increases the probability that an accident occurs.
It has been estimated that drowsiness causes between 10% and 20% of traffic
accidents with dead [31] and injured drivers [11], whereas the trucking industry shows
57% of fatal truck accidents for this fatality [2,22]. Fletcher et al. in [12] goes further
and has mentioned that 30% of all traffic accidents have been caused by drowsiness
and Brandt et al. [4] presents statistics in which 20% of all accidents are caused by
fatigue and inattention. In USA drowsiness is responsible for 100,000 traffic accidents
whose costs are about $12,000 million [28]. In Germany, one of four traffic accidents
have their origin in drowsiness, in England 20% off all traffic accidents are produced
by drowsiness [16] and in Australia 1,500 million dollars has been spent on this
fatality [24].
In this context, it is important to use new technologies to design and to build
systems that will monitor drivers, and measure their level of attention throughout the
whole driving process. Fortunately, people in a state of drowsiness produce several
visual cues that can be detected on the human face, they are:
Yawn frequency,
Eye-blinking frequency,
Eye-gaze movement,
Head movement and,
Facial expressions.
Taking advantage of these visual characteristics, computer vision is the feasible and
appropriate technology to treat this problem. This article presents the drowsiness
detection system of the IVVI (Intelligent Vehicle based Visual on Information)
vehicle [1]. The goal of this system is to estimate driver drowsiness automatically
and to prevent drivers falling asleep while driving.
The organization of the paper is as follows. Section 2presents an extended state
of the art divided by light conditions. Section 3introduces the proposed method that
consists of face and eye detection, face and eye tracking and the drowsiness index
based on support vector machine. Finally, in section 4results and conclusions are
2 Related Work
To increase the traffic security and to reduce the number of traffic accidents,
numerous universities, research centers, automotive companies (Toyota, Daimler
Chrysler, Mitsubishi, etc.) and governments (Europe Union, etc.) are contributing
in the development of ADAS for driver analysis [2], using different technologies. In
this sense, the use of visual information to know the driver’s drowsiness state and
understand his/her behavior is an active research field.
J Intell Robot Syst
This problem requires the recognition of human behavior when in a state of
sleepiness through the analysis of the eyes and the face (head). This is a difficult task,
even for humans because there are many factors involved, for instance, changing
illumination conditions and a variety of possible face poses. Taking into account the
illumination, the state of the art has been divided in two parts; one is the systems
that work with natural daylight; another is the systems which work with the help of
illumination systems based on near infrared (NIR) illumination.
2.1 Systems with Daylight Illumination
To analyze driver drowsiness several systems have been built in recent years. They
usually require simplifying the problem to work partially or under special environ-
ments, for example, D’Orazio et al. [10] has proposed an eye detection algorithm that
searches for the eyes in the whole image assuming that the iris is always darker than
the sclera and based on the Hough transform for circles and geometrical constraints
the eyes candidates are located, next, they are passed to a neural network that classify
between eyes and non-eyes. This system is able to classify the eyes as being in an
open or closed state. The main limitations of this algorithm are: it is applicable only
when the eyes are visible in the image, and it is not robust at changing illumination.
Horng et al. [17] has shown a system that uses a skin color model over HSI space
for face detection, edge information for eye localization and dynamical template
matching for eye tracking. Using eyeball color information, it identifies the eye state
and computes the driver’s state, i.e., asleep or alert; if the eyes are closed over a
five consecutive frames, the driver is dozing. Brandt et al. [4] has shown a system
that monitors driver fatigue and inattention. For this task, he uses the Viola & Jones
(VJ) method [34] to detect the driver’s face. Using the optical flow algorithm over
eyes and head this system is able to compute the driver state. Tian and Qin in [31]
have built a system for verifying the driver’s eye state. Their system uses Cb and Cr
components of the YCbCr color space; with vertical projection function this system
localizes the face region and with horizontal projection function it localizes the eye
region. Once the eyes are localized the system computes eye state using a complexity
function. Dong and Wu [11] have presented a system for driver fatigue detection,
which uses a skin color model based on bivariate Normal distribution and Cb and
Cr components of the YCbCr color space. After localizing the eyes, it computes the
fatigue index utilizing the eyelid distance to classify open eyes and closed eyes; if
the eyes are closed over five consecutives frames, the driver is regarded as dozing,
alike to the Horng’s work. Branzan et al. [5] also presents a system for drowsiness
monitoring using template matching to analysis the eye state.
2.2 Systems with Infrared Illumination
In this case, due to night-time light conditions, Ji et al. [21] and Ji and Yang [22]
has presented a drowsiness detection system based on NIR illumination and stereo
vision. This system localizes the eye position by using image differences based on
the bright pupil effect. Afterwards, this system computes the blind eyelid frequency
and eye gaze to build two drowsiness indices: PERCLOS (percentage of eye closure
over time) [28] and AECS (average eye closure speed). Bergasa et al. [2]also
has developed a non-intrusive system using infrared light illumination, this system
J Intell Robot Syst
computes driver vigilance level using a finite state automata (FSM) [3] with six eye
states that computes several indices, among them, PERCLOS; also, this system is
able to detect inattention through face pose analysis. Another work using this type
of illumination is presented by Grace [14] for measuring slow eyelid closure. Systems
using NIR illumination work well under stable lighting conditions [2,9]; however,
this is a shortcoming for applications in real vehicles, where the light is changing all
the time. In this scenario, if the spectral pupils disappear, then it will be difficult to
detect the eyes.
3 System Design to Drowsiness Detection
This paper presents a system to detect the driver’s drowsiness that works on grayscale
images. The scheme of the system is shown in Fig. 1in which six modules are
Face detection
Eye detection
Face tracking
Eye tracking
Drowsiness detection and
Distraction detection
Each one of these parts will be explained in the following subsections.
3.1 Face Detection
To localize the face, this system uses the VJ object detector which is a machine
learning approach for visual object detection. It uses three important aspects to make
an efficient object detector based on the integral image, AdaBoost technique and
cascade classifier [34]. Each one of these elements is important for processing the
Yes Yes
Face and Eye
Fig. 1 Algorithm scheme
J Intell Robot Syst
images efficiently and in near real-time with 90% of correct detection. A further
important aspect of this method is its robustness under changing light conditions.
However, in spite of the above-mentioned features, its principal disadvantage is that
it cannot extrapolate and does not work appropriately when the face is not in front
of the camera axis. Such would be the case when the driver moves his/her head;
however, this shortcoming will be analyzed later on.
Continuing with the algorithm description, when driver’s face is detected, it is
enclosed within a rectangle RI (region of interest) which is addressed by left-top
corner coordinates P0=(x0,y0) and right-bottom corner coordinates P1=(x1,y1),
as can be observed in Fig. 2a–c. Indeed, the rectangle size comes from experimental
analysis developed on the face database that has been created for this task.
3.2 Eye Detection
Localizing the eye position is a difficult task because different features define the
same eye depending, for example, the area of the image where it appears or its iris
color, but the main problem during driving is the changing ambient light conditions.
Once the face has been located through the rectangle RI in the previous section,
using the face anthropometric properties [13] which come from face database
analysis, two rectangles containing the eyes are obtained. Preliminary, this system
uses RILfor the left eye rectangle and RIRfor the right eye rectangle as can be seen
in the following four equations and they are shown in the Fig. 3.
where w=x1x0and h=y1y0.
Fig. 2 Viola & Jones method
J Intell Robot Syst
Fig. 3 Eye rectangles RIRand RIL
After the previous step; the exact position of each eye is searched for by incor-
porating information from grey-level pixels. The main idea is to obtain a random
sample from the pixels that belong to the eye area, and then, to adjust a parametric
model. Figure 4shows this procedure in which a random sample is extracted in (a)
and an elliptic model is adjusted in (b). In this case, the eye state is independent, i.e.,
it can be open or closed.
To extract the random sample the following algorithm is proposed. Let I(x,y)
[0,255]be the pixel value in the position (x,y), then:
Generate the image Jby means of the following equation:
where mand σare the mean and the standard deviation, respectively. These
parameters are computed over the eye rectangles located previously.
Generate the image Kusing the Eq. 6:
K(x,y)=J(x,y)256 δ1if J(x,y)0
256 δ2+J(x,y)if J(x,y)<0(6)
Fig. 4 a Random sample, beye parametric model
J Intell Robot Syst
where δ1=max (0,ceil (J(x,y)/256)1),δ2=max (1,ceil (|J(x,y)|/256)) and ceil(x)
is the function that returns the smallest integer larger than x.
Obtain the binary image, B, from image Kthrough the Eq. 7, namely,
B(x,y)=255 if K(x,y)κ
0other case (7)
where κis computed by Ostu’s method [29] which is used to compute an automatic
threshold, Fig. 5b.
Compute the gradient image, G, using the Sobel horizontal (Sx) and vertical (Sy)
edge operator followed by an image contrast enhancement [20], Fig. 5c.
Compute the logarithm image [35], L, with the objective to enhance the iris pixels
that are the central part of the eye, Fig. 5d.
L(x,y)=log (1+I(x,y)) (9)
Starting from the pixels that have been extracted from the images B, G and
L; it is possible to obtain the random sample previously mentioned. This sample
presents an ellipse shape and an elliptic model has been adjusted over this by using
the expectation maximization algorithm (EM) [26]. The ellipse center has been
given special attention, because, it allows the exact position of the eye center to be
obtained. The ellipse axes determine the width and height of the eyes. The result is
shown in Fig. 6b.
Fig. 5 Eye location through RLand RR,agrayscale image, bbinary image (B), cgradient image
(G), and dlogarithm image (L)
J Intell Robot Syst
Fig. 6 Expectation maximization algorithm over the spatial distribution of the eye pixels, aeye
image, bellipse parameters: center, axes and inclination angle. cfOther examples of this procedure
The main reason to use the pixel information through a random sample is
because head movement, illumination changes, etc. do not allow complete eye pixel
information to be obtained, i.e., only partial information of the eye in the images
B, G and L is available; where an elliptic shape prevails. This random information
makes it feasible to use an algorithm that computes the parameters of a function to
approximate eye ellipse shape. EM computes the mean, variance and the correlation
of X and Y coordinates that belong to the eye. The initial parameters to run EM
are obtained from a regression model adjusted with the least square method. The
number of iterations to run EM is fixed in 10, and the sample size is taken at least
one third of the rectangle area RIR. These parameters will be used in the eye state
analysis below.
3.3 Tracking
There are a number of reasons for tracking. One is problems that were found with
the VJ during this research. Another is the necessity to track the face and the eyes
continuously from frame to frame. A third reason is to satisfy the real-time conditions
that reduce the search space. The tracking process has been developed using the
Condensation algorithm (CA) in conjunction with the neural networks (NN) for face
tracking and with template matching for eye tracking.
3.3.1 The Condensation Algorithm
This contribution implements the Condensation algorithm that was proposed by
Isard and Blake [18,19] for tracking active contours using a stochastic approach.
J Intell Robot Syst
CA combines factored sampling (Monte-Carlo sampling method) with a dynamical
model that is governed through the state Eq. 10.
where Xtis the state at instant t,f(·)is an nonlinear equation and depends on a
previous state plus a white noise. The goal is to estimate the state vector Xtwith
the help of system observation which are the realization of the stochastic process Zt
governed by the measurement equation:
where Ztis the measure system at time t,h(·)is another nonlinear equation that
links the present state plus a white noise. The processes ξtand ηtare each one
white noise and are independent of each other. Also, these processes in general are
non-Gaussian and multi-modal. It must be pointed out that Xtis an unobservable
underlying stochastic process.
3.3.2 Neural Networks
McCulloch and Pitts proposed the first model of an artificial neuron in 1943 which
was based on its corresponding biological neuron [25]. Since then, neural networks
have evolved and they have been used in a wide variety of problems of pattern
recognition and classification, coming from engineering and social science [27,33].
Figure 7shows several face examples used for training a backpropagation neural
Before training the neural network, a preprocessing step that consists of two parts
is necessary:
Contrast modification using gamma correction given by Eq. 12 with γ=0.8 which
has been determined experimentally [30].
Remove the contour points through the operation AND with a mask of Fig. 8a.
After that, the characteristic vector that consists of the gray-level values of the
pixels coming from the face image is extracted. The rate of classification subsequent
Fig. 7 Examples of a face database which contain faces in different orientations: aleft profile,
bfront view, cright profile, ddown profile, and eup profile
J Intell Robot Syst
Fig. 8 Mask for face training and its result
3.3.3 Face Tracking
Previously, it has been mentioned that the VJ method has problems detecting
faces when they deviate from nominal position and orientation; so, to correct this
disadvantage the tracking face has been developed. To show this shortcoming, Fig. 9
shows several instants of time where the VJ method does not find the driver’s face, in
this sense, Fig. 10 presents an extended example, where the true position and the VJ
position are represented over a frame sequence. The true position has been obtained
manually retrieved.
The chief problem of the VJ method is that it is only able to localize the human
face when it is in frontal position of the camera. This drawback leads to an unreliable
system of driver analysis throughout the driving process that is highly dynamic, for
example, when looking at the mirror. Much effort has gone into correcting this
problem; so, an efficient tracker has been implemented using CA in conjunction with
a backpropagation neural network.
Through recursive probabilistic filtering of the incoming image stream, the state
Fig. 9 The driver’s face is not found by the Viola & Jones method at several time instants
J Intell Robot Syst
Fig. 10 Example where the VJ method does not find the driver’s face in a 100-frame sequence
of a driver’s face is estimated for each time step t. It is characterized by its position,
velocity and size. Let (xc,yc) represent its position the center, (uc,vc) be its velocity
in xand ydirection and (w,h) be its size in pixels. In the same way, the measure
vector is given by Eq. 14.
The dynamics of the driver’s face is modeled as a second order autoregressive process
AR(2), according to Eq. 15.
where A is the transition matrix proposed by [19]andξtrepresents the system
perturbation at time t. The most difficult part in CA is to evaluate the observation
density function, in this contribution to compute the weight π(j)
for j=1, ..., N,attimet, a neural network value in the range [0,1], which gives
an approximation of the face and non-face in conjunction with the distance with
Fig. 11 One time step of the Condensation algorithm apredicted region, bparticles regions
J Intell Robot Syst
Fig. 12 Trajectory of the real and estimated face-center in a 100-frame sequence using the proposed
respect to the face to track. This is similar to the work of Satake and Shakunaga
[32] who have used the sparse template matching for computing the weight π(j)
the sample s(j)
tfor j=1, ..., N. In this contribution, the neural network value is used
as an approximate value for the weights.
The density function of the initial state is p(x0)=N(z0,0),wherez0is computed
by the VJ method and 0is given in [22]. Figure 11b depicts a particle representation
and Fig. 12 shows the tracking process in which the green circle is the true position
and a red cross characterizes a particle or a hypothesis, whereas the Fig. 13 shows
the probability over time. This tracker is highly flexible because the neural network
includes faces and non-faces with different head orientations and under various
illumination conditions. Table 1presents more results over several sequences of
drivers faces. The sequences come from the drivers’ database, which was taken
Fig. 13 Estimated value of the
a posteriori density of the
face-center in a 100-frame
sequence using the proposed
tracker, the face is detected in
the fourth frame
J Intell Robot Syst
Table 1 Result of face
tracking Driver Total frames Tracking failure Correct rate (%)
D1 960 60 93.75
D2 900 22 97.55
D3 500 45 91.00
D4 330 15 95.45
D5 1400 50 96.42
to develop these experiments. The true position of the faces has been obtained
manually retrieved.
3.3.4 Eye Tracking
For this task, the state of the eye is characterized by its position and velocity over the
image. Let (x,y) represent the eye pixel position at time tand (u,v) be its velocity at
time tin xand ydirections, respectively. The state vector at time tcan, therefore, be
represented by Eq. 16.
The transition model is given by Eq. 17 which is a first autoregressive model
The evaluation of the observation density function is developed by a template
matching strategy [32] that was truncated to reduce the false detection. CA is
initialized when the eyes are detected with the method from the previous section plus
a white noise and it is similar to the case of face tracking. Figure 14 depicts the eye
Fig. 14 Trajectory of the real and estimated eyes-center in a 100-frame sequence
J Intell Robot Syst
Fig. 15 Estimated value of the a posteriori density of the eye-center in a 100-frame for right and left
eyes, the eyes are detected in the four frame
trajectory tracking and Fig. 15 shows the compute values of the a posteriori density
function of each eye, both on a sequence of 100 images, whereas Table 2shows the
eye tracking results that has been developed in several sequences of images.
3.4 Eye State Detection
To identify drowsiness through eye analysis it is necessary to know its state: open
or closed, through the time and develop an analysis over time, i.e., to measure the
time that has been spent in each state. Classification of the open and closed state
is complex due to the changing shape of the eye, among other factors, the changing
position and the rotating of the face, and variations of twinkling and illumination. All
this makes it difficult to analyze eye in a reliable manner. For the problems that have
been exposed a supervised classification method has been used for this challenging
task, in this case, a support vector machine (SVM). Figure 16 presents the schema
proposed for eye state verification.
Table 2 Result of eye tracking Driver Total frames Tracking failure Correct rate (%)
D1 960 20 97.91
D2 900 30 96.60
D3 500 8 98.40
D4 330 14 95.75
D5 1400 90 93.57
J Intell Robot Syst
Fig. 16 SVM schema for eye state verification
3.4.1 Support Vector Machine
SVM classification [6,8,15] is rooted in statistical learning theory and pattern classi-
fiers, it uses a training set, S={(xi,yi):i=1,···,m},wherexiis the characteristic
vector in Rn,yi{1,2}represents the class, in this case 1 for open eyes and 2 for
closed eyes, and mis the number of elements of S. From a training set a hyperplane
is built that allows the classification between two classes and minimizes the empirical
risk function [15].
Mathematically, SVM consists of finding the best solution to the following opti-
mization problem:
where eis a mby the 1 vector, Cis an upper bound, Qis a mby mmatrix with Qij =
yiyjK(xi,xj)and K(xi,xj)is the kernel function. By solving the above quadratic
programming problem, SVM tries to maximize the margin between data points in
the two classes and minimize the training errors simultaneously; Fig. 17 depicts the
mapping of the input space to a high dimensional feature space through a nonlinear
transformation and its maximization process.
Fig. 17 SVM representation
J Intell Robot Syst
3.4.2 Eye Characteristic Extraction Using Gabor Filter
The Gabor filter was used by Daugman for image analysis, changing the orientation
and scale [9,23]. Indeed, they are multi-scale and multi-orientation kernels. They can
be defined by Eq. 19 that is a complex function.
)=exp x2+y2
σ2exp (i2πθ (xcos (φ)+ysin (φ))) (19)
where θand φare the scale and orientation parameters, σis the standard deviation of
the Gaussian kernel that depends upon the spatial frequency to measured, i.e. θ.The
response of the Gabor filter to an image is obtained by a 2D convolution operation.
Let I(x,y)denote the image and G(x,y,θ,φ) denote the response of a Gabor filter
with scale θand orientation φto an image at point (x,y) on the image plane. G(·)is
obtained by (20).
)= I(p,q)g(xp,yq
)dpdq (20)
Some combinations of scales and orientations are more robust for the classification
between open eye and closed eye. Indeed, three scales, four orientations have been
used to generate Fig. 18, they are {1,2,3} and {0, π/4, π/2, 3π/4} that were obtained
experimentally over an image of size 30 by 20.
Once the response of a Gabor filter is obtained, the eye characteristic vector is
extracted by a sub-window procedure described by Chen and Kubo [7] and denoted
Fig. 18 Gabor filter for θ={0,1,2} and φ={0, π/4, π/2, 3π/4}
J Intell Robot Syst
Fig. 19 Sub-window images
from the Gabor filter
by dR360. This vector is computed by Eq. 21 over each sub-window of size 5 by 6.
Figure 19 shows the sub-window diagram.
)i=1,...,20 (21)
To do this work a training set has been built that consists of open eyes and closed
eyes. The images come from diverse sources, under several illumination conditions
and are of different races. A further important aspect of this eye database is that it
contains images of different eye colors, i.e., blue, black, green see Fig. 20.
Previous to SVM training, it is indispensable to process each image that consists
of histogram equalization, filter with the median filter, followed by the sharpen filter.
The median filter is used to reduce the image noise, whereas the sharpen filter is used
to enhance the borders.
The main objective of training SVM is to find the best parameters and the best
kernel that minimizes Eq. 11, so, after several training experiments of the SVM, it was
decided to use the RBF kernel, i.e., K(xi,xj)is exp γ
2,C=30 and γ=
0.0128; these parameters reach a high training classification rate that is about 93%.
Fig. 20 Examples of eye database
J Intell Robot Syst
Table 3 Result of eye state analysis
Driver Total frames Eyes open Eyes closed Correct rate (%)
D1 960 690/700 258/260 98.90
D2 900 520/560 339/340 96.27
D3 500 388/400 99/100 98.00
D4 330 150/170 152/160 91.61
D5 1,400 891/980 401/420 93.19
Table 3presents several results of this method computed over a several sequences
of drivers. It shows that a high correct rate of classifications.
3.5 Drowsiness Index
The eye-blinking frequency is an indicator that allows a driver’s drowsiness (fatigue)
level to be measured. As in the works of Horng et al. [17] and Dong and Wu [11], if
five consecutive frames or during 0.25 s are identified as eye-closed the system is able
to issue an alarm cue; PERCLOS [28] also is implemented in this system.
Figure 21 presents an instantaneous result of this system over a driver’s image,
whereas Fig. 22 pictures the evolution drowsiness index graph for a driver’s drowsi-
ness sequence.
3.6 Distraction
Distraction may also cause traffic accidents, it is estimated that it is the cause of about
20% of them [4]. To detect distraction the driver’s face should be studied because
the pose of the face contains information about one’s attention, gaze and level of
fatigue [22]. To verify the driver’s distraction, this contribution has implemented the
following procedure.
Fig. 21 System instantaneous result
J Intell Robot Syst
1 100 199 298 397 496 595 694 793 892
1 100 199 298 397 496 595 694 793 892
Fig. 22 Drowsiness index graph in a 900-frame sequence of a drowsy driver, aPerclos, bHorng-
Dong and Wu index
Fig. 23 Face orientation
J Intell Robot Syst
Fig. 24 Head-orientation monitoring over time in a 100-frame sequence
3.6.1 Face Orientation
Driver’s face orientation is estimated using the eye position, through Eq. 22.
where x=x2x1,y=y2y1,(x1,y1)and(x2,y2) correspond to the left and
right eye positions. Equation 23 presents the classification limits. Figure 23 depicts
an example of face orientation, whereas, Fig. 24 also shows and extended example of
driver’s face orientation from eyes through a sequence of images.
Left if θ>8
Front i f |θ|8
Right i f θ<8(23)
Fig. 25 a IVVI vehicle, bprocessing system, cdriver’s camera
J Intell Robot Syst
Fig. 26 Different stages of the proposed algorithm on several instants of time, driving conditions
and different drivers
3.6.2 Head Tilt
The above method has a problem when using a monocular camera, so, to correct
this drawback, this contribution implements a head-tilt based on neural networks.
Let us remember that the driver’s face database is made up of examples of faces in
five orientations, so, the face is passed to neural networks to know its orientation,
especially for the up and down cases. If the system detect that the face position is not
frontal, an alarm cue is issued to alert the driver of a danger situation.
4 Conclusions
In this paper, a research project to develop a non-intrusive driver’s drowsiness
system based on Computer Vision and Artificial Intelligence has been presented.
This system uses advanced technologies for analyzing and monitoring drivers eye
state in real-time and in real driving conditions. Based on the results presented
on Tables 1,2and 3, the proposed algorithm for face tracking, eye detection
and eye tracking is robust and accurate under varying light, external illuminations
interference, vibrations, changing background and facial orientations.
To acquire data to use while developing and testing the algorithms, several drivers
were recruited; they were exposed to a variety of difficult situations commonly
encountered on the roadway. This guarantees and confirms that these experiments
have proven robustness and efficiency in real traffic scenes. The images were taken
J Intell Robot Syst
with the camera inside the IVVI vehicle, Fig. 25c. IVVI is an experimental platform
used to develop the driver assistance systems in real driving conditions. It is a
Renault Twingo vehicle, Fig. 25a, equipped with a processing system, Fig. 25b, which
processes the information comes from the cameras. Finally, Fig. 26 shows an example
that validates this system.
For future work, the objective will be to reduce the percentage error, i.e., to reduce
the false alarms; for this, extra experiments will be developed, using additional
drivers and incorporating new modules.
Acknowledgements This work was supported in part by the Spanish Government through the
CICYT projects VISVIA (Grant TRA2007-67786-C02-02) and POCIMA (Grand TRA2007-67374-
1. Armingol, J.M., de la Escalera, A., Hilario, C., Collado, J., Carrasco, J., Flores, M., Pastor, J.,
Rodriguez, F.: IVVI: intelligent vehicle based on visual information. Robot. Auton. Syst. 55,
904–916 (2007). doi:10.1016/j.robot.2007.09.004
2. Bergasa, L., Nuevo, J., Sotelo, M., Vazquez, M.: Real time system for monitoring driver vigilance.
In: IEEE Intelligent Vehicles Symposium, Parma, 14–17 June 2004
3. Brookshear, J.G.: Theory of Computation: Formal Languages, Automata and Complexity.
Addison Wesley Iberoamericana, Reading (1993)
4. Brandt, T., Stemmer, R., Mertsching, B., Rakotomirainy, A.: Affordable visual driver monitoring
system for fatigue and monotony. IEEE Int. Conf. Syst. Man Cybern. 7, 6451–6456 (2004)
5. Branzan, A., Widsten, B., Wang, T., Lan, J., Mah, J.: A computer vision-based system for real-
time detection of sleep onset in fatigued drivers. In: IEEE Intelligent Vehicles Symposium,
pp. 25–30 (2008)
6. Chang, C., Lin, C.: LIBSVM: a library for support vector machine (2001). www.csie.ntu.
7. Chen, Y.W., Kubo, K.: A robust eye detection and tracking technique using Gabor filters.
In: Third International Conference on Intelligent Information Hiding and Multimedia Signal
Processing, IEEE, vol. 1, pp. 109–112 (2007)
8. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and other Kernel-
Based Learning Methods. Cambridge University Press, Cambridge (2000)
9. Daugman, J.G.: Uncertainty relation for resolution in space, spatial frequency and orientation
optimized by two-dimensional cortial filters. J. Opt. Soc. Am. 2(7), 1160–1169 (1985)
10. D’Orazio, T., Leo, M., Distante, A.: Eye detection in faces images for a driver vigilante system.
IEEE Intelligent Vehicles Symposium University of Parma, Italy, 14–17 June (2004)
11. Dong, W., Wu, X.: Driver fatigue detection based on the distance of eyelid. In: IEEE Int.
Workshop VLSI Design & Video Tech., Suzhou, China (2005)
12. Fletcher, L., Petersson, L., Zelinsky, A.: Driver assistance systems based on vision in and out of
vehicles. In: IEEE Proceedings of Intelligent Vehicles Symposium, pp. 322–327 (2003)
13. Gejgus, P., Sparka, M.: Face Tracking in Color Video Sequences. The Association for Computing
Machinery Inc., New York (2003)
14. Grace, R.: Drowsy driver monitor and warning system. International Driving Symposium on
Human Factors in Driver Assessment, Training and Vehicle Design (2001)
15. Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.A.: Feature Extraction: Foundations and Applica-
tions. Springer, Berlin (2006)
16. Hagenmeyer, L.: Development of a multimodal, universal human–machine-interface for
hypovigilance-management-systems. Ph.D. thesis, University of Stuttgart (2007)
17. Horng, W., Chen, C., Chang, Y.: Driver fatigue detection based on eye tracking and dynamic
template matching. In: Proceedings of the IEEE International Conference on Networking,
Sensing & Control (2004)
18. Isard, M., Blake, A.: Condensation: conditional density propagation for visual tracking. Int. J.
Comput. Vis. 29(1), 5–28 (1998). doi:10.1023/A:1008078328650
J Intell Robot Syst
19. Isard, M.A.: Visual motion analysis by probabilistic propagation of conditional density. Ph.D.
thesis, Oxford University (1998)
20. Jafar, I., Ying, H.: A new method for image contrast enhancement based on automatic specifica-
tion of local histograms. IJCSNS Int. J. Computer Sci. Netw. Secur. 7(7), 1–10 (2007)
21. Ji, Q., Zhu, Z., Lan, P.: Real time nonintrusive monitoring and prediction of driver fatigue. IEEE
Trans. Veh. Technol. 53(4), 1052–1068 (2004). doi:10.1109/TVT.2004.830974
22. Ji, Q., Yang, X.: Real-time eye, gaze, and face pose tracking for monitoring driver vigilance.
Real-Time Imaging 8, 357–377 (2002)
23. Liu, C.: Gabor-based kernel PCA with fractional power polynomial models for face recognition.
IEEE Trans. Pattern Anal. Mach. Intell. 26(5), 572–581 (2004)
24. Longhurst G.: Understanding Driver Visual Behaviour. Seeing Machine Pty Limited, Acton
25. Looney, C.G.: Pattern Recognition Using Neural Networks, Theory and Algorithms for
Engineers and Scientists. Oxford University Press, Oxford (1997)
26. McLachlan, G.J.: The EM Algorithm and Extensions. Wiley, New York (1997)
27. Mujtaba I.M.: Application of Neural Networks and Other Learning Technologies in Process
Engineering. Imperial College Press, London (2001)
28. NHTSA: evaluation of techniques for ocular measurement as an index of fatigue and the ba-
sis for alertness management. Final report DOT HS 808762, National Highway Traffic Safety
Administration, Virginia 22161, USA (1998)
29. Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man
Cybern. 9, 62–66 (1979). doi:10.1109/TSMC.1979.4310076
30. Parker, J.R.: Practical Computer Vision Using C. Wiley, New York (1994)
31. Tian, Z., Qin, H.: Real-time driver’s eye state detection. In: IEEE International Conference on
Vehicular Electronics and Safety, pp. 285–289 (2005)
32. Satake, J., Shakunaga, T.: Multiple target tracking by appearance-based condensation tracker
using structure information. In: Proceedings of the 17th International Conference on Patter
Recognition (ICPR’04), vol. 3, pp. 294–297 (2004)
33. Swingler, K.: Applying Neural Networks: A Practical Guide. Academic, New York (1996)
34. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In:
Conference on Computer Vision and Pattern Recognition (2001)
35. Wu, Y., Liu, H., Zha, H.: A new method of detecting human eyelids based on deformable
templates. In: IEEE International Conference on Systems, Man and Cybernetics, pp. 604–609
... ADAS is part of the active safety systems that is designed to alert the drivers to help them avoid traffic accidents. The main objective is to contribute the reduction of traffic accidents by using newly developed technologies; that is, incorporating new systems for increasing vehicle security, and at the same time, decreasing the dangerous situations that may arise during driving due to human errors [5]. Many surveys show that ADAS can prevent up to 40% of road accidents depending on the ADAS type and the type of accident scenario [6]. ...
... The performance of the experiment is assessed with accuracy, precision and recall criteria. Related equations for these criteria are given in (3), (4) and (5). In these equations, TN is the number of correct estimates for which a sample is negative; FP, the number of false estimates that a sample is positive; FN indicates that an estimate is negative, and TP is the number of accurate estimates that a case is positive [18]. ...
Drowsiness is one of the major causes of driver-induced traffic accidents. The interactive systems developed to reduce road accidents by alerting drivers is called as Advanced Driver Assistance Systems (ADAS). The most important ADAS are Lane Departure Warning System, Front Collision Warning System and Driver Drowsiness Systems. In this study, an ADAS system based on eye state detection is presented to detect driver drowsiness. First, Viola-Jones algorithm approach is used to detect the face and eye areas in the proposed method. The detected eye region is classified as closed or open by making use of a machine learning method. Finally, the eye conditions are analyzed at time domain with PERcentage of eyelid CLOsure (PERCLOS) metric and drowsiness conditions are determined by Support Vector Machine (SVM), kNN and decision tree classifiers. The proposed methods tested on 7 real people and drowsiness states are detected at 99.77%, 94.35%, and 96.62% accuracy, respectively.
... The superiority of the system, over competing algorithms, is due to the abundant tracking, analysis factors, and day-and-night data collection periods. A fatigue detection system was proposed by Flores et al. [9], which scanned and monitored the driver's face and eyes to decide based on facial expressions and eye movement. The gadget was tested in real-world scenarios with varying levels of illumination. ...
Full-text available
As the number of road accidents increases, it is critical to avoid making driving mistakes. Driver fatigue detection is a concern that has prompted researchers to develop numerous algorithms to address this issue. The challenge is to identify the sleepy drivers with accurate and speedy alerts. Several datasets were used to develop fatigue detection algorithms such as electroencephalogram (EEG), electrooculogram (EOG), electrocardiogram (ECG), and electromyogram (EMG) recordings of the driver’s activities e.g., DROZY dataset. This study proposes a fatigue detection system based on Fast Fourier Transform (FFT) and Discrete Wavelet Transform (DWT) with machine learning and deep learning classifiers. The FFT and DWT are used for feature extraction and noise removal tasks. In addition, the classification task is carried out on the combined EEG, EOG, ECG, and EMG signals using machine learning and deep learning algorithms including 1D Convolutional Neural Networks (1D CNNs), Concatenated CNNs (C-CNNs), Support Vector Machine (SVM), Random Forest (RF), Decision Tree (DT), k-Nearest Neighbor (KNN), Quadrature Data Analysis (QDA), Multi-layer Perceptron (MLP), and Logistic Regression (LR). The proposed methods are validated on two scenarios, multi-class and binary-class classification. The simulation results reveal that the proposed models achieved a high performance for fatigue detection from medical signals, with a detection accuracy of 90% and 96% for multiclass and binary-class scenarios, respectively. The works in the literature achieved a maximum accuracy of 95%. Therefore, the proposed methods outperform similar efforts in terms of detection accuracy.
... Another way devised to determine driver attention is to instantaneously examine and evaluate the driver's condition. It is founded on will [7]. ...
Full-text available
span>Fatigue and drowsiness detection techniques based on the external features are under progress, and the methods of facial feature extraction require further development. This paper discusses the innovative processes, efficient methods, and recent advancements in the field of drowsiness and fatigue detection. In this proposed model, a wide application is planned in the field of artificial intelligence by defining the fundamentals of human-computer interaction, facial expression recognition and driver fatigue-sleepiness determination. This research outlines an efficient and effective three-phase strategy for detecting drowsiness. Viola Jones is used to detect facial traits in these three phases. Detection of yawning and tracking once the face has been identified, the segmenting the skin, the system becomes lighting invariant portion by itself, focusing on the chromatic components based on skin, and to reject most of non-face image backdrops. The color eye tracking and yawning detection are carried out by template matching with the correlation coefficient. The vectors of features based on each of the above phases is concatenated, and a binary result is obtained. The analysis of sound and successive frames into fatigue and non-fatigue states has been classified. If the time in fatigue state exceeds the threshold, the system will sound an alarm. </span
... This model has been tested in real time under various scenarios and circumstances. This model poses a higher error rate under the influence of interruptions [11][12] [13]. CNN, SVM and HMM algorithms are used for gaining a highly accurate model but it is very difficult to deal with a larger dataset. ...
Full-text available
There has been an alarming increase in the number of accidents that occur due to drowsiness while driving. In order to reduce roadside accidents, the detection of driver fatigue or drowsiness is crucial. Detecting fatigue during driving is crucial for reducing accidents, as well as improving the safety of both the driver and the passengers. Various methods can be used to detect drowsiness among drivers, but fuzzy logic-based detection stands out for its ability to avoid false alarms. As part of the proposed system, we are using eye-tracking in combination with methods such as Haar cascade to identify the level of drowsiness of the driver. This system has been tested in real-time.
... The advanced driving assistance system (ADAS) is a type of active safety system that interacts with drivers more directly to assist them to prevent traffic collisions. Flores et al. analyzed the face and eyes of drivers in real time and presented an ADAS based on visual information, and used artificial intelligence (AI) to identify the drowsiness automatically (97). The experimental findings show accurate and robust results with a 98.90% correct rate under varying light conditions. ...
Full-text available
Driver fatigue is the most important factor in the increase in the frequency of traffic accidents and fatalities every year. Fatigue impairs driving performance through a lack of concentration and slower reaction time. Therefore, a fatigue detection system is very important for safe driving. This paper presents a systematic literature review of the research conducted over the last 15 years to provide information about the evolution of various driver fatigue detection (DFD) systems with the advancement of technologies. In the domain of DFD, researchers have used different approaches such as physiological, beha-vioral, vehicular, and mixed. Findings from the study indicate that physiological and behavior-based techniques are widely used by the authors, whereas vehicular features are very scarcely used. Analysis of papers shows that researchers are more likely to utilize a combination of physiological and behavior-based approaches to identify driving fatigue or drowsiness. The outcome of this literature review could help practitioners to improve existing fatigue detection technologies by application of the different approaches for fatigue identification and measurement.
... This variety of monitoring and analyzing parameters combined with the day and night acquisition conditions, resulted in the system outperforming other algorithms at the time. Flores et al. [9] proposed an ADAS (Advanced Driver Assistance System) that detects and tracks the driver's face and eyes before analyzing the driver's facial emotions and eyes movement to detect drowsiness. The system had been tested in real time under different lighting conditions. ...
Full-text available
Due to the increasing of traffic accidents, there is an urgent need to control and reduce driving mistakes. Driver fatigue or drowsiness is one of these major mistakes. Many algorithms have been developed to address this issue by detecting fatigue and alerting the driver to this dangerous condition. The major problem of the developed algorithms is their detection accuracy, as well as the time required to detect fatigue status and alert the driver. The accuracy and the time represent a critical condition that affects the reduction of traffic accidents. Several datasets have been used in the development of fatigue or drowsy detection techniques. These data are gathered from the deriver’s brain Electroencephalogram (EEG) signals or from video streaming recordings of the driver behavior. This paper develops two distinct approaches, the first based on the use of machine learning classifiers and the second depends on the use of deep learning models to produce a high-performance fatigue detection system. The machine learning approach is used to process EEG signals, whereas the deep learning approach is used to process video streams. In machine learning classifiers, Support Vector Machine (SVM) provides up to 98% of detection accuracy, which is the highest accuracy among the other five deployed classifiers. In deep learning models, Convolutional Neural Network (CNN) provides up to 99% detection accuracy, which is the highest accuracy among the other two deployed models. The experimental results demonstrate that the two proposed algorithms provide the highest detection accuracy with the shortest Testing Time ( TT ) when compared to all other recent and efficient fatigue detection algorithms.
In recent times, driver drowsiness is one of the major reasons for road accidents that leads to severe physical injuries, deaths and significant economic losses. Hence, the existing driver drowsiness detection systems require a countermeasure device for the prevention of sleepiness related accident. This research paper aims to perform drowsiness detection with the help of driver’s eye state, head pose, and mouth state information. Initially, the input data were collected from the public drowsy driver database. Then, the Camera Response Model (CRM) was applied to improve the quality of collected data. Also, viola-jones, and Kanade-Lucas-Tomasi (KLT) approaches were used to detect and track the driver’s face, eye, and mouth regions from the input video. In this research study, Online Region-Based Active Contour Model (ORACM) algorithm was used to segment the driver’s mouth region in order to obtain the threshold value. Successively, feature extraction; Histogram of Oriented Gradients (HOG) and Local Binary Pattern (LBP) was applied to extract the features from the detected eye region. The extracted features of the eye region were combined with the threshold value of mouth region and head pose angle. After extracting the feature vectors, infinite approach was utilized to choose the relevant feature vectors. Finally, the selected features were classified using Support Vector Machine (SVM) for classifying the stages of drowsiness detection. Simulation outcome illustrated that the proposed system increased the classification accuracy up to 5.52% as related to hybrid Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM).
Full-text available
The proposed research is a step to implement real time image segmentation and drowsiness with help of machine learning methodologies. Image segmentation has been implemented in real time in which the segments of mouth and eyes have been segmented using image processing. Input can be provided by the help of real time image acquisition system such as webcam or internet of things based camera. From the video input, image frames has been extracted and processed to obtain real time features and using clustering algorithms segmentation has been achieved in real time. In the proposed work a Support Vector Machine (SVM) based machine learning method has been implemented emotion detection using facial expressions. The algorithm has been tested under variable luminance conditions and performed well with optimum accuracy as compared to contemporary research.
The driver’s drowsiness and distraction are the principal causes of traffic accidents in the world. To attack this problem, in this paper we propose a visual-based driver’s drowsiness and distraction detection system, which is based on a face detection algorithm and a CNN-based driver state classification. To be useful the proposed system, we consider that the system must be implemented in a compact mobile device with limited memory space and computational power. The proposed system in compact mobile device can be used in any type of vehicle, avoiding accident caused by lack of driver’s alert. The proposed system is evaluated using public dataset, obtaining 95.77% of global accuracy. The proposed system is compared with five finetuned off-the-shelf CNNs, in which the proposed system shows a favorable performance, providing higher operation speed and lower memory requirement compared with these five CNNs, although the detection accuracy is slightly lower compared with the best CNN. The performance of the proposed system guarantees the real-time operation in the compact mobile device.KeywordsConvolutional Neural Networks (CNN)Driver’s drowsiness detectionDriver’s distraction detectionReal-time implementationFinetuning model
Full-text available
Summary The histogram equalization (HE) method is widely used for image contrast enhancement. While it can enhance the overall contrast, the inherent dependence of its transformation function on the global content of the image limits its ability to enhance local details at the same time. Furthermore, using the method to reform the image histogram into a uniform one usually results in a significant change in the image brightness and saturation artifacts, specifically in low contrast images. One extension for HE is the local histogram equalization (LHE) method that processes the image on block-by-block basis and uses the transformation function of HE for that block to modify its center pixel. Although the LHE method can enhance image details, it often causes unacceptable and unnatural image modification due to noise amplification, especially in smooth regions. In this paper, we propose a new local enhancement method referred as Automatic Local Histogram Specification (ALHS). The ALHS method is applied locally such that for each pixel in the image a neighborhood/block of specific size is defined with that pixel being at the center of the block. Next, the ALHS method modifies the graylevel value of this central pixel by specifying an output histogram and applying the histogram matching algorithm. The core idea of the ALHS method is specifying the best output histogram for the block associated with each pixel. To specify the output histogram, a minimization problem for a functional with a constraint that preserves the mean brightness of that block is solved. The specified histogram in the ALHS method provides the maximum graylevel stretching and preserves the mean brightness of the block. This is reflected on the processed image by the enhancement of its contrast, preservation of its outlook, and minimum introduction of noise and overenhancement artifacts. The ALHS method is fully automatic and provides an analytic solution for the output histogram as a function of the mean brightness of the block. Our experimental evaluation on a set of benchmark images involved the use of two quantitative measures and visual assessment. The evaluation results show that the ALHS method outperforms both the HE and LHE methods.
The development and evaluation of a universal, multimodal HMI for Hypovigilance-Management-Systems (HVMSs) is described. In principles, Hypovigilance-Management- Systems measure the state of vigilance of the respective user and take according measures if a critical state of hypovigilance with respect to the work task of the user is reached, i.e., warn the user on different levels of urgency and might try to keep the user awake for a short period. Hypovigilance is a key cause for severe accidents in various application scenarios, among them the transportation sector and the shift working industry. Clearly, countermeasures are to be considered. Primary measures, such as the proper design of shift cycles, intend to prevent the creation of hypovigilance itself. Where this is not possible, technical countermeasures, so called HVMSs can support the user and increase safety. The main influencing factors for hypovigilance were discussed and it was concluded that HVMSs should cause the user to stop his/her dangerous work task and go to sleep or, at least, take a power nap. A requirements analysis with respect to HVMS has been conducted. It was concluded that a universal HVMS needs to be mobile, should have predictive warning capabilities and aim for a high compliance of the user with the system. Then, general requirements for the interaction strategy, i.e., the warning strategy, as well as its implication in physical HMI-elements were deduced. Basic design guidelines were summarized on the basis of the current psychological and human factors literature. In the same way, usability and personalization issues were discussed and the benefit of designing a most universal system was pointed out. On the basis of the requirements for a HVMS, 18 existing HVMS were investigated. It was concluded that none of these systems fulfils the requirements both for the warning strategy and the HMI-elements. However, it was claimed that it is principally possible to generate a HVMS setup that fulfils all of the requirements. Furthermore, it was claimed that, by fulfilling these requirements, this system would effective in terms of the reduction of errors and the enhancement of performance in real working tasks. Consequently, the development of such a system was pursued and a new hybrid, multi-tiered-development approach was followed. A detailed warning strategy resulted which features three warning modes, a normal mode in which the actual vigilance state is displayed, a cautionary mode in which the user is warned early before dangerous vigilance degradations occur, and an imminent mode in which the user is intensively warned that a very dangerous vigilance level has been reached. The latter mode is augmented by a vigilance maintenance mode. In the same way, according HMI-elements were developed. The center of the physical HMI-setup is represented by a watch-like device. The border ring of the “clock-face” is used as a status indicator; it is lighted green, yellow or red, according to the detected user state. The center of the “clock-face” flashes white in case of an imminent alarm and acts as a push-button for the user feedback at the same time. In addition, a vibration device was developed to present haptic stimuli. Finally, a specific headphone was chosen to display audio signals. By the way of methodological construction, the developed HVMS inherently fulfils the requirements defined above, proving that it is indeed possible to construct a system which fulfills these requirements. Usability testing showed that the system, overall, was well designed. The HVMS constructed was tested in a practical experiment for effectiveness. An industrial work task was analyzed and mapped to a practical laboratory task in order to control most influencing factors. A homogeneous sample was chosen and randomly divided into three groups, a control group with no intervention, a treatment group using the system developed and a positive control group using a system with random output. The experiments included a training session, a baseline measurement in the evening, and, after a night of sleep deprivation, the actual data measurement in the morning. It could be shown that the dependent variables were chosen meaningfully, and that the experimental design was well done, resulting in a valid test set-up. By the results, an increase in safety by HVMSs was assumed to be reached twofold: On the one hand, the system could help the user to learn to better judge his/her own level of sleepiness by getting objective vigilance feedback. On the other hand, if the user complies with the system (which is very likely by the above results), and stops his/her dangerous work task such as working on a dangerous machine or driving a car, obviously, the risk of an accident would be reduced.