Abstract—In an overnight driving simulation study three
commercially available devices of fatigue monitoring technolo-
gies (FMT) were applied to test their accuracy. 16 volunteers
performed driving tasks during eight sessions (40 min each)
separated by 15 minutes breaks. The main output variable of
FMT devices, which is the percentage of eye closure PERCLOS,
the driving performance (standard deviation of lateral position
in lane, SDL), the electroencephalogram (EEG) and electroocu-
logram (EOG) were recorded during driving. In addition, the
subjective self-rated Karolinska sleepiness scale (KSS) was asse-
ssed every 2 min. As expected, Pearson product-moment corre-
lation coefficient (PMCC) yielded significant linear dependence
between KSS and PERCLOS as well as between SDL and PER-
CLOS. However, if PMCC was estimated within smaller data
segments (3 min) as well as without averaging across subjects
then strongly decreased correlation coefficients resulted. To
further validate PERCLOS at higher temporal resolution its
ability to discriminate between mild and strong fatigue was in-
vestigated and compared to the results of the same analysis for
EEG/EOG. Spectral-domain features of both types of signals
were classified using Support-Vector Machines (SVM). Results
suggest that EEG/EOG indicate driver fatigue much better than
PERCLOS. Therefore, current FMT devices perform acceptab-
ly if temporal resolution is low (> 20 min). But, even under lab-
oratory conditions large errors have to be expected if fatigue is
estimated on individual level and with high temporal resolution.
Both distracted and fatigued drivers cause crashes. It is
believed that both causations are underreported, since in
many cases there is no evidence of driver distraction or
fatigue at the scene of a crash. Moreover, it was observed
that drivers are reluctant to admit such impairments and that
their sense of responsibility decreases during fatigue.
Therefore, the detection of driver fatigue and distraction by
in-car FMT systems is urgently needed and will help to
overcome these problems . Hypovigilance is a deficit in
vigilance and leads to a decreased ability to maintain long-
term attention. Fatigued drivers sometimes fail to perceive
and interpret unexpected, relevant changes in the environ-
ment and lack the ability to make effective decisions as well
as perform precise motor movements.
The two major causes of hypovigilance are central fatigue
and task monotony. But, it is well known that several other
David Sommer and Martin Golz are with University of Applied Sciences
Schmalkalden, Department of Computer Science, Blechhammer 4-9,
98574 Schmalkalden, Germany (e-mail: firstname.lastname@example.org)
factors influence hypovigilance. It is a complex issue with
several facets. It depends for example on time-of-day due to
the circadian rhythm, on time-since-sleep (long duration of
wakefulness), on time-on-task (prolonged work), inadequate
sleep, and accumulated lack of sleep. The last two factors
may be caused by pathological sleepiness due to diseases,
like sleep apnea or narcolepsy, or may be caused by sleep
loss due to prolonged time awake. Moreover, there are also
psychological factors influencing the actual level of vigilan-
ce, e.g. motivation, stress, and monotony. The latter is belie-
ved to play a major role during driving, because especially
long-distance highway driving is in most situations a simple
lane-tracking task with a low event rate and therefore deacti-
vation may occur. Moreover, vigilance is considered as a
psychophysiological variable not always decreasing mono-
tonically during driving. It shows slow waxing and waning
patterns, which can be observed in driving performance and
repeatedly self-reported sleepiness.
Different FMT have been proposed to detect driver fatigue
automatically, using a variety of fatigue related measures. In
this paper, three video-based warning systems as well as
electrophysiological signals were investigated.
To demonstrate this, two procedures were applied. First,
the outcome of the video-based FMT systems was compared
to two fatigue measures which are independent of both app-
roaches. The self-report of sleepiness on the Karolinska slee-
piness scale (KSS)  is a subjective measure and the driv-
ing performance in terms of standard deviations of lateral po-
sition in lane (SDL)  is an objective measure.
Secondly, video-based and electrophysiology-based app-
roaches were compared directly using discriminant analysis.
This was realized by recording both types of signals simulta-
neously during overnight driving simulations, and by perfor-
ming the same kind of signal processing. The independent
fatigue measures (KSS, SDL) were used to divide the data
set in two classes: mild and strong fatigue. As a last step, dis-
criminant analysis was performed, and the mean error rates
for both the video-based and electrophysiology-based app-
roaches, were compared.
Several authors who have investigated driving variables
and / or driver behaviour variables, under different levels of
fatigue, utilized descriptive statistics or linear discriminant
analysis of time-domain features. Because drivers perceive a
lot of heterogeneous information and experience changing
road scenes, their driving behaviour is extremely situation
Evaluation of PERCLOS based
Current Fatigue Monitoring Technologies
D. Sommer and M. Golz
specific. Moreover, the internal state of the driver is influen-
ced by complex regulatory processes. This leads to non-uni-
modal distribution densities of the measured variables in the
feature space. Modern computational intelligence methods
like SVM are known to deal with this. They allow adapting
nonlinear discriminant functions, and regularizing them over
a wide range from strong locality to complete globality .
Another advantage is that they are non-parametric methods.
Therefore, they overcome limitations of parametric statistics.
This is important if effects are expected to vary from indivi-
dual to individual.
The investigated FMT devices were all video-based. The
data file stored on the host computer included results from
the application of several variables such as time series of lid
gap, blink duration, and pupil diameter. However, only the
PERCLOS variable was utilized to measure driver fatigue. It
is defined as percentage of time during which the pupils are
covered by the eye lid by more than 80% of their area .
Thus it is an integral measure of prolonged eye closures. Ty-
pical integration intervals to calculate PERCLOS are bet-
ween 1 and 5 minutes. Vendors of the three FMT agreed to
our testing protocol, and provided integration assistance.
16 participants drove two nights (11:30 p.m. – 8:30 a.m.)
in our real car driving simulation lab. One overnight
experiment comprised of 8 x 40 min of driving. EEG (FP1,
FP2, C3, Cz, C4, O1, O2, A1, A2) and EOG (vertical,
horizontal) were recorded at a sampling rate of 256 Hz.
PERCLOS as another oculomotoric measure was recorded
independently utilizing the three FMT devices (PA, PB, PC).
As it turned out, there were no large differences in their
technical principles. All were based on video analysis,
utilized modulated infra-red light, and slightly different data
rates (range 30 – 120 Hz). They all output several variables
of eyelid and head movements, but utilized only PERCLOS
as a measure of fatigue.
Also several variables of driving simulation, e.g. steering
angle and lane deviation, were recorded at a sampling rate of
50 Hz. Driving performance in terms of standard deviations
of lateral position in lane (SDL)  acted as an objective fat-
igue measure. The KSS, as mentioned above, is a standardiz-
ed, subjective, and independent measure of hypovigilance on
a numeric scale between 1 and 10. KSS was asked at the be-
ginning and after finishing driving. During driving only rela-
tive changes in percent of the full range were asked because
subjects are more aware of relative than of absolute changes.
B. Data Analysis
For this study we analyzed the recorded data in two differ-
rent ways. First, correlation analysis was performed between
measured PERCLOS data of the three devices and the two
independent fatigue measures (KSS, SDL). The PMCC was
estimated. Secondly, for the assessment of PERCLOS data as
a fatigue measure in comparison with EEG/EOG, we perfor-
med nonlinear discriminant analysis. For this purpose, all
signals under consideration (PA, PB, PC, EEG/EOG) were
transformed to the spectral domain, and the logarithmic pow-
er spectral densities (logPSD) were estimated by the modif-
ied periodogram method. LogPSD values of all signals were
averaged in spectral bands. In case of EEG/EOG signals, 1.0
Hz wide bands and a range of 1 to 23 Hz turned out to be
optimal, whereas in case of PERCLOS signals 0.2 Hz wide
bands and a range of 0 to 4 Hz were optimal.
The independent fatigue measures (KSS, SDL) were both
used to divide the whole data set into the two classes ‘mild
fatigue’ (class 1) and ‘strong fatigue’ (class 2). This was nec-
essary to get labels for discriminant analysis (classification).
For this, histograms of all measures were computed and divi-
ded in 2 subsets of equal size using 3 minutes data segments.
Fig. 1. Histogram of the subjective (KSS, left) and the objective (SDL,
right) measure of driver fatigue. An arbitrary threshold (vertical line) was
defined to have two classes (class 1: mild fatigue, class 2: strong fatigue).
The threshold parameter for the subjective measure was
selected at KSS = 7.0 (Fig. 1a). The histogram indicates a
slightly right-skewed distribution which confirms the success
of the experimental design in inducing strong sleepiness. The
same procedure was applied to the objective fatigue measure
SDL (Fig. 1b). Subjects who performed lane tracking with
standard deviations lower than 13 % with respect to the lane
width were defined to be in the state of mild fatigue. In our
driving simulation system, the lane width was set to 5.0 m.
This means that subjects in the mild fatigue state performed
with standard deviations lower than 0.60 m. Lane tracking
with standard deviations greater than 0.60 m was referred to
the state of strong fatigue. Lateral positions in lane greater
than 100 % with respect to the lane width were also exhibit-
ed, but were handled by the simulation system as an accident
and led to a re-start procedure of the driving simulation.
First, the PERCLOS measures of the three FMT devices
(PA, PB, PC) were considered at relative high temporal resol-
ution, all data were analyzed within each session (Fig. 2.). In
non-overlapping 3 min intervals, mean and standard deviat-
ions of all 5 variables were calculated. Averaging across
subjects was performed. This resulted in 13 mean values for
each variable in each session. For comparison, all measures
were scaled equally by setting the minimum to zero and the
maximum to one for each measured time series (min-max
scaling). The mean values of all measures, especially KSS,
were influenced by time-on-task effects during all driving
sessions. Calculations revealed fluctuations of the FMT
variables which are not easily explainable by performance
(SDL) or sleepiness (KSS) measures. In consequence,
correlation coefficients broke down considerably. Even in
sessions 1 to 4, where mild and mid-range sleepiness was
rated, correlations diminished. Only sessions 5, 6, 7 display-
ed high correlations. Overall, the study showed large inter-
individual difference in all measures (Fig. 2, std. errors).
Fig. 2. Mean and standard errors of the driver fatigue (KSS, green), driving
performance measure (SDL, blue) and FMT output variables (PA, PB, PC).
All variables were averaged over 3 min intervals and across all subjects.
PMCC between KSS/SDL and PA, PB, PC for each session are listed.
To explain these large inter- and intra-individual differen-
ces, 3 minute segments of the PERCLOS time series PA(t),
PB(t), and PC(t) were analyzed on a more general level using
pattern recognition techniques combined with methods of
computational intelligence. For simplicity, it was only asked
how well they may discriminate between mild and strong
self-rated sleepiness, and alternately, how well they discrimi-
nate between high and low lane tracking performance. In ad-
dition, the same time series analysis was performed on EEG/
EOG recorded simultaneously to PERCLOS. Under the as-
sumption that sleepiness, as well as performance decrements,
are originated in the central nervous system, the EEG is a
more direct measure than EOG and PERCLOS. Despite this,
all signals were covered by noise and by many other signal
components not tightly related to sleepiness and performance
decrements. Therefore, it is an open question as to which sig-
nal provides the best ability to detect driver fatigue.
The discriminant analysis between mild and strong fatigue
of the three PERCLOS signals yielded relatively high errors
between 26% and 34% (Fig. 3). The errors remained in this
range independent of the segment length. The same analysis
of EEG/EOG resulted in substantially lower errors of about
13% when the segment length exceeds 50 seconds. Note that
with increasing segment length, the number of segments and
in consequence the number of feature vectors is decreasing.
Therefore, with larger segment length the discriminant ana-
lysis becomes less complex and the statistical accuracy dec-
reases. The first case should lead to decreasing error rate
whereas the latter should lead to larger standard errors. Fig.
3 shows only the latter effect. Apparently, the reduced com-
plexity due to data reduction is compensated by increasing
complexity of the sampling distribution of the feature vec-
tors, at least for segment length greater than 50 seconds.
Fig. 3. Results of nonlinear discriminant analysis for the case of subjective
ratings of sleepiness (KSS) as an external criterion of driver fatigue. Mean
and standard deviation of test errors of classification versus width of signal
analysis interval (length of data segments) is shown. EEG/EOG based data
analysis performed much better than the PERCLOS based ones.
If the same discriminant analysis is performed using driv-
ing performance (SDL) instead of sleepiness ratings (KSS)
as class label, then higher errors (Fig. 4.) for all PERCLOS
variables (range 27 - 41%) and lower errors for EEG/EOG
(11% at 150 s) were found. For all signals, errors decreased
with increasing segment length. For both labels (KSS, SDL)
differences between the FMT devices are roughly the same.
Device A performs slightly better for small segment length,
whereas device C performs slightly better for large segment
length. The results of EEG/EOG analysis confirm our prev-
ious findings .
Fig. 4. Results of nonlinear discriminant analysis for the case of driving
performance (SDL) as an external criterion of driver fatigue. Mean and
standard deviation of test errors of classification versus width of signal
analysis interval (length of data segments) are shown. EEG/EOG based data
analysis performed much better than the PERCLOS based ones.
IV. CONCLUSIONS Download full-text
The aim of the present study was to evaluate commercially
available devices for driver fatigue monitoring. FMT devices
were investigated in an overnight driving simulation study to
evaluate their accuracy. One may make an objection to the
lab study instead of a field study. However, in the field it is
not possible to investigate such high levels of sleepiness and
performance decrements. Furthermore, lab studies should be
a first step for practical FMT evaluation. If devices fail under
lab conditions then it is likely that they will fail under the
much more complex influences of field conditions.
Subjectively rated sleepiness (KSS) and objectively asse-
ssed driving performance in terms of standard deviation of
lateral position in lane (SDL) indicate increased deteriorat-
ions throughout the night. These two values were used as
independent labels to evaluate the PERCLOS based fatigue
measures of each of the 3 video-based FMT devices. Correl-
ations between KSS, SDL and the PERCLOS measures PA,
PB, PC varied largely between different sessions (range 0.12 -
0.92) as well as between the 3 FMT devices. Clearly, this is
not sufficient for FMT as tools to prevent accidents in the
Therefore, the general ability of the PERCLOS measure of
all three FMT devices to distinguish between mild and strong
fatigue was of interest. As a benchmark EEG/EOG instead of
PERCLOS as input signals were used in the same framework
of data analysis. It has been shown recently that the combin-
ation of EEG and EOG is an excellent measure of strong fat-
igue when modern computational intelligence methods were
utilized . The resulting mean error of discrimination bet-
ween mild and strong fatigue was 13% and 10% for KSS and
SDL labels, respectively, when EEG/EOG were analyzed.
On the other hand, discriminant analysis of all 3 PERCLOS
signals provided mean errors between 26% and 32% for KSS
and between 26% and 42% for SDL labels, respectively. In
contradiction to the results of Dinges , this shows that
PERCLOS seems to carry less information on fatigue than
EEG combined with EOG.
Similar conclusions were made by Johns . He pointed
out that under demands of sustained attention, some sleep-
deprived subjects fall asleep while their eyes remain open.
Unfortunately, PERCLOS does not include any assessment
of eye and eye lid movements. Important dynamic character-
istics which are widely accepted, such as slow rolling eye
movements, reductions in maximal saccadic speed, or in vel-
ocity of eye lid reopening, are ignored. Their spectral charac-
teristics were picked up in our study through EOG and may
account for the far better results of EEG/EOG data fusion
presented here. Note, that highly dynamical alterations are
better reflected by EOG and by eye-tracking signals than by
integral measures like PERCLOS. Nevertheless, adaptive
signal analysis of EEG/EOG in combination with computat-
ional intelligence methods resulted in highest detection per-
We want to express our gratitude to the product managers
as well as to the R&D engineers of the manufacturers of the
FMT devices for supporting our study and for giving numer-
ous and valuable advices. The study was supported by Cater-
pillar Inc. which is thankfully acknowledged.
 Knipling RR (1998) Three Fatigue Management Revolutions for the
21st Century. In: Hartley (ed) Managing Fatigue in Transportation.
Proc 3rd Int Conf, Fremantle, Australia. Pergamon Press: 355-378
 Åkerstedt T, Gillberg M (1990) Subjective and objective sleepiness
in the active individual. Int J Neurosci 52:29-37
 Ingre M, Åkerstedt T, Peters B et al (2006) Subjective sleepiness,
simulated driving performance and blink duration examining
individual differences. J Sleep Res 15: 47–53
 Bishop CM (2007) Pattern Recognition and Machine Learning. 2nd
ed, Springer, New York
 Wierwille W, et al. (1994) Research on vehicle-based driver status /
performance monitoring: development, validation, and refinement of
algorithms for detection of driver drowsiness. Report No DOT HS
808 247. NHTSA, Washington D.C.
 Golz M, Sommer D, Chen M, Mandic DP, Trutschel U (2007)
Feature Fusion for the Detection of Microsleep Events. J VLSI Signal
 Dinges DF, Mallis M, Maislin G, Powell JW (1998) Final report:
Evaluation of techniques for ocular measurement as an index of
fatigue and as the basis for alertness management. Report No DOT
HS 808 762. NHTSA, Washington D.C.
 Johns M (2003) The amplitude-velocity ratio of blinks a new method
for monitoring drowsiness. Sleep 26: A51-52