Chapter

Open Your Eyes: Eyelid Aperture Estimation in Driver Monitoring Systems

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Driver Monitoring Systems (DMS) operate by measuring the state of the driver while performing driving activities. At the gates of the arrival of SAE-L3 autonomous driving vehicles, DMS are called to play a major role for guarantee or, at least, support safer mode transfer transitions (between manual and automated driving modes). Drowsiness and fatigue detection with cameras is still one of the major targets of DMS research and investment. In this work we present our eyelid aperture estimation method, as enabling method for estimating such physiological status, in the context of two main use cases. First, we show how the technique can be integrated into a DMS system, along with other outside-sensing components, to showcase SAE-L3 demonstrations. Second, we adopt the DMD (Driver Monitoring Dataset) open dataset project with a twofold purpose: evaluate the quality of our method compare to other state-of-the-art techniques, and to contribute to the DMD with ground truth labels about drowsiness concepts.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Although there are several initiatives that provide vision-based datasets for DMS, the lack of consolidated datasets has motivated our efforts to contribute with the creation of the DMD; in Figure 1, there are example images of some activities included in this dataset. Driver monitoring requires the interpretation of the driver's features regarding the attention and arousal state, the direction of gaze [40], head pose [41], the position of the hands [42], blink dynamics [43], facial expressions [44], body posture [45] and drowsiness state [46]. The currently available datasets tackle these DMS dimensions individually and do not provide a general description of different driver conditions. ...
Article
Full-text available
Tremendous advances in advanced driver assistance systems (ADAS) have been possible thanks to the emergence of deep neural networks (DNN) and Big Data (BD) technologies. Huge volumes of data can be managed and consumed as training material to create DNN models which feed functions such as lane keeping systems (LKS), automated emergency braking (AEB), lane change assistance (LCA), etc. In the ADAS/AD domain, these advances are only possible thanks to the creation and publication of large and complex datasets, which can be used by the scientific community to benchmark and leverage research and development activities. In particular, multi-modal datasets have the potential to feed DNN that fuse information from different sensors or input modalities, producing optimised models that exploit modality redundancy, correlation, complementariness and association. Creating such datasets pose a scientific and engineering challenge. The BD dimensions to cover are volume (large datasets), variety (wide range of scenarios and context), veracity (data labels are verified), visualization (data can be interpreted) and value (data is useful). In this paper, we explore the requirements and technical approach to build a multi-sensor, multi-modal dataset for video-based applications in the ADAS/AD domain. The Driver Monitoring Dataset (DMD) was created and partially released to foster research and development on driver monitoring systems (DMS), as it is a particular sub-case which receives less attention than exterior perception. Details on the preparation, construction, post-processing, labelling and publication of the dataset are presented in this paper, along with the announcement of a subsequent release of DMD material publicly available for the community.
Article
Full-text available
Driver attention prediction is becoming an essential research problem in human-like driving systems. This work makes an attempt to predict the driver attention in driving accident scenarios (DADA). However, challenges tread on the heels of that because of the dynamic traffic scene, intricate and imbalanced accident categories. In this work, we design a semantic context induced attentive fusion network (SCAFNet). We first segment the RGB video frames into the images with different semantic regions (i.e., semantic images), where each region denotes one semantic category of the scene (e.g., road, trees, etc.), and learn the spatio-temporal features of RGB frames and semantic images in two parallel paths simultaneously. Then, the learned features are fused by an attentive fusion network to find the semantic-induced scene variation in driver attention prediction. The contributions are three folds. 1) With the semantic images, we introduce their semantic context features and verify the manifest promotion effect for helping the driver attention prediction, where the semantic context features are modeled by a graph convolution network (GCN) on semantic images; 2) We fuse the semantic context features of semantic images and the features of RGB frames in an attentive strategy, and the fused details are transferred over frames by a convolutional LSTM module to obtain the attention map of each video frame with the consideration of historical scene variation in driving situations; 3) The superiority of the proposed method is evaluated on our previously collected dataset (named as DADA-2000) and two other challenging datasets with state-of-the-art methods.
Article
Full-text available
Research on driver status recognition has been actively conducted to reduce fatal crashes caused by the driver’s distraction and drowsiness. As in many other research areas, deep-learning-based algorithms are showing excellent performance for driver status recognition. However, despite decades of research in the driver status recognition area, the visual image-based driver monitoring system has not been widely used in the automobile industry. This is because the system requires high-performance processors, as well as has a hierarchical structure in which each procedure is affected by an inaccuracy from the previous procedure. To avoid using a hierarchical structure, we propose a method using Mobilenets without the functions of face detection and tracking and show this method is enabled to recognize facial behaviors that indicate the driver’s distraction. However, frames per second processed by Mobilenets with a Raspberry pi, one of the single-board computers, is not enough to recognize the driver status. To alleviate this problem, we propose a lightweight driver monitoring system using a resource sharing device in a vehicle (e.g., a driver’s mobile phone). The proposed system is based on Multi-Task Mobilenets (MT-Mobilenets), which consists of the Mobilenets’ base and multi-task classifier. The three Softmax regressions of the multi-task classifier help one Mobilenets base recognize facial behaviors related to the driver status, such as distraction, fatigue, and drowsiness. The proposed system based on MT-Mobilenets improved the accuracy of the driver status recognition with Raspberry Pi by using one additional device.
Chapter
Full-text available
Accurate point detection on image data is an important task for many applications, such as in robot perception, scene understanding, gaze point regression in eye tracking, head pose estimation, or object outline estimation. In addition, it can be beneficial for various object detection tasks where minimal bounding boxes are searched and the method can be applied to each corner. We propose a novel self training method, Multiple Annotation Maturation (MAM) that enables fully automatic labeling of large amounts of image data. Moreover, MAM produces detectors, which can be used online afterward. We evaluated our algorithm on data from different detection tasks for eye, pupil center (head mounted and remote), and eyelid outline point and compared the performance to the state-of-the-art. The evaluation was done on over 300,000 images, and our method shows outstanding adaptability and robustness. In addition, we contribute a new dataset with more than 16,200 accurate manually-labeled images from the remote eyelid, pupil center, and pupil outline detection. This dataset was recorded in a prototype car interior equipped with all standard tools, posing various challenges to object detection such as reflections, occlusion from steering wheel movement, or large head movements. The data set and library are available for download at http://ti.uni-tuebingen.de/Projekte.1801.0.html.
Article
Full-text available
Driver fatigue has been attributed to traffic accidents; therefore, fatigue-related traffic accidents have a higher fatality rate and cause more damage to the surroundings compared with accidents where the drivers are alert. Recently, many automobile companies have installed driver assistance technologies in vehicles for driver assistance. Third party companies are also manufacturing fatigue detection devices; however, much research is still required for improvement. In the field of driver fatigue detection, continuous research is being performed and several articles propose promising results in constrained environments, still much progress is required. This paper presents state-of-the-art review of recent advancement in the field of driver fatigue detection. Methods are categorized into five groups, i.e., subjective reporting, driver biological features, driver physical features, vehicular features while driving, and hybrid features depending on the features used for driver fatigue detection. Various approaches have been compared for fatigue detection, and areas open for improvements are deduced.
Conference Paper
Full-text available
Cognitive load has been shown, over hundreds of validated studies, to be an important variable for understanding human performance. However, establishing practical, non-contact approaches for automated estimation of cognitive load under real-world conditions is far from a solved problem. Toward the goal of designing such a system, we propose two novel vision-based methods for cognitive load estimation, and evaluate them on a large-scale dataset collected under real-world driving conditions. Cognitive load is defined by which of 3 levels of a validated reference task the observed subject was performing. On this 3-class problem, our best proposed method of using 3D convolutional neural networks achieves 86.1% accuracy at predicting task-induced cognitive load in a sample of 92 subjects from video alone. This work uses the driving context as a training and evaluation dataset, but the trained network is not constrained to the driving environment as it requires no calibration and makes no assumptions about the subject's visual appearance, activity, head pose, scale, and perspective.
Article
Full-text available
Depth cameras allow to setup reliable solutions for people monitoring and behavior understanding, specially when unstable or poor illumination conditions make unusable common RGB sensors. Therefore, we propose a complete framework for the estimation of the head and shoulder pose based on depth images only. A head detection and localization module is also included, in order to develop a complete end-to-end system. The core element of the framework is a Convolutional Neural Network, called POSEidon+, that receives as input three types of images and provides the 3D angles of the pose as output. Moreover, a Face-from-Depth component based on a Deterministic Conditional GAN model is able to hallucinate a face from the corresponding depth image and we empirically demonstrate that this positively impacts the system performances. We test the proposed framework on two public datasets, namely Biwi Kinect Head Pose and ICT-3DHP, and on Pandora, a new challenging dataset mainly inspired by the automotive setup. Experimental results show that our method overcomes all recent state-of-art works based on both intensity and depth input data, running in real time at more than 30 frames per second.
Article
Full-text available
Not just detecting but also predicting impairment of a car driver's operational state is a challenge. This study aims to determine whether the standard sources of information used to detect drowsiness can also be used to predict when a given drowsiness level will be reached. Moreover, we explore whether adding data such as driving time and participant information improves the accuracy of detection and prediction of drowsiness. Twenty-one participants drove a car simulator for 110min under conditions optimized to induce drowsiness. We measured physiological and behavioral indicators such as heart rate and variability, respiration rate, head and eyelid movements (blink duration, frequency and PERCLOS) and recorded driving behavior such as time-to-lane-crossing, speed, steering wheel angle, position on the lane. Different combinations of this information were tested against the real state of the driver, namely the ground truth, as defined from video recordings via the Trained Observer Rating. Two models using artificial neural networks were developed, one to detect the degree of drowsiness every minute, and the other to predict every minute the time required to reach a particular drowsiness level (moderately drowsy). The best performance in both detection and prediction is obtained with behavioral indicators and additional information. The model can detect the drowsiness level with a mean square error of 0.22 and can predict when a given drowsiness level will be reached with a mean square error of 4.18min. This study shows that, on a controlled and very monotonous environment conducive to drowsiness in a driving simulator, the dynamics of driver impairment can be predicted.
Article
Full-text available
The necessity for the classification of open and closed eyes is increasing in various fields, including analysis of eye fatigue in 3D TVs, analysis of the psychological states of test subjects, and eye status tracking-based driver drowsiness detection. Previous studies have used various methods to distinguish between open and closed eyes, such as classifiers based on the features obtained from image binarization, edge operators, or texture analysis. However, when it comes to eye images with different lighting conditions and resolutions, it can be difficult to find an optimal threshold for image binarization or optimal filters for edge and texture extraction. In order to address this issue, we propose a method to classify open and closed eye images with different conditions, acquired by a visible light camera, using a deep residual convolutional neural network. After conducting performance analysis on both self-collected and open databases, we have determined that the classification accuracy of the proposed method is superior to that of existing methods.
Conference Paper
Full-text available
Distracted driving is a worldwide problem leading to an astoundingly increasing number of accidents and deaths. Existing work is concerned with a very small set of distractions (mostly, cell phone usage). Also, for the most part, it uses unreliable ad-hoc methods to detect those distractions. In this paper, we present the first publicly available dataset for "distracted driver" posture estimation with more distraction postures than existing alternatives. In addition, we propose a reliable system that achieves a 95.98% driving posture classification accuracy. The system consists of a genetically-weighted ensemble of Convolutional Neural Networks (CNNs). We show that a weighted ensemble of classifiers using a genetic algorithm yields in better classification confidence. We also study the effect of different visual elements (i.e. hands and face) in distraction detection by means of face and hand localizations. Finally, we present a thinned version of our ensemble that could achieve a 94.29% classification accuracy and operate in a real-time environment.
Article
Full-text available
In this work we aim to predict the driver's focus of attention. The goal is to estimate what a person would pay attention to while driving, and which part of the scene around the vehicle is more critical for the task. To this end we propose a new computer vision model based on a multi-path deep architecture that integrates three sources of information: visual cues, motion and scene semantics. We also introduce DR(eye)VE, the largest dataset of driving scenes, enriched with eye-tracking annotations and other sensors' measurements. This dataset features more than 500.000 registered frames, matching ego-centric views (from glasses worn by drivers) and car-centric views (from roof-mounted camera). Results highlight that several attentional patterns are shared across drivers and can be reproduced to some extent. This may benefit an ADAS system by providing indication of which elements in the scene are likely to capture the driver's attention.
Article
Full-text available
Fast and accurate upper-body and head pose estimation is a key task for automatic monitoring of driver attention, a challenging context characterized by severe illumination changes, occlusions and extreme poses. In this work, we present a new deep learning framework for head localization and pose estimation on depth images. The core of the proposal is a regression neural network, called POSEidon, which is composed of three independent convolutional nets followed by a fusion layer, specially conceived for understanding the pose by depth. In addition, to recover the intrinsic value of face appearance for understanding head position and orientation, we propose a new Face-from-Depth approach for learning image faces from depth. Results in face reconstruction are qualitatively impressive. We test the proposed framework on two public datasets, namely Biwi Kinect Head Pose and ICT-3DHP, and on Pandora, a new challenging dataset mainly inspired by the automotive setup. Results show that our method overcomes all recent state-of-art works, running in real time at more than 30 frames per second
Conference Paper
Full-text available
The purpose of this paper is to bring together multiple literature sources which present innovative methodologies for the assessment of driver state, driving context and performance by means of technology within a vehicle and consumer electronic devices. It also provides an overview of ongoing research and trends in the area of driver state monitoring. As part of this review a model of a hybrid driver state monitoring system is proposed. The model incorporates technology within a vehicle and multiple brought- in devices for enhanced validity and reliability of recorded data. Additionally, the model draws upon requirement of data fusion in order to generate unified driver state indicator(-s) that could be used to modify in-vehicle information and safety systems hence, make them driver state adaptable. Such modification could help to reach optimal driving performance in a particular driving situation. To conclude, we discuss the advantages of integrating hybrid driver state monitoring system into a vehicle and suggest future areas of research.
Article
Full-text available
Accurate, robust, inexpensive gaze tracking in the car can help keep a driver safe by facilitating the more effective study of how to improve (1) vehicle interfaces and (2) the design of future Advanced Driver Assistance Systems. In this paper, we estimate head pose and eye pose from monocular video using methods developed extensively in prior work and ask two new interesting questions. First, how much better can we classify driver gaze using head and eye pose versus just using head pose? Second, are there individual-specific gaze strategies that strongly correlate with how much gaze classification improves with the addition of eye pose information? We answer these questions by evaluating data drawn from an on-road study of 40 drivers. The main insight of the paper is conveyed through the analogy of an "owl" and "lizard" which describes the degree to which the eyes and the head move when shifting gaze. When the head moves a lot ("owl"), not much classification improvement is attained by estimating eye pose on top of head pose. On the other hand, when the head stays still and only the eyes move ("lizard"), classification accuracy increases significantly from adding in eye pose. We characterize how that accuracy varies between people, gaze strategies, and gaze regions.
Conference Paper
Full-text available
This paper addresses the problem of Face Alignment for a single image. We show how an ensemble of regression trees can be used to estimate the face's landmark positions directly from a sparse subset of pixel intensities, achieving super-realtime performance with high quality predictions. We present a general framework based on gradient boosting for learning an ensemble of regression trees that optimizes the sum of square error loss and naturally handles missing or partially labelled data. We show how using appropriate priors exploiting the structure of image data helps with ef-ficient feature selection. Different regularization strategies and its importance to combat overfitting are also investi-gated. In addition, we analyse the effect of the quantity of training data on the accuracy of the predictions and explore the effect of data augmentation using synthesized data.
Conference Paper
Full-text available
This paper presents a non-intrusive approach for drowsiness detection, based on computer vision. It is installed in a car and it is able to work under real operation conditions. An IR camera is placed in front of the driver, in the dashboard, in order to detect his face and obtain drowsiness clues from their eyes closure. It works in a robust and automatic way, without prior calibration. The presented system is composed of 3 stages. The first one is preprocessing, which includes face and eye detection and normalization. The second stage performs pupil position detection and characterization, combining it with an adaptive lighting filtering to make the system capable of dealing with outdoor illumination conditions. The final stage computes PERCLOS from eyes closure information. In order to evaluate this system, an outdoor database was generated, consisting of several experiments carried out during more than 25 driving hours. A study about the performance of this proposal, showing results from this testbench, is presented.
Article
Full-text available
This paper presents an automatic drowsy driver monitoring and accident prevention system that is based on monitoring the changes in the eye blink duration. Our proposed method detects visual changes in eye locations using the proposed horizontal symmetry feature of the eyes. Our new method detects eye blinks via a standard webcam in real-time at 110fps for a 320×240 resolution. Experimental results in the JZU eye-blink database showed that the proposed system detects eye blinks with a 94% accuracy with a 1% false positive rate.
Article
Full-text available
We develop a non-intrusive system for monitoring fatigue by tracking eyelids with a single web camera. Tracking slow eyelid closures is one of the most reliable ways to monitor fatigue during critical performance tasks. The challenges come from arbitrary head movement, occlusion, reflection of glasses, motion blurs, etc. We model the shape of eyes using a pair of parameterized parabolic curves, and fit the model in each frame to maximize the total likelihood of the eye regions. Our system is able to track face movement and fit eyelids reliably in real time. We test our system with videos captured from both alert and drowsy subjects. The experiment results prove the effectiveness of our system.
Conference Paper
Full-text available
Eye blink detection is one of the important problems in computer vision. It has many applications such as face live detection and driver fatigue analysis. The existing methods towards eye blink detection can be roughly divided into two categories: contour template based and appearance based methods. The former one usually can extract eye contours accurately. However, different templates should be involved for the closed and open eyes separately. These methods are also sensitive to illumination changes. In the appearance based methods, image patches of open-eyes and closed-eyes are collected as positive and negative samples to learn a classifier, but eye contours can not be accurately extracted. To overcome drawbacks of the existing methods, this paper proposes an effective eye blink detection method based on an improved eye contour extraction technique. In our method, eye contour model is represented by 16 landmarks therefore it can describe both open and closed eyes. Each landmark is accurately recognized by fast classifier which is trained from the appearance around this landmark. Experiments have been conducted on YALE and another large data set consisting of frontal face images to extract the eye contour. The experimental results show that the proposed method is capable of affording accurate eye location and robust in closed eye condition. It also performs well in the case of illumination variants. The average time cost of our method is about 140ms on Pentium IV 2.8GHz PC 1G RAM, which satisfies the real-time requirement for face video sequences. This method is also applied in a face live detection system and the results are promising.
Article
Full-text available
This study examined if individuals who are at increased risk for drowsy-driving because of obstructive sleep apnea syndrome (OSAS), have impairments in driving performance in the moments during microsleep episodes as opposed to during periods of wakefulness. Twenty-four licensed drivers diagnosed with OSAS based on standard clinical and polysomnographic criteria, participated in an hour-long drive in a high-fidelity driving simulator with synchronous electroencephalographic (EEG) recordings for identification of microsleeps. The drivers showed significant deterioration in vehicle control during the microsleep episodes compared to driving performance in the absence of microsleeps on equivalent segments of roadway. The degree of performance decrement correlated with microsleep duration, particularly on curved roads. Results indicate that driving performance deteriorates during microsleep episodes. Detecting microsleeps in real-time and identifying how these episodes of transition between wakefulness and sleep impair driver performance is relevant to the design and implementation of countermeasures such as drowsy driver detection and alerting systems that use EEG technology.
Chapter
Vision is the richest and most cost-effective technology for Driver Monitoring Systems (DMS), especially after the recent success of Deep Learning (DL) methods. The lack of sufficiently large and comprehensive datasets is currently a bottleneck for the progress of DMS development, crucial for the transition of automated driving from SAE Level-2 to SAE Level-3. In this paper, we introduce the Driver Monitoring Dataset (DMD), an extensive dataset which includes real and simulated driving scenarios: distraction, gaze allocation, drowsiness, hands-wheel interaction and context data, in 41 h of RGB, depth and IR videos from 3 cameras capturing face, body and hands of 37 drivers. A comparison with existing similar datasets is included, which shows the DMD is more extensive, diverse, and multi-purpose. The usage of the DMD is illustrated by extracting a subset of it, the dBehaviourMD dataset, containing 13 distraction activities, prepared to be used in DL training processes. Furthermore, we propose a robust and real-time driver behaviour recognition system targeting a real-world application that can run on cost-efficient CPU-only platforms, based on the dBehaviourMD. Its performance is evaluated with different types of fusion strategies, which all reach enhanced accuracy still providing real-time response.
Article
This paper presents a methodology and mobile application for driver monitoring, analysis, and recommendations based on detected unsafe driving behavior for accident prevention using a personal smartphone. For the driver behavior monitoring, the smartphone's cameras and built-in sensors (accelerometer, gyroscope, GPS, and microphone) are used. A developed methodology includes dangerous state classification, dangerous state detection, and a reference model. The methodology supports the following driver's online dangerous states: distraction and drowsiness as well as an offline dangerous state related to a high pulse rate. We implemented the system for Android smartphones and evaluated it with ten volunteers.
Article
In this article, we examine the performance of different eye blink detection algorithms under various constraints. The goal of the present study was to evaluate the performance of an electrooculogram- and camera-based blink detection process in both manually and conditionally automated driving phases. A further comparison between alert and drowsy drivers was performed in order to evaluate the impact of drowsiness on the performance of blink detection algorithms in both driving modes. Data snippets from 14 monotonous manually driven sessions (mean 2 h 46 min) and 16 monotonous conditionally automated driven sessions (mean 2 h 45 min) were used. In addition to comparing two data-sampling frequencies for the electrooculogram measures (50 vs. 25 Hz) and four different signal-processing algorithms for the camera videos, we compared the blink detection performance of 24 reference groups. The analysis of the videos was based on very detailed definitions of eyelid closure events. The correct detection rates for the alert and manual driving phases (maximum 94%) decreased significantly in the drowsy (minus 2% or more) and conditionally automated (minus 9% or more) phases. Blinking behavior is therefore significantly impacted by drowsiness as well as by automated driving, resulting in less accurate blink detection.
Article
Eye detection and eye state (close/open) estimation are important for a wide range of applications, including iris recognition, visual interaction and driver fatigue detection. Current work typically performs eye detection first, followed by eye state estimation by a separate classifier. Such an approach fails to capture the interactions between eye location and its state. In this paper, we propose a method for simultaneous eye detection and eye state estimation. Based on a cascade regression framework, our method iteratively estimates the location of the eye and the probability of the eye being occluded by eyelid. At each iteration of cascaded regression, image features from the eye center as well as contextual image features from eyelid and eye corners are jointly used to estimate the eye position and openness probability. Using the eye openness probability, the most likely eye state can be estimated. Since it requires large number of facial images with labeled eye related landmarks, we propose to combine the real and synthetic images for training. It further improves the performance by utilizing this learning-by-synthesis method. Evaluations of our method on benchmark databases such as BioID and Gi4E database as well as on real world driving videos demonstrate its superior performance comparing to state-of-the-art methods for both eye detection and eye state estimation.
Article
Understanding intent and relevance of surrounding agents from video is an essential task for many applications in robotics and computer vision. The modeling and evaluation of contextual, spatio-temporal situation awareness is particularly important in the domain of intelligent vehicles, where a robot is required to smoothly navigate in a complex environment while also interacting with humans. In this paper, we address these issues by studying the task of on-road object importance ranking from video. First, human-centric object importance annotations are employed in order to analyze the relevance of a variety of multi-modal cues for the importance prediction task. A deep convolutional neural network model is used for capturing video-based contextual spatial and temporal cues of scene type, driving task, and object properties related to intent. Second, the proposed importance annotations are used for producing novel analysis of error types in image-based object detectors. Specifically, we demonstrate how cost-sensitive training, informed by the object importance annotations, results in improved detection performance on objects of higher importance. This insight is essential for an application where navigation mistakes are safety-critical, and the quality of automation and human-robot interaction is key.
Conference Paper
Hands are used by drivers to perform primary and secondary tasks in the car. Hence, the study of driver hands has several potential applications, from studying driver behavior and alertness analysis to infotainment and human-machine interaction features. The problem is also relevant to other domains of robotics and engineering which involve cooperation with humans. In order to study this challenging computer vision and machine learning task, our paper introduces an extensive, public, naturalistic videobased hand detection dataset in the automotive environment. The dataset highlights the challenges that may be observed in naturalistic driving settings, from different background complexities, illumination settings, users, and viewpoints. In each frame, hand bounding boxes are provided, as well as left/right, driver/passenger, and number of hands on the wheel annotations. Comparison with an existing hand detection datasets highlights the novel characteristics of the proposed dataset.
Article
Driver's fatigue is one of the major causes of traffic accidents, particularly for drivers of large vehicles (such as buses and heavy trucks) due to prolonged driving periods and boredom in working conditions. In this paper, we propose a vision-based fatigue detection system for bus driver monitoring, which is easy and flexible for deployment in buses and large vehicles. The system consists of modules of head-shoulder detection, face detection, eye detection, eye openness estimation, fusion, drowsiness measure percentage of eyelid closure (PERCLOS) estimation, and fatigue level classification. The core innovative techniques are as follows: 1) an approach to estimate the continuous level of eye openness based on spectral regression; and 2) a fusion algorithm to estimate the eye state based on adaptive integration on the multimodel detections of both eyes. A robust measure of PERCLOS on the continuous level of eye openness is defined, and the driver states are classified on it. In experiments, systematic evaluations and analysis of proposed algorithms, as well as comparison with ground truth on PERCLOS measurements, are performed. The experimental results show the advantages of the system on accuracy and robustness for the challenging situations when a camera of an oblique viewing angle to the driver's face is used for driving state monitoring.
Conference Paper
A new eye blink detection algorithm is proposed. It is based on analyzing the variance of the vertical motions in the eye region. The face and eyes are detected with a Viola–Jones type algorithm. Next, a flock of KLT trackers is placed over the eye region. For each eye, region is divided into 3×33\times 3 cells. For each cell an average “cell” motion is calculated. Simple state machines analyse the variances for each eye. The proposed method has lower false positive rate compared to other methods based on tracking. We introduce a new challenging dataset Eyeblink8. Our method achieves the best reported mean accuracy 99 % on the Talking dataset and state-of-the-art results on the ZJU dataset.
Article
A new eye blink detection algorithm is proposed. Motion vectors obtained by Gunnar–Farneback tracker in the eye region are analyzed using a state machine for each eye. Normalized average motion vector with standard deviation and time constraint are the input to the state machine. Motion vectors are normalized by the intraocular distance to achieve invariance to the eye region size. The proposed method outperforms related work on the majority of available datasets. We extend the way how to evaluate eye blink detection algorithms without the impact of algorithms used for face and eye detection. We also introduce a new challenging dataset Researcher’s night, which contains more than 100 unique individuals with 1849 annotated eye blinks. It is currently the largest dataset available.
Conference Paper
The development of facial databases with an abundance of annotated facial data captured under unconstrained 'in-the-wild' conditions have made discriminative facial deformable models the de facto choice for generic facial landmark localization. Even though very good performance for the facial landmark localization has been shown by many recently proposed discriminative techniques, when it comes to the applications that require excellent accuracy, such as facial behaviour analysis and facial motion capture, the semi-automatic person-specific or even tedious manual tracking is still the preferred choice. One way to construct a person-specific model automatically is through incremental updating of the generic model. This paper deals with the problem of updating a discriminative facial deformable model, a problem that has not been thoroughly studied in the literature. In particular, we study for the first time, to the best of our knowledge, the strategies to update a discriminative model that is trained by a cascade of regressors. We propose very efficient strategies to update the model and we show that is possible to automatically construct robust discriminative person and imaging condition specific models 'in-the-wild' that outperform state-of-the-art generic face alignment strategies.
Article
Driver drowsiness and distraction are two main reasons for traffic accidents and the related financial losses. Therefore, researchers have been working for more than a decade on designing driver inattention monitoring systems. As a result, several detection techniques for the detection of both drowsiness and distraction have been proposed in the literature. Some of these techniques were successfully adopted and implemented by the leading car companies. This paper discusses and provides a comprehensive insight into the well-established techniques for driver inattention monitoring and introduces the use of most recent and futuristic solutions exploiting mobile technologies such as smartphones and wearable devices. Then, a proposal is made for the active of such systems into car-to-car communication to support vehicular ad hoc network's (VANET's) primary aim of safe driving. We call this approach the dissemination of driver behavior via C2C communication. Throughout this paper, the most remarkable studies of the last five years were examined thoroughly in order to reveal the recent driver monitoring techniques and demonstrate the basic pros and cons. In addition, the studies were categorized into two groups: driver drowsiness and distraction. Then, research on the driver drowsiness was further divided into two main subgroups based on the exploitation of either visual features or nonvisual features. A comprehensive compilation, including used features, classification methods, accuracy rates, system parameters, and environmental details, was represented as tables to highlight the (dis)advantages and/or limitations of the aforementioned categories. A similar approach was also taken for the methods used for the detection of driver distraction.
Conference Paper
In this paper, we proposed a driver drowsiness detection method for which only eyelid movement information was required. The proposed method consists of two major parts. 1) In order to obtain accurate eye openness estimation, a vision-based eye openness recognition method was proposed to obtain an regression model that directly gave degree of eye openness from a low-resolution eye image without complex geometry modeling, which is efficient and robust to degraded image quality. 2) A novel feature extraction method based on unsupervised learning was also proposed to reveal hidden pattern from eyelid movements as well as reduce the feature dimension. The proposed method was evaluated and shown good performance. Index Terms—driver drowsiness detection, degree of eye openness ,eyelid movements, OLPP, unsupervised feature learning
Article
There is little agreement in the scientific literature about what the terms "driver distraction" and "driver inattention" mean, and what the relationship is between them. In 2011, Regan, Hallett and Gordon proposed a taxonomy of driver inattention in which driver distraction is conceptualized as just one of several processes that give rise to driver inattention. Since publication of that paper, two other papers have emerged that bear on the taxonomy. In one, the Regan et al taxonomy was used, for the first time, to classify data from an in-depth crash investigation in Australia. In the other, another taxonomy of driver inattention was proposed and described. In this paper we revisit the original taxonomy proposed by Regan et al. in light of these developments, and make recommendations for how the original taxonomy might be improved to make it more useful as a tool for classifying and coding crash and critical incident data. In addition, we attempt to characterize, theoretically, the processes within each category of the original taxonomy that are assumed to give rise to driver inattention. Recommendations are made for several lines of research: to further validate the original taxonomy; to understand the impact of each category of inattention in the taxonomy on driving performance, crash type and crash risk; and to revise and align with the original taxonomy existing crash and incident investigation protocols, so that they provide more comprehensive, reliable and consistent information regarding the contribution of inattention to crashes of all types.
Conference Paper
We study natural human activity under difficult settings of cluttered background, volatile illumination, and frequent occlusion. To that end, a two-stage method for hand and hand-object interaction detection is developed. First, activity proposals are generated from multiple sub-regions in the scene. Then, these are integrated using a second-stage classifier. We study a set of descriptors for detection and activity recognition in terms of performance and speed. With the overarching goal of reducing 'lab setting bias', a case study is introduced with a publicly available annotated RGB and depth dataset. The dataset was captured using a Kinect under real-world driving settings. The approach is motivated by studying actions-as well as semantic elements in the scene and the driver's interaction with them-which may be used to infer driver inattentiveness. The proposed framework significantly outperforms a state-of-the-art baseline on our dataset for hand detection.
Article
This paper presents a real-time vision-based system to detect the eye state. The system is implemented with a consumer-grade computer and an uncalibrated web camera with passive illumination. Previously established similarity measures between image regions, feature selection algorithms, and classifiers have been applied to achieve vision-based eye state detection without introducing a new methodology. From many different extracted data of 1,293 pair of eyes images and 2,322 individual eye images, such as histograms, projections, and contours, 186 similarity measures with three eye templates were computed. Two feature selection algorithms, the J5(ξ) J_{5} (\xi ) criterion and sequential forward selection, and two classifiers, multi-layer perceptron and support vector machine, were intensively studied to select the best scheme for pair of eyes and individual eye state detection. The output of both the selected classifiers was combined to optimize eye state monitoring in video sequences. We tested the system with videos with different users, environments, and illumination. It achieved an overall accuracy of 96.22 %, which outperforms previously published approaches. The system runs at 40 fps and can be used to monitor driver alertness robustly.
Article
BORGHINI et al. Measuring neurophysiological signals in aircraft pilots and car drivers for the assessment of mental workload, fatigue and drowsiness NEUROSCI BIOBEHAV REV 21(1) XXX-XXX, 2012. This paper reviews published papers related to neurophysiological measurements (electroencephalography: EEG, electrooculography EOG; heart rate: HR) in pilots/drivers during their driving tasks. The aim is to summarise the main neurophysiological findings related to the measurements of pilot/driver's brain activity during drive performance and how particular aspects of this brain activity could be connected with the important concepts of "mental workload", "mental fatigue" or "situational awareness". Review of the literature suggest that exists a coherent sequence of changes for EEG, EOG and HR variables during the transition from normal drive, high mental workload and eventually mental fatigue and drowsiness. In particular, increased EEG power in theta band and a decrease in alpha band occurred in high mental workload. Successively, increased EEG power in theta as well as delta and alpha bands characterize the transition between mental workload and mental fatigue. Drowsiness is also characterized by increased blink rate and decreased HR values. The detection of such mental states is actually performed "off-line" with accuracy around 90% but not on-line. A discussion on the possible future applications of findings provided by these neurophysiological measurements in order to improve the safety of the vehicles will be also presented.
Conference Paper
Experts assume that accidents caused by drowsiness are significantly under-reported in police crash investigations (1-3%). They estimate that about 24-33% of the severe accidents are related to drowsiness. In order to develop warning systems that detect reduced vigilance based on the driving behavior, a reliable and accurate drowsiness reference is needed. Studies have shown that measures of the driver's eyes are capable to detect drowsiness under simulator or experiment conditions. In this study, the performance of the latest eye tracking based in-vehicle fatigue prediction measures are evaluated. These measures are assessed statistically and by a classification method based on a large dataset of 90 hours of real road drives. The results show that eye-tracking drowsiness detection works well for some drivers as long as the blinks detection works properly. Even with some proposed improvements, however, there are still problems with bad light conditions and for persons wearing glasses. As a summary, the camera based sleepiness measures provide a valuable contribution for a drowsiness reference, but are not reliable enough to be the only reference.
Conference Paper
Several studies have related the alertness of an individual to their eye-blinking patterns. Accurate and automatic quantification of eye-blinks can be of much use in monitoring people at jobs that require high degree of alertness, such as that of a driver of a vehicle. This paper presents a non-intrusive system based on facial biometrics techniques, to accurately detect and quantify eye-blinks. Given a video sequence from a standard camera, the proposed procedure can output blink frequencies and durations, as well as the PERCLOS metric, which is the percentage of the time the eyes are at least 80% closed. The proposed algorithm was tested on 360 videos of the AV@CAR database, which amount to approximately 95,000 frames of 20 different people. Validation of the results against manual annotations yielded very high accuracy in the estimation of blink frequency with encouraging results in the estimation of PERCLOS (average error of 0.39%) and blink duration (average error within 2 frames).
Article
There are many excellent toolkits which provide support for developing machine learning soft- ware in Python, R, Matlab, and similar environments. Dlib-ml is an open source library, targeted at both engineers and research scientists, which aims to provide a similarly rich environment for developing machine learning software in the C++ language. Towards this end, dlib-ml contains an extensible linear algebra toolkit with built in BLAS support. It also houses implementations of algorithms for performing inference in Bayesian networks and kernel-based methods for classifi- cation, regression, clustering, anomaly detection, and fe ature ranking. To enable easy use of these tools, the entire library has been developed with contract p rogramming, which provides complete and precise documentation as well as powerful debugging tools.
Article
There is accumulating evidence that driver distraction and driver inattention are leading causes of vehicle crashes and incidents. However, as applied psychological constructs, they have been inconsistently defined and the relationship between them remains unclear. In this paper, driver distraction and driver inattention are defined and a taxonomy is presented in which driver distraction is distinguished from other forms of driver inattention. The taxonomy and the definitions provided are intended (a) to provide a common framework for coding different forms of driver inattention as contributing factors in crashes and incidents, so that comparable estimates of their role as contributing factors can be made across different studies, and (b) to make it possible to more accurately interpret and compare, across studies, the research findings for a given form of driver inattention.
Article
The Karolinska sleepiness scale (KSS) is frequently used for evaluating subjective sleepiness. The main aim of the present study was to investigate the validity and reliability of the KSS with electroencephalographic, behavioral and other subjective indicators of sleepiness. Participants were 16 healthy females aged 33-43 (38.1+/-2.68) years. The experiment involved 8 measurement sessions per day for 3 consecutive days. Each session contained the psychomotor vigilance task (PVT), the Karolinska drowsiness test (KDT-EEG alpha & theta power), the alpha attenuation test (AAT-alpha power ratio open/closed eyes) and the KSS. Median reaction time, number of lapses, alpha and theta power density and the alpha attenuation coefficients (AAC) showed highly significant increase with increasing KSS. The same variables were also significantly correlated with KSS, with a mean value for lapses (r=0.56). The KSS was closely related to EEG and behavioral variables, indicating a high validity in measuring sleepiness. KSS ratings may be a useful proxy for EEG or behavioral indicators of sleepiness.
Efficient monocular point-of-gaze estimation on multiple screens and 3D face tracking for driver behaviour analysis
  • J Goenetxea
  • L Unzueta
  • U Elordi
  • J D Ortega
  • O Otaegui
Real-time eye blink detection using facial landmarks
  • T Soukupová
  • J Cech