Content uploaded by Soukaina Bouhsissin
Author content
All content in this area was uploaded by Soukaina Bouhsissin on Feb 20, 2023
Content may be subject to copyright.
Available via license: CC BY 4.0
Content may be subject to copyright.
Received 10 January 2023, accepted 2 February 2023, date of publication 9 February 2023, date of current version 15 February 2023.
Digital Object Identifier 10.1109/ACCESS.2023.3243865
Driver Behavior Classification: A Systematic
Literature Review
SOUKAINA BOUHSISSIN , NAWAL SAEL, AND FAOUZIA BENABBOU
Laboratory of Information Technology and Modeling, Faculty of Sciences Ben M’Sick, Hassan II University of Casablanca, Casablanca 20000, Morocco
Corresponding author: Soukaina Bouhsissin (bouhsissin.soukaina@gmail.com)
ABSTRACT Driver behavior is receiving increasing attention because of the staggering number of road
accidents. Many road safety reports regard human behavior as the most important factor in the likelihood
of accidents. The detection and classification of aggressive or abnormal driver behavior is an essential
requirement in the real world to avoid deadly road accidents and to protect road users. The automatic
detection of a driver’s behavior aids in the prevention of dangerous situations for the driver and all
other participants in the driving environment, as well as the implementation of corrective measures. This
paper presents a systematic literature review (SLR) of driver behavior classification. This study aimed
to highlight and analyze the different types of driver behavior, types of studies, data sources, datasets,
features, preprocessing techniques, and artificial intelligence algorithms used to classify driver behavior and
its performance. Based on the results obtained from the analysis of the selected works, we aim to identify the
key contributions and challenges of studying driver behavior classification and propose potential avenues
for further directions for practitioners and researchers.
INDEX TERMS Driver behavior, intelligent transport system, systematic literature review, machine learning,
deep learning.
I. INTRODUCTION
The intelligent transport system (ITS) concept emerged first
in 1991, when transportation experts realized that electronic
technologies could start to play a crucial role in optimiz-
ing surface transportation. The National ITS Program was
also legally established by the US Congress [1]. Intelligent
Transportation Systems are technology-based systems that
aim to solve various road traffic problems [2], such as traffic
accidents, congestion, and conflicts, by analyzing data col-
lected from sensors or digital technology [3], [4]. ITS is a
general term that refers to the application of communication,
control, and information processing technologies to vehicle
networks [5]. An ITS covers everything related to a transport
system, including the vehicle, infrastructure, driver, and road
users. They assist the driver in making the best decisions
in real time to avoid dangerous situations [5]. There are
three objectives of ITS: mobility, sustainable transport, and
convenience [6], [7]. Mobility deals with the transportation
system’s effectiveness and capacity, while sustainable trans-
The associate editor coordinating the review of this manuscript and
approving it for publication was Wai-Keung Fung .
port focuses on road safety and environmental respect, and
convenience ensures service accessibility.
ITS is now used for more than just traffic control and infor-
mation, but also for road and vehicle safety, infrastructure
utilization efficiency, and the reduction of accidents, injuries,
and fatalities [8].
Every year, road safety officials set up campaigns to raise
awareness of road accidents through interventions in schools
and visual and audiovisual information; additionally, public
policies take measures to reduce the rate of road accidents,
for example by lowering the speed limit on the roads [9].
Despite these efforts, road deaths are still too high worldwide.
The World Health Organization (WHO), through the Global
Status Report on Road Safety; found that an average of 3,700
people died every day in traffic-related incidents in 2016,
and 1.35 million traffic-related deaths occur worldwide each
year [10]. It is therefore essential to analyze the various
factors and driver behavior that have a direct impact on road
accidents.
Advanced Driver Assistance Systems (ADAS) are an
example of in-vehicle ITS that are designed to assist drivers
in a variety of ways, including improving safety, reducing
14128 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ VOLUME 11, 2023
S. Bouhsissin et al.: Driver Behavior Classification: A Systematic Literature Review
the risk of accidents, and making driving more convenient.
ADAS technologies can include features such as lane depar-
ture warning systems, adaptive cruise control, automatic
emergency braking, reversing cameras, and more. These sys-
tems typically use a combination of sensors, cameras, radar,
and other technologies to gather information about the vehicle
and its surroundings, and to provide the driver with warnings,
alerts, or other forms of assistance as needed [11], [12]. How-
ever, the accuracy of these sensors can sometimes decrease
considerably, and the ADAS system cannot predict potential
danger at the right time [13].
Artificial intelligence techniques play a crucial role in
research. Machine Learning (ML) and Deep Learning (DL)
are powerful technologies that are widely used in differ-
ent contexts and applications, such as in robotics [14],
natural language processing (NLP) [15], image classifi-
cation [16], [17], and disease diagnosis like COVID-19
[18]. In recent years, machine learning and deep learning
techniques have accelerated the development of intelligent
transport systems (ITS) while making them more efficient,
especially for traffic anomaly prediction and detection [19],
accident detection and classification [16], accident severity
prediction [20], and other traffic problems. According to [21],
the human factor plays a primary role with a rate of 95%
in road accidents. As a result, driver behavior analysis has
become a necessity to ensure the safety of all road users.
Understanding and analyzing driver behavior are essential
to identifying and addressing potential safety concerns and
developing strategies to reduce the likelihood of accidents or
other road incidents. Driver behavior is a complex concept
that is influenced by a range of factors, which can make it
challenging to accurately describe and analyze. These factors
may include psychological, social, cultural, and environmen-
tal influences, as well as the individual characteristics of the
driver.
Therefore, in parallel to ADAS and artificial intelligence,
researchers are working hard to understand, detect, identify,
and predict driving styles and behaviors. This knowledge is
crucial because driving is a common daily activity for many
people [22]. By analyzing driver behavior with ML and DL
techniques, we can better understand and address the factors
that contribute to risky driving and work towards improving
safety on the road. DB describes the driver’s actions as they
relate to the driving scene and the general environment [23]
(see Fig. 1). DB is generally evaluated in terms of environ-
mental variables such as traffic signs, road geometry, and
pedestrians as well as vehicle variables such as distance,
speed, acceleration, and other related variables [24]. The
connections between the driver, the car, and the environment
must be investigated to understand driver behavior. There-
fore, three contexts have to be considered [25]:
•Driver context such as driver status, facial expression,
and distraction.
•Car context like speed, acceleration, and orientation.
FIGURE 1. Driver behavior processes.
•Environmental context like traffic conditions, geometry,
obstacles, and weather conditions.
In recent years, various commercial and research sys-
tems have been proposed to analyze driver behavior and
develop systems to evaluate driver performance and assist
drivers [26]. A common infrastructure shared by all these
systems is the driver surveillance system [27].
This paper aims to provide a systematic literature review
(SLR) on driver behavior classification. It covers research
published between 2015 and 2022, provides a comprehen-
sive overview of existing approaches to study, and analyzes
driver behavior using artificial intelligence techniques. The
objective is also to offer a guide for researchers on proposed
approaches, performance classification techniques, datasets,
and features selected to classify driver behavior. The follow-
ing contains the contributions of this research:
1) We provided an overview of the most recent academic
papers on classifying driver behavior.
2) A new taxonomy of different types of behaviors to
classify behaviors in a systematic way, which can be
useful for identifying trends or patterns in behavior, and
for designing interventions. In addition, an in-depth
analysis to identify the various approaches that have
been used to identify driver behaviors studies.
3) Finally, we identify the data sources, datasets, features,
and preprocessing techniques such as feature extraction
and selection methods, and we provide a complete
analysis of different ML and DL techniques involved
and performance obtained.
The remainder of this paper is structured as follows. Section II
presents a literature review of works related to our context.
Next, Section III presents the research methodology that we
used to select and study the papers analyzed. Section IV
presents the main results of this study in a comprehensive
overview. Section Vprovides an overall discussion of the
main results according to the research questions. Finally,
VOLUME 11, 2023 14129
S. Bouhsissin et al.: Driver Behavior Classification: A Systematic Literature Review
Section VI presents the conclusion of the study and its
limitations.
II. RELATED WORKS
In general, any behavior while driving that could endanger
the car or its occupants, pedestrians, other drivers, or roadside
facilities may be regarded as dangerous behavior. This state
of the art is organized according to the type of driver behavior
which includes five categories: Abnormal driver behavior is
defined as unsafe behavior on the road (risky and negative);
then there’s aggressive driving, which is defined as being
aggressive and intolerant; line deviation, which describes
driver deviations on the road; stopping vehicles, which is a
behavior of drivers behaving at controlled stop intersections;
and driver status, which describes driver behavior like a
distraction. In addition, we process and analyze the impact
of context on driver behavior.
A. ABNORMAL DRIVER BEHAVIOR
Monitoring of abnormal driver behavior is the cornerstone
for improving driving safety. The paper [28] proposed a
real-time intelligent system that can detect abnormal vehi-
cle behaviors using traffic cameras and the You Only Look
Once (YOLO) algorithm for object detection in video images.
Then the Kalman filter tracks the location of the vehicle
through successive images, and anomaly detection is carried
out using the images in which a vehicle appears (depending
on its speed). Article [29] presents a framework based on the
Strategic Highway Research Program 2 (SHRP 2) and Natu-
ralistic Driving Study (NDS) datasets to calculate a driver’s
risk profile (normal or abnormal) using a Random Forest
(RF) algorithm. They were able to achieve 90% of accuracy.
In paper [30], the Serial-Feature Network (SF-Net) algorithm
was proposed for normal and abnormal driver behavior recog-
nition based on smartphone inertial sensors like GPS and
gyroscope. The approach reached an accuracy of 97.10% and
a recall rate of 98.4%. A proposed system in [31] classifies
driver behaviors and road anomalies as normal, abnormal,
or bump. Based on smartphone sensors data were collected
and the k-Nearest Neighbor (KNN) and Dynamic Time Warp-
ing (DTW) algorithms performed an accuracy of 78.06% and
96.75% respectively for classification.
Papers [32], [33], and [34] classified driver behavior as
positive and negative. The paper [32] presents a mobile appli-
cation called ‘‘Project Drive’’ that bridges the gap between
detecting negative driver behavior and motivating users to
safer driver behavior. They used the clustering k-means algo-
rithm on GPS data. In [33], the study examines how the
presence of road signs affects young drivers’ behavior in
nighttime conditions using simulator and camera data. The
authors of [34] used simulator data and the MANOVA statis-
tical technique to investigate the effects of optical circles and
chevron patterns on driver behavior and speed when entering
a bend on a rural two-lane road.
Papers [35], [36], and [37] distinguished between safe and
unsafe driver behavior. In [35], they classified driver behav-
ior using smartphone sensors via an optimal path detection
algorithm and Bayesian classification. The method achieved
93.3% of correctly classified instances. For safe driving, [36]
suggests a system that uses two traffic datasets called Local
and LARA to give drivers advice based on traffic light condi-
tions. The system obtained 95.52% precision. The paper [37]
proposes a smartphone-based system to provide important
information for the analysis of driver behavior at intersections
using a camera, an accelerometer, and gyroscope data. They
proposed Long Short-term Memory (LSTM) and Convolu-
tional Neural Network (CNN) models and reached 0.36 for
the mean percentage error (MPE).
The paper [38] used the Next Generation Simulation
(NGSIM) dataset to propose an LSTM-based car tracking
model that captures realistic traffic flow features and detect
asymmetric driver behavior (which is a critical feature of
human driver behavior). The effectiveness of road signs on
driver safety is studied in [39] using GPS and video data.
The Logistic Regression (LR) algorithm was trained to detect
visible and non-visible driver activity.
Some articles go further than abnormal behavior and
assessed how much the behavior presents a risk road safety.
This latter is one of the main concerns of mobility and urban
planning, so it is often important to recognize risky driver
behaviors. A Support Vector Machine (SVM) and Artificial
Neural Networks (ANNs) are used to recognize safe and
dangerous driver behaviors using in-vehicle sensor data [40].
The classification results indicate an average accuracy of
above 90% for both classifiers. The paper [41] used the SHRP
2 dataset and the SVM algorithm to study the probability that
left-to-right lane changes are dangerous. The method ensured
an accuracy of 90%. In [42], a method for detecting risky
driver behaviors by analyzing vehicle speed with time in real-
world driving is proposed. The k-means and SVM algorithms
achieved 95% for classes correctly classified. Papers [43]
and [44] focused on the detection of driving risk levels based
on data collected by mobile sensors. The SVM model used
in [44] performed with an accuracy higher than 70%, while
in [43], SVM was combined with an auto-encoder algorithm
and achieved 83.03% accuracy. In [45], the authors classified
the driver’s risk level as low, medium, and high using a
decision tree ‘‘CART algorithm’’ and simulator data a with
statistical package for the social sciences (SPSS). The char-
acteristics of driver behavior were used to assess the risk of
a vehicle-pedestrian collision based on video data [46]. This
method was archived at over 85% with high discrimination
accuracy. The paper [47] proposed an ensemble learning
system for evaluating normal, low, high, and very high-risk
driving styles on a smartphone data using the combination
of the following algorithms: SVM, Multi-Layer Perceptron
(MLP), and KNN. The system succeeds in finding at least
94% accuracy in the driver’s style evaluation. Based on GPS
and ADAS data, authors in [48] suggested a system for cat-
egorizing driving risks into low and high risks and obtained
80% accuracy. They focused on the unbalanced time series
sample problem when evaluating driving behavior, which can
14130 VOLUME 11, 2023
S. Bouhsissin et al.: Driver Behavior Classification: A Systematic Literature Review
be alleviated by MeanShift clustering. The authors of the arti-
cle [49] proposed a real-time classification of driving behav-
ior based on k-means clustering, hierarchical clustering, and
model-based clustering algorithms to identify the number
of behavioral classes as normal, high, and low risk. Then
SVM, Decision Tree (DT), and Naive Bayes (NB) algorithms
were applied to evaluate these risk behaviors and test the
performance of the clustering methods. The algorithms attend
95.3%, 99.6%, and 84.3% accuracy respectively. In another
paper [50], they analyze the high or low skills of drivers
using sensor data from a driving simulator, and an SVM
algorithm was used, which performed with an accuracy of
95.7 %. Additionally, [51] classifies driver behavior as skilled
or non-skilled using the hidden Markov model (HMM) and
the model’s archived 80.37% accuracy.
B. AGGRESSIVE DRIVER BEHAVIOR
Among the most well-known driver behaviors in the lit-
erature is aggressive behavior. In [52], the authors deter-
mined whether the driving style is safe or aggressive, involv-
ing signs, speed, and maneuver estimation. They used the
CNN algorithm for detection, which performed with 88.02%
accuracy. A system based on SVM was presented in [53]
to classify the various types of aggressive drivers using a
few annotated simulator data points and a semi-supervised
approach. The classification accuracy of the method was
about 86.6% using S3VM. The paper [54] proposed a system
for normal and aggressive driver behavior classification based
on a combination of Fully Convolutional Network (FCN)
and LSTM algorithms. Using the UAH-DriveSet dataset, the
system reached an F1-score of 95.88%. In [55], authors used
the Random Forest method to find motion-based factors that
can predict aggressive driving. The model achieves a 97.10%
Area Under the Curve (AUC). While in the work [56], they
presented a significantly improved anomaly detection mech-
anism using Recurrent Neural Networks (RNNs) based on
simulator data. This method achieves 78.6% precision and
36.4% recall.
In order to identify driving maneuvers and classify aggres-
sive, normal, and cautious driving styles, the research [57]
examines how driving habits change depending on the task
performed for online car-hailing services using k-means clus-
tering. The paper [58] proposed a system for aggressive or
smooth driving style detection using the Gaussian mixture
model (GMM) and data from the gyroscope. The experiment
analyzed the driving habits of older and younger people under
the same environmental tests and requirements. A super-
vised method based on Labeled Latent Dirichlet Allocation
(LLDA) is proposed in [59] to understand driver behavior
and latent driving styles. It integrates prior knowledge via the
Safety Pilot Model Deployment (SPMD) dataset to classify
drivers into three categories: aggressive driving, moderate
driving, and careful driving. The average accuracy of this
model was 60.5%, outperforming SVM, NB, and KNN. This
research [60] collected real data from the vehicle accelerom-
eter and gyroscope to identify aggressive driver behaviors
using statistical regression, time series analysis, and the fol-
lowing machine learning algorithms: GMM, Partial Least
Squares Regression (PLSR), wavelet transformation, and
SVR. The method achieved 77% of the F1-score using the
PLSR model. The paper [61] examined dangerous driving
events and how they are connected to traffic accidents using
video and GPS data with correlation analysis. In [62], authors
predicted the driving style of drivers based on driver activities
and environmental data. Driver physiological data before and
during the driving start, car door opening and closing data,
and acceleration data were collected. This approach used both
Bayesian Networks (BN) and Sequential Minimal Optimiza-
tion (SMO) algorithms, with accuracy values ranging from
72.7% to 90.9% for the aggressive driving recognition rate.
Other researchers categorize aggressive driving behav-
ior by stating the types of aggressiveness. For example,
[63] and [64] classified seven driver behaviors as aggres-
sive: braking, acceleration, left turn, right turn, left lane
change, right lane change, and non-aggressive. Data were
collected from an accelerometer sensor on an Android smart-
phone, and various models were investigated, including
RNN, LSTM, and Gated Recurrent Unit (GRU) models. The
experiments showed that the GRU model produced the best
accuracy results, at 95%. In [65], five driver states are clas-
sified as driver behaviors (aggressive-stable, non-aggressive-
stable, non-aggressive-instable, aggressive-instable, and nor-
mal) using driver behavior and EEG data. They combined k-
means, SVM, and KNN to perform an accuracy of 83.5%,
with an average accuracy of 69.5% across all tested traffic
states.
On the other hand, some researchers were interested in
classifying the level of aggressiveness of drivers, such as
[66], [67], [68], and [69]. They have classified drivers’ behav-
ior with scores that express the levels of aggressiveness from
lowest to highest using SVM and LSTM algorithms. The
results achieved 86.67% and 92.8% accuracy, respectively.
Tracking the driving style of each driver without classifica-
tion was also an objective, as in the case of [70] where the
authors illustrate car-following behaviors in various driving
situations using the Next Generation Simulation (NGSIM)
dataset and genetic algorithm (GA).
C. LINE DEVIATION
Real-time monitoring of driver events or driving style is
the cornerstone of improved driving safety. In these parts,
we summarize research on line deviation detection. Papers
[71], [72], and [73] classified six types of driver behaviors
as weaving, swerving, side slipping, fast U-turn, turning with
a wide radius, and sudden braking. Acceleration and orien-
tation data were collected, and the algorithms SVM, Neural
Networks (NN), and composition between SVM and NN
were applied for classification. The models attend 95.36%,
96.88%, and 95.7% accuracy, respectively. In [74], authors
presented a system to identify risky driving actions, such
VOLUME 11, 2023 14131
S. Bouhsissin et al.: Driver Behavior Classification: A Systematic Literature Review
as illegal lane occupation, abrupt double lane changes, ille-
gal U-turns, and others. Based on video surveillance data,
the suggested method obtains an average detection accu-
racy of 88.62% using the hybrid algorithm Particle Swarm
Optimization-Support Vector Machine (PSO SVM). In this
paper [75], a behavior analysis technique based on the Hidden
Markov Model (HMM) was proposed. The aim was to assess
the driving behavior of moving cars and identify unusual driv-
ing events like approaching, braking, lane keeping, and lane
changing. The Conditional Monte Carlo Dense Occupancy
Tracker (CMCDOT) framework was used to determine the
speed and location of nearby vehicles in real time. The results
show that the proposed method successfully detects moments
of risk. In [76], they classified the behaviors using traffic
data into free lane changing, non-free lane changing, suc-
cessful lane changing, and unsuccessful lane changing. This
approach used SVM, and the prediction accuracy reached
nearly 90%.
D. VEHICLE STOPPING
The behavior of the driver in certain critical areas, such
as stop zones is considered one of the most crucial issues
in road safety. When a yellow indication is triggered, the
dilemma zone is investigated and modeled as a binary deci-
sion problem to stop or go [77]. Many papers conducted
binary classification of driver behavior in the dilemma zone
such as [78], [79], [80], [81], [82], [83]. In [78], [79], [80],
[81], they classified the behavior using SVM, BN, Stochas-
tic Model Predictive Control (SMPC), and combinations of
DT and Mixed Logit panel model algorithms, respectively,
that were trained with simulator data respectively. While the
SVM model predicted 92.9% accuracy, BN achieved 82.9%
precision. In a study [82], the SVM model was proposed and
trained with data collected from GPS, accelerometers, and
sensors and achieved an accuracy of 90.02%. The video data
is used in [83] with a Binary Logistic Regression model. The
developed model showed that the prediction accuracy of the
model is 83.3%.
Understanding how drivers behave at stop-controlled inter-
sections is of crucial importance for the control and man-
agement of an urban traffic system. Based on real data, the
paper [84] classified driver behavior at minor street stop sign
intersections as no-stop, rolling stop, or complete stop using
binary and ordinal LR classifiers. In [85], the behavior is
classified into a full stop, slight rolling stop, ruling stop, slow
down without stopping, and running through stop-controlled
intersections using k-means and camera data. On the other
hand, a statistical analysis (Chi-squared test) was used to
define the types of driver behavior into complete stops, rolling
stops, and non-compliant stops at rail level crossings (RLX)
in the paper [86].
E. DRIVERS’ STATUS
When a driver is distracted, drowsy, or has a special feeling
the driving behavior changes and affects the driving style.
In [87], two non-linear regression methods, ANN and Adap-
tive Neuro-Fuzzy Inference System (ANFIS) were indepen-
dently designed to predict the driver’s ability to maintain
the middle lane and speed limits from simulator data. The
average error between predicted speed limit maintenance and
real vehicle speed is 2.72, and the average error between
predicted and real middle line-keeping ability is 0.27. The
paper [88] examines the effect of passenger presence and
driver distraction on young drivers’ behavior using simulator
data and an analysis of covariance with ANCOVA. In [89],
authors proposed a system based on a Full Convolutional Net-
work (FCN) to find effective features for real-time cognitive
distraction detection at the wheel, and the model performed
with 91% for accuracy. In [90], authors presented a review to
separate and analyze the two primary categories of inattentive
driving behaviors: driver distraction and driver weariness or
drowsiness. Further, [91], [92], and [93] classified the drivers’
conditions as drowsy using simulator data via RF, Dynamic
Bayesian Network and ANN algorithms respectively. The
results of these classifiers have an accuracy of 84.8% for RF
and 0.22 MSE for ANN. The paper [94] exploited driving
signals to analyze normal, aggressive, distracted, drowsy,
and drunk driver behavior using the CNN algorithm. The
model achieved 99.76% accuracy. Drowsiness behavior is
also detected in [95] and [96] using the UAH-DriveSet
dataset and LSTM algorithm. The algorithm archived 91%
and 99.49% of the F1-score, respectively. While in [97],
they established a new model based on autoencoders for
the detection of abnormal driving: drunkenness or fatigue,
recklessness, and phone use while driving. The accuracy of
the model was 98.33%. Also in [98], they detected driver
fatigue based on the perspective of traffic psychology. Using
the SHRP 2 dataset and RF algorithms, the paper [99] catego-
rizes driver behaviors as using cell phones, moving, adjusting,
monitoring objects, passenger interaction, talking, drinking,
or eating, and personal hygiene. The classifier obtains 98.5%
concordance and a 6.5% MSE. Eating and drinking, talk-
ing, phone use while driving, and preparing are the behav-
iors classified in [100] with the interCNN algorithm. The
archived model had 81.66% accuracy. Distracted behavior
is classified in papers [101], [102] as texting with the right
and left hands, talking on the phone using the right and
left hands, drinking, reaching, applying makeup, and talking
to passengers using CNN based algorithms. The algorithm
achieved 99% accuracy. In the paper [103], eating and tex-
ting are the two distracting behaviors classified with the RF
algorithm and achieved with 85.38% accuracy. In [104], they
proposed to identify the different types of secondary tasks
with the SHRP 2 NDS dataset in which drivers are engaged
in activities, such as hand-held cellphone calling, cellphone
texting, and interaction. Using the combination of DT and
RF algorithms, the method achieved 99.2% accuracy at level
1 and 82.2% accuracy at level 2.
To further clarify the status of driving behavior, the paper
[105] applied several different types of classifiers, such as
LR, RNN, LSTM, and Deep Neural Network (DNN), for
detecting driver confusion using data collected from sensors,
14132 VOLUME 11, 2023
S. Bouhsissin et al.: Driver Behavior Classification: A Systematic Literature Review
GPS, and video. In [106], a technique is proposed to robustly
classify driving styles using the Support Vector Clustering
approach into defensive, aggressive, and normal. Another
type of style is detected in [107] via a wearable glove system.
It detects driver stress events in real-time using SVM, with
95% model accuracy. Moreover, in [108], the authors detected
driving anger using Linear Mixed Models (LMM) and trained
the model with SHRP 2 NDS dataset. The paper [109]
proposed a model to measure driver behavior via the driver
behavior questionnaire (DBQ) and social media, as well as
drug and alcohol use, which were also used to measure driver
behavior.
F. ANALYSIS OF THE IMPACT OF CONTEXT ON DRIVER
BEHAVIOR
The objective of a driving behavior analysis is a very complex
issue that depends on several divergent parameters and is
not restricted to a classification of driver type or driving
style. Other studies focus more on analyzing the impact of
different factors or attributes regarding the driving environ-
ment on driver behavior. These factors can influence driver
behavior, cause incorrect reactions, and lead to accidents.
Like in [110], they studied explicit and implicit attitudes
toward traffic climate and their relationship with drivers’ self-
reported behaviors. Additionally, authors in [111] examined
how driving skills affected the association between traffic
conditions and drivers’ behavior. To determine the extent
to which cognitive functioning contributes to the previously
identified connections between driving attitudes and person-
ality traits, the article [112] proposed an analysis of variables
related to the high-risk driving behavior of young people in
the early stages of their study. Speed control is investigated
in [113] to analyze the different influence factors on the
functioning and observation of right-turn drivers. In order to
determine the best strategy for slowing down, [114] investi-
gates how various perceptual treatments affect driving speed.
Another factor highlighted in [115] is the effect of longitu-
dinal pavement markings with varying levels and widths of
deterioration on a driver’s ability to maintain lane position.
Reference [116] analyzed the impact of the color contrast of
a waistcoat worn by cyclists on the visibility of drivers. Refer-
ence [117] examined whether motor vehicle drivers’ behavior
changes when there are more bicycles on the road. In addition,
an important factor in the analysis of driver behavior is the
weather condition, such as clear or foggy, which is studied in
[118]. Other measures can be used to detect driver behavior
or even help avoid collisions. In [119], they seek to detect
the driver’s intentions with respect to surrounding vehicles.
In [120] the authors explored whether it is possible to iden-
tify traffic congestion based on several parameters, including
delay constraints, and available speed via the GPS vehicle
trajectory, while in [121] authors analyzed conflicts between
vehicles and pedestrians and also driver behavior. To classify
the conflict into potential, mild, and severe, a model for
driver yielding behavior was developed using binary logistic
regression.
III. RESEARCH METHODOLOGY
A literature review is an essential component of academic
research. In this review, we use the Systematic Literature
Review (SLR) technique to study and analyze driver behav-
ior at the levels of behavioral classification types, types of
studies, data sources, features, preprocessing, and algorithms.
An SLR identifies, picks, and critically assesses research to
respond to a formulated question [122]. We organized, carried
out, and reported the review using the SLR method in [123].
A. RESEARCH QUESTIONS
The main research inquiries that need to be answered in order
to conduct the SLR for the suggested study are as follows:
RQ1: What are the objectives and types of driver behaviors
classified in the selected research studies?
RQ2: What are the types of driver behavior studies?
RQ3: What data sources, datasets, and features are used?
RQ4: What data preprocessing techniques are applied to
improve data processing?
RQ5: What feature selection and extraction techniques are
implemented to support the training and precision of the
model?
RQ6: What types of models are used to classify DB and
what is their performance?
B. SEARCH STRATEGY
We use five digital databases, namely: ScienceDirect,
the IEEE Xplore digital library, SpringerLink, the DBLP
database, and Google Scholar. We defined a set of keywords
for the search process as ‘‘driver behavior’’, ‘‘driver behavior
classification’’, ‘‘classification of driver behavior’’, ‘‘types
of studies for driver behavior’’, ‘‘datasets for driver behavior
classification’’, ‘‘algorithms for driver behavior’’, ‘‘data pre-
processing for driver behavior’’, and we repeat the research
with the same keywords, replacing ‘‘driver behavior’’ with
‘‘driver behaviour’’ as British spelling in the context of the
research area. Then, the search process was performed to
identify relevant articles to answer the search questions based
on predefined keywords using Boolean operators in the above
databases.
C. STUDY SELECTION AND QUALITY ASSESSMENT
To choose pertinent studies, we used the inclusion and exclu-
sion criteria to evaluate candidate articles that might contain
information that could be used to address the research ques-
tions.
Inclusion criteria:
•The articles published from 2015 to 2022.
•The research papers are from journals publications,
or conferences.
•The works published in IEEE Xplore, DBLP, Science
Direct, Springer and google Scholar.
•The articles are written in the English language.
•The research focuses on the classification of driver
behavior.
VOLUME 11, 2023 14133
S. Bouhsissin et al.: Driver Behavior Classification: A Systematic Literature Review
FIGURE 2. PRISMA flow diagram.
•The article uses machine learning or deep learning algo-
rithms or descriptive or preliminary statistics techniques
in the study.
Exclusion criteria:
•The articles are not in the range of 2015 to 2022.
•Irrelevant articles to our topic and research questions.
•We excluded duplicate articles.
•Articles that had a tenuous connection to the study’s
questions.
Next, the selected articles were assessed using firstly the
title, abstract, conclusion, and keywords. Articles unrelated
to the contemplated study were excluded. We then applied the
quality assessment criteria to the remaining articles to assess
their reliability, integrity, and relevance. In the review, quality
assessment plays an important role in the SLR protocol. The
selected articles perfectly answer the predefined key research
questions based on inclusion-exclusion criteria. They are
evaluated by all authors after the analysis and evaluation of
the abstracts and conclusions of the selected articles (see
Fig. 2).
D. DATA ANALYST
We identified 93 studies (Table 1) in the field of driver
behavior classification that were published during the period
FIGURE 3. Distribution of the studies over publication year.
2015-2022. Of these, 26 (28%) were published in conference
proceedings, and 67 (72%) articles were published in jour-
nals. Table 1presents the publication venues of the chosen
studies, and Fig. 3displays the distribution by year.
IV. DRIVER BEHAVIOR: A COMPREHENSIVE OVERVIEW
In this section, we present all the answers to the research
questions and the results concluded from the state of the art.
A. THE OBJECTIVES AND TYPES OF DRIVER BEHAVIORS
CLASSIFIED IN STUDIES (RQ1)
To fully understand the types of driver behavior in literature,
we have attempted to categorize them into five categories:
abnormal driver behavior, aggressive driver behavior, stop-
ping driver behavior, line deviation, and the driver’s status,
as shown in Table 2and Fig. 4.
•Abnormal driving behavior includes behaviors like
abnormal, dangerous, negative, risky, high-risk, and
low-risk.
•Aggressive driving behavior includes aggressive behav-
iors and their types and levels of aggressiveness.
•Vehicle stopping for driver behavior includes behaviors
in intersections and dilemma zones like stop, and go.
•Line deviation includes behaviors such as weaving,
swerving, side slipping, quick u-turns, turning with a
large radius, abrupt braking, and so on.
•Driver’s status such as distraction, drowsiness, stress,
use of cell phones, talking, eating, and drinking.
We note that the driver behaviors related to abnormal driver
behavior are studied at 30%, and the driver’s status are studied
at 28%, in the selected papers. Therefore, we can deduce that
the study of driver behaviors outside the car is less studied
and more difficult, such as the cases of line deviation 8%,
and vehicle stopping 12% (see Table 2), as they are related to
the vehicle and the driving environment, such as road signs
and pedestrian crossings.
This work aims to understand how drivers behave on
the roads, which is of crucial importance for the control,
14134 VOLUME 11, 2023
S. Bouhsissin et al.: Driver Behavior Classification: A Systematic Literature Review
TABLE 1. Publication venues.
minimization, and even avoidance of accidents; the manage-
ment of an urban traffic system; and many other benefits.
In general, each paper in Table 2identifies and classifies
types of driving behavior. Some types of driver behavior
are very common, and others are related to the study itself.
To better understand and facilitate the driver behavior types
analysis we regroup these types on tree levels as explained in
Fig 4. First, we categorized drive behavior types into five cat-
egories: Abnormal, aggressive, line deviation, stopping vehi-
cles, and driver status. Then, for each category, we divided
these types (targets) into normal or not normal, and finally;
we added all targets used in the articles studied.
We have abnormal DB present in 19.08%, which includes,
for example, abnormal, negative, unsafe, and risky. We then
have aggressive driver behavior and her types with 18.42%,
drivers’ status with 35.53%, line deviation with 19.74%, and
VOLUME 11, 2023 14135
S. Bouhsissin et al.: Driver Behavior Classification: A Systematic Literature Review
TABLE 2. Articles related to the type of driver behavior.
FIGURE 4. Driver behavior categorization objectives.
vehicle stopping with 7.24% as shown, in Fig. 5. Generally,
63 types of driver behavior are identified for the not normal
driver behavior type (targets) and 18 types of driver behavior
for the normal targets.
B. DRIVER BEHAVIOR STUDIES TYPES (RQ2)
Research on driver behavior focuses on two aspects: objective
and subjective measurement.
Typically, subjective conduct measurements are derived
based on individual viewpoints and beliefs. It represents the
driver’s unique experience and is described from their point
of view. These subjective measures can be used to evaluate
driver behavior through the questionnaire study. We can note
that the questionnaire survey is simple to administer and
analyze, but it may introduce subjectivity because respon-
dents occasionally fail to recognize when they engage in risky
behavior [124], [125].
Objective driving ratings are based on quantifiable data.
It is based on experiments via simulators or a limited sample
of real cars. The principal sources of acknowledged objective
driving data are driving simulator studies, field driving stud-
ies, and naturalistic driving studies. In objective measure, the
naturalistic driving study (NDS) uses unobtrusive measure-
ments to record detailed information about drivers, vehicles,
and their surroundings in order to study driver behavior on
the road [126]. Field driving studies (FDS) monitor driver
behavior using instrumented vehicles; even with the use of
monitoring equipment, instructors are frequently present in
the vehicle to record measurements and code driving per-
formance [127], [128]. The driver behavior data under the
precise control of the experiment could also be obtained in
the driving simulator test (driving simulator study) [129].
From previous studies, we concluded that the field driving
study is most commonly used for studying driver behavior
either with cameras, sensors, etc., with 51.55% compared
to the studies included in the state of the art, because it is
principally based on field studies, where human intervention
is always present to determine the types of driver behavior
and to collect data from all data sources. Followed by the
driving simulator study with 26.80%, the naturalistic driving
study with 14.43%, and finally the questionnaire study with
7.22% (see Fig. 6).
The main problem with the first method (subjective mea-
sure) is that the questionnaire usually reflects the subjective
opinions of the driver rather than the driver’s actual perfor-
mance on the road. The second method (objective measure)
usually involves manually controlling the driving environ-
ment to induce more aggressive driver behavior. In contrast,
the second method of analysis avoids the deviation caused by
the drivers’ attitudes.
At the same time, in research involving naturalistic driv-
ing, the cars of the test subjects have equipment that, over
14136 VOLUME 11, 2023
S. Bouhsissin et al.: Driver Behavior Classification: A Systematic Literature Review
FIGURE 5. Driver behavior categorization.
FIGURE 6. Type of studies for driver behavior.
time, continuously monitors various elements of their driving
behavior in an unobtrusive manner in the absence of a test
supervisor. The strategy offers data that is challenging or
impossible to gather using existing research techniques.
C. DATA SOURCES, DATASET AND FEATURES (RQ3)
1) DATA SOURCES
To classify driver behavior, a different data sources (see
Table 3) are used to collect data. A data source is simply the
origin of the data. It can be a file, a particular database, or even
a live data stream with sensors, cameras, etc. In our selected
studies, the data source types used are smartphone sensors,
GPS, accelerometers, cameras, simulators, etc. The most used
data source type is the simulator, with 25% of the number of
selected studies, followed by cameras with 17%, GPS with
13%, smartphone sensors with 10%, and the other sources as
shown in Table 3.
From the various data sources, several different types of
behavior can be extracted, depending on the researcher’s
objective and type of data source. In general, the types of
driver behavior classification are associated with different
TABLE 3. Data source for driver behavior classification.
data sources. From Table 4we can classify abnormal behavior
from cameras, GPS, simulators, accelerometers, and gyro-
scope data. In addition, we can classify aggressive behavior
from the data of simulators, smartphone sensors, gyroscopes,
GPS, etc. The vehicle’s stopping behavior is studied mainly
from the camera and simulator data. Thus, we can classify
the driving events mostly from the sensor data (accelerometer
and orientation sensors). The classification of the risk level is
done mainly from the sensor, simulator, and GPS data.
2) DATASETS
To classify the driver’s behavior, some datasets are used in
the selected articles. These datasets contain a combination
VOLUME 11, 2023 14137
S. Bouhsissin et al.: Driver Behavior Classification: A Systematic Literature Review
TABLE 4. Data source for driver behavior classification with target.
of smartphone sensor data, monitoring system data, transport
agency data, and other data sources. In the following section,
we describe an example of datasets used in selected papers.
TheStrategic Highway Research Program 2 (SHRP 2)
NDS database includes data from 50 million vehicle miles
and 5.4 million trips. SHRP 2 was collected by 3,100 volun-
teers at six different sites in the United States: Tampa, Florida;
central Indiana; Durham, North Carolina; Erie County, New
York; central Pennsylvania; Seattle, Washington [130].
TheUAH-DriveSet is a dataset used for the analysis and
classification of driving behavior. It was collected from six
distinct drivers and cars. Three separate driving behaviors,
normal, drowsy, and aggressive were included in the data
produced [131].
TheNext Generation Simulation (NGSIM) Vehicle Tra-
jectories and Supporting Data [132] datasets are gathered
on Peachtree Street in Atlanta, Georgia, eastbound I-80 in
Emeryville, California, and U.S. Highway 101 in Los Ange-
les, and California. A network of synchronized digital video
cameras was used to gather the data.
The Safety Pilot Model Deployment (SPMD) [133]
includes data on driver-vehicle interactions, vehicle trajecto-
ries, basic safety messages (BSMs), and contextual data that
specifies the environment in which the model deployment
data was gathered. Over 2,700 automobiles outfitted with CV
technology were used to collect the data between October
2012 and April 2013.
The 100-Car Naturalistic Driving Study database [134]
contains several examples of excessive driver behavior and
performance, such as extreme weariness, impairment, mis-
TABLE 5. Datasets for driver behavior classification.
TABLE 6. Datasets for driver behavior classification with target.
takes of judgment, risk-taking, aggressive driving, and traffic
violations [135].
The driver behavior dataset [63] is gathered across four
car excursions that last, on average, 13 minutes. The types of
driving events and behaviors in this dataset are: aggressive
acceleration, aggressive left and right turn, aggressive left
and right lane change, aggressive braking, and non-aggressive
event.
According to our selected paper, eight datasets are used to
classify driver behavior, as shown in Table 4. The most used
dataset is SHRP2 with 28%, followed by UAH-DriveSet with
17%, as presented in Table 5.
In addition, Table 6presents the targets related to each
dataset. From UAH-DriveSet, we can extract normal, aggres-
sive, and drowsy driver behavior. Normal, abnormal, and
drivers’ status (eating and drinking, personal hygiene, phone
use while driving, reaching, talking, talking to passengers,
and texting) are extracted from SHRP2. Safe and dangerous
situations are extracted from the driving dataset. More spe-
cific types of aggressiveness are found in the driver behavior
dataset.
3) FEATURES
Every data source or dataset contains a set of features. Each
feature or column, in the dataset, represents measurable data
that can be used for analysis. Through this study, we catego-
rize a set of variables or features according to the different
data sources and datasets. We have grouped these features
into 39 general characteristics, presented in Table 7.
14138 VOLUME 11, 2023
S. Bouhsissin et al.: Driver Behavior Classification: A Systematic Literature Review
TABLE 7. Features for driver behavior classification.
The ten features most commonly used to classify driver
behavior are: vehicle speed, acceleration/deceleration, rota-
tion rate, and rotation angle; pedal, throttle, and accelerator;
acceleration (lateral and longitudinal); time; time to an inter-
section; time to a stop line; and time to a lane crossing; traffic
condition; personnel information; steering; and physiological
and psychological signals. Other features are also used in DB
classification, Table 7presents these different features.
We find that several features describing the acceleration are
used: lateral, longitudinal, vertical, and linear acceleration.
Lateral acceleration represents driving events such as left
turns and lane changes; longitudinal acceleration corresponds
to braking and acceleration of the vehicle; vertical accelera-
tion represents road anomalies such as bumps and potholes;
and linear acceleration quantifies the force of acceleration
applied to a vehicle in all three dimensions (x, y, and z),
excluding the force of gravity. Gravity and acceleration inputs
from all three axes are required to calculate linear accel-
eration. In addition to acceleration features, several others
are extracted from data sources and used many times in
the articles studied. The following Fig. 7and Fig. 8show
the must-used features that can be extracted from each data
source and dataset. From these figures, we can derive features
that allow drivers to describe and identify their behavior;
furthermore, these features can be exploited to create clas-
sification models of driver behavior.
Fig. 7allows us to deduce that: The simulator helps extract
a lot of information, such as acceleration, deceleration, accel-
eration (lateral and longitudinal), throttle, direction, vehicle
position, and vehicle speed. GPS can be used to measure
the speed and acceleration of a vehicle. The accelerometer
can provide acceleration on three axes, which gives us an
accurate indication of driver behavior and road anomalies.
The gyroscope can provide the angular velocity (speed of
rotation) on the three axes (x, y, and z). Moreover, the com-
bination of vehicle speed and throttle opening can capture
the acceleration characteristics of the driver. With the camera,
we can capture images of vehicles, drivers, and surroundings
and detect the angle of rotation, vehicle speed, acceleration,
deceleration, gap, pedestrian, traffic sign, etc.
Regarding the datasets in Fig. 8, SHRP2 contains well-
known feature sets used to study driver behavior such as
acceleration (lateral and longitudinal), personal information
of drivers (gender and age), passengers, turn angle, traffic
conditions (construction zone, environmental factors, inter-
section influence, road geometry, secondary tasks, surface
condition, traffic density, traffic flow), velocity, turn signals,
vehicle speed, and other features. The 100-CAR Naturalistic
Driving Data Set contains acceleration (lateral and longi-
tudinal), vehicle heading (GPS), accelerator pedal position,
rotation rate, vehicle speed, velocity, and other features. The
UAH-DriveSet includes acceleration, distance to the vehicle
ahead, distance to the vehicle ahead in the current lane, turn
rate and turn angle, vehicle position (angle of car relative to
lane curvature, position of car relative to lane center), vehicle
speed, and other characteristics.
To provide additional orientation for future work and
experimentation, we provide through this SLR for each type
of driver behavior the features most commonly used to
describe that type. This helps researchers to choose the most
important features to achieve a better classification and, at the
same time, link each type of driver behavior with the features
that can more efficiently describe it. Table 8presents the most
commonly used features to describe the main types of driver
behavior.
Acceleration and deceleration, vehicle speed, rotation
angle, and acceleration (lateral and longitudinal) are the fea-
tures that should be used to classify driver behavior as aggres-
sive and abnormal. For vehicle stopping behavior, we need
personnel information (age, driving experience, education,
gender, license type), vehicle speed, time to an intersection,
time to the stop line, and time of day. To study and classify
drivers’ status, we need traffic conditions, acceleration, vehi-
cle speed, acceleration (lateral and longitudinal), the image
of the driver, rotation angle, and steering. Finally, to recog-
nize line deviation features commonly utilized are: rotation
angle, acceleration and deceleration, acceleration (lateral and
longitudinal), orientation, speed, gap, and velocity.
VOLUME 11, 2023 14139
S. Bouhsissin et al.: Driver Behavior Classification: A Systematic Literature Review
FIGURE 7. Data source vs features for driver behavior classification.
FIGURE 8. Dataset vs features for driver behavior classification.
D. DATA PREPROCESSING TECHNIQUES (RQ4)
One of the most challenging keys to a performant machine
learning or deep learning model is the quality of the training
data. Therefore, data preprocessing is a phase in the neces-
sary data mining and analysis process that converts raw data
into a format that can be comprehended and examined by
14140 VOLUME 11, 2023
S. Bouhsissin et al.: Driver Behavior Classification: A Systematic Literature Review
TABLE 8. Features related to driver behavior categorization.
algorithms. In fact, to obtain good results, the data must be
pre-processed to remove any impurities, solve the problems
inherent in the data, and improve its quality.
For a more efficient study of the various research propos-
als undertaken to preprocess and improve the data quality,
we analyze the data preprocessing techniques used, the image
and video preprocessing techniques adopted, the imbalanced
data problems, and the data labeling techniques studied sep-
arately.
1) DATA PREPROCESSING
The data preprocessing techniques used in the state of the art
are cleaning and removing noise, resampling or synchroniza-
tion, normalization, rolling window or data augmentation or
segmentation and imbalanced data. Fig. 9shows that nor-
malization is the most commonly step in data preparation,
followed by cleaning and noisy data elimination and data
augmentation to improve the performance of the models.
Data resampling is a dominant step in this problem where
we have different sources of data, as well as data imbalance
processing.
•Data cleaning and remove noise
The cleaning process is required to eliminate redundant data
from the raw data, such as correlated and identical features,
duplicate rows with the same timestamp, and then substitute
missing values.
The researchers use machine learning techniques to impute
missing values such as KNN. The KNN algorithm is used
to replace missing values by approximating a point value
using the nearest points based on other features. In another
study, missing data is approximated by linear interpolation.
Because temporal data frequently suffers from noise, features
cannot be extracted directly from it since noise frequently
skews measurements of things like speed. Other research uses
statistical methods to replace missing values such as mean,
median, and standard deviation.
Position and speed errors in Controller Area Network
(CAN) buses, GPS receivers, and inertial sensors all impact
the driver behavior classification model in addition to the
measured data. Researchers use preprocessing of the raw
speed time series before training the model. Meanwhile,
others use preprocessing of the data via a smoothing filter
to remove the effect of noise. Moreover, the Kalman filter
is adopted to remove noise in some papers. Also, it’s used
to reduce measurement errors in abnormal acceleration and
deceleration values in the NGSIM dataset [38].
•Resampling stage
The sampling rate at which the traffic features are collected
is not uniform. For example, a 10 Hz camera, a 10 Hz sensor,
and a 1 Hz GPS sensor. Features can be downsampled at
a lower sampling rate or the lower-sampled features can be
upsampled at a higher sampling rate to overcome this prob-
lem. Some researchers applied an oversampling technique
based on a finite impulse response (FIR) filter well-known
in the field of signal processing research for its ability to do
oversampling (interpolation). Others solve the problem by
oversampling the sampled data below the highest sampling
frequency via linear interpolation filtering.
•Data normalization
Data normalization is an important step to achieve a model’s
performance, as the features have different scales. For data
normalization, researchers generally used the linear scaling
technique, as normalization technique for resampled time
series sensor data. Another method was used for data nor-
malization by performing the FTP-72 (speed) driving test in
advance to guarantee the results.
•Data augmentation
Data augmentation is frequently used as a preprocessing
method to help increase the size of the dataset, which in
VOLUME 11, 2023 14141
S. Bouhsissin et al.: Driver Behavior Classification: A Systematic Literature Review
FIGURE 9. Preprocessing techniques used to classify driver behavior.
turn improves the accuracy of the model by providing a
better classification of the driver’s behavior. This technique
allows extracting more information from the raw data than by
picking random data segments. Using a sliding window with
a set length, the data was separated into overlapping parts.
•Imbalanced data
The distribution of abnormal and normal driver behavior in
the data is generally imbalanced (impedance data), and as
long as the dataset has time series characteristics, most per-
formant data balancing techniques are needed. Furthermore,
additional data filtering cannot be used when processing the
data, as this step will not only result in the loss of temporal
information but will also not fit the situation in which the
algorithm has to be processed. This problem is a challenging
one, and because of the data imbalance in most datasets,
the evaluation metrics precision, recall, and F1-score are
introduced, nevertheless, it is not sufficiently addressed in
the selected papers. To tackle this problem, some researchers
consider that the normalization of temporal data is sufficient
and that the problem of unbalanced classes can be resolved
by the algorithm itself. The MeanShift method is also used to
expand the samples of low-risk factors and solve the problem
of unbalanced samples.
2) PREPROCESSING OF VIDEOS AND IMAGES
Images and videos are important sources of driver behavior
data. The preprocessing of an image or video is the set of
operations carried out on the image that consists, on the one
hand, of modifying the appearance of the image to extract
information more easily and, on the other hand, to remove
useless information such as image noise to improve the data
quality [136].
Analyzing video can be done by frames or by paths divided
into segments of one minute or more. In general, all of the
videos were cropped using fixed selection frames, the back-
grounds were removed, and then the videos were resized to
produce lower-definition copies. This is because employing
high-resolution photos inherently adds storage, computing,
and data transmission overheads, which would make the
model’s design more challenging. This method also lessens
the conflicting effects of background and lighting changes on
the classification task, which can frequently occur in actual
driving situations. In addition, considering the optical flow
(OF) at the input of an image-trained model can effectively
improve accuracy. The cubic B-spline method is used to
represent the vehicle motion in the video.
Image augmentation is also an important task that not only
helps provide a diversified dataset and prevent the classifier
from overfitting but also assists in the development of a more
robust classifier that could classify much more effectively.
In most cases, manual observation models were used to
identify traffic and environmental conditions to extract fea-
tures that trained the model. Also, the mean RGB value is
used as a subtractor to center the data. The trajectories can
be extracted from the video to solve some problems using the
shared source traffic vision analysis platform TvaLib.
For an imbalanced data problem, the downsampling of the
collected video can be solved by storing only every third
frame, which reduces data redundancy.
3) DATA LABELING
The first problem encountered in the driver behavior classifi-
cation study is the labeling of the data. The data labels must
be very accurate in order to teach the model to make correct
predictions. There are many methods used to label the data:
manually, automatically, and based on indicators in the data
that allow us to directly label and understand the type of driver
behavior.
In most cases, the data labeling is done manually, but
the problem becomes complex when the data increases. The
14142 VOLUME 11, 2023
S. Bouhsissin et al.: Driver Behavior Classification: A Systematic Literature Review
method requires much time and many humans to label that
data.
Automatic labeling refers to any data labeling technique
that is not done by humans. This could mean labeling by
machine learning models, heuristic approaches, or a combi-
nation of both. The most commonly used technique in the
selected studies is clustering. The K-means clustering algo-
rithm is the most used one. It is based on several forms to mea-
sure similarity, like Euclidean distance and log-likelihood
distance. For example, in [53], speed was the key to labeling
driver behavior; in [42], the speed range was used; and in [65],
driving characteristics were the clustering base. Generally,
we use clustering only for continuous and categorical vari-
ables [45]. Three clustering algorithms are proposed in [49],
including k-means clustering, hierarchical clustering, and
model-based clustering, to choose the optimal number of
clusters for each and then label the data based on the speed
data.
The other method used in the studied papers is based on
the analysis of the datasets. They labeled aggressive behav-
ior as occurring in drivers who prefer greater longitudinal
acceleration and deceleration. Also, for the level of risk, they
categorize behavior by the minimum acceleration, average
acceleration, and kinetic energy reduction ratio, which are
connected to the accident rate. The driver’s behavior at inter-
sections is detected based on the decreasing probability of
stopping through the increasing speeds of the approaching
vehicle and the increasing yellow duration at the signal.
E. FEATURE SELECTION AND FEATURES EXTRACTION
TECHNIQUES (RQ5)
The objective of feature selection is to find the best set
of features to build performant models of the phenomena
studied. According to previous research, the selection of
relevant features is done either automatically or manually
using context knowledge. To select the most relevant fea-
ture, ML learning techniques were widely used such as,
SVM [65], LR [105], and RF [48]. According to [96], accel-
eration and jerk based on timestamp and speed, which are
provided by GPS sensors, are the most often used features in
the classification of driver behavior. For video data in [74],
authors used minimum redundancy and maximum relevance
(mRMR). This approach performs better than conditional
mutual information maximization (CMIM), mutual informa-
tion maximization (MIM), and ReliefF when evaluating the
best representative trajectory histograms. A sensitivity met-
ric, mean square error (MSE), was employed to determine
the bare minimum number of events or features required to
study driver behavior. In general, few papers have proceeded
to the feature selection section.
Feature extraction refers to the process of constructing
derived values (features) that are more informative and can
facilitate subsequent learning steps. Two techniques were
applied: feature engineering to create a new feature from
existing ones and dimension reduction.
TABLE 9. Features extraction methods.
It is very important to extract static characteristics from
the data to study driver behavior. These characteristics can be
used to make straightforward behavioral classifications. The
methods used for feature extraction, are mean, maximum,
standard deviation, minimum, PCA, and others, as shown in
Table 9. Time series data is used in papers [51] and [99], to
extract statistical parameters of the attributes mean, standard
deviation, maximum value, and minimum value to represent
the maneuvers. The minimum and maximum values of speeds
and acceleration were used in [55]. While in [71] and [72],
the statistical parameters were used such as maximum, min-
imum, the range of values, the mean, and the standard devi-
ation, and they present the main difference between the dif-
ferent driver behaviors. In addition, in [40], feature extraction
was applied based on the following basic statistical descrip-
tors: mean, maximum value, standard deviation, and median
value. In [103], mean and variation are used to find effective
trends then Predicted Error (PE) - It was calculated using
Second-Order Taylor. Article [57] uses a threshold-based
algorithm to extract driving maneuvers from the path. The
point detection algorithm is used to estimate the time range
of the signal in search of important events [35]. The papers
[65], [73] use SVM to extract stability features. In [94],
they proposed converting the signals into images using the
recurrence plot technique. Consistent misuse of control or
propensity to drive can be determined from the mean and
median values. The maximum value, for instance, denotes
abnormal or aggressive behavior when it is higher than the
speed limitations that are presented. Furthermore, a fast rate
of change may be indicated by the standard deviation, which
frequently denotes aggressive behavior. The engine is being
driven aggressively if the standard deviation of the engine
speed or throttle position is high. The major properties of a
signal in the frequency domain can be captured using Hjorth
parameters, which are frequently employed in feature extrac-
tion. Hjorth activity reflects the signal’s strength, the mobility
of its average frequency, and the complexity of its frequency
variation [40]. The degree of dispersion and symmetry of
the data are each described using kurtosis and skewness,
VOLUME 11, 2023 14143
S. Bouhsissin et al.: Driver Behavior Classification: A Systematic Literature Review
FIGURE 10. Type of algorithms.
respectively. Particularly, Skewness measures how much the
data deviate from a perfectly symmetric distribution, whereas
Kurtosis assesses how heavy or light-tailed the data are in
comparison to a normal distribution [40].
A feature extraction procedure was required to minimize
the dimensionality of the feature space and obtain the most
representative parameters from a large number of modi-
fied features. Dimension reduction purposes in papers [29],
[57], [103], [104], [106] use principal component analy-
sis (PCA) to reduce dimensionality, improve model overfit-
ting performance, and reduce computational complexity. For
video data, the length of the vectors was fixed to reduce the
data dimension.
F. APPROACHES FOR DRIVER BEHAVIOR ANALYSIS (RQ6)
In this section, we will identify and analyze the applications
of machine learning, deep learning, and statistical techniques
and algorithms in the field of driver behavior assessment and
classification. Fig. 10 presents in percentage form the number
of papers that have developed ML, DL, or statistical analysis.
It shows that machine learning (ML) algorithms are the most
used; they are present in 60% of previous studies. Then deep
learning (DL) algorithms took 34.87%. Finally, statistical
methods are less used, with 5.13%. In this SLR, we have
extracted twenty machine learning (ML) algorithms used to
classify driver behavior. We have SVM, LR, RF, KNN, k-
means clustering, BN, DT, AdaBoost, and other algorithms
presented in Fig. 11. In addition, twenty deep learning (DL)
algorithms, we have LSTM, CNN, ANN, RNN, DNN, SF-
Net, FCN, autoencoders, and other algorithms presented in
Fig. 11. Six statistical techniques used, such as the T-test,
ANOVA, and ANCOVA. The algorithms SVM, LR, LSTM,
ANN, KNN, RF, and CNN are the most commonly used,
they present 49.72% of the articles studied. Fig. 11 shows the
interest that each algorithm received in previous studies.
1) DRIVER BEHAVIOR CLASSIFICATION ALGORITHMS
We have analyzed the algorithms used in the selected studies
according to the dataset, data source, type of algorithm, and
type of driver behavior study.
At the level of the dataset, Fig. 12 shows for each dataset
the classification algorithms used to predict and classify
driver behavior. LSTM is the most commonly used algorithm
for UAH-DriveSet with 11.42%. Then, SVM, RF, and DT
algorithms are often used in the SHRP2 dataset, with 5.71%
in each one. SVM and ANN are used with 2.87% in the
Driving dataset, and RNN, LSTM, and GRU are used in
Driver Behavior Dataset with 2.85% in each one. In general,
the LSTM algorithm is the most used in driver behavior
classification from the extracted dataset, with almost 17.14%,
followed by SVM with 14.29%, and RF with 11.43%, and the
other algorithms percentages are shown in Fig. 12. We can
also conclude that ML algorithms are the most commonly
used to classify driver behavior from datasets, accounting for
51.43% of the total, while DL algorithms account for 42.86%.
In addition, from Fig. 13, SVM and LR algorithms are
mostly used to analyze smartphone sensor data. Also, for
simulator data the SVM algorithm is recommend. In general,
the SVM algorithm is the most used in the data source for the
classification of driver behavior with 19.60%, then LR and
LSTM with 9.49 and 8.23%, respectively, and the percentage
of other algorithms is shown in Fig. 13. ML algorithms
account for 62.66% of the algorithms used for the classi-
fication of driver behavior based on different data sources,
followed by the DL algorithm with 32.91% and the statistical
method with 4.43%.
2) DRIVER BEHAVIOR CLASSIFICATION PERFORMANCE
MEASURES
Measuring model performance is crucial to understand-
ing and quantifying its effectiveness. Through this process,
we can determine which model is the best to use for a clas-
sification or regression task. The Fig. 14 show that the most
popular performance metrics considered in the chosen studies
are accuracy, F1-score, recall, and precision.
In fact, as accuracy is the most common evaluation met-
ric in classification modeling, it is adopted 48.75% of the
selected studies, followed by the F1-score in 12.50% of
papers. The harmonic mean of the recall and precision values
is represented by this metric. 11.25% used recall or sensitivity
metrics to measure the fraction of correctly classified positive
patterns, while 8.75% used precision metrics to measure the
fraction of correctly predicted positive patterns in a positive
class. False-positive rate, MSE, ROC curve, and AUC are
each used at 3.75%, followed by MPE at 2.5%, specificity,
MCC, MAE, and Concordance at 1.25% each.
3) DRIVER BEHAVIOR CLASSIFICATION PERFORMANCE
In terms of the best performing algorithm, we cannot decide
exactly which of the algorithms illustrated above (see Fig. 11)
performs better than the others, as in most of the studies
selected above, their data source, features, sample size, study
environment, number of participants, metrological data, and
other factors are all different. To analyze the model’s per-
formance, we project in Table 10 the algorithm used in the
datasets detailed in Table 5and its measured performance.
14144 VOLUME 11, 2023
S. Bouhsissin et al.: Driver Behavior Classification: A Systematic Literature Review
FIGURE 11. Distribution of algorithms.
FIGURE 12. Distribution of driver behavior algorithms used in the dataset extracted from selected studies.
We note that in UAH-DriveSet the highest score obtained
is 99.49% F1-score with the LSTM algorithm and the three
output classifications: aggressive, drowsy, and normal. The
SPMD achieves an accuracy of 60.5% with the LLDA algo-
rithm, and the extracted behavior types are aggressive driv-
ing, cautious driving, and moderate driving. Then 98.5%
VOLUME 11, 2023 14145
S. Bouhsissin et al.: Driver Behavior Classification: A Systematic Literature Review
FIGURE 13. Distribution of driver behavior algorithms used in the data source extracted from selected studies.
FIGURE 14. Performance metrics.
concordance is obtained in the SHRP2 dataset with the RF
algorithm and the following outputs: eating and drinking,
personal hygiene, phone use while driving, reaching, talk-
ing, and talking to passengers. In addition, 90% accuracy
is found in SHRP2 with both abnormal and normal behav-
iors and the RF algorithm. Furthermore, the driving dataset
with the SVM algorithm, and the types of dangerous and
safe driver behavior obtained 90% accuracy. With the GRU
algorithm and Driver Behavior Dataset, 95% accuracy is
attained, and the driver behavior is classified as aggressive
in acceleration, braking, left and right turns, left and right
lane changes, or non-aggressive events. Finally, the 100-CAR
Naturalistic.
Driving Data Set and the SVM algorithm found a result of
90% accuracy for the types of dangerous and normal driver
behavior.
FIGURE 15. Type of models.
4) DRIVER BEHAVIOR APPROACHES
The algorithms identified earlier were used to assess driver
behavior generally in three forms: (1) detecting driver behav-
ior; (2) predicting and classifying driver behavior; and (3)
using statistics to study driver behavior. Fig. 15 is plotted
to describe the driver behavior study techniques and models.
49.15% of studies use algorithms to predict driver behavior,
followed by 32.20% that use detection models, and 18.64%
that use statistical models.
In addition, in the detection models, the authors use 54%
ML algorithms and 44% DL algorithms. The same thing
happens in prediction models: ML algorithms account for
64.29%, DL algorithms for 33.67%, and statistical models
for 2.04%. While in statistical models, researchers also use
more ML algorithms with 57.14% and statistical algorithms
like ARIMA and others with 42.86% (see Fig. 16).
14146 VOLUME 11, 2023
S. Bouhsissin et al.: Driver Behavior Classification: A Systematic Literature Review
TABLE 10. The results of the algorithms based on the targets for each dataset.
For each type of driver behavior study, we need to know
the most commonly chosen type of algorithm (Fig. 17).
Machine learning algorithms are the most frequently used
algorithms in driving simulator studies, field driving studies,
naturalistic driving studies, and questionnaire studies, with
60.98%, 62.82%, 54.17%, and 64.29%, respectively. Then,
with 37.50% in naturalistic driving studies and 34.62% in
field driving studies, deep learning algorithms are used. In the
questionnaire study, 35.71% presented statistical techniques.
V. DISCUSSION
The purpose of the SLR protocol used in this article is based
on the following objectives:
RQ1: The classification of driver behavior is a complex
and difficult objective. It has not yet been addressed in
detail in previous studies, in part because the types of driver
behavior (targets) are not unified and many interferences and
interdependencies can be detected between the terms used to
target the same goal. To attempt this goal, we extracted driver
behaviors (DB) from papers studied in this SLR, analyze
the dependencies and relations between them and categorize
them into abnormal DB, aggressive DB, vehicle stopping
for DB, line deviation, and driver’s status. In each category,
there are types of DB included like abnormal DB (normal,
abnormal, safe, dangerous, positive, negative, high risk, etc.),
aggressive DB (aggressive behaviors and their types and
levels of aggression), vehicle stopping (stop, run, go, rolling
stop, etc.), line deviation (swerving, side slipping, fast U-turn,
successful lane changing, etc.), driver’s status (drowsiness,
stress, use of cell phone, talking, eating, etc.). An accurate
study allows us to group and categorize the types of driver
behavior, which is an inevitable goal.
VOLUME 11, 2023 14147
S. Bouhsissin et al.: Driver Behavior Classification: A Systematic Literature Review
FIGURE 16. Classification of model types according to algorithm types.
FIGURE 17. Classification of types of studies according to algorithm types.
RQ2: In the selected papers, four types of studies were
extracted to classify driver behavior: questionnaire study,
naturalistic driving study (NDS), field driving studies (FDS),
and driving simulator study. The questionnaire reflects the
subjective opinions of the driver rather than the driver’s actual
performance on the road, and the other types involve man-
ual control of the driving environment to induce a specific
driver behavior. In addition, field driving studies are the most
commonly used to study driver behavior either with cameras,
sensors, etc.
RQ3: From this SLR, we identified the different data
sources used to extract data and classify driver behavior.
These data sources are classified in Table 3, we find: a simu-
lator, camera, GPS, smartphone sensors, accelerometer, gyro-
scope, and other data sources (Table 3). In addition, we have
synthesized datasets used in several articles, such as SHRP2
(Strategic Highway Research Program 2), UAH-DriveSet,
SPMD (Safety Pilot Model Deployment), NGSIM (Next-
Generation Simulation), Driver Behavior Dataset, 100-CAR
Naturalistic Driving Data Set, and other datasets (Table 5).
From these sources and datasets, several different features
are extracted. Due to the complexity of the problem and
the different sources and datasets, the number of extracted
features is abundant (39 large feature categories and 225 sub-
features). Among the important features (the most frequent
ones from the selected articles) are vehicle speed, acceler-
ation/deceleration, rotation rate, pedal, acceleration (lateral
and longitudinal), time (time to the intersection, time to a stop
line, and time to a lane crossing), traffic condition, personnel
information, steering, and physiological and psychological
signals. Due to the different sources of data, the mass of
data, and the different traffic characteristics, the problem of
classification of driving behavior is more and more complex.
However, due to these data, which are large and of differ-
ent types, the research potentials in this area are promising,
especially with the development of ML, DL, and statistical
techniques.
RQ4 and RQ5:Data preprocessing techniques in the field
of driver behavior discussed in selected articles are: the clean-
ing phase, noise removal, resampling phase, normalization,
data augmentation, and imbalanced data. We find that these
techniques are not used extensively and sufficiently to achieve
the objective of driver behavior classification. For example,
the number of features extracted in each article is too high
to allow us to specify the features that are important for
classifying driver behavior. Furthermore, because the data
in this topic is typically temporal, it requires a complex
structure and preprocessing to yield significant results. Let
14148 VOLUME 11, 2023
S. Bouhsissin et al.: Driver Behavior Classification: A Systematic Literature Review
us not forget the problem of unbalanced data, which is not
sufficiently covered in the selected papers. In addition, data
labeling is a topic that most researchers do not address, and
others do not clearly and adequately discuss. From the papers
studied, we found a set of feature extraction techniques with
mathematical forms like minimum, maximum, variance, and
others, and using artificial intelligence algorithms like SVM,
RF, and PCA. At the level of feature selection, we have found
that most of the papers use the knowledge of the domain in
the choice of features. However, ML and DL techniques offer
enormous potential for feature selection, many techniques are
available for this purpose and allow for attribute selection that
improves the performance of the models.
RQ6: The SVM, LR, and LSTM algorithms are most
commonly used in various datasets to classify driver behavior.
In general, ML algorithms are more applied in different types
of studies, while DL algorithms are also less used. Therefore,
the question that arises is why machine learning algorithms
are more widely used than deep learning algorithms, even
though we have data that changes over time, has a large
number of features, and comes from heterogeneous data
sources. We can say that the potential of DL techniques is
not sufficiently exploited for DB study.
In general, we have synthesized from various literature on
driver behavior a set of factors that relate to the external
environment of the car that can influence driver behavior.
Such as road conditions, traffic conditions, weather condi-
tions, and the presence of pedestrians, vehicles, motorcycles,
and cyclists. Beyond that, we need a global system that
allows us to classify driver behavior according to the external
environment of the car.
The use of real-time driver behavior classification in road
vehicles has many implications for road security, for example:
First, in-car warning systems can alert the driver to unusual
driving behavior and encourage him to be cautious. Secondly,
we can use it to create a police warning system for abnormally
driving vehicles by identifying, for example, vehicles that
are likely to be dangerous or unsafe. This is achieved by
analyzing the vehicle’s distance traveled, speed, and other
approaches. This system permits us to stop a person before
an accident occurs, allowing governments to impose penalties
on reckless drivers to maintain road safety and traffic con-
trol. Thirdly, it helps companies make better decisions about
hiring drivers and predict the behavior of their employees.
Finally, to detect areas of abnormal behavior to assist the gov-
ernment in making decisions to improve road safety in those
areas. All of these applications can help prevent accidents and
reduce the cost of repairs.
VI. CONCLUSION
The aim of this review is to identify existing classification
systems for driver behavior. Using SLR guidelines and proce-
dures, we analyzed and evaluated past reviews in the field of
driver behavior. We followed the systematic literature review
approach in this study and used digital databases such as
ScienceDirect, the IEEE Xplore digital library, SpringerLink,
the DBLP database, and Google Scholar to extract the infor-
mation. This systematic review examined the literature on
driver behavior classification from 2015 to 2022. Finally,
we find 93 primary empirical studies that are relevant to the
research questions (RQs) posed in this review. The results
showed that field driving studies are the most widely used
to study driver behavior classification. In addition, this SLR
states that there are many types of driver behavior classifi-
cations. We have classified driver behavior as abnormal DB,
aggressive DB, vehicle stopping for DB, line deviation, and
driver status. We then identified the data sources and datasets
utilized to analyze and predict driver behavior. The Strategic
Highway Research Program 2 Data Set (SHRP2) and the
UAH-DriveSet are the most commonly used datasets in driver
behavior classification. The simulator and the camera are
the most popular data sources for this problem. We explored
preprocessing, feature selection, and feature extraction tech-
niques used in the papers studied. Additionally, the SVM, LR,
and LSTM algorithms are widely used in training data to clas-
sify driver behavior. In general, machine learning algorithms
are most present in this problem of driver behavior classifi-
cation with 60%, followed by deep learning algorithms with
32.73%.
This study possesses some limitations in searching for
articles, as only journals and conferences indexed in Scopus
between 2015 and 2022 were used. Additionally, only articles
based on a system, method, or machine and deep learn-
ing algorithms based on driver behavior classification were
selected and analyzed. We concentrated mostly on articles
dealing with the study of driver behavior from the outside of
the car, i.e., the external environment.
For future research, we will first focus on studying the
different data sources that allow us to extract drivers’ features
in order to propose a semantic categorization of these features
to provide more accurate description of the driver’s behavior.
Furthermore, we attempt to propose a DB classification sys-
tem capable of classifying driving behavior using multiple
data sources and leveraging the potential of deep learning
algorithms for time series. Finally, we need to reduce the
problems of driver behavior and study how this behavior is
related to the environment, such as intersections, pedestrians,
and weather.
REFERENCES
[1] S. A. Shaheen and R. Finson, ‘‘Intelligent transportation systems?’’
in Reference Module in Earth Systems and Environmental
Sciences. Amsterdam, The Netherlands: Elsevier, 2016,
Art. no. B9780124095489012000, doi: 10.1016/B978-0-12-409548-
9.01108-8.
[2] N. Kumar, K. Ankit, and A. Pathani, ‘‘Intelligent transportation system:
Review paper,’’ Int. J. Sci. Eng. Res., vol. 8, no. 12, p. 5, 2017.
[3] J.-P. Rodrigue, C. Comtois, and B. Slack, The Geography of
Transport Systems, 3rd ed. London, U.K.: Routledge, 2013, doi:
10.4324/9780203371183.
[4] B. Singh and A. Gupta, ‘‘Recent trends in intelligent transportation sys-
tems: A review,’’ J. Transp. Literature, vol. 9, no. 2, pp. 30–34, Apr. 2015,
doi: 10.1590/2238-1031.jtl.v9n2a6.
[5] A. Paul, N. Chilamkurti, A. Daniel, and S. Rho, ‘‘Intelligent transporta-
tion systems,’’ in Intelligent Vehicular Networks and Communications,
2017, pp. 21–41, doi: 10.1016/B978-0-12-809266-8.00002-8.
VOLUME 11, 2023 14149
S. Bouhsissin et al.: Driver Behavior Classification: A Systematic Literature Review
[6] Y. Lin, P. Wang, and M. Ma, ‘‘Intelligent transportation system (ITS):
Concept, challenge and opportunity,’’ in Proc. IEEE IEEE 3rd Int. Conf.
Big Data Secur. Cloud (BigDataSecurity) Int. Conf. High Perform. Smart
Comput., (HPSC) IEEE Int. Conf. Intell. Data Secur. (IDS), May 2017,
pp. 167–172, doi: 10.1109/BigDataSecurity.2017.50.
[7] J. Oskarbski, T. Marcinkowski, and M. Zawisza, ‘‘Impact of intelligent
transport systems services on the level of safety and improvement of
traffic conditions,’’ in Smart Solutions in Today’s Transport, vol. 715,
J. Mikulski, Ed. Cham, Switzerland: Springer, 2017, pp. 142–154, doi:
10.1007/978-3-319-66251-0_12.
[8] M. Boltze and V. A. Tuan, ‘‘Approaches to achieve sustainability in
traffic management,’’ Proc. Eng., vol. 142, pp. 205–212, Jan. 2016, doi:
10.1016/j.proeng.2016.02.033.
[9] F. B. Aghdam, H. Sadeghi-Bazargani, S. Azami-Aghdash, A. Esmaeili,
H. Panahi, M. Khazaee-Pool, and M. Golestani, ‘‘Developing a national
road traffic safety education program in Iran,’’ BMC Public Health,
vol. 20, no. 1, p. 1064, Jul. 2020, doi: 10.1186/s12889-020-09142-1.
[10] Global Status Report on Road Safety 2018, World Health Organization,
Geneva, Switzerland, 2018. Accessed: Dec. 26, 2021. [Online]. Available:
https://apps.who.int/iris/handle/10665/324835
[11] P. C. Jain, ‘‘Trends in next generation intelligent transportation sys-
tems,’’ in Artificial Intelligence, vol. 6, M. Găiceanu, Ed. London, U.K.:
IntechOpen, 2021, doi: 10.5772/intechopen.97690.
[12] I. Harris, ‘‘Embedded software for automotive applications,’’ in Software
Engineering for Embedded Systems, R. Oshana and M. Kraeling, Eds.
Oxford, U.K.: Newnes, 2013, ch. 22, pp. 767–816, doi: 10.1016/B978-0-
12-415917-4.00022-0.
[13] S. M. Kouchak and A. Gaffar, ‘‘Estimating the driver status using long
short term memory,’’ in Machine Learning and Knowledge Extraction,
vol. 11713, A. Holzinger, P. Kieseberg, A. M. Tjoa, and E. Weippl, Eds.
Cham, Switzerland: Springer, 2019, pp. 67–77, doi: 10.1007/978-3-030-
29726-8_5.
[14] Y. H. Bhosale and K. S. Patnaik, ‘‘IoT deployable lightweight deep learn-
ing application for COVID-19 detection with lung diseases using Rasp-
berryPi,’’ in Proc. Int. Conf. IoT Blockchain Technol. (ICIBT), May 2022,
pp. 1–6, doi: 10.1109/ICIBT52874.2022.9807725.
[15] Y. Matrane, F. Benabbou, and N. Sael, ‘‘Sentiment analysis through word
embedding using AraBERT: Moroccan dialect use case,’’ in Proc. Int.
Conf. Digit. Age Technol. Adv. Sustain. Develop. (ICDATA), Jun. 2021,
pp. 80–87, doi: 10.1109/ICDATA52997.2021.00024.
[16] S. Bouhsissin, N. Sael, and F. Benabbou, ‘‘Enhanced VGG19 model
for accident detection and classification from video,’’ in Proc. Int.
Conf. Digit. Age Technol. Adv. Sustain. Develop. (ICDATA), Jun. 2021,
pp. 39–46, doi: 10.1109/ICDATA52997.2021.00017.
[17] Y. H. Bhosale, S. Zanwar, Z. Ahmed, M. Nakrani, D. Bhuyar, and
U. Shinde, ‘‘Deep convolutional neural network based COVID-19 classi-
fication from radiology X-ray images for IoT enabled devices,’’ in Proc.
8th Int. Conf. Adv. Comput. Commun. Syst. (ICACCS), vol. 1, Mar. 2022,
pp. 1398–1402, doi: 10.1109/ICACCS54159.2022.9785113.
[18] Y. H. Bhosale and K. S. Patnaik, ‘‘Application of deep learning techniques
in diagnosis of COVID-19 (coronavirus): A systematic review,’’ Neural
Process Lett., Sep. 2022, doi: 10.1007/s11063-022-11023-0.
[19] Y. Peng, C. Li, K. Wang, Z. Gao, and R. Yu, ‘‘Examining imbal-
anced classification algorithms in predicting real-time traffic crash risk,’’
Accident Anal. Prevention, vol. 144, Sep. 2020, Art. no. 105610, doi:
10.1016/j.aap.2020.105610.
[20] S. Bouhsissin, N. Sael, and F. Benabbou, ‘‘Prediction of risks in intel-
ligent transport systems,’’ in Proc. Int. Conf. Big Data Internet Things,
vol. 489, M. Lazaar, C. Duvallet, A. Touhafi, and M. A. Achhab, Eds.
Cham, Switzerland: Springer, 2022, pp. 303–316, doi: 10.1007/978-3-
031-07969-6_23.
[21] L. Evans, ‘‘The dominant role of driver behavior in traffic safety,’’
Amer. J. Public Health, vol. 86, no. 6, pp. 784–786, Jun. 1996, doi:
10.2105/AJPH.86.6.784.
[22] T. K. Chan, C. S. Chin, H. Chen, and X. Zhong, ‘‘A comprehensive
review of driver behavior analysis utilizing smartphones,’’ IEEE Trans.
Intell. Transp. Syst., vol. 21, no. 10, pp. 4444–4475, Oct. 2020, doi:
10.1109/TITS.2019.2940481.
[23] C. M. Martinez, M. Heucke, B. Gao, D. Cao, and F.-Y. Wang, ‘‘Driv-
ing style recognition for intelligent vehicle control and advanced driver
assistance: A survey,’’ IEEE Trans. Intell. Transp. Syst., vol. 19, no. 3,
pp. 666–676, Mar. 2018, doi: 10.1109/TITS.2017.2706978.
[24] M. Q. Khan and S. Lee, ‘‘A comprehensive survey of driving monitoring
and assistance systems,’’ Sensors, vol. 19, no. 11, p. 2574, Jun. 2019, doi:
10.3390/s19112574.
[25] A. Soultana, F. Benabbou, and N. Sael, ‘‘Context-awareness in the smart
car: Study and analysis,’’ in PervasiveHealth: Pervasive Computing Tech-
nologies for Healthcare, Oct. 2019, doi: 10.1145/3368756.3369019.
[26] H.-B. Kang, ‘‘Various approaches for driver and driving behavior mon-
itoring: A review,’’ in Proc. IEEE Int. Conf. Comput. Vis. Workshops,
Dec. 2013, pp. 616–623, doi: 10.1109/ICCVW.2013.85.
[27] L. M. Bergasa, J. Nuevo, M. A. Sotelo, R. Barea, and M. E. Lopez,
‘‘Real-time system for monitoring driver vigilance,’’ IEEE Trans.
Intell. Transp. Syst., vol. 7, no. 1, pp. 63–77, Mar. 2006, doi:
10.1109/TITS.2006.869598.
[28] C. Wang, A. Musaev, P. Sheinidashtegol, and T. Atkison, ‘‘Towards detec-
tion of abnormal vehicle behavior using traffic cameras,’’ in Big Data—
BigData 2019, vol. 11514, K. Chen, S. Seshadri, and L.-J. Zhang, Eds.
Cham, Switzerland: Springer, 2019, pp. 125–136, doi: 10.1007/978-3-
030-23551-2_9.
[29] A. E. Abdelrahman, H. S. Hassanein, and N. Abu-Ali, ‘‘Robust
data-driven framework for driver behavior profiling using supervised
machine learning,’’ IEEE Trans. Intell. Transport. Syst., vol. 23, no. 4,
pp. 3336–3350, Apr. 2022, doi: 10.1109/TITS.2020.3035700.
[30] R. Wang, F. Xie, J. Zhao, B. Zhang, R. Sun, and J. Yang, ‘‘Smartphone
sensors-based abnormal driving behaviors detection: Serial-feature net-
work,’’ IEEE Sensors J., vol. 21, no. 14, pp. 15719–15728, Jul. 2021, doi:
10.1109/JSEN.2020.3036862.
[31] A. H. Ali, A. Atia, and M.-S.-M. Mostafa, ‘‘Recognizing driv-
ing behavior and road anomaly using smartphone sensors,’’ Int. J.
Ambient Comput. Intell., vol. 8, no. 3, pp. 22–37, Jul. 2017, doi:
10.4018/IJACI.2017070102.
[32] K. Bahadoor and P. Hosein, ‘‘Application for the detection of dangerous
driving and an associated gamification framework,’’ in Proc. IEEE 4th
Int. Conf. Future Internet Things Cloud Workshops (FiCloudW), Vienna,
Austria, Aug. 2016, pp. 276–281, doi: 10.1109/W-FiCloud.2016.63.
[33] D. Babić, D. Babić, H. Cajner, A. Sruk, and M. Fiolić, ‘‘Effect
of road markings and traffic signs presence on young driver stress
level, eye movement and behaviour in night-time conditions: A driv-
ing simulator study,’’ Safety, vol. 6, no. 2, p. 24, May 2020, doi:
10.3390/safety6020024.
[34] H. H. Awan, A. Pirdavani, A. Houben, S. Westhof, M. Adnan, and
T. Brijs, ‘‘Impact of perceptual countermeasures on driving behavior at
curves using driving simulator,’’ Traffic Injury Prevention, vol. 20, no. 1,
pp. 93–99, Jan. 2019, doi: 10.1080/15389588.2018.1532568.
[35] H. Eren, S. Makinist, E. Akin, and A. Yilmaz, ‘‘Estimating driving
behavior by a smartphone,’’ in Proc. IEEE Intell. Vehicles Symp., Madrid,
Spain, Jun. 2012, pp. 234–239, doi: 10.1109/IVS.2012.6232298.
[36] C. Bao, C. Chen, H. Kui, and X. Wang, ‘‘Safe driving at traffic lights:
An image recognition based approach,’’ in Proc. 20th IEEE Int. Conf.
Mobile Data Manage. (MDM), Hong Kong, Jun. 2019, pp. 112–117, doi:
10.1109/MDM.2019.00-67.
[37] Q. Wang, Y. Liu, J. Liu, Y. Gu, and S. Kamijo, ‘‘Critical areas
detection and vehicle speed estimation system towards intersection-
related driving behavior analysis,’’ in Proc. IEEE Int. Conf. Con-
sum. Electron. (ICCE), Las Vegas, NV, USA, Jan. 2018, pp. 1–6, doi:
10.1109/ICCE.2018.8326122.
[38] X. Huang, J. Sun, and J. Sun, ‘‘A car-following model considering asym-
metric driving behavior based on long short-term memory neural net-
works,’’ Transp. Res. C, Emerg. Technol., vol. 95, pp. 346–362, Oct. 2018,
doi: 10.1016/j.trc.2018.07.022.
[39] V. Vignali, A. Bichicchi, A. Simone, C. Lantieri, G. Dondi, and M. Costa,
‘‘Road sign vision and driver behaviour in work zones,’’ Transp.
Res. F, Traffic Psychol. Behav., vol. 60, pp. 474–484, Jan. 2019, doi:
10.1016/j.trf.2018.11.005.
[40] E. Lattanzi and V. Freschi, ‘‘Machine learning techniques to
identify unsafe driving behavior by means of in-vehicle sensor
data,’’ Expert Syst. Appl., vol. 176, Aug. 2021, Art. no. 114818, doi:
10.1016/j.eswa.2021.114818.
[41] S. Ramyar, A. Homaifar, A. Karimoddini, and E. Tunstel, ‘‘Identifica-
tion of anomalies in lane change behavior using one-class SVM,’’ in
Proc. IEEE Int. Conf. Syst., Man, Cybern. (SMC), Budapest, Hungary,
Oct. 2016, pp. 4405–4410, doi: 10.1109/SMC.2016.7844924.
[42] D. Wang, X. Pei, L. Li, and D. Yao, ‘‘Risky driver recognition based
on vehicle speed time series,’’ IEEE Trans. Human-Mach. Syst., vol. 48,
no. 1, pp. 63–71, Feb. 2018, doi: 10.1109/THMS.2017.2776605.
14150 VOLUME 11, 2023
S. Bouhsissin et al.: Driver Behavior Classification: A Systematic Literature Review
[43] M. Jeon, E. Yang, E. Oh, J. Park, and C.-H. Youn, ‘‘A deterministic
feedback model for safe driving based on nonlinear principal analysis
scheme,’’ Proc. Comput. Sci., vol. 113, pp. 454–459, Jan. 2017, doi:
10.1016/j.procs.2017.08.301.
[4