ArticlePDF Available

SafeSmartDrive: Real-Time Traffic Environment Detection and Driver Behavior Monitoring With Machine and Deep Learning

Authors:

Abstract and Figures

The advancement of intelligent transportation systems is crucial for improving road safety and optimizing traffic flow. In this paper, we present SafeSmartDrive, an integrated transportation monitoring system designed to detect and assess critical elements in the driving environment while simultaneously monitoring driver behavior. The system is structured into four key layers: perception, filtering and preparation, detection and classification, and alert. SafeSmartDrive focuses on two primary objectives: (1) detecting and assessing essential traffic elements, including vehicles (buses, cars, motorcycles, trucks, bicycles), traffic signs and lights, pedestrians, animals, infrastructure damage, accident classification, and traffic risk assessment, and (2) evaluating driver behavior across various road types, such as highways, secondary roads, and intersections. Machine learning and deep learning algorithms are employed throughout the system’s components. For traffic element detection, we utilize YOLOv9 in this paper, which outperforms previous versions like YOLOv7 and YOLOv8, achieving a precision of 83.1%. Finally, we present the evaluation of the SafeSmartDrive system’s real-time detection capabilities in a specific scenario in Casablanca. SafeSmartDrive’s comprehensive architecture offers a novel approach to improving road safety through the integration of advanced detection, classification, and risk assessment capabilities.
Content may be subject to copyright.
Received 29 October 2024, accepted 10 November 2024, date of publication 14 November 2024,
date of current version 25 November 2024.
Digital Object Identifier 10.1109/ACCESS.2024.3498596
SafeSmartDrive: Real-Time Traffic Environment
Detection and Driver Behavior Monitoring
With Machine and Deep Learning
SOUKAINA BOUHSISSIN , NAWAL SAEL , FAOUZIA BENABBOU ,
ABDELFETTAH SOULTANA , AND AYOUB JANNANI
Laboratory of Information Technology and Modeling, Faculty of Sciences Ben M’Sick, Hassan II University of Casablanca, Casablanca 20000, Morocco
Corresponding author: Soukaina Bouhsissin (bouhsissin.soukaina@gmail.com)
ABSTRACT The advancement of intelligent transportation systems is crucial for improving road safety and
optimizing traffic flow. In this paper, we present SafeSmartDrive, an integrated transportation monitoring
system designed to detect and assess critical elements in the driving environment while simultaneously mon-
itoring driver behavior. The system is structured into four key layers: perception, filtering and preparation,
detection and classification, and alert. SafeSmartDrive focuses on two primary objectives: (1) detecting
and assessing essential traffic elements, including vehicles (buses, cars, motorcycles, trucks, bicycles),
traffic signs and lights, pedestrians, animals, infrastructure damage, accident classification, and traffic risk
assessment, and (2) evaluating driver behavior across various road types, such as highways, secondary roads,
and intersections. Machine learning and deep learning algorithms are employed throughout the system’s
components. For traffic element detection, we utilize YOLOv9 in this paper, which outperforms previous
versions like YOLOv7 and YOLOv8, achieving a precision of 83.1%. Finally, we present the evaluation
of the SafeSmartDrive system’s real-time detection capabilities in a specific scenario in Casablanca.
SafeSmartDrive’s comprehensive architecture offers a novel approach to improving road safety through the
integration of advanced detection, classification, and risk assessment capabilities.
INDEX TERMS Road safety, driver behavior, real-time monitoring, environment detection, vehicle detec-
tion, traffic signs, deep learning, sustainability, ESG goals.
I. INTRODUCTION
Intelligent Transportation Systems (ITS) have emerged as a
solution to address the issue of road insecurity, particularly
concerning driver behavior and interaction with the environ-
ment. ITS utilizes technology and connectivity to enhance
road safety by providing real-time information, warnings,
and guidance to drivers [1]. By integrating various sensors,
cameras, and communication systems, ITS can monitor and
analyze driver behavior, including speeding, sudden lane
changes, speed limit notifications, and distracted driving.
Moreover, the visual perception technology of intelligent
vehicles can assist automatic driving systems in accurately
The associate editor coordinating the review of this manuscript and
approving it for publication was Anandakumar Haldorai .
perceiving complex environments in traffic scenarios, which
is essential for safe driving and collision avoidance.
Through the utilization of advanced algorithms and arti-
ficial intelligence, ITS can alert drivers of potential dangers
and assist in making safer decisions while on the road. Over-
all, ITS plays a crucial role in mitigating road insecurity
by focusing on improving driver behavior and promoting
a safer and more efficient driving experience. Furthermore,
sustainability has become a key consideration in modern
transportation systems. By optimizing traffic flow, reducing
congestion, and minimizing accidents, ITS can significantly
lower fuel consumption and emissions, contributing to a
greener, more efficient transportation infrastructure. Early
detection of infrastructure issues also helps prevent costly and
resource-intensive repairs, promoting the long-term sustain-
ability of road networks.
VOLUME 12, 2024
2024 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ 169499
S. Bouhsissin et al.: SafeSmartDrive: Real-Time Traffic Environment Detection and Driver Behavior Monitoring
In general, a system that leverages the evolution of AI
applications in transportation and takes into account the
three contexts of the driver, the car, and the environment
will be a significant innovation and solution in the trans-
portation system. According to our research and knowledge,
there is currently no system that comprehensively addresses
these problems, taking into account technological advance-
ments and all factors that contribute to the relationship
between drivers, cars, and the environment. The objective
is to propose a system that supports drivers in monitoring
their behavior on the road and assists them in safe driving
practices.
While current ITS technologies have made significant
advancements, key challenges persist in accurately detecting
and interpreting complex driving environments and driver
behaviors in real time. Most existing systems focus solely on
either vehicle detection or driver behavior monitoring but lack
a comprehensive solution that integrates critical factors such
as traffic elements, road conditions, damage detection, acci-
dent detection, and driver behaviors. This research aims to
address this gap by proposing SafeSmartDrive, an integrated
intelligent transportation monitoring system that delivers a
real-time, multi-layered approach to enhance both driver
safety and environmental detection.
By providing a comprehensive solution that simulta-
neously monitors and processes data from the vehicle,
the driver, and the external environment, SafeSmartDrive
addresses the limitations of current systems. Unlike exist-
ing systems that focus on isolated aspects, SafeSmartDrive
combines advanced machine learning (ML) and deep learn-
ing (DL) techniques to analyze data from multiple sensors
and cameras, thus providing more accurate detection and
classification of traffic environments and driver behaviors.
This system offers a novel four-layer architecture: perception,
filtering and preparation, detection and classification, and
alert, designed to continuously update models and provide
real-time feedback to drivers.
In practical terms, the system is designed to improve
road safety by delivering real-time alerts to drivers based
on detected risks, such as hazardous road conditions, unsafe
driving behaviors, and potential collisions. By incorporating
this multi-layered approach, SafeSmartDrive offers a solu-
tion that not only detects traffic elements but also provides
insights into how driver behavior interacts with these ele-
ments, offering a significant advancement in the field of
intelligent transportation systems.
The key innovation in this system lies in the use of
the YOLOv9 model for traffic element detection, which
improves upon previous models (YOLOv7, YOLOv8) by
achieving higher accuracy in detecting vehicles, traffic signs,
and other road elements. Additionally, SafeSmartDrive evalu-
ates driver behavior in various contexts, including highways,
intersections, and secondary roads, classifying it into normal,
aggressive, and drowsy driving patterns. This combination
of environmental monitoring and driver behavior analysis
distinguishes SafeSmartDrive from other ITS solutions by
offering a comprehensive approach to traffic risk assessment
and accident prevention.
The contributions of this research are summarized as
follows:
1. We conducted a review of the latest research in traffic
element detection and driver behavior monitoring.
2. We introduce the conceptual SafeSmartDrive system,
integrating multiple layers for real-time transportation
monitoring.
3. This study presents a novel application of the YOLOv9
model for traffic element detection, which includes
vehicles such as buses, cars, motorcycles, trucks, and
bicycles, as well as traffic infrastructure such as traffic
lights and signs. The analysis compares the model’s
performance with earlier models, demonstrating its
superior accuracy.
The paper is organized as follows: Section II provides
background on intelligent transportation system (ITS) chal-
lenges and driver behavior analysis. Section III exposes
the related works related to the elements that constitute an
intelligent transportation system, focusing on the driving
environment and the driver behavior. Section IV presents a
conception of a novel intelligent transportation monitoring
system (SafeSmartDrive). In Section V, we introduce our
methodology, which utilizes transfer learning algorithms to
detect vehicles, traffic signs, and lights, and we present our
findings. In Section VI, we evaluate the real-time detec-
tion capabilities of the SafeSmartDrive system in a specific
scenario set in Casablanca. Section VII discusses our propo-
sition system and its advantages and difficulties. Finally,
in Section VIII, we present the conclusion of the paper and
discuss future perspectives.
II. BACKGROUND
A. INTELLIGENT TRANSPORTATION SYSTEMS CONTEXTS
To comprehensively understand and analyze issues in Intelli-
gent Transportation Systems (ITS), it is essential to consider
the interactions among three key elements: the driver, the
surrounding environment (environmental context), and the
vehicle (car context) [2]. These three contexts interconnect
and significantly influence driver behavior on the road:
Driver Context: This context focuses on the individual
behind the wheel, encompassing factors such as the
driver’s physical and mental state, status (e.g., fatigue,
distraction, impairment), and even emotional states that
may be evident through facial expressions. Understand-
ing the driver’s context is crucial because it directly
affects how they perceive and react to both the vehicle
and the environmental conditions.
Car Context: The car context includes all aspects
related to the vehicle itself, such as speed, acceleration,
orientation (position and direction), vehicle condition
(e.g., mechanical issues), and the performance of in-
car technologies. The vehicle’s response to the driver’s
169500 VOLUME 12, 2024
S. Bouhsissin et al.: SafeSmartDrive: Real-Time Traffic Environment Detection and Driver Behavior Monitoring
inputs and the external environment significantly influ-
ences driver behavior.
Environmental Context: This context encompasses
everything external to the vehicle, including traffic con-
ditions (e.g., congestion, traffic flow), road geometry
(e.g., curves, intersections), the presence of obstacles
(e.g., potholes, speed bumps, cracks, manhole covers),
detection of road users (e.g., cars, buses, pedestrians),
traffic signs, signals, and weather conditions (e.g., rain,
fog, snow). Environmental factors have a substantial
impact on driver behavior, requiring drivers to adapt to
changing conditions and respond to potential hazards.
The study and analysis of these contexts require careful
consideration of the challenges and complexities associated
with intelligent transportation systems (ITS) and the factors
that influence driver behavior in these contexts:
Complexity of Driver Behavior: Despite advance-
ments in ITS technology, understanding and predicting
driver behavior remains challenging due to the multi-
faceted nature of human behavior [3].
Complexity of Environment: The dynamic nature of
the surrounding environment presents challenges for
ITS in accurately perceiving and responding to changing
road conditions. Factors such as obstacles, traffic ele-
ments [4],[5], accidents [6],[7], and other road hazards
necessitate robust sensing and decision-making capabil-
ities to ensure safe driving.
B. TRANSFER LEARNING
YOLO (You Only Look Once) is a succession of real-time
object detection models that have progressed through various
iterations, each designed to improve accuracy, speed, and
efficiency. These models employ a singular neural network
architecture that divides images into a grid, simultaneously
predicting bounding boxes and class probabilities for each
grid cell. This architecture allows YOLO models to achieve
real-time performance, making them highly suitable for
applications like traffic monitoring, where speed and accu-
racy are critical. We delineate the progression of essential
YOLO models utilized in our system below:
YOLOv7 is acknowledged for its efficacy and superior
detecting capabilities. In comparison to its predecessor,
YOLOv5, YOLOv7 demonstrates enhancements in both
velocity and precision. YOLOv7’s architecture incorpo-
rates innovative methods, including Extended Efficient
Layer Aggregation Networks (E-ELAN), facilitating
enhanced feature learning without elevating computing
expenses [8]. Furthermore, YOLOv7 integrates model
pruning and dynamic label assignment, which enhance
the efficiency of the detection process. These attributes
allow YOLOv7 to sustain real-time processing speeds
while markedly enhancing detection efficacy, render-
ing it particularly effective in contexts containing small
objects or complex environments.
YOLOv8 offers additional improvements in perfor-
mance, accuracy, and adaptability, building based on
YOLOv7 [9]. Cross-stage partial networks (CSPNet),
introduced in YOLOv8, enhance gradient flow and
lessen computing load while maintaining a greater level
of object detection precision. Additionally, it makes
use of Path Aggregation Networks (PANet) to improve
feature fusion across multiple layers, strengthening the
model’s ability to handle complex visual situations.
These enhancements render YOLOv8 more versatile
for diverse real-world situations, particularly in intricate
traffic environments where simultaneous identification
of many objects is essential.
YOLOv9, the most recent version, presents signifi-
cant improvements in real-time object detection [10].
YOLOv9 integrates the advantages of its predeces-
sors, featuring enhancements in precision, velocity, and
efficacy. It utilizes dual-path networks (DPNs) and
optimized spatial pyramid pooling (SPP), allowing the
model to efficiently concentrate on both global and
local input. YOLOv9 employs sophisticated approaches,
including attention mechanisms and transformer-based
layers, to improve detection accuracy, particularly in
difficult settings characterized by object occlusion or
varying lighting conditions.
III. RELATED WORKS
In this section, we present papers that are used to propose
our system for monitoring. Based on the challenge, we have
categorized the state-of-the-art into two primary areas: envi-
ronment and driver behavior. In the environment category,
we focus on articles related to traffic element detection,
damage detection, and traffic risk assessment. In the driver
behavior category, we concentrate on articles concerning the
classification of driver behaviors in connection with the chal-
lenges posed by the vehicle and the drivers themselves.
A. ENVIRONMENT
1) TRAFFIC ELEMENTS DETECTION
Recent years have witnessed an increase in research dedi-
cated to vehicle identification and classification, motivated
by its various uses in intelligent traffic management systems.
Most applications in Intelligent Transportation Systems (ITS)
linked to vehicle identification and classification focus on
traffic accident investigation, traffic flow monitoring, fleet
and transport management, autonomous driving, and sim-
ilar domains. The research presented in [11] proposes an
enhanced method utilizing YOLOv4 for real-time object
detection under challenging weather circumstances. The
method seeks to improve accuracy and efficiency in target
detection by employing the BDD100K dataset. The approach
attains a mean average precision of 60.3%. The paper [12]
introduces an enhanced variant of YOLOv5 to showcase
its efficacy in identifying distant, diminutive, or rotating
objects in images and videos. Utilizing the MS COCO
VOLUME 12, 2024 169501
S. Bouhsissin et al.: SafeSmartDrive: Real-Time Traffic Environment Detection and Driver Behavior Monitoring
2017 dataset, the model attains a remarkable mean average
precision of 90% with a training loss of 0.025. The paper [13]
utilizes YOLOv5s for detecting small objects in traffic sce-
narios, improving the architecture via attention feature fusion
for autonomous driving systems. Employing the BDD100k
dataset, the model attains a mean average precision of 51.5%.
The publication [4] presents a methodology for the iden-
tification and classification of automobiles in aerial image
sequences. The work achieves vehicle detection using the
YOLOv8 algorithm and the VEDAI dataset. Upon detection,
characteristics are retrieved from the identified vehicles and
employed to train a Deep Belief Network (DBN) classifier
for vehicle categorization. The suggested method attains an
accuracy of 95.6% on the VEDAI dataset.
The rapid expansion of road networks and the grow-
ing intricacy of traffic situations highlight the essential
requirement for effective and reliable traffic sign recogni-
tion systems. The paper [14] discusses the categorization of
traffic signals utilizing multiple models, such as VGG16,
Optimized LeNet, AlexNet, ResNet, and GoogleNet. These
models exhibit remarkable classification accuracy, varying
from 94.9% for VGG16 to 99.6% for Optimized LeNet.
A notable study is detailed in [15], whereby the authors seek
to enhance deep residual networks by refining loss functions
and epochs for traffic sign categorization utilizing the Ara-
bic Traffic Signs dataset. ResNet V1 attains an accuracy of
91.8%, while ResNet V2 reaches a remarkable accuracy of
97.55%. The study in [16] concentrates on the classifica-
tion of traffic signs with the German Traffic Sign Detection
Benchmark (GTSDB) dataset, implementing Convolutional
Neural Networks (CNN) for feature extraction and model
construction, resulting in an impressive accuracy of 93% for
the CNN model. In addition, the paper [5] presents a CNN
model for traffic sign categorization with the GTSDB dataset,
attaining an accuracy of 99.25%. Furthermore, ResNet50
attains an accuracy of 99.5% on the identical assignment.
The work [17] addresses developments in the YOLO (You
Only Look Once) model, specifically tailored for the recog-
nition of small traffic sign images within the domain of
object detection. The work uses the TT100K dataset and
utilizes the YOLOv5 backbone for feature extraction. The
proposed model named YOLOv5 SPD +Head +CAM,
exhibits remarkable performance metrics, with an accuracy of
95%, precision of 91.6%, and recall of 95.4%. Research [18]
similarly investigates detection tasks utilizing the CCTSDB
dataset, implementing modifications to the YOLOv5 archi-
tecture to augment the identification of minor traffic signs,
even under adverse weather circumstances.
2) ROAD DAMAGE DETECTION
In addition, severe road damage can greatly increase the
likelihood of accidents. This study [19] proposes a precise
pothole detection system for autonomous vehicles leverag-
ing the YOLOv8 deep learning model. To train the model,
a dataset of 665 pothole images sourced from the Kaggle
public library is utilized. In a paper [20], a convolutional
neural network (CNN) was deployed to detect and navi-
gate around obstacles in autonomous vehicles. The model
achieved an accuracy of 88.6% when trained on the dataset
gathered by the authors. In this paper, a system for detecting
speed bumps and potholes is developed from images captured
by a camera (714 images) using a CNN model. The model
achieves a precision of 98.13%.
3) ACCIDENT
Road traffic accidents pose a significant global challenge,
leading to numerous fatalities, injuries, and considerable eco-
nomic repercussions annually. In pursuit of accident detection
solutions, [21] aims to develop a video-based system. Each
frame undergoes classification using a CNN model trained
to distinguish between accident and non-accident scenarios,
achieving an impressive accuracy exceeding 95%. In the
same way, [22] describes a framework for classifying traffic
accidents that uses VGG19 and a spatial attention mecha-
nism. It achieves an average accuracy of 93.72%, which is
higher than current methods. Additionally, [7] proposes an
adapted VGG19 model supplemented with additional layers
and parameters for accident image classification, achieving a
commendable accuracy rate of 96%.
On the other hand, developing precise models to predict
accident risks, including the severity of traffic incidents,
is imperative for transportation systems. In [23], the authors
utilize data from simulation experiments to forecast acci-
dent risks on highways, introducing a backpropagation neural
network for this purpose. They achieve values of 82.33%
for true positive rate (TPR), 0.93 for area under the curve
(AUC), and 86.62% for accuracy. Additionally, [24] presents
a BN-RF model designed to predict accident risks using
highway-collected data. The Random Forest (RF) algorithm
is employed to prioritize explanatory variables based on the
Gini index, while a Bayesian network is constructed for crash
prediction, resulting in an 88.35% ROC curve. Furthermore,
[25] employs the Adaboost algorithm, while [26] utilizes a
LSTM-CNN to tackle the challenge of predicting vehicle
accident risks. Finally, [6] is a thorough study that looks at
how to classify the severity of a car accident using a wide
range of machine learning (DT, LR, RF, XGBoost, and Naive
Gaussian Bayesian) and deep learning algorithms (LSTMs,
GRUs, Bi-LSTM, and Bi-GRU). Their Random Forest model
achieved the best performance with an accuracy of 90.46%.
4) TRAFFIC RISK ASSESSMENT
Traffic risk assessment, which involves evaluating the poten-
tial risks and hazards present on the road, can be an effective
solution for reducing road risk. In [27], authors propose
a two-stream recurrent convolutional neural network with
dynamic attention for image-based risk assessment. Separate
CNN and LSTM architectures process raw images and flow
information, with late fusion combining their outputs for risk
level prediction. The model achieves 84.89% accuracy using
169502 VOLUME 12, 2024
S. Bouhsissin et al.: SafeSmartDrive: Real-Time Traffic Environment Detection and Driver Behavior Monitoring
seven frames. Additionally, the paper [28] conducts traffic
risk assessment using a series of weighted RGB and optical
flow images to recognize driving risk, employing a driver
attention model and LSTM. This model achieves an overall
accuracy of 84.55% using 20 frames.
B. DRIVER BEHAVIOR
Driver behavior refers to the actions and habits of drivers
on the road. It is a multifaceted concept influenced by
many factors, rendering its precise description and analysis
challenging. It includes different types such as speeding, dis-
tracted driving, aggressive driving, drowsy driving, and other
categories described in this systematic literature review [3].
There are many research interests in the classification of
driver behavior using the UAH-DriveSet dataset. In [29],
the authors defined aggressive driving and categorized spe-
cific behaviors into four distinct classes: aggressive driving
on highways, aggressive driving on secondary roads, nor-
mal driving on highways, and normal driving on secondary
roads. They utilized machine learning models such as ran-
dom forest (RF), AdaBoost, ResNet, and a combination of
fully convolutional network (FCN) and LSTM algorithms.
These models achieved impressive F1-scores ranging from
88.29% to 95.88%, indicating their effectiveness in accu-
rately categorizing driver behavior. Similarly, [30] focused
on categorizing driving behavior into three specific classes:
normal, aggressive, or drowsy driving. Employing machine
learning classifiers such as multi-layer perceptron (MLP)
and decision tree (DT) models, they achieved F1-scores of
48% and 80%, respectively. The detection of drowsiness
behavior was further investigated in [30] and [31] using
LSTM algorithms, achieving notable F1-scores of 91% and
99.49%, respectively. Lastly, in [32], the authors employed
the UAH-DriveSet dataset to categorize driver behavior. They
utilized filter, embedding, and wrapper techniques for fea-
ture selection, in conjunction with several machine learning
algorithms, including Logistic Regression, Support Vector
Machine, K-Nearest Neighbors, Naive Bayes, Decision Tree,
Random Forest, and XGBoost. The Random Forest approach
attained a remarkable accuracy of 96.4% and an F1-score of
96.36% using backward feature selection, with a runtime of
7.43 seconds.
Driver behavior in critical areas, such as stop zones, is a
pivotal concern in ensuring road safety. In a study by [33], the
dilemma zone triggered by a yellow indication is scrutinized
and framed as a binary decision problem: to stop or pro-
ceed. Further investigations, as detailed in papers [34],[35],
involve the classification of driver behavior into stop-and-go
actions using simulator data. Various algorithms, includ-
ing AdaBoost, ANN, and SVM, are employed. Notably,
the SVM model exhibits a notable predictive accuracy of
92.9%. Moreover, research by [36] focuses on distinguishing
between proceeding and stopping behaviors using simulators,
while [37] delves into drivers’ decision-making processes
when encountering yellow traffic lights, particularly when
provided with advanced information about signal changes.
Another notable study by [38] categorizes driver behavior at
intersections, differentiating between stopping and proceed-
ing actions. This analysis further categorizes drivers based on
their stopping behaviors before and beyond the intersection
line. Impressively, XGBoost demonstrates superior perfor-
mance with 92.19% accuracy in the initial experiment, while
Random Forest (RF) achieves an outstanding accuracy of
99.38% in the subsequent analysis.
C. SYNTHESES
Although previous research has made significant strides in
the areas of traffic element detection, road damage identi-
fication, accident classification, traffic risk assessment, and
driver behavior classification, it frequently addresses these
challenges in isolation, which restricts their overall impact.
Several existing systems concentrate exclusively on specific
duties, such as the recognition of traffic signs or the detection
of vehicles, without incorporating these components into a
comprehensive framework. This results in deficiencies in the
real-time monitoring and decision-making of traffic. Further-
more, the significance of robust preprocessing stages, which
are essential for optimizing model performance, is frequently
disregarded in previous research.
By providing a fully integrated system that employs
advanced machine learning and deep learning algorithms to
detect multiple traffic elements, assess road damage, conduct
traffic risk assessments, monitor driver behavior, and predict
accidents in real time, SafeSmartDrive addresses these defi-
ciencies. The implementation of state-of-the-art algorithms
and transfer learning models, including YOLOv9, is a critical
aspect of our system. Additionally, we employ exhaustive
preprocessing techniques to enhance the quality of the data
and improve the system’s overall accuracy.
SafeSmartDrive addresses these deficiencies by offering
a comprehensive and scalable approach to intelligent trans-
portation monitoring, which substantially improves traffic
efficiency and road safety by means of real-time, comprehen-
sive analysis and feedback.
In the following section, we will introduce our advanced
intelligent transport monitoring system (SafeSmartDrive).
This innovative solution is designed to enhance road safety
and promote safe driving behavior among all road users.
IV. A NOVEL INTELLIGENT TRANSPORTATION
MONITORING SYSTEM (SAFESMARTDRIVE)
In this section, we propose a new intelligent transport mon-
itoring system named SafeSmartDrive (see Fig. 1). The
SafeSmartDrive system is currently a conceptual model
that provides a detailed framework for how an integrated,
multi-layered traffic monitoring and driver behavior analysis
system would function. This conceptual model focuses on
the design and interaction of four key layers: perception,
filtering and preparation, detection and classification, and
alert. While each layer is designed to work in harmony to
deliver real-time monitoring and improve road safety, the
VOLUME 12, 2024 169503
S. Bouhsissin et al.: SafeSmartDrive: Real-Time Traffic Environment Detection and Driver Behavior Monitoring
FIGURE 1. The flowchart of the proposed SafeSmartDrive monitoring system.
system is still in the design phase, with some aspects having
been evaluated through preliminary experiments, particularly
the detection and classification layer. The purpose of this
conceptual model is to illustrate how these components can
be integrated into a cohesive system, providing a blueprint
for future development, testing, and implementation. While
the current research lays the foundation for understanding the
potential of SafeSmartDrive, further empirical evaluations are
planned to validate the complete system and assess the unique
contributions of each layer in real-world driving conditions.
A. LAYERS DESCRIPTION
1) PERCEPRTION LAYER
The input layer in the SafeSmartDrive system plays a crit-
ical role as the initial point of data acquisition. This layer
integrates various sensors, such as the front camera, GPS,
accelerometer, gyroscope, and orientation sensors, into the
vehicle to gather environmental and behavioral data. The
front camera captures real-time video data of the road
and surrounding elements, such as vehicles, pedestrians,
road signs, and infrastructure. Simultaneously, other sensors
gather detailed information about the vehicle’s motion and
the driver’s behavior, such as speed, acceleration, orientation,
and location.
This diverse range of data is essential for the subse-
quent layers of the system. The input layer’s ability to
simultaneously collect data from multiple sources ensures a
comprehensive understanding of both the driving environ-
ment and driver behavior. The collected raw data from this
input layer is then passed on to the filtering and preparation
layer for synchronization, cleaning, and preprocessing before
being fed into the models for detection and classification.
2) FILTERING AND PREPARATION LAYER
The filtering and preparation layer plays a crucial role in
ensuring that raw data collected from various sources is
properly organized and processed before it is passed to the
detection and classification layers, which utilize machine
learning and deep learning models. This layer is divided into
two segments: one handling data from the vehicle’s front
camera and another managing outputs from multiple sensors.
For the front camera, we extract images; the preprocess-
ing is essential for accurate environmental detection. This
involves steps like resizing the images to the required dimen-
sions and performing data normalization techniques.
For the front camera, image preprocessing is essential for
accurate environmental detection. This includes steps such as
resizing the images to the required dimensions and applying
data normalization techniques. In the second segment, data
169504 VOLUME 12, 2024
S. Bouhsissin et al.: SafeSmartDrive: Real-Time Traffic Environment Detection and Driver Behavior Monitoring
from sensors such as GPS, accelerometers, gyroscopes, and
cameras are synchronized and temporally aligned to ensure
that all data points are coherent and correspond to the same
moment in time. This synchronization is crucial for effective
real-time analysis of driver behavior. The Fig. 2outlines the
key features extracted from these sensors.
Following synchronization, the data undergoes several
preprocessing steps, including removing duplicate entries,
handling missing values, eliminating redundant information,
transforming the data into a usable format, and normalizing it
to maintain consistency across all inputs. Once preprocessing
is complete, the prepared data is forwarded to the detection
models for traffic element analysis and the driver behavior
classification models for real-time monitoring and decision-
making.
FIGURE 2. Sensors and extracted features.
3) DETECTION AND CLASSIFICATION LAYER
The detection and classification layer is central to the driv-
ing assistance capabilities of the SafeSmartDrive system,
focusing on two primary objectives: analyzing the driv-
ing environment and classifying driver behavior. These two
elements work together to provide real-time feedback and
warnings to drivers, ultimately enhancing road safety and
reducing the risk of accidents.
The first objective is the detection of environmental com-
ponents, where the system identifies and analyzes all relevant
road elements such as vehicles, traffic signs, road conditions,
obstacles, accidents, and assessing the risks associated with
traffic. The second objective is the classification of driver
behavior, where the system monitors how the driver interacts
with the environment across different road types, specifically
on secondary roads, highways, and at intersections.
a: TRAFFIC ELEMENTS DETECTION
The first component of our system’s environmental analysis
is traffic element detection, which is categorized into four
main types: vehicle detection, road sign and light detec-
tion, pedestrian and animal detection, and damage detection,
as illustrated in Fig. 1. The system employs advanced detec-
tion algorithms to accurately identify key elements in each of
these categories.
The Fig. 3presents the workflow for traffic element detec-
tion using the vehicle’s front camera. The process begins
with image capture, followed by the application of the data
filtering and preparation layer. This layer plays a crucial
role in preprocessing the captured data, involving essential
steps such as normalization, resizing, and other preparatory
techniques to ensure the data is optimally structured for the
subsequent detection and classification layer. At this stage,
deep learning models based on transfer learning, including
various CNN architectures, are employed to accurately detect
and classify the traffic elements in real time.
For vehicle detection, the system can identify five specific
types of vehicles present in the environment: cars, buses,
motorcycles, bicycles, and trucks. In addition to vehicle
detection, the system also detects traffic signs and lights,
enabling it to interpret real-time road regulations and pro-
vide context-aware feedback to the driver. Beyond vehicle
and road sign detection, the system is capable of detecting
pedestrians and animals, ensuring non-vehicular road users
are accounted for in traffic analysis. This capability helps
avoid potential collisions, contributing to overall road safety.
Finally, the system incorporates a damage detection compo-
nent, which monitors for critical road infrastructure issues
such as potholes, speed bumps, cracks, and manhole covers.
Upon detecting these hazards, the system provides real-time
alerts to the driver, allowing for timely corrective actions and
improving safety on damaged roads.
In this paper (Section V), we specifically address the
detection of vehicles, traffic signs, and traffic lights using
YOLov9. Further details regarding pedestrian and animal
detection, as well as road damage detection, will be explored
in subsequent papers.
b: ACCIDENT CLASSIFICATION
The second major component of our system for detecting the
driving environment is accident classification, as illustrated
in Fig. 4. This process involves both the detection and clas-
sification of road accidents using advanced computer vision
algorithms. By accurately identifying accidents in real time,
the system helps reduce traffic flow disruptions, minimize
congestion, and lower the likelihood of subsequent accidents.
VOLUME 12, 2024 169505
S. Bouhsissin et al.: SafeSmartDrive: Real-Time Traffic Environment Detection and Driver Behavior Monitoring
FIGURE 3. The flowchart of traffic elements detection.
The Fig. 4provides a detailed workflow of the acci-
dent classification process. The workflow begins with image
data from the vehicle’s front camera, which undergoes data
preparation and preprocessing, including resizing and nor-
malization. The preprocessed data is then passed through
detection and classification models, such as computer vision
models like VGG19 and CNN. Finally, the data is categorized
as either an accident or non-accident.
In our system, we can utilize several computer vision
algorithms that can effectively classify road incidents. Specif-
ically, we proposed an enhanced VGG19 model for accident
detection and classification in [7], which has demonstrated
remarkable performance when applied to real-world data.
This model processes images sourced from platforms like
YouTube to identify and categorize accident scenes. Our
enhanced model achieved an impressive accuracy rate of
96%, with an F1-score of 96.15% and an AUC (Area Under
the Curve) value of 0.99. These results are exceeding our
comparative models, the VGG19 baseline approach, the
VGG19 with attention mechanisms, and the CNN model.
These metrics underscore the model’s reliability and accu-
racy in distinguishing between accident and non-accident
scenarios.
c: TRAFFIC RISK ASSESSMENT
The third major component of our system’s environmental
analysis is traffic risk assessment, as illustrated in Fig. 5.
This component enables the classification of the driving
environment into four risk categories: critical, high, low,
and moderate, based on visual data captured from the front
camera of the vehicle.
The Fig. 5provides a detailed workflow for the traffic
risk assessment process. After the front camera captures the
images, the data preparation begins by segmenting each video
into 20 image frames, followed by resizing and normalizing
the images. Finally, each segment was then evaluated using
transfer learning algorithms or the proposed CNN model to
classify the associated risk level.
In our paper titled ‘‘Enhanced CNN-Based Model for
Traffic Risk Assessment,’’ which will be presented at The
9th SMART CITY APPLICATIONS International Confer-
ence [39], we propose an efficient Convolutional Neural
Network (CNN) model tailored specifically for traffic risk
classification. The model analyzes dashcam-recorded data
and classifies it into the aforementioned risk categories. Our
CNN model has demonstrated exceptional performance, out-
performing all previously proposed models in the literature
for this specific task. The model achieved an impressive
accuracy of 99%, a loss function value of 0.03, and a ROC
AUC score of 99.25%.
d: DRIVER BEHAVIOR: AT SECONDARY ROAD AND
HIGHWAY
The first key aspect we address in the driver behavior layer
is the detection of aggressive driving behavior, as illustrated
in Fig. 1. The goal of this segment is to continuously mon-
itor and assess driver behavior in real-time during transit,
enabling the system to issue alerts when unsafe or abnormal
driving patterns are detected.
As shown in Fig. 6, our system focuses on three pri-
mary categories of driver behavior: aggressive, drowsy, and
normal driving. Data from multiple sensors such as GPS,
accelerometers, gyroscopes, and cameras is synchronized,
normalized, and processed before classification, followed by
the application of machine learning algorithms to classify
driver behavior into these categories.
In our research, detailed in [32], we classified these
behaviors using the UAH-DriveSet dataset. To enhance clas-
sification accuracy, we employed feature selection methods,
including filter, embedded, and wrapper techniques, with
a total of 10 distinct feature selection approaches. These
techniques were applied alongside various machine learning
169506 VOLUME 12, 2024
S. Bouhsissin et al.: SafeSmartDrive: Real-Time Traffic Environment Detection and Driver Behavior Monitoring
FIGURE 4. The flowchart of accident classification.
FIGURE 5. The flowchart of traffic risk assessment.
algorithms, including Logistic Regression (LR), Support
Vector Machines (SVM), K-Nearest Neighbors (K-NN),
Decision Trees (DT), Naive Bayes (NB), and ensemble meth-
ods such as Random Forest (RF), XGBoost, and AdaBoost.
Among these, the RF algorithm demonstrated the best per-
formance, achieving a remarkable accuracy of 96.4% and
an F1-score of 96.36%. The feature selection was optimized
using backward feature selection, achieving these results in a
computation time of 7.43 seconds.
e: DRIVER BEHAVIOR: AT INTERSECTION
The second aspect of driver behavior detection focuses on
stopping behavior at intersections, specifically during the
yellow-light phase. In this scenario, we categorize driver
actions into two primary responses: stopping at the intersec-
tion or proceeding (going) through it. For those who choose
to stop, we further classify their behavior into two subgroups:
drivers who halt beyond the designated stop line (considered
unsafe or dangerous) and those who stop before reaching the
stop line (considered safe), as depicted in Fig. 7.
The Fig. 7outlines a structured process for collecting and
processing sensor data to detect stopping behavior at inter-
sections. Multiple machine learning algorithms can apply to
accurately classify driver actions as either stop or go, fol-
lowed by further classification of stopping actions as safe or
unsafe. This workflow ensures precise and timely detection,
enhancing road safety.
In our study [6], we applied various machine learning algo-
rithms (DT, LR, Adaboost, XGBoost, SVM, NB, RF) to clas-
sify these behaviors at intersection. Among the algorithms
tested, XGBoost emerged as the most effective for catego-
rizing driver behavior in stop-and-go situations, achieving an
accuracy of 92.19% and a precision of 94.38%. Additionally,
the Random Forest (RF) algorithm demonstrated superior
VOLUME 12, 2024 169507
S. Bouhsissin et al.: SafeSmartDrive: Real-Time Traffic Environment Detection and Driver Behavior Monitoring
FIGURE 6. The flowchart of driver behavior at secondary road and highway.
FIGURE 7. The flowchart of driver behavior at intersection.
performance in distinguishing between safe and unsafe stop-
ping behaviors, achieving an impressive accuracy of 99.38%.
4) ALERT LAYER
The output layer of the SafeSmartDrive system is where the
processed and classified data from the previous layers culmi-
nates to produce actionable insights. After the detection and
classification layer has analyzed the driving environment and
driver behavior, the output layer generates real-time results in
the form of warnings and alerts for the driver. These outputs
are designed to enhance road safety and assist drivers in
making informed decisions.
For example, the system might issue an alert if it detects
critical risks such as an imminent collision, unsafe driver
behavior (e.g., drowsiness or aggressive driving), or haz-
ardous road conditions (e.g., potholes or cracks). Similarly,
the system can notify the driver of safe or unsafe stopping
behaviors at intersections. These alerts are tailored to provide
instant feedback, helping the driver respond appropriately to
the current driving environment and reducing the likelihood
of accidents.
The output layer also ensures that the system operates in
real-time, providing drivers with immediate feedback and
recommendations based on the current conditions. In this
way, the SafeSmartDrive system not only monitors and ana-
lyzes the driving situation but also offers practical solutions
to mitigate risks and promote safe driving practices.
B. LAYERS CONNECTION
The connection between the layers in the SafeSmartDrive
system is essential for ensuring the seamless flow of data
and efficient monitoring of both the driving environment and
169508 VOLUME 12, 2024
S. Bouhsissin et al.: SafeSmartDrive: Real-Time Traffic Environment Detection and Driver Behavior Monitoring
driver behavior. The system begins with the perception layer,
where various sensors such as cameras, accelerometers, gyro-
scopes, and GPS collect data from the vehicle’s surroundings
and the driver’s actions. Data collected from the front camera
focuses on the external environment (e.g., traffic elements),
while data from other sensors captures internal factors such
as driver behavior.
This raw data is then passed to the filtering and preparation
layer, where it undergoes preprocessing. For image data from
the front camera, preprocessing includes resizing and aug-
menting the images to make them suitable for deep learning
and transfer learning models used for environment detection.
Sensor data, on the other hand, is synchronized and aligned to
ensure temporal coherence, allowing for accurate assessment
of driver behavior in real time. Redundant and irrelevant data
are eliminated, and the remaining data is transformed and
normalized for further analysis.
Once the data is prepared, it moves into the detection and
classification layer, where deep learning and transfer learning
models such as YOLO and CNN for environment detection
and machine learning models like RF and XGBoost for driver
behavior classification—process the data to identify objects,
classify behavior, and assess risks. The outputs from this
layer are real-time classifications of driving conditions and
behaviors.
Finally, the alert layer uses these outputs to generate action-
able alerts for the driver, providing warnings about potential
dangers in the environment or unsafe driving practices. This
ensures the system is not only reactive but also proactive in
maintaining road safety.
V. TRANSFER LEARNING FOR DETECTION OF ROAD
VEHICLES, TRAFFIC SIGNS, AND LIGHTS
In this section, we present our methodology for traffic ele-
ment detection, specifically designed for robust identification
of vehicles, road signs, and traffic lights under challenging
conditions. We first provide a high-level overview of the
methodology, followed by a detailed explanation of each
component: dataset, preprocessing, transfer learning model,
performance metrics, and software and hardware used. This
comprehensive approach allows readers to understand the
overall framework and the specific functionalities of each
element.
A. METHODOLOGY
Our research methodology involves an experimental
approach, as illustrated in Fig. 8. We start by using the MS
COCO dataset to extract targeted objects pertinent to vehicle
detection and traffic scene/light detection. We then employ a
transfer learning approach with YOLOv9 for object detection
and evaluate its performance, comparing it against YOLOv7
and YOLOv8.
1) MS COCO DATASET
We utilize the MS COCO dataset [40] for our traffic ele-
ment detection tasks. This dataset contains a wide range of
objects, and we focus on vehicle detection and traffic scene
elements like traffic signs and lights. The MS COCO dataset’s
diversity provides a strong foundation for evaluating the
effectiveness of object detection models in real-world traffic
environments.
2) DATASET PREPARATION AND PREPROCESSING
Our analysis centers on seven key objects within this dataset:
bus, car, motorcycle, truck, bicycle, traffic light, and stop
sign. By extracting these specific objects from the origi-
nal dataset, we have compiled a total of 22,677 images.
To ensure uniformity and meet the input criteria of our
models, each image was resized to 624 ×624 pixels. This
preprocessing step is essential to preserving consistency
throughout the dataset, which improves our detection algo-
rithms’ effectiveness and precision. Subsequently, the dataset
was segmented into training (used to train the model), testing
(used to evaluate the model’s performance on unseen data),
and validation sets (used to fine-tune hyperparameters during
training). As detailed in Table 1: The training set comprises
15,872 images featuring a total of 66,372 objects. The test-
ing set includes 4,537 images with an aggregate of 19,010
objects. The validation set contains 2,268 images encompass-
ing 9,263 objects. This split allowed us to train the model,
validate hyperparameters, and test its performance on unseen
data.
TABLE 1. Dataset size.
3) TRANSFER LEARNING ALGORITHMS
We used three transfer learning algorithms from the YOLO
family for traffic element detection: YOLOv7, YOLOv8,
and YOLOv9. While YOLOv9 offers incremental improve-
ments over earlier versions, we chose it based on its specific
advancements in detection accuracy, speed, and its abil-
ity to handle smaller objects more effectively. YOLOv9
also integrates optimized architecture and better anchor-
free detection, which helps in improving generalization for
real-time applications. Table 2presents the hyperparameters
for each model, including learning rates, batch sizes, and
confidence thresholds, along with a justification for their
selection. All models were trained with a batch size of 16 and
an initial learning rate of 0.01. YOLOv9 was trained for
30 epochs, while YOLOv8 and YOLOv7 were trained for
40 epochs. YOLOv9 is the most complex, with 467 layers
and 102.5 GFLOPs, whereas YOLOv8 has the fewest layers
but a similar parameter count. YOLOv7 uses 106 layers and
a higher final learning rate (lrf) of 0.1.
VOLUME 12, 2024 169509
S. Bouhsissin et al.: SafeSmartDrive: Real-Time Traffic Environment Detection and Driver Behavior Monitoring
FIGURE 8. Research methodology.
TABLE 2. Models hyperparameters.
4) PERFORMANCE MEASURE
To evaluate the performance of the models, we used several
common object detection metrics:
Precision: Measures the accuracy of the model’s posi-
tive detections (how many true positives were identified
compared to all predicted positives).
Recall: Measures how well the model finds all relevant
objects (how many true positives were identified com-
pared to all actual positives in the image).
mAP@0.5: A common metric that considers both preci-
sion and recall, averaging the precision across different
object categories for a specific overlap threshold (IoU)
of 0.5 between predicted and ground truth bounding
boxes.
mAP@0.5:0.95: A more comprehensive metric calculat-
ing the average precision across multiple IoU thresholds,
ranging from low (50% overlap) to high (95% over-
lap). This provides a broader picture of the model’s
performance under varying degrees of overlap between
detections and actual objects.
These evaluation criteria help assess the overall perfor-
mance of YOLO models in detecting objects accurately and
reliably in images or videos. By analyzing precision, recall,
mAP@0.5, and mAP@0.5:0.95, researchers and practitioners
can gain insights into the strengths and weaknesses of the
models and make informed decisions about their suitability
for specific applications.
5) SOFTWARE AND HARDWARE
Our implementation leveraged the Python programming lan-
guage for code development. To accelerate the training
process for our complex deep learning models, we utilized the
computational power of NVIDIA Tesla P100 GPUs available
on Kaggle Notebooks (or Kernels). These high-performance
GPUs boast 16 GB of high-bandwidth memory (HBM2),
significantly improving the efficiency of training our models
by reducing processing time and enabling large-scale data
handling.
B. EXPERIMENT RESULTS
To evaluate the effectiveness of our YOLOv9-based approach
for traffic element detection, we used a Kaggle notebook
equipped with a P100 GPU to expedite the training process.
The detailed results, including confusion matrices (presented
in Fig. 9and Fig. 10) and object classification results (shown
in Table 3), illustrate the performance of YOLOv9 in com-
parison to YOLOv7 and YOLOv8.
As presented in Table 3, the performance comparison
across various object categories shows that YOLOv9 con-
sistently outperforms its predecessors. YOLOv9 achieved an
169510 VOLUME 12, 2024
S. Bouhsissin et al.: SafeSmartDrive: Real-Time Traffic Environment Detection and Driver Behavior Monitoring
FIGURE 9. Confusion matrix of YOLOv9-based approach.
TABLE 3. Objects classification results.
accuracy of 88% for buses, compared to 84% for YOLOv7
and 77% for YOLOv8. For cars, both YOLOv9 and YOLOv7
achieved an accuracy of 75%, while YOLOv8 achieved 69%.
YOLOv9 achieved 80% accuracy for motorcycles, com-
pared to 78% for YOLOv7 and 66% for YOLOv8. For
trucks, YOLOv9 achieved 66% accuracy, YOLOv7 achieved
57%, and YOLOv8 achieved 53%. In the case of bicy-
cles, YOLOv9 achieved 71%, YOLOv7 achieved 70%, and
YOLOv8 achieved 62%. For traffic lights, YOLOv7 achieved
69%, YOLOv9 achieved 66%, and YOLOv8 achieved 59%.
Lastly, for stop signs, YOLOv7 achieved 87%, YOLOv9
achieved 86%, and YOLOv8 achieved 76%. These results
demonstrate that our YOLOv9-based proposition consis-
tently outperformed YOLOv7 and YOLOv8 in most cate-
gories, highlighting its superior accuracy in object detection
tasks.
In addition to object classification, we assessed the models
based on common evaluation metrics like precision, recall,
and mAP (mean Average Precision) at IoU (Intersection over
Union) thresholds of 0.5 and 0.5:0.95. As shown in Table 4
and Fig. 11, YOLOv9 consistently outperformed YOLOv7
and YOLOv8 across all metrics.
In terms of precision, which measures the accuracy of the
positive predictions made by the model, YOLOv9 achieves
FIGURE 10. Confusion matrix of YOLOv7 and YOLOv8.
the highest value of 0.831, followed by YOLOv7 with
0.804 and YOLOv8 with 0.771. This suggests that YOLOv9
exhibits a higher proportion of correctly identified objects
among all the objects predicted by the model compared to the
other two versions. Similarly, in Recall, which quantifies the
ability of the model to correctly identify all relevant instances
of objects, YOLOv9 achieves the highest score of 0.735,
followed by YOLOv7 with 0.681 and YOLOv8 with 0.639.
This implies that YOLOv9 is more effective at capturing the
entirety of objects present in the images compared to the other
models. Additionally, the mean Average Precision (mAP) at
an IoU threshold of 0.5, which provides an overall measure
of the model’s accuracy across different object categories,
further reinforces the superior performance of YOLOv9.
With a mAP@.5 score of 0.821, YOLOv9 surpasses both
VOLUME 12, 2024 169511
S. Bouhsissin et al.: SafeSmartDrive: Real-Time Traffic Environment Detection and Driver Behavior Monitoring
YOLOv7 (0.734) and YOLOv8 (0.718). For mAP at IoU
=0.5:0.95, YOLOv7 achieved 0.528, followed by YOLOv8
with 0.518, and YOLOv9 with 0.627. These results indi-
cate that YOLOv9 consistently provides higher precision and
accuracy across different IoU thresholds, demonstrating its
superiority in object detection tasks.
Overall, the results indicate that YOLOv9-based proposi-
tion demonstrates notable advancements over its predeces-
sors, YOLOv7 and YOLOv8, in terms of precision, recall,
and mAP, making it the preferred choice for object detection
tasks. These findings suggest that YOLOv9 offers improved
accuracy and reliability, potentially leading to more effective
and robust applications in real-world scenarios.
TABLE 4. Models results.
C. COMPARATIVE ANALYSIS WITH EXISTING MODELS
To further prove the model’s performance, we compare our
results with existing model results in the literature. Table 5
presents a comparison based on mAP at IoU =0.5, the
number of detectable classes, and dataset size. The model
in [11] achieved an mAP of 0.603 with seven classes on a
dataset of 16,626 images. While the model in [12] reached a
higher mAP of 0.9, it was trained on only four classes with
a smaller dataset of 1,200 images, limiting its generalizabil-
ity. The model in [13] had a lower mAP of 0.515 with ten
classes on a much larger dataset of 80,000 images (eight times
larger). In contrast, our YOLLOv9 model achieved a mAP of
0.821 with seven classes on a dataset of 22,677 images. This
demonstrates our model’s effectiveness in balancing dataset
size, number of classes, and detection performance. While
Model [12] shows a higher mAP, its smaller dataset and fewer
classes restrict its comparability. Overall, our YOLLOv9
model is effective for vehicle detection and traffic sign/light
identification tasks, showcasing its competitiveness.
TABLE 5. A comparison between the proposed method and existing
research.
VI. USE CASE: REAL-TIME URBAN AND ROAD
CONDITION DETECTION IN CASABLANCA
Before presenting the use case example of real-time envi-
ronment detection, it is important to clarify that while the
SafeSmartDrive system is fully conceptualized to include
both environment detection and driver behavior analysis, the
current evaluation focuses solely on the environment detec-
tion capabilities. The driver behavior monitoring component
remains to be evaluated in future studies, which may involve
collaborations with research partners or additional hardware
installations for a thorough assessment. These future evalu-
ations will help validate the system’s ability to monitor and
classify driver behaviors in real-time, providing a more com-
prehensive safety solution alongside the existing environment
detection features.
The SafeSmartDrive system was evaluated in the urban
and suburban streets of Casablanca, where complex traffic
patterns, multiple road hazards, and diverse conditions make
for an ideal testing environment (see Fig. 12).
The process from data capture to alert delivery involves
multiple layers. In the Perception Layer, cameras collect
images of surrounding traffic elements and road conditions,
forming the input for subsequent layers. This data then
enters the Filtering and Preparation Layer, where it undergoes
resizing, normalization, and other refinement steps. This pre-
processing ensures that images are optimized for detection
and classification, enhancing the quality and consistency of
data that reaches the machine learning models. Once opti-
mized, the data is passed to the Detection and Classification
Layer, where machine learning models accurately identify
and classify objects, such as vehicles, traffic signs, and
road surface hazards. This detection is performed efficiently,
as demonstrated in scenarios where the system identifies
multiple objects and surface conditions with rapid inference
times, maintaining performance even in complex environ-
ments. Finally, in the Alert Layer, the system generates timely
notifications based on the detected elements, allowing the
driver to respond to road conditions and traffic patterns safely.
As the driver navigates through busy intersections and
crowded areas, the system efficiently detects and classifies
various traffic elements and road conditions in real-time.
In high-traffic areas, the system identifies multiple objects,
such as four cars, one truck, and two stop signs in just 18.2 ms,
allowing the driver to approach intersections safely. In a
more crowded scene with 16 cars, three motorcycles, two
trucks, and two traffic lights, the system processes this data
within 17.6 ms, maintaining real-time responsiveness even
under heavy traffic conditions. For more intricate situations
involving five cars, one motorcycle, and two traffic lights,
the system processes the image in 60.7 ms, demonstrating its
adaptability across varying traffic densities.
As the driver transitions to Casablanca’s suburban areas,
the SafeSmartDrive system seamlessly switches focus to
detect potential road surface hazards. It identifies an uneven
manhole cover in 14.7 ms, alerting the driver to avoid it.
169512 VOLUME 12, 2024
S. Bouhsissin et al.: SafeSmartDrive: Real-Time Traffic Environment Detection and Driver Behavior Monitoring
FIGURE 11. Models results.
Similarly, a speed bump is detected in 14.8 ms, prompting the
driver to slow down. In another scenario, the system detects
two road cracks along with a nearby car in 71.8 ms, ensuring
driver awareness for navigating uneven surfaces. Addition-
ally, the system spots four potholes in 14.8 ms, allowing the
driver to reduce speed or steer clear to minimize impact.
Overall, the system’s evaluation highlights its efficient
processing capabilities, with an average pre-processing time
of 3.6 ms, inference time of 50.7 ms, post-processing time
of 1.1 ms, and 24.2 ms for Non-Maximum Suppression
(NMS) per image at a resolution of (1, 3, 640, 640). This
real-time performance enables the SafeSmartDrive system
to provide timely alerts and accurate information on both
traffic conditions and road hazards, thereby enhancing driver
safety and decision-making throughout the varied and chal-
lenging streets of Casablanca. The integrated layer-by-layer
approach in SafeSmartDrive provides comprehensive situ-
ational awareness, with future enhancements planned for
driver behavior analysis to create an even more holistic safety
solution.
VII. DISCUSSION
Although current Intelligent Transportation Systems (ITS)
technologies have made substantial strides in improving road
safety and traffic management, they continue to face sig-
nificant challenges in real-time detection and interpretation
of complex transportation environments and diverse driver
behaviors. Many existing systems remain limited in scope,
focusing narrowly on either vehicle detection or driver behav-
ior analysis, thus failing to provide a comprehensive solution
for real-world traffic scenarios. The development of SafeS-
martDrive addresses this gap by offering a multi-layered,
integrated approach that combines the analysis of driver
behavior with real-time detection of critical traffic elements.
This integration not only advances road safety but also aligns
with broader Environmental, Social, and Governance (ESG)
goals, offering a holistic perspective on modern traffic man-
agement.
In this paper, we presented SafeSmartDrive, a novel intel-
ligent transport monitoring system designed to detect and
classify various traffic elements, assess driver behavior, and
provide real-time alerts to enhance road safety. The system
is built with four key layers: perception, filtering and prepa-
ration, detection and classification, and alert, enabling it to
be flexible and scalable for a range of traffic management
scenarios.
The SafeSmartDrive system employs a range of vehi-
cle sensors, including cameras, accelerometers, GPS, and
VOLUME 12, 2024 169513
S. Bouhsissin et al.: SafeSmartDrive: Real-Time Traffic Environment Detection and Driver Behavior Monitoring
FIGURE 12. Detection and classification of traffic elements by the safesmartdrive system.
gyroscopes, to gather comprehensive data on the driver’s
environment and behavior. These sensors capture various
aspects of the driving experience, such as the surround-
ing vehicle, road damage, and driver actions. The collected
data undergoes preprocessing in the filtering and prepara-
tion layer, where it is organized and formatted according to
the system’s objectives. This preprocessing step ensures that
the dataset is ready for analysis and modeling. Following
the preparation layer, the system incorporates a detection
and classification layer, which encompasses the algorithms
and models used to detect and classify the collected data.
This includes vehicle detection (buses, cars, motorcycles,
trucks, and bicycles), traffic sign and light recognition,
pedestrian and animal detection, infrastructure damage iden-
tification, accident classification, and traffic risk assessment.
The system also evaluates driver behavior across various road
types, such as highways, secondary roads, and intersections.
By leveraging advanced technologies such as computer vision
169514 VOLUME 12, 2024
S. Bouhsissin et al.: SafeSmartDrive: Real-Time Traffic Environment Detection and Driver Behavior Monitoring
and machine learning, the system can provide an alert system
for drivers on vehicle dashboards.
A key contribution of this research is the application of
transfer learning models, such as YOLOv9, for traffic ele-
ment detection. The use of pre-trained models significantly
improves detection accuracy, particularly when adapting to
new datasets. Our results demonstrate that YOLOv9 outper-
forms previous versions (YOLOv7 and YOLOv8) and other
models in the literature, particularly in handling complex,
real-world scenarios. This highlights the advancements made
in object detection techniques and the potential of transfer
learning to enhance ITS systems.
The implementation of SafeSmartDrive offers several
practical implications when analyzed through the lens of
Environmental, Social, and Governance (ESG) standards.
These implications underscore how the system can impact
sustainable development, societal well-being, and regulatory
compliance, reflecting its potential to modernize transporta-
tion infrastructure while meeting global ESG goals.
The deployment of our SafeSmartDrive system contributes
significantly to environmental sustainability, offering several
advantages in optimizing traffic management. By reducing
congestion and facilitating smoother traffic flow, the sys-
tem plays an important role in minimizing vehicle idling
times, which is a major source of unnecessary fuel consump-
tion and greenhouse gas emissions in urban environments.
The reduction of traffic jams through improved traffic
monitoring directly lowers the carbon footprint of transporta-
tion, aligning with global efforts to combat climate change
and promote greener cities. This environmentally conscious
design makes the SafeSmartDrive system a valuable asset
in advancing eco-friendly transportation systems, supporting
the transition toward more sustainable urban infrastruc-
ture. Furthermore, SafeSmartDrive includes capabilities for
detecting road damage such as potholes and cracks, enabling
timely maintenance. This proactive approach prevents small
issues from escalating into major repairs, which are often
resource-intensive and environmentally costly. In this way,
the system contributes to sustainable urban planning by
extending the lifespan of road infrastructure, reducing mate-
rial waste, and minimizing the environmental footprint of
road maintenance projects.
Linking seamlessly to these environmental benefits, SafeS-
martDrive also delivers substantial social value by enhancing
road safety and accessibility, which benefits communities and
promotes equity in transportation. The system’s capacity to
provide real-time alerts based on driver behavior analysis
directly addresses common causes of traffic accidents, such
as drowsy or aggressive driving. By offering timely warn-
ings to drivers, it can intervene before dangerous behaviors
result in accidents, thereby protecting not only the drivers
themselves but also passengers and other road users. This
capacity for early intervention is particularly crucial in high-
risk scenarios, such as highways and intersections, where
the consequences of unsafe driving can be most severe.
Moreover, the adaptability of SafeSmartDrive allows it to
be implemented across various geographic contexts, ranging
from densely populated urban areas to remote rural roads.
This flexibility ensures that improvements in road safety
are not confined to well-resourced regions but can extend
to underserved areas that may lack advanced traffic man-
agement systems. By promoting safer driving environments
and reducing accident rates, the system contributes to social
equity in road safety, ensuring that the benefits of modern ITS
technologies are accessible to all communities. Additionally,
by reducing the frequency and severity of road accidents,
SafeSmartDrive alleviates the strain on emergency services
and healthcare systems, enabling these resources to be used
more effectively for other community needs.
These social benefits are underpinned by strong gov-
ernance practices, as SafeSmartDrive fosters regulatory
compliance, transparency, and data-driven decision-making
within the transportation sector. By monitoring driver behav-
ior and detecting traffic violations, such as speeding or
improper stops, the system supports effective enforcement
of road safety standards, thereby enhancing accountabil-
ity. For fleet operators, SafeSmartDrive offers detailed
reporting on driver performance, helping companies uphold
safety and regulatory compliance standards across commer-
cial fleets. Additionally, the system generates anonymized
data that provides valuable insights for policymakers and
urban planners, enabling evidence-based decisions for infras-
tructure improvements and traffic management strategies.
Importantly, SafeSmartDrive is designed with robust privacy
measures, ensuring that driver information is protected and
that data is used responsibly, reinforcing its alignment with
governance standards around ethical data handling and trans-
parency.
Despite the promising results, implementing SafeSmart-
Drive presents several challenges that must be addressed
to optimize its performance and accessibility. One of the
primary challenges lies in managing adverse and stochas-
tic environmental conditions, such as low-light settings,
inclement weather, and sudden, unpredictable changes in
lighting. These factors introduce variability that can reduce
detection accuracy and complicate the design of a robust
detection system capable of consistently high performance.
Another challenge involves driver alert management; alerts
need to be designed in a way that does not distract or
disrupt drivers, requiring thoughtful development of alert
mechanisms to avoid unnecessary disturbance while still
providing critical information. Additionally, the system’s
reliance on high-quality sensors and advanced detection
algorithms may increase implementation costs, potentially
limiting accessibility, especially in lower-resourced regions.
Privacy concerns also arise due to the collection and pro-
cessing of personal data through the system’s sensors,
necessitating strong data security and anonymization mea-
sures to protect user information and ensure compliance with
privacy regulations. Despite these challenges, the potential
benefits of SafeSmartDrive, including enhanced road safety
and improved traffic efficiency make it a promising tool
VOLUME 12, 2024 169515
S. Bouhsissin et al.: SafeSmartDrive: Real-Time Traffic Environment Detection and Driver Behavior Monitoring
for modernizing transportation infrastructure and advancing
overall mobility.
Future work for the SafeSmartDrive system will explore
several directions to enhance its capabilities and real-world
applicability further. One key area is the integration
of multi-modal data fusion, combining video data with
accelerometer, GPS, and gyroscope inputs to provide a
richer understanding of complex driving scenarios. This
will improve the system’s ability to detect unsafe driving
behaviors under various conditions. Additionally, imple-
menting real-time edge computing on embedded GPUs or
AI-accelerated hardware could significantly reduce latency,
enabling faster response times for alerts in scenarios with
limited network connectivity. Moreover, developing an adap-
tive alert system that customizes notifications based on driver
habits and real-time conditions could minimize driver distrac-
tions. Additionally, integrating health monitoring features to
detect signs of driver fatigue or stress could further prevent
accidents caused by impaired driving conditions. Finally,
conducting long-term studies on the impact of SafeSmart-
Drive on traffic flow, accident rates, and driver behavior
would provide valuable insights into its effectiveness, sup-
porting broader adoption in smart cities.
VIII. CONCLUSION
In this study, we introduced SafeSmartDrive, an innova-
tive intelligent transport monitoring system that enhances
road safety and traffic management through advanced func-
tionalities. These include comprehensive vehicle detection,
traffic sign and light recognition, accident classification, and
traffic risk assessment, along with driver behavior evalua-
tion across various road types. A key contribution of this
research is the use of transfer learning models, particularly
YOLOv9, for traffic element detection, which demonstrated
superior performance compared to previous YOLO versions
and existing models in the literature. Our results show a
significant improvement in precision and mean average pre-
cision (mAP), with YOLOv9 achieving a precision rate of
83.1% and a mAP score of 81.1%. These findings underscore
the system’s ability to effectively detect vehicles, signs, and
traffic lights in complex, real-world scenarios.
The novel contribution of SafeSmartDrive lies in its
layered structure, which enables flexibility and scalability,
making it adaptable for different traffic monitoring tasks.
This system not only advances the current state of intelligent
transport monitoring but also offers practical solutions for
improving road safety and traffic efficiency through real-time
analysis and alerts.
In the future, we aim to further enhance the system
by implementing additional detection features, including
infrastructure damage and pedestrian and animal detection.
Moreover, future work will focus on improving detection
algorithms to increase the system’s robustness and accuracy
in handling multi-object images under diverse conditions.
We plan to conduct real-world testing to assess the system’s
scalability and performance. Understanding the specific
conditions under which the models succeed or fail will be
critical for further refinement and improvement.
REFERENCES
[1] B. Singh and A. Gupta, ‘‘Recent trends in intelligent transportation sys-
tems: A review,’ J. Transp. Literature, vol. 9, no. 2, pp. 30–34, Apr. 2015,
doi: 10.1590/2238-1031.jtl.v9n2a6.
[2] A. Soultana, F. Benabbou, and N. Sael, ‘‘Context-awareness in the smart
car: Study and analysis,’ in Proc. 4th Int. Conf. Smart City Appl. (SCA),
Oct. 2019, doi: 10.1145/3368756.3369019.
[3] S. Bouhsissin, N. Sael, and F. Benabbou, ‘Driver behavior clas-
sification: A systematic literature review,’ IEEE Access, vol. 11,
pp. 14128–14153, 2023, doi: 10.1109/ACCESS.2023.3243865.
[4] N. Al Mudawi, A. M. Qureshi, M. Abdelhaq, A. Alshahrani, A. Alazeb,
M. Alonazi, and A. Algarni, ‘‘Vehicle detection and classification via
YOLOv8 and deep belief network over aerial image sequences,’’ Sustain-
ability, vol. 15, no. 19, p. 14597, Oct. 2023, doi: 10.3390/su151914597.
[5] A. Hamza and S. Nawal, ‘‘Traffic sign classification using deep learning
comparative study,’ Proc. Comput. Sci., vol. 233, pp. 939–949, Jan. 2024,
doi: 10.1016/j.procs.2024.03.283.
[6] S. Bouhsissin, N. Sael, and F. Benabbou, ‘‘Prediction of risks in intelligent
transport systems,’ in Lecture Notes in Networks and Systems. Cham,
Switzerland: Springer, 2022, pp. 303–316.
[7] S. Bouhsissin, N. Sael, and F. Benabbou, ‘‘Enhanced VGG19 model for
accident detection and classification from video,’ in Proc. Int. Conf. Digit.
Age Technological Adv. Sustain. Develop. (ICDATA), Jun. 2021, pp. 39–46,
doi: 10.1109/ICDATA52997.2021.00017.
[8] C.-Y. Wang, A. Bochkovskiy, and H.-Y.-M. Liao, ‘‘YOLOv7: Train-
able Bag-of-Freebies sets new State-of-the-Art for real-time object
detectors,’ in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recog-
nit. (CVPR). BC, Canada: IEEE, Jun. 2023, pp. 7464–7475, doi:
10.1109/cvpr52729.2023.00721.
[9] D. Reis, J. Kupec, J. Hong, and A. Daoudi, ‘‘Real-time flying object
detection with YOLOv8,’’ 2023, arXiv:2305.09972.
[10] C.-Y. Wang, I.-H. Yeh, and H.-Y. Mark Liao, ‘‘YOLOv9: Learning what
you want to learn using programmable gradient information,’ 2024,
arXiv:2402.13616.
[11] R. Wang, H. Zhao, Z. Xu, Y. Ding, G. Li, Y. Zhang, and H. Li, ‘Real-
time vehicle target detection in inclement weather conditions based on
YOLOv4,’’ Frontiers Neurorobotics, vol. 17, Mar. 2023, Art. no. 1058723,
doi: 10.3389/fnbot.2023.1058723.
[12] N. Ibrahim Nife and M. Chtourou, ‘‘Improved detection and tracking
of objects based on a modified deep learning model (YOLOv5),’ Int.
J. Interact. Mobile Technol., vol. 17, no. 21, pp. 145–160, Nov. 2023.
[13] J. Lian, Y. Yin, L. Li, Z. Wang, and Y. Zhou, ‘‘Small object detection in
traffic scenes based on attention feature fusion,’ Sensors, vol. 21, no. 9,
p. 3031, Apr. 2021, doi: 10.3390/s21093031.
[14] H. B. Fredj, A. Chabbah, J. Baili, H. Faiedh, and C. Souani, ‘‘An effi-
cient implementation of traffic signs recognition system using CNN,’’
Microprocessors Microsystems, vol. 98, Apr. 2023, Art. no. 104791, doi:
10.1016/j.micpro.2023.104791.
[15] G. Latif, D. A. Alghmgham, R. Maheswar, J. Alghazo, F. Sibai, and
M. H. Aly, ‘‘Deep learning in transportation: Optimized driven deep resid-
ual networks for Arabic traffic sign recognition,’’ Alexandria Eng. J.,
vol. 80, pp. 134–143, Oct. 2023, doi: 10.1016/j.aej.2023.08.047.
[16] P. Bailke, ‘Traffic sign classification using CNN,’ Int. J. Res. Appl. Sci.
Eng. Technol., vol. 10, no. 2, pp. 198–206, Feb. 2022.
[17] T. Han, L. Sun, and Q. Dong, ‘‘An improved YOLO model for traffic signs
small target image detection,’ Appl. Sci., vol. 13, no. 15, p. 8754, Jul. 2023,
doi: 10.3390/app13158754.
[18] S. Qu, X. Yang, H. Zhou, and Y. Xie, ‘‘Improved YOLOv5-based for small
traffic sign detection under complex weather,’’ Sci. Rep., vol. 13, no. 1,
p. 16219, Sep. 2023, doi: 10.1038/s41598-023-42753-3.
[19] M. Khan, M. A. Raza, G. Abbas, S. Othmen, A. Yousef, and T. A. Jumani,
‘‘Pothole detection for autonomous vehicles using deep learning: A
robust and efficient solution,’’ Frontiers Built Environ., vol. 9, pp. 1–28,
Jan. 2024.
[20] N. Sanil, P. A. N. venkat, V. Rakesh, R. Mallapur, and M. R. Ahmed, ‘Deep
learning techniques for obstacle detection and avoidance in driverless
cars,’ in Proc. Int. Conf. Artif. Intell. Signal Process. (AISP), Jan. 2020,
pp. 1–4, doi: 10.1109/AISP48273.2020.9073155.
169516 VOLUME 12, 2024
S. Bouhsissin et al.: SafeSmartDrive: Real-Time Traffic Environment Detection and Driver Behavior Monitoring
[21] S. Ghosh, S. J. Sunny, and R. Roney, ‘Accident detection using convolu-
tional neural networks,’ in Proc. Int. Conf. Data Sci. Commun. (IconDSC),
Mar. 2019, pp. 1–6.
[22] M. Moniruzzaman, Z. Yin, and R. Qin, ‘‘Spatial attention mechanism for
weakly supervised fire and traffic accident scene classification,’ in Proc.
IEEE Int. Conf. Smart Comput. (SMARTCOMP), Jun. 2019, pp. 258–265,
doi: 10.1109/SMARTCOMP.2019.00061.
[23] J. Wang, Y. Kong, and T. Fu, ‘Expressway crash risk prediction using back
propagation neural network: A brief investigation on safety resilience,’’
Accident Anal. Prevention, vol. 124, pp. 180–192, Mar. 2019, doi:
10.1016/j.aap.2019.01.007.
[24] M. Wu, D. Shan, Z. Wang, X. Sun, J. Liu, and M. Sun, ‘A Bayesian
network model for real-time crash prediction based on selected variables
by random forest,’ in Proc. 5th Int. Conf. Transp. Inf. Safety, Jul. 2019,
pp. 670–677, doi: 10.1109/ICTIS.2019.8883694.
[25] H. Zhao, H. Yu, D. Li, T. Mao, and H. Zhu, ‘‘Vehicle accident risk
prediction based on AdaBoost-SO in VANETs,’’ IEEE Access, vol. 7,
pp. 14549–14557, 2019, doi: 10.1109/ACCESS.2019.2894176.
[26] P. Li, M. Abdel-Aty, and J. Yuan, ‘‘Real-time crash risk prediction on
arterials based on LSTM-CNN,’ Accident Anal. Prevention, vol. 135,
Feb. 2020, Art. no. 105371, doi: 10.1016/j.aap.2019.105371.
[27] G.-P. Corcoran and J. Clark, ‘Traffic risk assessment: A two-stream
approach using dynamic-attention,’ in Proc. 16th Conf. Comput. Robot
Vision, May 2019, pp. 166–173, doi: 10.1109/CRV.2019.00030.
[28] W.-J. Wu and H.-Y. Lin, ‘‘Traffic risk assessment from driving scene
images,’ in Proc. IEEE Int. Conf. Systems, Man, Cybernetics, Oct. 2023,
pp. 3015–3020, doi: 10.1109/SMC53992.2023.10393985.
[29] Y. Moukafih, H. Hafidi, and M. Ghogho, ‘Aggressive driving detec-
tion using deep learning-based time series classification,’ in Proc. IEEE
Int. Symp. Innov. Intell. Syst. Appl. (INISTA), Jul. 2019, pp. 1–5, doi:
10.1109/INISTA.2019.8778416.
[30] K. Saleh, M. Hossny, and S. Nahavandi, ‘Driving behavior classification
based on sensor data fusion using LSTM recurrent neural networks,’ in
Proc. IEEE 20th Int. Conf. Intell. Transp. Syst. (ITSC), Oct. 2017, pp. 1–6.
[31] M. A. Khodairy and G. Abosamra, ‘‘Driving behavior classification
based on oversampled signals of smartphone embedded sensors using
an optimized stacked-LSTM neural networks,’ IEEE Access, vol. 9,
pp. 4957–4972, 2021, doi: 10.1109/ACCESS.2020.3048915.
[32] S. Bouhsissin, N. Sael, F. Benabbou, and A. Soultana, ‘‘Enhancing
machine learning algorithm performance through feature selection for
driver behavior classification,’’ Indonesian J. Electr. Eng. Comput. Sci.,
vol. 35, no. 1, p. 354, Jul. 2024, doi: 10.11591/ijeecs.v35.i1.pp354-365.
[33] D. Gazis, R. Herman, and A. Maradudin, ‘‘The problem of the amber signal
light in traffic flow,’ Operations Res., vol. 8, no. 1, pp. 112–132, Feb. 1960,
doi: 10.1287/opre.8.1.112.
[34] M. Elhenawy, A. Jahangiri, H. A. Rakha, and I. El-Shawarby,
‘‘Classification of driver stop/run behavior at the onset of a yel-
low indication for different vehicles and roadway surface condi-
tions using historical behavior,’’ Proc. Manuf., vol. 3, pp. 858–865,
Apr. 2015.
[35] M. Elhenawy, A. Jahangiri, H. A. Rakha, and I. El-Shawarby, ‘Modeling
driver stop/run behavior at the onset of a yellow indication considering
driver run tendency and roadway surface conditions,’’ Accident Anal. Pre-
vention, vol. 83, pp. 90–100, Oct. 2015, doi: 10.1016/j.aap.2015.06.016.
[36] C. Chen, Y. Chen, J. Ma, G. Zhang, and C. M. Walton, ‘‘Driver behavior
formulation in intersection dilemma zones with phone use distraction via a
logit-Bayesian network hybrid approach,’ J. Intell. Transp. Syst., vol. 22,
no. 4, pp. 311–324, Jul. 2018, doi: 10.1080/15472450.2017.1350921.
[37] Y. Ali, M. M. Haque, Z. Zheng, and M. C. J. Bliemer, ‘‘Stop or go
decisions at the onset of yellow light in a connected environment: A
hybrid approach of decision tree and panel mixed logit model,’ Ana-
lytic Methods Accident Res., vol. 31, Sep. 2021, Art. no. 100165, doi:
10.1016/j.amar.2021.100165.
[38] S. Bouhsissin, N. Sael, and F. Benabbou, ‘‘Classification and model-
ing of driver behavior during yellow intervals at intersections,’’ Int.
Arch. Photogramm., Remote Sens. Spatial Inf. Sci., vol. 4, pp. 33–40,
Dec. 2022.
[39] S. Bouhsissin, A. Jannani, N. Sael, and F. Benabbou, ‘‘Enhanced CNN-
Based Model for Traffic Risk Assessment,’’ in Innovations in Smart Cities
and Applications Book, vol. 8. Springer, 2024.
[40] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan,
P. Dollár, and C. L. Zitnick, ‘Microsoft COCO: Common
objects in context,’ in Computer Vision—ECCV. Springer, 2014,
pp. 740–755.
SOUKAINA BOUHSISSIN received the B.Sc.
degree in mathematical sciences and in computer
science from the Faculty of Sciences Ben M’Sick,
Hassan II University of Casablanca, Morocco,
in 2018, and the M.Sc. degree in data science
and big data from the Hassan II University of
Casablanca, in 2020. She is currently pursuing the
Ph.D. degree in computer science. Her research
interests include driver behavior classification,
intelligent transport systems, machine learning,
deep learning, computer vision, and temporal series.
NAWAL SAEL received the Engineering degree
in software engineering from ENSIAS, Morocco,
in 2002. She has been a Teacher-Researcher, since
2012, an Authorized Professor, since 2014, and a
Professor of Higher Education with the Depart-
ment of Mathematics and Computer Science, Ben
M’Sick Faculty of Sciences, Casablanca, since
2020. Her research interests include data min-
ing, educational data mining, machine learning,
deep learning, computer vision, and the Internet of
Things.
FAOUZIA BENABBOU has been a Teacher-
Researcher, since 1994, an Authorized Professor,
since 2008, and a Professor of Higher Education
with the Department of Mathematics and Com-
puter Science, Ben M’Sick Faculty of Sciences,
Casablanca, since 2015. She is currently a mem-
ber of the Information Technology and Modeling
Laboratory and the Leader of the ICCNSE (Cloud
Computing, Network and Systems Engineering)
Team. Her research interests include cloud com-
puting, data mining, machine learning, and natural language processing.
ABDELFETTAH SOULTANA received the mas-
ter’s degree in software quality from the Hassan
II University of Casablanca, Morocco, in 2015.
He is currently pursuing the Ph.D. degree with the
Laboratory of Information Processing and Model-
ing, Ben M’Sik Faculty of Science. His research
interests include machine learning, deep learning,
and the Internet of Things.
AYOUB JANNANI received the Bachelor of Sci-
ence and Techniques degree in computer science
from the Faculty of Sciences and Techniques,
Sultan Molay Slimane University, Beni Mellal,
Morocco, in 2018, and the Master of Science
degree in data science and big data from the Fac-
ulty of Sciences Ben M’Sick, Hassan II University
of Casablanca, Morocco, in 2020. He is currently
pursuing the Ph.D. degree in computer science.
His research interests include leveraging artificial
intelligence to enhance quality of life and wellbeing, specifically through the
use of natural language processing, machine learning, deep learning, time
series analysis, and computer vision.
VOLUME 12, 2024 169517
... Their model, trained on a dataset consisting of 1000 images for training and 500 for validation, demonstrates exceptional performance, achieving 98.4% accuracy, with a recall of 97.2%, precision of 98.5%, and an F1 score of 95.7%, highlighting its effectiveness in predicting traffic incidents [25]. Furthermore, Bouhsissin et al. (2024) developed SafeSmartDrive using YOLOv9 for traffic monitoring in Casablanca, which achieved 83.1% precision. However, the system encountered challenges related to precision, latency, and adaptability to different urban environments, emphasizing the need for improved real-time performance and broader applicability [26]. ...
... Furthermore, Bouhsissin et al. (2024) developed SafeSmartDrive using YOLOv9 for traffic monitoring in Casablanca, which achieved 83.1% precision. However, the system encountered challenges related to precision, latency, and adaptability to different urban environments, emphasizing the need for improved real-time performance and broader applicability [26]. ...
Article
Full-text available
Road traffic accidents in Dhaka are among the worst in the world, along with huge human fatalities and vast economic losses. An advanced car accident detecting system using a YOLOv11 model is proposed for highly efficient real-time detection to further upgrade the emergency response systems. Moreover, a comparative analysis has been made on YOLOv9, YOLOv10 and YOLOv11 on the same dataset to detect accident accurately. The pre-trained weights were used through state-of-the-art object detection techniques such as IoU and NMS, which are very accurate for the YOLOv11. The dataset of 9,000 labelled images have been used for training for extremely accurate object detection and classification of on-road accidents. On a 0.8249 Recall, it does an exceptionally good job with a perfect Precision of 1.0000 and a mean Average Precision of 0.9940 on a 50% IoU threshold. It also has very low latency, requiring only 19.93 milliseconds per frame on a GPU, hence suitable for real-time applications. This may potentially improve the response time at an accident site, significantly reducing the risk of fatalities. These results point toward the great avenue that AI-driven systems can offer in the quest toward making roads safer and responses better. The integration of such traffic management within existing frameworks would ensure great monitoring and timely response actions to save a life. It also reduces the impact on society resulting from traffic accidents. Indeed, this is a giant step toward harnessing artificial intelligence potentials for public safety and infrastructural development.
Article
Full-text available
Machine learning (ML) techniques empower computers to learn from data and make predictions or decisions in various domains, while preprocessing methods assist in cleaning and transforming data before it can be effectively utilized by ML. Feature selection in ML is a critical process that significantly influences the performance and effectiveness of models. By carefully choosing the most relevant and informative attributes from the dataset, feature selection enhances model accuracy, reduces overfitting, and minimizes computational complexity. In this study, we leverage the UAH-DriveSet dataset to classify driver behavior, employing Filter, embedded, and wrapper methods encompassing 10 distinct feature selection techniques. Through the utilization of diverse ML algorithms, we effectively categorize driver behavior into normal, drowsy, and aggressive classes. The second objective is to employ feature selection techniques to pinpoint the most influential features impacting driver behavior. As a results, random forest emerges as the top-performing classifier, achieving an impressive accuracy of 96.4% and an F1-score of 96.36% using backward feature selection in 7.43 s, while K-nearest neighbour (K-NN) attains an accuracy of 96.29% with forward feature selection in 0.05 s. Following our comprehensive results, we deduce that the primary influential features for studying driver behavior include speed (km/h), course, yaw, impact time, road width, distance to the ahead vehicle, vehicle position, and number of detected vehicles.
Article
Full-text available
Autonomous vehicles can transform the transportation sector by offering a safer and more effective means of travel. However, the success of self-driving cars depends on their ability to navigate complex road conditions, including the detection of potholes. Potholes pose a substantial risk to vehicles and passengers, leading to potential damage and safety hazards, making their detection a critical task for autonomous driving. In this work, we propose a robust and efficient solution for pothole detection using the “you look only once (YOLO) algorithm of version 8, the newest deep learning object detection algorithm.” Our proposed system employs a deep learning methodology to identify real-time potholes, enabling autonomous vehicles to avoid potential hazards and minimise accident risk. We assess the effectiveness of our system using publicly available datasets and show that it outperforms existing state-of-the-art approaches in terms of accuracy and efficiency. Additionally, we investigate different data augmentation methods to enhance the detection capabilities of our proposed system. Our results demonstrate that YOLO V8-based pothole detection is a promising solution for autonomous driving and can significantly improve the safety and reliability of self-driving vehicles on the road. The results of our study are also compared with the results of YOLO V5.
Article
Full-text available
Recent years have seen advances in deep learning, including in the field of traffic management. Detecting distant objects that occupy a small number of pixels in the input image is one of the major challenges in computer vision for several reasons, including limited resolution. The challenges of detecting the rotation of objects may be attributed to the deflection of the camera when taking photographs. We recommend enhancing the features of the YOLOv5 network. The proposed method is to train a model on a traffic dataset, which achieves the best inference results through training, testing, and detection on a 1280 × 1280 image for 300 epochs. Moreover, modifications were made to some structural elements of the YOLOv5. In addition to detecting round objects by increasing degrees from 0 to 270, it also increases the probability of flipping in all directions: up, down, left, and right. In addition, the degree of rotation of the image was increased to 90 degrees. The results showed optimized accuracy in detecting distant and small objects, as 73 objects were detected compared to the original YOLOv5 23 objects. It achieved the best number of objects detected in the video (people, cars, and others), and detecting rotating objects increases the number of detected objects (32 objects). The inference time was (23 Ms.) this dataset can make excellent traffic monitoring applications. This model can be deployed on an Android mobile device to provide accurate data about current traffic at a specific location. This is because a mobile device can be used at any time and place. Therefore, in the future, we are working on designing models for object detection that can be operated on mobile devices.
Article
Full-text available
Traffic sign detection is a challenging task for unmanned driving systems. In the traffic sign detection process, the object size and weather conditions vary widely, which will have a certain impact on the detection accuracy. In order to solve the problem of balanced detecting precision of traffic sign recognition model in different weather conditions, and it is difficult to detect occluded objects and small objects, this paper proposes a small object detection algorithm based on improved YOLOv5s in complex weather. First, we add the coordinate attention(CA) mechanism in the backbone, a light-weight yet effective module, embedding the location information of traffic signs into the channel attention to improve the feature extraction ability of the network. Second, we exploit effectively fine-grained features about small traffic signs from the shallower layers by adding one prediction head to YOLOv5s. Finally, we use Alpha-IoU to improve the original positioning loss CIoU, improving the accuracy of bbox regression. Applying this model to the recently proposed CCTSDB 2021 dataset, for small objects, the precision is 88.1%, and the recall rate is 79.8%, compared with the original YOLOv5s model, it is improved by 12.5% and 23.9% respectively, and small traffic signs can be effectively detected under different weather conditions, with low miss rate and high detection accuracy. The source code will be made publicly available at https://github.com/yang-0706/ImprovedYOLOv5s.
Article
Full-text available
Car manufacturers around the globe are in a race to design and build driverless cars. The concept of driverless is also being applied to any moving vehicle such as wheelchairs, golf cars, tourism carts in recreational parks, etc. To achieve this ambition, vehicles must be able to drive safely on streets stay within required lanes, sense moving objects, sense obstacles, and be able to read traffic signs that are permanent and even temporary signs. It will be a completely integrated system of the Internet of Things (IoT), Global Positioning System (GPS), Machine Learning (ML)/Deep Learning (DL), and Smart Technologies. A lot of work has been done on traffic sign recognition in the English language, but little has been done for Arabic traffic sign recognition. The concepts used for traffic sign recognition can also be applied to indoor signage, smart cities, supermarket labels, and others. In this paper, we propose two optimized Residual Network (ResNet) models (ResNet V1 and ResNet V2) for automatic traffic sign recognition using the Arabic Traffic Signs (ArTS) dataset. Additionally, the authors developed a new dataset specifically for Arabic Traffic Sign recognition consisting of 2,718 images taken from random places in the Eastern province of Saudi Arabia. The optimized proposed ResNet V1 model achieved the highest training and validation accuracies of 99.18% and 96.14%, respectively. It should be noted here that the authors accounted for both overfitting and underfitting in the proposed models. It is also important to note that the results achieved using the proposed models outperform similar methods proposed in the extant literature for the same dataset or similar-size dataset.
Article
Vehicle detection and classification are the most significant and challenging activities of an intelligent traffic monitoring system. Traditional methods are highly computationally expensive and also impose restrictions when the mode of data collection changes. This research proposes a new approach for vehicle detection and classification over aerial image sequences. The proposed model consists of five stages. All of the images are preprocessed in the first stage to reduce noise and raise the brightness level. The foreground items are then extracted from these images using segmentation. The segmented images are then passed onto the YOLOv8 algorithm to detect and locate vehicles in each image. The feature extraction phase is then applied to the detected vehicles. The extracted feature involves Scale Invariant Feature Transform (SIFT), Oriented FAST and Rotated BRIEF (ORB), and KAZE features. For classification, we used the Deep Belief Network (DBN) classifier. Based on classification, the experimental results across the three datasets produced better outcomes; the proposed model attained an accuracy of 95.6% over Vehicle Detection in Aerial Imagery (VEDAI) and 94.6% over Vehicle Aerial Imagery from a Drone (VAID) dataset, respectively. To compare our model with the other standard techniques, we have also drawn a comparative analysis with the latest techniques in the research.