ArticlePDF Available

Optimized YOLOv3 Algorithm and Its Application in Traffic Flow Detections

MDPI
Applied Sciences
Authors:

Abstract and Figures

In the intelligent traffic system, real-time and accurate detections of vehicles in images and video data are very important and challenging work. Especially in situations with complex scenes, different models, and high density, it is difficult to accurately locate and classify these vehicles during traffic flows. Therefore, we propose a single-stage deep neural network YOLOv3-DL, which is based on the Tensorflow framework to improve this problem. The network structure is optimized by introducing the idea of spatial pyramid pooling, then the loss function is redefined, and a weight regularization method is introduced, for that, the real-time detections and statistics of traffic flows can be implemented effectively. The optimization algorithm we use is the DL-CAR data set for end-to-end network training and experiments with data sets under different scenarios and weathers. The analyses of experimental data show that the optimized algorithm can improve the vehicles’ detection accuracy on the test set by 3.86%. Experiments on test sets in different environments have improved the detection accuracy rate by 4.53%, indicating that the algorithm has high robustness. At the same time, the detection accuracy and speed of the investigated algorithm are higher than other algorithms, indicating that the algorithm has higher detection performance.
Content may be subject to copyright.
A preview of the PDF is not available
... Most methods for recognizing traffic signals in an image involve two primary steps: detection and classification. Many academics are using well-established or specially created computer vision algorithms to tackle this difficult issue (Huang et al. 2020). ...
Article
Full-text available
Traffic accidents remain a pressing public safety concern, with a substantial number of incidents resulting from drivers' lack of attentiveness to road signs. Automated road sign recognition has emerged as a promising technology for enhancing driving assistance systems. This study explores the application of Convolutional Neural Networks (CNNs) in automatically recognizing road signs. CNNs, as deep learning algorithms, possess the ability to process and classify visual data, making them well-suited for image-based tasks such as road sign recognition. The research focuses on the data collection process for training the CNN, incorporating a diverse dataset of road sign images to improve recognition accuracy across various scenarios. A mobile application was developed as the user interface, with the output of the system displayed on the app. The results show that the system is capable of recognizing signs in real time, with average accuracy for sign recognition from a distance of 10 meters: i) daytime = 89.8%, ii) nighttime = 75.6%, and iii) rainy conditions = 76.4%. In conclusion, the integration of CNNs in automated road sign recognition, as demonstrated in this study, presents a promising avenue for enhancing driving safety by addressing drivers' attentiveness to road signs in real-time scenarios.
... Implementation of YOLOv7 on COCO dataset shows good mAP. The network structure of latest version of YOLO is displayed in Fig. 3. Darknet network extracts and obtain deeper feature information (Huang et al. 2020). It produces three times faster results (Qian et al. 2017). ...
Article
Full-text available
Due to the rapid growth in population and continue rise in number of vehicle on road issue of transportation congestion arises. Combination of Internet-of-Things-Aided Smart Transportation System is doing promising work in this area. A massive amount of video streaming data is produced at high speed by distributed mobile IoT devices and video cameras due to the use of artificial intelligence (AI) and Internet of Things (IoT) combinations in smart city scenarios. Real-time data processing application demands efficient analysis of these data. The key focus in this work is on improving cloud-based traffic video analytics systems by executing a two-step approach: first, Edge-based pre-processing of a video stream to reduce data transmission time and Cloud-based traffic Video Analytics. Second, Video Analytics and Sensor Fusion (VA/SF) are studied and examined to guarantee that the continuum of potentials are sufficiently covered by the data that algorithms are trained on and make it sufficiently efficient to provide high accuracy or low latency modes of services. We suggest a YOLO based deep learning video analytics system on the cloud to perform real-time object detection for traffic surveillance video. The proposed VA/SF model reduces detection speed of the model while improving the object detection accuracy by 1.8% when compared to no-IoT sensor fusion. The experiment proves that higher accuracy with better detection is achieved by our traffic analytical model under extreme weather conditions.
... One of the more popular computer vision algorithms is You Only Look Once (YOLO), which keeps advancing with each version. With v3 of YOLO, both the original and YOLOv3-DL meet the requirements of high precision and fast speed required to monitor traffic flows, achieving accuracy rates of approximately 95% and 99% [23]. A similar study was also conducted by Impedovo et al. using YOLOv3, which gave an accuracy of 98% in traffic state classification [24]. ...
Article
Full-text available
The escalating prevalence of traffic congestion in Malang, which has worsened over time, has resulted in increased inconvenience and traffic-related issues. As a solution, we propose the development of an intelligent transportation system based on traffic flow estimation. While this endeavor presents challenges, advancements in traffic data accessibility and deep learning algorithms, including You Only Look Once (YOLO), R-CNN based ResNet101, Inception V2, and others, with YOLOv7 demonstrating superior performance, continue to enhance traffic flow estimation. Consequently, this study aims to describe and assess the YOLOv7 model's suitability for estimating traffic flow. To achieve this, we propose a traffic monitoring system utilizing the YOLOv7 model in conjunction with a SORT filter, capable of tracking vehicle count and average speed using a custom dataset comprising car images and CCTV footage of Malang's traffic. Subsequently, the number and average speed of cars passing on the road were computed by delineating two virtual boxes. According to the findings, our YOLOv7 system exhibited higher average precision, achieving 61.3%, 78.6%, and 62.1% for each test, and boasted a faster average inference time of 90.67 seconds, meeting the requirements of an intelligent transportation system. Nevertheless, with an average percent deviation in vehicle count exceeding 30% compared to the pre-trained model, vehicle detection is suboptimal, consequently impacting average vehicle speed detection adversely. Hence, our custom model necessitates further optimization and training before practical implementation in real-world traffic estimation.
... The prediction results for objects of various sizes consist of bounding box coordinates, object presence probability, and class probability. This allows for the rapid and accurate detection of objects of varying sizes [14,15]. ...
Article
Wildfires cause human casualties, property damage, and significant damage to ecosystems, so prompt fire detection and response are essential. In this study, training and evaluation experiments were performed on the YOLOv3, Faster R-CNN, and Cascade R-CNN models to analyze their real-time wildfire detection performance. The performance of each model was evaluated using precision, recall, mAP 50, mAP 50-95, the model parameter, and frames per second (FPS). The experiment results showed that YOLOv3 had slightly lower overall performance than other models, but the difference was not significant in terms of mAP 50. YOLOv3 had the highest inference speed (48.7 FPS) and an overall mAP 50 of 0.832. On the contrary, Faster R-CNN's overall mAP 50 was 0.835, while Cascade R-CNN showed higher performance with 0.840. Compared to YOLOv3, both models showed relatively low inference speeds (36.2 and 26.3 FPS, respectively). Therefore, the YOLOv3 model is deemed appropriate for real-time fire detection owing to its high inference speed and insignificant wildfire detection performance difference in terms of mAP 50. Future wildfire detection models are expected to improve by optimizing the network structures (e.g., FPN and Head) of 1-Stage models, such as YOLOv3.
Article
Full-text available
Numerous design cases of abandoning auxiliary lanes for freeway dual-lane ramps with low traffic volumes exist, adapting to complex engineering conditions and reducing construction costs, but the national specifications have not posed specific setup conditions for auxiliary lanes. Thus, this paper uses traffic flow theory and simulation tools to study the critical traffic conditions applicable to auxiliary lanes on dual-lane exit ramps of freeways. Initially, the vehicle operation data in the UAV (unmanned aerial vehicle) aerial video were extracted using an object detection algorithm. Subsequently, the VISSIM simulation calibration procedure was developed based on traffic flow theory and the orthogonal experimental method. The impact of auxiliary lanes on the capacity of the freeway diverging area was analyzed through the simulation results based on traffic flow theory. Eventually, the critical traffic conditions applicable to auxiliary lanes were proposed. The results show that the maximum traffic volume applicable to non-auxiliary lane designs decreases with increasing diverging ratios. The research findings define the application conditions for auxiliary lanes on dual-lane ramp exits, contributing to the sustainable development of transportation design and operations. The VISSIM simulation calibration procedure based on data collection and traffic flow theory developed in this paper also provides an innovative and sustainable approach to road design issues.
Article
Full-text available
The importance of deep learning has heralded transforming changes across different technological domains, not least in the enhancement of robotic arm functionalities of object detection’s and grasping. This paper is aimed to review recent and past studies to give a comprehensive insight to focus in exploring cutting-edge deep learning methodologies to surmount the persistent challenges of object detection and precise manipulation by robotic arms. By integrating the iterations of the You Only Look Once (YOLO) algorithm with deep learning models, our study not only advances the innovations in robotic perception but also significantly improves the accuracy of robotic grasping in dynamic environments. Through a comprehensive exploration of various deep learning techniques, we introduce many approaches that enable robotic arms to identify and grasp objects with unprecedented precision, thereby bridging a critical gap in robotic automation. Our findings demonstrate a marked enhancement in the robotic arm’s ability to adapt to and interact with its surroundings, opening new avenues for automation in industrial, medical, and domestic applications. The impact of this research extends lays the groundwork for future developments in robotic autonomy, offering insights into the integration of deep learning algorithms with robotic systems. This also serves as a beacon for future research aimed at fully unleashing the potential of robots as autonomous agents in complex, real-world settings.
Article
The paper discusses new approaches to the arrangement of non-stop traffic flows at signaled crossings using coordinated traffic management. The study objective is to develop a mathematical model for determining the recommended traffic speed on the stretches of the urban road network using computer vision to ensure the non-stop traffic of a group of vehicles when crossing a signaled intersection. The model is unique because it takes into account the queue parameters of out-of-group vehicles, as well as the condition of the road surface. The study presents a method for calculating the time of non-stop passage of cars over crossings of the road network using AIMS-Eco monitoring system. The system uses real-time video stream analysis technology based on YOLOv4 neural network to obtain data on traffic parameters. The coefficients of influence of the traffic flow structure and the condition of the road surface on the time of queuing outside the group vehicles are characterized, which makes it possible to more accurately assess the impact of these factors on traffic dynamics. The dependences studied include those of the recommended speed of the leading car and the capacity of the road network section on the number of extra-group vehicles, taking into account the travel time of the queue of extra-group vehicles. The developed mathematical model makes it possible to increase the average flow velocity in the section of the stop line by 10-15% due to the arrangement of non-stop passage of signaled intersections by group vehicles. The results achieved are of practical importance for increasing the traffic capacity of the road network, improving road and environmental safety of automobile traffic.
Conference Paper
Full-text available
Bounding box regression is the crucial step in object detection. In existing methods, while ℓn-norm loss is widely adopted for bounding box regression, it is not tailored to the evaluation metric, i.e., Intersection over Union (IoU). Recently, IoU loss and generalized IoU (GIoU) loss have been proposed to benefit the IoU metric, but still suffer from the problems of slow convergence and inaccurate regression. In this paper, we propose a Distance-IoU (DIoU) loss by incorporating the normalized distance between the predicted box and the target box, which converges much faster in training than IoU and GIoU losses. Furthermore, this paper summarizes three geometric factors in bounding box regression, i.e., overlap area, central point distance and aspect ratio, based on which a Complete IoU (CIoU) loss is proposed, thereby leading to faster convergence and better performance. By incorporating DIoU and CIoU losses into state-of-the-art object detection algorithms, e.g., YOLO v3, SSD and Faster R-CNN, we achieve notable performance gains in terms of not only IoU metric but also GIoU metric. Moreover, DIoU can be easily adopted into non-maximum suppression (NMS) to act as the criterion, further boosting performance improvement. The source code and trained models are available at https://github.com/Zzh-tju/DIoU
Article
Full-text available
We propose a high-performance algorithm while using a promoted and modified form of the You Only Look Once (YOLO) model, which is based on the TensorFlow framework, to enhance the real-time monitoring of traffic-flow problems by an intelligent transportation system. Real-time detection and traffic-flow statistics were realized by adjusting the network structure, optimizing the loss function, and introducing weight regularization. This model, which we call YOLO-UA, was initialized based on the weight of a YOLO model pre-trained while using the VOC2007 data set. The UA-CAR data set with complex weather conditions was used for training, and better model parameters were selected through tests and subsequent adjustments. The experimental results showed that, for different weather scenarios, the accuracy of the YOLO-UA was ~22% greater than that of the YOLO model before optimization, and the recall rate increased by about 21%. On both cloudy and sunny days, the accuracy, precision, and recall rate of the YOLO-UA model were more than 94% above the floating rate, which suggested that the precision and recall rate achieved a good balance. When used for video testing, the YOLO-UA model yielded traffic statistics with an accuracy of up to 100%; the time to count the vehicles in each frame was less than 30 ms and it was highly robust in response to changes in scenario and weather.
Article
We present some updates to YOLO! We made a bunch of little design changes to make it better. We also trained this new network that's pretty swell. It's a little bigger than last time but more accurate. It's still fast though, don't worry. At 320x320 YOLOv3 runs in 22 ms at 28.2 mAP, as accurate as SSD but three times faster. When we look at the old .5 IOU mAP detection metric YOLOv3 is quite good. It achieves 57.9 mAP@50 in 51 ms on a Titan X, compared to 57.5 mAP@50 in 198 ms by RetinaNet, similar performance but 3.8x faster. As always, all the code is online at https://pjreddie.com/yolo/
Conference Paper
Can a large convolutional neural network trained for whole-image classification on ImageNet be coaxed into detecting objects in PASCAL? We show that the answer is yes, and that the resulting system is simple, scalable, and boosts mean average precision, relative to the venerable deformable part model, by more than 40% (achieving a final mAP of 48% on VOC 2007). Our framework combines powerful computer vision techniques for generating bottom-up region proposals with recent advances in learning high-capacity convolutional neural networks. We call the resulting system R-CNN: Regions with CNN features. The same framework is also competitive with state-of-the-art semantic segmentation methods, demonstrating its flexibility. Beyond these results, we execute a battery of experiments that provide insight into what the network learns to represent, revealing a rich hierarchy of discriminative and often semantically meaningful features.