ArticlePublisher preview available

Deep learning and control algorithms of direct perception for autonomous driving

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract and Figures

We propose an end-to-end machine learning model that integrates multi-task (MT) learning, convolutional neural networks (CNNs), and control algorithms to achieve efficient inference and stable driving for self-driving cars. The CNN-MT model can simultaneously perform regression and classification tasks for estimating perception indicators and driving decisions, respectively, based on the direct perception paradigm of autonomous driving. The model can also be used to evaluate the inference efficiency and driving stability of different CNNs on the metrics of CNN’s size, complexity, accuracy, processing speed, and collision number, respectively, in a dynamic traffic. We also propose new algorithms for controllers to drive a car using the indicators and its short-range sensory data to avoid collisions in real-time testing. We collect a set of images from a camera of The Open Racing Car Simulator in various driving scenarios, train the model using this dataset, test it in unseen traffics, and find that it outperforms earlier models in highway traffic. The stability of end-to-end learning and self driving depends crucially on the dynamic interplay between CNN and control algorithms. The source code and data of this work are available on our website, which can be used as a simulation platform to evaluate different learning models on equal footing and quantify collisions precisely for further studies on autonomous driving.
This content is subject to copyright. Terms and conditions apply.
https://doi.org/10.1007/s10489-020-01827-9
Deep learning and control algorithms of direct perception
for autonomous driving
Der-Hau Lee1·Kuan-Lin Chen2·Kuan-Han Liou2·Chang-Lun Liu2·Jinn-Liang Liu2
©Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract
We propose an end-to-end machine learning model that integrates multi-task (MT) learning, convolutional neural networks
(CNNs), and control algorithms to achieve efficient inference and stable driving for self-driving cars. The CNN-MT model
can simultaneously perform regression and classification tasks for estimating perception indicators and driving decisions,
respectively, based on the direct perception paradigm of autonomous driving. The model can also be used to evaluate the
inference efficiency and driving stability of different CNNs on the metrics of CNN’s size, complexity, accuracy, processing
speed, and collision number, respectively, in a dynamic traffic. We also propose new algorithms for controllers to drive a car
using the indicators and its short-range sensory data to avoid collisions in real-time testing. We collect a set of images from
a camera of The Open Racing Car Simulator in various driving scenarios, train the model using this dataset, test it in unseen
traffics, and find that it outperforms earlier models in highway traffic. The stability of end-to-end learning and self driving
depends crucially on the dynamic interplay between CNN and control algorithms. The source code and data of this work are
available on our website, which can be used as a simulation platform to evaluate different learning models on equal footing
and quantify collisions precisely for further studies on autonomous driving.
Keywords Self-driving cars ·Autonomous driving ·Deep learning ·Image perception ·Control algorithms
1 Introduction
The direct perception model proposed by Chen et al. [1]
maps an input image (high dimensional pixels) from a
Jinn-Liang Liu
jinnliu@mail.nd.nthu.edu.tw;
http://www.nhcue.edu.tw/jinnliu/
Der-Hau Lee
derhaulee@yahoo.com.tw
Kuan-Lin Chen
mark0101tw@gmail.com
Kuan-Han Liou
jason839262002@yahoo.com.tw
Chang-Lun Liu
leo28833705@yahoo.com.tw
1Department of Electrophysics, National Chiao Tung
University, Hsinchu, Taiwan
2Institute of Computational and Modeling Science,
National Tsing Hau University, Hsinchu, Taiwan
sensory device of a vehicle to fourteen affordance indicators
(a low dimensional representation) by a convolutional
neural network (CNN). Controllers then drive the vehicle
autonomously using these indicators in an end-to-end (E2E)
and real-time manner. This paradigm falls between and
displays the merits [13] of the mediated perception [4
8] and behavior reflex [913] paradigms. We refer to these
papers, some recent review articles [1418], and references
there for more thorough discussions about these three major
paradigms in the state-of-art machine learning algorithms of
autonomous driving.
We instead study the interplay between CNNs and
controllers and its effects on the overall performance of self-
driving cars in training and testing phases, which are not
addressed in earlier studies. CNN is a perception mapping
from sensory input to affordance output. Controllers
then map key affordances to driving actions, namely, to
accelerate, brake, or steer [1].
These two mapping algorithms are generally proposed
and verified separately since automotive control systems
are very complex varying with vehicle types and levels
of automation [1419]. A great variety of simulators have
been developed for simulation testing of autonomous cars in
Published online: 8 August 2020
Applied Intelligence (2021) 51:237–247
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
... Lane detection has progressed from edge detection and Hough transforms to deep learning-based segmentation [13]. CNN-based methods reliably segment lanes [31], often integrating post-processing like polynomial curve fitting for smoother lane boundaries [21]. Multi-task learning incorporating temporal information further improves detection stability [20]. ...
... Deep networks now infer lanes even when markings are degraded by leveraging contextual road cues [19]. Stateof-the-art models like LaneNet achieve 96.4% accuracy on TuSimple [46], while SCNN achieves 71.6% F1-score on CULane [21], setting benchmarks for comparison. ...
... All experiments were conducted on a PC equipped with an Intel i9-9900K CPU, 64 GB of RAM, and an Nvidia RTX 2080 ti GPU with 11 GB of VRAM. The TORCS software provides sophisticated visualization and physics engines and accurately simulates both visual effects and vehicle dynamics for a self-driving vehicle and its surrounding environment [27], [28], [29]. Due to lag in the TORCS simulator, the actuation delay was approximately 6.66 ms on our computer for vision-based experiments; this lag was zero in numerical simulations. ...
Preprint
Full-text available
A robust control strategy for autonomous vehicles can improve system stability, enhance riding comfort, and prevent driving accidents. This paper presents a novel interpolation tube-based constrained iterative linear quadratic regulator (itube-CILQR) algorithm for autonomous computer-vision-based vehicle lane-keeping. The goal of the algorithm is to enhance robust-ness during high-speed cornering on tight turns. The advantages of itube-CILQR over the standard tube-approach include reduced system conservatism and increased computational speed. Numerical and vision-based experiments were conducted to examine the feasibility of the proposed algorithm. The proposed itube-CILQR algorithm is better suited to vehicle lane-keeping than variational CILQR-based methods and model predictive control (MPC) approaches using a classical interior-point solver. Specifically, in evaluation experiments, itube-CILQR achieved an average runtime of 3.16 ms to generate a control signal to guide a self-driving vehicle; itube-MPC typically required a 4.67-times longer computation time to complete the same task. Moreover, the influence of conservatism on system behavior was investigated by exploring the interpolation variable trajectories derived from the proposed itube-CILQR algorithm during lane-keeping maneuvers. Index Terms-Autonomous vehicles, constrained iterative linear quadratic regulator, tube model predictive control, deep neural networks.
... For example, in the field of medicine, in order to realize the diagnosis and classification of diseases, Rana M and Bhushan M use CNN and RF algorithms to achieve nearly 100% accuracy [4]. For autonomous driving, Lee D H, Chen K L, Liou K H and others modified the CNN based on a large number of perceptual images and designed the controller to achieve zero collision for all agent cars [5]. At the same time, machine learning has also made significant progress in the field of loan prediction [6]. ...
Article
Full-text available
With the economic development and social progress, loan has become one of the most common business of all kinds of modern banks. In order to reduce the risk of loan, it is particularly important to put forward an effective method to forecast loan. Therefore, this paper proposes a loan forecasting method based on Xgboost and feature importance evaluation. Specifically, after cleaning the original data set, we evaluate the classification effect of several different machine learning algorithm models and find that the Xgboost effect performs best. In addition, in order to enlighten the bank staff, in addition to predicting the results, we also use SHAP and feature importance method to carry out importance analysis on the relevant feature variables, and explain the influence of each feature on the model output. The results show that the principal paid by the borrower, the amount of financing and the amount of recovery after the borrower defaults are the main factors affecting the default risk. The results and methods of this study can provide important references for financial institutions in the process of loan approval, and help to improve the effectiveness and accuracy of risk management.
Article
Full-text available
Recent advancements on digital technologies, particularly artificial intelligence, have been resulted into remarkable transformations in automobile industry. One of these technologies is artificial intelligence (AI). AI plays a key role in the development of autonomous vehicles. In this paper, the role of AI in autonomous vehicle (AV) platform layers is studied. The focus of this paper is on the indexed papers in Scopus database. The most relevant keywords are selected and searched. 628 articles, between 2014 and 2024 were selected for analysing and reviewing. Articles were analysed based on source type, topics, and AI algorithms. Text mining and content analysis of articles revealed that 233 journals published 628 articles, and the most top 185 are selected to assess. The topics of paper are classified into perception, localization and mapping, planning, decision making, control, communication, security, data management, and general topics. Each of these areas consisted of many roles, or tasks and use AI to realize their tasks. Convolutional neural network in the perception, control, and localization and mapping layers have been more used. Deep reinforcement learning had the most application in planning and decision‐making areas. The main result of this paper is recognition of AVs platform layers classification, designing a data‐driven digital twin AI‐based model of autonomous vehicles architecture, containing physical world, virtual world, and communication space, and mapping of applied AI algorithms each layer, which aid researchers to choose the suitable methods in the field of autonomous vehicles. This study provided a comprehensive map of research projects related to from 1985 to 2022. Finally, some research directions are suggested.
Article
Full-text available
The last decade witnessed increasingly rapid progress in self‐driving vehicle technology, mainly backed up by advances in the area of deep learning and artificial intelligence (AI). The objective of this paper is to survey the current state‐of‐the‐art on deep learning technologies used in autonomous driving. We start by presenting AI‐based self‐driving architectures, convolutional and recurrent neural networks, as well as the deep reinforcement learning paradigm. These methodologies form a base for the surveyed driving scene perception, path planning, behavior arbitration, and motion control algorithms. We investigate both the modular perception‐planning‐action pipeline, where each module is built using deep learning methods, as well as End2End systems, which directly map sensory information to steering commands. Additionally, we tackle current challenges encountered in designing AI architectures for autonomous driving, such as their safety, training data sources, and computational hardware. The comparison presented in this survey helps gain insight into the strengths and limitations of deep learning and AI approaches for autonomous driving and assist with design choices.
Article
Full-text available
This paper presents a systematic review of the perception systems and simulators for autonomous vehicles (AV). This work has been divided into three parts. In the first part, perception systems are categorized as environment perception systems and positioning estimation systems. The paper presents the physical fundamentals, principle functioning, and electromagnetic spectrum used to operate the most common sensors used in perception systems (ultrasonic, RADAR, LiDAR, cameras, IMU, GNSS, RTK, etc.). Furthermore, their strengths and weaknesses are shown, and the quantification of their features using spider charts will allow proper selection of different sensors depending on 11 features. In the second part, the main elements to be taken into account in the simulation of a perception system of an AV are presented. For this purpose, the paper describes simulators for model-based development, the main game engines that can be used for simulation, simulators from the robotics field, and lastly simulators used specifically for AV. Finally, the current state of regulations that are being applied in different countries around the world on issues concerning the implementation of autonomous vehicles is presented.
Chapter
Full-text available
For human drivers, having rear and side-view mirrors is vital for safe driving. They deliver a more complete view of what is happening around the car. Human drivers also heavily exploit their mental map for navigation. Nonetheless, several methods have been published that learn driving models with only a front-facing camera and without a route planner. This lack of information renders the self-driving task quite intractable. We investigate the problem in a more realistic setting, which consists of a surround-view camera system with eight cameras, a route planner, and a CAN bus reader. In particular, we develop a sensor setup that provides data for a 360-degree view of the area surrounding the vehicle, the driving route to the destination, and low-level driving maneuvers (e.g. steering angle and speed) by human drivers. With such a sensor setup we collect a new driving dataset, covering diverse driving scenarios and varying weather/illumination conditions. Finally, we learn a novel driving model by integrating information from the surround-view cameras and the route planner. Two route planners are exploited: (1) by representing the planned routes on OpenStreetMap as a stack of GPS coordinates, and (2) by rendering the planned routes on TomTom Go Mobile and recording the progression into a video. Our experiments show that: (1) 360-degree surround-view cameras help avoid failures made with a single front-view camera, in particular for city driving and intersection scenarios; and (2) route planners help the driving task significantly, especially for steering angle prediction. Code, data and more visual results will be made available at http://www.vision.ee.ethz.ch/~heckers/Drive360.
Article
Full-text available
For people, having a rear-view mirror and side-view mirrors is vital for safe driving. They deliver a better view of what happens around the car. Human drivers also heavily exploit their mental map for navigation. Nonetheless, several methods have been published that learn driving models with only a front-facing camera and without a route planner. This lack of information renders the self-driving task quite intractable. Hence, we investigate the problem with a more realistic setting, which consists of a surround-view camera system with eight cameras, a route planner, and a CAN bus reader. In particular, we develop a sensor setup that provides data for a 360-degree view of the area surrounding the vehicle, the driving route to the destination, and the low-level driving maneuvers (e.g. steering angle and speed) by human drivers. With such sensor setup we collect a new driving dataset, covering diverse driving scenarios and varying weather/illumination conditions. Finally, we learn a novel driving model by integrating information from the surround-view cameras and the route planner. Two route planners are exploited: one based on OpenStreetMap and the other on TomTom Maps. The route planners are exploited in two ways: 1) by representing the planned routes as a stack of GPS coordinates, and 2) by rendering the planned routes on a map and recording the progression into a video. Our experiments show that: 1) 360-degree surround-view cameras help avoid failures made with a single front-view camera for the driving task; and 2) a route planner helps the driving task significantly. We acknowledge that our method is not the best-ever driving model, but that is not our focus. Rather, it provides a strong basis for further academic research, especially on driving relevant tasks by integrating information from street-view images and the planned driving routes. Code and data will be made available.
Conference Paper
Full-text available
Steering a car through traffic is a complex task that is difficult to cast into algorithms. Therefore, researchers turn to training artificial neural networks from front-facing camera data stream along with the associated steering angles. Nevertheless, most existing solutions consider only the visual camera frames as input, thus ignoring the temporal relationship between frames. In this work, we propose a Convolutional Long Short-Term Memory Recurrent Neural Network (C-LSTM), that is end-to-end trainable, to learn both visual and dynamic temporal dependencies of driving. Additionally, We introduce posing the steering angle regression problem as classification while imposing a spatial relationship between the output layer neurons. Such method is based on learning a sinusoidal function that encodes steering angles. To train and validate our proposed methods, we used the publicly available Comma.ai dataset. Our solution improved steering root mean square error by 35% over recent methods, and led to a more stable steering by 87%
Article
Throughout the last century, the automobile industry achieved remarkable milestones in manufacturing reliable, safe, and affordable vehicles. Because of significant recent advances in computation and communication technologies, autonomous cars are becoming a reality. Already autonomous car prototype models have covered millions of miles in test driving. Leading technical companies and car manufacturers have invested a staggering amount of resources in autonomous car technology, as they prepare for autonomous cars’ full commercialization in the coming years. However, to achieve this goal, several technical and non-technical issues remain: software complexity, real-time data analytics, and testing and verification are among the greater technical challenges; and consumer stimulation, insurance management, and ethical/moral concerns rank high among the non-technical issues. Tackling these challenges requires thoughtful solutions that satisfy consumers, industry, and governmental requirements, regulations, and policies. Thus, here we present a comprehensive review of state-of-the-art results for autonomous car technology. We discuss current issues that hinder autonomous cars’ development and deployment on a large scale. We also highlight autonomous car applications that will benefit consumers and many other sectors. Finally, to enable cost-effective, safe, and efficient autonomous cars, we discuss several challenges that must be addressed (and provide helpful suggestions for adoption) by designers, implementers, policymakers, regulatory organizations, and car manufacturers.
Article
We introduce CARLA, an open-source simulator for autonomous driving research. CARLA has been developed from the ground up to support development, training, and validation of autonomous urban driving systems. In addition to open-source code and protocols, CARLA provides open digital assets (urban layouts, buildings, vehicles) that were created for this purpose and can be used freely. The simulation platform supports flexible specification of sensor suites and environmental conditions. We use CARLA to study the performance of three approaches to autonomous driving: a classic modular pipeline, an end-to-end model trained via imitation learning, and an end-to-end model trained via reinforcement learning. The approaches are evaluated in controlled scenarios of increasing difficulty, and their performance is examined via metrics provided by CARLA, illustrating the platform's utility for autonomous driving research. The supplementary video can be viewed at https://youtu.be/Hp8Dz-Zek2E