To read the full-text of this research, you can request a copy directly from the authors.
... Since each sensor responds to signal interference differently, the combination of their encoded information allows for a clearer signal to be extracted. Recent research has demonstrated the efficacy of this combination for both terrestrial autonomous navigation [11] and dust de-filtering [12]. ...
... Figure 2 outlines the proposed architecture for ARC-LIGHT, demonstrating how multiple techniques may be combined. As lidar-camera fusion is an emerging technology, there are few established algorithmic conventions to compare to; the separate pre-processing pathways before CNNbased fusion are comparable to other proposed architectures for dust filtering [12], but the comparison to previous iteration outputs is a unique processing stage that takes advantage of the known physics of the spacecraft descent geometry. CNNs are selected at present for their flexibility and simplicity of implementation, but future refinement of this algorithm, including the use of more specialized neural network architectures, will be informed by trade studies and mission requirements for computational cost and performance metrics. ...
... The camera data is fed into a CNN, which estimates the optical depth of the lofted regolith in different sectors of the image (e.g., a 16 × 16 grid) to accommodate spatial inhomogeneity of the PSI cloud, allowing for the lidar data from each region to be processed according to the local dust density. The division of the visual field into different sectors has demonstrated success in similar dust detection algorithms [12]. The optical depth τ is a dimensionless measure of light extinction, which is related to the number density of lofted regolith n, regolith cross-section σ, and light path length s: ...
Safe and reliable lunar landings are crucial for future exploration of the Moon. The regolith ejected by a lander’s rocket exhaust plume represents a significant obstacle in achieving this goal. It prevents spacecraft from reliably utilizing their navigation sensors to monitor their trajectory and spot emerging surface hazards as they near the surface. As part of NASA’s 2024 Human Lander Challenge (HuLC), the team at the University of Michigan developed an innovative concept to help mitigate this issue. We developed and implemented a machine learning (ML)-based sensor fusion system, ARC-LIGHT, that integrates sensor data from the cameras, lidars, or radars that landers already carry but disable during the final landing phase. Using these data streams, ARC-LIGHT will remove erroneous signals and recover a useful detection of the surface features to then be used by the spacecraft to correct its descent profile. It also offers a layer of redundancy for other key sensors, like inertial measurement units. The feasibility of this technology was validated through development of a prototype algorithm, which was trained on data from a purpose-built testbed that simulates imaging through a dusty environment. Based on these findings, a development timeline, risk analysis, and budget for ARC-LIGHT to be deployed on a lunar landing was created.
Light detection and ranging (LiDAR) sensors can create high-quality scans of an environment. However, LiDAR point clouds are affected by harsh weather conditions since airborne particles are easily detected. In literature, conventional filtering and artificial intelligence (AI) filtering methods have been used to detect, and remove, airborne particles. In this paper, a convolutional neural network (CNN) model was used to classify airborne dust particles through a voxel-based approach. The CNN model was compared to several conventional filtering methods, where the results show that the CNN filter can achieve up to 5.39 % F1 score improvement when compared to the best conventional filter. All the filtering methods were tested in dynamic environments where the sensor was attached to a mobile platform, the environment had several moving obstacles, and there were multiple dust cloud sources.
Stress can be considered a mental/physiological reaction in conditions of high discomfort and challenging situations. The levels of stress can be reflected in both the physiological responses and speech signals of a person. Therefore the study of the fusion of the two modalities is of great interest. For this cause, public datasets are necessary so that the different proposed solutions can be comparable. In this work, a publicly available multimodal dataset for stress detection is introduced, including physiological signals and speech cues data. The physiological signals include electrocardiograph (ECG), respiration (RSP), and inertial measurement unit (IMU) sensors equipped in a smart vest. A data collection protocol was introduced to receive physiological and audio data based on alterations between well-known stressors and relaxation moments. Five subjects participated in the data collection, where both their physiological and audio signals were recorded by utilizing the developed smart vest and audio recording application. In addition, an analysis of the data and a decision-level fusion scheme is proposed. The analysis of physiological signals includes a massive feature extraction along with various fusion and feature selection methods. The audio analysis comprises a state-of-the-art feature extraction fed to a classifier to predict stress levels. Results from the analysis of audio and physiological signals are fused at a decision level for the final stress level detection, utilizing a machine learning algorithm. The whole framework was also tested in a real-life pilot scenario of disaster management, where users were acting as first responders while their stress was monitored in real time.
This paper presents a theoretical comparison of early and late fusion methods. An initial discussion on the conditions to apply early or late (soft or hard) fusion is introduced. The analysis show that, if large training sets are available, early fusion will be the best option. If training sets are limited we must do late fusion, either soft or hard. In this latter case, the complications inherent in optimally estimating the fusion function could be avoided in exchange for lower performance. The paper also includes a comparative review of the fusion state of the art methods with the following divisions: early sensor-level fusion; early feature-level fusion; late score-level fusion (late soft fusion); and late decision-level fusion (late hard fusion). The main strengths and weaknesses of the methods are discussed.
Dust storms are natural disasters that have a serious impact on various aspects of human life and physical infrastructure, particularly in urban areas causing health risks, reducing visibility, impairing the transportation sector, and interfering with communication systems. The ability to predict the movement patterns of dust storms is crucial for effective disaster prevention and management. By understanding how these phenomena travel, it is possible to identify the areas that are most at risk and take appropriate measures to mitigate their impact on urban environments. Deep learning methods have been demonstrated to be efficient tools for predicting moving processes while considering multiple geographic information sources. By developing a convolutional neural network (CNN) method, this study aimed to predict the pathway of dust storms that occur in arid regions in central and southern Asia. A total of 54 dust-storm events were extracted from the modern-era retrospective analysis for research and applications, version 2 (MERRA-2) product to train the CNN model and evaluate the prediction results. In addition to dust-storm data (aerosol optical depth (AOD) data), geographic context information including relative humidity, surface air temperature, surface wind direction, surface skin temperature, and surface wind speed was considered. These features were chosen using the random forest feature importance method and had feature importance values of 0.2, 0.1, 0.06, 0.03, and 0.02, respectively. The results show that the CNN model can promisingly predict the dust-transport pathway, such that for the 6, 12, 18, and 24-h time steps, the overall accuracy values were 0.9746, 0.975, 0.9751, and 0.9699, respectively; the F1 score values were 0.7497, 0.7525, 0.7476, and 0.6769, respectively; and the values of the kappa coefficient were 0.7369, 0.74, 0.7351, and 0.6625, respectively.
Automated Driving Systems (ADS) open up a new domain for the automotive industry and offer new possibilities for future transportation with higher efficiency and comfortable experiences. However, perception and sensing for autonomous driving under adverse weather conditions have been the problem that keeps autonomous vehicles (AVs) from going to higher autonomy for a long time. This paper assesses the influences and challenges that weather brings to ADS sensors in a systematic way, and surveys the solutions against inclement weather conditions. State-of-the-art algorithms and deep learning methods on perception enhancement with regard to each kind of weather, weather status classification, and remote sensing are thoroughly reported. Sensor fusion solutions, weather conditions coverage in currently available datasets, simulators, and experimental facilities are categorized. Additionally, potential ADS sensor candidates and developing research directions such as V2X (Vehicle to Everything) technologies are discussed. By looking into all kinds of major weather problems, and reviewing both sensor and computer science solutions in recent years, this survey points out the main moving trends of adverse weather problems in perception and sensing, i.e., advanced sensor fusion and more sophisticated machine learning techniques; and also the limitations brought by emerging 1550 nm LiDARs. In general, this work contributes a holistic overview of the obstacles and directions of perception and sensing research development in terms of adverse weather conditions.
The most well-known research on driverless vehicles at the moment is connected autonomous
vehicles (CAVs), which reflects the future path for the self driving field. The
development of connected autonomous vehicles (CAVs) is not only increasing logistics
operations, but it is also opening up new possibilities for the industry’s sustainable growth.
In this review, we will explore the cloud-controlled wireless network-based model of cyber
physical aspect of the autonomous vehicle, which is coupled with unmanned aerial vehicles
(UAVs). Additionally, this model is Internet of Things (IoT) managed and AI-based, with
a blockchain-based security mechanism. Additionally, we’ll focus on lateral control in
autonomous driving, particularly the lane change maneuver, taking social behavior into
account. Here, we briefly reviewed Vehicle-to-Everything (V2X) communication, which is
carried out by on-board sensors and connected wireless medium that enhance the lane
departure processes while retaining human driver behavior relying on obstacle avoidance.
Although the LiDAR sensor provides high-resolution point cloud data, its performance degrades when exposed to dust environments, which may cause a failure in perception for robotics applications. To address this issue, our study designed an intensity-based filter that can remove dust particles from LiDAR data in two steps. In the first step, it identifies potential points that are likely to be dust by using intensity information. The second step involves analyzing the point density around selected points and removing them if they do not meet the threshold criterion. To test the proposed filter, we collected experimental data sets under the existence of dust and manually labeled them. Using these data, the de-dusting performance of the designed filter was evaluated and compared to several types of conventional filters. The proposed filter outperforms the conventional ones in achieving the best performance with the highest F1 score and removing dust without sacrificing the original surrounding data.
Autonomous driving technology relies on high-resolution, high-performance vision sensing technology. As the two most popular sensors in the industry, Lidar and stereo cameras play an important role in sensing, detection, control and planning decisions. At this stage, the Lidar can obtain more accurate depth map information, but with the increase of resolution, its cost increases sharply and cannot be applied in large scale in the market. At the same time, the depth detection scheme based on computer vision has high resolution, but the data accuracy is very low. In order to solve the practical application defects of the above two sensors, this paper proposes a new sensor fusion method for stereo camera and low-resolution Lidar, which has high resolution, high performance and low cost. This paper adopts new sensor design, multi-sensor calibration, classification and selection of Lidar pointcloud features, large-scale and efficient stereo image matching and depth map calculation, and method of filling missing information based on pointcloud segmentation. To verify the effectiveness of the method in this paper, this paper used high-resolution Lidar as ground truth for comparison. The results show that the fusion method can achieve an average improved accuracy of 30% within a close range of 30 meters, while achieving 98% resolution. In addition, this paper presents a scheme for visualizing multi-sensor image fusion, and encapsulates five modules: multi-sensor calibration, large-scale stereo depth calculation, low-resolution Lidar simulation, sensor data fusion, and fusion image and error visualization, in order to facilitate future secondary development.
LiDAR sensors have played an important role in a variety of related applications due to their merits of providing high-resolution and accurate information about the environment. However, their detection performance significantly degrades under dusty conditions, thereby making the whole perception of the vehicles prone to failure. To deal with this problem, we designed a de-dust filter using a LIOR filtering technique that offers a viable method of eliminating dust particles from the measurement data. Experimental results confirm that the proposed method is robust in the face of dust particles by successfully removing them from the measured point cloud with good filtering accuracy while maintaining rich information about the environment.
The role of sensors such as cameras or LiDAR (Light Detection and Ranging) is crucial for the environmental awareness of self-driving cars. However, the data collected from these sensors are subject to distortions in extreme weather conditions such as fog, rain, and snow. This issue could lead to many safety problems while operating a self-driving vehicle. The purpose of this study is to analyze the effects of fog on the detection of objects in driving scenes and then to propose methods for improvement. Collecting and processing data in adverse weather conditions is often more difficult than data in good weather conditions. Hence, a synthetic dataset that can simulate bad weather conditions is a good choice to validate a method, as it is simpler and more economical, before working with a real dataset. In this paper, we apply fog synthesis on the public KITTI dataset to generate the Multifog KITTI dataset for both images and point clouds. In terms of processing tasks, we test our previous 3D object detector based on LiDAR and camera, named the Spare LiDAR Stereo Fusion Network (SLS-Fusion), to see how it is affected by foggy weather conditions. We propose to train using both the original dataset and the augmented dataset to improve performance in foggy weather conditions while keeping good performance under normal conditions. We conducted experiments on the KITTI and the proposed Multifog KITTI datasets which show that, before any improvement, performance is reduced by 42.67% in 3D object detection for Moderate objects in foggy weather conditions. By using a specific strategy of training, the results significantly improved by 26.72% and keep performing quite well on the original dataset with a drop only of 8.23%. In summary, fog often causes the failure of 3D detection on driving scenes. By additional training with the augmented dataset, we significantly improve the performance of the proposed 3D object detection algorithm for self-driving cars in foggy weather conditions.
Autonomous vehicles (AVs) rely on various types of sensor technologies to perceive the environment and to make logical decisions based on the gathered information similar to humans. Under ideal operating conditions, the perception systems (sensors onboard AVs) provide enough information to enable autonomous transportation and mobility. In practice, there are still several challenges that can impede the AV sensors’ operability and, in turn, degrade their performance under more realistic conditions that actually occur in the physical world. This paper specifically addresses the effects of different weather conditions (precipitation, fog, lightning, etc.) on the perception systems of AVs. In this work, the most common types of AV sensors and communication modules are included, namely: RADAR, LiDAR, ultrasonic, camera, and global navigation satellite system (GNSS). A comprehensive overview of their physical fundamentals, electromagnetic spectrum, and principle of operation is used to quantify the effects of various weather conditions on the performance of the selected AV sensors. This quantification will lead to several advantages in the simulation world by creating more realistic scenarios and by properly fusing responses from AV sensors in any object identification model used in AVs in the physical world. Moreover, it will assist in selecting the appropriate fading or attenuation models to be used in any X-in-the-loop (XIL, e.g., hardware-in-the-loop, software-in-the-loop, etc.) type of experiments to test and validate the manner AVs perceive the surrounding environment under certain conditions.
Visible camera-based semantic segmentation and semantic forecasting are important perception tasks in autonomous driving. In semantic segmentation, the current frame's pixel-level labels are estimated using the current visible frame. In semantic forecasting, the future frame's pixel-level labels are predicted using the current and the past visible frames and pixel-level labels. While reporting state-of-the-art accuracy, both of these tasks are limited by the visible camera's susceptibility to varying illumination, adverse weather conditions, sunlight and headlight glare etc. In this work, we propose to address these limitations using the deep sensor fusion of the visible and the thermal camera. The proposed sensor fusion framework performs both semantic forecasting as well as an optimal semantic segmentation within a multi-step iterative framework. In the first or forecasting step, the framework predicts the semantic map for the next frame. The predicted semantic map is updated in the second step, when the next visible and thermal frame is observed. The updated semantic map is considered as the optimal semantic map for the given visible-thermal frame. The semantic map forecasting and updating are iteratively performed over time. The estimated semantic maps contain the pedestrian behavior, the free space and the pedestrian crossing labels. The pedestrian behavior is categorized based on their spatial, motion and dynamic orientation information. The proposed framework is validated using the public KAIST dataset. A detailed comparative analysis and ablation study is performed using pixel-level classification and IOU error metrics. The results show that the proposed framework can not only accurately forecast the semantic segmentation map but also accurately update them.
In road environments, real-time knowledge of local weather conditions is an essential prerequisite for addressing the twin challenges of enhancing road safety and avoiding congestions. Currently, the main means of quantifying weather conditions along a road network requires the installation of meteorological stations. Such stations are costly and must be maintained; however, large numbers of cameras are already installed on the roadside. A new artificial intelligence method that uses road traffic cameras and a convolution neural network to detect weather conditions has, therefore, been proposed. It addresses a clearly defined set of constraints relating to the ability to operate in real-time and to classify the full spectrum of meteorological conditions and order them according to their intensity. The method can differentiate between five weather conditions such as normal (no precipitation), heavy rain, light rain, heavy fog and light fog. The deep-learning method’s training and testing phases were conducted using a new database called the Cerema-AWH (Adverse Weather Highway) database. After several optimisation steps, the proposed method obtained an accuracy of 0.99 for classification.
In the last few years, the deep learning (DL) computing paradigm has been deemed the Gold Standard in the machine learning (ML) community. Moreover, it has gradually become the most widely used computational approach in the field of ML, thus achieving outstanding results on several complex cognitive tasks, matching or even beating those provided by human performance. One of the benefits of DL is the ability to learn massive amounts of data. The DL field has grown fast in the last few years and it has been extensively used to successfully address a wide range of traditional applications. More importantly, DL has outperformed well-known ML techniques in many domains, e.g., cybersecurity, natural language processing, bioinformatics, robotics and control, and medical information processing, among many others. Despite it has been contributed several works reviewing the State-of-the-Art on DL, all of them only tackled one aspect of the DL, which leads to an overall lack of knowledge about it. Therefore, in this contribution, we propose using a more holistic approach in order to provide a more suitable starting point from which to develop a full understanding of DL. Specifically, this review attempts to provide a more comprehensive survey of the most important aspects of DL and including those enhancements recently added to the field. In particular, this paper outlines the importance of DL, presents the types of DL techniques and networks. It then presents convolutional neural networks (CNNs) which the most utilized DL network type and describes the development of CNNs architectures together with their main features, e.g., starting with the AlexNet network and closing with the High-Resolution network (HR.Net). Finally, we further present the challenges and suggested solutions to help researchers understand the existing research gaps. It is followed by a list of the major DL applications. Computational tools including FPGA, GPU, and CPU are summarized along with a description of their influence on DL. The paper ends with the evolution matrix, benchmark datasets, and summary and conclusion.
With the significant advancement of sensor and communication technology and the reliable application of obstacle detection techniques and algorithms, automated driving is becoming a pivotal technology that can revolutionize the future of transportation and mobility. Sensors are fundamental to the perception of vehicle surroundings in an automated driving system, and the use and performance of multiple integrated sensors can directly determine the safety and feasibility of automated driving vehicles. Sensor calibration is the foundation block of any autonomous system and its constituent sensors and must be performed correctly before sensor fusion and obstacle detection processes may be implemented. This paper evaluates the capabilities and the technical performance of sensors which are commonly employed in autonomous vehicles, primarily focusing on a large selection of vision cameras, LiDAR sensors, and radar sensors and the various conditions in which such sensors may operate in practice. We present an overview of the three primary categories of sensor calibration and review existing open-source calibration packages for multi-sensor calibration and their compatibility with numerous commercial sensors. We also summarize the three main approaches to sensor fusion and review current state-of-the-art multi-sensor fusion techniques and algorithms for object detection in autonomous driving applications. The current paper, therefore, provides an end-to-end review of the hardware and software methods required for sensor fusion object detection. We conclude by highlighting some of the challenges in the sensor fusion field and propose possible future research directions for automated driving systems.
LiDAR sensors have the advantage of being able to generate high-resolution imaging quickly during both day and night; however, their performance is severely limited in adverse weather conditions such as snow, rain, and dense fog. Consequently, many researchers are actively working to overcome these limitations by applying sensor fusion with radar and optical cameras to LiDAR. While studies on the denoising of point clouds acquired by LiDAR in adverse weather have been conducted recently, the results are still insufficient for application to autonomous vehicles because of speed and accuracy performance limitations. Therefore, we propose a new intensity-based filter that differs from the existing distance-based filter, which limits the speed. The proposed method showed overwhelming performance advantages in terms of both speed and accuracy by removing only snow particles while leaving important environmental features. The intensity criteria for snow removal were derived based on an analysis of the properties of laser light and snow particles.
Recently, the advancement of deep learning (DL) in discriminative feature learning from 3-D LiDAR data has led to rapid development in the field of autonomous driving. However, automated processing uneven, unstructured, noisy, and massive 3-D point clouds are a challenging and tedious task. In this article, we provide a systematic review of existing compelling DL architectures applied in LiDAR point clouds, detailing for specific tasks in autonomous driving, such as segmentation, detection, and classification. Although several published research articles focus on specific topics in computer vision for autonomous vehicles, to date, no general survey on DL applied in LiDAR point clouds for autonomous vehicles exists. Thus, the goal of this article is to narrow the gap in this topic. More than 140 key contributions in the recent five years are summarized in this survey, including the milestone 3-D deep architectures, the remarkable DL applications in 3-D semantic segmentation, object detection, and classification; specific data sets, evaluation metrics, and the state-of-the-art performance. Finally, we conclude the remaining challenges and future researches.
Road weather conditions are closed-related to the transportation safety and traffic capacity. With the development of road surveillance systems, weather conditions could be recognized from video. However, it is hard to be detected by machine. To address it, a deeply supervised convolution neural network (DS-CNN) is designed and trained on a self-established dataset. The traffic image dataset includes five groups labeled with “sunny”, “overcast”, “rainy”, “snowy” and “foggy”. Each group has manually labeled and selected more than 2500 images. The DS-CNN, can achieve the precision rate of 0.9681 and the recall rate of 0.9681. This practical weather detection method has been built in the surveillance system covers five freeways. The experimental result used in practice shows much worse detection results at first with disturbance of difference scenarios, worn camera, transmission failure and so on. With further improvement of DS-CNN, we found that it was much more effective than hand-crafted features in this task, and a deeper neural architecture could derive more powerful features. Moreover, the results show that dense weather information has more details in a small scale. In order to fast report the regional weather detection result, a designed visualization method is also proposed in spatiotemporal dimension to fuse with currently-used system. The high accuracy and fast detection speed with friendly visualization would lead to more precise traffic management and prompt the road weather traffic control to more intelligence levels.
With the significant development of practicability in deep learning and the ultra-high-speed information transmission rate of 5G communication technology will overcome the barrier of data transmission on the Internet of Vehicles, automated driving is becoming a pivotal technology affecting the future industry. Sensors are the key to the perception of the outside world in the automated driving system and whose cooperation performance directly determines the safety of automated driving vehicles. In this survey, we mainly discuss the different strategies of multi-sensor fusion in automated driving in recent years. The performance of conventional sensors and the necessity of multi-sensor fusion are analyzed, including radar, LiDAR, camera, ultrasonic, GPS, IMU, and V2X. According to the differences in the latest studies, we divide the fusion strategies into four categories and point out some shortcomings. Sensor fusion is mainly applied for multi-target tracking and environment reconstruction. We discuss the method of establishing a motion model and data association in multi-target tracking. At the end of the paper, we analyzed the deficiencies in the current studies and put forward some suggestions for further improvement in the future. Through this investigation, we hope to analyze the current situation of multi-sensor fusion in the automated driving process and provide more efficient and reliable fusion strategies.
In this paper, we are presenting a short overview of the sensors and sensor fusion in autonomous vehicles. We focused on the sensor fusion from the key sensors in autonomous vehicles: camera, radar, and lidar. The current state-of-the-art in this area will be presented, such as 3D object detection method for leveraging both image and 3D point cloud information, moving object detection and tracking system, and occupancy grid mapping used for navigation and localization in dynamic environments. It is shown that including more sensors into sensor fusion system benefits with better performance and the robustness of the solution. Moreover, usage of camera data in localization and mapping, that is traditionally solved by radar and lidar data, improves the perceived model of the environment. Sensor fusion has a crucial role in autonomous systems overall, therefore this is one of the fastest developing areas in the autonomous vehicles domain.
Autonomous driving is expected to revolutionize road traffic attenuating current externalities, especially accidents and congestion. Carmakers, researchers and administrations have been working on autonomous drivingfor years and significant progress has been made. However, the doubts and challenges to overcome are still huge, as the implementation of an autonomous driving environment encompasses not only complex automotive technology, but also human behavior,ethics,traffic management strategies, policies, liability, etc.As a result, carmakers do not expect to commercially launch fully driverlessvehicles in the short-term. From the technical perspective, theunequivocal detection of obstacles at high speeds and long distances is one of the greatest difficulties to face. Regarding traffic management strategies, all approaches share the vision that vehicles should behave cooperatively. General V2V cooperation and platooning are options being discussed, both with multiple variants. Various strategies, built from different standpoints, are being designedand validated using simulation. Besides, legal issues have already been arisen in the context of highly-automated driving. They rangefrom the need for special driving licenses to much more intricate topics like liabilityin the event of an accident or privacy issues. All these legal and ethical concerns could hinder the spread of autonomous vehicles once technologically feasible.This paper provides an overview of the current state of the art in the key aspects of autonomous driving. Based on the information received in situ from top research centersin the field and on a literature review, authors highlight the most important advances and findings reached so far, discuss different approaches regarding autonomous traffic and propose a framework for future research.
This paper describes the behavior of a commercial light detection and ranging (LiDAR) sensor in the presence of dust. This work is motivated by the need to develop perception systems that must operate where dust is present. This paper shows that the behavior of measurements from the sensor is systematic and predictable. LiDAR sensors exhibit four behaviors that are articulated and understood from the perspective of the shape-of-return signals from emitted light pulses. We subject the commercial sensor to a series of tests that measure the return pulses and show that they are consistent with theoretical predictions of behavior. Several important conclusions emerge: (i) where LiDAR measures dust, it does so to the leading edge of a dust cloud rather than as a random noise; (ii) dust starts to affect measurements when the atmospheric transmittance is less than 71%–74%, but this is quite variable with conditions; (iii) LiDAR is capable of ranging to a target in dust clouds with transmittance as low as 2% if the target is retroreflective and 6% if it is of low reflectivity; (iv) the effects of airborne particulates such as dust are less evident in the far field. The significance of this paper lies in providing insight into how better to use measurements from off-the-shelf LiDAR sensors in solving perception problems.
Three dimension (3D) point cloud data in fog-filled environments were measured using light detection and ranging (LIDAR). Disaster response robots cannot easily navigate through such environments because this data contain false data and distance errors caused by fog. We propose a method for recognizing and removing fog based on 3D point cloud features and a distance correction method for reducing measurement errors. Laser intensity and geometrical features are used to recognize false data. However, these features are not sufficient to measure a 3D point cloud in fog-filled environments with 6 and 2 m visibility, as misjudgments occur. To reduce misjudgment, laser beam penetration features were added. Support vector machine (SVM) and K-nearest neighbor (KNN) are used to classify point cloud data into ‘fog’ and ‘objects.’ We evaluated our method in heavy fog (6 and 2 m visibility). SVM has a better F-measure than KNN; it is higher than 90% in heavy fog (6 and 2 m visibility). The distance error correction method reduces distance errors in 3D point cloud data by a maximum of 4.6%. A 3D point cloud was successfully measured using LIDAR in a fog-filled environment. Our method’s recall (90.1%) and F-measure (79.4%) confirmed its robustness.
http://www.tandfonline.com/toc/tadr20/30/11-12
Reliable detection and tracking of surrounding objects are indispensable for comprehensive motion prediction and planning of autonomous vehicles. Due to the limitations of individual sensors, the fusion of multiple sensor modalities is required to improve the overall detection capabilities. Additionally, robust motion tracking is essential for reducing the effect of sensor noise and improving state estimation accuracy. The reliability of the autonomous vehicle software becomes even more relevant in complex, adversarial high-speed scenarios at the vehicle handling limits in autonomous racing. In this paper, we present a modular multi-modal sensor fusion and tracking method for high-speed applications. The method is based on the Extended Kalman Filter (EKF) and is capable of fusing heterogeneous detection inputs to track surrounding objects consistently. A novel delay compensation approach enables to reduce the influence of the perception software latency and to output an updated object list. It is the first fusion and tracking method validated in high-speed real-world scenarios at the Indy Autonomous Challenge 2021 and the Autonomous Challenge at CES (AC@CES) 2022, proving its robustness and computational efficiency on embedded systems. It does not require any labeled data and achieves position tracking residuals below 0.1 m. The related code is available as open-source software at
https://github.com/TUMFTM/FusionTracking
.
A significant advancement in automated driving technology and the deployment of it have noticeably increased the safety of humans in transportation, industry, and the construction field in the last several decades. As a result, the automated driving technology of an autonomous excavator has become a hot topic among researchers. This study provides the findings of a systematic review of the literature on automated driving and working systems of autonomous excavators published in the last two decades. This paper divides the autonomous system into five groups, namely sensing, perception, decision, planning, and action, and presents research gaps that are not yet well studied for each group. Furthermore, the review of publications presented in this paper is conducted with the aim of highlighting key challenges and contributions of the previous and ongoing research work in this field. Finally, the paper is concluded and provides a direction for future work in this area.
Recent advances in robotics and deep learning demonstrate promising 3-D perception performances via fusing the light detection and ranging (LiDAR) sensor and camera data, where both spatial calibration and temporal synchronization are generally required. While the LiDAR–camera calibration problem has been actively studied during the past few years, LiDAR–camera synchronization has been less studied and mainly addressed by employing a conventional pipeline consisting of clock synchronization and temporal synchronization. The conventional pipeline has certain potential limitations, which have not been sufficiently addressed and could be a bottleneck for the potential wide adoption of low-cost LiDAR–camera platforms. Different from the conventional pipeline, in this article, we propose the LiCaS3, the first deep-learning-based framework, for the LiDAR–camera synchronization task via self-supervised learning. The proposed LiCaS3 does not require hardware synchronization or extra annotations and can be deployed both online and offline. Evaluated on both the KITTI and Newer College datasets, the proposed method shows promising performances. The code will be publicly available at
https://github.com/KleinYuan/LiCaS3
.
Quantitative identification of a dust storm weather is key to forecasting and early warning of dust storm disasters. However, traditional visibility ground-based measurements cannot be extended to regional observations. Remote sensing of dust storms is associated with large uncertainties in dust thresholds. For accurate quantification of the dust storm region, this study proposes a dust storm mask algorithm to identify dust storm in the Tarim Basin. The dust storm mask includes two algorithms to identify the dust storm outbreak and the spatial extent by using the Advanced Geostationary Radiation Imager (AGRI) on board the FY-4A satellite. A deep learning convolutional neural network (CNN) is employed for the dust storm mask, and the AGRI bands 1-3, 5-6, and 11-13 are used as model parameters. A physical algorithm (PA) is adopted to construct a dust storm mask using three physical dust indices: the Normalized Difference Dust Index (NDDI2.25μm-0.47μm/2.25μm+0.47μm), the Dust Ratio Index (DRI7.10μm/3.75μm), and the Brightness Temperature Difference (BTD3.75μm-13.50μm). The results show that the CNN algorithm has a higher classification accuracy on dust storm detection compared to the PA. This advantage suggests that the CNN can effectively monitor large-scale dust storms. The dust storm identification results were compared and analyzed with the AGRI true color, Aerosol Optical Depth products, and Ultra Violet Aerosol Index products.
Autonomous mobile systems such as vehicles or robots are equipped with multiple sensor modalities including Lidar, RGB, and Radar. The fusion of multi-modal information can enhance task accuracy but indiscriminate sensing and fusion in all modality increases demand on available system resources. This paper presents a task-driven approach to input fusion that minimizes utilization of resource-heavy sensors and demonstrates its application to Visual-Lidar fusion for object tracking and path planning. Proposed spatiotemporal sampling algorithm activates Lidar only at regions-of-interest identified by analyzing visual input and reduces the Lidar ‘base frame rate’ according to kinematic state of the system. This significantly reduces Lidar usage, in terms of data sensed/transferred and potentially power consumed, without severe reduction in performance compared to both a baseline decision-level fusion and state-of-the-art deep multi-modal fusion.
Due to the great achievements in artificial intelligence, it is predicted that autonomous vehicles with little or even no human involvement will come to market in the near future. Autonomous vehicles are equipped with multiple types of sensors. An autonomous vehicle relies on its sensors to perceive its environment, and this sensory information plays a key role in the vehicle's driving decisions. Hence, ensuring the trustworthiness of the sensor data is crucial for drivers' safety. In this paper, we discuss the impact of perception error attacks (PEAs) on autonomous vehicles, and propose a countermeasure called LIFE (LIDAR and Image data Fusion for detecting perception Errors). LIFE detects PEAs by analyzing the consistency between camera image data and LIDAR data using novel machine learning and computer vision algorithms. The performance of LIFE has been evaluated extensively using the KITTI dataset.
Radar and camera information fusion sensing methods are used to solve the inherent shortcomings of the single sensor in severe weather. Our fusion scheme uses radar as the main hardware and camera as the auxiliary hardware framework. At the same time, the Mahalanobis distance is used to match the observed values of the target sequence. Data fusion based on the joint probability function method. Moreover, the algorithm was tested using actual sensor data collected from a vehicle, performing real-time environment perception. The test results show that radar and camera fusion algorithms perform better than single sensor environmental perception in severe weather, which can effectively reduce the missed detection rate of autonomous vehicle environment perception in severe weather. The fusion algorithm improves the robustness of the environment perception system and provides accurate environment perception information for the decision-making system and control system of autonomous vehicles.
Recently, a significant amount of research has been investigated on interpretation of deep neural networks (DNNs) which are normally processed as black box models. Among the methods that have been developed, local interpretation methods stand out which have the features of clear expression in interpretation and low computation complexity. Different from existing surveys which cover a broad range of methods on interpretation of DNNs, this survey focuses on local interpretation methods with an in-depth analysis of the representative works including the newly proposed approaches. From the perspective of principles, we first divide local interpretation methods into two main categories: model-driven methods and data-driven methods. Then we make a fine-grained distinction between the two types of these methods, and highlight the latest ideas and principles. We further demonstrate the effects of a number of interpretation methods by reproducing the results through open source software plugins. Finally, we point out research directions in this rapidly evolving field.
We tackle the problem of exploiting Radar for perception in the context of self-driving as Radar provides complementary information to other sensors such as LiDAR or cameras in the form of Doppler velocity. The main challenges of using Radar are the noise and measurement ambiguities which have been a struggle for existing simple input or output fusion methods. To better address this, we propose a new solution that exploits both LiDAR and Radar sensors for perception. Our approach, dubbed RadarNet, features a voxel-based early fusion and an attention-based late fusion, which learn from data to exploit both geometric and dynamic information of Radar data. RadarNet achieves state-of-the-art results on two large-scale real-world datasets in the tasks of object detection and velocity estimation. We further show that exploiting Radar improves the perception capabilities of detecting faraway objects and understanding the motion of dynamic objects.
Lidar sensors are frequently used in environment perception for autonomous vehicles and mobile robotics to complement camera, radar, and ultrasonic sensors. Adverse weather conditions are significantly impacting the performance of lidar-based scene understanding by causing undesired measurement points that in turn effect missing detections and false positives. In heavy rain or dense fog, water drops could be misinterpreted as objects in front of the vehicle which brings a mobile robot to a full stop. In this letter, we present the first
CNN
-based approach to understand and filter out such adverse weather effects in point cloud data. Using a large data set obtained in controlled weather environments, we demonstrate a significant performance improvement of our method over state-of-the-art involving geometric filtering. Data is available at
https://github.com/rheinzler/PointCloudDeNoising
.
We estimate weather information from single images, as an important clue to unveil real-world characteristics available in the cyberspace, and as a complementary feature to facilitate computer vision applications. Based on an image collection with geotags, we crawl the associated weather and elevation properties from the web. With this large-scale and rich image dataset, various correlations between weather properties and metadata are observed, and are used to construct computational models based on random forests to estimate weather information for any given image. We describe interesting statistics linking weather properties with human behaviors, and show that image’s weather information can potentially benefit computer vision tasks such as landmark classification. Overall, this work proposes a large-scale image dataset with rich weather properties, and provides comprehensive studies on using cameras as weather sensors.
Given a single outdoor image, we propose a collaborative learning approach using novel weather features to label the image as either sunny or cloudy. Though limited, this two-class classification problem is by no means trivial given the great variety of outdoor images captured by different cameras where the images may have been edited after capture. Our overall weather feature combines the data-driven convolutional neural network (CNN) feature and well-chosen weather-specific features. They work collaboratively within a unified optimization framework that is aware of the presence (or absence) of a given weather cue during learning and classification. In this paper we propose a new data augmentation scheme to substantially enrich the training data, which is used to train a latent SVM framework to make our solution insensitive to global intensity transfer. Extensive experiments are performed to verify our method. Compared with our previous work and the sole use of a CNN classifier, this paper improves the accuracy up to 7 − 8%. Our weather image dataset is available together with the executable of our classifier.
Multi-class weather classification is a fundamental and significant technique which has many potential applications, such as video surveillance and intelligent transportation. However, it is a challenging task due to the diversity of weather and lack of discriminate feature. Most existing weather classification methods only consider two-class weather conditions such as sunny-rainy or sunny-cloudy weather. Moreover, they predominantly focus on a fixed scene such as popular tourism and traffic scenario. In this paper, we propose a novel method for scene-free multi-class weather classification from single images based on multiple category-specific dictionary learning and multiple kernel learning. To improve the discrimination of image representation and enhance the performance of multiple weather classification, our approach extracts multiple weather features and learns dictionaries based on these features. To select a good subset of features, we utilize multiple kernel learning algorithm to learn an optimal linear combination of feature kernels. In addition, to evaluate the proposed approach, we collect an outdoor image set that contains 20 K images, called MWI (Multi-class Weather Image) set. Experimental results show the effectiveness of the proposed method.
This paper describes a robust vision-based relative-localization approach for a moving target based on an RGB-depth (RGB-D) camera and sensor measurements from two-dimensional (2-D) light detection and ranging (LiDAR). With the proposed approach, a target's three-dimensional (3-D) and 2-D position information is measured with an RGB-D camera and LiDAR sensor, respectively, to find the location of a target by incorporating visual-tracking algorithms, depth information of the structured light sensor, and a low-level vision-LiDAR fusion algorithm, e.g., extrinsic calibration. To produce 2-D location measurements, both visual-and depth-tracking approaches are introduced, utilizing an adaptive color-based particle filter (ACPF) (for visual tracking) and an interacting multiple-model (IMM) estimator with intermittent observations from depth-image segmentation (for depth image tracking). The 2-D LiDAR data enhance location measurements by replacing results from both visual and depth tracking; through this procedure, multiple LiDAR location measurements for a target are generated. To deal with these multiple-location measurements, we propose a modified track-to-track fusion scheme. The proposed approach shows robust localization results, even when one of the trackers fails. The proposed approach was compared to position data from a Vicon motion-capture system as the ground truth. The results of this evaluation demonstrate the superiority and robustness of the proposed approach.