Article

Vision Transformer for Detecting Critical Situations And Extracting Functional Scenario for Automated Vehicle Safety Assessment

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Recently, deep learning techniques have seen significant advancements in various fields, as evidenced by numerous studies [32][33][34]. For network threat detection, in [35], anomalous traffic was detected using spatio-temporal information modeling based on the C-LSTM method. ...
Article
Full-text available
Using traditional methods based on detection rules written by human security experts presents significant challenges for the accurate detection of network threats, which are becoming increasingly sophisticated. In order to deal with the limitations of traditional methods, network threat detection techniques utilizing artificial intelligence technologies such as machine learning are being extensively studied. Research has also been conducted on analyzing various string patterns in network packet payloads through natural language processing techniques to detect attack intent. However, due to the nature of packet payloads that contain binary and text data, a new approach is needed that goes beyond typical natural language processing techniques. In this paper, we study a token extraction method optimized for payloads using n-gram and byte-pair encoding techniques. Furthermore, we generate embedding vectors that can understand the context of the packet payload using algorithms such as Word2Vec and FastText. We also compute the embedding of various header data associated with packets such as IP addresses and ports. Given these features, we combine a text 1D CNN and a multi-head attention network in a novel fashion. We validated the effectiveness of our classification technique on the CICIDS2017 open dataset and over half a million data collected by The Education Cyber Security Center (ECSC), currently operating in South Korea. The proposed model showed remarkable performance compared to previous studies, achieving highly accurate classification with an F1-score of 0.998. Our model can also preprocess and classify 150,000 network threats per minute, helping security agents in the field maximize their time and analyze more complex attack patterns.
... Artificial intelligence (AI) recently has advanced unprecedentedly with the introduction of some foundation models that were proven to be effective in various problem domains [2][3][4][5]. Some AI technologies have been employed for small-scale counterfeit examining systems [6][7][8]. However, the proposed neural networks still have several limitations in building an automated system for examining counterfeit copies at a large scale. ...
Preprint
Full-text available
This paper presents a two-stage hierarchical neural network using image classification and object detection algorithms as key building blocks for a system that automatically detects a potential design right infringement. This neural network is trained to return the Top-N original design right records that highly resemble the input image of a counterfeit. Design rights specify the unique aesthetic characteristics of a product. Due to the rapid change of trends, new design rights are continuously generated. This work proposes an Ensemble Neural Network (ENN), an artificial neural network model that aims to deal with a large amount of counterfeit data and design right records that are frequently added and deleted. At first, we performed image classification and objection detection learning per design right using the existing models with a proven track record of high accuracy. The distributed models form the backbone of the ENN and yield intermediate results aggregated at a master neural network. This master neural network is a deep residual network paired with a fully connected network. This ensemble layer is trained to determine the sub-models that return the best result for a given input image of a product. In the final stage, the ENN model multiples the inferred similarity coefficients to the weighted input vectors produced by the individual sub-models to assess the similarity between the test input image and the existing product design rights to see any sign of violation. Given 84 design rights and the sample product images taken meticulously under various conditions, our ENN model achieved average Top-1 and Top-3 accuracies of 98.409% and 99.460%, respectively. Upon introducing new design rights data, a partial update of the inference model was done an order of magnitude faster than the single model. ENN maintained a high level of accuracy as it scaled out to handle more design rights. Therefore, the ENN model is expected to offer practical help to the inspectors in the field, such as the customs at the border that deal with a swarm of products.
Article
Full-text available
This study aims to improve the accuracy of predicting the severity of traffic accidents by developing an innovative traffic accident risk prediction model—StackTrafficRiskPrediction. The model combines multidimensional data analysis including environmental factors, human factors, roadway characteristics, and accident-related meta-features. In the model comparison, the StackTrafficRiskPrediction model achieves an accuracy of 0.9613, 0.9069, and 0.7508 in predicting fatal, serious, and minor accidents, respectively, which significantly outperforms the traditional logistic regression model. In the experimental part, we analyzed the severity of traffic accidents under different age groups of drivers, driving experience, road conditions, light and weather conditions. The results showed that drivers between 31 and 50 years of age with 2 to 5 years of driving experience were more likely to be involved in serious crashes. In addition, it was found that drivers tend to adopt a more cautious driving style in poor road and weather conditions, which increases the margin of safety. In terms of model evaluation, the StackTrafficRiskPrediction model performs best in terms of accuracy, recall, and ROC–AUC values, but performs poorly in predicting small-sample categories. Our study also revealed limitations of the current methodology, such as the sample imbalance problem and the limitations of environmental and human factors in the study. Future research can overcome these limitations by collecting more diverse data, exploring a wider range of influencing factors, and applying more advanced data analysis techniques.
Article
Full-text available
This paper presents a two-stage hierarchical neural network using image classification and object detection algorithms as key building blocks for a system that automatically detects a potential design right infringement. This neural network is trained to return the Top-N original design right records that highly resemble the input image of a counterfeit. This work proposes an ensemble neural network (ENN), an artificial neural network model that aims to deal with a large amount of counterfeit data and design right records that are frequently added and deleted. First, we performed image classification and objection detection learning per design right using acclaimed existing models with high accuracy. The distributed models form the backbone of the ENN and yield intermediate results aggregated at a master neural network. This master neural network is a deep residual network paired with a fully connected network. This ensemble layer is trained to determine the sub-models that return the best result for a given input image of a product. In the final stage, the ENN model multiplies the inferred similarity coefficients to the weighted input vectors produced by the individual sub-models to assess the similarity between the test input image and the existing product design rights to see any sign of violation. Given 84 design rights and the sample product images taken meticulously under various conditions, our ENN model achieved average Top-1 and Top-3 accuracies of 98.409% and 99.460%, respectively. Upon introducing new design rights data, a partial update of the inference model was performed an order of magnitude faster than the single model. The ENN maintained a high level of accuracy as it was scaled out to handle more design rights. Therefore, the ENN model is expected to offer practical help to the inspectors in the field, such as customs at the border that deal with a swarm of products.
Article
Full-text available
A thorough safety assessment of an automated driving system (ADS) is necessary before its introduction into the market and practical application. Scenario-based assessments have received significant attention in research. However, identifying sufficient critical scenarios for ADSs is a major challenge, especially for complex urban environments with a variety of road geometries, traffic rules, and traffic participants. To identify the critical scenarios in this complex environment, it is essential to understand the environmental factors that lead to safety-critical events (e.g., accidents and near-miss incidents). Thus, this study proposes a method for identification of critical scenario components by analyzing near-miss incident data and extracting environmental factors that induce driver errors. In this study, we applied the proposed method to a scenario, in which an ego vehicle makes a right turn at a signalized intersection with an oncoming vehicle approaching the intersection in left-hand traffic, as a case study. The proposed method identified two components (dynamic occlusion caused by oncoming right-turn vehicles and change in traffic lights) that were both critical and challenging for ADSs. The case study results showed the usefulness of the identified components and the validity of the proposed method, which can extract critical scenario components explicitly.
Article
Full-text available
Dashcams are considered video sensors, and the number of dashcams installed in vehicles is increasing. Native dashcam video players can be used to view evidence during investigations, but these players are not accepted in court and cannot be used to extract metadata. Digital forensic tools, such as FTK, Autopsy and Encase, are specifically designed for functions and scripts and do not perform well in extracting metadata. Therefore, this paper proposes a dashcam forensics framework for extracting evidential text including time, date, speed, GPS coordinates and speed units using accurate optical character recognition methods. The framework also transcribes evidential speech related to lane departure and collision warning for enabling automatic analysis. The proposed framework associates the spatial and temporal evidential data with a map, enabling investigators to review the evidence along the vehicle’s trip. The framework was evaluated using real-life videos, and different optical character recognition (OCR) methods and speech-to-text conversion methods were tested. This paper identifies that Tesseract is the most accurate OCR method that can be used to extract text from dashcam videos. Also, the Google speech-to-text API is the most accurate, while Mozilla’s DeepSpeech is more acceptable because it works offline. The framework was compared with other digital forensic tools, such as Belkasoft, and the framework was found to be more effective as it allows automatic analysis of dashcam evidence and generates digital forensic reports associated with a map displaying the evidence along the trip.
Article
Full-text available
Artificial intelligence plays a significant role in traffic-accident detection. Traffic accidents involve a cascade of inadvertent events, making traditional detection approaches challenging. For instance, Convolutional Neural Network (CNN)-based approaches cannot analyze temporal relationships among objects, and Recurrent Neural Network (RNN)-based approaches suffer from low processing speeds and cannot detect traffic accidents simultaneously across multiple frames. Furthermore, these networks dismiss background interference in input video frames. This paper proposes a framework that begins by subtracting the background based on You Only Look Once (YOLOv5), which adaptively reduces background interference when detecting objects. Subsequently, the CNN encoder and Transformer decoder are combined into an end-to-end model to extract the spatial and temporal features between different time points, allowing for a parallel analysis between input video frames. The proposed framework was evaluated on the Car Crash Dataset through a series of comparison and ablation experiments. Our framework was benchmarked against three accident-detection models to evaluate its effectiveness, and the proposed framework demonstrated a superior accuracy of approximately 96%. The results of the ablation experiments indicate that when background subtraction was not incorporated into the proposed framework, the values of all evaluation indicators decreased by approximately 3%.
Article
Full-text available
Traffic closed-circuit television (CCTV) devices can be used to detect and track objects on roads by designing and applying artificial intelligence and deep learning models. However, extracting useful information from the detected objects and determining the occurrence of traffic accidents are usually difficult. This paper proposes a CCTV frame-based hybrid traffic accident classification model that enables the identification of whether a frame includes accidents by generating object trajectories. The proposed model utilizes a Vision Transformer (ViT) and a Convolutional Neural Network (CNN) to extract latent representations from each frame and corresponding trajectories. The fusion of frame and trajectory features was performed to improve the traffic accident classification ability of the proposed hybrid method. In the experiments, the Car Accident Detection and Prediction (CADP) dataset was used to train the hybrid model, and the accuracy of the model was approximately 97%. The experimental results indicate that the proposed hybrid method demonstrates an improved classification performance compared to traditional models.
Article
Full-text available
The controller design of vehicle systems depends on accurate reference index input. Considering information fusion and feature extraction based on existing data settings in the time domain, if reasonable input is selected for prediction to obtain accurate information of future state, it is of great significance for control decision-making, system response, and driver’s active intervention. In this paper, the nonlinear dynamic model of the four-wheel steering vehicle system was built, and the Long Short-Term Memory (LSTM) network architecture was established. On this basis, according to the real-time data under different working conditions, the information correction calculation of variable time-domain length was carried out to obtain the real-time state input length. At the same time, the historical state data of coupled road information was adopted to train the LSTM network offline, and the acquired real-time data state satisfying the accuracy was used as the LSTM network input to carry out online prediction of future confidence information. In order to solve the problem of mixed sensitivity of the system, a robust controller for vehicle active steering was designed with the sideslip angle of the centroid of 0, and the predicted results were used as reference inputs for corresponding numerical calculation verification. Finally, according to the calculated results, the robust controller with information prediction can realize the system stability control under coupling conditions on the premise of knowing the vehicle state information in advance, which provides an effective reference for controller response and driver active manipulation.
Article
Full-text available
As new mobility called automated vehicles (AVs) appears on the road, positive effects are expected, but in fact, unexpected adverse effects may arise due to the mixed traffic situation with human-driven vehicles (HVs). Prior to the commercialization of AVs, a preliminary review and preventive measures are required, and among them, the interaction between the existing vehicle and the new mobility and the interaction with the infrastructure must be considered. Therefore, we propose (i) the positive–negative effect of introducing AVs in a mixed traffic situation and (ii) the optimal operation plan for the dedicated lane for AVs. First, the effect of introducing AVs considering the interaction between vehicles in the mixed traffic situation showed mostly positive such as speed increase, delay time reduction, and capacity increase. However, in a 75% Market Penetration Rate (MPR) environment of all levels of Service (LOS), the effect was diminished compared to the previous MPR. This is contemplated to be the result of a conflict caused by the operation of some HVs (including heavy vehicles) behavior as obstacles in the situation where most of the vehicles on the road are AVs. Based on the previous result, we deployed the dedicated lane to resolve the negative effect in the 75% MPR environment and proposed an optimal operation strategy for the AVs dedicated lane from the perspective of operational efficiency for a more feasible operation. Given the 75% MPR, the Mixed-Use operation strategy of High-Occupancy Vehicles (HOV) and AVs is ascertained as the most suitable operation strategy. This implies that even in the era of AVs, the influence of other vehicles (e.g., heavy vehicles, other mobility) must be considered. This study is significant by considering the negative effects of the introduction of AVs and presenting an optimal operation strategy for dedicated lanes, and it can expect to be used as a new strategy as part of the Free/Expressway Traffic Management System (FTMS) applicable in the era of autonomous driving.
Article
Full-text available
Currently, the stage of technological development for commercialization of autonomous driving level 3 has been achieved. However, the legal and institutional bases and traffic safety facilities for safe driving on actual roads in autonomous driving mode are insufficient. Therefore, in this study, a measurement method using a camera(monocular or dual) was used to evaluate autonomous vehicles. In addition, integrated scenarios was proposed wherein the scenarios proceeded continuously. The precision of the autonomous vehicle safety evaluation method using cameras was verified via comparisons and analyses with the results of real vehicle tests. As a result of the test, the difference in the average error rate of inter-vehicle distance between the monocular camera and the dual camera was 0.34%. The difference in the average error rate of the distance to the lane was 0.3 to 0.5%, showing similar results. It is judged that it will be possible to compensate for each other’s shortcomings if they are used at the same time rather than the independent use of monocular cameras and dual cameras.
Article
Full-text available
As automated vehicles have been considered one of the important trends in intelligent transportation systems, various research is being conducted to enhance their safety. In particular, the importance of technologies for the design of preventive automated driving systems, such as detection of surrounding objects and estimation of distance between vehicles. Object detection is mainly performed through cameras and LiDAR, but due to the cost and limits of LiDAR’s recognition distance, the need to improve Camera recognition technique, which is relatively convenient for commercialization, is increasing. This study learned convolutional neural network (CNN)‐based faster regions with CNN (Faster R‐CNN) and You Only Look Once (YOLO) V2 to improve the recognition techniques of vehicle‐mounted monocular cameras for the design of preventive automated driving systems, recognizing surrounding vehicles in black box highway driving videos and estimating distances from surrounding vehicles through more suitable models for automated driving systems. Moreover, we learned the PASCAL visual object classes (VOC) dataset for model com-parison. Faster R‐CNN showed similar accuracy, with a mean average precision (mAP) of 76.4 to YOLO with a mAP of 78.6, but with a Frame Per Second (FPS) of 5, showing slower processing speed than YOLO V2 with an FPS of 40, and a Faster R‐CNN, which we had difficulty detecting. As a result, YOLO V2, which shows better performance in accuracy and processing speed, was deter-mined to be a more suitable model for automated driving systems, further progressing in estimating the distance between vehicles. For distance estimation, we conducted coordinate value conversion through camera calibration and perspective transform, set the threshold to 0.7, and performed object detection and distance estimation, showing more than 80% accuracy for near‐distance vehicles. Through this study, it is believed that it will be able to help prevent accidents in automated vehicles, and it is expected that additional research will provide various accident prevention alternatives such as calculating and securing appropriate safety distances, depending on the vehicle types.
Article
Full-text available
As the research and development activities of automated vehicles have been active in recent years, developing test scenarios and methods has become necessary to evaluate and ensure their safety. Based on the current context, this study developed an automated vehicle test scenario derivation methodology using traffic accident data and a natural language processing technique. The natural language processing technique-based test scenario mining methodology generated 16 functional test scenarios for urban arterials and 38 scenarios for intersections in urban areas. The proposed methodology was validated by determining the number of traffic accident records that can be explained by the resulting test scenarios. That is, the resulting test scenarios are valid and represent a matching rate between the test scenarios and the increased number of traffic accident records. The resulting functional scenarios generated by the proposed methodology account for 43.69% and 27.63% of the actual traffic accidents for urban arterial and intersection scenarios, respectively.
Conference Paper
Full-text available
Traffic accident detection is an important topic in traffic video analysis, and this paper discusses single-vehicle traffic accident detection. Specifically, a novel real-time traffic accident detection framework, which consists of an automated traffic region detection method, a new traffic direction estimation method, and a first-order logic traffic accident detection method, is presented in this paper. First, the traffic region detection method applies the general flow of traffic to detect the location and boundaries of the roads. Second, the traffic direction estimation method estimates the moving direction of the traffic. The rationale for estimating the traffic direction is that the crashed vehicles often make rapid changes of directions. Third, traffic accidents are detected using the first-order logic decision-making system. Experimental results using the real traffic video data show the feasibility of the proposed method. In particular, traffic accidents are detected in real-time in traffic videos without any false alarms.
Conference Paper
Full-text available
Large-scale application of autonomous vehicles requires rigorous testing on actual roads, and the quality and reliability of the test can be guaranteed when dealing with complex traffic risk scenarios. Over 500 cases of traffic accidents occurred within Yizhuang Development Zone in Beijing from 2009 to 2017, a risk scenario database was constructed and accident scenarios were analyzed by deconstructing and reconstructing the variables of each. Nine field variables with factors such as time, space environment and accident participation were deconstructed, 4,320 effective risk scenarios were reconstructed by artificial neural network, and five typical test scenarios of autonomous vehicles within Yizhuang Development Zone were extracted. The results show the types of accidents that account for the highest accident probability for AV testing. The conclusions offer new ideas for the construction of the risk scenario database, and also provide a valuable reference for the road testing of autonomous vehicles.
Article
Full-text available
When will automated vehicles come onto the market? This question has puzzled the automotive industry and society for years. The technology and its implementation have made rapid progress over the last decade, but the challenge of how to prove the safety of these systems has not yet been solved. Since a market launch without proof of safety would neither be accepted by society nor by legislators, much time and many resources have been invested into safety assessment in recent years in order to develop new approaches for an efficient assessment. This paper therefore provides an overview of various approaches, and gives a comprehensive survey of the so-called scenario-based approach. The scenario-based approach is a promising method, in which individual traffic situations are typically tested by means of virtual simulation. Since an infinite number of different scenarios can theoretically occur in real-world traffic, even the scenario-based approach leaves the question unanswered as to how to break these down into a finite set of scenarios, and find those which are representative in order to render testing more manageable. This paper provides a comprehensive literature review of related safety-assessment publications that deal precisely with this question. Therefore, this paper develops a novel taxonomy for the scenario-based approach, and classifies all literature sources. Based on this, the existing methods will be compared with each other and, as one conclusion, the alternative concept of formal verification will be combined with the scenario-based approach. Finally, future research priorities are derived.
Article
Full-text available
Connected cars and vehicle-to-everything (V2X) communication scenarios are attracting more researchers. There will be numerous possibilities offered by V2X in the future. For instance, in the case of vehicles that move in a column, they could react to the braking of those in front of it through the rapid information exchanges, and most chain collisions could be avoided. V2X will be desiderated for routes optimizations, travel time reductions, and accident rate decrease in cases such as communication with infrastructures, traffic information exchanges, functioning of traffic lights, possible situations of danger, and the presence of construction sites or traffic jams. Furthermore, there could be massive conversations between smartphones and vehicles performing real-time dialogues. It is relatively reasonable to expect a connection system in which a pedestrian can report its position to all surrounding vehicles. Regarding this, it is compelling to perceive the positive effects of the driver being aware of the presence of pedestrians when vehicles are moving on the roads. This paper introduces the concepts for the development of a solution based on V2X communications aimed at vehicle and pedestrian safety. A potential system architecture for the development of a real system, concerning the safety of vehicles and pedestrians, is suggested to draft some guidelines that could be followed in new applications.
Chapter
Full-text available
Assuring the safety of self-driving cars and other fully autonomous vehicles presents significant challenges to traditional software safety standards both in terms of content and approach. We propose a safety standard approach for fully autonomous vehicles based on setting scope requirements for an overarching safety case. A viable approach requires feedback paths to ensure that both the safety case and the standard itself co-evolve with the technology and accumulated experience. An external assessment process must be part of this approach to ensure lessons learned are captured, as well as to ensure transparency. This approach forms the underlying basis for the UL 4600 initial draft standard.
Conference Paper
Full-text available
Neural end-to-end architectures have been recently proposed for spoken language translation (SLT), following the state-of-the-art results obtained in machine translation (MT) and speech recognition (ASR). Motivated by this contiguity, we propose an SLT adaptation of Transformer (the state-of-the-art architecture in MT), which exploits the integration of ASR solutions to cope with long input sequences featuring low information density. Long audio representations hinder the training of large models due to Transformer's quadratic memory complexity. Moreover, for the sake of translation quality, handling such sequences requires capturing both short-and long-range dependencies between bi-dimensional features. Focusing on Trans-former's encoder, our adaptation is based on: i) downsampling the input with con-volutional neural networks, which enables model training on non cutting-edge GPUs, ii) modeling the bidimensional nature of the audio spectrogram with 2D components , and iii) adding a distance penalty to the attention, which is able to bias it towards short-range dependencies. Our experiments show that our SLT-adapted Transformer outperforms the RNN-based baseline both in translation quality and training time, setting the state-of-the-art performance on six language directions.
Article
Full-text available
Two-wheeled vehicles (motorized and non-motorized, referred to as TWs) are an important part of the transport system in China. They also represent an important challenge for road safety, with many TW user fatalities and injuries every year. Recently, active safety systems for cars, such as Automated Emergency Braking (AEB), promise to reduce road traffic fatalities and injuries. For these systems to work effectively, it is necessary to understand and define the complex traffic scenarios to be addressed. The aim of this study is to contribute to the development of test procedures for AEB specifically, drawing on the China In-Depth Accident Study (CIDAS) data from July 2011 to February 2016 to describe typical scenarios for crashes between cars and TWs by means of cluster analysis. In total, 672 car-to-TW crashes were extracted. The data was clustered according to five main crash characteristics: time of crash, view obstruction, pre-crash driving behavior of the car driver and the TW driver, and relative moving direction. The analysis resulted in six car-to-TW crash scenarios typical of China. In three scenarios the car and the TW travel perpendicularly to each other before the crash, in two they travel in the same direction, and in one they travel in opposite directions. Further, each scenario can be described with three characteristics (the road speed limit, the TW’s first contact point on the car, and the car’s first contact point on the TW) that can be included in an AEB test suite. Some scenarios were similar to those in the Euro New Car Assessment Program (Euro NCAP). For example, in one, a TW moving straight ahead was hit by a car moving perpendicularly, and in the other the car hit a TW traveling in the same direction. Both occurred in daytime, without a visual obstruction. However, in contrast to the Euro NCAP, typical scenarios in China included night-time scenarios, scenarios where the car or the TW was turning, and those in which the TW was hidden from the car by an obstruction. The results contribute to a proposed novel AEB test suite with realistic scenarios specific to China.
Article
Full-text available
In this paper, we present techniques for automatically classifying players and tracking ball movements in basketball game video clips under poor conditions where camera angle dynamically shifts and changes. In the core of our system lies Yolo, a realtime object detection system. Given the ground truth boxes collected by our data specialists, Yolo is trained to detect the presence of objects in every video frame. In addition, Yolo uses Darknet that implements Convolution Neural Networks to classify a detected object to a player and to recognize its jersey numbers of specific movements. By identifying players and ball possessions, we can automatically compute ball distributions that are reflected on complex networks. With original Yolo system, player movement can be interrupted when the players move out of the frame due to camera shift and when players overlap each other on a two-dimensional frame.We have adapted Yolo to keep track of players even under such poor condition by taking into account contextual information available from the framework preceding and/or succeeding problematic video frames. In addition to the novel movement inference method, we provide a framework for analyzing the pass networks in various perspectives to help the managing staff to reveal critical determinants of team performance and to design better game strategies. We assess the performance of our system in terms of accuracy by making a comparison with the analytical reports generated by human experts.
Chapter
Full-text available
The testing of Autonomous Vehicles (AVs) requires driving the AV billions of miles under varied scenarios in order to find bugs, accidents and otherwise inappropriate behavior. Because driving a real AV that many miles is too slow and costly, this motivates the use of sophisticated ‘world simulators’, which present the AV’s perception pipeline with realistic input scenes, and present the AV’s control stack with realistic traffic and physics to which to react. Thus the simulator is a crucial piece of any CAD toolchain for AV testing. In this work, we build a test harness for driving an arbitrary AV’s code in a simulated world. We demonstrate this harness by using the game Grand Theft Auto V (GTA) as world simulator for AV testing. Namely, our AV code, for both perception and control, interacts in real-time with the game engine to drive our AV in the GTA world, and we search for weather conditions and AV operating conditions that lead to dangerous situations. This goes beyond the current state-of-the-art where AVs are tested under ideal weather conditions, and lays the ground work for a more comprehensive testing effort. We also propose and demonstrate necessary analyses to validate the simulation results relative to the real world. The results of such analyses allow the designers and verification engineers to weigh the results of simulation-based testing.
Article
Full-text available
The testing of Autonomous Vehicles (AVs) requires driving the AV billions of miles under varied scenarios in order to find bugs, accidents and otherwise inappropriate behavior. Because driving a real AV that many miles is too slow and costly, this motivates the use of sophisticated `world simulators', which present the AV's perception pipeline with realistic input scenes, and present the AV's control stack with realistic traffic and physics to which to react. Thus the simulator is a crucial piece of any CAD toolchain for AV testing. In this work, we build a test harness for driving an arbitrary AV's code in a simulated world. We demonstrate this harness by using the game Grand Theft Auto V (GTA) as world simulator for AV testing. Namely, our AV code, for both perception and control, interacts in real-time with the game engine to drive our AV in the GTA world, and we search for weather conditions and AV operating conditions that lead to dangerous situations. This goes beyond the current state-of-the-art where AVs are tested under ideal weather conditions, and lays the ground work for a more comprehensive testing effort. We also propose and demonstrate necessary analyzes to validate the simulation results relative to the real world. The results of such analyses allow the designers and verification engineers to weigh the results of simulation-based testing.
Conference Paper
Full-text available
The latest version of the ISO 26262 standard from 2016 represents the state of the art for a safety-guided development of safety-critical electric/electronic vehicle systems. These vehicle systems include advanced driver assistance systems and vehicle guidance systems. The development process proposed in the ISO 26262 standard is based upon multiple V-models, and defines activities and work products for each process step. In many of these process steps, scenario based approaches can be applied to achieve the defined work products for the development of automated driving functions. To accomplish the work products of different process steps, scenarios have to focus on various aspects like a human understandable notation or a description via state variables. This leads to contradictory requirements regarding the level of detail and way of notation for the representation of scenarios. In this paper, the authors discuss requirements for the representation of scenarios in different process steps defined by the ISO 26262 standard, propose a consistent terminology based on prior publications for the identified levels of abstraction, and demonstrate how scenarios can be systematically evolved along the phases of the development process outlined in the ISO 26262 standard.
Article
Full-text available
SAE Technical Paper 2018-01-1066 presented at WCX World Congress Experience 2018 in Detroit: One of the major challenges for the automotive industry will be the release and validation of cooperative and automated vehicles. The immense driving distance that needs to be covered for a conventional validation process requires the development of new testing procedures. Further, due to limited market penetration in the beginning, the driving behavior of other human traffic participants, regarding a mixed traffic environment, will have a significant impact on the functionality of these vehicles.In this paper, a generic simulation-based toolchain for the model-in-the-loop identification of critical scenarios will be introduced. The proposed methodology allows the identification of critical scenarios with respect to the vehicle development process. The current development status of cooperative and automated vehicle determines the availability of testable simulation models, software, and components.The identification process is realized by a coupled simulation framework. A combination of a vehicle dynamics simulation that includes a digital prototype of the cooperative and automated vehicle, a traffic simulation that provides the surrounding environment, and a cooperation simulation including cooperative features, is used to establish a suitable comprehensive simulation environment. The behavior of other traffic participants is considered in the traffic simulation environment.The criticality of the scenarios is determined by appropriate metrics. Within the context of this paper, both standard safety metrics and newly developed traffic quality metrics are used for evaluation. Furthermore, we will show how the use of these new metrics allows for investigating the impact of cooperative and automated vehicles on traffic. The identified critical scenarios are used as an input for X-in-the-Loop methods, test benches, and proving ground tests to achieve an even more precise comparison to real-world situations. As soon as the vehicle development process is in a mature state, the digital prototype becomes a “digital twin” of the cooperative and automated vehicle. --- SAE Technical Paper 2018-01-1066 was recommended by WCX 2018 topical chair for publication as a journal article under the original DOI --> SAE International Journal for Connected and Automated Vehicles. SAE Int. J. of CAV 1(2):93–106, 2018, doi:10.4271/2018-01-1066.
Article
Full-text available
Autonomous Vehicle technology is quickly expanding its market and has found in Silicon Valley, California, a strong foothold for preliminary testing on public roads. In an effort to promote safety and transparency to consumers, the California Department of Motor Vehicles has mandated that reports of accidents involving autonomous vehicles be drafted and made available to the public. The present work shows an in-depth analysis of the accident reports filed by different manufacturers that are testing autonomous vehicles in California (testing data from September 2014 to March 2017). The data provides important information on autonomous vehicles accidents’ dynamics, related to the most frequent types of collisions and impacts, accident frequencies, and other contributing factors. The study also explores important implications related to future testing and validation of semi-autonomous vehicles, tracing the investigation back to current literature as well as to the current regulatory panorama.
Article
Full-text available
Driving through dynamically changing traffic scenarios is a highly challenging task for autonomous vehicles, especially on urban roadways. Prediction of surrounding vehicles' driving behaviors plays a crucial role in autonomous vehicles. Most traditional driving behavior prediction models work only for a specific traffic scenario and cannot be adapted to different scenarios. In addition, priori driving knowledge was never considered sufficiently. This study proposes a novel scenario-adaptive approach to solve these problems. A novel ontology model was developed to model traffic scenarios. Continuous features of driving behavior were learned by Hidden Markov Models (HMMs). Then, a knowledge base was constructed to specify the model adaptation strategies and store priori probabilities based on the scenario's characteristics. Finally, the target vehicle's future behavior was predicted considering both a posteriori probabilities and a priori probabilities. The proposed approach was sufficiently evaluated with a real autonomous vehicle. The application scope of traditional models can be extended to a variety of scenarios, while the prediction performance can be improved by the consideration of priori knowledge. For lanechanging behaviors, the prediction time horizon can be extended by up to 56% (0.76 s) on average. Meanwhile, long-term prediction precision can be enhanced by over 26%.
Conference Paper
Full-text available
Driving test is critical to the deployment of autonomous vehicles. It is necessary to review the related works since the methodologies summaries are rare, which will help to set up an integrated method for autonomous driving test in different development stages, and help to provide a reliable, quick, safe, low cost and reproducible method and accelerate the development of autonomous vehicle. In this paper, we review the related autonomous driving test works, including autonomous vehicle functional verification, vehicle integrated testing, system validation in different architectures. This review work will be helpful for autonomous vehicle development.
Article
Objective Automatic emergency braking (AEB) that detects pedestrians has great potential to reduce pedestrian crashes. The objective of this study was to examine its effects on real-world police-reported crashes. Methods Two methods were used to assess the effects of pedestrian-detecting AEB on pedestrian crash risk. Vehicles with and without the system were examined on models where it was an optional feature. Poisson regression was used to estimate the effects of AEB on pedestrian crash rates per insured vehicle year, and quasi-induced exposure using logistic regression compared involvement in pedestrian crashes to a system-irrelevant crash type. Results AEB with pedestrian detection was associated with significant reductions of 25%–27% in pedestrian crash risk and 29%–30% in pedestrian injury crash risk. However, there was not evidence that that the system was effective in dark conditions without street lighting, at speed limits of 50 mph or greater, or while the AEB-equipped vehicle was turning. Conclusions Pedestrian-detecting AEB is reducing pedestrian crashes, but its effectiveness could be even greater. For the system to make meaningful reductions in pedestrian fatalities, it is crucial for it to work well in dark and high-speed conditions. Other proven interventions to reduce pedestrian crashes under challenging circumstances, such as improved headlights and roadway-based countermeasures, should continue to be implemented in conjunction with use of AEB to prevent pedestrian crashes most effectively.
Article
Connected and automated vehicles (CAVs) enabled by wireless communication and vehicle automation are believed to revolutionize the form and operation of road transport in the next decades. This paper addresses traffic flow effects of CAVs, and focuses on their lane-changing impacts on the mixed traffic flow of CAVs and human-driven vehicles (HVs). At present technical paths towards the development and deployment of CAVs are still uncertain. With CAV technologies getting matured, CAVs are supposed to provide rides of higher efficiency than HVs, beyond improved safety and comfort. In heavy traffic, this would only be achievable via agile and flexible lane changes of CAVs, because longitudinal acceleration would be unhelpful or even impossible. Such lane changes are expected to be ego-efficient in that they serve solely CAVs’ interests without much considering surrounding vehicles, as long as safety constraints are not violated. As road resources are limited, the growth of the CAV population adopting such ego-efficient lane-changing strategies would probably lead to renowned “Tragedy of the Commons”. In this context, this paper considers three important prospective questions: A: How to determine an ego-efficient lane-changing strategy for CAVs? B: With more CAVs introduced each adopting the ego-efficient lane-changing strategy, what is the impact on traffic flow? C: How to determine a system-efficient lane-changing strategy for CAVs? These forward-looking issues are addressed from the perspectives of microscopic traffic simulation and reinforcement learning. Without any constraint on the lane-changing incentive, the developed lane-changing strategy was found to be beneficial for CAVs and the entire traffic flow only if the market penetration rate (MPR) of CAVs is less than 50%. With an appropriate constraint placed, however, the lane-changing strategy was found to become consistently beneficial for the entire traffic flow at any MPR. These findings suggest that CAVs may not simply be a magic cure for traffic problems that the society is currently facing, unless some upper-level coordination may be proposed for CAVs to benefit not only themselves but also the entire traffic. This is also consistent with what “Tragedy of the Commons” suggests.
Article
Driving safety is one of the most important points to concern on the road. Vehicles constantly generate messages under vehicle-to-everything (V2X) assisted driving. Especially, in dense urban environments, the massive messages carrying precise data can help us to improve road safety. However, vehicles do not always provide accurate data due to a variety of reasons, such as defective vehicle sensors, or selfish. It is critical to check and analyze the data supplied by vehicles in real time and correct the possible errors to eliminate the unsafe issues. In this article, we introduce a cOllaborative vehiClE dAta correctioN method (OCEAN) based on rationality and $Q$ -learning techniques to correct the error V2X data for ensuring the driving safety of vehicles on the road, which can be deployed on both vehicles and road side unit. Extensive experimental results show that OCEAN can detect error V2X data up to 80 $\%$ and cut down 60 $\%$ average error distance for most attributes in vehicle data.
Article
The rapid advancement of sensor technologies and artificial intelligence are creating new opportunities for traffic safety enhancement. Dashboard cameras (dashcams) have been widely deployed on both human driving vehicles and automated driving vehicles. A computational intelligence model that can accurately and promptly predict accidents from the dashcam video will enhance the preparedness for accident prevention. The spatial-temporal interaction of traffic agents is complex. Visual cues for predicting a future accident are embedded deeply in dashcam video data. Therefore, the early anticipation of traffic accidents remains a challenge. Inspired by the attention behavior of humans in visually perceiving accident risks, this paper proposes a Dynamic Spatial-Temporal Attention (DSTA) network for the early accident anticipation from dashcam videos. The DSTA-network learns to select discriminative temporal segments of a video sequence with a Dynamic Temporal Attention (DTA) module. It also learns to focus on the informative spatial regions of frames with a Dynamic Spatial Attention (DSA) module. A Gated Recurrent Unit (GRU) is trained jointly with the attention modules to predict the probability of a future accident. The evaluation of the DSTA-network on two benchmark datasets confirms that it has exceeded the state-of-the-art performance. A thorough ablation study that assesses the DSTA-network at the component level reveals how the network achieves such performance. Furthermore, this paper proposes a method to fuse the prediction scores from two complementary models and verifies its effectiveness in further boosting the performance of early accident anticipation.
Article
Explainable artificial intelligence (XAI) aims to reduce the opacity of AI-based decision-making systems, allowing humans to scrutinize and trust them. Unlike prior work that attributes the responsibility for an algorithm's decisions to its inputs as a purely associational concept, we propose a principled causality-based approach for explaining black-box decision-making systems. We present the demonstration of Lewis, a system that generates explanations for black-box algorithms at the global, contextual, and local levels, and provides actionable recourse for individuals negatively affected by an algorithm's decision. Lewis makes no assumptions about the internals of the algorithm except for the availability of its input-output data. The explanations generated by Lewis are based on probabilistic contrastive counterfactuals, a concept that can be traced back to philosophical, cognitive, and social foundations of theories on how humans generate and select explanations. We describe the system layout of Lewis wherein an end-user specifies the underlying causal model and Lewis generates explanations for particular use-cases, compares them with explanations generated by state-of-the-art approaches in XAI, and provides actionable recourse when applicable. Lewis has been developed as open-source software; the code and the demonstration video are available at lewis-system.github.io.
Article
Deep learning algorithms for anomaly detection, such as autoencoders, point out the outliers, saving experts the time-consuming task of examining normal cases in order to find anomalies. Most outlier detection algorithms output a score for each instance in the database. The top-k most intense outliers are returned to the user for further inspection; however, the manual validation of results becomes challenging without justification or additional clues. An explanation of why an instance is anomalous enables the experts to focus their investigation on the most important anomalies and may increase their trust in the algorithm. Recently, a game theory-based framework known as SHapley Additive exPlanations (SHAP) was shown to be effective in explaining various supervised learning models. In this paper, we propose a method that uses Kernel SHAP to explain anomalies detected by an autoencoder, which is an unsupervised model. The proposed explanation method aims to provide a comprehensive explanation to the experts by focusing on the connection between the features with high reconstruction error and the features that are most important in terms of their affect on the reconstruction error. We propose a black-box explanation method, because it has the advantage of being able to explain any autoencoder without being aware of the exact architecture of the autoencoder model. The proposed explanation method extracts and visually depicts both features that contribute the most to the anomaly and those that offset it. An expert evaluation using real-world data demonstrates the usefulness of the proposed method in helping domain experts better understand the anomalies. Our evaluation of the explanation method, in which a “perfect” autoencoder is used as the ground truth, shows that the proposed method explains anomalies correctly, using the exact features, and evaluation on real-data demonstrates that (1) our explanation model, which uses SHAP, is more robust than the Local Interpretable Model-agnostic Explanations (LIME) method, and (2) the explanations our method provides are more effective at reducing the anomaly score than other methods.
Article
With safety being one of the primary motivations for developing automated vehicles (AVs), extensive field and simulation tests are being carried out to ensure AVs can operate safely on roadways. Since 2014, the California Department of Motor Vehicles (DMV) has been collecting AV collision and disengagement reports, which are valuable data sources for studying AV crash patterns. A crash sequence of events describes the AV’s interactions with other road users before a collision in a temporal manner. In this study, sequence of events data extracted from California AV collision reports were used to investigate patterns and how they may be used to develop AV test scenarios. Employing sequence analysis methods and clustering, this study evaluated 168 AV crashes (with AV in automatic driving mode before disengagement or collision) reported to the California DMV from 2015 to 2019. Analysis of subsequences showed that the most representative pattern in AV crashes was “collision following AV stop”. Analysis of event transition showed that disengagement, as an event in 24% of all studied AV crash sequences, had a transition probability of 68% to an immediate collision. Cluster analysis characterized AV crash sequences into seven groups with distinctive crash dynamic features. Cross-tabulation analysis showed that sequence groups were significantly associated with variables measuring crash outcomes and describing environmental conditions. Crash sequences are useful for developing AV test scenarios. Based on the findings, a scenario-based AV safety testing framework was proposed with sequence of events embedded as a core component.
Article
Quantitative microstructural interpretations were carried out without human involvement through an integrated combination of deep learning and focused ion beam-scanning electron microscopy (FIB-SEM) analytics on Ni/Y2O3-stabilized ZrO2 (Ni/YSZ) cermets. The Ni/YSZ/pore composites were analyzed for the automated extraction of microstructural parameters to prevent the subjective analysis problems and unavoidable artifacts frequently encountered in lengthy image processing tasks and eliminate biased evaluations. Considering the high volume of image data and future expectations for electron microscopy usage, FIB-SEM was efficiently combined with semantic segmentation. Traditional image processing analysis tools are combined with phase separation predictions by semantic segmentation algorithms, leading to a quantitative evaluation of microstructural parameters. The combined strategy enables one to significantly enhance poor image quality originating from artifacts in electron microscopy, including charging effects, curtain effects, out-of-focus problems, and unclear phase boundaries encountered in searching for high-efficiency solid oxide fuel cells (SOFCs).
Article
Automated semantic segmentation is applied to the quantification of microstructural features in three-phase composite cathode materials of solid oxide fuel cells (SOFCs), i.e., GDC/LSC/Pore where GDC stands for Gd2O3-doped CeO2 and LSC for La0.6Sr0.4CoO3-δ. Our aim is to eliminate the tedious involvement of human experts and the associated errors. The high volume of image information sets is generated using automatic acquisition systems involving focused-ion beam scanning electron microscopy through a so-called slice-view procedure. Through the integration of semantic segmentation with image processing-assisted stereography tools, the following detailed microstructural features are quantitatively extracted automatically and objectively without any human involvement: size distribution, surface (or equivalently, volume) fraction, lengths of two-phase boundaries, and density of triple-phase boundaries based on two-dimensional images. The extracted two-dimensional information is connected with three-dimensional reconstruction analysis. The implications of semantic segmentation in SOFCs are discussed considering efficient analysis and design of high-performance electrode structures in energy-oriented devices.
Article
Automated vehicles are emerging on the transportation networks as manufacturers test their automated driving system (ADS) capabilities in complex real-world environments in testing operations like California's Autonomous Vehicle Tester Program. A more comprehensive understanding of the ADS safety performances can be established through the California Department of Motor Vehicle disengagement and crash reports. This study comprehensively examines the safety performances (159,840 disengagements, 124 crashes, and 3,669,472 automated vehicle miles traveled by the manufacturers) documented since the inauguration of the testing program. The reported disengagements were categorized as control discrepancy, environmental conditions and other road users, hardware and software discrepancy, perception discrepancy, planning discrepancy, and operator takeover. An applicable subset of disengagements was then used to identify and quantify the 5 W's of these safety-critical events: who (disengagement initiator), when (the maturity of the ADS), where (location of disengagement), and what/why (the facts causing the disengagement). The disengagement initiator, whether the ADS or human operator, is linked with contributing factors, such as the location, disengagement cause, and ADS testing maturity through a random parameter binary logit model that captured unobserved heterogeneity. Results reveal that compared to freeways and interstates, the ADS has a lower likelihood of initiating the disengagement on streets and roads compared to the human operator. Likewise, software and hardware, and planning discrepancies are associated with the ADS initiating the disengagement. As the ADS testing maturity advances in months, the probability of the disengagement being initiated by the ADS marginally increases when compared to human-initiated. Overall, the study contributes by understanding the factors associated with disengagements and exploring their implications for automated systems.
Article
A research study conducted to evaluate the efficacy of flashing amber signal phasing is reported. Flashing amber, which is set to overlap with the green indication a few seconds before the onset of solid amber, is a form of time reference aid used to warn drivers of an impending onset of amber. The time reference aid is a concept predicated on the principle that driver decisions will be easier and more predictable if drivers have advance information that allows them to predict the onset of amber. The evaluation of the flashing amber method showed that its implementation has the potential to reduce red-light violations, severity of maximum decelerations, and kinematically defined inappropriate stop or cross decisions. However, the data also showed that compared with the regular signal phasing, the flashing amber phasing increased the size of the indecision zone, a mechanism usually responsible for increased rear-end collisions. A measure not previously used in literature was introduced to compare the regular and the experimental signal phasing. This measure, which analyzes first-response time variability in relation to the indecision zone, predicted that increased rear-end collisions might be expected as a result of implementation of the flashing amber signal phasing. Generally, the results suggest that the implementation of a flashing amber signal phasing would not significantly increase intersection safety despite the notion that it would improve driver anticipation of the onset of solid amber.
Article
In recent years, the frequent appearance of obstacles on roads has been increasing. Opportune obstacle detection is crucial in driver-assistance systems to prevent traffic incidents. Artificial vision has been used to design advanced driver-assistance systems. Driver-assistance allows avoiding collisions or (mortal) accidents by offering technologies that alert the driver about potential problems. Opportune obstacle detection is an open problem in a dynamic environment; therefore, it is necessary to identify static objects and moving objects, known as obstacles, while driving a vehicle. The object identification process is mainly affected by light conditions. In this paper, we present an on-road obstacle detection system based on video analysis. The system extracts areas of interest from a video scene by using a rectangular window of observation and carrying out a sample analysis to separate the road from possible obstacles and the horizon, which is known as the segmentation process. Besides, the system calculates the obstacle trajectory by using monocular vision and an extended Kalman filter. The mechanism has been tested under several surface and lighting conditions, showing a significant improvement in terms of robustness to real world driving conditions, as compared to other state of the art methods, which are designed to work in controlled environments.
Article
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.0 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
Conference Paper
We propose a Dynamic-Spatial-Attention (DSA) Recurrent Neural Network (RNN) for anticipating accidents in dashcam videos (Fig. 1). Our DSA-RNN learns to (1) distribute soft-attention to candidate objects dynamically to gather subtle cues and (2) model the temporal dependencies of all cues to robustly anticipate an accident. Anticipating accidents is much less addressed than anticipating events such as changing a lane, making a turn, etc., since accidents are rare to be observed and can happen in many different ways mostly in a sudden. To overcome these challenges, we (1) utilize state-of-the-art object detector [3] to detect candidate objects, and (2) incorporate full-frame and object-based appearance and motion features in our model. We also harvest a diverse dataset of 678 dashcam accident videos on the web (Fig. 3). The dataset is unique, since various accidents (e.g., a motorbike hits a car, a car hits another car, etc.) occur in all videos. We manually mark the time-location of accidents and use them as supervision to train and evaluate our method. We show that our method anticipates accidents about 2 s before they occur with 80% recall and 56.14% precision. Most importantly, it achieves the highest mean average precision (74.35%) outperforming other baselines without attention or RNN.
Article
The objective of this study was to evaluate the effectiveness of forward collision warning (FCW) alone, a low-speed autonomous emergency braking (AEB) system operational at speeds up to 19 mph that does not warn the driver prior to braking, and FCW with AEB that operates at higher speeds in reducing front-to-rear crashes and injuries. Poisson regression was used to compare rates of police-reported crash involvements per insured vehicle year in 22 U.S. states during 2010–2014 between passenger vehicle models with FCW alone or with AEB and the same models where the optional systems were not purchased, controlling for other factors affecting crash risk. Similar analyses compared rates between Volvo 2011–2012 model S60 and 2010–2012 model XC60 vehicles with a standard low-speed AEB system to those of other luxury midsize cars and SUVs, respectively, without the system.