Conference PaperPDF Available

Driver Identification Using Vehicle Telematics Data

Authors:
A preview of the PDF is not available
... It has high training efficiency and low generalization error. Some scholars use it to identify drivers [28,[55][56][57][58][59]. For example, Hallac et al. [28] proposed to identify drivers using the RF model for CAN bus data collected during turn- The RF model processes a large number of high-dimensional data by randomly selecting input samples and features. ...
... It has high training efficiency and low generalization error. Some scholars use it to identify drivers [28,[55][56][57][58][59]. For example, Hallac et al. [28] proposed to identify drivers using the RF model for CAN bus data collected during turning and driving, with an accuracy rate of 76.9%. ...
... For example, Hallac et al. [28] proposed to identify drivers using the RF model for CAN bus data collected during turning and driving, with an accuracy rate of 76.9%. Wang et al. [55] used a RF model composed of 1000 classification trees to input the statistical features extracted from the CAN bus data into the model, and the recognition accuracy was up to 100%. Luo et al. [56] used the RF model to conduct integrated learning on the samples composed of driving data, and the accuracy of identifying 4 and 15 drivers was 89.14% and 60.36%, respectively. ...
Article
Full-text available
Driver identification is very important to realizing customized service for drivers and road traffic safety for electric vehicles and has become a research hotspot in the field of modern automobile development and intelligent transportation. This paper presents a comprehensive review of driver identification methods. The basic process of driver identification task is proposed as four steps, the advantages and disadvantages of different data sources for driver identification are analyzed, driver identification models are divided into three categories, and the characteristics and research progress of driver identification models are summarized, which can provide a reference for further research on driver identification. It is concluded that on-board sensor data in the natural driving state is objective and accurate and could be the main data source for driver identification. Emerging technologies such as big data, artificial intelligence, and the internet of things have contributed to building a deep learning hybrid model with high accuracy and robustness and representing an important gradual development trend of driver identification methods. Developing a driver identification method with high accuracy, real-time performance, and robustness is an important development goal in the future.
... Even in Deep learning, the use of batch normalization has also become popular for accelerating the training process and improving the performance in some cases [8]. Data scaling became typical and a standard step in data pre-processing, although it's still used without obvious guidelines or further improvements exploration, considering that the two prominent methods are: z-score method also known as Standard Scaler [9] [10], and the Min-Max Scaler [11] [12][13] [14]. These methods are used in almost every study when Data scaling is needed in pre-processing, the same is true for data extraction from sensors like CAN-Bus/OBD-II, which is the object of our study, or other various time series data. ...
... Moreira-Matias et al 2017 in [15] used Min-Max scaler in normalization before introducing the vehicle data to an identification framework that used stacked generalization of Support Vector Machine (SVM), Random Forest (RF), Decision Tree and K-Nearest Neighbors (KNN). Wang et al 2017 in [9] adopted standardization (Z-score) for CAN-Bus signals classification with Random Forest for driver identification. Hallac et al 2018 in [16] also scaled his CANbus data with Z-score before exploiting a Gated Recurrent Unit (GRU) embedding in driver identification. ...
... The driving process is a continuous activity and the information captured with one single sample even with the 51 different features cannot describe the identity of the driver or the activity to the ML algorithm. Hence, there is a need to use a sliding window stepping, this process is used by most of the state-of-the-art works already mentioned such as those described in [9][10] [11] [12][13] [14]. Figure 1 describe an example of the sliding window that we choose for our data based on the work presented in [10] with the same dataset and profiting from their search for the best parameters. ...
... Wang et al. [8] identified 30 drivers by using the voting strategy and Random Forest algorithm. Authors split data and tests into different window sizes. ...
... Compared to our paper, [6], [7], [8], [12], [13] and [14] do not look for frequency. Also, LSTM in [9] obtained lower scores in comparison with a Decision Tree (DT) algorithm ( [12], [13] and [14]) on the same dataset. ...
Preprint
Full-text available
The introduction of Information and Communication Technology (ICT) in transportation systems leads to several advantages (efficiency of transport, mobility, traffic management). However, it may bring some drawbacks in terms of increasing security challenges, also related to human behaviour. As an example , in the last decades attempts to characterize drivers' behaviour have been mostly targeted. This paper presents Secure Routine, a paradigm that uses driver's habits to driver identification and, in particular, to distinguish the vehicle's owner from other drivers. We evaluate Secure Routine in combination with other three existing research works based on machine learning techniques. Results are measured using well-known metrics and show that Secure Routine outperforms the compared works.
... However, the battery data is included amongst many other data traces describing other active processes in the electrified product, and the data is recorded at varying frequencies, or out-of-band. This results in exorbitantly large data volumes, 1-5 mb/min of data generated per device in the automotive industry (Wang et al., 2017). The types of data included in in-field monitoring can be as descriptive as cell-level or module-level data but can also only include packlevel readings, depending on the configuration of the monitoring system. ...
Article
Full-text available
Batteries have enabled modernization of society through portability of electricity. Batteries are also a crucial component to enabling clean technologies of the future such as grid storage and electrified transportation. Because of their ubiquity in modern society, global organizations develop and commercialize batteries for their electrified products. Across the field of battery development, in both commercial and academic settings, there is broad utility in standardization of data formats amongst disparate data sources, labs, equipment, organizations, industries, and lifecycle phases. Due to the way the nascent industry developed, there is a lack of standardization for how performance data is recorded, which is now hindering the industry’s ability to learn from data and accelerate growth. Herein, we describe the different types of data, formats, conventions, and standardization for each phase in the battery lifecycle. Next, we provide a standard data format and conventions for the community to either utilize in their data collection practices or map their existing data into: the Voltaiq Data Format (VDF). This standard data format provides the flexibility needed to capture the variability in data formats and conventions along the battery lifecycle. The utility of this standard format aids in collaboration within and across organizations, accelerating innovation across the industry, and paves the way for the battery community to start utilizing the power of machine learning and data science.
Article
Controller Area Network (CAN) is a masterless serial bus designed and widely used for the exchange of mission and time-critical information within commercial vehicles. Invehicle communication is based on messages sent and received by Electronic Control Units (ECUs) connected to this serial bus network. Although unencrypted, CAN messages are not easy to interpret. In fact, Original Equipment Manufacturers (OEMs) attempt to achieve security through obscurity by encoding the data in their proprietary format, which is kept secret from the general public. As a result, the only way to obtain clear data is to reverse engineer CAN messages. Driven by the need for in-vehicle message interpretation, which is highly valuable in the automotive industry, researchers and companies have been working to make this process automated, fast, and standardized. In this paper, we provide a comprehensive review of the state of the art and summarize the major advances in CAN bus reverse engineering. We are the first to provide a taxonomy of CAN tokenization and translation techniques. Based on the reviewed literature, we highlight an important issue: the lack of a public and standardized dataset for the quantitative evaluation of translation algorithms. In response, we define a complete set of requirements for standardizing the data collection process. We also investigate the risks associated with the automation of CAN reverse engineering, in particular with respect to the security network and the safety and privacy of drivers and passengers. Finally, we discuss future research directions in CAN reverse engineering.
Chapter
Modern vehicles more and more often have a specific life cycle, presented in years or with the possibility of going through a certain mileage without failure. For this purpose, manufacturers perform a series of tests, to confirm the reliability of their products. One of such tests is an accelerated durability test, with the use of modern simulation stands that imitate road conditions. This article presents the methodology of durability tests for commercial vehicles, using the eight poster, inertia reacted, road simulator MTS 320. The first step is to collect reference road data. For this purpose, the vehicle is equipped with a set of sensors. Gathered data was recreated on the road simulator. Authors prove the convergence of the obtained results, in comparison to the data collected from the road, at the level of up to 99% of root mean square value of the reference signals to the reconstructed signals. Tests are performed for loaded and unloaded vehicles. The result of the test is to confirm the reliability of the manufactured vehicle. During the test, the supporting structure of the vehicle with suspension as well as the functional reliability of the vehicle are assessed. Moreover, the key issue is to correctly determine the duration of the test and to set test parameter values. The development of this test technology is response to the need, to quickly and reliably check vehicles or their components, before implementing them into serial production.
Chapter
Full-text available
The air transport industry is characterized by the consequences of the speedy and momentous impacts of surrounding actions and economic and social changes. An airport hub serves as a center point that connects everyone and everything. Because they protect the financial interests of airlines and satisfy the connectivity requirements of both passengers and cargo, hubs continue to play a significant role in aviation. An efficient hub airport with enough extra space will increase passenger options and encourage airline competition by allowing additional competitors, routes, and frequencies. The airport networks need to apply the developed management and use its features to try to arrive at the optimum results and used its facilities such as: (geographical, capital, the ability for multi-model transports, ready to apply with the future technology, ready to welcome the companies such as FedEx and DHL, applying e-airport with e-airline with e- freight with e-AWB). This helps to achieve maximum growth and to be able to face rapid growth. Future airport networks need to reach all sites, so they should employ several different transmission technologies. Accordingly, this paper aims to compare the competitiveness of the busiest hub in each continent or region in 2019 for cargo and passengers which will be Hong Kong airport (HKIA) for Asia, Frankfurt hub (Fraport) for Europe, in Africa Addis Ababa air hub (ADD) and Cairo international airport hub (CAI), while for middle east will be Dubai airport (DXB).
Article
Characterizing human driver's driving behaviors from GPS trajectories is an important yet challenging trajectory mining task. Previous works heavily rely on high-quality GPS data to learn such driving style representations through deep neural networks. However, they have overlooked the driving contexts that greatly govern drivers' driving activities and the data sparsity issue of practical GPS trajectories collected at a low-sampling rate. Besides, existing works omit the cold start problem, where the newly joined drivers usually have insufficient data to learn accurate driving style representations. To address these limitations, we present an adversarial driving style representation learning approach, named $\mathtt {Radar}$ . In addition to summarizing statistic features from raw GPS data, $\mathtt {Radar}$ also extracts contextual features from three aspects of road condition, geographic semantic, and traffic condition. We exploit the advanced semi-supervised generative adversarial networks to construct our learning model. By jointly considering statistic features and contextual features, the trained model is able to efficiently learn driving style representations from practical GPS trajectory data. Furthermore, we enhance $\mathtt {Radar}$ 's representation learning for drivers owning limited training data with some basic data augmentation strategies and a novel auxiliary driver based data augmentation method. Experiments on two benchmark applications, i.e. , driver identification and driver number estimation, with a large real-world GPS trajectory dataset demonstrate that $\mathtt {Radar}$ can outperform the state-of-the-art approaches by learning more effective and accurate driving style representations.
Article
Driver identification in connected transportation is useful for usage-based insurance, personalized assisted driving, fleet management etc. Capturing the driving style from data behind the wheel benefits such applications without requiring extra costs and offending drivers’ biometric fingerprint privacy (e.g., facial recognition). However, the driver group to be identified may change, which leads to poor learning ability in a method based on depth representations of driving styles for new drivers and produces a sharp decline in generalization ability. Meta-learning is an exciting subfield of machine learning that equips deep learning models with the ability to learn, especially when the given data sample is very limited. This paper addresses a distinctively novel driver identification problem, where the prior model is supposed to quickly adapt to varying numbers of drivers, especially when few examples are available. First, based on a public driving dataset, a set of training and testing tasks is specifically designed for few-shot driver identification. Then, based on the popular model-agnostic meta-learning (MAML) framework, a feature autoencoder regularized learner is proposed to avoid the commonly encountered memorization problem and improve the generalization ability of the identification model. Three versions of meta-models are derived concerning computation and classification effectiveness. Finally, the experimental results show that the proposed method is superior to the previous baselines.
Article
The autonomous driving industry has mushroomed over the past decade. Although autonomous driving has undoubtedly become one of the most promising technologies of this century, its development faces multiple challenges, of which security is the major concern. In this paper, we present a thorough analysis of autonomous driving security. At first, the attack surface of autonomous driving is presented. After an analysis of the operation of autonomous driving in terms of key components and technologies, the security of autonomous driving is elaborated in four dimensions: sensors, operating system, control system, and vehicle-to-everything communication. Sensor security is examined from five components which are mainly responsible for self-positioning and environmental perception. The analysis of operating system security, the second dimension, is concentrated on the robot operating system. Concerning the control system security, controller area network is approached mainly from vulnerabilities and protection measures. The fourth dimension, vehicle-to-everything communication security, is probed from four categories of attacks: authenticity/identification, availability, data integrity, and confidentiality with corresponding solutions. Moreover, the drawbacks of existing methods adopted in the four dimensions are also provided. Finally, a conceptual multi-layer defense framework is proposed to secure the information flow from external communication to the physical autonomous vehicle.
Article
Full-text available
A vast increase in automotive electronic systems, coupled with related demands on power and design, has created an array of new engineering opportunities and challenges. Today's high-end vehicles may have more than 4 kilometers of wiring, compared to 45 meters in vehicles manufactured in 1955. Reducing wiring mass through in-vehicle networks will bring an explosion of new functionality and innovation. Our vehicles will become more like PCs, creating the potential for a host of plug-and-play devices. On average, US commuters spend 9 percent of their day in an automobile. Thus, introducing multimedia and telematics to vehicles will increase productivity and provide entertainment for millions. Further, X-by-wire solutions will make computer diagnostics a standard part of mechanics' work and may even create an electronic chauffeur
Article
As automotive electronics continue to advance, cars are becoming more and more reliant on sensors to perform everyday driving operations. These sensors are omnipresent and help the car navigate, reduce accidents, and provide comfortable rides. However, they can also be used to learn about the drivers themselves. In this paper, we propose a method to predict, from sensor data collected at a single turn, the identity of a driver out of a given set of individuals. We cast the problem in terms of time series classification, where our dataset contains sensor readings at one turn, repeated several times by multiple drivers. We build a classifier to find unique patterns in each individual's driving style, which are visible in the data even on such a short road segment. To test our approach, we analyze a new dataset collected by AUDI AG and Audi Electronics Venture, where a fleet of test vehicles was equipped with automotive data loggers storing all sensor readings on real roads. We show that turns are particularly well-suited for detecting variations across drivers, especially when compared to straightaways. We then focus on the 12 most frequently made turns in the dataset, which include rural, urban, highway on-ramps, and more, obtaining accurate identification results and learning useful insights about driver behavior in a variety of settings.
Conference Paper
In this work, we introduce a real-time driver activity recognition method which takes a sequence of depth images as input and outputs an activity class among a predetermined set of driver activities. A classification algorithm called Random Forests is employed and further enhanced by a unique state based inference system to reduce initial classifier errors. For example, frequent changes in driver activities are penalized so as to stabilize the output. The cost of activity change is decided by a state inference system which takes both temporal and spatial coherence into account. The paper will introduce the training system, explain the state inference system and the cost based penalty calculation. Finally we will discuss the results and future work.
Article
Drivers’ individual difference is one of the key factors to influence the accuracy of driving behavior model. The accuracy of model should include the effect characteristics of individual difference on driving behavior. The overtaking process was the research object to study the individual characteristics of driving behavior. The operation data of accelerator and steering wheel of each driver was analyzed with the character of time series. Based on both of the operation data, hidden Markov model (HMM) was employed to model the individual characteristics of driving behavior. Two individual models were built for each driver, one trained from accelerator data and one learned from steering wheel angel data. The models can be used to identify different drivers and the accuracy can reach to 85 %. It proved that individual difference is one factor which cannot be ignored in driving behavior model, and HMM has effectiveness in modeling it.
Chapter
In this chapter, driver characteristics under driving conditions are extracted through spectral analysis of driving signals. We assume that characteristics of drivers while accelerating or decelerating can be represented by “cepstral features” obtained through spectral analysis of gas and brake pedal pressure readings. Cepstral features of individual drivers can be modeled with a Gaussian mixture mode! (GMM). Driver models are evaluated in driver identification experiments using driving signals of 276 drivers collected in a real vehicle on city roads. Experimental results show that the driver model based on cepstral features achieves a 76.8 % driver identification rate, resulting in a 55 % error reduction over a conventional driver model that uses raw gas and brake pedal operation signals. Key wordsDriving behavior-driver identification-pedal pressure-spectral analysis-Gaussian mixture model
Article
Bagging predictors is a method for generating multiple versions of a predictor and using these to get an aggregated predictor. The aggregation averages over the versions when predicting a numerical outcome and does a plurality vote when predicting a class. The multiple versions are formed by making bootstrap replicates of the learning set and using these as new learning sets. Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy. The vital element is the instability of the prediction method. If perturbing the learning set can cause significant changes in the predictor constructed, then bagging can improve accuracy.
Article
All drivers have habits behind the wheel. Different drivers vary in how they hit the gas and brake pedals, how they turn the steering wheel, and how much following distance they keep to follow a vehicle safely and comfortably. In this paper, we model such driving behaviors as car-following and pedal operation patterns. The relationship between following distance and velocity mapped into a two-dimensional space is modeled for each driver with an optimal velocity model approximated by a nonlinear function or with a statistical method of a Gaussian mixture model (GMM). Pedal operation patterns are also modeled with GMMs that represent the distributions of raw pedal operation signals or spectral features extracted through spectral analysis of the raw pedal operation signals. The driver models are evaluated in driver identification experiments using driving signals collected in a driving simulator and in a real vehicle. Experimental results show that the driver model based on the spectral features of pedal operation signals efficiently models driver individual differences and achieves an identification rate of 76.8 % for a field test with 276 drivers, resulting in a relative error reduction of 55 % over driver models that use raw pedal operation signals without spectral analysis. Proceedings of the IEEE. v.95, n.2, 2007, p.427-437
Overview of Autonomous Vehicle Sensors and Systems
  • J Z Varghese
  • R G Boone
Varghese, J.Z., and Boone, R.G., "Overview of Autonomous Vehicle Sensors and Systems," Proceedings of the 2015 International Conference on Operations Excellence and Service Engineering.
Driver identification using driving behavior signals
  • T Wakita
  • K Ozawa
  • C Miyajima
  • K Igarashi
  • I Katunobu
  • K Takeda
  • F Itakura
T. Wakita, K. Ozawa, C. Miyajima, K. Igarashi, I. Katunobu, K. Takeda, and F. Itakura. Driver identification using driving behavior signals. IEICE TRANSACTIONS on Information and Systems, 89(3):1188-1194, 2006.