A method for simplifying the analysis of traffic accidents injury severity on two-lane highways using Bayesian networks

TRYSE Research Group, Department of Civil Engineering, University of Granada, Spain.
Journal of safety research (Impact Factor: 1.34). 10/2011; 42(5):317-26. DOI: 10.1016/j.jsr.2011.06.010
Source: PubMed


This study describes a method for reducing the number of variables frequently considered in modeling the severity of traffic accidents. The method's efficiency is assessed by constructing Bayesian networks (BN).
It is based on a two stage selection process. Several variable selection algorithms, commonly used in data mining, are applied in order to select subsets of variables. BNs are built using the selected subsets and their performance is compared with the original BN (with all the variables) using five indicators. The BNs that improve the indicators' values are further analyzed for identifying the most significant variables (accident type, age, atmospheric factors, gender, lighting, number of injured, and occupant involved). A new BN is built using these variables, where the results of the indicators indicate, in most of the cases, a statistically significant improvement with respect to the original BN.
It is possible to reduce the number of variables used to model traffic accidents injury severity through BNs without reducing the performance of the model.
The study provides the safety analysts a methodology that could be used to minimize the number of variables used in order to determine efficiently the injury severity of traffic accidents without reducing the performance of the model.

  • Source
    • "One of the main problems of the accident data was considered as the heterogeneity [4] in the previous studies. A great many methods had been proposed to solve this problem, such as latent class clustering [5] [6] [7] [8], Bayesian networks (BNs) [9] [10] [11], and continuous risk profile (CRP) [12] [13]. One of the commonalities between these methods is that historical accident data, more specifically, recorded historical accident data, were used as the basic data. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Automatic traffic accident detection, especially not recorded by traffic police, is crucial to accident black spots identification and traffic safety. A new method of detecting traffic accidents is proposed based on temporal data mining, which can identify the unknown and unrecorded accidents by traffic police. Time series model was constructed using ternary numbers to reflect the state of traffic flow based on cell transmission model. In order to deal with the aftereffects of linear drift between time series and to reduce the computational cost, discrete Fourier transform was implemented to turn time series from time domain to frequency domain. The pattern of the time series when an accident happened could be recognized using the historical crash data. Then taking Euclidean distance as the similarity evaluation function, similarity data mining of the transformed time series was carried out. If the result was less than the given threshold, the two time series were similar and an accident happened probably. A numerical example was carried out and the results verified the effectiveness of the proposed method.
    Full-text · Article · Jun 2014 · Mathematical Problems in Engineering
  • Source
    • "The values of accuracy range from 64.0% in C1 to 55.1% in C4. These values are within the same range found in previous studies (Abdelwahab and Abdel-Aty, 2001; De Oña et al., 2011; Mujalli and De Oña, 2011) that used classification techniques for similar objectives. Table 4 shows that only C1 (64.0%) achieved a statistically significant improvement of accuracy as compared with results obtained for the EDB (59.5%). "
    [Show abstract] [Hide abstract]
    ABSTRACT: One of the principal objectives of traffic accident analyses is to identify key factors that affect the severity of an accident. However, with the presence of heterogeneity in the raw data used, the analysis of traffic accidents becomes difficult. In this paper, Latent Class Cluster (LCC) is used as a preliminary tool for segmentation of 3229 accidents on rural highways in Granada (Spain) between 2005 and 2008. Next, Bayesian Networks (BNs) are used to identify the main factors involved in accident severity for both, the entire database (EDB) and the clusters previously obtained by LCC. The results of these cluster-based analyses are compared with the results of a full-data analysis. The results show that the combined use of both techniques is very interesting as it reveals further information that would not have been obtained without prior segmentation of the data. BN inference is used to obtain the variables that best identify accidents with killed or seriously injured. Accident type and sight distance have been identify in all the cases analysed; other variables such as time, occupant involved or age are identified in EDB and only in one cluster; whereas variables vehicles involved, number of injuries, atmospheric factors, pavement markings and pavement width are identified only in one cluster.
    Full-text · Article · Nov 2012 · Accident; analysis and prevention
  • Source
    • "Taking into consideration the indicators used to evaluate the goodness of a classification method in de Oña et al. (2011) and Mujalli and de Oña (2011), and that the variable class used shows 2 possible response categories (state A and state B), the parameters that can be defined are described below:  Accuracy -The method's precision, defined as the percentage of cases correctly classified by the classifier.  Sensitivity -The proportion of cases correctly classified as state A among all the observed as state A.  Specificity -The proportion of cases correctly classified as state B among all the observed as state B.  Receiver Operating Characteristic Curve (ROC) Area – This indicator represents the curve of positive cases correctly classified (sensitivity), as opposed to the cases of false positives (1-specificity), in such a way that a value 1 describes a perfect adjustment. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Given the current number of road accidents, the aim of many road safety analysts is to identify the main factors that contribute to crash severity. To pinpoint those factors, this paper shows an application that applies some of the methods most commonly used to build decision trees (DTs), which have not been applied to the road safety field before. An analysis of accidents on rural highways in the province of Granada (Spain) between 2003 and 2009 (both inclusive) showed that the methods used to build DTs serve our purpose and may even be complementary. Applying these methods has enabled potentially useful decision rules to be extracted that could be used by road safety analysts. For instance, some of the rules may indicate that women, contrary to men, increase their risk of severity under bad lighting conditions. The rules could be used in road safety campaigns to mitigate specific problems. This would enable managers to implement priority actions based on a classification of accidents by types (depending on their severity). However, the primary importance of this proposal is that other databases not used here (i.e. other infrastructure, roads and countries) could be used to identify unconventional problems in a manner easy for road safety managers to understand, as decision rules.
    Full-text · Article · Sep 2012 · Accident; analysis and prevention
Show more