Content uploaded by Haniyeh Ghomi
Author content
All content in this area was uploaded by Haniyeh Ghomi on Apr 27, 2016
Content may be subject to copyright.
•Train speed: The most important factor in all situations. Same
result show in previous research.
•Illumination: Only when the train has speed lower than 28
mph, presence of light could reduce the severity. The opposite
of this claim is seen in previous research.
•Age: In all situation and with any speed of train people older
than 66 years suffer more severity than youngers. This issue
is verified only by one of the previous research; however,
other studies claim the opposite.
•Gender: In low train speed, association rules doesn’t show
any difference between injury severity males and females. But
in high speed of train and night time, females suffer more
severe injury. However, all of the previous studies claim that
females suffer more severe injury in all situations.
•Time: Because of low consciousness and less visibility at
night, high injury severity of collisions is expected. This result
is confirm with the only previous article that considered this
factor.
•Weather: presence of snow/rain increase severity because of
the decrease of the reaction time A similar result on the
influence of snow on driver injury severity has been reported
earlier in safety literature
A data mining approach for identifying key factors affecting the severity of crossing collisions
involving vulnerable road users
Haniyeh Ghomi1, Morteza Bagheri*1, Liping Fu2 & Luis F. Miranda-Moreno3 (Paper No. 15-4543)
(1) School of Railway Engineering, Iran University of Science and Technology (2) Department of Civil & Environmental Engineering, University of Waterloo (3) Department of Civil and Applied Mechanics, McGill University
*morteza.bagheri@iust.ac.ir
INTRODUCTION
CONCLUSION
ACKNOWLEDGMENTS
Data mining is one of the new approaches to analyzing collision
severity and identifying interaction effects of multiple factors. In
this study, two popular data mining techniques have been
examined for identifying variables that influence injury severity of
vulnerable road users (VRU) at highway-railroad grade crossing
(HRGC) collisions. CART was the primary technique and
Association Rules act as the supporter and identifier interactions
between variables. Classification trees and Association Rules
discovery results consistently show that the most influential
collision factors are train speed, collision time, VRU age and
gender, illumination, weather conditions, and several interactions
between these patterns.
In highway railroad grade crossing (HRGC) collisions, all types of
road users are at risk of being injured or killed; however, there are
notable differences in fatality rates between different road user
groups. In particular, the vulnerable road users (VRU) such as
pedestrians and cyclists are at greater risk than vehicle occupants.
The major proportion of fatality and serious injuries at HRGC
belongs to pedestrians or cyclists who choose to walk or pass
through HRGC. People living close to a railway line and children who
play around HRGC are more prone to hazards associated with these
collisions.
Table 1 VRU share at HRGC collisions (FRA)
To the best of our knowledge this is the first work that
incorporates the detailed analysis of VRU injury severity factors
including the interaction of multiple factors.
The VRU injury severity distribution in the database is as follows:
fatality (56%), injury (36%), and no injury (11%). The variables train
speed, time of collision, illumination during night time, VRU age and
gender, and weather condition were correlated with fatal collisions as
shown by both CART algorithm and Association Rules.
Table 2 Output of Association Rules
For this purpose, we used the U.S. Federal Railroad Administration
(FRA) database from 2007 to 2013.
Classification and Regression Tree (CART)
The Decision tree is like an inverted tree which has a root (node) upside
and its leaves are on the downside. Based on the result of examination,
data will be directed to one of the two lower branches.
Association Rules
Association Rules employ if/then statements that discover relationships
in a database. Association Rules focus on evaluation of features which
occur simultaneously.
RESULTS
The first and second authors are members of the Transportation
Systems and Logistics (TSL) laboratory at Iran University of
Science and Technology and acknowledge Rail Pardaz Seir
company. The third and fourth authors are from University of
Waterloo and McGill University respectively, and acknowledge
Transport Canada for supporting this article.
Node 0
I=254 (31.8%)
K=454 (56.9%)
N=89 (11.1%)
Total= 797 (100%)
Node 5
I=76 (29.3%)
K=155 (59.8%)
N=28 (10.8%)
Total= 259 (32.4%)
Node 6
I=68 (18.8%)
K=249 (69.1%)
N=43 (11.9%)
Total= 360 (45.1%)
Node 2
I=144 (23.2%)
K=404 (65.2%)
N=71 (11.4%)
Total= 619 (77.6%)
Node 1
I=110 (61.7%)
K=50 (28%)
N=18 (10.1%)
Total= 178 (22.3%)
Node 3
I=55 (70.5%)
K=18 (23%)
N=5 (6.4%)
Total= 78 (9.3%)
Node 7
I=110 (61.7%)
K=50 (28%)
N=18 (10.1%)
Total= 100 (12.5%)
Node 8
I=110 (61.7%)
K=50 (28%)
N=18 (10.1%)
Total= 100 (12.5%)
Node 4
I=55 (55%)
K=32 (32%)
N=13 (13%)
Total= 100 (12.5%)
Node 9
I=21 (42%)
K=21 (42%)
N=8 (16%)
Total= 50 (6.2%)
Node 10
I=34 (68%)
K=11 (22%)
N=5 (10%)
Total= 50 (6.2%)
Node 11
I=71 (31%)
K=131 (57.2%)
N=27 (11.7%)
Total= 229 (28.7%)
Node 12
I=5 (16.6%)
K=24 (80%)
N=1 (3.3%)
Total= 30 (3.7%)
Node 13
I=43 (20%)
K=145 (67.7%)
N=26 (12.1%)
Total= 214 (26.8%)
Node 14
I=25 (17.1%)
K=104 (71.2%)
N=17 (11.6%)
Total= 146 (18.3%)
Node 27
I=43 (20%)
K=145 (67.7%)
N=26 (12.1%)
Total= 214 (26.8%)
Node 28
I=25 (17.1%)
K=104 (71.2%)
N=17 (11.6%)
Total= 146 (18.3%)
Node 23
I=43 (20%)
K=145 (67.7%)
N=26 (12.1%)
Total= 214 (26.8%)
Node 24
I=25 (17.1%)
K=104 (71.2%)
N=17 (11.6%)
Total= 146 (18.3%)
>25 mph<=25 mph
Train Speed
>40 mph25-40 mph
Train Speed
Age
>48<=48
Time
7 PM- 6 AM
6 AM- 7 PM
>66<= 66
Age
Time
7 PM- 12 AM12 AM-7 PM
>48<=48
Age
Lights
yesno
Gender
femalemale
Year Total HRGC collisions VRUs collisions Total HRGC fatality/injury VRUs fatality/injury
2007 2778 110 339 59
2008 2429 131 291 63
2009 1933 112 249 60
2010 2051 143 262 81
2011 2059 130 269 82
2012 1970 132 272 85
2013 1870 140 258 96
OBJECTIVE
The objective of this paper is to identify VRU injury severity factors at
HRGCs using CART and Association Rules algorithms.
Three basic steps were adopted in this study:
1) Finding more significant factors on severity of vulnerable users
collisions
2) Estimating hidden relation between significant factors
3) Finding dangerous situations for vulnerable users
SIGNIFICANT FACTORS
Figure 1 Output of CART algorithm
METHODOLOGY
Rule ID Consequence Antecedent Support (%) Confidence (%)
1 Injury train speed(10-25mph) & illumination(yes) 7.6 63.9
2 Injury train speed(10-25mph), gender (male) & weather (Sunny) 7.2 56.8
3 Injury train speed (10-25mph) & illumination(no) 6.5 50
4 Injury train speed < 25mph & weather (Sunny) 5.3 67.4
5 Fatality train speed > 60 mph & weather (not Sunny) 6.2 76
6 Fatality train speed > 60 mph & illumination(yes) 5.5 75
7 Fatality train speed (40-60mph) & age (older than 66) 5.5 75
8 Fatality train speed (40-60mph) & gender (female) 6.7 74
9 Fatality train speed (40-60mph) & weather (not Sunny) 8.5 73.5
10 Fatality train speed (40-60mph) & gender (male) 19.9 70.4
11 Fatality train speed (40-60mph) & age (48-66) 5 70
12 Fatality gender (female) & time of collision (7:00PM-12:00AM) 5.2 69.7
13 Fatality time of collision (7:00PM-12:00AM) & weather (not Sunny) 6.7 68.5
14 Fatality train speed (40-60mph) & illumination(yes) 14.6 68.3
15 Fatality train speed (40-60mph) & illumination (no) 12.9 66.9
16 Fatality train speed (25- 40) & age (older than 66) 6.6 66
17 Fatality train speed (40-60mph) & time of collision (7:00PM-12:00AM) 8 65.6
18 Fatality train speed (40-60mph) & weather (Sunny) 21.4 64.9
19 Fatality gender (male) & age (older than 66) 13.9 64.8
20 Fatality train speed (25-40mph) & weather (not Sunny) 8.6 63.7
21 Fatality train speed(25-40mph) & gender (male) 21.8 61.4
22 Fatality train speed(25-40mph) & illumination (yes) 16.1 60.4
23 Fatality gender (male) & time of collision (7:00PM-12:00AM) 19 59.8
24 Fatality train speed(25-40mph) & gender (female) 5.8 59.5
25 Fatality train speed(40-60mph) & age (30-48) 5.7 58.6
26 Fatality train speed(25-40mph) & illumination (no) 14.5 58.6
27 Fatality gender (male) & age (older than 66) 5.1 58.5
28 Fatality train speed(25-40mph) & weather(Sunny) 23.8 58.4
29 Fatality train speed(25-40mph) & time of collision (7:00PM-12:00AM) 9.7 57.6
30 Fatality time of collision (7:00PM-12:00AM) & weather(Sunny) 20.9 57.4
31 Fatality train speed(25-40mph) & age (48-66) 6.7 57.4