Conference PaperPDF Available

Back to School: Impact of Training on Driver Behavior and State in Autonomous Vehicles

Authors:

Abstract and Figures

Many producers of automated vehicle systems have begun testing autonomous vehicles on the road. In order to ensure safety and prevent crashes, human drivers are enlisted to monitor autonomous vehicles. However, operators of autonomous systems exhibit negative behavior adaptations in response to prolonged supervision of automation. To prevent the onset of undesirable behaviors in safety drivers, we must investigate driver state and behavior changes during the operation of highly automated vehicles. In the study presented here, we examine the effects of theoretical and practical training on the drivers' response to potentially critical situations in a longitudinal driving simulator study. We also present the effects of encountering a failure of the automated vehicle on driver state and behavior. We conducted a two-part panel driving simulator study (N=28), with an interval of 20-30 days between the training and testing sessions. We found that while participants with training are better prepared for a potential failure of the automation, participants in both conditions show a rise in sleepy or drowsy behavior before a potential failure of automation.
Critical Event in Practical Training with faded lane markings a Photoplethysmograph (PPG) attachment. The Ag/Agcl electrodes placed on the forearm and the PPG sensor are attached to the ear lobe as seen in Figure 1. The physiological data is set to record at 512 Hz sampling frequency. 2) Testing Course Design: At the start of the testing session drive, participants are told that they are driving to San Francisco in their AV. Participants are instructed to closely follow signs along the side of the guiding them to their destination and other pertinent driving information such as designated lanes for AVs. They are informed that the AV is programmed to follow directions to the destination and obey the road signs, but participants are instructed that they can take over control if they feel it necessary, using the methods learned in the training phase. The total drive time in the testing phase is approximately 45 minutes depending on participant driving speeds. The drive begins with a short 3-4 minute long practice segment for the participants to acclimate to driving in the larger full cab simulator. During this segment, participants are asked to operate the vehicle in manual and automated driving modes by taking-over and handing-off control from and to the automated driving system using the methods covered during the training phase. 3) Event Design: During the main portion of the testing phase drive, participants encounter eight events in the simulator test course. In all eight events, the lane markings are either occluded or not available. The design of the course with the practice segment and the eight individual events is shown in Figure 3. Each event is separated from the previous event by at least two minutes. In each event, the AV encounters missing/faded lane markings. While the cause for the absence or occlusion of the lane markings is different across the events, the overall temporal structure of all the events is the same (Figure 3). Every event is preceded by a road sign on the side of the road, indicating a potential critical event in the road ahead. Then, the lane markers fade or disappear completely, and the AV encounters an obstacle in the road. The time interval between the appearance of a road sign and the lane markers disappearance is denoted as T 1 . The time interval between the disappearance of the lane markers and the appearance of an obstacle on the road is denoted as T 2 . T 1 and T 2 are intentionally varied between the eight events to avoid any learning effects between events based on timing alone.
… 
Content may be subject to copyright.
Back to School: Impact of Training on Driver Behavior
and State in Autonomous Vehicles
Srinath Sibi
Stanford University
Stanford, CA, USA
ssibi@stanford.edu
Stephanie Balters
Stanford University
Stanford, CA, USA
balters@stanford.edu
Ernestine Fu
Stanford University
Stanford, CA, USA
ernestinefu@stanford.edu
Ella G. Strack
Zoox Inc.
(previously Bosch LLC)
Foster City, CA, USA
estrack@zoox.com
Martin Steinert
Norwegian University of
Science And Technology
Trondheim, Norway
martin.steinert@ntnu.no
Wendy Ju
Cornell Tech
New York, NY
wendyju@cornell.edu
Abstract Many producers of automated vehicle systems
have begun testing autonomous vehicles on the road. In order
to ensure safety and prevent crashes, human drivers are
enlisted to monitor autonomous vehicles. However, operators
of autonomous systems exhibit negative behavior adaptations
in response to prolonged supervision of automation. To prevent
the onset of undesirable behaviors in safety drivers, we must
investigate driver state and behavior changes during the opera-
tion of highly automated vehicles. In the study presented here,
we examine the effects of theoretical and practical training
on the drivers’ response to potentially critical situations in
a longitudinal driving simulator study. We also present the
effects of encountering a failure of the automated vehicle on
driver state and behavior. We conducted a two-part panel
driving simulator study (N=28), with an interval of 20-30 days
between the training and testing sessions. We found that while
participants with training are better prepared for a potential
failure of the automation, participants in both conditions show
a rise in sleepy or drowsy behavior before a potential failure
of automation.
I. INTRODUCTION
Autonomous vehicles (AVs) for widespread public use
bring the promise of safer and more efficient roads, while
simultaneously reducing overall travel time by decreasing
traffic [
1
]. Currently, vehicle manufacturers are testing vehi-
cles with automation features in Levels 3 to 5 of automation
on the SAE classification [
2
], with the goal of eventually
reaching Level 5, or full automation. During on-road testing
of AVs,
Safety Drivers
(also referred to as Fallback Drivers)
are needed to ensure safety in the event the AV is unable to
operate or experiences a failure; in such an event, the driver
can safely take-over and navigate the vehicle. Safety drivers
[
2
]) must monitor the AV operations and its surroundings
and, if possible, anticipate failures of the AV system. This
makes the understanding and the analysis of safety driver
training, behavior and state, vital [3].
However, it has been noted that drivers supervising of au-
tomated vehicles are imperfect themselves. The simultaneous
failure of the vehicle automation systems and the safety driver
can have disastrous consequences
1
,
2
. Moreover, prior research
indicates that with the resumption of manual driving from
lower levels of automation, drivers experience an increase
in response time [
4
] and in secondary task involvement [
5
].
In higher levels of automation, drivers experience increased
sleepy and drowsy behavior [
6
] caused by reduced cognitive
activity [
7
] associated with underload. However, this body of
AV research is based on studies conducted with participants
with driving experience, but not drivers given specific training
on the fallibility of AVs. Hence, it is important to understand
if training of drivers or increased experience with automation
failure can alleviate the detrimental driver behaviors that
occur when supervising AVs.
In this study, we analyze the impact of (1) failure training
and the (2) experience of a near-catastrophic-failure on
the AV driver’s ability to recognize a potentially critical
situation and prepare himself or herself. We employed a
1 x 2 (with Failure Training and without Failure Training)
design panel study (N=28). The study was run in two phases,
separated by 20-30 days: the first session was presented as
a training session and the second session as a test session.
In the training session, participants were given theoretical
and practical training using a training document packet and
driving simulation respectively. Those who underwent
failure
training
received an additional section in their document
packet that detailed the likely ways the AV driving system
could fail or suddenly disengage, and, how to best prepare
for such an event. Participants who received failure training
also were given a demonstration of sudden silent automation
failure in the training driving simulation. In second phase (i.e.
the testing session), participants drove the AV through a 40
minute simulated course. All participants experienced sudden
silent automation failure in the second driving simulation
1
https://www.nbcnews.com/tech/innovation/video-shows-moment-fatal-
crash-involving-self-driving-uber-n858921
2
https://qz.com/1410928/waymos-self-driving-car-crashed-because-its-
human-driver-fell-asleep/
Fig. 1. Participant in the Testing Phase
session. We analyzed the differences in driver state and driving
behavior between the 2 groups and within each group over
time. The key findings of this study are stated below.
Participants who received theoretical and practical failure
training were more likely to anticipate potential AV
failure.
Participants in both conditions showed a decreasing trend
in their preparedness and vigilance over time.
The experience of a near-failure of automation caused
atemporary increase in preparedness and decrease in
sleepy behavior, but prior trends take up again soon after.
II. BACKGROU ND AND PRIOR WORK
A. Driver Behavior Changes
As more and more automated driving features are included
in cars, driver behaviors change in ways that are not always
expected. Researchers have noted that driver behavior can
take a turn for the worse with lower levels of automation
(between Levels 1-2.5). Rudin-Brown et al. [
4
] find that
the prolonged use of ACC corresponded to increases in the
response time to hazard detection tasks. Similarly, a study
conducted by Strand et al. reveals that drivers using highly
automated driving systems maintain lower minimum time to
collision, lower minimum time headway and a longer response
time to a failure of the automation as compared to drivers
who use the lower levels of automation(ACC) [
8
]. In contrast,
longitudinal research on supervision of automation by Erika
Miller et al., investigating behavioral adaptations to Level 2
lane-keeping assistance features over the course of multiple
drives on multiple days over the course of a week, indicates
that drivers perform better with automation over time but then
suffer performance losses when assistance is not available [
9
],
[
10
]. Several studies also report that automation correlates
with decreased driver arousal, increased driver drowsiness,
slower reaction times and reduced eyes-on-road gazes [
11
],
[12], [13].
Similar adverse behavioral adaptations are also observed
in the higher levels of automation. Past studies indicate
that drivers who monitor the automated driving system for
lengthy time periods experience drowsy behavior [
6
], [
14
].
These incidences of drowsy behavior were investigated using
cortical activity measures [
7
] and were found to be the result
of prolonged duration of low cortical activity. Research on
vigilance indicates that uncertainty increases the workload
associated with supervision of automation, and the associated
fatigue degrades performance [
15
] particularly over time [
16
]
These studies point to a pattern of detrimental driver behaviors
emerging with the use of automated driving systems, making
it vital that any training program used for safety drivers
addresses these issues.
B. Safety Driver Training
Driver training programs have long been an essential tool
in ensuring the safety of drivers. It is ingrained into our
driving culture and is a rite of passage for many teenagers.
The design of an appropriate and effective driver training
program can reduce the risk of crashes and traffic violations
[
17
]. New and improved driver training programs are being
developed that focus on bettering drivers’ anticipation of risk,
rather than the drivers’ skill in driving at the limits of tire
adhesion. With the advent of automated driving and the use
of safety drivers to supervise the automation to ensure safety,
it is now imperative to study the effectiveness of training
programs for safety drivers.
While OEMs and tech companies commonly release the
number of miles logged by the AVs in simulation and on-
road, there is little to no information provided on the training
given to safety drivers who supervise the automation during
the testing [
18
]. In the aftermath of a fatal mishap, Uber
announced that it would improve safety driver training. The
report announced the addition of a co-pilot and the inclusion
of among aspects, “Fault Injection Training” and “Software
Limitation Training.” In other words, the proposed training
would include training about the failure modes of the AV.
Recently, a new set of guidelines released by the Automated
Vehicle Safety Consortium called for the inclusion of both
theoretical and practical training and, like the Uber report,
called for the inclusion of training into the failure modes of
the AV. Aside from the recommendations mentioned above,
some federal [
19
] and state policies allude to the need for
safety driver vigilance to ensure safety during AV testing,
making the design of safety driver programs and investigation
of safety driver behavior vital.
C. Measurement of Driver State
Gon
c¸
alves et al. define
Driver State Monitoring Systems
(DSM)
as systems that collect observable information about
the human driver in order to assess drivers capability to
Fig. 2. Panel Study Design with Training Phase(left) and Testing Phase(right). Training Phase consists of theoretical and practical training(smaller driving
simulator). Testing phase conducted in full-cab driving simulator
perform the driving task in a safe manner [
20
]. Past research
suggests myriad ways of inferring driver state, such as video-
based detection [
21
], odometric data [
22
], physiological tools
[
23
], [
24
]. These techniques have been successfully applied
in the detection of stress, fatigue and drowsiness [
24
], and are
being increasingly incorporated into AVs to monitor drivers
and ensure safety [
3
] and comfort
3
. Physiological sensors
offer researchers the ability to monitor driver state over long
time periods with minimal intrusion, making them ideal tools
for the driver state monitoring in AVs. In the current study, we
employ physiological measurement tools to characterize driver
state ahead of latent critical events. We employ galvanic skin
response (GSR) and heart rate (HR) measures to analyze the
preparedness and vigilance displayed by participants ahead
of latent critical events.
III. GOAL S OF TH E STUDY
The goal of the study is to answer the following research
questions:
RQ1
: Does prior knowledge from theoretical and prac-
tical training have an impact on the driver behavior?
RQ2
: How does the experience of a near-failure impact
the safety driver behavior?
IV. MET HOD
A. Overview
The study had a between-subject study design: one group
that was given failure training versus the control group which
was not. Participants of each group were required to attend
two sessions: the first session we call the Training Phase, and
the second, the Testing Phase. These sessions were separated
in time by 20-30 days to model the attrition of knowledge
over time. The practical training phase was conducted with a
small driving simulator consisting of a driver seat and one
small-sized screen (see Figure 2, left), similar to what might
3
https://www.faurecia.com/en/innovation/smart-life-board/comfort-
wellness
be used in a driver training class. The testing phase, which
is intended to predict how drivers will perform in subsequent
on-road driving, used a large immersive driving simulator
consisting of a 270-degree wrap-around screen and full-cab
vehicle (see Figure 2, right). A total of N=28 participant,
between 18 and 60 years old, completed both phases of the
study. All participants had a minimum of 2 years of on-
road driving experience. Participants were paid
$
5 for the
completion of the training phase and
$
25 for the testing phase
in Amazon gift cards. The overall study design is shown in
Figure 3.
B. Training Phase
Upon entering the study facility, participants fill out a
consent form and a questionnaire. The questionnaire collects
preliminary demographic information. The training phase
consists of two sub-sections: the theoretical and the practical
training sections. Both groups experience both training
sections. The failure training group is additionally briefed
about potential failures during theoretical training, and further
experience a silent automation failure during the practical
training phase.
1) Theoretical Training: In the theoretical training phase,
all
participants are presented with a document packet contain-
ing information detailing the functions of the AV subsystems.
It contains introductory content on perception, localization
and data fusion sub-systems in the AV, along with elementary
information regarding the functionalities of the RADAR,
LIDAR and camera subsystems in the AV. The theoretical
training packet instructs participants on how to engage and
disengage automation using the buttons on the steering wheel,
the brake and the steering. These methods are employed in a
driving simulator in the practical training segment.
The participants who receive theoretical failure training
receive an additional set of documents in the theoretical train-
ing session on the topic of ”safety in autonomous vehicles.
This section contains the reasons why the automated driving
system (used in this experiment) would fail or disengage.
Fig. 3. (Left)Course Design with practice and baseline segments, as well as test events, shown along a timeline. (Right)Temporal structure for each critical
event. (Figures not to scale)
Fig. 4. Examples of information provided in the theoretical training packet.
The full training document packet is available in the repository indicated in
theoretical training sub-section.
For example, participants are instructed that missing lane
markings or large debris covering the road
could
cause the
system to disengage automation and suddenly transfer control
to the driver. To prepare for possible control transition and
to avoid crashes due to sudden automation disengagement,
participants are instructed that they should hold the steering
wheel and place their foot on the brake pedal if they suspect
that the automation will disengage.
After participants finish reading the theoretical training
packet, they are given a multiple-answer multiple-choice test
to ensure that they have read and familiarized themselves with
the contents of the training packet. The test contains a total
of 8 questions, with 3 questions that tested the participants
on the safety section. We ensure that the participants who
receive theoretical failure training are well versed with the
contents of the safety training section by making them re-
read the training packet until they are able to answer all the
questions in the test about the safety section accurately (i.e.,
they have to redo the test until all questions are answered
correctly). This marks the end of the theoretical training
section; participants then begin the practical training section.
The material used in the theoretical training section along
with the test administered to participants are included in a
Github repository4.
2) Practical Training: In the practical training,
all
partici-
pants practice operating the automated vehicle using a driving
simulator (Figure 2) with automated driving capabilities.
Participants are given a short introduction of how to engage
and disengage the automation. In order to engage automation,
they press a green button on the steering wheel. To disengage
automation, they can press a red button on the steering wheel,
step on the brake, or turn the steering wheel to a minimum
of 15 degrees. (The minimum steering wheel for disabling
the automation is set to 15 degrees to avoid any accidental
disengagements by participants.) All participants practice
engaging and disengaging the automation at least twice
using the methods listed above, until they felt familiarized
with the transfers of control. While the practical training
phase uses a small driving simulator and the testing phase
uses an immersive large driving simulator, both simulators
are based on the same simulation software and automated
system capabilities (see Figure 2). The methods to engage
and disengage automation were consistent across the two
driving simulators. This provides the participants the requisite
practical training to operate the automated vehicle in the
testing phase.
During the practical training, the AV encounters a section
of road where the lane markers are faded. The participants
who receive practical failure training condition experience a
sudden disengagement of the automation due to the absence
of lane markers at this location (shown in Figure 5), in a
manner similar to how the automated driving system will
fail in the subsequent testing phase. Participants who do not
receive the practical failure training do not experience this
disengagement in automation.
C. Testing Phase
1) Measurements: During the study, participants’ ECG
signals, galvanic skin response and video data are gathered.
ECG data are collected using the Shimmer3 ECG device
with Ag/AgCl placed over the chest. The galvanic skin
response is recorded using the Shimmer3 GSR unit with
4
https://github.com/srinathsibi/Safety-Driver-Failure-Training-
Materials.git
Fig. 5. Critical Event in Practical Training with faded lane markings
a Photoplethysmograph (PPG) attachment. The Ag/Agcl
electrodes placed on the forearm and the PPG sensor are
attached to the ear lobe as seen in Figure 1. The physiological
data is set to record at 512 Hz sampling frequency.
2) Testing Course Design: At the start of the testing
session drive, participants are told that they are driving to San
Francisco in their AV. Participants are instructed to closely
follow signs along the side of the guiding them to their
destination and other pertinent driving information such as
designated lanes for AVs. They are informed that the AV is
programmed to follow directions to the destination and obey
the road signs, but participants are instructed that they can
take over control if they feel it necessary, using the methods
learned in the training phase.
The total drive time in the testing phase is approximately
45 minutes depending on participant driving speeds.
The drive begins with a short 3-4 minute long practice
segment for the participants to acclimate to driving in the
larger full cab simulator. During this segment, participants are
asked to operate the vehicle in manual and automated driving
modes by taking-over and handing-off control from and to the
automated driving system using the methods covered during
the training phase.
3) Event Design: During the main portion of the testing
phase drive, participants encounter eight events in the simula-
tor test course. In all eight events, the lane markings are either
occluded or not available. The design of the course with the
practice segment and the eight individual events is shown in
Figure 3. Each event is separated from the previous event
by at least two minutes. In each event, the AV encounters
missing/faded lane markings. While the cause for the absence
or occlusion of the lane markings is different across the
events, the overall temporal structure of all the events is the
same (Figure 3). Every event is preceded by a road sign on
the side of the road, indicating a potential critical event in
the road ahead. Then, the lane markers fade or disappear
completely, and the AV encounters an obstacle in the road.
The time interval between the appearance of a road sign and
the lane markers disappearance is denoted as T
1
. The time
interval between the disappearance of the lane markers and
the appearance of an obstacle on the road is denoted as T
2
.
T
1
and T
2
are intentionally varied between the eight events
to avoid any learning effects between events based on timing
alone.
As seen in Figure 3, events 3 and 8 are denoted differently
from the other 6 events. In events 1, 2, 4, 5, 6 and 7, the AV
self-corrects and maneuvers to safety without any intervention
from the driver (participant), thereby successfully navigating a
situation where there were absent or unreadable lane markers.
For example, in event 1, the AV first encounters a warning
sign. On the other side of the hill, the AV’s lane and the lane
markers are covered by rocks due to a landslide. To avoid
hitting the rocks, the AV slowly self-corrects its course by
driving to the left and around the rocks and then back on
to the lane. Events 2, 4, 5, 6 and 7 have a similar structure
where the AV avoids any traffic incidents by identifying and
correcting its path.
Unlike the aforementioned events, events 3 and 8 feature
automation failure. In event 3, the participant encounters a
construction zone in which the lane markers are missing
and a section of the road is closed off for construction with
pylons. The AV begins braking at the last possible moment
to avoid crashing into the pylons and automation disengages
after the vehicle comes to a stop. Participants then have to
manually drive for
30 seconds to a different location to
engage automation, marking the end of event 3. In event 8,
the AV encounters missing lane markers while driving along
the highway. When the highway lanes diverge due to the
presence of a highway island, the AV is unable to follow the
curve in the road and continues to drive straight due to the
absence of lane markers. As a result, without intervention
the AV crashes into trees on the highway island.
This test course design was chosen to enable comparisons
in the participants’ behavior and state during the T
1
and T
2
intervals for events, and differences in between the driver
behavior and driver state before and after event 3. This allows
us to compare participants who receive failure training and
those who did not, and the impact of experience a near-failure
of the AV on driver behavior and state.
D. Other Events
In between the eight critical events, it is important to
note that the AV navigates several events such as highway
exits and merges, lane changes, stop signs and traffic lights
with no errors in driving. We chose this setup as the future
paradigm of AVs is one where the safety driver supervises
the automation for long periods of successful driving with
the automation encountering disengagements or failures that
are further and further apart
5
. We built common traffic events
that the AV should navigate easily so that the narrative of
the study was in keeping with the emerging paradigm in
automated driving.
V. RESULTS
A. Preparedness for Critical Event
Each participant’s video was coded for their preparedness
for failure by identifying whether participants prepared
themselves for a possible failure of automation as instructed in
5
https://www.forbes.com/sites/alanohnsman/2019/02/13/waymo-tops-self-
driving-car-disengagement-stats-as-gm-cruise-gains-and-tesla-is-awol/
the training phase of the study. In other words, did participants
hold the steering wheel and prepare themselves for a possible
automation disengagement or failure of automation? In the
study design figure, (Figure 3) this is the interval T
1
+ T
2
(i.e. time interval from the appearance of the road sign to the
appearance of the obstacle on the road).
The fraction of participants in both study conditions who
displayed preparedness for a possible failure of automation
in the interval before an event are shown in Figure 6.
Three interesting results can be observed and are noted
as 1a, 1b and 1c in the Figure 6. First, in 1a, we observe
that a greater fraction of participants who received failure
training exhibit increased preparedness ahead of latent critical
events across all events. Next, in 1b, we also observe an
increase in preparedness after event 3 where the automation
experiences a disengagement through a silent failure. Lastly,
in 1c, we observe a marked downward trend in the fraction
of participants who showed preparedness in both study
conditions. This downward trend is temporarily interrupted by
the third critical event. However, the downward trend appears
once again after event 4.
B. Sleepy/Drowsy Behavior Before a Critical Event
In the T
1
+ T
2
interval for all events, we coded the
videos for sleepy or drowsy behavior. If participants exhibited
prolonged (
>
5 sec) eye closure, yawning or other sleep
related behavior (e.g., head nodding), they were classified as
displaying sleepy behavior. The fraction of participants who
exhibit sleepy behavior in the T
1
+T
2
interval for each event
in both conditions is shown in the lower part of Figure 7.
Here we observe two interesting results: First, the fraction
of participants who display drowsy behavior ahead of an event
is not significantly different between both study conditions.
However, both conditions towards event 8 show a marked
increase in drowsy behavior. Approximately a third of
participants in both cases display sleepy or drowsy behavior
before the last critical event. This is indicated as result 2a
in Figure 7. Another interesting observation in the results is
the decrease in the observed incidences of sleepy behavior
immediately following event 3 in both conditions, as denoted
by result 2b in Figure 7.
C. Physiological Data
We re-sampled the GSR and ECG data to 512Hz to
compensate for the loss of data over the Bluetooth connection
for the Shimmer device, and to ensure that the recording rate
of 512 Hz is retained. Once the data were re-sampled, the
Python BioSPPy package [
25
] was employed to extract the
GSR peaks and amplitude information and the heart rate from
the ECG data.
Baseline for the physiological data was set to the interval
after the practice segment, before event 1. In this baseline
interval (shown in Figure 3), participants monitored the
automation while the AV drove on a straight road with no
traffic or critical events. For the GSR data, the number of
peaks and the amplitude of the peaks were extracted for
the (T
1
+ T
2
). To analyze the increase in driver arousal and
vigilance ahead of the event, the number of peaks over 0.5
µ
Siemens and the increase in heart rate from the baseline
value for the (T1+ T2) interval were calculated.
Some participants’ physiological data had to be excluded
from further analysis due to excessive motion artefacts in the
data, resulting in N= 11 (FT group) and N= 13 (Control
group) participants for the heart rate measure, and N= 13 (FT
group) and N= 11 (Control group) participants for the GSR
measure. For each group, we further calculated the average
change in heart rate (compared to baseline), as well as the
average number of GSR peaks before the event – across all
events [
26
]. As data were not normally distributed, we present
boxplot diagrams (see Figure 8).
Non-parametric Mann-Whitney U tests between the groups
reveal a statistically significant higher average number of
GSR peaks before the event for the failure training group
compared to the control group (U= 3.000, z= 3.046, p=
.001), with a large effect size of r= .762. The tests show no
differences in heart rate.
In summary, results 1a, 1b, and 1c show that participants
who received failure training display an increased level of
driver preparedness in the event of a potential failure of
automation. These findings is bolstered by the results of the
GSR analysis. Participants who receive failure training show
an increase in the GSR response before the critical events.
These findings demonstrate that safety drivers benefit from
receiving training about the failure modes of the AV.
Both driver groups, with and without failure training,
show a decrease in preparedness indicating that the effect
of training on driver alertness and vigilance fades over time.
This downward trend is interrupted by the disengagements
of automation or potentially hazardous events, but the trend
resumes as drivers continue to monitor the automation after
the event. On one hand, these results imply that training and
knowledge of the AV’s capabilities and limits better safety
driver awareness and preparedness. However, on the other
hand, the results suggest that this preparedness decreases with
time as they monitor the AV navigating other (non-critical)
traffic events successfully.
Results 2a and 2b show that a prolonged duration for
monitoring the automated driving system results in increased
sleepy or drowsy behavior. This result is in keeping with past
research which shows that the cortical activity of drivers is the
lowest when monitoring automation and this sustained interval
of low cognitive activity often leads to sleepy or drowsy
behavior [
7
]. Similar to driver preparedness, the increasing
trend in sleepy or drowsy behavior is interrupted by a critical
event, but resumes a short while after this critical event.
These results provide some much needed insight into the
behavior of safety drivers and their training. It is important
to understand how we can effectively train safety drivers.
To this end, these results suggest that training and limited
time spans performing supervision are necessary to safety
driver performance. The newly released AVSC safety driver
training recommendations [
3
] advocates periodic safety driver
evaluation and retraining. These recommendations are in
keeping with our findings. Moreover, these recommendations
0
0.2
0.4
0.6
0.8
1
Event 1 Event 2 Event 3 Event 4 Event 5 Event 6 Ev ent 7 Event 8
Frac tion of Par ticipants
Preparedness for Cr itical Event
FT group
Control group
Result 1a
Result 1c
Result 1b
Result 1c
Near-failure
FT Group
Control G roup
Fig. 6. Preparedness of participants across all events
0
0.2
0.4
0.6
0.8
1
Event 1 Event 2 Event 3 Event 4 Event 5 Event 6 Event 7 Event 8
Frac tion of Par ticipants
Presence of Sleepy/Drowsy Be havior Prior to Event
FT group
Control group
Result 2b
Near-failure
Result 2a
FT Group
Control G roup
Fig. 7. Drowsy behavior of participants across all events
0
2
4
6
8
10
FT group Control
group
Avereage Number of GSR
Peaks Before Event
0
10
20
30
40
FT group Control
group
Avereage Change in Heart
Rate Compared to Baseline
Fig. 8. (Left) Average change in heart rate in (T
1
+ T
2
) interval and
(Right) Average number of peaks greater than 0.5
µ
Siemens in the (T
1
+
T2) interval.
also suggest 5 to 20 minute breaks after 2 hours of AV testing
and operation. These findings, too, are aligned with our own
results and observations.
D. Automation Take-Overs:
In the T
1
+ T
2
interval for all events, we coded for take-
overs of automation initiated by participants using the three
modalities introduced in the training phase. Figure 9 shows
the fraction of participants across all participants in both
condition who disengage automation in the T
1
+T
2
interval
ahead of all events.
No significant differences in take over behavior were
observed between the conditions across the events save
for event 6. In event 6, some participants in the control
group disengaged the automation ahead of the obstacle
without closely monitoring the automated driving system.
This deviation in the trend was because event 3 and event
6 share the same road sign (a construction zone sign) to
indicate that there might be a potentially dangerous road
condition ahead. Some participants in the control condition
misconstrued the reason for the AV failure as the construction
zone rather than the absence of lane markers and prematurely
disengage automation. This alludes to a potentially dangerous
learning trend; safety drivers who are not accurately trained
on the AV’s failure modes may incorrectly learn and interpret
its behavior during operation.
0
0.2
0.4
0.6
0.8
1
Event 1 Event 2 Event 3 Event 4 Event 5 Event 6 Event 7
Frac tion of Par ticipants
Tak e -ov er Befor e Event
FT group
Control group
Near-failure
Result 3
Fig. 9. Participants who initiate a take-over before a Critical Event
E. Choice of Failure Mode
In this study, the absence of lane markers was chosen as
the failure mode for the AV. The experimenters constructed an
overarching narrative where the simulated automated driving
software would disengage or fail when it was no longer able
to detect lane markers. This failure mode was chosen as a way
to test the impact of failure training on the participants who
underwent safety drive training. While AVs currently being
tested and deployed might employ localization algorithms
that rule out failures like these, the findings of this study
would still be applicable to any other failure mode on which
safety drivers may be instructed.
F. Time Interval Between Phases of the Panel Study
The length of longitudinal studies to study driver behavior
varies widely. Past studies use time intervals between phases
that range from a few days to few years [
27
], [
28
]. In this
panel study, we chose an interval of 20-30 days between the
two phases of the study for pragmatic reasons. This study
design decision was chosen to push the boundaries on the
learning and retention of the information and training given to
the participants in the training phase without compromising
the researchers’ ability to conduct the study effectively. It
must be noted that while this study analyzes the impact of
the failure training on safety drivers, further research on
the effects of the time interval between training and testing
may still be needed. Testing different time intervals between
training and testing phases may also cast much needed light
on the effectiveness of the theoretical and practical aspects
of the training and aid in the development of more effective
training programs for AV safety drivers.
G. Participant Mental Models
Using the test provided after the theoretical training section,
the authors ensure that participants were familiar with the
theoretical failure training material. However, the study does
not investigate in depth the mental models that participants
develop as a result of the failure training. Drivers will tend
to view the automated driving system in this study and its
limitations through the lens of their past experiences. This
in turn could impact their trust in the automated system[].
A detailed analysis of the participant mental models may
be required in future investigations of the impact of failure
training on AV operator behavior.
VI. CONCLUSION
This study investigated the impact of failure training on
safety driver preparedness and behavior (
RQ1
) and the
experience of a near failure of the automation during a
critical event on the safety driver behavior (
RQ2
) using a
2-phase panel study design with theoretical and practical
training. The results clearly show that theoretical and practical
failure training have a positive impact on the safety driver
preparedness; however, there is an overall trend indicating a
loss in driver preparedness and a loss in vigilance signalled
by increasing incidences of sleepy/drowsy behavior over time.
While experiencing critical events in which the automation
may fail increases driver preparedness and alleviates sleepy
behavior, the effect is only transient. In other words, the
longer the AV drives successfully, the lesser the safety drivers
are prepared for a failure of the automated driving system.
VII. ACKNOWLEDGMENTS
This research was conducted under Stanford IRB Protocol
30016, with support from Robert Bosch LLC.
REFERENCES
[1]
C. Wu, A. M. Bayen, and A. Mehta, “Stabilizing traffic with
autonomous vehicles,” in 2018 IEEE International Conference on
Robotics and Automation (ICRA). IEEE, 2018, pp. 1–7.
[2]
Taxonomy and Definitions for Terms Related to Driving Automation
Systems for On-Road Motor Vehicles, jun 2018. [Online]. Available:
https://doi.org/10.4271/J3016 201806
[3]
AVSC Best Practice for In-Vehicle Fallback Test Driver Selection,
Training, and Oversight Procedures for Automated Vehicles Under Test,
nov 2019.
[4]
C. M. Rudin-Brown and H. A. Parker, “Behavioural adaptation to
adaptive cruise control (acc): implications for preventive strategies,”
Transportation Research Part F: Traffic Psychology and Behaviour,
vol. 7, no. 2, pp. 59–76, 2004.
[5]
J. C. de Winter, N. A. Stanton, J. S. Price, and H. Mistry, “The effects
of driving with different levels of unreliable automation on self-reported
workload and secondary task performance,” International journal of
vehicle design, vol. 70, no. 4, pp. 297–324, 2016.
[6]
D. Miller, A. Sun, M. Johns, H. Ive, D. Sirkin, S. Aich, and W. Ju,
“Distraction becomes engagement in automated driving,” in Proceedings
of the Human Factors and Ergonomics Society Annual Meeting, vol. 59,
no. 1. SAGE Publications Sage CA: Los Angeles, CA, 2015.
[7]
S. Sibi, H. Ayaz, D. P. Kuhns, D. M. Sirkin, and W. Ju, “Monitoring
driver cognitive load using functional near infrared spectroscopy
in partially autonomous cars,” in 2016 IEEE Intelligent Vehicles
Symposium (IV). IEEE, 2016, pp. 419–425.
[8]
N. Strand, J. Nilsson, I. M. Karlsson, and L. Nilsson, “Semi-automated
versus highly automated driving in critical situations caused by
automation failures,” Transportation research part F: traffic psychology
and behaviour, vol. 27, pp. 218–228, 2014.
[9]
E. E. Miller and L. N. Boyle, “Behavioral adaptations to lane keeping
systems: Effects of exposure and withdrawal,Human factors, vol. 61,
no. 1, pp. 152–164, 2019.
[10]
E. E. Miller, “Behavioral adaptations of drivers to autonomous systems:
Evaluating intermediate and carryover effects,” Ph.D. dissertation, 2018.
[11]
A. H. Jamson, N. Merat, O. M. Carsten, and F. C. Lai, “Behavioural
changes in drivers experiencing highly-automated vehicle control in
varying traffic conditions,Transportation research part C: emerging
technologies, vol. 30, pp. 116–125, 2013.
[12]
O. Carsten, F. C. Lai, Y. Barnard, A. H. Jamson, and N. Merat, “Control
task substitution in semiautomated driving: Does it matter what aspects
are automated?” Human factors, vol. 54, no. 5, pp. 747–761, 2012.
[13]
J. C. De Winter, R. Happee, M. H. Martens, and N. A. Stanton,
“Effects of adaptive cruise control and highly automated driving on
workload and situation awareness: A review of the empirical evidence,”
Transportation research part F: traffic psychology and behaviour,
vol. 27, pp. 196–217, 2014.
[14]
F. Naujoks, S. H
¨
ofling, C. Purucker, and K. Zeeb, “From partial and
high automation to manual driving: relationship between non-driving
related tasks, drowsiness and take-over performance,Accident Analysis
& Prevention, vol. 121, pp. 28–42, 2018.
[15]
J. S. Warm, W. N. Dember, and P. A. Hancock, “Vigilance and workload
in automated systems,” Automation and human performance: Theory
and applications, pp. 183–200, 1996.
[16]
E. T. Greenlee, P. R. DeLucia, and D. C. Newton, “Driver vigilance
in automated vehicles: Hazard detection failures are a matter of time,
Human factors, vol. 60, no. 4, pp. 465–476, 2018.
[17]
R. C. Peck, “Do driver training programs reduce crashes and traffic
violations?a critical examination of the literature,” IATSS research,
vol. 34, no. 2, pp. 63–71, 2011.
[18]
P. Koopman and B. Osyk, “Safety argument considerations for public
road testing of autonomous vehicles,” SAE Technical Paper, Tech. Rep.,
2019.
[19]
D. NHTSA, “Automated driving systems 2.0: A vision for safety,”
2017.
[20]
J. Gon
c¸
alves and K. Bengler, “Driver state monitoring systems–
transferable knowledge manual driving to had,Procedia Manufactur-
ing, vol. 3, pp. 3011–3016, 2015.
[21]
B. Kisacanin, “Method of detecting vehicle-operator state,” Sept. 9
2008, uS Patent 7,423,540.
[22]
F. Friedrichs, M. Miksch, and B. Yang, “Estimation of lane data-based
features by odometric vehicle data for driver state monitoring,” in 13th
International IEEE Conference on Intelligent Transportation Systems.
IEEE, 2010, pp. 611–616.
[23]
J. A. Healey and R. W. Picard, “Detecting stress during real-world
driving tasks using physiological sensors,IEEE Transactions on
intelligent transportation systems, vol. 6, no. 2, pp. 156–166, 2005.
[24]
S. Begum, “Intelligent driver monitoring systems based on physiological
sensor signals: A review,” in 16th International IEEE Conference on
Intelligent Transportation Systems (ITSC 2013). IEEE, 2013, pp.
282–289.
[25]
C. Carreiras, A. Alves, A. Louren
c¸
o, F. Canento, H. Silva, A. Fred, et al.,
“Biosppy: Biosignal processing in python (2015–),” URL https://github.
com/PIA-Group/BioSPPy, 2018.
[26]
J. A. Healey, “Wearable and automotive systems for affect recogni-
tion from physiology,” Ph.D. dissertation, Massachusetts Institute of
Technology, 2000.
[27]
J. G. Hull, A. M. Draghici, and J. D. Sargent, “A longitudinal study
of risk-glorifying video games and reckless driving.Psychology of
Popular Media Culture, vol. 1, no. 4, p. 244, 2012.
[28]
E. E. Miller and L. N. Boyle, “Driver adaptation to lane keeping
assistance systems: Do drivers become less vigilant?” in Proceedings
of the Human Factors and Ergonomics Society Annual Meeting, vol. 61,
no. 1. SAGE Publications Sage CA: Los Angeles, CA, 2017.
... A study in 2008 suggested that drivers who have been exposed to automation failures during training performed better in regaining control of self-driving cars than those without prior practice [3]. In a recent work, Sibi and colleagues also proposed to incorporate takeover scenarios in driver training courses for highly automated cars [27]. For instance, during theoretical training, new drivers should learn about common reasons why the automated driving system would disengage and how to get ready to regain control of the vehicles to avoid accidents. ...
... For instance, during theoretical training, new drivers should learn about common reasons why the automated driving system would disengage and how to get ready to regain control of the vehicles to avoid accidents. Practical sessions could provide new drivers with hands-on experience related to sudden takeover requests in a low-risk environment [27]. By doing so, drivers are empowered with both knowledge and practical skills to safely operate self-driving cars and regain control of the vehicles whenever necessary. ...
Preprint
Full-text available
Recent years have seen growing interest in the development of self-driving vehicles that promise (or threaten) to replace human drivers with intelligent software. However, current self-driving cars still require human supervision and prompt takeover of control when necessary. Poor alertness while controlling self-driving cars could hinder the drivers' ability to intervene during unpredictable situations, thus increasing the risk of avoidable accidents. In this paper we examine the key factors that contribute to drivers' poor alertness, and the potential solutions that have been proposed to address them. Based on this examination we make some recommendations for various stakeholders, such as researchers, drivers, industry and policy makers.
... Therefore, we encourage automotive researchers to investigate health aspects as part of the education and training process in VR driving simulations. Moreover, we also see the need to conduct longitudinal studies in this domain, such as the one reported by Sibi et al. (2020). For example, future research could investigate longitudinal effects of future mixed traffic scenarios for novice drivers, and the effect on trust and acceptance of AVs, which shows the interdisciplinary nature of this application area. ...
Article
Full-text available
While virtual reality (VR) interfaces have been researched extensively over the last decades, studies on their application in vehicles have only recently advanced. In this paper, we systematically review 12 years of VR research in the context of automated driving (AD), from 2009 to 2020. Due to the multitude of possibilities for studies with regard to VR technology, at present, the pool of findings is heterogeneous and non-transparent. We investigated N = 176 scientific papers of relevant journals and conferences with the goal to analyze the status quo of existing VR studies in AD, and to classify the related literature into application areas. We provide insights into the utilization of VR technology which is applicable at specific level of vehicle automation and for different users (drivers, passengers, pedestrians) and tasks. Results show that most studies focused on designing automotive experiences in VR, safety aspects, and vulnerable road users. Trust, simulator and motion sickness, and external human-machine interfaces (eHMIs) also marked a significant portion of the published papers, however a wide range of different parameters was investigated by researchers. Finally, we discuss a set of open challenges, and give recommendation for future research in automated driving at the VR side of the reality-virtuality continuum.
Article
Full-text available
Objective: The primary aim of the current study was to determine whether monitoring the roadway for hazards during automated driving results in a vigilance decrement. Background: Although automated vehicles are relatively novel, the nature of human-automation interaction within them has the classic hallmarks of a vigilance task. Drivers must maintain attention for prolonged periods of time to detect and respond to rare and unpredictable events, for example, roadway hazards that automation may be ill equipped to detect. Given the similarity with traditional vigilance tasks, we predicted that drivers of a simulated automated vehicle would demonstrate a vigilance decrement in hazard detection performance. Method: Participants "drove" a simulated automated vehicle for 40 minutes. During that time, their task was to monitor the roadway for roadway hazards. Results: As predicted, hazard detection rate declined precipitously, and reaction times slowed as the drive progressed. Further, subjective ratings of workload and task-related stress indicated that sustained monitoring is demanding and distressing and it is a challenge to maintain task engagement. Conclusion: Monitoring the roadway for potential hazards during automated driving results in workload, stress, and performance decrements similar to those observed in traditional vigilance tasks. Application: To the degree that vigilance is required of automated vehicle drivers, performance errors and associated safety risks are likely to occur as a function of time on task. Vigilance should be a focal safety concern in the development of vehicle automation.
Conference Paper
Full-text available
In partially automated cars, it is vital to understand the driver state, especially the driver's cognitive load. This might indicate whether the driver is alert or distracted, and if the car can safely transfer control of driving. In order to better understand the relationship between cognitive load and the driver performance in a partially autonomous vehicle, functional near infrared spectroscopy (fNIRS) measures were employed to study the activation of the prefrontal cortex of drivers in a simulated environment. We studied a total of 14 participants while they drove a partially autonomous car and performed common secondary tasks. We observed that when participants were asked to monitor the driving of an autonomous car they had low cognitive load compared to when the same participants were asked to perform a secondary reading or video watching task on a brought in device. This observation was in line with the increased drowsy behavior observed during intervals of autonomous system monitoring in previous studies. Results demonstrate that fNIRS signals from prefrontal cortex indicate additional cognitive load during manual driving compared to autonomous. Such brain function metrics could be used with minimally intrusive and low cost sensors to enable real-time assessment of driver state in future autonomous vehicles to improve safety and efficacy of transfer of control.
Article
Full-text available
Until automated cars function perfectly, drivers will have to take over control when automation fails or reaches its functional limits. Two simulator experiments (N = 24 and 27) were conducted, each testing four automation levels ranging from manual control (MC) to highly automated driving. In both experiments, participants about once every 3 min experienced an event that required intervention. Participants performed a secondary divided attention task while driving. Automation generally resulted in improved secondary task performance and reduced self-reported physical demand and effort as compared to MC. However, automated speed control was experienced as more frustrating than MC. Participants responded quickly to the events when the stimulus was salient (i.e., stop sign, crossing pedestrian, and braking lead car), but often failed to react to an automation failure when their vehicle was driving slowly. In conclusion, driving with imperfect automation can be frustrating, even though mental and physical demands are reduced.
Article
Full-text available
Highly Automated Driving (HAD) will be commercially available in a near feature, yet human factors issues like the influence of driver state can have a critical impact in the success of this driving paradigm and also in road safety. It is very likely that Driver State Monitoring Systems (DSMS) will play a bigger role than they have played so far. However with this new driving paradigm shift is important to select highlight what is transferable from the previous systems. Due to lack of driving task engagement, driving performance metrics are no longer viable, creating opportunities for other approaches like detecting non-driving task engagement or fatigue countering behaviours. Eye based metrics will remain important.
Article
Full-text available
As vehicle automation proliferates, the current emphasis on preventing driver distraction needs to transition to maintaining driver availability. During automated driving, vehicle operators are likely to use brought-in devices to access entertainment and information. Do these media devices need to be controlled by the vehicle in order to manage driver attention? In a driving simulation study (N=48) investigating driver performance shortly after transitions from automated to human control, we found that participants watching videos or reading on a tablet were far less likely (6% versus 27%) to exhibit behaviors indicative of drowsiness than when overseeing the automated driving system; irrespective of the pre-driving activity, post- transition driving performance after a five-second structured handoff was not impaired. There was not a significant difference in collision avoidance reaction time or minimum headway distance between supervision and media consumption conditions, irrespective of whether messages were presented on the tablet device, or only presented on the instrument panel, or whether there was a single or two-stage handoff.
Article
Objective: A driving simulator study was conducted to evaluate the longitudinal effects of an intervention and withdrawal of a lane keeping system on driving performance and cognitive workload. Background: Autonomous vehicle systems are being implemented into the vehicle fleet. However, limited research exists in understanding the carryover effects of long-term exposure. Methods: Forty-eight participants (30 treatment, 18 control) completed eight drives across three separate days in a driving simulator. The treatment group had an intervention and withdrawal of a lane keeping system. Changes in driving performance (standard deviation of lateral position [SDLP] and mean time to collision [TTC]) and cognitive workload (response time and miss rate to a detection response task) were modeled using mixed effects linear and negative binomial regression. Results: Drivers exposed to the lane keeping system had an increase in SDLP after the system was withdrawn relative to their baseline. Drivers with lane keeping had decreased mean TTC during and after system withdrawal compared with manual drivers. There was an increase in cognitive workload when the lane keeping system was withdrawn relative to when the system was engaged. Conclusion: Behavioral adaptations in driving performance and cognitive workload were present during automation and persisted after the automation was withdrawn. Application: The findings of this research emphasize the importance to consider the effects of skill atrophy and misplaced trust due to semi-autonomous vehicle systems. Designers and policymakers can utilize this for system alerts and training.
Article
Background: Until the level of full vehicle automation is reached, users of vehicle automation systems will be required to take over manual control of the vehicle occasionally and stay fallback-ready to some extent during the drive. Both, drowsiness caused by inactivity and the engagement in distracting non-driving related tasks (NDRTs) such as entertainment or office work have been suggested to impair the driver's ability to safely handle these transitions of control. Thus, it is an open question whether engagement in NDRTs will impair or improve takeover performance. Method: In a motion-based driving simulator, 64 participants completed an automated drive that lasted either one or two hours using either a partially or highly automated driving system. In the partially automated driving condition, a warning was issued after several seconds when drivers took both hands off the steering wheel, while the highly automated driving system allowed hands-off driving permanently. Drivers were allowed to bring along their smartphones and to use them during the drive. They engaged in a wide variety of NDRTs such as reading or using social media. At the end of the session, drivers had to react to a sudden lead vehicle braking event. In the partial automation condition, there was no takeover request (TOR) to notify the drivers of the braking vehicle , while in the highly automated condition, the situation happened right after the drivers had deactivated the automation in response to a TOR. The lead time of the TOR was set at 8 s. Driver's level of drowsiness, workload (visual, mental and motoric) from carrying out the NDRT and motivational appeal of the NDRT right before the control transition were video-coded and used to predict the outcome of the braking event (i.e., reaction and system deactivation times, minimal Time-to-collision (TTC) and self-reported criticality) with a multiple regression approach. Results: In the partial automation condition, reaction times to the braking vehicle and situation criticality as measured by the minimum TTC could be well predicted. Main predictors for increased reaction time were drowsiness and motivational appeal of the NDRT. However, visual and mental demand associated with NDRTs did decrease reaction time, suggesting that the NDRT helped the drivers to maintain alertness during the partially automated drive. Accordingly, drowsiness and motivational appeal of the NDRT increased situation criticality, while cog-nitive load due to the NDRT decreased it. In the highly automated condition, however, it was not possible to predict system deactivation time (in reaction to the TOR), brake reaction time to the braking vehicle and situation criticality by observed drowsiness and NDRT engagement. Discussion: The results suggest a relationship between the driver's drowsiness and NDRT engagement in partial automation but not in highly automated driving. Several explanations for this finding are discussed. It could be possible that the lead time of 8 s might have given the drivers enough time to complete the driver state transition process from executing NDRTs to manual driving, putting them in a position to be able to cope with the driving event, while this was not possible in the partial automation condition. Methodological issues that might have led to a non-detection of an effect of drowsiness or NDRT engagement in the highly automated driving condition, such as the sample size and sensitivity of the observer ratings, are also discussed.
Article
Driver adaptation to semi-autonomous vehicles is examined in this study using a longitudinal driving simulator study with 18 subjects randomly assigned to a control (n = 9) and treatment (n = 9) group. A Tactile Detection Response Task (TDRT) was used to measure cognitive workload while drivers engaged in visual-manual distracting tasks using an in-vehicle information system, with and without a lane-keeping assistance system (baseline, intervention, and withdrawal period). Measures of miss rate and reaction time to the TDRT stimuli suggest that drivers in both groups showed decreases in cognitive workload as they gained experience. Participants in the treatment group experienced higher cognitive workload when the lane keeping assistance system was withdrawn. On average, there were greater decreases in cognitive workload over time for the difficult tasks as compared to the easy tasks for both the control and treatment groups. The effect was more pronounced in the control group with larger decreases in cognitive workload. The NASA TLX was also comparable to the task accuracy. These results suggest that adaptations in driver behavior due to exposure to automation may have transfer effects on cognitive workload and thus may lead to safety implications.