ArticlePDF Available

Abstract and Figures

Objective The effectiveness of three types of in-vehicle warnings was assessed in a driving simulator across different noise conditions. Background Although there has been much research comparing different types of warnings in auditory displays and interfaces, many of these investigations have been conducted in quiet laboratory environments with little to no consideration of background noise. Furthermore, the suitability of some auditory warning types, such as spearcons, as car warnings has not been investigated. Method Two experiments were conducted to assess the effectiveness of three auditory warnings (spearcons, text-to-speech, auditory icons) with different types of background noise while participants performed a simulated driving task. Results Our results showed that both the nature of the background noise and the type of auditory warning influenced warning recognition accuracy and reaction time. Spearcons outperformed text-to-speech warnings in relatively quiet environments, such as in the baseline noise condition where no music or talk-radio was played. However, spearcons were not better than text-to-speech warnings with other background noises. Similarly, the effectiveness of auditory icons as warnings fluctuated across background noise, but, overall, auditory icons were the least efficient of the three warning types. Conclusion Our results supported that background noise can have an idiosyncratic effect on a warning’s effectiveness and illuminated the need for future research into ameliorating the effects of background noise. Application This research can be applied to better present warnings based on the anticipated auditory environment in which they will be communicated.
Content may be subject to copyright.
AUDITORY WARNINGS AND BACKGROUND NOISE 1
Published version is available at:
1
https://journals.sagepub.com/doi/metrics/10.1177/0018720819879311
2
3
4
Topic Choice: Surface Transportation
5
Toward a Better Understanding of In-Vehicle Auditory Warnings and Background Noise
6
Edin Šabić1, Jing Chen2, and Justin A. MacDonald1
7
New Mexico State University1, Old Dominion University2
8
9
Author Note
10
Edin Šabić, Department of Psychology, New Mexico State University
11
Jing Chen, Department of Psychology, Old Dominion University
12
Justin A. MacDonald, Department of Psychology, New Mexico State University
13
Manuscript type: Revision, Extended Multi-Phase Study; Word count: 7484
14
This research was based on the first author’s master’s thesis, and supported by the
15
National Science Foundation under Grant No. 1566173 and 1760347. The authors would like to
16
thank Dr. Igor Dolgov for his comments on an earlier version of this manuscript, and Jaymison
17
Miller, Steven Archuleta, Gabrielle Campbell, Akuadasuo Ezenyilimba, Lorenzo Lopez, Graham
18
Strom, and Ashley Ruiz for their help in data collection.
19
Correspondence concerning this article should be addressed to Edin Šabić
20
(sabic@nmsu.edu), Department of Psychology, New Mexico State University, PO Box
21
30001/MSC 3452, Las Cruces, NM 88003, or Jing Chen (j1chen@odu.edu), Department of
22
Psychology, Old Dominion University, 250 Mills Godwin Life Sciences Bldg, Norfolk, VA
23
23529.
24
AUDITORY WARNINGS AND BACKGROUND NOISE 2
Objective. The effectiveness of three types of in-vehicle warnings was assessed in a
25
driving simulator across different noise conditions.
26
Background. Although there has been much research comparing different types of
27
warnings in auditory displays and interfaces, many of these investigations have been conducted
28
in quiet laboratory environments with little to no consideration of background noise. Further, the
29
suitability of some auditory warning types, such as spearcons, as car warnings has not been
30
investigated.
31
Method. Two experiments were conducted to assess the effectiveness of three different
32
auditory warnings (spearcons, text-to-speech, auditory icons) with different types of background
33
noise while participants performed a simulated driving task.
34
Results. Our results showed that both the nature of the background noise and the type of
35
auditory warning influenced warning recognition accuracy and reaction time. Spearcons
36
outperformed text-to-speech warnings in relatively quiet environments, such as in the baseline
37
noise condition where no music or talk radio was played. However, spearcons were not better
38
than text-to-speech warnings with other background noises. Similarly, the effectiveness of
39
auditory icons as warnings fluctuated across background noise, but, overall, auditory icons were
40
the least efficient of the three warning types.
41
Conclusion. Our results supported that background noise can have an idiosyncratic effect
42
on a warning’s effectiveness and illuminated the need for future research into ameliorating the
43
effects of background noise.
44
Application. This research can be applied to better pair warnings based on the
45
anticipated auditory environment in which they will be communicated.
46
Keywords: Warning systems, auditory displays, noise/acoustics, audition, vehicle design
47
AUDITORY WARNINGS AND BACKGROUND NOISE 3
Précis: The presence of noise can affect recognition of auditory warnings during human-
48
machine interaction. Two experiments were conducted to investigate the efficiency of three
49
different warning types across different types of background noise. Results showed that the
50
efficiency of the warning was determined by both the warning and background noise.
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
AUDITORY WARNINGS AND BACKGROUND NOISE 4
Towards a Better Understanding of In-Vehicle Auditory Warnings and Background Noise
73
As in-vehicle technology improves and the level of automation increases, effective
74
communication with the driver becomes paramount. As the driver’s role becomes less physical,
75
keeping the driver in the loop is integral to ensure safety in the event that they may need to
76
intervene. One such way is to alert the driver of current vehicle status (Koo et al., 2014), and
77
communicate take-over requests when necessary. Much research has been conducted comparing
78
vehicle warnings across different modalities. When the vehicle needs to communicate with the
79
driver, is it better to provide a visual, auditory, or tactile warning (e.g., rumbling their seat)?
80
While research is still tackling this question (Dettman & Bullinger, 2017; Mohebbi, Gray, &
81
Tan, 2009; Murata, Kuroda, & Karwowski, 2017; Scott & Gray, 2008), presenting auditory
82
warnings compared to visual and tactile warnings has its share of advantages.
83
Auditory Interfaces and Driving
84
Unlike visual warnings that demand to be located in the individuals visual field to gain
85
benefit, auditory warnings only require that the individual can hear and understand the message.
86
As the level of automation in vehicles increases, operators will also be less likely to be
87
monitoring areas where visual warnings may appear. Further, tactile warnings cannot take
88
advantage of written or spoken speech, which limits the amount of information that can be
89
effectively conveyed. Indeed, a crucial advantage of auditory interfaces is that they can utilize
90
language to communicate, and unlike visual interfaces, can do so without the user having to alter
91
their gaze.
92
While auditory interfaces can stand alone apart from any other devices, it is important to
93
explore the unique potential of auditory interfaces in vehicles. In-vehicle auditory interfaces can
94
make use of already existing sound sources, such as internal speakers, to direct attention or
95
AUDITORY WARNINGS AND BACKGROUND NOISE 5
deliver a warning. For instance, drivers can be notified of a change in the environment by
96
manipulating the radio’s sound level or by panning audio towards one side of the vehicle
97
(Fagerlönn, Lindberg, & Sirkka, 2012; Sabic & Chen, 2017). It must be stressed that driving is a
98
visually-demanding task and using auditory alerts can allow the driver to focus more on the road.
99
Wickens’ (2002) multiple resources model posits that different modalities may utilize different
100
cognitive resource pools across stages including perception, cognition, and the eventual
101
response. The model predicts that there may be less interference when two tasks, in this case
102
driving and interacting with a device, are not both utilizing the same resources within each
103
individual stage and each modality.
104
Importantly, the acoustic parameters of auditory warnings not only influence detection
105
and response time, but also how the warnings are perceived in terms of urgency and annoyance
106
(Edworthy, 1991; Fagerlönn, 2011). Investigating how warnings are perceived is critical,
107
because annoying or low-urgency warnings can be rejected or ignored. Melodic and temporal
108
parameters of sound, such as speed, rhythm, and pitch, have consistent effects on the perceived
109
urgency of the audio (Edworthy, 1991). For instance, as auditory stimuli increase in speed and
110
pitch, urgency ratings tend to increase. Fagerlönn (2011) evaluated two pedestrian warnings in
111
terms of brake behavior, perceived annoyance, and perceived urgency. While findings did not
112
show that higher urgency warnings decreased brake times, results did show that higher urgency
113
warnings were perceived as more annoying.
114
Auditory displays have been researched in many environments, including unmanned
115
aerial systems (Graham & Cummings, 2007), medical devices (Barrass, 2014; Gionfrida,
116
Roginska, Keary, & Mohanraj, 2016), and biofeedback settings (Harris, Vance, Fernandes,
117
Parnandi, & Gutierrez-Osuna, 2014). Tardieu et al. (2015) compared the sonification (i.e., the
118
AUDITORY WARNINGS AND BACKGROUND NOISE 6
communication of data and information using non-speech audio; Kramer et al., 1999) of an in-
119
vehicle global positioning system (GPS) to a purely visual version of the same GPS. They found
120
that the group that interacted with the GPS augmented with auditory cues spent more time
121
looking on the road. Similarly, Jeon et al. (2015) found that augmenting a menu-navigation task
122
with auditory cues during driving improved driving performance and facilitated menu
123
navigation.
124
While interfaces can be augmented with many types of auditory cues, this paper focuses
125
on the following three: Text-to-Speech (TTS), speech-based earcons (spearcons), and auditory
126
icons. TTS, as the name suggests, is a form of speech that is produced by a program that reads
127
text in a synthesized voice. The fact that natural speech recordings need not be made for every
128
possible cue or message is especially useful in many systems, such as GPS devices that
129
communicate street or town names. Auditory icons are not speech-based, but rather are sounds
130
that directly represent the icon they accompany on a symbolic, metaphorical, or nomic level
131
(Gaver, 1986). An example of an auditory icon is the shutter sound that may accompany a
132
camera application. The sound is not language-based, but nevertheless informs the user of the
133
inherent construct that it represents (see McKeown & Isherwood, 2007, for a discussion on
134
signal-to-referent mappings). Lastly, it has been shown that spearcons can increase efficiency in
135
auditory menus (Walker et al., 2006). Spearcons are created by manipulating the tempo of TTS
136
while maintaining a constant pitch. This can be done with a MATLAB script (Walker et al.,
137
2013) resulting in a quick version of TTS that, while not necessarily being perceived as speech,
138
still carries linguistic information.
139
These auditory cue types were chosen because of their ability to effectively convey
140
relationships between the signal and the referent, and in the case of TTS and auditory icons, their
141
AUDITORY WARNINGS AND BACKGROUND NOISE 7
prevalence in GPS and media systems. For spearcons, however, while research has supported
142
that they can facilitate menu navigation (Walker et al., 2013), improve pedestrian navigation
143
(Hussain et al., 2016), and enhance medical devices (Li et al., 2017), their potential in driving or
144
warning systems has been largely unassessed.
145
Auditory Warnings and Background Noise
146
It is important to consider the role of background noise as in-vehicle interfaces are further
147
refined. For instance, Murata, Kuroda, and Kanbayashi (2014) found that noise led to slower
148
responses to auditory warnings in a simulated driving task. Suied, Susini, and McAdams (2008)
149
showed that manipulating the temporal parameters of warnings, such as the inter-onset interval
150
(in this case, amount of time in-between sound pulses), can influence response time. Moreover,
151
Singer, Lerner, Baldwin, and Traube (2015) took drivers on the road and manipulated the sound
152
environment through physical actions such as opening the window. They found that background
153
noise influenced not only warning recognition, but also perceived urgency. Research has
154
demonstrated the importance of calibrating urgency properly to the warning, because highly
155
urgent warnings for trivial information are likely to be seen as especially annoying (Marshall,
156
Lee, & Austria, 2007).
157
Clearly, loudness relative to background noise predicts whether an individual can hear a
158
sound. It might be tempting to conclude that simply making warnings or alerts very loud is an
159
ideal solution to dealing with background noise. However, taking this approach can lead to a
160
slew of unintended consequences, such as increased disturbance or annoyance, startle responses,
161
or interruption of important communication (Edworthy, 1994).
162
Loudness is not the only factor influencing perception of auditory stimuli in noise. The
163
masking of an auditory signal by another sound can also be partly explained by its frequency
164
AUDITORY WARNINGS AND BACKGROUND NOISE 8
content (Wegel & Lane, 1924), sound duration (Fay & Coombs, 1983), and by the human
165
physical structures involved in detecting an auditory target in noise (Karunarathne, Wang, So,
166
Kam, & Meddis, 2018). Furthermore, one’s understanding of the influence of background noise
167
on auditory perception would be lacking if the content of the background noise itself is
168
disregarded. In the case of speech, there is more to consider than just the physical aspects of the
169
sounds, but rather all the nuances of language. Indeed, Sperry, Wiley, and Chial (1997) found
170
that meaningful speech stimuli (e.g., background conversations) had a more detrimental
171
influence on speech recognition than non-meaningful speech noise (e.g., speech played
172
backwards).
173
The Current Study
174
Much research has supported the great potential for auditory cues and messages to allow
175
for effective communication between user and system. However, apart from a few exceptions
176
(e.g., Lerner et al., 2015; Murata et al., 2014), much of this research has been conducted in quiet
177
environments. The present research aimed to fill this gap by assessing drivers’ responses to
178
different types of auditory warnings (TTS, spearcons, and auditory icons) in various background-
179
noise conditions.
180
We assessed individuals’ performance in responding to car warnings during simulated
181
driving with four different types of background noise. These included a baseline noise condition
182
where car windows were rolled up (windows-up), a windows-down noise condition where
183
windows were rolled down, a music condition where jazz music played, and a talk-radio
184
condition where a talk show looped in the background. These noise types were chosen because
185
drivers might encounter them during a routine drive.
186
Experiment 1
187
AUDITORY WARNINGS AND BACKGROUND NOISE 9
Many studies have compared the effectiveness of different auditory cue types
188
(Bonebright & Nees, 2007; Bussemakers & de Haan, 2000; Dingler, Lindsay, & Walker, 2008).
189
The goal of the present research, however, was to assess how different auditory warnings
190
functioned in a variety of in-vehicle noise environments. It is clear that research in evaluating
191
how different sonification and auditory display systems perform in noisy environments is
192
integral to the practical implementation of these systems. To this end, Experiment 1 was
193
designed to address how well individuals responded to auditory icons, spearcons, and TTS
194
warnings while performing a simulated driving task within various noise conditions (windows-
195
down, windows-up, music, and talk-radio).
196
Hypotheses
197
Hypothesis 1.1: Spearcons will be responded to faster than TTS and auditory icons. This
198
is based on previous findings of better navigation performance and decreased RTs to spearcons
199
compared to TTS (Sabic & Chen, 2016; Suh, Jeon, & Walker, 2012; Walker et al., 2006; Walker
200
et al., 2013).
201
Hypothesis 1.2: Lane deviation, as measured by deviation from the lane center, will be
202
greater for participants in the spearcon and auditory icon conditions compared to the TTS
203
condition. This is based on the rationale that spearcon and auditory icon warnings may be more
204
difficult to understand than simple TTS, which may influence driving performance.
205
Hypothesis 1.3: Recognition rate for warnings will be lower in the windows-down,
206
music, and talk-radio conditions than in the windows-up condition. This is based on research
207
showing that background noise, compared to baseline, led to decreased discriminability of car
208
warnings (Lerner et al., 2015).
209
AUDITORY WARNINGS AND BACKGROUND NOISE 10
Hypothesis 1.4: Background noise will influence the perceived urgency of the auditory
210
warnings. This hypothesis stems from previous research showing that background noise
211
influenced the perceived urgency of warnings (Lerner et al., 2015).
212
Hypothesis 1.5: There will be an interaction between background noise and warning type
213
on recognition accuracy, such that spearcons will be more difficult to recognize under some
214
noise conditions, such as windows-down noise, compared to the other auditory warning types.
215
This hypothesis is based on the brief and speech-based characteristics of spearcons.
216
Method
217
Experiment Design. The experiment used a mixed-factor design with auditory warning
218
type (spearcons, TTS, and auditory icons) as a between-subjects factor, and background noise
219
type (windows-up, music, talk-radio, and windows-down) as a within-subjects factor. Details of
220
each condition are included in the Stimuli subsection.
221
Participants. Sixty participants (34 female, 26 male; mean age = 20.19, SD = 3.13) were
222
recruited from New Mexico State University (NMSU). All participants reported normal or
223
corrected-to-normal vision and hearing. Six participants reported being left-handed, and 54 right-
224
handed. Participants received partial course credits towards their psychology courses. This and
225
the following experiment were approved by the Institutional Review Board at NMSU.
226
Apparatus. Participants were seated in front of two monitors, with their hands placed on
227
a steering wheel (Logitech G920 driving Force) attached to the desk in front of them. The
228
monitor in front (HP S270C) showed the driving scene through a STISIM driving simulator
229
(http://stisimdrive.com). The monitor to the left (Dell S2415H) played the background noise and
230
the auditory warnings through the E-Prime software (https://www.pstnet.com/eprime.cfm).
231
Participants wore a set of Audio-Technica ATH-M30x headphones throughout the experiment. A
232
AUDITORY WARNINGS AND BACKGROUND NOISE 11
finger mouse, the EasySMX 2.4G, was placed on the participant’s index finger to log RT to
233
auditory warnings. This mouse was used so that participants did not have to take their hands off
234
the steering wheel to respond to warnings. Participants were asked to place the finger mouse on
235
whichever hand they preferred. Three pedals were placed on the floor in front of the participant,
236
with the left and right pedals being the brake and gas pedals, respectively.
237
Stimuli. In total, eight different car warnings were used (see Table 1). Five of these
238
warnings were used in previous research (McKeown & Isherwood, 2007) that assessed signal-to-
239
referent relationships. We gathered the auditory icons for these five warnings from the British
240
Broadcasting Corporation Sound Effects Library, from CDs: 1, 5, 12, 13, and 19. We created
241
three new warnings (battery alert, engine temperature high, and brake failure), and obtained the
242
original audio files from freesound.org. TTS warnings were created using NaturalReader 14.0
243
software, which read entered text using a synthesized voice. Audacity (www.audacityteam.org)
244
was used to capture and clip this audio. Spearcon warnings were then produced by increasing the
245
tempo of the corresponding TTS warnings, while keeping pitch constant, in Audacity. The
246
resulting spearcon warnings were approximately 40% of the corresponding TTS warning in
247
length (Sabic & Chen, 2016). That is, if the original length was 1 s for the TTS, the resulting
248
length for the spearcon would be approximately 400 ms. See Appendix for the length of sound
249
files for each warning across groups.
250
251
Table 1
252
253
Text-to-Speech (TTS) and Descriptions of Auditory Icon Stimuli for Experiments 1 and 2
254
TTS
Auditory Icons
Car in Blind Spot
Honk
Tire Pressure Low
Air Hiss
AUDITORY WARNINGS AND BACKGROUND NOISE 12
Engine Break is On
Creaking Noise
Gas is Low
Sipping Liquid
Oil is Low
Steam + Water Sounds
Battery Alert
Spark Noise
Engine Temperature High
Kettle Boiling
Brake Failure
Screeching Tires
255
Background-noise stimuli for the windows-up and windows-down conditions were
256
obtained from free samples at freesound.org. Windows-up noise consisted of constant muffled
257
street sounds during driving, and windows-down noise included the sounds of nearby vehicles
258
and air rushing into the cab. In the music condition, background music consisted of the same
259
smooth jazz song (“Café Amore” by Spyro Gyra) used in Singer et al. (2015) for studying the
260
perceptibility of alerts in ambient noise conditions. In the talk-radio condition, a sample from a
261
National Public Radio “hidden brain” podcast on internet language was used. Both the talk-radio
262
and music noise stimuli were overlaid on top of the windows-up noise. Each of the noise stimuli
263
was set to loop throughout each block. See Figure 1 for descriptions of each noise type based on
264
spectrograms.
265
AUDITORY WARNINGS AND BACKGROUND NOISE 13
266
Figure 1. Spectrograms for each noise type, where darker regions represent increased energy.
267
The background noise and warnings were set equally loud of approximately 65 dB. This
268
setting was chosen to mirror real-life situations where noise can mask warning perception, which
269
is frequently the case in real-world situations. In addition, this setting was to control for intensity
270
differences across stimuli. In this way, any impacts of background noise on warnings shown in
271
the experiment would be based on spectral characteristics of the target and background audio,
272
and not intensity differences. We measured sound output at the headphone. We chose to use the
273
65 dB level for auditory stimuli as this was significantly greater than the intensity of ambient
274
noise within the testing room (39 dB). The sound stimuli were measured through a sound-level
275
meter, which took A-weighted dB sound pressure level (SPL) measurements. A-weighting
276
measurements were taken because they provide a more accurate representation of how the
277
sounds are perceived by the human ear.
278
AUDITORY WARNINGS AND BACKGROUND NOISE 14
We used a MATLAB script (hosted at https://github.com/Edin-Sabic) to standardize all
279
the warning and background noise files to have an identical average root-mean-square amplitude
280
when compared to each other. Although the signal-to-noise ratio could fluctuate from moment to
281
moment, we tried to make the warnings comparable in that the loudness was approximately
282
identical across the entirety of the audio clip.
283
Procedure. Participants were first randomized into one of the three warning groups. To
284
familiarize the participants with the warnings, they were presented with a slideshow, which
285
paired each auditory warning with a corresponding picture indicating the meaning of the
286
warning. During the following practice session, participants responded by pressing the finger
287
mouse as soon as they heard the warning, and then identified the warning through a number pad
288
by referencing an image on the screen where all warnings were listed and numbered. The
289
practice session was completed when participants had an accuracy above 85% in any set of
290
successive nine trials. Thus, the practice session lasted between 9 and 45 trials depending on
291
participants’ performance. Participants then operated the virtual vehicle through a series of
292
curves to gain an understanding of the steering wheel sensitivity. After this familiarization,
293
participants began the experimental task and drove through the virtual environment while
294
attending to auditory stimuli presented through headphones.
295
AUDITORY WARNINGS AND BACKGROUND NOISE 15
296
Figure 2. The driving scenario that participants navigated.
297
Each trial began with the participant driving on a two-lane road in a desert environment
298
(see Figure 2). Participants were instructed to remain as close to the middle of their lane as
299
possible. About 25 to 35 s after the start of the trial, the first auditory warning was presented. On
300
average, participants drove 30 s before hearing a warning; the intervals between warnings were
301
varied to avoid participants’ anticipation of the warning. Participants were instructed to first
302
press the finger mouse as soon as they heard and recognized a warning, and then pause the drive
303
using a hand pedal attached to the back of the steering wheel on the right side. RT was logged
304
from onset of the auditory warning until the finger mouse was pressed. This approach was
305
chosen because the time it takes to communicate a warning starting from its onset can influence
306
its effectiveness. As soon as the drive was paused, the monitor adjacent to the driving simulator
307
monitor displayed a figure of all warnings labeled by number, and participants reported which
308
warning they heard through a number pad. Their responses were logged for calculating accuracy.
309
This screen remained completely blank during the entirety of each trial, so that participants were
310
unable to obtain any information regarding the onset of the stimuli. After the response,
311
AUDITORY WARNINGS AND BACKGROUND NOISE 16
participants categorized the warnings based on warning meaning into one of three categories
312
used in Lerner et al.’s (2015) study: (a) Urgent crash warning, (b) Safety information (not urgent
313
crash warnings), and (c) Information not related to safety. After this categorization, the next trial
314
began with the participant pressing the hand pedal again to continue driving.
315
Throughout the experiment, participants performed four experimental blocks with 16
316
trials in each block (a total of 64 trials). The four blocks differed only in the background noise
317
(windows-up, windows-down, music, or talk-radio) that was present. Both the order of the
318
blocks and the warning type were fully randomized among participants. Each of the eight
319
warnings was played twice in each block, with the order of the warnings being randomized. At
320
the end of each block, participants rated each warning on perceived urgency using a 1 9 scale
321
in response to the question: “How urgent did you find the warning?”, with 1 representing “not
322
urgent” and 9 “very urgent”. Upon the conclusion of the study, participants completed the
323
NASA-TLX inventory (Hart & Staveland, 1988). The raw TLX, which does not include
324
weighted pairwise comparisons, served as the cognitive workload measure for the experiment.
325
Results
326
For convenience, test statistics for RT, accuracy, and inverse-efficiency scores (IES) have
327
been combined and are displayed in Table 2
i
. Across all tables, bolded terms indicate
328
significance at the .05 level.
329
Reaction Time. A mixed analysis of variance (ANOVA) was conducted with auditory
330
warning type as a between-subjects factor and background noise type as a within-subjects factor
331
on RT (see Table 2). The main effect of warning type was significant (M = 1202 ms, SD = 148
332
ms for TTS warnings; M = 937 ms, SD = 290 ms for spearcon warnings; M = 1359 ms, SD = 279
333
AUDITORY WARNINGS AND BACKGROUND NOISE 17
ms for auditory icons). The main effect of background noise type was not significant. No
334
significant interacion was found between the two factors (see Figure 3).
335
AUDITORY WARNINGS AND BACKGROUND NOISE 18
Table 2
336
ANOVA and Šidák results for Response Time and Inverse Efficiency Score in Experiment 1
337
Response Time Inverse Efficiency Score
338
Factor df F p ηp2 df F p ηp2
339
ANOVA
340
Auditory Warning Type (AWT) 2, 57 14.97 .001 .34 2, 57 19.57 <.001 .41
341
Background Noise Type (BNT) 3, 171 1.30 .278 .02 3, 171 2.99 .033 .05
342
AWT x BNT 6, 171 1.34 .240 .05 6, 171 3.02 .008 .10
343
Pairwise Comparisons (Šidák) for AWT
344
TTS vs. Spearcons .004 .075
345
TTS vs. Auditory Icons .140 .001
346
Spearcons vs. Auditory Icons <.001 <.001
347
Pairwise Comparisons (Šidák) for BNT
348
Windows-up vs. windows-down .025
349
Windows-up vs. music .175
350
Windows-down vs. music .999
351
Windows-down vs. talk-radio .533
352
Music vs. talk-radio .950
353
Talk-radio vs. Windows-up .388
354
Note. Values in bold indicate significance at the p < .05 level.
355
AUDITORY WARNINGS AND BACKGROUND NOISE 19
356
Figure 3. Reaction time (ms) as a function of warning type and background noise type.
357
Accuracy. As a result of a violation of the ANOVA normality assumption, we
358
implemented an arcsine transformation, which still did not resolve the normality violation.
359
Therefore, we conducted separate non-parametric tests for both the within-subjects factor
360
(background noise type; Friedman test) and the between-subjects factor (auditory warning type;
361
Kruskal-Wallis H test). To investigate differences across warning groups in each noise type, we
362
conducted Mann-Whitney U tests and corrected for multiple comparisons using the Šidák
363
correction (1−(1−α)1/n).
364
Results of the Friedman test (see Table 3) showed that spearcons differed significantly
365
across the background noise types, as did auditory icons. However, TTS was not found to vary
366
significantly across background noise types. Results of the Kruskal-Wallis H tests (see Table 4)
367
showed that there was a significant difference across the three warning groups at each level of
368
noise, which included the windows-up block, the windows-down block, the music noise block,
369
and the talk-radio noise block.
370
AUDITORY WARNINGS AND BACKGROUND NOISE 20
Table 3
371
372
Friedman Tests for Accuracy of Warning Groups across Noise Type in Experiment 1
373
Factor df χ2 p W
374
Noise across Warning Groups
375
TTS 3 4.33 .228 .072
376
Spearcons 3 19.40 <.001 .323
377
Auditory Icons 3 15.90 .001 .265
378
Note. Values in bold indicate significance at the p < .019 level.
379
380
Table 4
381
Kruskal-Wallis H and Mann-Whitney U Tests for Accuracy in Experiment 1
382
Factor df H p ε²
383
Warning Groups across Noise
384
Windows-Up 2 15.76 <.001 .83
385
Windows-Down 2 33.27 <.001 1.75
386
Music 2 29.34 <.001 1.54
387
Talk-Radio 2 12.69 .002 .67
388
Pairwise Comparisons (Mann-Whitney U) for Windows-Up Accuracy
389
TTS vs. Spearcons .435
390
TTS vs. Auditory Icons <.001
391
Spearcons vs. Auditory Icons .001
392
Pairwise Comparisons (Mann-Whitney U) for Windows-Down Accuracy
393
TTS vs. Spearcons <.001
394
TTS vs. Auditory Icons <.001
395
Spearcons vs. Auditory Icons .002
396
Pairwise Comparisons (Mann-Whitney U) for Music Accuracy
397
TTS vs. Spearcons .250
398
TTS vs. Auditory Icons <.001
399
Spearcons vs. Auditory Icons <.001
400
Pairwise Comparisons Mann-Whitney U) for Talk-Radio Accuracy
401
TTS vs. Spearcons .001
402
TTS vs. Auditory Icons <.001
403
AUDITORY WARNINGS AND BACKGROUND NOISE 21
Spearcons vs. Auditory Icons .989
404
Note. Kruskal-Wallis H test results in bold indicate significance at the p < .012 level, while
405
Mann-Whitney U results indicate significance at the p < .004 level.
406
407
408
Figure 4. Accuracy as a function of warning type and background noise type.
409
410
Table 5
411
412
Reaction Time (ms) and Accuracy for Spearcons, Auditory Icons, and TTS as a Function of
413
Background Noise
414
Background Noise
Spearcons
Auditory Icons
TTS
Accuracy
Windows-Down
92.8% (6.9%)
84.1% (9.6%)
99.4% (1.9%)
Windows-Up
99.1% (2.3%)
90.3% (9.8%)
98.4% (2.8%)
Music
95.9% (5.1%)
83.0% (10.5%)
97.8% (3.1%)
Talk-Radio
91.6% (9.6%)
91.9% (8.4%)
98.8% (2.6%)
Reaction
Time
Windows-Down
1009 (390)
1366 (365)
1204 (142)
Windows-Up
917 (287)
1300 (271)
1173 (205)
AUDITORY WARNINGS AND BACKGROUND NOISE 22
Music
869 (269)
1422 (434)
1207 (252)
Talk-Radio
953 (326)
1348 (303)
1224 (155)
Note. Numbers in parentheses represent standard deviations.
415
Inverse-Efficiency Scores. To account for a potential speed-accuracy tradeoff, a mixed
416
ANOVA was conducted with warning type as a between-subjects factor and background noise
417
type as a within-subjects factor on IES (Townsend & Ashby, 1978). IES is calculated as 𝐼𝐸𝑆 =
418
𝑅𝑇
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦. As was mentioned in the Introduction, lower IES denote higher efficiency. The main
419
effect of warning type was significant (M = 1220, SD = 177 for TTS warnings; M = 998, SD =
420
357 for spearcons; M = 1597, SD = 500 for auditory icons). The main effect of background noise
421
type was also significant (M = 1197, SD = 379 for windows-up; M = 1325, SD = 480 for
422
windows-down; M = 1304, SD = 551 for music; M = 1261, SD = 369 for talk-radio). A
423
significant interaction was found between background noise type and warning type (see Figure
424
5). Overall, spearcons were as efficient as TTS in the music, windows-down, and talk-radio noise
425
blocks, and more efficient than TTS warnings in the windows-up noise block. Similar to trends
426
seen in the RT and accuracy data, auditory icons were, in general, the least efficient of the
427
warnings. See Table 6 for analyses related to the interaction.
428
429
Table 6
430
431
Follow-up Interaction Analyses of Inverse Efficiency Scores (IES) across Noise Condition
432
Factor df F p ηp2
433
Noise Type
434
Windows-Up 2, 57 15.36 <.001 .35
435
Windows-Down 2, 57 9.687 <.001 .25
436
Music 2, 57 20.07 <.001 .41
437
Talk-Radio 2, 57 9.29 <.001 .25
438
AUDITORY WARNINGS AND BACKGROUND NOISE 23
Pairwise Comparisons (Šidák) for Windows-Up IES
439
TTS vs. Spearcons .027
440
TTS vs. Auditory Icons .019
441
Spearcons vs. Auditory Icons <.001
442
Pairwise Comparisons (Šidák) for Windows-Down IES
443
TTS vs. Spearcons .812
444
TTS vs. Auditory Icons .004
445
Spearcons vs. Auditory Icons <.001
446
Pairwise Comparisons (Šidák) for Music IES
447
TTS vs. Spearcons .061
448
TTS vs. Auditory Icons .001
449
Spearcons vs. Auditory Icons <.001
450
Pairwise Comparisons (Šidák) for Talk-Radio IES
451
TTS vs. Spearcons .191
452
TTS vs. Auditory Icons .053
453
Spearcons vs. Auditory Icons <.001
454
Note. Values in bold indicate significance at the p < .05 level.
455
456
457
458
Figure 5. IES as a function of warning type and background noise type.
459
AUDITORY WARNINGS AND BACKGROUND NOISE 24
Lane Deviation. To compare lane deviation across warnings, a one-way ANOVA was
460
conducted with warning type as the between-subjects factor. Lane deviation data was collapsed
461
across all four noise blocks. The main effect of warning type on lane deviation was not
462
significant, F(2, 57) = 1.96, p = .150, ηp2 = .06.
463
Urgency. A mixed ANOVA was conducted with warning type as a between-subjects
464
factor and background noise type as a within-subjects factor on subjective urgency ratings. The
465
main effect of warning type was not significant, F(2, 57) = .14, p = .873, ηp2 = .01; the main
466
effect of background noise type was not significant, F(3, 171) = 1.24, p = .297, ηp2 = .02.
467
However, the interaction between the two factors was significant, F(6, 171) = 3.44, p = .003, ηp2
468
= .11. To analyze this interaction, one-way within-subjects ANOVAs were conducted for each
469
warning type across all background noise types. Only the ANOVA including perceived urgency
470
to spearcons was significant, F(2, 57) = 4.89, p = .004, ηp2 = .20. See Figure 6 for urgency ratings
471
across all noise blocks for the three warning groups.
472
473
Figure 6. Urgency ratings as a function of warning type and background noise type.
474
AUDITORY WARNINGS AND BACKGROUND NOISE 25
Categorization of Warnings. To compare the categorization of warnings, a Kruskal-
475
Wallis test was conducted with warning group as the grouping variable. No significant difference
476
was found across the groups, χ2(2) = .114, p = .945.
477
NASA-TLX. Separate between-subjects one-way ANOVAs were conducted on each
478
NASA-TLX subscale. No significant differences were found for any of the groups across any
479
subscales (see Figure 7), ps > .05.
480
481
Figure 7. Mean ratings for the NASA-TLX subscales.
482
Discussion
483
Experiment 1 assessed the efficacy of TTS, spearcon, and auditory icon warnings in
484
multiple background noise conditions while participants drove in a driving simulator. A main
485
effect of warning type showed that spearcon warnings were responded to more quickly than both
486
auditory icons and TTS, which supported Hypothesis 1.1. Overall, participants responded to
487
auditory icons more slowly and identified them less accurately than the other warnings. While
488
pairwise comparisons on RT data showed that spearcons were responded to more quickly than
489
AUDITORY WARNINGS AND BACKGROUND NOISE 26
both TTS and auditory icon warnings, which supported Hypothesis 1.1, pairwise comparisons on
490
accuracy data showed that TTS warnings were accurately identified more often than both
491
auditory icons and spearcons. No significant differences were found when analyzing lane
492
deviation data, failing to support Hypothesis 1.2.
493
A main effect of background noise type was found within both IES and accuracy data,
494
but not RT data, suggesting that noise did not influence how quickly participants responded
495
when collapsed across warning types. This result supported Hypothesis 1.3, namely: background
496
noise would decrease the accuracy of the warning identification. An interaction within urgency
497
ratings showed that the perceived urgency of the warnings fluctuated across background noise
498
types, supporting Hypothesis 1.4. Follow-up analyses demonstrated that one warning type in
499
particular, spearcons, drove this interaction by fluctuating in perceived urgency. This finding
500
replicates prior research (Lerner et al., 2015), and demonstrates that background noise impacts
501
warning perception outside of just recognition.
502
Interactions in the accuracy and IES data allowed for more in-depth analyses on the
503
efficacy of spearcon and TTS warnings across the noise blocks. Participants were better at
504
recognizing TTS warnings than spearcons in the windows-down and talk-radio noise blocks, but
505
they accurately identified TTS and spearcon warnings at equal rates in the music and radio noise
506
blocks. These results highlight the impact of the type of noise on warning recognition, supporting
507
Hypothesis 1.5. IES data allowed for the direct comparison of warning types taking both RT and
508
accuracy into consideration. Pairwise comparisons across the warning types for each
509
background-noise condition demonstrated that spearcon and TTS warnings were equally efficient
510
in windows-down, music, and talk-radio noise blocks. However, spearcons were more efficient in
511
AUDITORY WARNINGS AND BACKGROUND NOISE 27
the baseline windows-up noise condition compared to TTS warnings. These findings illustrate
512
that spearcons may be more efficient than TTS in certain noise conditions.
513
Experiment 1 provided information on the interaction between warning type and
514
background noise. However, there is a need to explore how a system designer might increase the
515
efficiency of a warning in background noise. To this end, Experiment 2 was designed to replicate
516
the results of Experiment 1 and examine whether the inclusion of an alerting tone could help
517
increase accurate recognition of the warnings.
518
Experiment 2
519
In noisy environments, the sudden presence of auditory warnings may not be effective
520
because people may miss the onset of the warning. For example, Wang, Lyckvi, Hong, Dahlsted,
521
and Chen (2017) used an alerting tone to signal a transfer of control and found that the alerting
522
tone led to improved performance and situation awareness compared to no alerting tone.
523
Experiment 2 investigated whether presenting an alerting tone immediately prior to warning
524
presentation would allow for better recognition and response to car warnings. This tone may be
525
especially beneficial for auditory warning types of shorter durations, such as spearcons. Being
526
informed about when a stimulus is about to occur may allow users to focus their attention on an
527
upcoming event.
528
Hypotheses
529
New hypotheses included:
530
Hypothesis 2.1: An alerting tone will increase recognition of the auditory warnings, as
531
reflected in accuracy data.
532
Hypothesis 2.2: An alerting tone will decrease participants’ efficiency (as measured
533
through IES) in recognizing warnings in the baseline windows-up noise condition. The
534
AUDITORY WARNINGS AND BACKGROUND NOISE 28
alerting tone may be redundant in scenarios with little background noise where warning
535
perceptibility is already high.
536
Hypothesis 2.3: An alerting tone will increase participants’ efficiency (as measured
537
through IES) in recognizing warnings in the experimental noise conditions through the
538
added temporal information about the upcoming warning.
539
Method
540
Participants. A total of 60 new participants (46 female, 14 male; 4 left-handed, 56 right-
541
handed; mean age = 23.53; SD = 10.58) from NMSU and the Las Cruces community participated
542
in this experiment for class credit or $10 monetary compensation. All participants reported
543
normal or corrected-to-normal vision and hearing.
544
Apparatus, stimuli, experiment design, and procedure. Experiment 2 utilized the same
545
apparatus, experiment design, and procedure as in Experiment 1. All stimuli were identical to
546
Experiment 1, except that an alerting tone was presented prior to each of the warnings. The alert
547
tone duration was 220 ms and had a constant frequency of 1100 Hz. The tone was played at
548
exactly 65 dB, as measured by a sound level meter that took A-weighted dB SPL measurements
549
at the output of the headphone. We chose a constant frequency of 1100 Hz as we expected it to
550
be perceived as relatively high in pitch (promoting urgency), but not excessively high as the
551
hearing threshold for individuals with hearing impairment has been shown to increase with
552
frequency (Wendt, Kollmeier, & Brand, 2015). The length of 220 ms was chosen because this
553
length was long enough to be perceived by participants but still relatively brief.
554
Results
555
The same statistical tests for reaction time, accuracy, and IES were conducted as in
556
Experiment 1, and the resultsi are displayed in Table 7.
557
AUDITORY WARNINGS AND BACKGROUND NOISE 29
Reaction Time. Similar result patterns were found to those in Experiment 1. The main
558
effect of warning type on RT was significant (M = 1424 ms, SD = 427 ms for TTS warnings; M
559
= 1057 ms, SD = 255 ms for spearcons; M = 1563 ms, SD = 385 ms for auditory icons). The
560
main effect of background noise type on RT was not significant. No significant interaction term
561
was found between background noise type and warning type. See Figure 8 for reaction time data
562
across all noise blocks for the three warning groups.
563
AUDITORY WARNINGS AND BACKGROUND NOISE 30
Table 7
564
565
ANOVA and Šidák results for Response Time and Inverse Efficiency Score in Experiment 1
566
Response Time Inverse Efficiency Score
567
Factor df F p ηp2 df F p ηp2
568
ANOVA
569
Auditory Warning Type (AWT) 2, 57 12.31 <.001 .30 2, 57 16.03 <.001 .36
570
Background Noise Type (BNT) 3, 171 1.84 .142 .03 3, 171 7.00 <.001 .11
571
AWT x BNT 6, 171 .86 .528 .03 6, 171 4.96 <.001 .15
572
Pairwise Comparisons (Šidák) for AWT
573
TTS vs. Spearcons .003 .105
574
TTS vs. Auditory Icons .471 .003
575
Spearcons vs. Auditory Icons <.001 <.001
576
Pairwise Comparisons (Šidák) for BNT
577
Windows-up vs. windows-down .006
578
Windows-up vs. music .987
579
Windows-down vs. music .011
580
Windows-down vs. talk-radio .018
581
Music vs. talk-radio 1.000
582
Talk-radio vs. Windows-up .989
583
Note. Values in bold indicate significance at the p < .05 level.
584
AUDITORY WARNINGS AND BACKGROUND NOISE 31
585
Figure 8. Reaction time (ms) as a function of warning type and background noise type.
586
Accuracy. Identical results were found as in Experiment 1 (see Figure 9), except for the
587
result of one pairwise comparison between spearcons and auditory icons in windows-up noise.
588
Results of a Friedman test (see Table 8) showed that spearcons differed significantly across the
589
background noise types, as did auditory icons. However, TTS was not found to vary significantly
590
across background noise types. Results of the Kruskal-Wallis H tests (see Table 9) showed that
591
there was a significant difference across the three warning groups at each level of noise, which
592
included the windows-up block, the windows-down block, the music noise block, and the talk-
593
radio noise block.
594
Table 8
595
596
Friedman Tests for Accuracy of Warning Groups across Noise Type in Experiment 2
597
Factor df χ2 p W
598
Noise across Warning Groups
599
TTS 3 7.06 .070 .118
600
Spearcons 3 18.19 <.001 .303
601
Auditory Icons 3 18.48 <.001 .308
602
AUDITORY WARNINGS AND BACKGROUND NOISE 32
Note. Values in bold indicate significance at the p < .019 level.
603
604
Table 9
605
Kruskal-Wallis H and Mann-Whitney U Tests for Accuracy in Experiment 2
606
Factor df H p ε²
607
Warning Groups across Noise
608
Windows-Up 2 13.24 .001 .70
609
Windows-Down 2 38.38 <.001 2.02
610
Music 2 27.39 <.001 1.44
611
Talk-Radio 2 12.04 .002 .63
612
Pairwise Comparisons (Mann-Whitney U) for Windows-Up Accuracy
613
TTS vs. Spearcons .429
614
TTS vs. Auditory Icons .001
615
Spearcons vs. Auditory Icons .008
616
Pairwise Comparisons (Mann-Whitney U) for Windows-Down Accuracy
617
TTS vs. Spearcons <.001
618
TTS vs. Auditory Icons <.001
619
Spearcons vs. Auditory Icons <.001
620
Pairwise Comparisons (Mann-Whitney U) for Music Accuracy
621
TTS vs. Spearcons .004
622
TTS vs. Auditory Icons <.001
623
Spearcons vs. Auditory Icons .001
624
Pairwise Comparisons Mann-Whitney U) for Talk-Radio Accuracy
625
TTS vs. Spearcons .003
626
TTS vs. Auditory Icons .002
627
Spearcons vs. Auditory Icons .922
628
Note. Kruskal-Wallis H test results in bold indicate significance at the p < .012 level, while
629
Mann-Whitney U results indicate significance at the p < .004 level.
630
631
632
Table 10
633
634
AUDITORY WARNINGS AND BACKGROUND NOISE 33
Reaction Time (ms) and Accuracy for Spearcons, Auditory Icons, and TTS as a Function of
635
Background Noise
636
Background Noise
Spearcons
Auditory Icons
TTS
Accuracy
Windows-Down
91.6% (8.7%)
73.4% (14.7%)
99.4% (1.9%)
Windows-Up
97.5% (4.7%)
90.0% (10.2%)
98.8% (2.6%)
Music
96.9% (5.2%)
87.5% (12.5%)
100% (0%)
Talk-Radio
89.1% (11.6%)
89.4% (10.2%)
97.5% (4.3%)
Reaction
Time
Windows-Down
1100 (297)
1617 (483)
1460 (424)
Windows-Up
1019 (222)
1556 (324)
1444 (384)
Music
1035 (230)
1599 (442)
1402 (452)
Talk-Radio
1072 (270)
1481 (292)
1390 (447)
Note. Numbers in parentheses represent one standard deviation.
637
638
639
Figure 9. Accuracy as a function of warning type and background noise type.
640
AUDITORY WARNINGS AND BACKGROUND NOISE 34
Inverse-Efficiency Scores. Similar patterns as in Experiment 1 were shown. The main
641
effect of warning type was significant (M = 1443, SD = 444 for TTS warnings; M = 1147, SD =
642
346 for spearcons; M = 1921, SD = 724 for auditory icons). The main effect of background noise
643
type was significant (M = 1430, SD = 506 for windows-up; M = 1667, SD = 767 for windows-
644
down; M = 1463, SD = 649 for music; M = 1454, SD = 487 for talk-radio). The interaction
645
between the two factors was significant (see Figure 10). To explore this interaction, separate
646
ANOVAs with warning type serving as the between-subjects factor were conducted on IES at
647
each level of background noise type. Overall, IES in response to auditory icons were greater than
648
IES in response to TTS or spearcon warnings. IES in response to spearcon and TTS warnings
649
were not significantly different except for the windows-up noise block, where IES in response to
650
TTS warnings were significantly greater than IES in response to spearcon warnings. See Table
651
11 for analyses related to the interaction.
652
653
654
655
656
657
658
659
660
661
662
663
664
665
AUDITORY WARNINGS AND BACKGROUND NOISE 35
Table 11
666
667
Follow-up Interaction Analyses of Inverse Efficiency Scores (IES) across Noise Condition
668
Factor df F p np2
669
Noise Type
670
Windows-Up 2, 57 15.10 <.001 .35
671
Windows-Down 2, 57 17.33 <.001 .38
672
Music 2, 57 11.51 <.001 .29
673
Talk-Radio 2, 57 4.75 <.001 .14
674
Pairwise Comparisons (Šidák) for Windows-Up IES
675
TTS vs. Spearcons .007
676
TTS vs. Auditory Icons .078
677
Spearcons vs. Auditory Icons <.001
678
Pairwise Comparisons (Šidák) for Windows-Down IES
679
TTS vs. Spearcons .500
680
TTS vs. Auditory Icons <.001
681
Spearcons vs. Auditory Icons <.001
682
Pairwise Comparisons (Šidák) for Music IES
683
TTS vs. Spearcons .189
684
TTS vs. Auditory Icons .016
685
Spearcons vs. Auditory Icons <.001
686
Pairwise Comparisons (Šidák) for Talk-Radio IES
687
TTS vs. Spearcons .493
688
TTS vs. Auditory Icons .222
689
Spearcons vs. Auditory Icons .010
690
Note. Values in bold indicate significance at the p < .05 level.
691
692
693
AUDITORY WARNINGS AND BACKGROUND NOISE 36
694
Figure 10. IES as a function of warning type and background noise type.
695
Lane Deviation. One participant’s data was excluded due to a data compilation error.
696
Similar to Experiment 1, the main effect of warning group was not significant, F(2, 56) = .46, p
697
= .633, ηp2 = .02.
698
Urgency. There were no significant main effect of warning type, F(2, 57) = 1.11, p =
699
.336, ηp2 = .04, and background noise type, F(3, 171) = .38, p = .766, ηp2 = .01. Further, the
700
interaction between the two factors was not significant, F(6, 171) = 1.33, p = .246, ηp2 = .05.
701
Categorization of Warnings. Similar to Experiment 1, no significant difference was
702
found across the groups, χ2(2) = 2.749, p = .253.
703
NASA-TLX. Similar to Experiment 1, no significant differences found for any of the
704
groups across any subscales (see Figure 11), ps > .05.
705
706
707
708
AUDITORY WARNINGS AND BACKGROUND NOISE 37
709
Figure 11. Mean rating across the NASA-TLX six subscales.
710
Discussion
711
Experiment 2 assessed the efficacy of presenting an attention-capturing tone immediately
712
prior to each warning. Overall, results were very similar in comparison to Experiment 1. In terms
713
of the RT data, the findings from Experiment 1 were replicated in Experiment 2. Namely, RTs to
714
spearcon warnings were faster than both auditory icons and TTS. Again, no difference was found
715
between RTs to auditory icon and TTS warnings. The RT data from Experiment 2 further
716
supported that individuals with brief training with spearcons can recognize and respond to them
717
quickly. Data also supported that background noise did not have a significant effect on how
718
quickly participants responded to the warnings.
719
Accuracy data trends were nearly identical to those found in Experiment 1, barring a
720
pairwise comparison including spearcons and auditory icons in windows-up noise. In the
721
windows-up and music noise blocks, accuracy to spearcon and TTS warnings were not different.
722
However, in the talk-radio noise and windows-down noise blocks, accuracy to spearcons was
723
AUDITORY WARNINGS AND BACKGROUND NOISE 38
lower than accuracy to TTS warnings. Accuracy to TTS warnings was not impacted much across
724
any noise environments. Accuracy to auditory icons was relatively constant across music, talk-
725
radio, and windows-up noise blocks, but decreased heavily in the windows-down condition.
726
Experiment 2 also utilized the IES measure to gain an understanding of warning
727
efficiency. Overall, auditory icons had the highest IES scores (i.e., the lowest efficiency). In the
728
windows-up noise block, IES for spearcons were significantly lower than both TTS and auditory
729
icons. In the talk-radio, windows-down, and music noise blocks, however, IES to TTS and
730
spearcon warnings were similar. In terms of accuracy and efficiency, the mean values showed
731
that the attention-capturing alert that was introduced in Experiment 2 did not increase accuracy
732
for spearcons or auditory icons in the heavier noise blocks. This lack of an effect also led to no
733
increase in efficiency as measured in IES scores.
734
Presenting an alert prior to warning presentation was not very useful in this study. It was
735
predicted that IES scores would be lower across all noise blocks, excluding baseline, in
736
Experiment 2 compared to Experiment 1. The inclusion of the alert did not facilitate increased
737
recognition of the warnings, even in the more noisy blocks such as the windows-down noise
738
block. It may be the case that the tone was too short for participants, or the training was
739
inadequate for participants to optimally use the alert. It is worth noting, that the RTs in
740
Experiment 2 included the length of the alert tone. Regardless, Experiment 2 did add value to
741
this research by replicating Experiment 1 data trends, and by evaluating how designers and
742
researchers can better prepare operators for alerts.
743
General Discussion
744
The present research evaluated three auditory warning types in noisy driving
745
environments. Results provided an understanding of how these auditory warning types relate to
746
AUDITORY WARNINGS AND BACKGROUND NOISE 39
each other and function in different auditory environments and highlighted the importance of
747
matching the warning to the auditory environment. Significant interactions in terms of IES
748
across both experiments demonstrated that the efficiency of the warning significantly fluctuated
749
across different noise conditions: Spearcons outperformed TTS with windows-up noise, but had
750
similar efficiency with windows-down, music, and talk-radio noise. Further, accuracy to some
751
warning types, such as spearcons, was heavily impacted by certain types of noise, such as the
752
windows-down and talk-radio noise, compared to the other noise types. While spearcons can
753
convey important information quickly, it is important not to forsake accurate recognition.
754
The significant interaction in the accuracy data between warning and background noise
755
type in both experiments showed that participants’ performance did not depend solely on either
756
background noise type or warning group, but rather a combination of the two. This result stresses
757
the impact of the auditory environment in the design of auditory alerts and warnings. Warnings
758
that are similar to the background noise, in terms of frequency or content (Sperry et al., 1997;
759
Wegel & Lane, 1924), are harder to recognize. The finding that accuracy to spearcons was
760
impacted in the talk-radio noise block could exemplify the semantic content of the background
761
noise influencing recognition of concurrent linguistic warnings. Interestingly, this reduction in
762
accuracy was not found for the TTS group.
763
Across both experiments, a number of patterns emerged in terms of recognition of the
764
warnings across the different noise conditions. When collapsed across all warning groups,
765
accuracy was lowest in the windows-down noise block compared to the other noise blocks.
766
Recognition of auditory icons was worst when the background consisted of noise such as music
767
or noise from outside of the cab in the case of windows-down noise. However, in the talk-radio
768
AUDITORY WARNINGS AND BACKGROUND NOISE 40
noise block where the auditory noise was solely linguistic in nature, the recognition of auditory
769
icons was not impacted much in terms of their comparison to baseline.
770
While spearcons were found to be more efficient than TTS in the windows-up noise
771
block, there was no difference between the two across any other noise blocks. These results
772
suggest that while spearcons can be responded to more quickly, the accurate identification of
773
spearcons may be limited to auditory environments that are not particularly distracting or
774
consisting of speech. The shortened nature of spearcons may make them more difficult to
775
recognize in these noise conditions.
776
When assessing perceived urgency, an important construct concerning warnings, a
777
significant interaction emerged in Experiment 1 but not Experiment 2. In Experiment 1,
778
perceived urgency fluctuated for spearcons, but not auditory icons or TTS warnings. These
779
results suggest that the perceived urgency to a warning is influenced by both the auditory
780
environment, and the form in which the warning is communicated.
781
Unexpectedly, the introduction of an attention-capturing alert did not improve
782
performance in recognizing the auditory warnings. To determine whether the alert was easily
783
detected given the background noise, we estimated the signal-to-nose ratio (SNR) for each of our
784
background noises around 1100 Hz. We filtered our background noise files using a 10001200
785
Hz bandpass filter and calculated the SNR for each file. The SNRs were 35.14, 4.48, 26.19, and
786
21.94 for noise types windows-up, windows-down, music, and talk-radio, respectively. These
787
positive SNRs provided strong evidence that our alert tone was well above detection threshold.
788
This lack of improvement by the tone may be a result of insufficient training, as participants
789
were merely informed of the tone and not trained to use it. Further, the design of the alert may
790
have been insufficient. Although the auditory alert was designed to be short, so as not to impact
791
AUDITORY WARNINGS AND BACKGROUND NOISE 41
efficiency, it might have been too short. The alert also consisted of a flat 1100 Hz tone, and
792
future research might investigate whether changing the acoustic parameters of the tone might be
793
beneficial. Future research is needed to determine whether changes to this alert, potentially in
794
terms of increasing urgency through acoustic parameter manipulation (Edworthy, 1991;
795
Edworthy, 1994) would influence this result.
796
These results have important implications in the context of the existing literature. While
797
other studies have shown that spearcons can expedite menu navigation (Palladino & Walker,
798
2008; Walker et al., 2013) or improve patient monitoring (Li et al., 2017), this study is the first to
799
evaluate the efficacy of using spearcons as car warnings and highlight their efficiency in some
800
noise conditions such as windows-up noise. Interestingly, urgency ratings in Experiment 1 did
801
not show that spearcons were perceived as more urgent than TTS warnings even though
802
spearcons are a faster version of each TTS utterance. This goes against previous findings that
803
showed an increase in perceived urgency as a function of speed (Edworthy, 1991), and points to
804
a potential difference in how the speed of a stimulus impacts perceived urgency across speech
805
and non-speech auditory warnings. Urgency ratings from Experiment 1 also supported recent
806
research, which also found that background noise can influence perceived urgency (Singer et al.,
807
2015). While other researchers have found that background noise can influence RTs (Murata et
808
al., 2014), the present research found no such effect. Rather, background noise was seen to
809
impact recognition, efficiency, and urgency but not RTs to warnings themselves.
810
Overall, the present research investigated the ability of different warnings in providing
811
safety critical information. Experiments 1 and 2 had fair ecological validity by incorporating a
812
primary task, driving, and manipulating in which auditory environment the warnings were
813
presented. This research is novel as the comparison of different auditory warnings in this context
814
AUDITORY WARNINGS AND BACKGROUND NOISE 42
has not been done to this extent. While the attempt to provide a buffer to the noise by utilizing an
815
attention-capturing alert was not successful, the results still provide a useful starting point from
816
which to further understand how to best present auditory warnings with consideration of both the
817
auditory environment and the warnings themselves.
818
Limitations
819
While a driving task was included, this context certainly did not entirely reflect the
820
dangers and experiences of actual driving. Moreover, responses across both experiments did not
821
reflect the responses that would be made in vehicles. In the real world, reactions are complex and
822
vary based on the warning being communicated. Another limitation is that this study assessed
823
the ability of relatively young participants in recognizing these warnings. As research has shown,
824
auditory perception degrades as we age and future research needs to investigate the role of
825
hearing impairment on the perception of these warnings in noise. Also, no formal hearing test
826
was administered so we cannot be certain about participants’ normal hearing, although this is
827
somewhat mitigated by the replication of results in Experiment 2 and by the low prevalence of
828
hearing loss in a student population. Importantly, the present research also only investigated the
829
perception of warnings at 0 SNR, and future research is needed to assess differences at other
830
levels of SNR. Lastly, it should be noted that participants were much more likely to have
831
interacted with auditory icons and TTS than with spearcons before their participation. As a
832
result, the novelty of spearcons may have led to the advantage of the spearcon warnings.
833
Implications for Design
834
Unexpected noise in an environment, especially in vehicles and other types of machinery,
835
is not just a possibility but rather a guarantee. When background noise is predictable, there exist
836
many standards to follow both in terms of the level at which warnings are presented relative to
837
AUDITORY WARNINGS AND BACKGROUND NOISE 43
background noise and warning sources of confusion (DoD, 2012; Edworthy, 1994). However,
838
although it is possible for systems to control some sound in the environment, much background
839
noise is unpredictable. Thus, designers and researchers should further consider and assess the
840
impact of noise on potential interaction with auditory displays. Furthering the understanding on
841
how to facilitate intelligibility of auditory warnings of all kinds will lead to better auditory
842
interface interaction. In the case of in-vehicle auditory interfaces, this facilitated interaction may
843
lead to less time interacting with infotainment systems, and more time spent looking on the road.
844
Although the results need further replication, our data suggest that spearcons are a good
845
candidate for systems that need quick responses, such as warning systems, only in environments
846
where speech is unlikely, and the level of background noise is relatively low. However, it is also
847
important to note that participants were able to recognize spearcons at a comparable rate to TTS
848
in the music and windows-up noise blocks. In the case where an interface is most likely to be
849
interacted with in an environment with speech or heavy continuous noise, we recommend the use
850
of TTS. Overall, our results support that large sets of auditory icons should not be used as safety-
851
critical warnings in noisy environments.
852
The results demonstrate that the recognition of warnings does not depend solely on the
853
warning content, or the sound environment, but rather a combination. When designing the
854
auditory components of systems or devices, consideration of possible background noise will lead
855
to more recognizable, and as a result, safer, auditory interfaces. Importantly, these results are not
856
solely generalizable to warning systems, but also to any system that presents auditory alerts or
857
sounds within a potentially noisy environment. For example, navigational systems that are
858
increasingly more common in vehicles today could benefit from more nuanced presentation of
859
AUDITORY WARNINGS AND BACKGROUND NOISE 44
directions. Whether this be in notifying the driver and passengers of an upcoming warning before
860
presenting it or manipulating the acoustic parameters of the message directly.
861
Conclusion
862
The results of the present research reaffirm the importance of designing auditory
863
warnings with background noise in mind, as background noise can mask warnings and impair
864
recognition. This impaired recognition is more common in interaction with systems related to
865
machinery, automobiles, or loud working environments. For spearcons, the detrimental
866
background noises were the talk-radio and the windows-down noise blocks. The reduced
867
recognition of spearcons with the talk-radio noise could potentially be due to the multiple speech
868
streams. Additional research is needed to further investigate how to best ensure recognition of
869
warnings in noisy environments, as well as how to implement systems that sample the auditory
870
environment in order to present warnings for maximal discernibility.
871
872
873
AUDITORY WARNINGS AND BACKGROUND NOISE 45
Key Points
874
Two experiments investigated the efficacy of three warning types during a simulated
875
driving task within a variety of noisy environments. Experiment 2 differed form
876
Experiment 1 by presenting an attention-capturing alert immediately prior to warning
877
presentation.
878
Spearcons were more efficient than text-to-speech warnings in the baseline windows-up
879
noise condition, but no difference in efficiency was found in the other three noise
880
conditions.
881
While the attention-capturing alert did not improve the recognition or efficiency of
882
warnings in Experiment 2, results nevertheless replicated the findings of Experiment 1.
883
The results demonstrate that noise can idiosyncratically influence recognition of different
884
types of warnings and support that spearcons may be more efficient than text-to-speech
885
warnings in ideal noise conditions.
886
887
888
889
890
891
892
893
894
895
896
897
898
AUDITORY WARNINGS AND BACKGROUND NOISE 46
References
899
Barrass, S. (2014, September). Acoustic sonification of blood pressure in the form of a singing
900
bowl. In Conference on Sonification of Health and Environmental Data (SoniHED 2014)
901
(pp. 16-21). York, England.
902
Bonebright, T. L., & Nees, M. A. (2007). Memory for auditory icons and earcons with
903
localization cues. In Proceedings of the International Conference on Auditory Display
904
(ICAD 07) (pp. 419-422), Montreal, Canada.
905
Bussemakers, M. P., & De Haan, A. (2000, April). When it sounds like a duck and it looks like a
906
dog... auditory icons vs. earcons in multimedia environments. In Proceedings of the
907
International Conference on Auditory Display (ICAD 00) (pp. 184-189). Atlanta,
908
Georgia.
909
Dettmann, A., & Bullinger, A. C. (2017). Spatially distributed visual, auditory and multimodal
910
warning signalsA comparison. In Proceedings of the Human Factors and Ergonomics
911
Society Europe (pp. 185-199). Rome, Italy.
912
Dingler, T., Lindsay, J., & Walker, B. N. (2008). Learnabiltiy of sound cues for environmental
913
features: Auditory icons, earcons, spearcons, and speech. In Proceedings of the
914
International Conference on Auditory Display (ICAD 2008) (pp. 1-6), Paris, France (24-
915
27 June).
916
DoD, M. S. (2012). Department of Defense Design Criteria Standard: Human Engineering (MIL-
917
STD-1472G). Department of Defense, Washington.
918
Edworthy, J., Loxley, S., & Dennis, I. (1991). Improving auditory warning design: Relationship
919
between warning sound parameters and perceived urgency. Human Factors, 33(2), 205-
920
231.
921
AUDITORY WARNINGS AND BACKGROUND NOISE 47
Edworthy, J. (1994). The design and implementation of non-verbal auditory warnings. Applied
922
Ergonomics, 25(4), 202-210.
923
Fagerlönn, J. (2011). Urgent alarms in trucks: Effects on annoyance and subsequent driving
924
performance. IET Intelligent Transportation Systems, 5(4), 252-258.
925
Fagerlönn, J., Lindberg, S., & Sirkka, A. (2012, October). Graded auditory warnings during in-
926
vehicle use: Using sound to guide drivers without additional noise. Proceedings of the 4th
927
International Conference on Automotive User Interfaces and Interactive Vehicular
928
Applications, 85-91.
929
Fay, R. R., & Coombs, S. (1983). Neural mechanisms in sound detection and temporal
930
summation. Hearing Research, 10, 69-92.
931
Graham, H. D., & Cummings, M. L. (2007). Assessing the Impact of Auditory Peripheral
932
Displays for UAV Operators (Report No. HAL2007-09). Cambridge, MA: MIT Humans
933
and Automation Laboratory.
934
Gaver, W. W. (1986). Auditory icons: Using sound in computer interfaces. Human-Computer
935
Interaction, 2, 167-177.
936
Gionfrida, L., Roginska, A., Keary, J., Mohanraj, H., & Friedman, K. P. (2016). The Triple Tone
937
Sonification Method to Enhance the Diagnosis of Alzheimer’s Dementia. In International
938
Community on Auditory Display (ICAD 2016).
939
Harris, J., Vance, S., Fernandes, O., Parnandi, A., & Gutierrez-Osuna, R. (2014, April). Sonic
940
respiration: controlling respiration rate through auditory biofeedback. In CHI'14
941
Extended Abstracts on Human Factors in Computing Systems, 2383-2388.
942
Hart, S. G., & Staveland, L. E. (1988). Development of NASA-TLX (Task Load Index): Results
943
of empirical and theoretical research. In Advances in Psychology, 52, 139-183.
944
AUDITORY WARNINGS AND BACKGROUND NOISE 48
Hussain, I., Chen, L., Mirza, H. T., Wang, L., Chen, G., & Memon, I. (2016). Chinese-based
945
spearcons: improving pedestrian navigation performance in eyes-free
946
environment. International Journal of Human-Computer Interaction, 32, 460-469.
947
Jeon, M., Gable, T. M., Davison, B. K., Nees, M. A., Wilson, J., & Walker, B. N. (2015). Menu
948
navigation with in-vehicle technologies: Auditory menu cues improve dual task
949
performance, preference, and workload. International Journal of Human-Computer
950
Interaction, 31, 1-16.
951
Karunarathne, B., Wang, T., So, R. H., Kam, A. C., & Meddis, R. (2018). Adversarial
952
relationship between combined medial olivocochlear (MOC) and middle-ear-muscle
953
(MEM) reflexes and alarm-in-noise detection thresholds under negative signal-to-noise
954
ratios (SNRs). Hearing research, 367, 124-128.
955
Koo, J., Kwac, J., Ju, W., Steinert, M., Leifer, L., & Nass, C. (2015). Why did my car just do
956
that? Explaining semi-autonomous driving actions to improve driver understanding, trust,
957
and performance. International Journal on Interactive Design and Manufacturing
958
(IJIDeM), 9, 269-275.
959
Kramer, G., Walker, B. N., Bonebright, T., Cook, P., Flowers, J., Miner, N., ... & Evreinov, G.
960
(1999). The sonification report: Status of the field and research agenda. Report prepared
961
for the national science foundation by members of the international community for
962
auditory display. International Community for Auditory Display (ICAD).
963
Santa Fe, New Mexico.
964
Lerner, Neil; Singer, Jeremiah; Kellman, Daniel; and Traube, Eric. In-Vehicle Noise Alters the
965
Perceived Meaning of Auditory Signals. In: Proceedings of the Eighth International
966
Driving Symposium on Human Factors in Driver Assessment, Training and Vehicle
967
AUDITORY WARNINGS AND BACKGROUND NOISE 49
Design, June 22-25, 2015, Salt Lake City, Utah. Iowa City, IA: Public Policy Center,
968
University of Iowa, 2015: 401-407.
969
Li, S. Y., Tang, T. L., Hickling, A., Yau, S., Brecknell, B., & Sanderson, P. M. (2017).
970
Spearcons for Patient Monitoring: Laboratory Investigation Comparing Earcons and
971
Spearcons. Human Factors, 59, 765-781.
972
Marshall, D.C., Lee, J.D., & Albert Austria, P. (2007). Alerts for in-vehicle information systems:
973
Annoyance, urgency and appropriateness. Human Factors, 49(1), 145-157.
974
McKeown, D., & Isherwood, S. (2007). Mapping candidate within-vehicle auditory displays to
975
their referents. Human Factors, 49, 417-428.
976
Mohebbi, R., Gray, R., & Tan, H. Z. (2009). Driver reaction time to tactile and auditory rear-end
977
collision warnings while talking on a cell phone. Human Factors, 51, 102-110.
978
Murata, A., Kuroda, T., & Kanbayashi, M. (2014). Effectiveness of Auditory and Vibrotactile
979
Cuing for Driver’s Enhanced Attention under Noisy Environment. Advances in
980
Ergonomics In Design, Usability & Special Populations: Part II, 17, 155-164.
981
Murata, A., Kuroda, T., & Karwowski, W. (2017). Effects of auditory and tactile warning on
982
response to visual hazards under a noisy environment. Applied Ergonomics, 60, 58-67.
983
Palladino, D. K., & Walker, B. N. (2008, September). Navigation efficiency of two dimensional
984
auditory menus using spearcon enhancements. In Proceedings of the Human Factors and
985
Ergonomics Society Annual Meeting (Vol. 52, No. 18, pp. 1262-1266). Sage CA: Los
986
Angeles, CA: SAGE Publications.
987
Sabic, E., & Chen, J. (2016, September). Threshold of Spearcon Recognition for Auditory
988
Menus. In Proceedings of the Human Factors and Ergonomics Society Annual
989
Meeting (pp. 1539-1543). Sage CA: Los Angeles, CA: SAGE Publications.
990
AUDITORY WARNINGS AND BACKGROUND NOISE 50
Sabic, E., & Chen, J. (2017, September). Left or right: Auditory collision warnings for driving
991
assistance systems. In Proceedings of the Human Factors and Ergonomics Society
992
Annual Meeting (Vol. 61, No. 1, pp. 1551-1551). Sage CA: Los Angeles, CA: SAGE
993
Publications.
994
Scott, J. J., & Gray, R. (2008). A comparison of tactile, visual, and auditory warnings for rear-
995
end collision prevention in simulated driving. Human Factors, 50, 264-275.
996
Singer, J., Lerner, N., Baldwin, C., & Traube, E. (2015). Auditory Alerts in Vehicles: Effects of
997
Alert Characteristics and Ambient Noise Conditions on Perceived Meaning and
998
Detectability. In 24th International Technical Conference on the Enhanced Safety of
999
Vehicles (ESV), 1, 1-14.
1000
Sperry, J. L., Wiley, T. L., & Chial, M. R. (1997). Word recognition performance in various
1001
background competitors. Journal-American Academy of Audiology, 8, 71-80.
1002
Suh, H., Jeon, M., & Walker, B. N. (2012, September). Spearcons improve navigation
1003
performance and perceived speediness in Korean auditory menus. In Proceedings of the
1004
Human Factors and Ergonomics Society Annual Meeting (pp. 1361-1365). Sage CA:
1005
Los Angeles, CA: SAGE Publications.
1006
Suied, C., Susini, P., & McAdams, S. (2008). Evaluating warning sound urgency with reaction
1007
times. Journal of Experimental Psychology: Applied, 14(3), 201-212.
1008
Tardieu, J., Misdariis, N., Langlois, S., Gaillard, P., & Lemercier, C. (2015). Sonification of in-
1009
vehicle interface reduces gaze movements under dual-task condition. Applied
1010
Ergonomics, 50, 41-49.
1011
Townsend, J. T., & Ashby, F. G. (1978). Methods of modeling capacity in simple processing
1012
systems. Cognitive Theory, 3, 200-239.
1013
AUDITORY WARNINGS AND BACKGROUND NOISE 51
Walker, B. N., Nance, A., & Lindsay, J. (2006). Spearcons: Speech-based Earcons improve
1014
Navigation Performance in Auditory Menus. Proceedings of the International
1015
Conference on Auditory Display (ICAD 2006), London, England. June 20-24, pp. 63-68.
1016
Walker, B. N., Lindsay, J., Nance, A., Nakano, Y., Palladino, D. K., Dingler, T., & Jeon, M.
1017
(2013). Spearcons (speech-based earcons) improve navigation performance in advanced
1018
auditory menus. Human Factors, 55, 157-182.
1019
Wang, M., Lyckvi, S. L., Chen, C., Dahlstedt, P., & Chen, F. (2017, May). Using Advisory 3D
1020
Sound Cues to Improve Drivers' Performance and Situation Awareness. In Proceedings
1021
of the 2017 CHI Conference on Human Factors in Computing Systems, Denver,
1022
Colorado. May 6-11, pp. 2814-2825. ACM.
1023
Wegel, R., & Lane, C. E. (1924). The auditory masking of one pure tone by another and its
1024
probable relation to the dynamics of the inner ear. Physical Review, 23, 266-285.
1025
Wendt, D., Kollmeier, B., & Brand, T. (2015). How hearing impairment affects sentence
1026
comprehension: Using eye fixations to investigate the duration of speech
1027
processing. Trends in hearing, 19, 1-18.
1028
Wickens, C. D. (2002). Multiple resources and performance prediction. Theoretical Issues in
1029
Ergonomics Science, 3, 159-177.
1030
1031
1032
1033
1034
1035
1036
AUDITORY WARNINGS AND BACKGROUND NOISE 52
Appendix
1037
Table 1
1038
1039
Length of sound files for each warning across groups
1040
1041
Text-to-Speech
Spearcons
Auditory Icons
Battery Alert
765 ms
310 ms
655 ms
Brake Failure
888 ms
360 ms
1060 ms
Car in Blind Spot
1370 ms
547 ms
355 ms
Engine Temp High
1233 ms
488 ms
1383 ms
Gas is Low
920 ms
448 ms
1486 ms
Hand break is On
1135 ms
448 ms
685 ms
Oil is Low
822 ms
328 ms
1740 ms
Tire Pressure Low
1338 ms
535 ms
1740 ms
Average
1059 ms
422 ms
1139 ms
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
AUDITORY WARNINGS AND BACKGROUND NOISE 53
Biographies
1056
Edin Šabić is a doctoral student enrolled in the Experimental Psychology program at New
1057
Mexico State University. He received a bachelor’s degree in German and Psychology from The
1058
University of Iowa in 2014, and a master’s degree in Experimental Psychology from New
1059
Mexico State University in 2017.
1060
Jing Chen is an Assistant Professor of Human Factors Psychology at Old Dominion
1061
University. Jing received her Ph.D. in Cognitive Psychology and M.S. in Industrial Engineering
1062
at Purdue University in 2015.
1063
Justin A. MacDonald is an Associate Professor in the Psychology Department at New
1064
Mexico State University. Justin received his Ph.D. in Quantitative-Mathematical Psychology
1065
from Purdue University.
1066
1067
1068
1069
1070
1071
1072
1073
i
We evaluated whether the amount of training trials participants completed was a covariate for accuracy and RT in
both experiments. No significant effects were found any of the tests. Therefore, we excluded this covariate from
the analyses.
... Compared to visual TORs, auditory and vibrotactile TORs are both more effective and result in significantly shorter takeover time (Petermeijer et al., 2017); however, vibrotactile TORs may cause drivers to perceive a higher level of annoyance and have poor user experience (Politis et al., 2015). Accordingly, auditory feedback is increasingly recommended and applied in current vehicles (e.g., Tesla, 2020), and it can take advantage of speech to convey detailed information effectively and has no problem with intrusiveness (Campbell et al., 2016;Edin et al., 2019). ...
... Considering the engagement in NDRTs and the loss of situational awareness in conditionally AD, a combination of non-verbal warnings and speech could provide a possible solution, which was also mentioned by some participants. However, in a study exploring auditory interfaces in manual driving, Edin et al. (2019) found that presenting an alerting tone prior to the speech or auditory icons did not improve the participant recognition of warnings. They thought that this might be a result of insufficient design of the alerting tone or inadequate training of the participants. ...
Article
This study explored the possibility of applying personalized takeover requests (TORs) in an automated driving system (ADS), which required drivers to regain control when the system reached its limits. A driving simulator experiment was conducted to investigate how speech-based TOR voices impacted driver performance in takeover scenarios with two lead time conditions in conditionally automated driving (level 3). Eighteen participants drove in three sessions, with each session having a different TOR voice (a synthesized male voice, a synthesized female voice, and a significant other voice). Two scenarios with a lead time of 5 s and two scenarios with a lead time of 12 s were provided per session. The driver takeover time and quality data were collected. A follow-up interview was conducted to gain a clearer understanding of the drivers’ psychological feelings about each TOR voice during takeovers. Changes in takeover time and takeover quality caused by TOR voices were similar in both lead time conditions, except for the lateral acceleration. The synthesized male voice led to a larger maximum lateral acceleration than the other two voices in the 5 s condition. Interestingly, most drivers preferred choosing the synthesized female voice for future takeovers and showed negative attitudes toward the significant other voice. Our results implied that choosing TOR voices should consider the drivers’ daily voice-usage habits as well as specific context of use, and personalized TOR voices should be incorporated into the ADS prudently.
... This was because drivers' heads and eyes were not oriented toward the visual displays during automated driving. Sabi c et al. (2021) also revealed that the background noise in the vehicle could impair the effectiveness of auditory warnings. By contrast, the tactile TOR could handle these issues. ...
Article
The vibrotactile modality has great potential for presenting takeover requests (TORs) to get distracted drivers back into the control loop. However, few studies investigate the effectiveness of directional vibrotactile TORs. Whether TORs should be directed toward the direction of hazard (stimulus-response incompatibility) or the direction of avoidance action (stimulus-response compatibility) remains inconclusive. The present study explored the impact of directional vibrotactile TORs (toward-hazard, toward-action, and non-directional) on takeover performance. The influences of TORs lead time (3 s, 4 s, 6 s, and 8 s) and non-driving related tasks (NDRTs) (playing Tetris games and monitoring the road) on the effect of directional TORs were also probed. A total of 48 participants were recruited for our simulated driving study. Results showed that when drivers were engaged in NDRTs during automated driving, directional TORs were more effective than non-directional TORs. Specifically, at the lead times of 6 s and 8 s, both toward-hazard and toward-action TORs could shorten steering response times, compared with the non-directional TORs. At the lead times of 3 s and 4 s, toward-action TORs were more beneficial, as the maximum lateral acceleration was smaller than toward-hazard and non-directional TORs. However, when drivers monitored the road during automated driving, no obvious difference existed between directional and non-directional TORs, regardless of how long the lead time was. The findings in the present study shed light on the design and implementation of the tactile takeover system for automobile designers.
... Spearcons, speech-based earcons, were introduced by Walker et al. [61] for menu-based interfaces and are generated by "speeding up a spoken phrase until it is no longer recognized as speech"; spearcons have the advantage to be acoustically unique. Šabic et al. [44] found out (based on two experiments on in-vehicle warnings, N 1 =60, N 2 =60) that spearcons performed better than text-to-speech warnings in quiet environments and similar in noisier environments, but recommended using text-to-speech for safety-critical warnings in potentially noisier environments. ...
Article
Autonomous driving will still use human-machine co-driving to handle complex situations for a long term, which requires the driver to control the vehicle and avoid hazards by executing appropriate behavioral sequences after takeover prompts. Previous studies focused on the division of static behavioral indicators and major phases in the initial phase of takeover, while lacking the construction of behavioral sequences based on the dynamic changes of behavioral characteristics during the takeover process. This study divides the takeover process in a detailed manner and investigates the impact of audio types on the behavioral sequence at each phase. 20 professional drivers performed the NDRT in autonomous driving mode on real roads, and after receiving audio prompts, they took over the vehicle and performed hazard avoidance maneuvers. The results show that the behavioral characteristics could construct the behavioral sequence of different phases, with the dynamic characteristics of the takeover operation change. In addition, different types of audio prompts will affect the timing of the takeover operation and its driving performance. Choosing different audio prompts or combinations can help improve the effect of taking over the vehicle. This study helps to provide guidance on the design of human-machine interaction for behavior optimization at different phases, so that guiding the driver to take over the vehicle safely and effectively.
Article
Level crossing safety is a well-researched safety issue worldwide, but little attention has been placed on the safety benefits of using train horns when a train approaches a level crossing. Given train horns' adverse effects on the health and well-being of residents living near rail tracks, the use of train horns must be beneficial to safety. The current study sought to determine in a laboratory environment whether road users (N = 31) can detect the range of train horns observed in Australia in terms of loudness and duration, using high-definition audio recordings from railway crossings. A repeated measures design was used to evaluate the effects of key factors likely to influence the detectability of train horns, including, visual and auditory distractive tasks, hearing loss and environmental noise (crossing bells). Train horn detectability was assessed based on participants' accuracy and reaction times. Results indicated the duration of the train horn had the most influential effect on the detectability of train horns, with short-duration train horns less likely to be detected. The presence of bells at a crossing was the second most important factor that limited train horn detection. Train horn loudness also affected detectability: faint blasts were less likely to be noticed, while loudest blasts were more likely to be noticed. However, loud horns reduced the ability to detect the side from which the train was approaching and may result in longer times to detect the train, in the field. The auditory distractive task reduced the train horn detection accuracy and increased reaction time. However, the visual distractive task and medium to severe hearing loss were not found to affect train horn detection. This laboratory study is the first to provide a broad understanding of the factors that affect the detectability of Australian train horns by road users. The findings from this study provide important insights into ways to reduce the use and modify the practice to mitigate the negative effects of train horns while maintaining the safety of road users.
Chapter
Autonomous vehicles (AVs) are expected to play an increasingly important role in future transportation systems as a promising means of improving road safety and efficiency by eventually replacing human-driven vehicles. Semi-autonomous vehicles (semi-AVs; SAE Level 2 and Level 3) feature automatic lateral and longitudinal control of the vehicle with human drivers required to supervise the system at all times (Level 2) or prepared to resume control when requested (Level 3). As these definitions reveal, semi-AVs still require human oversight and intervention to fully ensure safety. Humans are required to monitor and be ready to take over control when the vehicle fails to recognize or respond to hazardous events. Thus, it is essential to ensure effective human-automation interaction and collaboration for semi-AVs. This book chapter will discuss the critical challenges for effective human-automation interaction for semi-autonomous driving, including communicating potential risks to human drivers and maintaining proper driver trust in the semi-AV. Risks in the current context are moving or stationary objects and road environments that impose imminent threats to drivers, including overt hazards such as road obstacles, a pedestrian crossing the road, and an intruding vehicle, or covert hazards such as a pedestrian that is about to cross but is occluded by a parked truck or a roadway structure. We discuss the design of effective risk communication mechanisms to convey these risks to the human driver, which helps maintain the driver’s situation awareness and facilitate the driver’s actions when needed. In addition, the effectiveness of this risk communication can be influenced not only by the characteristics of the driver and the semi-AV, but also their interaction. Finally, we will discuss factors that affect drivers’ trust in semi-AVs and subsequently how it affects effective risk communication in semi-AV driving.KeywordsSemi-autonomous drivingTrustRisk communication
Article
Use of ride-hailing mobile apps has surged and reshaped the taxi industry. These apps allow real-time taxi-customer matching of taxi dispatch system. However, there are also increasing concerns for driver distractions as a result of these ride-hailing systems. This study aims to investigate the effects of distractions by different ride-hailing systems on the driving performance of taxi drivers using the driving simulator experiment. In this investigation, fifty-one male taxi drivers were recruited. During the experiment, the road environment (urban street versus motorway), driving task (free-flow driving versus car-following), and distraction type (no distraction, auditory distraction by radio system, and visual-manual distraction by mobile app) were varied. Repeated measures ANOVA and random parameter generalized linear models were adopted to evaluate the distracted driving performance accounting for correlations among different observations of a same driver. Results indicate that distraction by mobile app impairs driving performance to a larger extent than traditional radio systems, in terms of the lateral control in the free-flow motorway condition and the speed control in the free-flow urban condition. In addition, for car-following task on urban street, compensatory behaviour (speed reduction) is more prevalent when distracted by mobile app while driving, compared to that of radio system. Additionally, no significant difference in subjective workload between distractions by mobile app and radio system were found. Several driver characteristics such as experience, driving records, and perception variables also influence driving performances. The findings are expected to facilitate the development of safer ride-hailing systems, as well as driver training and road safety policy.
Preprint
Full-text available
Auditory displays are commonly used in safety-critical domains and are a vital component of universal and inclusive design practices. Despite several decades of research on brief auditory alerts for representing status and processes in user interfaces, there is no clear heuristic guidance for which type(s) of auditory alerts should be preferred for designing interfaces. We used evidence synthesis (systematic review and meta-analysis) to examine the effectiveness of different types of brief audio alerts. We identified articles comparing auditory icons (real-world sounds with an ecological relationship to their referent), earcons (abstract sounds with no ecological relationship to their referent), spearcons (accelerated/compressed speech), and speech alerts. We used meta-analysis to compare alerts across five different outcomes: accuracy, reaction time, subjective ratings, workload, and dual-task interference. For accuracy and reaction time, results indicated speech, spearcons, and other types of alerts (usually hybrid, e.g., spearcons plus speech) were superior to auditory icons, which in turn were superior to earcons. Earcons also were inferior to all other options with respect to subjective ratings. Analyses generally suggested parity among alert types for workload and dual-task interference. Although high heterogeneity in our analyses cannot rule out a wide range of possible effects, based on currently available evidence, it appears that speech, spearcons, and hybrid (e.g., spearcons plus speech) auditory alerts should be preferred over auditory icons and especially earcons, all other considerations being equal. These findings can guide the selection of brief audio alerts in interface design.
Article
With the development of connected vehicles, in-vehicle auditory alerts enable drivers to effectively avoid hazards by quickly presenting critical information in advance. Auditory icons can be understood quickly, evoking a better user experience. However, as collision warnings, the design and application of auditory icons still need further exploration. Thus, this study aims to investigate the effects of internal semantic mapping and external acoustic characteristics (compression and dynamics design) on driver performance and subjective experience. Thirty-two participants (17 females) experienced 15 types of warnings — (3 dynamics: mapping 0 vs. 1 vs. 2) × (5 warning types: original iconic vs. original metaphorical vs. compressed iconic vs. compressed metaphorical auditory icon vs. earcon) — in a simulator. We found that compression design was effective for rapid risk avoidance, which was more effective in iconic and highly pitch-dynamic sounds. This study provides additional ideas and principles for the design of auditory icon warnings.
Conference Paper
Full-text available
Spatially distributed warning signals are able to increase the effectiveness of Advanced Driver Assistance Systems. They provide a better performance regarding attention shifts towards critical objects, and thus, lower a driver´s reaction time and increase traffic safety. The question which modality is used best, however, remains open. We present three driving simulator studies (30 participants each) with spatially distributed warnings, whereby two focused on spatial-visual as well as auditory warnings respectively. The third study, which combined the most promising approaches from the previous studies, depicts a multimodal spatial warning system. All studies included a baseline without secondary tasks and warnings. Afterwards, subjects were confronted with multiple (30+) critical objects while performing a secondary task. The chronological order of warnings was randomly mixed between spatial, non-spatial and no warning during the first two studies. Data from reaction times, eye tracking data, and questionnaires were collected. Results show that spatial-visual directed warnings are more effective than non-spatial warnings in large distances, but subjects do have difficulties in detecting objects in peripheral regions when they are distracted. While auditory spatial warnings are not as efficient as literature implies, it still performed best in this particular situation. Results of the multimodal warning study, discussion and implications on Advanced Driver Assistance Systems (ADAS) conclude the paper.
Article
The role of auditory efferent feedback from the medial olivocochlear system (MOCS) and the middle-ear-muscle (MEM) reflex in tonal detection tasks for humans in the presence of noise is not clearly understood. Past studies have yielded inconsistent results on the relationship between efferent feedback and tonal detection thresholds. This study attempts to address this inconsistency. Fifteen human subjects with normal hearing participated in an experiment where they were asked to identify an alarm signal in the presence of 80 dBA background (pink) noise. Masked detection thresholds were estimated using the method of two-interval forced choice (2IFC). Contralateral suppression of transient-evoked otoacoustic emissions (TEOAEs) was measured to estimate the strength of auditory efferent feedback. Subsequent correlation analysis revealed that the contralateral suppression of TEOAEs was significantly negatively correlated (r = -0.526, n = 15, p = 0.0438) with alarm-in-noise (AIN) detection thresholds under negative signal-to-noise conditions. The result implies that the stronger the auditory efferent feedback, the worse the detection thresholds and thus the poorer the tonal detection performance in the presence of loud noise.
Article
Assistance driving systems aim to facilitate human behavior and increase safety on the road. These systems comprise common systems such as forward collision warning systems, lane deviation warning systems, and even park assistance systems. Warning systems can communicate with the driver through various modalities, but auditory warnings have the advantage of not further tasking visual resources that are primarily used for driving. Auditory warnings can also be presented from a certain location within the cab environment to be used by the driver as a cue. Beattie, Baillie, Halvey, and McCall (2014) assessed presenting warnings in stereo configuration, coming from one source, and bilateral configuration, panned fully from left or right, and found that drivers felt more in control with lateral warnings than stereo warnings when the car was in self-driving mode. Straughn, Gray, and Tan (2009) examined laterally presented auditory warnings to signal potential collisions. They found that the ideal presentation of warnings in either the avoidance direction, in which the driver should direct the car to avoid a collision, or the collision direction, in which the potential collision is located, was dependent on time to collision. Wang, Proctor, and Pick (2003) applied the stimulus-response compatibility principle to auditory warning design by using a steering wheel in a non-driving scenario and found that a tone presented monaurally in the avoidance-direction led to the fastest steering response. However, the reverse finding occurred when similar experiments utilized a driving simulator in a driving scenario (Straughn et al., 2009; Wang, Pick, Proctor, & Ye, 2007).
Article
Objective We compared the effectiveness of single-tone earcons versus spearcons in conveying information about two commonly monitored vital signs: oxygen saturation and heart rate. Background The uninformative nature of many medical alarms-and clinicians' lack of response to alarms-is a widespread problem that can compromise patient safety. Auditory displays, such as earcons and spearcons (speech-based earcons), may help clinicians maintain awareness of patients' well-being and reduce their reliance on alarms. Earcons are short abstract sounds whose properties represent different types and levels of information, whereas spearcons are time-compressed spoken phrases that directly state their meaning. Listeners might identify patient vital signs more accurately with spearcons than with earcons. Method In Experiment 1 we compared how accurately 40 nonclinician participants using either (a) single-tone earcons differentiated by timbre and tremolo or (b) Cantonese spearcons recorded using a female Cantonese voice could identify both oxygen saturation and heart rate levels. In Experiment 2 we tested the identification performance of six further nonclinician participants with spearcons recorded using a male Cantonese voice. Results In Experiment 1, participants using spearcons identified both vital signs together more accurately than did participants using earcons. Participants using Cantonese spearcons also learned faster, completed trials faster, identified individual vital signs more accurately, and felt greater ease and more confident when identifying oxygen saturation levels. Experiment 2 verified the previous findings with male-voice Cantonese spearcons. Conclusion Participants identified vital signs more accurately using spearcons than with the single-tone earcons. Application Spearcons may be useful for patient monitoring in situations in which intermittently presented information is desirable.
Conference Paper
Within vehicle Human Machine Interface design, visual displays are predominant, taking up more and more of the visual channel for each new system added to the car, e.g. navigation systems, blind spot information and forward collision warnings. Sounds however, are mainly used to alert or warn drivers together with visual information. In this study we investigated the design of auditory displays for advisory information, by designing a 3D auditory advisory traffic information system (3DAATIS) which was evaluated in a drive simulator study with 30 participants. Our findings indicate that overall, drivers' performance and situation awareness improved when using this system. But, more importantly, the results also point towards the advantages and limitations of the use of advisory 3D-sounds in cars, e.g. attention capture vs. limited auditory resolution. These findings are discussed and expressed as design implications.
Article
A warning signal presented via a visual or an auditory cue might interfere with auditory or visual information inside and outside a vehicle. On the other hand, such interference would be certainly reduced if a tactile cue is used. Therefore, it is expected that tactile cues would be promising as warning signals, especially in a noisy environment. In order to determine the most suitable modality of cue (warning) to a visual hazard in noisy environments, auditory and tactile cues were examined in this study. The condition of stimulus onset asynchrony (SOA) was set to 0ms, 500ms, and 1000ms. Two types of noises were used: white noise and noise outside a vehicle recorded in a real-world driving environment. The noise level LAeq (equivalent continuous A-weighted sound pressure level) inside the experimental chamber of each type of noise was adjusted to approximately 60 dB (A), 70 dB (A), and 80 dB (A). As a result, it was verified that tactile warning was more effective than auditory warning. When the noise outside a vehicle from a real-driving environment was used as the noise inside the experimental chamber, the reaction time to the auditory warning was not affected by the noise level.
Conference Paper
For the current diagnosis of Alzheimer’s dementia (AD), physicians and neuroscientists primarily call upon visual and statistical analysismethodsoflarge,multi-dimensionalpositronemissiontomography (PET) brain scan data sets. As these data sets are complex in nature, the assessment of disease severity proves challenging, and is susceptible to cognitive and perceptual errors causing intra and inter-reader variability among doctors. The Triple-Tone Sonification method, first presented and evaluated by Roginska et al., invites an audible element to the diagnosis process, offering doctors another tool to gain certainly and clarification of disease stages. Audible beating patterns resulting from three interacting frequencies extracted from PET brain scan data, the Triple-Tone method underwent a second round of subjective listening test and evaluation, this time on radiologists from NYU Langone Medical Center. Results show the method is effective at evaluation PET scanbraindata.