Content uploaded by Lewis L. Chuang
Author content
All content in this area was uploaded by Lewis L. Chuang on Apr 30, 2018
Content may be subject to copyright.
Using EEG to Understand why Behavior to Auditory
In-vehicle Notifications Differs Across Test Environments
Lewis L. Chuang
Perception, Cognition, Action,
Max Planck Institute for
Biological Cybernetics
Tübingen, Germany
lewis.chuang@tuebingen.mpg.de
Christiane Glatz
Perception, Cognition, Action,
Max Planck Institute for
Biological Cybernetics
Tübingen, Germany
christiane.glatz@tuebingen.mpg.de
Stas Krupenia
Styling and Vehicle
Ergonomics, Scania CV AB
Södertälje, Sweden
stas.krupenia@scania.com
ABSTRACT
In this study, we employ EEG methods to clarify why auditory
notifications, which were designed for task management in
highly automated trucks, resulted in different performance
behavior, when deployed in two different test settings: (a)
student volunteers in a lab environment, (b) professional truck
drivers in a realistic vehicle simulator. Behavioral data showed
that professional drivers were slower and less sensitive in iden-
tifying notifications compared to their counterparts. Such
differences can be difficult to interpret and frustrates the de-
ployment of implementations from the laboratory to more
realistic settings. Our EEG recordings of brain activity reveal
that these differences were not due to differences in the de-
tection and recognition of the notifications. Instead, it was
due to differences in EEG activity associated with response
generation. Thus, we show how measuring brain activity can
deliver insights into how notifications are processed, at a finer
granularity than can be afforded by behavior alone.
CCS Concepts
Human-centered computing
∼
Human computer interaction
(HCI); Human-centered computing∼User studies
Author Keywords
Driving simulator; auditory notifications;
electroencephalography; event-related potential; MMN; P3;
Bereitschaftspotential.
INTRODUCTION
Notifications are a fixture of in-vehicle environments. They
are designed to direct users, who would be engaged otherwise,
to aspects of the environment that require a response (e.g., fuel
indicator lights). Recent advances in automated driving will
increase the importance of notifications, especially when the
duties of the human operator transition from vehicle control
to vehicle supervision [1, 7]. Indeed, research on the design
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than the
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specific permission
and/or a fee. Request permissions from permissions@acm.org.
AutomotiveUI ’17, September 24–27, 2017, Oldenburg, Germany
© 2017 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ISBN 978-1-4503-5150-8/17/09. . . $15.00
DOI: https://doi.org/10.1145/3122986.3123017
of in-vehicle interactions have rapidly shifted, in recent years,
towards addressing the anticipated user requirements of au-
tomated vehicles [22]. Given the rapid pace of innovation
in technology and design, it is surprising that we continue to
have few tools at our disposal that allow us to truly appreci-
ate how users process and act upon in-vehicle notifications.
Here, we combine the analysis of behavioral responses with
electroencephalography (EEG) recordings to better understand
how auditory notifications were processed for information and
responded to across different test environments and participant
groups.
The design space of notifications is large. This gives rise to
infinite variations of how in-vehicle notifications ought to be
designed and for which purpose. While guidelines have been
proposed for the design of in-vehicle displays (e.g., [16, 39,
44]), they tend to be based on studies with a focus on critical-
safety behavior. Ultimately, notifications are implemented
according to whether or not they will be effective in safely
eliciting the desired responses, in the environment for which
they were designed for. Unfortunately, evaluating notifications
by performance measures alone can only reveal whether a
given implementation is better or worse than its control com-
parison. In order to understand why a given notification results
in better or worse performance than originally anticipated, it
is necessary to evaluate the extent to which the notification
is perceived, processed, and elicits a response. For this, it is
necessary to inspect how the brain responds to notifications.
Methods for neuroimaging are becoming more accessible. Re-
cent developments in neuroimaging technology, especially
with regards to EEG, have focused on the ease of application
and user mobility [14]. In spite of this, valid concerns persist
with regards to their suitability for use in realistic test envi-
ronments, especially since the presence of electronic devices
can introduce substantial noise into EEG recordings. On a
more practical note, EEG measurements are often expected to
impose implementational costs on the researcher that might
be deemed unnecessary, especially when behavioral and self-
report measures are expected to suffice.
Nonetheless, the time-varying EEG signal offers a detailed
inspection of how information is processed by the human user,
which cannot be achieved with performance measurements
alone. With regards to the evaluation of notifications, EEG
measurements can reveal how the brain automatically detects
and consciously identifies notifications. They can also indi-
cate how the brain prepares itself to generate an appropriate
response, pending the identification of the notification. Per-
formance measurements implicitly treat the human operator
as a single stimulus-response unit and do not, in themselves,
distinguish between perception, cognition, and action.
Our research aim was to demonstrate that employing EEG
methods can allow us to account for why behavioral responses
to auditory notifications might differ across different instances
of testing. Specifically, between professional drivers tested in
a high fidelity driving simulator and students tested in a psy-
chophysical laboratory. This is an experience that is common
to many researchers when evaluating novel designs of notifi-
cation interfaces for deployment in the "real world". Often,
interfaces are first prototyped and evaluated under highly con-
trolled conditions before they are deployed in more realistic
environments and tested with their intended users. When the
behavioral results of a highly controlled test do not generalize
to a more realistic one, it is often difficult to establish the
reasons that might have caused this.
Here, we show that EEG measurements can complement be-
havioral results to provide a better resolution for understand-
ing how notifications are processed by users across different
settings. Unlike behavioral performances, the appropriate
application of EEG measurements allows the researcher to dis-
criminate how notifications are processed by the brain across at
least three different stages of information processing, namely
perceptual, cognitive, and response stages [46, 47]. In this re-
gard, it offers researchers the ability to investigate the outcome
of information processing at the various stages that lie in be-
tween the presentation of a stimulus (i.e., auditory notification)
and a behavioral response (i.e., keypress).
We report two experiments that presented participants with
identical tasks but under two different test environments. All
of our participants were required to respond to auditory notifi-
cations, which were previously designed to direct the attention
of commercial truck drivers to the occurence of task require-
ments during a long distance, automated vehicle mission [13].
They were also presented irrelevant distractor sounds, which
they had to ignore, and a dynamic visual scene, which varied
in its realism according to the test environment. The first ex-
periment was performed as a highly-controlled psychophysical
experiment (N=15) ), with low mission fidelity, and on young
and untrained student participants. The second experiment
was performed on professional truck drivers in a high-fidelity
driving simulator (N=14). Our findings are as follows:
1.
professional drivers in a high fidelity simulator were slower
and less sensitive in discriminating target notifications from
distractor sounds
2.
the EEG activity for notification detection and identification
did not discriminate between the two test environments
3.
the EEG activity for correct responses to the notifications
(i.e., Bereitschaftspotential; BP) discriminated between the
two test environments
4.
thus, we attributed observed behavioral differences to dif-
ferences in the sample demographic and not to differences
in the test environment.
RELATED WORK
In-vehicle notifications are often designed to shift user atten-
tion from the primary task of engagement (e.g. driving) to
a critical event (e.g. low fuel). With advances in vehicle
sensing, notifications can also be designed to direct a user’s
attention to safety-critical aspects of the driving task (e.g.
pedestrian detection [45, 49]) or to support decision making
(e.g. lane-changing [25]). With the increased adoption of
automated vehicles, we expect the role of notifications to grow
in prominence. Besides takeover notifications that prompt
users to resume vehicle handling (e.g. [5, 36]), we also ex-
pect task-management notifications to pervade the in-vehicle
environment as the scope of permissible in-vehicle activities
grows ([34]). In particular, commercial vehicles (i.e., trucks)
will stand to benefit from the effective introduction of task-
management notifications. This is because commercial drivers
might be expected to assume additional responsibilities such as
delivery logistics, as their responsibility for vehicle diminishes
with increasing handling automation.
Auditory notifications
Auditory displays tend to be a favored delivery medium for
notifications, given that they are not in direct conflict with
the visuomotor demands of vehicle handling [33]. Auditory
information that is presented during driving has been claimed
to be more deeply processed than visual information, given
that it is more likely to be recalled post testing [32]. However,
it can also be perceived as being more distracting. In any case,
auditory notifications inhabit a large design space with ma-
nipulable parameters, which include their formant, duration,
interpulse interval, onset/offset latency and more. This allows
them to be flexibly tuned in order to communicate informa-
tion, such as urgency [12], even whilst being moderated for
undesirable side-effects, such as perceived annoyance [29].
Some sounds are more effective than others, particular when
operational concerns are taken into consideration. For exam-
ple, a driving simulator study compared different auditory
warnings for signalling potential headway collision and found
that the sound of a car horn and a tone with a looming time-
to-contact intensity resulted in the best braking latencies [15].
Nonetheless, the sound of a car horn also resulted in more
unnecessary braking responses than the looming tone. A sep-
arate study compared four classes of auditory notifications—
namely, abstract sounds, auditory icons, nonspecific environ-
ment sounds, and speech messages—for their efficacy in cuing
for driving situations such as low tire pressure, low oil level,
engaged handbrake whilst driving, and others [31]. This study
found that speech messages and auditory icons generated faster
and more accurate responses than either environment sounds
or abstract sounds.
In these mentioned examples, auditory displays were often
evaluated on the basis of behavioral responses—namely, accu-
racy and response latencies. Although brain responses offer
a finer granularity of information processing, they are rarely
employed in the evaluation of auditory notifications.
Event-related potentials (ERPs)
EEG refers to the measured electrical activity of surface elec-
trodes placed on the human scalp (typically in the range of
10–100
µV
), which can be attributed (in part) to brain activity
[41]. Event-related potentials (ERPs) are changes within a
pre-specified time window of EEG activity that are generated
after or prior to a known event. In this work, we focus on
two physical events: (i) the presentation of a target notifica-
tion, (ii) the participants’ self-generated response to target
notifications. Respectively, this allows to first understand how
our participants’ brains capacity to detect and recognize the
presented notification and, next, decide to generate a behav-
ioral response relative their self-generated responses. This
corresponds to three stages of information processing that lie
between presenting a notification and generating a response.
Stimulus ERPs for auditory events
Auditory stimuli are frequently employed in ERP studies as
they elicit waveforms with identifiable components that are
associated with cognitive mechanisms. Of current interest
are the slow transient responses that arise from the auditory
and associated cortices (i.e., 50 msec after sound onset). A
popular paradigm (i.e., oddball paradigm) presents two dis-
criminable sounds, one more frequently than the other (e.g.,
80% to 20%). Participants are only required to respond to the
infrequent sound, which are termed targets. This corresponds
to a real-world scenario where auditory notifications have to
be detected and identified against an auditory background of
distractors. Subtracting the EEG activity generated by the fre-
quent distractor sounds from that generated by targets results
in a difference waveform with two interpretable components.
First, the mismatch negativity (MMN), which is an early nega-
tive deflection with a typical latency of about 140 msec. The
MMN is associated with an automatic process that responds
in proportion to the perceived deviance of the targets from the
distractors. It is generated even when there is no task involved.
Second, the P3 ,which is a positive late deflection with a la-
tency between 450–600 msec. The P3 is only generated if
the subject is attending to the stimuli in a way that demands
a response (but, see [40]). Working memory processes that
underlie context updating are claimed to be represented by the
P3 component [11]. Indeed, fMRI studies have localized brain
regions, which are typically implicated with conscious effort
and working memory processes (i.e., insular cortex, inferior
parietal and frontal lobes), as generators of the P3 (e.g. [24]).
In the current study, we evaluated MMN and P3 responses
to target auditory notifications in order to determine whether
behavioral differences across different test environments re-
flected changes in their automatic detection and/or voluntary
identification.
Response ERPs for motoric responses
Brain responses can also be measured prior to response ac-
tuation. When EEG activity are temporally aligned to the
generated responses of measured individuals, it is possible to
observe a slow change in potentiation, leading up to the re-
sponse onset. This observation was first reported in 1964 and
was termed the Bereitschatpotential (i.e., readiness potential;
BP) [20, 43].
The BP is maximal at the midline centro-parietal area (i.e.,
CPz). The adoption of a common average reference, such as
in the current work, means that it is observed as a positive
potential shift in the parietal and occipital electrodes and as a
negative potential shift in the frontal electrodes. In an interest-
ing series of experiments, recorded participants were asked to
report the time when they decided to generate volitional key-
press responses [23]. Here, the BP was found to be initiated
approximately 350 msec prior the participants’ reported times,
which raised philosophical doubts on the nature of free will.
Putting such existentialist concerns aside, most researchers
agree that the initiation of BP has its physiological origins in
the supplementary motor area (SMA), a brain region that is
implicated in the generation of motor responses, at least in the
case of hand movements [38]. Thus, BP could be regarded
as a cortical decision to initiate a motor action (prior to the
conscious realization of the decision itself!). More recently, it
has been claimed that this cortical decision can be volitionally
cancelled up till a point (i.e., 200 msec prior to response),
after which the generation of a motor response inevitable [42].
For our current purposes, we treat BP as an indicator for the
cortical decision to respond, which is initiated earlier than the
recorded response itself. This allows us to determine the la-
tency between the cortical decision and the actuated response,
as well as the amplitude of the cortical decision itself which is
not evident in a binary key- or button-press response.
STUDY
This study was a between groups design that compared behav-
ioral performance and brain responses to auditory notifications
across two different experimental settings: (i) a highly con-
trolled psychophysical laboratory, (ii) a high fidelity driving
simulator environment. The whole experiment took approxi-
mately 2.5 hours to complete, including preparation time and
debriefing.
Participants
Thirty participants—that is, fifteen undergraduate students
(mean age(std)=26.1(4.0) years; 9 males) from the University
of Tübingen, Germany, , and fifteen professional commercial
drivers (age (mean age(std)=41.4(12.1) years; 13 males) who
were employees of Scania CV AB, Sweden —performed the
task reported here. The data of one professional driver, from
the driving simulator testing, had to be excluded from further
analysis because only one response was recorded throughout
the entire experiment. Besides demographic differences, the
primary difference between these two groups was the test en-
vironment that they experienced (see 3.2). All participants
reported no known hearing defects and provided signed in-
formed consent.
Apparatus and Stimuli
Psychophysics laboratory
The psychophysics laboratory had black walls, was insulated
for external sounds, and had an ambient sound level of approxi-
mately 40 dB (Figure 1, left). Visual stimuli were presented on
a desktop display (
60◦
field-of-view; 45 cm distance to fixed
Figure 1. Left: A student participant in a psychophysical laboratory (Department for Human Perception, Cognition, and Action, MPI for Biological
Cybernetics, Tübingen, Germany). Right: A professional commercial driver in a truck driving simulator (Styling and Vehicle Ergonomics, Scania CV
AB, Södertälje, Sweden)
chin-rest). The visualization was rendered by customized soft-
ware written in Matlab R2013b (The Mathworks, Natick, MA).
The visualization consisted of a cursor that drifted horizon-
tally between two vertical lines, which represented a vehicle’s
position in a single lane. Participant responses were collected
via keypresses on a standard USB keyboard. Sound presen-
tation was controlled by an ASIO 2.0 compatible sound card
(SoundBlaster ZxR; Creative Labs) and displayed via stereo
speakers, each placed on either side of the display.
High fidelity driving simulator
The driving simulator was designed to simulate the operation
of commercial truck vehicles. Participants sat in a realistic
cabin interior, based on an existing truck seating buck that con-
sisted of a pneumatic seat, a steering column complete with
wheel and shaft, instrument cluster, and the remaining dash-
board (Figure 1, right). Visualization was presented on a front-
projection three wall display (approx.
150◦
field-of-view; 450
cm distance to head), and via two pairs of 2 vertically-aligned
displays each, attached to either side of the cabin, that simu-
lated side rear-view mirrors. The visualization was rendered
by a customized graphical engine (i.e., VISIR) that created
3D environments from OpenDrive8 road network files (xodr)
and from additional landscape description file (xml). Here, we
presented a highway scene from Linköping and Norrköping,
with two lanes for congruent traffic and two lanes for oppos-
ing traffic. The participant inhabited the far-right lane. The
highway was populated with low traffic density, including the
occasional headway vehicle. Experimental responses were
collected via dedicated buttons that were located on the steer-
ing wheel. Sound presentation was controlled by an ASIO
2.0 compatible sound card (RME HDSP 9632; RME Intelli-
gent Audio Solutions) and displayed via a 5.1 surround sound
system, installed around the driver’s seat.
Experimental Stimuli
Twelve target and ninety distractor sounds were used in this
experiment. These were modified from sounds that were de-
signed as part of a project (MODAS: Methods for Designing
Future Autonomous Systems) to cue truck drivers to attend to
possible non-driving tasks [13, 21]. All sounds had a duration
of 500 msec.
There were two notifications for each of six non-driving tasks.
They were a verbal command in Swedish and an auditory icon
(in brackets): (i) system (synthetic tone), (ii) convoy (train
whistle), (iii) driver (human whistle), (iv) weather (raindrop),
(v) road (rumbling), (vi) traffic (car horn). The distractor
sounds were random permutations of four sounds, two verbal
commands and two auditory icons, played simultaneously in
reverse.
EEG recording
EEG was recorded on a dedicated PC using a 59-channel active
electrode array that was affixed to the scalp using an elastic
whole head cap, which specified for pre-determined sites in-
cluding those corresponding to the international 10-20 system
(ActiCap System, Brain Products GmbH, Munich, Germany).
The horizontal and vertical electrooculogram (HEOG/VEOG)
were recorded with four electrodes attached to the right and
left canthi as well as above and below the left eye. FCz was
used as an online reference for all channels. Prior to testing,
electrode gel was applied to ensure that electrode impedance
was
<20 KΩ
for each channel. EEG signals were digitized
at a rate of 1000 Hz. EEG recordings were synchronized
with experimental events via a parallel port connection to the
experimental PC.
Task
All participants performed the same task, regardless of the
experiment setting. They were required to attend to the visual
scene throughout the experiment. Participants were informed
that this was a stimulated automated driving scenario and that
no steering was necessary. Whenever they heard a sound, they
were required to respond if it was a target notification and
to ignore it if it was a distractor sound. The inter-stimulus
Figure 2. Six examples of clusters of dipoles (blue) and their mean position (red), their projected scalp activity, and power spectral density (inset: left to
right), derived from EEG recordings in the driving simulator. First row: Cortical dipoles that are likely to be associated with auditory processing (left)
and motor response generation (right) respectively. Second row: Non-cortical dipoles that are associated with muscle activity (left) and eye-movements
and -blinks (right). Third row: Non-cortical dipoles that are due to electrical line noise (left) and unresolved variance in EEG recording (right).
interval was randomly selected from a uniform distribution
across 1800-2000 msec. Participants could respond within
2000 msec of the onset of the target notification. Failures to
do so were considered misses. Responses to distractor sounds
were treated as false alarms. Each experiment presented ap-
proximately 980 sounds in total. Of these,
20%
were target
notifications.
Results: Behavioral performance
Our participants’ performance were assessed in terms of the
median of their correct response times (RT) and discriminabil-
ity index (i.e., d’). The discriminability index is calculated
as the difference between the z-scores of hit and false-alarm
rates, whereby hits and false-alarms were responses to target
and distractor sounds respectively [28]. Welch’s t-tests for
independent samples were performed to compare behavioral
performance in the high fidelity driving simulator and the psy-
chophysical laboratory [8]. The adopted criteria for statistical
significance was α=0.05.
Participants in the high fidelity driving simulator were slower
(mean=1262 msec; std.=120 msec) in their correct responses
than those in the psychophysics laboratory (mean=1062 msec;
std.=139 msec). This difference (mean=200 msec) is statisti-
cally significant (
t
(26.8)=4.18,
p
<0.001, Cohen’s d=1.54) and
has a 95% confidence interval from 102 to 299 msec.
Participants in the high fidelity driving simulator were less
sensitive (mean=4.37; std.=1.03) in discriminating the au-
ditory target notifications from their distractor counterparts,
than those in the psychophysics laboratory (mean=5.25;
std.=0.60). This difference (mean=0.88) is statistically signifi-
cant (
t
(20.6)=2.81,
p
<0.05, Cohen’s d=1.06) and has a
95%
confidence interval from 0.23 to 1.54.
To summarize, behavioral results indicated slower and less dis-
crimination sensitivity for auditory notifications in the driving
simulator, compared to the psychophysics laboratory.
Results: EEG/ERP responses
Data collection, signal processing, and statistical analysis
Data pre-processing and analysis was performed offline with
Matlab (The Mathworks, Natick, MA) scripts based on
EEGLAB v.14
1
, an open source environment for processing
electrophysiological data [9]. The following steps were per-
formed on EEG data prior to analyzing the ERPs of stimuli
and responses [3]. First, the data was downsampled to
250
Hz to reduce computational costs. Next, a high-pass filter
(cut-off=
0.5
Hz) was applied to remove slow drifts,
50
Hz
electrical line noise from the environment was removed using
the CleanLine algorithm, and bad channels were removed us-
ing the ASR algorithm. Next, all electrodes were re-referenced
to their common average, and each participant’s dataset was
separately submitted to an Adaptive Mixture ICA to decom-
pose the continuous data into source-resolved activity [10].
On these learned independent components (IC), equivalent
1https://sccn.ucsd.edu/eeglab/
Figure 3. LEF T: Stimulus ERPs are illustrated by and labelled in three difference waveforms that depict averaged EEG activity of electrode groups
from anterior, central, and posterior regions. RIGHT: MUA results plot statistically significant t-values between the two participant groups for every
electrode and time-point. The analysis reveals that there are no statistically significant electrode-time regions that proceed the auditory notification.
current dipole model estimation was performed by using an
MNI Boundary Element Method head model to fit an equiva-
lent dipole to the scalp projection pattern of each independent
component. ICs whose dipoles were located outside the brain
were excluded as well as those that had a residual variance of
over 15%. Within each participant group, ICs were clustered
into 30 clusters using k-means based on their mean power
spectra, topography, and equivalent dipole location.
Figure 2 provides examples of dipole clusters with either cor-
tical (first row) or non-cortical origins (e.g., muscle and eye
activity (second row), electrical activity from environment
sources (third row)). Non-cortical components were identified
on the basis of their power spectral density, scalp topology,
and location in a volumetric brain model [19]. As might be
expected, there were more non-cortical dipole components
found in participants who performed the experiment in the
driving simulator (N=15) than those from the psychophysical
experiment (N=14). In other words, EEG recordings were con-
taminated by the activity of more non-cortical components in
the driving simulator environment than in the psychophysical
laboratory. Non-cortical dipole clusters were removed from
the EEG recording and the remaining EEG activity was sub-
jected to comparative analysis for the two participant groups.
Specifically, we derived a stimulus and a response ERP for
each participant. This was achieved by mean-averaging the
EEG activity of a time-window (also termed an epoch; 1000
msec after/before the relevant trigger event, baselined to 500
msec before/after the stimulus/response), across all epochs.
Stimuli ERPs were defined by the differences in EEG re-
sponses to target notifications and distractor sounds, prominent
components are MMN and P3, which are respectively associ-
ated with the information processing aspects of notification
detection and identification [35]. Response ERPs were defined
by EEG activity prior to the registration of our participants’ re-
sponses. It is defined by a single component, which manifests
itself as a single peak that changes from negative to positive
polarity, from the anterior to posterior electrodes. The left
panels of Figures 3 and 4 illustrate the averaged activity of
anterior (Fz, F1, F3, F5, F2, F4, F6, FC1, FC3, FC5, FC2, FC4,
FC6), central (Cz, C1, C3, C5, C2, C4, C6), and posterior (Pz,
POz, Oz, P1, P3, P5, P2, P4, P6, PO3, PO7, PO4, PO8, O1,
O2) electrodes for stimulus and response ERPs respectively.
The profile of the derived waveforms were consistent with our
expectations for both test environments.
To evaluate our EEG recordings for differences across the two
test settings, we performed separate mass-univariate analyses
(MUA) for the stimuli and response ERPs [17]. This allowed
us to determine the time points of individual electrodes that
were statistically significant for waveform differences between
the EEG activity recorded across the two test environments.
False discovery rate control was applied (i.e., FDR-BH) [2].
We illustrate MUA results as raster plots of electrode chan-
nels across time (see Figures 3 and 4, right), whereby statis-
tically significant t-values are represented by color intensity.
The raster plots consist of three panels whereby the top and
bottom panels indicate right- and left-hemispheric electrodes
respectively, and the middle panel indicate mid-line electrodes.
Within each panel, electrodes are vertically ordered from ante-
rior to central to posterior electrodes.
Figure 4. L EF T: Stimulus ERPs are illustrated by three waveforms that depict averaged EEG activity of electrode groups from anterior, central, and
posterior regions. The BP peaks are indicated by arrows. RIGHT: MUA results reveal statistically significant differences between driving simulator and
laboratory recordings in two time periods (i.e., 600-430 and 220-0 msec before responses).
Stimulus ERPs
The MUA results (Figure 3, right) reveal no statistically sig-
nificant differences between the difference waveforms (Figure
3, left) of participants from the laboratory (–) and driving
simulator (–) test settings. This suggests that the notifications
elicited equivalent brain responses for detection and identifi-
cation in both groups of participants, regardless of their test
environments.
Response ERPs
The MUA results (Figure 4, right) reveal statistically signif-
icant differences, particular in the frontal (e.g., Fpz, Fz) and
posterior (e.g., Pz, POz, Oz) electrodes. To understand these
differences, the reader should recall from the behavioral results
that truck drivers in the driving simulator generated slower
responses than students in the psychophysical laboratory, of
about 200 msec. We note that the time periods of significantly
different EEG activity are of similar duration (i.e., 600-430
and 220-0 msec before the response event). This means that
the time that it took for truck drivers in the driving simulator
to generate a behavioral response after the initiation of corre-
sponding brain activity (i.e., BP) was approximately 220 to
230 msec longer than it took for the undergraduate students
in the psychophysical laboratory. In addition, the peak ampli-
tudes of the BP in the frontal and posterior electrodes were
smaller in the truck drivers than the undergraduate students
(Figure 4, left).
DISCUSSION
What inferences can we draw when auditory notifications,
which were designed in the confines of a well-controlled labo-
ratory, elicit different behavioral responses in more realistic
settings? The current study demonstrates that EEG measure-
ments can provide some clarity when the explanatory power
of behavioral responses are limited.
In this work, we found statistically significant behavioral dif-
ferences across two test settings. Professional drivers were
slower and less sensitive in detecting target notifications in
a high fidelity driving simulator compared to student partic-
ipants tested in a psychophysical laboratory. To begin, this
is surprising for at least two reasons. First, the verbal com-
mands were in the native language of the professional drivers.
Second, the professional drivers understood what these notifi-
cations represented in the context of their jobs. Thus, we might
have assumed that professional drivers to have responded
faster and more accurately. Although the professional drivers
were slower than our student volunteers by approximately 200
msecs. They continued to respond in an acceptable time range
(i.e., less than 2 secs, the recommended time headway for
preventing rear-end collisions). From this, the current auditory
notifications might be considered to be suitable for fulfilling
their intended function of task management.
Based on behavioral data alone, the worse performance of the
professional drivers could be attributed to several reasons. For
example, the driving simulator could have provided a more
immersive environment that reduced the perceptual saliency of
the notifications. Alternatively, it could have been due to age
or motivational differences between the professional drivers
and student participants. Last, but not least, performance
differences could have resulted from technical differences in
the auditory displays or input devices. This host of plausible
interpretations often plague comparative user studies that rely
solely on behavioral measurements.
With EEG measurements, we were able to infer that the be-
havioral differences that we observed were due to differences
between the professional drivers and the student participants.
Specifically, in how their brains prepare themselves prior to
responding. Our reasoning is as follows. First, notifications
were unlikely to have been detected or identified differently,
given that the brain responses associated with these processing
stages (i.e., MMN and P3 respectively) were similar across
the two participant groups. In other words, the auditory notifi-
cations were robustly perceived across the two different test
settings. Next, differences were found in the response ERPs.
More specifically, the response ERPs for professional drivers
had a longer latency than student participants between the BP
peak and the recorded response. In other words, more time
elapsed for the professional drivers between cortical decision-
making to respond and the response itself. This difference
in the latencies from BP initation to the motor response was
about 220 msec, and gave rise to the statistical differences that
the MUA analysis revealed. This difference in EEG activity
between the two groups converges with the difference that we
found with behavioral response times (i.e., 200 msec). Finally,
the ERPs for response generation in professional drivers had
smaller peak amplitudes than in student participants. This
suggests that reduced cortical activity observed in the profes-
sional drivers prior to responding could have resulted in later
responses. This possibility rules out alternative reasons for
slow responding, such as the sub-optimal physical ergonomics
of the truck cabin or the physical layout of the input device.
Taken together, the combination of EEG and behavioral mea-
surements show that the current auditory notifications were
sufficiently salient and robust across different test environ-
ments. Although professional drivers exhibited slower and
less sensitive discrimination performance, it was not likely to
be due to the notification design or the physical environment.
Instead, it was due to differences in the sample demographics.
Thus, subsequent effort in this scenario ought to be invested
in understanding and mitigating for human factor limitations,
rather than in further refinements in the design of notifications
or the physical interface.
The current work is restricted to the presentation of auditory
notifications. There are other channels of notification delivery
that remain to be considered, such tactile notifications (e.g.,
vibrations) which have been claimed to be more easily discrim-
inable and less interfering with the task of driving compared
to auditory notifications [6].
To reiterate our main point, comparisons between different
types of notifications and across different deployment settings
can result in conflicting evidence, across independent stud-
ies and even within the same study. For example, while [6]
reported that auditory notifications elicited shorter response
times, a meta-analysis showed that tactile notifications elicited
faster responses, at least for low-urgency messages [27]. The
same meta-analysis emphasized that moderating factors play
a critical role in determining the suitability notification deliv-
ery and design. For example, tactile notifications might be
more discriminable, but only if they have low-complexity cf.,
[6]; responses to auditory notifications are more accurate for
high-complexity information such as in the current study. This
variability of empirical evidence across the diverse design and
deployment space means that it is insufficient to simply focus
on behavioral responses to user interfaces.
CONCLUSION AND OUTLOOK
The current paper contributes by demonstrating how EEG
methods could allow us, as researchers, to identify the stage of
information processing that results in differences in behavior.
Such an approach will allow us to target the limitations of our
designs for user interfaces more selectively and emphasize the
aspects that are more deserving of our attention.
While EEG measurement is not a panacea, the current work
shows that it can provide insight into how information is pro-
cessed, at a finer granularity than behavioral responses alone.
Furthermore, it can help to deliver insight when the deploy-
ment settings of our designs change across test phases. Here,
it assured us that the designed notifications were sufficiently
robust to be processed by the brains of their users, regardless
of differences in the test environments.
This level of understanding, namely of how the information
communicated by notification interfaces is processed by the
brain, will be increasingly important especially as we attempt
to increase the design space and functional diversity of noti-
fications. Nowadays, notifications are designed, not only to
capture the user’s attention at all costs but, to be sensitive to
the user’s goals and requirements [30]. Ambient notifications
represent a particular class of notifications that will be difficult
to evaluate if behavioral measurements are all that we have
to rely on. This is because ambient notifications are, by defi-
nition, designed to inform the user without eliciting behavior
that would interrupt existing activity [26, 37]. With such noti-
fications, responses from the brain could be measured instead
of behavioral responses.
The current work relied on high density EEG recording equip-
ment. While the use of medical grade equipment can be fea-
sibly implemented in a real car, and even for the actuation of
emergency braking [18], doing so might not be expedient for
many researchers. A recent evaluation suggest that simpler
and more convenient EEG setups, for example those that use
dry electrodes, could be implemented in a vehicle environ-
ment at a reasonable signal to noise ratio [48]. Innovations in
electrode designs, such as an around-the-ear EEG placement
[4], could further allow for brain responses to be measured
without imposing on users the inconvenience of donning an
unsightly EEG cap.
As we continue to innovate in-vehicle interfaces to keep up
with the demands of user expectations, it is only appropriate
that we also innovate our means for evaluating these interfaces
to keep up with the demands of inferential rigor. The current
work demonstrates the viability of one approach that should
be employed more often than it currently is.
ACKNOWLEDGMENTS
This work was partially supported by the German Re-
search Foundation (DFG) for financial within project C03
of SFB/Transregio 161. We would like to thank K-Marie
Lahmer and Rickard Leandertz for their assistance in data
collection, Johan Fagerlonn for sharing his original stimuli,
and BrainProducts GmbH (Munich, Germany) for loaning us
the necessary equipment for this study.
REFERENCES
1. L. Bainbridge. 1983. Ironies of automation. Automatica
19, 6 (1983), 775–779. DOI:
http://dx.doi.org/10.1016/0005-1098(83)90046- 8
2. Y. Benjamini and Y. Hochberg. 1995. Controlling the
false discovery rate: a practical and powerful approach to
multiple testing. Journal of the royal statistical society.
Series B (Methodological) (1995), 289–300.
http://www.jstor.org/stable/2346101
3. N. Bigdely-Shamlo, T. Mullen, C. Kothe, K-M. Su, and
K. A. Robbins. 2015. The PREP pipeline: standardized
preprocessing for large-scale EEG analysis. Frontiers in
neuroinformatics 9 (2015), 16. DOI:
http://dx.doi.org/10.3389/fninf.2015.00016
4. M. G Bleichner, B. Mirkovic, and S. Debener. 2016.
Identifying auditory attention with ear-EEG: cEEGrid
versus high-density cap-EEG comparison. Journal of
Neural Engineering 13, 6 (2016), 066004.
http://stacks.iop.org/1741-2552/13/i=6/a=066004
5. S. Borojeni, L. Chuang, W. Heuten, and S. Boll. 2016.
Assisting Drivers with Ambient Take-Over Requests in
Highly Automated Driving. In Proceedings of the 8th
International Conference on Automotive User Interfaces
and Interactive Vehicular Applications (Automotive’UI
16). ACM, New York, NY, USA, 237–244. DOI:
http://dx.doi.org/10.1145/3003715.3005409
6. Y. Cao, F. van der Sluis, M. Theune, R. op den Akker,
and A. Nijholt. 2010. Evaluating Informative Auditory
and Tactile Cues for In-vehicle Information Systems. In
Proceedings of the 2Nd International Conference on
Automotive User Interfaces and Interactive Vehicular
Applications (AutomotiveUI ’10). ACM, New York, NY,
USA, 102–109. DOI:
http://dx.doi.org/10.1145/1969773.1969791
7. S.M. Casner, E.L. Hutchins, and D. Norman. 2016. The
Challenges of Partially Automated Driving. Commun.
ACM 59, 5 (April 2016), 70–77. DOI:
http://dx.doi.org/10.1145/2830565
8. M. Delacre, D. Lakens, and C. Leys. in press. Why
Psychologists Should by Default Use Welch’s t-test
Instead of Student’s t-test. International Review of Social
Psychology (in press).
9. A. Delorme and S. Makeig. 2004. EEGLAB: an open
source toolbox for analysis of single-trial EEG dynamics
including independent component analysis. Journal of
neuroscience methods 134, 1 (2004), 9–21. DOI:
http://dx.doi.org/10.1016/j.jneumeth.2003.10.009
10. A. Delorme, J. Palmer, J. Onton, R. Oostenveld, and S.
Makeig. 2012. Independent EEG sources are dipolar.
PloS one 7, 2 (2012), e30135.
11. E. Donchin and M.G.H. Coles. 1988. Is the P300
component a manifestation of context updating.
Behavioral and brain sciences 11, 3 (1988), 357–427.
DOI:http://dx.doi.org/10.1017/S0140525X00058027
12. J. Edworthy, S. Loxley, and I. Dennis. 1991. Improving
auditory warning design: relationship between warning
sound parameters and perceived urgency. Human factors
33, 2 (1991), 205–231. DOI:
http://dx.doi.org/10.1177/001872089103300206
13. Johan Fagerlönn, Stefan Lindberg, and Anna Sirkka.
2015. Combined Auditory Warnings For Driving-Related
Information. In Proceedings of the Audio Mostly 2015 on
Interaction With Sound (AM ’15). ACM, New York, NY,
USA, Article 11, 5 pages. DOI:
http://dx.doi.org/10.1145/2814895.2814924
14. K. Gramann, D.P. Ferris, J. Gwin, and S. Makeig. 2014.
Imaging natural cognition in action. International
Journal of Psychophysiology 91, 1 (2014), 22–29. DOI:
http://dx.doi.org/10.1016/j.ijpsycho.2013.09.003
15.
R. Gray. 2011. Looming Auditory Collision Warnings for
Driving. Human Factors 53, 1 (2011), 63—-74. DOI:
http://dx.doi.org/10.1177/0018720810397833.Copyright
16. P. Green, W. Levison, G. Paelke, and C. Serafin. 1995.
Preliminary Human Factors Design Guidelines for
Driver Information Systems. Technical Report
FHWA-RD-94-087. US Department of Transportation,
Federal Highway Administration. 1–111 pages.
17. D. M. Groppe, T. P. Urbach, and M. Kutas. 2011. Mass
univariate analysis of event-related brain potentials/fields
I: A critical tutorial review. Psychophysiology 48, 12
(2011), 1711–1725. DOI:
http://dx.doi.org/10.1111/j.1469-8986.2011.01273.x
18. S. Haufe, J-W. Kim, I-H. Kim, A. Sonnleitner, M.
Schrauf, G. Curio, and B. Blankertz. 2014.
Electrophysiology-based detection of emergency braking
intention in real-world driving. Journal of neural
engineering 11, 5 (2014), 056011. DOI:
http://dx.doi.org/https:
//doi.org/10.1088/1741-2560/11/5/056011
19. T-P. Jung, S. Makeig, C. Humphries, T-W. Lee, M. J.
Mckeown, V. Iragui, and T. J. Sejnowski. 2000.
Removing electroencephalographic artifacts by blind
source separation. Psychophysiology 37, 2 (2000),
163–178. DOI:
http://dx.doi.org/10.1111/1469-8986.3720163
20. H.H. Kornhuber and L. Deecke. 1964.
Hirnpotentialanderungen beim Menschen vor und nach
Willkurbewegungen dargestellt mit
Magnetbandspeicherung und Ruckwartsanalyse. In
Pflugers Archiv, Vol. 281. 52.
21. S. Krupenia, A. Selmarker, J. Fagerlönn, K. Delsing, A.
Jansson, B. Sandblad, and C. Grane. 2014. The Methods
for Designing Future Autonomous Systems’ (MODAS)
project: Developing the cab for a highly autonomous
truck. In Proceedings of the 5th International Conference
on Applied Human Factors and Ergonomics (AHFE2014)
(Krakow, Poland). 19–23.
22.
A. L. Kun, S. Boll, and A. Schmidt. 2016. Shifting Gears:
User Interfaces in the Age of Autonomous Driving. IEEE
Pervasive Computing 15, 1 (Jan 2016), 32–38. DOI:
http://dx.doi.org/10.1109/MPRV.2016.14
23. B. Libet. 1985. Unconscious cerebral initiative and the
role of conscious will in voluntary action. Behavioral and
Brain Sciences 8, 4 (1985), 529–539. DOI:
http://dx.doi.org/10.1017/S0140525X00044903
24.
D.E.J. Linden, D. Prvulovic, E. Formisano, M. Völlinger,
F. E. Zanella, R. Goebel, and T. Dierks. 1999. The
functional neuroanatomy of target detection: an fMRI
study of visual and auditory oddball tasks. Cerebral
Cortex 9, 8 (1999), 815–823. DOI:
http://dx.doi.org/10.1093/cercor/9.8.815
25. A. Löcken, W. Heuten, and S. Boll. 2015. Supporting
Lane Change Decisions with Ambient Light. In
Proceedings of the 7th International Conference on
Automotive User Interfaces and Interactive Vehicular
Applications (AutomotiveUI ’15). ACM, New York, NY,
USA, 204–211. DOI:
http://dx.doi.org/10.1145/2799250.2799259
26. A. Löcken, S. Sadeghian Borojeni, H. Müller, T.M.
Gable, S. Triberti, C. Diels, C. Glatz, I. Alvarez, L.
Chuang, and S. Boll. 2017. Towards Adaptive Ambient
In-Vehicle Displays and Interactions: Insights and Design
Guidelines from the 2015 AutomotiveUI Dedicated
Workshop. In Automotive User Interfaces: Creating
Interactive Experiences in the Car, G. Meixner and
C. Müller (Eds.). Springer International Publishing,
Cham, 325–348. DOI:
http://dx.doi.org/10.1007/978-3- 319-49448- 7_12
27. S.A. Lu, C.D. Wickens, J.C. Prinet, S.D. Hutchins, N.
Sarter, and A. Sebok. 2013. Supporting interruption
management and multimodal interface design: three
meta-analyses of task performance as a function of
interrupting task modality. Human Factors 55, 4 (2013),
697–724. DOI:
http://dx.doi.org/10.1177/0018720813476298
28. N. A. Macmillan and C. D. Creelman. 1991. Detection
theory: A user’s guide (1st ed.). Lawrence Erlbaum
Associates Inc., New Jersey.
29. D.C. Marshall, J.D. Lee, and R.A. Austria. 2007. Alerts
for in-vehicle information systems: annoyance, urgency,
and appropriateness. Human factors 49, 1 (2007), 145–57.
DOI:http://dx.doi.org/10.1518/001872007779598145
30. D. Scott McCrickard and C. M. Chewar. 2003. Attuning
Notification Design to User Goals and Attention Costs.
Commun. ACM 46, 3 (March 2003), 67–72. DOI:
http://dx.doi.org/10.1145/636772.636800
31. D. McKeown and S. Isherwood. 2007. Mapping
Candidate Within-Vehicle Auditory Displays to Their
Referents. Human Factors 49, 3 (2007), 417–428. DOI:
http://dx.doi.org/10.1518/001872007X200067
32. M. A. Mollenhauer, J. Lee, K. Cho, M. C. Hulse, and
T. A. Dingus. 1994. The Effects of Sensory Modality and
Information Priority on In-Vehicle Signing and
Information Systems. In Proceedings of the Human
Factors and Ergonomics Society Annual Meeting, Vol. 38.
1072–1076. DOI:
http://dx.doi.org/10.1177/154193129403801617
33.
M.A. Nees and B.N. Walker. 2011. Auditory displays for
in-vehicle technologies. Reviews of human factors and
ergonomics 7, 1 (2011), 58–99. DOI:
http://dx.doi.org/10.1177/1557234X11410396
34.
C.L. Paul, A. Komlodi, and W. Lutters. 2015. Interruptive
notifications in support of task management.
International Journal of Human-Computer Studies 79
(2015), 20–34. DOI:
http://dx.doi.org/10.1016/j.ijhcs.2015.02.001
35. T. W. Picton. 2010. Human auditory evoked potentials.
Plural Publishing.
36. Ioannis Politis, Stephen Brewster, and Frank Pollick.
2015. Language-based Multimodal Displays for the
Handover of Control in Autonomous Cars. In
Proceedings of the 7th International Conference on
Automotive User Interfaces and Interactive Vehicular
Applications (AutomotiveUI ’15). ACM, New York, NY,
USA, 3–10. DOI:
http://dx.doi.org/10.1145/2799250.2799262
37. Z. Pousman and J. Stasko. 2006. A Taxonomy of
Ambient Information Systems: Four Patterns of Design.
In Proceedings of the Working Conference on Advanced
Visual Interfaces (AVI ’06). ACM, New York, NY, USA,
67–74. DOI:http://dx.doi.org/10.1145/1133265.1133277
38. P. Praamstra, D.F. Stegeman, M.W.I.M. Horstink, and
A.R. Cools. 1996. Dipole source analysis suggests
selective modulation of the supplementary motor area
contribution to the readiness potential.
Electroencephalography and clinical neurophysiology 98,
6 (1996), 468–477. DOI:
http://dx.doi.org/10.1016/0013-4694(96)95643- 6
39. T. Ross, K. Midtland, M. Fuchs, A. Pauzié, A. Engert, B.
Duncan, G. Vaughan, M. Vernet, H. Peters, G. Burnett,
and A. May. 1996. HARDIE Design Guidelines
Handbook: Human Factors Guidelines for Information
Presentation by ATT Systems. Technical Report V2008
HARDIE, Deliverable No. 20. European Commission
Host Organization, CORDIS, Luxembourg. 1—-562
pages.
40. M. Scheer, H. H. Bülthoff, and L. L Chuang. 2016.
Steering Demands Diminish the Early-P3, Late-P3 and
RON Components of the Event-Related Potential of
Task-Irrelevant Environmental Sounds. Frontiers in
human neuroscience 10, March (2016), 73. DOI:
http://dx.doi.org/10.3389/fnhum.2016.00073
41. D. L. Schomer and F. L. Da Silva. 2012. Niedermeyer’s
electroencephalography: basic principles, clinical
applications, and related fields. Lippincott Williams &
Wilkins.
42. M. Schultze-Kraft, D. Birman, M. Rusconi, C. Allefeld,
K. Görgen, S. Dähne, B. Blankertz, and J-D. Haynes.
2016. The point of no return in vetoing self-initiated
movements. Proceedings of the National Academy of
Sciences 113, 4 (2016), 1080–1085. DOI:
http://dx.doi.org/10.1073/pnas.1513569112
43. H. Shibasaki and M. Hallett. 2006. What is the
Bereitschaftspotential? Clinical Neurophysiology 117, 11
(2006), 2341–2356. DOI:
http://dx.doi.org/10.1016/j.clinph.2006.04.025
44. A. Stevens, A. Quimby, A. Board, T. Kersloot, and P.
Burns. 2002. Design guidelines for safety of in-vehicle
information systems. Technical Report TRL-PA-3721/01.
Department of Transport, Local Government and the
Regions. 1–55 pages. http://www.transport-research.
info/Upload/Documents/200607/20060728
45. O. Tsimhoni and M. Flannagan. 2006. Pedestrian
detection with night vision systems enhanced by
automatic warnings. In Proceedings of the Human
Factors and Ergonomics Society Annual Meeting, Vol. 50.
Sage Publications, 2443–2447. DOI:
http://dx.doi.org/10.1177/154193120605002220
46. C. D. Wickens. 2002. Multiple resources and
performance prediction. Theoretical Issues in
Ergonomics Science 3, 2 (2002), 159–177. DOI:
http://dx.doi.org/10.1080/14639220210123806
47. C. D. Wickens. 2008. Applied attention theory.
Ergonomics 1, 2 (2008), 222. DOI:
http://dx.doi.org/10.1080/00140130802295564
48. T.O. Zander, L.M. Andreessen, A. Berg, M. Bleuel, J.
Pawlitzki, L. Zawallich, L.R. Krol, and K. Gramann.
2017. Evaluation of a Dry EEG System for Application
of Passive Brain-Computer Interfaces in Autonomous
Driving. Frontiers in Human Neuroscience 11 (2017).
DOI:http://dx.doi.org/10.3389/fnhum.2017.00078
49. S. Zhang, R. Benenson, M. Omran, J. Hosang, and B.
Schiele. 2016. How Far are We from Solving Pedestrian
Detection?. In 2016 IEEE Conference on Computer
Vision and Pattern Recognition (CVPR). 1259–1267.
DOI:http://dx.doi.org/10.1109/CVPR.2016.141