Using EEG to Understand why Behavior to Auditory
In-vehicle Notiﬁcations Differs Across Test Environments
Lewis L. Chuang
Perception, Cognition, Action,
Max Planck Institute for
Perception, Cognition, Action,
Max Planck Institute for
Styling and Vehicle
Ergonomics, Scania CV AB
In this study, we employ EEG methods to clarify why auditory
notiﬁcations, which were designed for task management in
highly automated trucks, resulted in different performance
behavior, when deployed in two different test settings: (a)
student volunteers in a lab environment, (b) professional truck
drivers in a realistic vehicle simulator. Behavioral data showed
that professional drivers were slower and less sensitive in iden-
tifying notiﬁcations compared to their counterparts. Such
differences can be difﬁcult to interpret and frustrates the de-
ployment of implementations from the laboratory to more
realistic settings. Our EEG recordings of brain activity reveal
that these differences were not due to differences in the de-
tection and recognition of the notiﬁcations. Instead, it was
due to differences in EEG activity associated with response
generation. Thus, we show how measuring brain activity can
deliver insights into how notiﬁcations are processed, at a ﬁner
granularity than can be afforded by behavior alone.
Human computer interaction
(HCI); Human-centered computing∼User studies
Driving simulator; auditory notiﬁcations;
electroencephalography; event-related potential; MMN; P3;
Notiﬁcations are a ﬁxture of in-vehicle environments. They
are designed to direct users, who would be engaged otherwise,
to aspects of the environment that require a response (e.g., fuel
indicator lights). Recent advances in automated driving will
increase the importance of notiﬁcations, especially when the
duties of the human operator transition from vehicle control
to vehicle supervision [1, 7]. Indeed, research on the design
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for proﬁt or commercial advantage and that copies bear this notice and the full citation
on the ﬁrst page. Copyrights for components of this work owned by others than the
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior speciﬁc permission
and/or a fee. Request permissions from firstname.lastname@example.org.
AutomotiveUI ’17, September 24–27, 2017, Oldenburg, Germany
© 2017 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ISBN 978-1-4503-5150-8/17/09. . . $15.00
of in-vehicle interactions have rapidly shifted, in recent years,
towards addressing the anticipated user requirements of au-
tomated vehicles . Given the rapid pace of innovation
in technology and design, it is surprising that we continue to
have few tools at our disposal that allow us to truly appreci-
ate how users process and act upon in-vehicle notiﬁcations.
Here, we combine the analysis of behavioral responses with
electroencephalography (EEG) recordings to better understand
how auditory notiﬁcations were processed for information and
responded to across different test environments and participant
The design space of notiﬁcations is large. This gives rise to
inﬁnite variations of how in-vehicle notiﬁcations ought to be
designed and for which purpose. While guidelines have been
proposed for the design of in-vehicle displays (e.g., [16, 39,
44]), they tend to be based on studies with a focus on critical-
safety behavior. Ultimately, notiﬁcations are implemented
according to whether or not they will be effective in safely
eliciting the desired responses, in the environment for which
they were designed for. Unfortunately, evaluating notiﬁcations
by performance measures alone can only reveal whether a
given implementation is better or worse than its control com-
parison. In order to understand why a given notiﬁcation results
in better or worse performance than originally anticipated, it
is necessary to evaluate the extent to which the notiﬁcation
is perceived, processed, and elicits a response. For this, it is
necessary to inspect how the brain responds to notiﬁcations.
Methods for neuroimaging are becoming more accessible. Re-
cent developments in neuroimaging technology, especially
with regards to EEG, have focused on the ease of application
and user mobility . In spite of this, valid concerns persist
with regards to their suitability for use in realistic test envi-
ronments, especially since the presence of electronic devices
can introduce substantial noise into EEG recordings. On a
more practical note, EEG measurements are often expected to
impose implementational costs on the researcher that might
be deemed unnecessary, especially when behavioral and self-
report measures are expected to sufﬁce.
Nonetheless, the time-varying EEG signal offers a detailed
inspection of how information is processed by the human user,
which cannot be achieved with performance measurements
alone. With regards to the evaluation of notiﬁcations, EEG
measurements can reveal how the brain automatically detects
and consciously identiﬁes notiﬁcations. They can also indi-
cate how the brain prepares itself to generate an appropriate
response, pending the identiﬁcation of the notiﬁcation. Per-
formance measurements implicitly treat the human operator
as a single stimulus-response unit and do not, in themselves,
distinguish between perception, cognition, and action.
Our research aim was to demonstrate that employing EEG
methods can allow us to account for why behavioral responses
to auditory notiﬁcations might differ across different instances
of testing. Speciﬁcally, between professional drivers tested in
a high ﬁdelity driving simulator and students tested in a psy-
chophysical laboratory. This is an experience that is common
to many researchers when evaluating novel designs of notiﬁ-
cation interfaces for deployment in the "real world". Often,
interfaces are ﬁrst prototyped and evaluated under highly con-
trolled conditions before they are deployed in more realistic
environments and tested with their intended users. When the
behavioral results of a highly controlled test do not generalize
to a more realistic one, it is often difﬁcult to establish the
reasons that might have caused this.
Here, we show that EEG measurements can complement be-
havioral results to provide a better resolution for understand-
ing how notiﬁcations are processed by users across different
settings. Unlike behavioral performances, the appropriate
application of EEG measurements allows the researcher to dis-
criminate how notiﬁcations are processed by the brain across at
least three different stages of information processing, namely
perceptual, cognitive, and response stages [46, 47]. In this re-
gard, it offers researchers the ability to investigate the outcome
of information processing at the various stages that lie in be-
tween the presentation of a stimulus (i.e., auditory notiﬁcation)
and a behavioral response (i.e., keypress).
We report two experiments that presented participants with
identical tasks but under two different test environments. All
of our participants were required to respond to auditory notiﬁ-
cations, which were previously designed to direct the attention
of commercial truck drivers to the occurence of task require-
ments during a long distance, automated vehicle mission .
They were also presented irrelevant distractor sounds, which
they had to ignore, and a dynamic visual scene, which varied
in its realism according to the test environment. The ﬁrst ex-
periment was performed as a highly-controlled psychophysical
experiment (N=15) ), with low mission ﬁdelity, and on young
and untrained student participants. The second experiment
was performed on professional truck drivers in a high-ﬁdelity
driving simulator (N=14). Our ﬁndings are as follows:
professional drivers in a high ﬁdelity simulator were slower
and less sensitive in discriminating target notiﬁcations from
the EEG activity for notiﬁcation detection and identiﬁcation
did not discriminate between the two test environments
the EEG activity for correct responses to the notiﬁcations
(i.e., Bereitschaftspotential; BP) discriminated between the
two test environments
thus, we attributed observed behavioral differences to dif-
ferences in the sample demographic and not to differences
in the test environment.
In-vehicle notiﬁcations are often designed to shift user atten-
tion from the primary task of engagement (e.g. driving) to
a critical event (e.g. low fuel). With advances in vehicle
sensing, notiﬁcations can also be designed to direct a user’s
attention to safety-critical aspects of the driving task (e.g.
pedestrian detection [45, 49]) or to support decision making
(e.g. lane-changing ). With the increased adoption of
automated vehicles, we expect the role of notiﬁcations to grow
in prominence. Besides takeover notiﬁcations that prompt
users to resume vehicle handling (e.g. [5, 36]), we also ex-
pect task-management notiﬁcations to pervade the in-vehicle
environment as the scope of permissible in-vehicle activities
grows (). In particular, commercial vehicles (i.e., trucks)
will stand to beneﬁt from the effective introduction of task-
management notiﬁcations. This is because commercial drivers
might be expected to assume additional responsibilities such as
delivery logistics, as their responsibility for vehicle diminishes
with increasing handling automation.
Auditory displays tend to be a favored delivery medium for
notiﬁcations, given that they are not in direct conﬂict with
the visuomotor demands of vehicle handling . Auditory
information that is presented during driving has been claimed
to be more deeply processed than visual information, given
that it is more likely to be recalled post testing . However,
it can also be perceived as being more distracting. In any case,
auditory notiﬁcations inhabit a large design space with ma-
nipulable parameters, which include their formant, duration,
interpulse interval, onset/offset latency and more. This allows
them to be ﬂexibly tuned in order to communicate informa-
tion, such as urgency , even whilst being moderated for
undesirable side-effects, such as perceived annoyance .
Some sounds are more effective than others, particular when
operational concerns are taken into consideration. For exam-
ple, a driving simulator study compared different auditory
warnings for signalling potential headway collision and found
that the sound of a car horn and a tone with a looming time-
to-contact intensity resulted in the best braking latencies .
Nonetheless, the sound of a car horn also resulted in more
unnecessary braking responses than the looming tone. A sep-
arate study compared four classes of auditory notiﬁcations—
namely, abstract sounds, auditory icons, nonspeciﬁc environ-
ment sounds, and speech messages—for their efﬁcacy in cuing
for driving situations such as low tire pressure, low oil level,
engaged handbrake whilst driving, and others . This study
found that speech messages and auditory icons generated faster
and more accurate responses than either environment sounds
or abstract sounds.
In these mentioned examples, auditory displays were often
evaluated on the basis of behavioral responses—namely, accu-
racy and response latencies. Although brain responses offer
a ﬁner granularity of information processing, they are rarely
employed in the evaluation of auditory notiﬁcations.
Event-related potentials (ERPs)
EEG refers to the measured electrical activity of surface elec-
trodes placed on the human scalp (typically in the range of
), which can be attributed (in part) to brain activity
. Event-related potentials (ERPs) are changes within a
pre-speciﬁed time window of EEG activity that are generated
after or prior to a known event. In this work, we focus on
two physical events: (i) the presentation of a target notiﬁca-
tion, (ii) the participants’ self-generated response to target
notiﬁcations. Respectively, this allows to ﬁrst understand how
our participants’ brains capacity to detect and recognize the
presented notiﬁcation and, next, decide to generate a behav-
ioral response relative their self-generated responses. This
corresponds to three stages of information processing that lie
between presenting a notiﬁcation and generating a response.
Stimulus ERPs for auditory events
Auditory stimuli are frequently employed in ERP studies as
they elicit waveforms with identiﬁable components that are
associated with cognitive mechanisms. Of current interest
are the slow transient responses that arise from the auditory
and associated cortices (i.e., 50 msec after sound onset). A
popular paradigm (i.e., oddball paradigm) presents two dis-
criminable sounds, one more frequently than the other (e.g.,
80% to 20%). Participants are only required to respond to the
infrequent sound, which are termed targets. This corresponds
to a real-world scenario where auditory notiﬁcations have to
be detected and identiﬁed against an auditory background of
distractors. Subtracting the EEG activity generated by the fre-
quent distractor sounds from that generated by targets results
in a difference waveform with two interpretable components.
First, the mismatch negativity (MMN), which is an early nega-
tive deﬂection with a typical latency of about 140 msec. The
MMN is associated with an automatic process that responds
in proportion to the perceived deviance of the targets from the
distractors. It is generated even when there is no task involved.
Second, the P3 ,which is a positive late deﬂection with a la-
tency between 450–600 msec. The P3 is only generated if
the subject is attending to the stimuli in a way that demands
a response (but, see ). Working memory processes that
underlie context updating are claimed to be represented by the
P3 component . Indeed, fMRI studies have localized brain
regions, which are typically implicated with conscious effort
and working memory processes (i.e., insular cortex, inferior
parietal and frontal lobes), as generators of the P3 (e.g. ).
In the current study, we evaluated MMN and P3 responses
to target auditory notiﬁcations in order to determine whether
behavioral differences across different test environments re-
ﬂected changes in their automatic detection and/or voluntary
Response ERPs for motoric responses
Brain responses can also be measured prior to response ac-
tuation. When EEG activity are temporally aligned to the
generated responses of measured individuals, it is possible to
observe a slow change in potentiation, leading up to the re-
sponse onset. This observation was ﬁrst reported in 1964 and
was termed the Bereitschatpotential (i.e., readiness potential;
BP) [20, 43].
The BP is maximal at the midline centro-parietal area (i.e.,
CPz). The adoption of a common average reference, such as
in the current work, means that it is observed as a positive
potential shift in the parietal and occipital electrodes and as a
negative potential shift in the frontal electrodes. In an interest-
ing series of experiments, recorded participants were asked to
report the time when they decided to generate volitional key-
press responses . Here, the BP was found to be initiated
approximately 350 msec prior the participants’ reported times,
which raised philosophical doubts on the nature of free will.
Putting such existentialist concerns aside, most researchers
agree that the initiation of BP has its physiological origins in
the supplementary motor area (SMA), a brain region that is
implicated in the generation of motor responses, at least in the
case of hand movements . Thus, BP could be regarded
as a cortical decision to initiate a motor action (prior to the
conscious realization of the decision itself!). More recently, it
has been claimed that this cortical decision can be volitionally
cancelled up till a point (i.e., 200 msec prior to response),
after which the generation of a motor response inevitable .
For our current purposes, we treat BP as an indicator for the
cortical decision to respond, which is initiated earlier than the
recorded response itself. This allows us to determine the la-
tency between the cortical decision and the actuated response,
as well as the amplitude of the cortical decision itself which is
not evident in a binary key- or button-press response.
This study was a between groups design that compared behav-
ioral performance and brain responses to auditory notiﬁcations
across two different experimental settings: (i) a highly con-
trolled psychophysical laboratory, (ii) a high ﬁdelity driving
simulator environment. The whole experiment took approxi-
mately 2.5 hours to complete, including preparation time and
Thirty participants—that is, ﬁfteen undergraduate students
(mean age(std)=26.1(4.0) years; 9 males) from the University
of Tübingen, Germany, , and ﬁfteen professional commercial
drivers (age (mean age(std)=41.4(12.1) years; 13 males) who
were employees of Scania CV AB, Sweden —performed the
task reported here. The data of one professional driver, from
the driving simulator testing, had to be excluded from further
analysis because only one response was recorded throughout
the entire experiment. Besides demographic differences, the
primary difference between these two groups was the test en-
vironment that they experienced (see 3.2). All participants
reported no known hearing defects and provided signed in-
Apparatus and Stimuli
The psychophysics laboratory had black walls, was insulated
for external sounds, and had an ambient sound level of approxi-
mately 40 dB (Figure 1, left). Visual stimuli were presented on
a desktop display (
ﬁeld-of-view; 45 cm distance to ﬁxed
Figure 1. Left: A student participant in a psychophysical laboratory (Department for Human Perception, Cognition, and Action, MPI for Biological
Cybernetics, Tübingen, Germany). Right: A professional commercial driver in a truck driving simulator (Styling and Vehicle Ergonomics, Scania CV
AB, Södertälje, Sweden)
chin-rest). The visualization was rendered by customized soft-
ware written in Matlab R2013b (The Mathworks, Natick, MA).
The visualization consisted of a cursor that drifted horizon-
tally between two vertical lines, which represented a vehicle’s
position in a single lane. Participant responses were collected
via keypresses on a standard USB keyboard. Sound presen-
tation was controlled by an ASIO 2.0 compatible sound card
(SoundBlaster ZxR; Creative Labs) and displayed via stereo
speakers, each placed on either side of the display.
High ﬁdelity driving simulator
The driving simulator was designed to simulate the operation
of commercial truck vehicles. Participants sat in a realistic
cabin interior, based on an existing truck seating buck that con-
sisted of a pneumatic seat, a steering column complete with
wheel and shaft, instrument cluster, and the remaining dash-
board (Figure 1, right). Visualization was presented on a front-
projection three wall display (approx.
cm distance to head), and via two pairs of 2 vertically-aligned
displays each, attached to either side of the cabin, that simu-
lated side rear-view mirrors. The visualization was rendered
by a customized graphical engine (i.e., VISIR) that created
3D environments from OpenDrive8 road network ﬁles (xodr)
and from additional landscape description ﬁle (xml). Here, we
presented a highway scene from Linköping and Norrköping,
with two lanes for congruent trafﬁc and two lanes for oppos-
ing trafﬁc. The participant inhabited the far-right lane. The
highway was populated with low trafﬁc density, including the
occasional headway vehicle. Experimental responses were
collected via dedicated buttons that were located on the steer-
ing wheel. Sound presentation was controlled by an ASIO
2.0 compatible sound card (RME HDSP 9632; RME Intelli-
gent Audio Solutions) and displayed via a 5.1 surround sound
system, installed around the driver’s seat.
Twelve target and ninety distractor sounds were used in this
experiment. These were modiﬁed from sounds that were de-
signed as part of a project (MODAS: Methods for Designing
Future Autonomous Systems) to cue truck drivers to attend to
possible non-driving tasks [13, 21]. All sounds had a duration
of 500 msec.
There were two notiﬁcations for each of six non-driving tasks.
They were a verbal command in Swedish and an auditory icon
(in brackets): (i) system (synthetic tone), (ii) convoy (train
whistle), (iii) driver (human whistle), (iv) weather (raindrop),
(v) road (rumbling), (vi) trafﬁc (car horn). The distractor
sounds were random permutations of four sounds, two verbal
commands and two auditory icons, played simultaneously in
EEG was recorded on a dedicated PC using a 59-channel active
electrode array that was afﬁxed to the scalp using an elastic
whole head cap, which speciﬁed for pre-determined sites in-
cluding those corresponding to the international 10-20 system
(ActiCap System, Brain Products GmbH, Munich, Germany).
The horizontal and vertical electrooculogram (HEOG/VEOG)
were recorded with four electrodes attached to the right and
left canthi as well as above and below the left eye. FCz was
used as an online reference for all channels. Prior to testing,
electrode gel was applied to ensure that electrode impedance
for each channel. EEG signals were digitized
at a rate of 1000 Hz. EEG recordings were synchronized
with experimental events via a parallel port connection to the
All participants performed the same task, regardless of the
experiment setting. They were required to attend to the visual
scene throughout the experiment. Participants were informed
that this was a stimulated automated driving scenario and that
no steering was necessary. Whenever they heard a sound, they
were required to respond if it was a target notiﬁcation and
to ignore it if it was a distractor sound. The inter-stimulus
Figure 2. Six examples of clusters of dipoles (blue) and their mean position (red), their projected scalp activity, and power spectral density (inset: left to
right), derived from EEG recordings in the driving simulator. First row: Cortical dipoles that are likely to be associated with auditory processing (left)
and motor response generation (right) respectively. Second row: Non-cortical dipoles that are associated with muscle activity (left) and eye-movements
and -blinks (right). Third row: Non-cortical dipoles that are due to electrical line noise (left) and unresolved variance in EEG recording (right).
interval was randomly selected from a uniform distribution
across 1800-2000 msec. Participants could respond within
2000 msec of the onset of the target notiﬁcation. Failures to
do so were considered misses. Responses to distractor sounds
were treated as false alarms. Each experiment presented ap-
proximately 980 sounds in total. Of these,
Results: Behavioral performance
Our participants’ performance were assessed in terms of the
median of their correct response times (RT) and discriminabil-
ity index (i.e., d’). The discriminability index is calculated
as the difference between the z-scores of hit and false-alarm
rates, whereby hits and false-alarms were responses to target
and distractor sounds respectively . Welch’s t-tests for
independent samples were performed to compare behavioral
performance in the high ﬁdelity driving simulator and the psy-
chophysical laboratory . The adopted criteria for statistical
signiﬁcance was α=0.05.
Participants in the high ﬁdelity driving simulator were slower
(mean=1262 msec; std.=120 msec) in their correct responses
than those in the psychophysics laboratory (mean=1062 msec;
std.=139 msec). This difference (mean=200 msec) is statisti-
cally signiﬁcant (
<0.001, Cohen’s d=1.54) and
has a 95% conﬁdence interval from 102 to 299 msec.
Participants in the high ﬁdelity driving simulator were less
sensitive (mean=4.37; std.=1.03) in discriminating the au-
ditory target notiﬁcations from their distractor counterparts,
than those in the psychophysics laboratory (mean=5.25;
std.=0.60). This difference (mean=0.88) is statistically signiﬁ-
<0.05, Cohen’s d=1.06) and has a
conﬁdence interval from 0.23 to 1.54.
To summarize, behavioral results indicated slower and less dis-
crimination sensitivity for auditory notiﬁcations in the driving
simulator, compared to the psychophysics laboratory.
Results: EEG/ERP responses
Data collection, signal processing, and statistical analysis
Data pre-processing and analysis was performed ofﬂine with
Matlab (The Mathworks, Natick, MA) scripts based on
, an open source environment for processing
electrophysiological data . The following steps were per-
formed on EEG data prior to analyzing the ERPs of stimuli
and responses . First, the data was downsampled to
Hz to reduce computational costs. Next, a high-pass ﬁlter
Hz) was applied to remove slow drifts,
electrical line noise from the environment was removed using
the CleanLine algorithm, and bad channels were removed us-
ing the ASR algorithm. Next, all electrodes were re-referenced
to their common average, and each participant’s dataset was
separately submitted to an Adaptive Mixture ICA to decom-
pose the continuous data into source-resolved activity .
On these learned independent components (IC), equivalent
Figure 3. LEF T: Stimulus ERPs are illustrated by and labelled in three difference waveforms that depict averaged EEG activity of electrode groups
from anterior, central, and posterior regions. RIGHT: MUA results plot statistically signiﬁcant t-values between the two participant groups for every
electrode and time-point. The analysis reveals that there are no statistically signiﬁcant electrode-time regions that proceed the auditory notiﬁcation.
current dipole model estimation was performed by using an
MNI Boundary Element Method head model to ﬁt an equiva-
lent dipole to the scalp projection pattern of each independent
component. ICs whose dipoles were located outside the brain
were excluded as well as those that had a residual variance of
over 15%. Within each participant group, ICs were clustered
into 30 clusters using k-means based on their mean power
spectra, topography, and equivalent dipole location.
Figure 2 provides examples of dipole clusters with either cor-
tical (ﬁrst row) or non-cortical origins (e.g., muscle and eye
activity (second row), electrical activity from environment
sources (third row)). Non-cortical components were identiﬁed
on the basis of their power spectral density, scalp topology,
and location in a volumetric brain model . As might be
expected, there were more non-cortical dipole components
found in participants who performed the experiment in the
driving simulator (N=15) than those from the psychophysical
experiment (N=14). In other words, EEG recordings were con-
taminated by the activity of more non-cortical components in
the driving simulator environment than in the psychophysical
laboratory. Non-cortical dipole clusters were removed from
the EEG recording and the remaining EEG activity was sub-
jected to comparative analysis for the two participant groups.
Speciﬁcally, we derived a stimulus and a response ERP for
each participant. This was achieved by mean-averaging the
EEG activity of a time-window (also termed an epoch; 1000
msec after/before the relevant trigger event, baselined to 500
msec before/after the stimulus/response), across all epochs.
Stimuli ERPs were deﬁned by the differences in EEG re-
sponses to target notiﬁcations and distractor sounds, prominent
components are MMN and P3, which are respectively associ-
ated with the information processing aspects of notiﬁcation
detection and identiﬁcation . Response ERPs were deﬁned
by EEG activity prior to the registration of our participants’ re-
sponses. It is deﬁned by a single component, which manifests
itself as a single peak that changes from negative to positive
polarity, from the anterior to posterior electrodes. The left
panels of Figures 3 and 4 illustrate the averaged activity of
anterior (Fz, F1, F3, F5, F2, F4, F6, FC1, FC3, FC5, FC2, FC4,
FC6), central (Cz, C1, C3, C5, C2, C4, C6), and posterior (Pz,
POz, Oz, P1, P3, P5, P2, P4, P6, PO3, PO7, PO4, PO8, O1,
O2) electrodes for stimulus and response ERPs respectively.
The proﬁle of the derived waveforms were consistent with our
expectations for both test environments.
To evaluate our EEG recordings for differences across the two
test settings, we performed separate mass-univariate analyses
(MUA) for the stimuli and response ERPs . This allowed
us to determine the time points of individual electrodes that
were statistically signiﬁcant for waveform differences between
the EEG activity recorded across the two test environments.
False discovery rate control was applied (i.e., FDR-BH) .
We illustrate MUA results as raster plots of electrode chan-
nels across time (see Figures 3 and 4, right), whereby statis-
tically signiﬁcant t-values are represented by color intensity.
The raster plots consist of three panels whereby the top and
bottom panels indicate right- and left-hemispheric electrodes
respectively, and the middle panel indicate mid-line electrodes.
Within each panel, electrodes are vertically ordered from ante-
rior to central to posterior electrodes.
Figure 4. L EF T: Stimulus ERPs are illustrated by three waveforms that depict averaged EEG activity of electrode groups from anterior, central, and
posterior regions. The BP peaks are indicated by arrows. RIGHT: MUA results reveal statistically signiﬁcant differences between driving simulator and
laboratory recordings in two time periods (i.e., 600-430 and 220-0 msec before responses).
The MUA results (Figure 3, right) reveal no statistically sig-
niﬁcant differences between the difference waveforms (Figure
3, left) of participants from the laboratory (–) and driving
simulator (–) test settings. This suggests that the notiﬁcations
elicited equivalent brain responses for detection and identiﬁ-
cation in both groups of participants, regardless of their test
The MUA results (Figure 4, right) reveal statistically signif-
icant differences, particular in the frontal (e.g., Fpz, Fz) and
posterior (e.g., Pz, POz, Oz) electrodes. To understand these
differences, the reader should recall from the behavioral results
that truck drivers in the driving simulator generated slower
responses than students in the psychophysical laboratory, of
about 200 msec. We note that the time periods of signiﬁcantly
different EEG activity are of similar duration (i.e., 600-430
and 220-0 msec before the response event). This means that
the time that it took for truck drivers in the driving simulator
to generate a behavioral response after the initiation of corre-
sponding brain activity (i.e., BP) was approximately 220 to
230 msec longer than it took for the undergraduate students
in the psychophysical laboratory. In addition, the peak ampli-
tudes of the BP in the frontal and posterior electrodes were
smaller in the truck drivers than the undergraduate students
(Figure 4, left).
What inferences can we draw when auditory notiﬁcations,
which were designed in the conﬁnes of a well-controlled labo-
ratory, elicit different behavioral responses in more realistic
settings? The current study demonstrates that EEG measure-
ments can provide some clarity when the explanatory power
of behavioral responses are limited.
In this work, we found statistically signiﬁcant behavioral dif-
ferences across two test settings. Professional drivers were
slower and less sensitive in detecting target notiﬁcations in
a high ﬁdelity driving simulator compared to student partic-
ipants tested in a psychophysical laboratory. To begin, this
is surprising for at least two reasons. First, the verbal com-
mands were in the native language of the professional drivers.
Second, the professional drivers understood what these notiﬁ-
cations represented in the context of their jobs. Thus, we might
have assumed that professional drivers to have responded
faster and more accurately. Although the professional drivers
were slower than our student volunteers by approximately 200
msecs. They continued to respond in an acceptable time range
(i.e., less than 2 secs, the recommended time headway for
preventing rear-end collisions). From this, the current auditory
notiﬁcations might be considered to be suitable for fulﬁlling
their intended function of task management.
Based on behavioral data alone, the worse performance of the
professional drivers could be attributed to several reasons. For
example, the driving simulator could have provided a more
immersive environment that reduced the perceptual saliency of
the notiﬁcations. Alternatively, it could have been due to age
or motivational differences between the professional drivers
and student participants. Last, but not least, performance
differences could have resulted from technical differences in
the auditory displays or input devices. This host of plausible
interpretations often plague comparative user studies that rely
solely on behavioral measurements.
With EEG measurements, we were able to infer that the be-
havioral differences that we observed were due to differences
between the professional drivers and the student participants.
Speciﬁcally, in how their brains prepare themselves prior to
responding. Our reasoning is as follows. First, notiﬁcations
were unlikely to have been detected or identiﬁed differently,
given that the brain responses associated with these processing
stages (i.e., MMN and P3 respectively) were similar across
the two participant groups. In other words, the auditory notiﬁ-
cations were robustly perceived across the two different test
settings. Next, differences were found in the response ERPs.
More speciﬁcally, the response ERPs for professional drivers
had a longer latency than student participants between the BP
peak and the recorded response. In other words, more time
elapsed for the professional drivers between cortical decision-
making to respond and the response itself. This difference
in the latencies from BP initation to the motor response was
about 220 msec, and gave rise to the statistical differences that
the MUA analysis revealed. This difference in EEG activity
between the two groups converges with the difference that we
found with behavioral response times (i.e., 200 msec). Finally,
the ERPs for response generation in professional drivers had
smaller peak amplitudes than in student participants. This
suggests that reduced cortical activity observed in the profes-
sional drivers prior to responding could have resulted in later
responses. This possibility rules out alternative reasons for
slow responding, such as the sub-optimal physical ergonomics
of the truck cabin or the physical layout of the input device.
Taken together, the combination of EEG and behavioral mea-
surements show that the current auditory notiﬁcations were
sufﬁciently salient and robust across different test environ-
ments. Although professional drivers exhibited slower and
less sensitive discrimination performance, it was not likely to
be due to the notiﬁcation design or the physical environment.
Instead, it was due to differences in the sample demographics.
Thus, subsequent effort in this scenario ought to be invested
in understanding and mitigating for human factor limitations,
rather than in further reﬁnements in the design of notiﬁcations
or the physical interface.
The current work is restricted to the presentation of auditory
notiﬁcations. There are other channels of notiﬁcation delivery
that remain to be considered, such tactile notiﬁcations (e.g.,
vibrations) which have been claimed to be more easily discrim-
inable and less interfering with the task of driving compared
to auditory notiﬁcations .
To reiterate our main point, comparisons between different
types of notiﬁcations and across different deployment settings
can result in conﬂicting evidence, across independent stud-
ies and even within the same study. For example, while 
reported that auditory notiﬁcations elicited shorter response
times, a meta-analysis showed that tactile notiﬁcations elicited
faster responses, at least for low-urgency messages . The
same meta-analysis emphasized that moderating factors play
a critical role in determining the suitability notiﬁcation deliv-
ery and design. For example, tactile notiﬁcations might be
more discriminable, but only if they have low-complexity cf.,
; responses to auditory notiﬁcations are more accurate for
high-complexity information such as in the current study. This
variability of empirical evidence across the diverse design and
deployment space means that it is insufﬁcient to simply focus
on behavioral responses to user interfaces.
CONCLUSION AND OUTLOOK
The current paper contributes by demonstrating how EEG
methods could allow us, as researchers, to identify the stage of
information processing that results in differences in behavior.
Such an approach will allow us to target the limitations of our
designs for user interfaces more selectively and emphasize the
aspects that are more deserving of our attention.
While EEG measurement is not a panacea, the current work
shows that it can provide insight into how information is pro-
cessed, at a ﬁner granularity than behavioral responses alone.
Furthermore, it can help to deliver insight when the deploy-
ment settings of our designs change across test phases. Here,
it assured us that the designed notiﬁcations were sufﬁciently
robust to be processed by the brains of their users, regardless
of differences in the test environments.
This level of understanding, namely of how the information
communicated by notiﬁcation interfaces is processed by the
brain, will be increasingly important especially as we attempt
to increase the design space and functional diversity of noti-
ﬁcations. Nowadays, notiﬁcations are designed, not only to
capture the user’s attention at all costs but, to be sensitive to
the user’s goals and requirements . Ambient notiﬁcations
represent a particular class of notiﬁcations that will be difﬁcult
to evaluate if behavioral measurements are all that we have
to rely on. This is because ambient notiﬁcations are, by deﬁ-
nition, designed to inform the user without eliciting behavior
that would interrupt existing activity [26, 37]. With such noti-
ﬁcations, responses from the brain could be measured instead
of behavioral responses.
The current work relied on high density EEG recording equip-
ment. While the use of medical grade equipment can be fea-
sibly implemented in a real car, and even for the actuation of
emergency braking , doing so might not be expedient for
many researchers. A recent evaluation suggest that simpler
and more convenient EEG setups, for example those that use
dry electrodes, could be implemented in a vehicle environ-
ment at a reasonable signal to noise ratio . Innovations in
electrode designs, such as an around-the-ear EEG placement
, could further allow for brain responses to be measured
without imposing on users the inconvenience of donning an
unsightly EEG cap.
As we continue to innovate in-vehicle interfaces to keep up
with the demands of user expectations, it is only appropriate
that we also innovate our means for evaluating these interfaces
to keep up with the demands of inferential rigor. The current
work demonstrates the viability of one approach that should
be employed more often than it currently is.
This work was partially supported by the German Re-
search Foundation (DFG) for ﬁnancial within project C03
of SFB/Transregio 161. We would like to thank K-Marie
Lahmer and Rickard Leandertz for their assistance in data
collection, Johan Fagerlonn for sharing his original stimuli,
and BrainProducts GmbH (Munich, Germany) for loaning us
the necessary equipment for this study.
1. L. Bainbridge. 1983. Ironies of automation. Automatica
19, 6 (1983), 775–779. DOI:
2. Y. Benjamini and Y. Hochberg. 1995. Controlling the
false discovery rate: a practical and powerful approach to
multiple testing. Journal of the royal statistical society.
Series B (Methodological) (1995), 289–300.
3. N. Bigdely-Shamlo, T. Mullen, C. Kothe, K-M. Su, and
K. A. Robbins. 2015. The PREP pipeline: standardized
preprocessing for large-scale EEG analysis. Frontiers in
neuroinformatics 9 (2015), 16. DOI:
4. M. G Bleichner, B. Mirkovic, and S. Debener. 2016.
Identifying auditory attention with ear-EEG: cEEGrid
versus high-density cap-EEG comparison. Journal of
Neural Engineering 13, 6 (2016), 066004.
5. S. Borojeni, L. Chuang, W. Heuten, and S. Boll. 2016.
Assisting Drivers with Ambient Take-Over Requests in
Highly Automated Driving. In Proceedings of the 8th
International Conference on Automotive User Interfaces
and Interactive Vehicular Applications (Automotive’UI
16). ACM, New York, NY, USA, 237–244. DOI:
6. Y. Cao, F. van der Sluis, M. Theune, R. op den Akker,
and A. Nijholt. 2010. Evaluating Informative Auditory
and Tactile Cues for In-vehicle Information Systems. In
Proceedings of the 2Nd International Conference on
Automotive User Interfaces and Interactive Vehicular
Applications (AutomotiveUI ’10). ACM, New York, NY,
USA, 102–109. DOI:
7. S.M. Casner, E.L. Hutchins, and D. Norman. 2016. The
Challenges of Partially Automated Driving. Commun.
ACM 59, 5 (April 2016), 70–77. DOI:
8. M. Delacre, D. Lakens, and C. Leys. in press. Why
Psychologists Should by Default Use Welch’s t-test
Instead of Student’s t-test. International Review of Social
Psychology (in press).
9. A. Delorme and S. Makeig. 2004. EEGLAB: an open
source toolbox for analysis of single-trial EEG dynamics
including independent component analysis. Journal of
neuroscience methods 134, 1 (2004), 9–21. DOI:
10. A. Delorme, J. Palmer, J. Onton, R. Oostenveld, and S.
Makeig. 2012. Independent EEG sources are dipolar.
PloS one 7, 2 (2012), e30135.
11. E. Donchin and M.G.H. Coles. 1988. Is the P300
component a manifestation of context updating.
Behavioral and brain sciences 11, 3 (1988), 357–427.
12. J. Edworthy, S. Loxley, and I. Dennis. 1991. Improving
auditory warning design: relationship between warning
sound parameters and perceived urgency. Human factors
33, 2 (1991), 205–231. DOI:
13. Johan Fagerlönn, Stefan Lindberg, and Anna Sirkka.
2015. Combined Auditory Warnings For Driving-Related
Information. In Proceedings of the Audio Mostly 2015 on
Interaction With Sound (AM ’15). ACM, New York, NY,
USA, Article 11, 5 pages. DOI:
14. K. Gramann, D.P. Ferris, J. Gwin, and S. Makeig. 2014.
Imaging natural cognition in action. International
Journal of Psychophysiology 91, 1 (2014), 22–29. DOI:
R. Gray. 2011. Looming Auditory Collision Warnings for
Driving. Human Factors 53, 1 (2011), 63—-74. DOI:
16. P. Green, W. Levison, G. Paelke, and C. Seraﬁn. 1995.
Preliminary Human Factors Design Guidelines for
Driver Information Systems. Technical Report
FHWA-RD-94-087. US Department of Transportation,
Federal Highway Administration. 1–111 pages.
17. D. M. Groppe, T. P. Urbach, and M. Kutas. 2011. Mass
univariate analysis of event-related brain potentials/ﬁelds
I: A critical tutorial review. Psychophysiology 48, 12
(2011), 1711–1725. DOI:
18. S. Haufe, J-W. Kim, I-H. Kim, A. Sonnleitner, M.
Schrauf, G. Curio, and B. Blankertz. 2014.
Electrophysiology-based detection of emergency braking
intention in real-world driving. Journal of neural
engineering 11, 5 (2014), 056011. DOI:
19. T-P. Jung, S. Makeig, C. Humphries, T-W. Lee, M. J.
Mckeown, V. Iragui, and T. J. Sejnowski. 2000.
Removing electroencephalographic artifacts by blind
source separation. Psychophysiology 37, 2 (2000),
20. H.H. Kornhuber and L. Deecke. 1964.
Hirnpotentialanderungen beim Menschen vor und nach
Willkurbewegungen dargestellt mit
Magnetbandspeicherung und Ruckwartsanalyse. In
Pﬂugers Archiv, Vol. 281. 52.
21. S. Krupenia, A. Selmarker, J. Fagerlönn, K. Delsing, A.
Jansson, B. Sandblad, and C. Grane. 2014. The Methods
for Designing Future Autonomous Systems’ (MODAS)
project: Developing the cab for a highly autonomous
truck. In Proceedings of the 5th International Conference
on Applied Human Factors and Ergonomics (AHFE2014)
(Krakow, Poland). 19–23.
A. L. Kun, S. Boll, and A. Schmidt. 2016. Shifting Gears:
User Interfaces in the Age of Autonomous Driving. IEEE
Pervasive Computing 15, 1 (Jan 2016), 32–38. DOI:
23. B. Libet. 1985. Unconscious cerebral initiative and the
role of conscious will in voluntary action. Behavioral and
Brain Sciences 8, 4 (1985), 529–539. DOI:
D.E.J. Linden, D. Prvulovic, E. Formisano, M. Völlinger,
F. E. Zanella, R. Goebel, and T. Dierks. 1999. The
functional neuroanatomy of target detection: an fMRI
study of visual and auditory oddball tasks. Cerebral
Cortex 9, 8 (1999), 815–823. DOI:
25. A. Löcken, W. Heuten, and S. Boll. 2015. Supporting
Lane Change Decisions with Ambient Light. In
Proceedings of the 7th International Conference on
Automotive User Interfaces and Interactive Vehicular
Applications (AutomotiveUI ’15). ACM, New York, NY,
USA, 204–211. DOI:
26. A. Löcken, S. Sadeghian Borojeni, H. Müller, T.M.
Gable, S. Triberti, C. Diels, C. Glatz, I. Alvarez, L.
Chuang, and S. Boll. 2017. Towards Adaptive Ambient
In-Vehicle Displays and Interactions: Insights and Design
Guidelines from the 2015 AutomotiveUI Dedicated
Workshop. In Automotive User Interfaces: Creating
Interactive Experiences in the Car, G. Meixner and
C. Müller (Eds.). Springer International Publishing,
Cham, 325–348. DOI:
http://dx.doi.org/10.1007/978-3- 319-49448- 7_12
27. S.A. Lu, C.D. Wickens, J.C. Prinet, S.D. Hutchins, N.
Sarter, and A. Sebok. 2013. Supporting interruption
management and multimodal interface design: three
meta-analyses of task performance as a function of
interrupting task modality. Human Factors 55, 4 (2013),
28. N. A. Macmillan and C. D. Creelman. 1991. Detection
theory: A user’s guide (1st ed.). Lawrence Erlbaum
Associates Inc., New Jersey.
29. D.C. Marshall, J.D. Lee, and R.A. Austria. 2007. Alerts
for in-vehicle information systems: annoyance, urgency,
and appropriateness. Human factors 49, 1 (2007), 145–57.
30. D. Scott McCrickard and C. M. Chewar. 2003. Attuning
Notiﬁcation Design to User Goals and Attention Costs.
Commun. ACM 46, 3 (March 2003), 67–72. DOI:
31. D. McKeown and S. Isherwood. 2007. Mapping
Candidate Within-Vehicle Auditory Displays to Their
Referents. Human Factors 49, 3 (2007), 417–428. DOI:
32. M. A. Mollenhauer, J. Lee, K. Cho, M. C. Hulse, and
T. A. Dingus. 1994. The Effects of Sensory Modality and
Information Priority on In-Vehicle Signing and
Information Systems. In Proceedings of the Human
Factors and Ergonomics Society Annual Meeting, Vol. 38.
M.A. Nees and B.N. Walker. 2011. Auditory displays for
in-vehicle technologies. Reviews of human factors and
ergonomics 7, 1 (2011), 58–99. DOI:
C.L. Paul, A. Komlodi, and W. Lutters. 2015. Interruptive
notiﬁcations in support of task management.
International Journal of Human-Computer Studies 79
(2015), 20–34. DOI:
35. T. W. Picton. 2010. Human auditory evoked potentials.
36. Ioannis Politis, Stephen Brewster, and Frank Pollick.
2015. Language-based Multimodal Displays for the
Handover of Control in Autonomous Cars. In
Proceedings of the 7th International Conference on
Automotive User Interfaces and Interactive Vehicular
Applications (AutomotiveUI ’15). ACM, New York, NY,
USA, 3–10. DOI:
37. Z. Pousman and J. Stasko. 2006. A Taxonomy of
Ambient Information Systems: Four Patterns of Design.
In Proceedings of the Working Conference on Advanced
Visual Interfaces (AVI ’06). ACM, New York, NY, USA,
38. P. Praamstra, D.F. Stegeman, M.W.I.M. Horstink, and
A.R. Cools. 1996. Dipole source analysis suggests
selective modulation of the supplementary motor area
contribution to the readiness potential.
Electroencephalography and clinical neurophysiology 98,
6 (1996), 468–477. DOI:
39. T. Ross, K. Midtland, M. Fuchs, A. Pauzié, A. Engert, B.
Duncan, G. Vaughan, M. Vernet, H. Peters, G. Burnett,
and A. May. 1996. HARDIE Design Guidelines
Handbook: Human Factors Guidelines for Information
Presentation by ATT Systems. Technical Report V2008
HARDIE, Deliverable No. 20. European Commission
Host Organization, CORDIS, Luxembourg. 1—-562
40. M. Scheer, H. H. Bülthoff, and L. L Chuang. 2016.
Steering Demands Diminish the Early-P3, Late-P3 and
RON Components of the Event-Related Potential of
Task-Irrelevant Environmental Sounds. Frontiers in
human neuroscience 10, March (2016), 73. DOI:
41. D. L. Schomer and F. L. Da Silva. 2012. Niedermeyer’s
electroencephalography: basic principles, clinical
applications, and related ﬁelds. Lippincott Williams &
42. M. Schultze-Kraft, D. Birman, M. Rusconi, C. Allefeld,
K. Görgen, S. Dähne, B. Blankertz, and J-D. Haynes.
2016. The point of no return in vetoing self-initiated
movements. Proceedings of the National Academy of
Sciences 113, 4 (2016), 1080–1085. DOI:
43. H. Shibasaki and M. Hallett. 2006. What is the
Bereitschaftspotential? Clinical Neurophysiology 117, 11
(2006), 2341–2356. DOI:
44. A. Stevens, A. Quimby, A. Board, T. Kersloot, and P.
Burns. 2002. Design guidelines for safety of in-vehicle
information systems. Technical Report TRL-PA-3721/01.
Department of Transport, Local Government and the
Regions. 1–55 pages. http://www.transport-research.
45. O. Tsimhoni and M. Flannagan. 2006. Pedestrian
detection with night vision systems enhanced by
automatic warnings. In Proceedings of the Human
Factors and Ergonomics Society Annual Meeting, Vol. 50.
Sage Publications, 2443–2447. DOI:
46. C. D. Wickens. 2002. Multiple resources and
performance prediction. Theoretical Issues in
Ergonomics Science 3, 2 (2002), 159–177. DOI:
47. C. D. Wickens. 2008. Applied attention theory.
Ergonomics 1, 2 (2008), 222. DOI:
48. T.O. Zander, L.M. Andreessen, A. Berg, M. Bleuel, J.
Pawlitzki, L. Zawallich, L.R. Krol, and K. Gramann.
2017. Evaluation of a Dry EEG System for Application
of Passive Brain-Computer Interfaces in Autonomous
Driving. Frontiers in Human Neuroscience 11 (2017).
49. S. Zhang, R. Benenson, M. Omran, J. Hosang, and B.
Schiele. 2016. How Far are We from Solving Pedestrian
Detection?. In 2016 IEEE Conference on Computer
Vision and Pattern Recognition (CVPR). 1259–1267.