PreprintPDF Available

Neural attention filters do not predict behavioral success in a large cohort of aging listeners

Authors:

Abstract and Figures

Successful listening crucially depends on intact attentional filters that separate relevant from irrelevant information. Research into their neurobiological implementation has focused on two potential auditory filter strategies: the lateralization of alpha power and selective neural speech tracking. However, the functional interplay of the two neural filter strategies and their potency to index listening success in an ageing population remains unclear. Using electroencephalography and a dual-talker task in a representative sample of listeners (N=155; age=39-80 years), we here demonstrate an often-missed link from single-trial behavioural outcomes back to trial-by-trial changes in neural attentional filtering. First, we observe preserved attentional-cue-driven modulation of both neural filters across chronological age and hearing levels. Second, neural filter states vary independently of one another, demonstrating complementary neurobiological solutions of spatial selective attention. Stronger neural speech tracking but not alpha lateralization boosts trial-to-trial behavioural performance. Our results highlight the translational potential of neural speech tracking as an individualized neural marker of adaptive listening behaviour.
Content may be subject to copyright.
Neural attention filters
1
do not predict behavioral success
2
in a large cohort of aging listeners
3
4
5
6
Sarah Tune*, Mohsen Alavash, Lorenz Fiedler, & Jonas Obleser*
7
8
9
Department of Psychology, University of Lübeck, 23562 Lübeck, Germany
10
11
12
13
14
* Author correspondence:
15
Sarah Tune, Jonas Obleser
16
Department of Psychology
17
University of Lübeck
18
Maria-Goeppert-Str. 9a
19
23562 Lübeck
20
Email: sarah.tune@uni-luebeck.de; jonas.obleser@uni-luebeck.de
21
22
23
Running title: Modelling neural filters and behavioral outcome in attentive
24
listening
25
26
Number of words in Abstract: 193
27
Number of words in Introduction: 791
28
Number of words in Discussion: 1882
29
30
31
41 pages, 6 Figures, 0 Tables, Includes Supplemental Information
32
33
34
Conflict of Interest: The authors declare no competing financial interests.
35
36
Keywords: EEG, alpha power lateralization, neural speech tracking, speech
37
processing, auditory selective attention
38
39
Author contributions: S.T., M.A., and J.O. designed the experiment; S.T. and M.A.
40
oversaw data collection and preprocessing of the data; S.T., M.A. and L.F. analyzed
41
the data; S.T., L.F., M.A., and J.O. wrote the paper.
42
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
2
43
Acknowledgments: Research was funded by the European Research Council
44
(grant no. ERC-CoG-2014-646696 ”Audadapt“ awarded to J.O.). The authors are
45
grateful for the help of Franziska Scharata in acquiring the data.
46
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
3
Abstract
47
Successful speech comprehension requires the listener to differentiate relevant
48
from irrelevant sounds. Most studies on this problem address one of two candidate
49
neural auditory filter solutions: Selective neural tracking of attended versus ignored
50
speech in auditory cortex (“speech tracking”), and the lateralized modulation of
51
alpha power in wider temporoparietal cortex. Using an unprecedentedly large,
52
age-varying sample (N=155; age=3980 years) in a difficult listening task, we here
53
demonstrate their limited potency to predict behavioral listening success at the
54
single-trial level. Within auditory cortex, we observed attentionalcue-driven
55
modulation of both, speech tracking and alpha power. Both filter mechanisms
56
clearly operate functionally independent of each other and appear undiminished
57
with chronological age. Importantly, single-trial models and cross-validated
58
predictive analyses in this large sample challenge the immediate functional
59
relevance of these neural attentional filters to overt behavior: The direct impact of
60
spatial attentional cues as well as typical confounders age and hearing loss on
61
behavior reliably outweighed the relatively minor predictive influence of speech
62
tracking and alpha power. Our results emphasize the importance of large-scale,
63
trial-by-trial analyses and caution against simplified accounts of neural filtering
64
strategies for behavior often derived from younger, smaller samples.
65
66
67
68
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
4
Introduction
69
Real-life listening is characterized by the concurrence of sound sources that
70
compete for our attention [1]. Successful speech comprehension thus relies on the
71
selective filtering of relevant and irrelevant inputs. How does the brain instantiate
72
attentional filter mechanisms that allow for the controlled inhibition and
73
amplification of speech? Recent neuroscientific approaches to this question have
74
focused on two potential neural filter strategies originating from distinct research
75
traditions:
76
From the visual domain stems an influential line of research that support a
77
role of alpha-band (~812 Hz) oscillatory activity in the implementation of
78
controlled, top-down suppression of behaviorally-irrelevant information [2-5].
79
Importantly, across modalities, it was shown that tasks requiring spatially-directed
80
attention are neurally supported by a hemispheric lateralization of alpha power
81
over occipital, parietal but also the respective sensory cortices [6-15]. This suggests
82
that asymmetric alpha modulation could act as a filter mechanism by modulating
83
sensory gain already in early processing stages.
84
In addition, a prominent line of research focuses on the role of low-
85
frequency (18 Hz) neural activity in auditory and, broadly speaking, perisylvian
86
cortex in the selective representation of speech input (“speech tracking”). It is
87
assumed that slow cortical dynamics temporally align with (or “track’’) auditory
88
input signals to allow for a prioritized neural representation of behaviorally-
89
relevant sensory information [16-19]. In human speech comprehension, a key
90
finding is the preferential neural tracking of attended compared to ignored speech
91
in superior temporal brain areas close to auditory cortex [20-24].
92
However, with few exceptions [6], these two proposed auditory filter
93
strategies have been studied independently of one another [but see 25,26 for
94
recent results on visual attention]. Also, they have often been studied using tasks
95
that are difficult to relate to natural, conversation-related listening situations
96
[27,28].
97
We thus lack understanding whether or how modulations in lateralized
98
alpha power and the neural tracking of attended and ignored speech in wider
99
auditory cortex interact in the service of successful listening behavior. At the same
100
time, few studies using more real-life listening and speech-tracking measures were
101
able to explicitly address the functional relevance of the discussed neural filter
102
strategies, that is, their potency to explain and predict behavioral listening success
103
[22].
104
In the present EEG study, we aim at closing these gaps by leveraging the
105
statistical power and representativeness from a large, age-varying participant
106
sample. We use a novel dichotic listening paradigm to enable a synoptic look at
107
concurrent changes in auditory alpha power and neural speech tracking at the
108
single-trial level. More specifically, our linguistic variant of a classic Posner
109
paradigm [29] emulates a challenging dual-talker listening situation in which
110
speech comprehension is supported by two different listening cues [30]. These cues
111
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
5
encourage the use of two complementary cognitive strategies to improve
112
comprehension: A spatial-attention cue guides auditory attention in space,
113
whereas a semantic cue affords more specific semantic predictions of upcoming
114
speech. Previous research has shown that the sensory analysis of speech and, to a
115
lesser degree, the modulation of alpha power are influenced by the availability of
116
higher-order linguistic information [31-35].
117
Varying from trial to trial, both cues were presented either in an informative
118
or uninformative version, allowing us to examine relative changes in listening
119
success and in the modulation of neural measures thought to enable auditory
120
selective attention. Based on the neural and behavioral results, we focused on four
121
research questions (see Fig. 1). Note that in pursuing these, we take into account
122
additional notorious influences on listening success and its supporting neural
123
strategies. These include age, hearing loss, or hemispheric asymmetries in speech
124
processing due to the well-known right-ear advantage [36,37].
125
126
127
128
129
130
131
132
133
134
First, we predicted that informative listening cues should increase listening success:
135
These cues allow the listener to deploy auditory selective attention (compared to
136
divided attention), and to generate more specific (compared to only general)
137
semantic predictions, respectively.
138
Second, we asked how the different cuecue combinations would modulate
139
the two key neural measuresalpha power lateralization and neural speech
140
tracking. We expected to replicate previous findings of increased alpha power
141
lateralization and stronger tracking of the to-be-attended speech signal under
142
selective (compared to divided) spatial attention.
143
Third, an important and often neglected research question pertains to a
144
direct, trial-by-trial relationship of these two candidate neural measures: Do
145
changes in alpha power lateralization impact the degree to which attended and
146
ignored speech signals are neurally tracked by low-frequency cortical responses?
147
Figure 1. Schematic illustration of the
research questions addressed in the present
study. The dichotic listening task
manipulated the attentional focus and
semantic predictability of upcoming input
using two separate visual cues. We
investigated whether informative cues would
enhance behavioral performance (Q1). In line
with (Q2), we also examined the degree to
which listening cues modulated the two
auditory neural measures of interest: neural
tracking and lateralization of auditory alpha
power. Finally, we assessed (Q3) the co-
variation of neural measures, and (Q4) their
potency in predicting behavioral
performance. Furthermore, we controlled for
additional factors that may challenge
listening success and its underlying neural
strategies.
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
6
Our final and overarching research question is the most arguably the most
148
relevant one for all translational aspects of auditory attention: Would alpha power
149
and neural speech tracking of attended and ignored speech allow us at all to
150
explain and predict single-trial behavioral success in this challenging listening
151
situation? While tacitly assumed in most studies that deem these filter mechanisms
152
“attentional”, this has remained a surprisingly open empirical question.
153
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
7
Results
154
We recorded EEG from an age-varying sample (N=155) of healthy middle-aged
155
and older adults (age=3980 years) who performed a challenging dichotic listening
156
task[30]. In this linguistic variant of a classic Posner paradigm, participants listened
157
to two competing sentences spoken by the same female talker, and were asked to
158
identify the final word in one of the two sentences. Importantly, sentence
159
presentation was preceded by two visual cues. First, a spatial-attention cue
160
encouraged the use of either selective or divided attention by providing
161
informative or uninformative instructions about the to-be-attended, and thus later
162
probed, ear. The second cue indicated the semantic category that applied to both
163
final target words. The provided category could represent a general or specific
164
level, thus allowing for more or less precise prediction of the upcoming speech
165
signal (Fig. 2a, b). While this listening task does not tap into the most naturalistic
166
forms of speech comprehension, it provides an ideal scenario to probe the neural
167
underpinnings of successful selective listening.
168
To address our research questions outlined above, we employed two
169
complementary analysis strategies: First, we aimed at explaining behavioral task
170
performance in a challenging listening situation, and the degree to which it is
171
leveraged by two key neural measures of auditory attention: the lateralization of
172
812 Hz alpha power, and the neural tracking of attended and ignored speech by
173
slow cortical responses. Using generalized linear mixed-effects models on single-
174
trial data, we investigate the cue-driven modulation of behavior and neural
175
measures, the interaction of neural measures, and their (joint) influence on selective
176
listening success.
177
Second, we went beyond common explanatory statistical analyses and used
178
cross-validated regularized regression to explicitly probe the potency of cue-driven
179
modulation of behavior and neural measures to predict single-trial task
180
performance [38].
181
182
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
8
183
Figure 2. Experimental design and behavioral benefit from informative cues.
184
(a) Visualization of the employed 2×2 design [39]. The level of the spatial and semantic cue differed
185
on a trial-by-trial basis. Top row shows the informative [+] cue levels, bottom row the uninformative
186
[] cue levels.
187
(b) Schematic representation of the temporal order of events for a given trial. Successive display of
188
the two visual cues precedes the dichotic presentation of two sentences spoken by the same female
189
talker. After sentence presentation, participants had to select the final word from four alternative
190
words.
191
(c) Grand average and individual results of accuracy shown per cue-cue combination. Colored dots
192
are individual (N=155) trial-averages, black dots and vertical lines show group means ± bootstrapped
193
95% confidence intervals (left side). Individual cue benefits displayed separately for the two cues (top:
194
spatial cue, bottom: semantic cue). Black dots indicate individual trial-averages ± bootstrapped 95 %
195
confidence intervals. Histograms show the distribution of the difference for informative vs.
196
uninformative levels. OR: Odds ratio parameter estimate from generalized linear mixed-effects
197
models (right side); *** = p < .001.
198
(d) As in (c) but shown for response speed; β: slope parameter estimate from general linear mixed-
199
effects models.
200
201
The following figure supplements are available for figure 2:
202
Figure supplement 1. (a) Histogram showing age distribution of N=155 participants across 2-year
203
age bins. (b) Individual (N=155) and mean air-conduction thresholds (PTA) averaged across the left
204
and right ear, and four age groups.
205
Figure supplement 2. Analysis of response types split into correct responses, spatial stream
206
confusions and random errors.
207
208
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
9
Informative spatial cue improves listening success
209
For behavioral performance, we tested the impact of informative versus
210
uninformative listening cues on listening success. Overall, participants achieved a
211
mean accuracy of 87.8 % ± 9.1 % with a mean reaction time of 1742 ms ± 525 ms;
212
as response speed: 0.62 s-1 ± 0.17.
213
As shown in Figure 2c/d, the behavioral results varied between the different
214
combinations of listening cues. As expected, the analyses revealed a strong
215
behavioral benefit of informative compared to uninformative spatial-attention
216
cues. In selective-attention trials, participants responded more accurately and
217
faster (accuracy: generalized linear mixed-effects model (GLMM); odds ratio
218
(OR)=3.45, std. error (SE) =.12, p<.001; response speed: linear mixed-effects model
219
(LMM); β=.57, SE=.04, p<.001). That is, when cued to one of the two sides,
220
participants responded on average 261 ms faster and their probability of giving a
221
correct answer increased by 6 %.
222
Participants also responded generally faster in trials in which they were
223
given a specific and thus informative semantic cue (LMM; β=.2, SE=.03, p<.001),
224
most likely reflecting a semantic priming effect that led to faster word recognition.
225
As in a previous fMRI implementation of this task [30], we did not find
226
evidence for any interactive effects of the two listening cues on either accuracy
227
(GLMM; OR=1.3, SE=.22, p=.67) or response speed (LMM; β=.09, SE=.06, p=.41).
228
Moreover, the breakdown of error trials revealed a significantly higher proportion
229
of spatial stream confusions (6 % ±8.3 %) compared to random errors (3 % ±3.4 %;
230
paired t-test on logit-transformed proportions: t155 = 6.53, p<.001; see also Fig. 2
231
supplement 2). The increased rate of spatial stream confusions (i.e., responses in
232
which the last word of the to-be-ignored sentence was chosen) attests to the
233
distracting nature of dichotic sentence presentation and thus heightened task
234
difficulty.
235
236
Spatial attention modulates both alpha power lateralization and neural
237
speech tracking in auditory cortex
238
In line with our second research question, following source projection of EEG data,
239
we tested the hypothesis that the presence of an informative spatial attention cue
240
would lead to reliable modulation of both alpha power and neural speech tracking
241
in and around auditory cortex. To this end, we focused on our analyses to an a
242
priori defined auditory region of interest (ROI; see Fig. 3supplement1 and
243
Methods for details).
244
For alpha power, we expected an attention-modulated lateralization caused
245
by the concurrent decrease in power contralateral and increase in power ipsilateral
246
to the focus of attention. For the selective neural tracking of speech, we expected
247
an increase in the strength of neural tracking of attended but potentially also of
248
ignored speech[cf. 40] in selective-attention compared to divided-attention trials.
249
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
10
For the lateralization of alpha power we relied on an established neural
250
index that compares alpha power ipsi- and contralateral to the to-be-attended ear
251
(in divided-attention trials: probed ear) to provide a temporally-resolved single-
252
trial measure of alpha power lateralization [ALI = (α-poweripsi α-powercontra) / (α-
253
poweripsi + α-powercontra)] [12,22].
254
As shown in Figure 3a, the instruction to pay attention to one of the two
255
sides elicited a pronounced lateralization of 812 Hz alpha power within the
256
auditory ROI: We found a strong but transient increase in lateralization in response
257
to an informative spatial-attention cue. After a subsequent break-down of the
258
lateralization during semantic-cue presentation, it re-emerged in time for the start
259
of dichotic sentence presentation and peaked again during the presentation of the
260
task-relevant sentence-final words.
261
As expected, the statistical analysis of alpha power lateralization during
262
sentence presentation (time window: 3.56.5 s) revealed a significant modulation
263
by attention that was additionally influenced by the probed ear (LMM; spatial cue
264
x probed ear: β=.13, SE=.02, p<.001). Follow-up analysis showed a significant
265
increase in alpha power lateralization in selective-attention compared to divided
266
attention-trials when participants were instructed to pay attention to the right ear
267
(LMM, simple effect spatial cue: β=.12, SE=.01, p<.001). No such difference was
268
found for attend-left/ignore-right trials (LMM, simple effect spatial cue: β=.01,
269
SE=.01, p=.63; see Table S3 for full model details).
270
Notably, we did not find any evidence for a modulation of alpha
271
lateralization during sentence presentation by the semantic cue nor any joint
272
influence of the spatial and semantic cue (LMM; semantic cue main effect: β=.007,
273
SE=.01, p=.57, interaction spatial x semantic cue: β=.02, SE=.02, p=.42).
274
We ran additional control analyses for the time window covering the spatial
275
cue (01 s) and the sentence-final target word (5.56.5 s), respectively. For the
276
target word, results mirrored those observed for the entire duration of sentence
277
presentation (see Table S8). Alpha power lateralization during spatial-cue
278
presentation was modulated by independent effects of the spatial-attention cue
279
and probed ear (See Table S9 for details). None our analyses revealed an effect of
280
age or hearing ability on the strength of alpha power lateralization (all p-
281
values>.14).
282
283
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
11
284
285
Figure 3. Informative spatial cue elicits increased alpha power lateralization before and during
286
speech presentation.
287
(a) Attentional modulation of 812 Hz auditory alpha power shown throughout the trial for all N=155
288
participants. Brain models indicate the spatial extent of the auditory region of interest (shown in blue).
289
Purple traces show the temporally resolved alpha lateralization index (ALI) for the informative (dark
290
color) and the uninformative spatial cue (light color), each collapsed across the semantic cue levels.
291
Positive values indicate relatively higher alpha power in the hemisphere ipsilateral to the attended/-
292
probed sentence compared to the contralateral hemisphere. Shaded grey area show the temporal
293
window of sentence presentation.
294
(b) Strength of the alpha lateralization index (ALI) during sentence presentation (3.56.5 s) shown
295
separately per spatial cue condition and probed ear as indicated by a significant interaction in the
296
corresponding mixed-effects model (left plot). Colored dots represent trial-averaged individual
297
results, black dots and error bars indicate the grand-average and bootstrapped 95% confidence
298
intervals. For attend-right/ignore-left trials we observed a significant increase in alpha power
299
lateralization (right plot). Black dots represent individual mean ALI values ± bootstrapped 95 %
300
confidence intervals. Histogram shows distribution of differences in ALI for informative vs.
301
uninformative spatial-cue levels. β: slope parameter estimate from the corresponding general linear
302
mixed-effects model; ***=p<.001.
303
304
305
In close correspondence to the alpha-power analysis, we investigated whether
306
changes in attention or semantic predictability would modulate the neural tracking
307
of attended and ignored speech. We used linear backward (‘decoding’) models to
308
reconstruct the onset envelopes of the to-be-attended and ignored sentences (for
309
simplicity hereafter referred to as attended and ignored) from neural activity in the
310
auditory ROI. We compared the reconstructed envelopes to the envelopes of the
311
actually presented sentences and used the resulting Pearson’s correlation
312
coefficients as a fine-grained, single-trial measure of neural tracking strength.
313
Reconstruction models were trained on selective-attention trials, only, but then
314
utilized to reconstruct attended (probed) and ignored (unprobed) envelopes for
315
both attention conditions (see Methods and Fig. 4a along with supplement 1 for
316
details).
317
In line with previous studies, but here observed particularly for right-ear
318
inputs processed in the left auditory ROI, we found increased encoding of attended
319
compared to ignored speech in the time window covering the N1TRF and P2TRF
320
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
12
component (see Fig. 4b). Notably, within both left and right auditory cortex, for
321
later processing stages (>300 ms), deflections in the forward-transformed temporal
322
response functions (TRFs) for attended and ignored speech follow trajectories of
323
opposite polarity with overall stronger deflections for the ignored sentence.
324
Further attesting to the validity of our reconstruction models, we found that
325
envelopes reconstructed by the attended decoder model were more similar to the
326
envelope of the actually presented to-be-attended sentence than to that of the to-
327
be-ignored sentence (see Fig. 4b, bottom left plot). As reported in previous studies,
328
we also found significantly stronger neural tracking of attended (mean r=.06, range:
329
.02.15) compared to ignored speech (ignored: mean r=.03, range: .03.1; LMM,
330
β=.022, SE=.002, p<.001; see Fig. 4b, bottom right plot).
331
As shown in Figure 4c/d, the neural tracking of attended and ignored
332
envelopes was modulated by attention. Note that for sake of consistency, for
333
divided-attention trials, we likewise refer to the reconstruction of the probed
334
envelope as the attended envelope, and to the reconstruction of the unprobed
335
envelope as the ignored envelope despite the absence of corresponding
336
instructions.
337
Giving credence to the suggested role of selective neural speech tracking
338
as a neural filter strategy, we found stronger neural tracking of the attended
339
envelope following an informative spatial-attention cue as compared to an
340
uninformative one (LMM; β=.06, SE=.01, p<.001; see Table S4 for full model details).
341
The semantic predictability of the sentence-final words, however, did not modulate
342
the neural tracking of the attended envelope (LMM; β=.02, SE=.01, p=.26). We also
343
found stronger tracking in trials where later probed sentence started ahead of the
344
distractor sentence (LMM: β=.13, SE=.01, p<.001; see Methods and Supplements
345
for details). The latter effect may indicate that the earlier onset of a sentence
346
increased its bottom-up salience and attentional capture. The observed effect may
347
also suggest that participants focused their attention to one of the two sentences
348
already during early parts of auditory presentation.
349
The analysis of ignored-speech tracking revealed a qualitatively similar
350
pattern of results: we found stronger tracking of ignored speech following an
351
informative compared to an uninformative spatial-attention cue (LMM; spatial cue:
352
β=.04, SE=.01, p=.005). We additionally observed stronger tracking of the ignored
353
(under divided attention: unprobed) sentence when it was played to the left ear
354
(LMM; main effect probed ear: β=.17, SE=.02, p<.001; see Table S5 for full model
355
details).
356
357
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
13
358
359
Figure 4. Neural speech tracking of attended and ignored sentences.
360
(a) Schematic representation of the linear backward model approach used to reconstruct onset
361
envelopes on the basis of recorded EEG responses. Following the estimation of linear backward
362
models using only selective-attention trials (see Fig. 4supplement 1 for details), envelopes of
363
attended and ignored sentences were reconstructed via convolution of EEG responses with the
364
estimated linear backward models in all functional parcels within the auditory ROI. Reconstructed
365
envelopes were compared to the envelopes of presented sentences to assess neural tracking strength
366
and reconstruction accuracy (see Supplements).
367
(b) Forward-transformed temporal response functions (TRFs) for attended (green) and ignored (red)
368
speech in the selective-attention condition in the auditory ROI. Models are averaged across all N=155
369
participants, and shown separately for per hemisphere. Left panel compares the TRFs in the left
370
auditory ROI to right-ear input, right panel compares the TRFs in the right auditory ROI left-ear input.
371
Error bands indicate 95 % confidence intervals. Bottom row: Left plot shows the Pearson’s correlation
372
of the reconstructed attended envelope to the envelopes of the actually presented sentences, middle
373
plot analogously for the reconstructed ignored envelope. Right plot compares the neural tracking
374
strength of the attended and ignored speech.
375
(c) Attentional modulation in the neural tracking of attended (under divided attention: probed)
376
speech. Colored dots represent trial-averaged individual results, black dots and error bars indicate
377
the grand-average and bootstrapped 95 % confidence intervals. Single-trial analysis reveals that
378
neural tracking of attended speech is stronger under selective attention (left plot and bottom right
379
plot). Colored dots in 45° plot represent trial-averaged correlation coefficients per participant along
380
with bootstrapped 95 % confidence intervals; ***= p<.001.
381
(d) as in (c ) but for the neural tracking of ignored/unprobed speech; **=p<.01.
382
383
The following figure supplements are available for figure 4:
384
Figure supplement 1. Training and testing of envelope reconstruction models.
385
Figure supplement 2. Neural tracking strength per reconstruction model for divided attention
386
condition (N=155).
387
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
14
Figure supplement 3. Decoding accuracy of attended and ignored speech shown per attention
388
condition (N=155).
389
390
Changes in neural speech tracking are independent of concurrent changes in
391
auditory alpha power lateralization
392
Our third major goal was to investigate whether the modulation of alpha power
393
and neural tracking strength reflect two dependent neural strategies of auditory
394
attention at all. To this end, using two single-trial linear mixed-effects models, we
395
tested whether the neural tracking of the attended and ignored sentence could be
396
explained by auditory alpha power dynamics (here, alpha power ipsi- and
397
contralateral to the focus of attention/probed ear, that is, the relative signal
398
modulations that constitute the single-trial alpha lateralization index). If
399
modulations of alpha power over auditory cortices indeed act as a neural filter
400
mechanism to selectively gate processing during early stages of sensory analysis
401
then this would be reflected as follows: decreases in contralateral alpha power
402
should lead to stronger neural tracking of the attended envelope whereas increases
403
in ipsilateral alpha power should lead to weaker tracking of the ignored envelope
404
(cf. Fig. 5a).
405
However, neither the tracking of the attended envelope (LMM; main effect
406
alpha ipsi: β=−.007, SE=.006, p=.63, Bayes factor (BF) = .009; main effect alpha
407
contra: β=.01, SE=.006, p=.43, BF=.02; interaction ipsi × contra: β=−.003, SE=.004,
408
p=.72, BF=.006) nor the tracking of the ignored envelope were correlated to
409
changes in ipsi- or contralateral alpha power or their interaction (LMM; main effect
410
alpha ipsi: β=.01, SE=.006, p=.35; BF=.02; main effect alpha contra: β=.003, SE=.006,
411
p=.85, BF=.006; interaction ipsi × contra: β=−.006, SE=.004, p=.53, BF=.01; see
412
Tables S6 and S7 for full model details). This notable absence of an alpha
413
lateralizationspeech tracking relationship was independent of the respective
414
spatial-attention condition (LMM; all corresponding interaction terms with p-value
415
>.38, BFs <.0076).
416
A control analysis relating alpha power modulation during spatial-cue
417
presentation to neural speech tracking showed qualitatively similar results (see
418
Table S10 for details).
419
Alternatively, selective neural tracking of speech might be driven not
420
primarily by alpha-power modulations emerging in auditory cortices, but rather by
421
those generated in domain-general attention networks in parietal cortex [41].
422
Additional control models included changes in alpha power within the inferior
423
parietal lobule (see Table S11 and Fig. 5supplement 1 for details) to test this
424
hypothesis: We found a trend towards a negative relationship between
425
contralateral alpha power and the neural tracking of the attended envelope (LMM;
426
β=−.017, SE=.007, p=.082, BF=.1). In addition, we observed a significant joint effect
427
of ipsi- and contralateral alpha power in inferior parietal cortex on the tracking of
428
the ignored envelope (LMM; interaction alpha power ipsi x alpha power contra:
429
β=−.011, SE=.004, p=.022, BF=.17): the strength of the positive correlation of
430
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
15
ipsilateral alpha power and ignored speech tracking increased with decreasing
431
levels of contralateral alpha power.
432
A final control analysis included changes in visual alpha power and did not
433
find any significant effects of alpha power on neural speech tracking (see Table
434
S12).
435
436
437
Figure 5. Independence of auditory alpha power dynamics and neural tracking.
438
(a) Hypothesized inverse relationship of changes in alpha power and neural tracking within the
439
auditory ROI. Changes in alpha power are assumed to drive changes in neural tracking strength.
440
Schematic representation of expected co-variation in the two neural measures during an attend-left
441
trial.
442
(b) Independence of changes in neural tracking and alpha power in the auditory ROI during sentence
443
processing as revealed by two separate linear mixed-effects models predicting the neural tracking
444
strength of attended and ignored speech, respectively. Error band denotes 95 % confidence interval.
445
Black dots represent single-trial observed values from N=155 participants (i.e., a total of k = 35675
446
trials).
447
448
Alpha power lateralization and neural tracking fail to explain single-trial
449
listening success
450
Having established the functional independence of alpha power lateralization and
451
speech tracking, the final and most important piece of our investigation becomes
452
in fact statistically more tractable: If alpha power dynamics in auditory cortex (but
453
also in inferior parietal and visual cortex), and neural speech tracking essentially act
454
as two largely independent neural filter strategies, we can proceed to probe their
455
relative functional relevance for behavior in a factorial-design fashion.
456
We employed two complementary analysis strategies to quantify the
457
relative importance of neural filters for listening success: (i) using the same
458
(generalized) linear mixed-effects models as in testing our first research question
459
(Q1 in Fig.1), we investigated whether changes in task performance could be
460
explained by the independent (i.e., as main effects) or joint influence (i.e., as an
461
interaction) of neural measures; (ii) using cross-validated regularized regression we
462
directly assessed the ability of neural filter dynamics to predict out-of-sample
463
single-trial listening success.
464
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
16
With respect to response accuracy, our most important indicator of
465
listening success, we did not find evidence for any direct effects of alpha power
466
lateralization and the neural tracking of attended or ignored speech (GLMM; main
467
effect alpha power lateralization (ALI): OR=1.0, SE=.02, p=.93; main effect attended
468
speech tracking: OR=1.01, SE=.02, p=.93; main effect ignored speech tracking:
469
OR=1.01, SE=.02, p=.82, see Fig. 6) nor for any joint effects (GLMM; ALI x attended
470
speech tracking: OR=1.01, SE=.02, p=.82; ALI x ignored speech tracking: OR=.98,
471
SE=.02, p=.72, see Fig. 6b for comparison of Bayes factors). Importantly, the
472
absence of an effect did not hinge on differences across spatial-cue, semantic-cue
473
or probed-ear levels as relevant interactions of the neural measures with these
474
predictors were included in the model (all p-values > .05, see Table S1 for full
475
details).
476
The analysis of response speed did not reveal any meaningful brain-
477
behavior relationships, either. We did not find evidence for any direct effects of
478
neural measures on response speed (LMM; alpha power lateralization (ALI):
479
β=0.002, SE=.005, p=.9; attended speech tracking: β=.003, SE=.005, p=.73; ignored
480
speech tracking: β=.00001, SE=.005, p=.99) nor for their joint influence (LMM; ALI
481
x attended speech tracking: β=0.005, SE=.004, p=.61; ALI x ignored speech
482
tracking: β=.004, SE=.004, p=.61; see Table S2 for full details).
483
We ran two sets of control analyses: first, we tested for the influence of
484
changes in auditory alpha power during the presentation of the sentence-final
485
target word and the spatial-attention cue, respectively (see Tables S1316). Next,
486
we investigated whether modulation of alpha power lateralization during sentence
487
presentation extracted from inferior parietal or visual cortex would influence
488
accuracy or response speed (see Tables S17-20). However, all results were
489
qualitatively in line with those of our main analyses.
490
In contrast to the missing brain-behavior link, but in line with the literature
491
on listening behavior in aging adults [e.g. 42-44], the behavioral outcome was
492
instead reliably predicted by age, hearing loss, and probed ear. We observed that
493
participants’ performance varied in line with the well-attested right-ear advantage
494
(REA, also referred to as left-ear disadvantage) in the processing of linguistic
495
materials [36,37]. More specifically, participants responded both faster and more
496
accurately (response speed: LMM; β=.08, SE=.013, p<.001; accuracy: GLMM;
497
OR=1.23, SE=.07, p=.017; see also Fig. 6 supplement 2) when they were probed on
498
the last word presented to the right compared to the left ear.
499
Increased age led to less accurate and slower response (accuracy: GLMM;
500
OR=.81, SE=.08, p=.033; response speed: LMM; β=.15, SE=.03, p<.001). In contrast,
501
increased hearing loss led to less accurate (GMM; OR=.75, SE=.08, p=.003) but not
502
slower responses (LMM; β=.05, SE=.08, p=.28).
503
To test whether alpha power lateralization and neural speech tracking
504
would be related to behavior on a between-subjects level where their effects could
505
also be more readily compared to the effects of age or PTA, we included separate
506
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
17
within- and between-subjects effects in simplified control models ([45]; see Fig. 6b
507
and Methods for details).
508
We did not observe any between-subject effects of alpha power
509
lateralization (LMM; OR=1.14, SE=.07, p=.11) or neural speech tracking (LMM;
510
attended speech tracking: OR=1.12, SE=.08, p=.35; ignored speech tracking:
511
OR=1.05, SE=.09, p=.74) on accuracy (see Table S21 and Fig. 6supplement 1). For
512
response speed, we found a significant between-subject effect of attended speech
513
tracking (LMM; attended speech tracking: β=.013, SE=.005, p=.039) but no
514
comparable effects for the other two neural measures (LMM; ignored speech
515
tracking: β=.002, SE=.006, p= .89; ALI: β=.001, SE=.005, p=.89; see Table S22).
516
517
518
519
520
Figure 6. Results of explanatory and predictive analysis.
521
(a) Path diagram summarizing the results of our four research questions. Solid black arrows indicate
522
significant effects, grey dashed lines the absence of an effect. An informative spatial cue led to a boost
523
in both accuracy and response speed, as well as in alpha power lateralization and neural speech
524
tracking. Neural dynamics varied independently of one another and of changes in behavior. Age,
525
hearing loss and probed ear additionally influenced behavioral listening success.
526
(b) Bayes factors (BF) for individual predictors of accuracy (left) and response speed (right). BFs are
527
derived by comparing the full model with a reduced model in which only the respective regressor was
528
removed. Log-transformed BFs larger than zero provide evidence for the presence of a particular
529
effect whereas BFs below zero provide evidence for the absence of an effect.
530
(c) Predictive performance of models of accuracy (left) and response speed (right) as estimated by
531
the area under the receiver operating characteristic curve (AUC) and mean-squared error (MSE)
532
metric, respectively. Colored dots show the performance in each of k=5 outer folds in nested cross-
533
validation along with the mean and 95 % confidence interval shown on the right side of each panel.
534
Color gradient indicates the amount of regularization applied.
535
(d) Variable selection in the k=5 final model shown here only for main effects. Horizontal bars indicate
536
in how many of the final models a given effect was included (i.e., it had a non-zero coefficient). All
537
main effects were included in the winning model of accuracy; for response speed only the main effects
538
of spatial and semantic cue, age and PTA were included (see figure supplement 4 for the full model).
539
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
18
540
541
Neural filter mechanisms make limited contributions to overall good model
542
performance in predicting behavior
543
Lastly, to complement the analysis of the explanatory power of neural filters for
544
listening success, we explicitly scrutinized their potency to predict out-of-sample
545
single-trial behavioral outcomes. To this end, based on the final explanatory model
546
for accuracy and speed, we performed variable selection via cross-validated LASSO
547
regression to test which subset of variables would allow for optimal prediction of
548
single-trial task performance (see Fig. 6supplement 3 and Methods for details).
549
This approach yields two important results: On the one hand, based on the
550
k=5 cross-validated final models, it quantifies how well on average we can predict
551
single-trial accuracy (expressed via the area under the receiver operating
552
characteristic curve, AUC) and response speed (expressed via the mean-squared
553
error, MSE). On the other hand, the comparison of terms included in the final
554
models allows to assess how consistently a given effect was deemed important for
555
prediction.
556
As shown in Figure 6c, the models of single-trial accuracy and response
557
speed yielded overall high levels of predictive performance. We achieved a
558
classification of accuracy with a mean AUC of 73.5 ± 0.019 %. This means that in
559
classifying correct and incorrect responses a random trial with a correct response
560
has a 73.5 % chance of being ranked higher by the prediction algorithm than a
561
random incorrect trial. Single-trial prediction of response speed had an average
562
prediction error (MSE) of 0.027 s-1 ± 0.003 that was thus approximately within one
563
(squared) standard deviation of the average response speed.
564
For each predicted behavioral outcome, we estimated one winning model
565
based on the full dataset by performing LASSO regression with the regularization
566
parameter λ value of the best-performing final model. Across the k=5 final models,
567
overall higher degrees of regularization were applied to the regression models
568
predicting response speed than to those predicting accuracy, indicating that
569
predictive performance hinged on the effects of fewer variables (see Fig. 6c along
570
with supplement 4 and Table S23 and S24 for details).
571
As expected, the applied regularization resulted in the removal of several
572
regressors from the winning predictive model. In line with the results of our
573
explanatory analyses, we observed that predictors strongly correlated with changes
574
in behavior were also consistently included in the final predictive models. As shown
575
in Figure 6d for all direct effects (i.e., main effects), the spatial and semantic cue, as
576
well as age and PTA contributed most consistently to the prediction of behavioral
577
outcomes. While the direct effects of neural filter dynamics were retained in almost
578
all final models of accuracy, many of their associated higher-order interactive terms
579
were removed (see Table S23). However, even though the direct (and some of the
580
joint) effects of neural measures were included by the variable selection procedure,
581
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
19
in the winning model fit to all available data, they were still not significantly
582
correlated with the behavioral outcome.
583
For response speed, the direct effects of neural filter dynamics were not
584
included in any of the final models, that is, they did not contribute at all to the
585
overall good predictive performance.
586
In sum, we conclude that in this large, age-varying sample of middle-aged
587
to older adults, single-trial fluctuations in neural filtering make limited
588
contributions to the prediction of accuracy and do not have any meaningful role at
589
all for the prediction of single-trial response speed. Instead, the behavioral
590
outcome was reliably predicted by the effects of the spatial cue and age, and to a
591
lesser degree by the semantic cue, probed ear, and hearing loss.
592
593
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
20
Discussion
594
The present study has utilized the power of a large, representative sample of
595
middle-aged and older listeners to explicitly address the question of how two
596
different neural filter strategies, typically studied in isolation, jointly shape listening
597
success.
598
In our dichotic listening task, we source-localized the
599
electroencephalogram, and observed systematic cue-driven changes within
600
auditory cortex in alpha power lateralization and the neural tracking of attended
601
and ignored speech.
602
These results do provide unprecedented large-sample support for their
603
suggested roles as neural signatures of selective attention. Additionally, they
604
address two important but still insufficiently answered questions: How do these
605
two neural filter strategies relate to one another, and how do they impact listening
606
success in a demanding real-life listening situation?
607
First, when related at the single-trial, single-subject level, we found the
608
modulation of the two neural attentional filter mechanisms to be statistically
609
independent, underlining their functional segregation and speaking to two distinct
610
neurobiological implementations.
611
Second, an informative spatial-attention cue not only boosted both neural
612
measures but also consistently boosted behavioral performance. However, we
613
found changes at the neural and at the behavioral level to be generally unrelated.
614
In fact, the effect of neural measures on single-trial task performance was reliably
615
outweighed by the impact of age, hearing loss, or probed ear.
616
Finally, a cross-validated predictive model showed that, despite their
617
demonstrated validity as measures of deployed attention, trial-by-trial neural filter
618
states have a surprisingly limited role to play in predicting out-of-sample listening
619
success.
620
621
What does the missing direct link from brain to behavior teach us?
622
With the present study, we have explicitly addressed the often overlooked question
623
of how trial-by-trial neural dynamics impact behavior, here single-trial listening
624
success [46-49]. The observed cue-driven modulation of behavior, alpha
625
lateralization, and the tracking of attended and ignored speech allowed us
626
investigate this question.
627
Notably, despite an unprecedentedly large sample of almost 160 listeners;
628
an adequate number of within-subject trials; and a sophisticated linear-model
629
approach, we did not find any evidence for a direct or indirect (i.e., moderating)
630
influence of alpha power lateralization or of the neural tracking of the attended or
631
ignored sentence on behavior (see Fig. 6).
632
At first glance, the glaring absence of any brainbehavior relationship may
633
seem puzzling given the reliable spatial-cue effects on both behavior and neural
634
measures. However, previous studies have provided only tentative support for the
635
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
21
behavioral relevance of the two neural filter solutions, and the observed brain
636
behavior relations do not always converge.
637
The available evidence for the behavioral relevance of selective speech
638
tracking is generally sparse given the emphasis on more complex naturalistic
639
language stimuli [50,51]. While these stimuli provide a window onto the most
640
natural forms of speech comprehension, they do not afford fine-grained measures
641
of listening behavior. This makes it particularly challenging to establish a direct link
642
of differential speech tracking with behavior [20,21,23,52-55]. Nevertheless, there
643
is preliminary evidence for the behavioral benefit of a more pronounced cortical
644
representation of to-be-attended speech [22].
645
Previous studies on lateralized alpha oscillations suggest that their
646
functional relevance is complicated by a number of different factors.
647
First of all, most of the evidence associating increased levels of alpha power
648
lateralization with better task performance in spatial-attention tasks stems from
649
studies on younger adults [9,12,56,57]. By contrast, the establishment of such a
650
consistent relationship in middle-age and older adults is obscured by considerable
651
variability in age-related changes at the neural (and to some extent also behavioral)
652
level [11,58-61]. We here found the fidelity of alpha power lateralization unchanged
653
with age (see also [11]. Other studies on auditory attention have observed
654
diminished and less sustained alpha power lateralization for older compared to
655
younger adults that were to some extend predictive of behavior [58,62].
656
Second, previous findings and their potential interpretation differ along (at
657
least) two dimensions: (i) whether the functional role of alpha power lateralization
658
during attention cueing, stimulus anticipation, or stimulus presentation is studied
659
[6,58,63], and (ii) whether the overall strength of power lateralization or its
660
stimulus-driven rhythmic modulation is related to behavior [cf. 9,11]. Depending
661
on these factors, the observed brainbehavior relations may relate to different top-
662
down and bottom-up processes of selective auditory attention.
663
Third, as shown in a recent study by Wöstmann et al. [63] on younger adults,
664
the neural processing of target and distractor are supported by two uncorrelated
665
lateralized alpha responses emerging from different neural networks. Notably, their
666
results provide initial evidence for the differential behavioral relevance of neural
667
responses related to target selection and distractor suppression.
668
Lastly, the present study deliberately aimed at establishing a brainbehavior
669
relationship at the most fine-grained trial-by-trial, single-subject level. However,
670
many of the previous studies relied on coarsely binned data (e.g. correct vs.
671
incorrect trials) or examined relationships at the between-subjects level, that is,
672
from one individual to another. By contrast, effects observed at the single-trial level
673
are generally considered the most compelling evidence that cognitive
674
neuroscience is able to provide [64].
675
In sum, the current large sample with a demonstrably good signal-to-noise
676
ratio also at the trial-to-trial or within-subjects level is unable to discern either form
677
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
22
of relationship. This result should give the translational ambition of manipulating
678
neural filter mechanisms as a way to improve speech comprehension some pause.
679
680
How important to auditory attention is the neural representation of ignored
681
speech?
682
Early studies of neural speech tracking in multi-talker scenarios have particularly
683
emphasized the prioritized cortical representation of attended speech as a means
684
to selectively encode behaviorally relevant information [20-23,65]. Based on this
685
view, selective attention to one of our two dichotically presented sentences should
686
primarily lead to an increase in the neural tracking of attended, and if anything, a
687
decrease in the neural tracking of ignored speech.
688
However, our findings only partially agree with this hypothesis: In line with
689
previous results, under selective attention, we observed a significantly stronger
690
tracking of attended compared to ignored speech. At the same time, however,
691
cortical responses during later processing stages (> 300 ms) suggest a heightened
692
and diverging neural representation of to-be-ignored speech (see Fig. 4b). This was
693
mirrored by an increase in the neural tracking strength of both attended and
694
ignored speech in selective-attention (compared to divided-attention) trials.
695
How might the increased neural representation of ignored sounds impact the
696
fidelity of auditory selective attention? At least two scenarios seem possible: on the
697
one hand, stronger neural responses to irrelevant sounds could reflect a leaky
698
attentional filter that fails to sufficiently suppress distracting information. On the
699
other hand, it may reflect the implementation of a better calibrated, “late-selection”
700
filter for ignored speech that selectively gates information flow [66,67].
701
Largely compatible with our findings, a recent, source-localized EEG study
702
[40] provided evidence for active suppression of to-be-ignored speech via its
703
enhanced cortical representation. Importantly, this amplification in the neural
704
representation of to-be-ignored speech was found for the most challenging
705
condition in which the distracting talker was presented at a higher sound level than
706
the attended and was thus perceptually more dominant.
707
This finding allows for the testable prediction that the engagement of
708
additional neural filter strategies supporting the suppression of irrelevant sounds
709
is triggered by particularly challenging listening situation. Several key
710
characteristics of our study may have instantiated such a scenario: (i) both
711
sentences were spoken by the same female talker, (ii) had a similar sentence
712
structure and thus prosodic contour, (iii) were presented against the background
713
of speech-shaped noise at 0 dB SNR, and (iv) were of relatively short duration (~2.5
714
s). It thus seems plausible that the successful deployment of auditory selective
715
attention in our task may have relied on the recruitment of additional filter
716
strategies that depend on the neural fate of both attended and ignored speech
717
[68-72].
718
719
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
23
Is the modulation of lateralized alpha power functionally connected to the
720
selective neural tracking of speech?
721
In our study, we investigated attention-related changes in alpha power dynamics
722
and in the neural tracking of speech signals that (i) involve neurophysiological
723
signals operating at different frequency regimes, (ii) are assumed to support
724
auditory attention by different neural strategies, and (iii) are typically studied in
725
isolation [3,19]. Here, when focusing on changes in neural activity within auditory
726
areas, we found both neural filter strategies to be impacted by the same spatial-
727
attention cue.
728
This begs the questions of whether changes in alpha power, assumed to
729
reflect a neural mechanisms of controlled inhibition, and changes in neural speech
730
tracking, assumed to reflect a sensory-gain mechanism, are functionally connected
731
signals [16,17]. Indeed, there is preliminary evidence suggesting that these two
732
neural filter strategies may exhibit a systematic relationship [6,11,27,28,73].
733
However, our fine-grained trial-by-trial analysis revealed independent modulation
734
of alpha power and neural speech tracking. Therefore, the present results speak
735
against a consistent, linear relationship of neural filter strategies. We see instead
736
compelling evidence for the coexistence of two complementary but independent
737
neural solutions to the implementation of auditory selective attention for the
738
purpose of speech comprehension.
739
Our results are well in line with recent reports of independent variation in
740
alpha band activity and steady-state (SSR) or frequency following responses (FFR)
741
in studies of visual spatial attention [25,26,74]. Additionally, the inclusion of single-
742
trial alpha power lateralization as an additional training feature in a recent speech-
743
tracking study failed to improve the decoding of attention [75].
744
The observed independence of neural filters may be explained by our focus
745
on their covariation within auditory cortex while the controlled inhibition of
746
irrelevant information via alpha power modulation typically engages areas in the
747
parietal-occipital cortex. However, our control analyses relating neural speech
748
tracking to modulation in alpha power in those areas did not find strong support
749
for the interplay of neural filters, either.
750
These findings also agree with results from our recent fMRI study on the
751
same listening task and a subsample of the present participant cohort [30]. The
752
graph-theoretic analysis of cortex-wide networks revealed a network-level
753
adaption to the listening task that was characterized by increased local processing
754
and higher segregation. Crucially, we observed a reconfigured composition of brain
755
networks that involved brain areas in auditory and superior temporal areas
756
generally associated with the neural tracking of speech but not those typically
757
engaged in the controlled inhibition of irrelevant information via top-down
758
regulation of alpha power [76-79].
759
Taken together with findings from previous electrophysiological studies
760
[6,27,28], our results provide strong evidence in favor of a generally existent
761
functional tradeoff between attentional filter mechanisms [80]. However, we
762
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
24
suggest that the precise nature of this interplay of attentional filter mechanisms
763
hinges on a number of factors such as the particular task demands, and potentially
764
on the level of temporal and/or spatial resolution at which the two neural
765
signatures are examined [28].
766
767
Conclusion
768
In a large, representative sample of adult listeners, we have provided evidence that
769
single-trial listening success in a challenging, dual-talker acoustic environment
770
cannot be easily explained by the individual (or joint) influence of two independent
771
neurobiological filter solutionsalpha power lateralization and neural speech
772
tracking. Supporting their interpretation as neural signatures of spatial attention,
773
we find both neural filter strategies engaged following a spatial-attention cue.
774
However, there was no direct link between neural and ensuing behavioral effects.
775
Our findings highlight the level of complexity associated with establishing a stable
776
brainbehavior relationship in the deployment of auditory attention. They should
777
also temper simplified account on the predictive power of neural filter strategies
778
for behavior as well as translational neurotechnological advances that build on
779
them.
780
781
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
25
Materials and Methods
782
Participants and procedure
783
A total of N = 155 right-handed German native speakers (median age 61 years; age
784
range 3980 years; 62 males; see Fig. 1supplement 1 for age distribution) were
785
included in the analysis. All participants are part of a large-scale study on the neural
786
and cognitive mechanisms supporting adaptive listening behavior in healthy
787
middle-aged and older adults (“The listening challenge: How ageing brains adapt
788
(AUDADAPT)”; https://cordis.europa.eu/project/rcn/197855_en.html). Handedness
789
was assessed using a translated version of the Edinburgh Handedness
790
Inventory[81]. All participants had normal or corrected-to-normal vision, did not
791
report any neurological, psychiatric, or other disorders and were screened for mild
792
cognitive impairment using the German version of the 6-Item Cognitive
793
Impairment Test (6CIT [82]).
794
As part of our large-scale study on adaptive listening behavior in healthy
795
aging adults, participants also underwent a session consisting of a general
796
screening procedure, detailed audiometric measurements, and a battery of
797
cognitive tests and personality profiling (see ref. [11] for details). This session
798
always preceded the EEG recording session. Only participants with normal hearing
799
or age-adequate mild-to-moderate hearing loss were included (see Fig. 1
800
supplement 2 for individual audiograms). As part of this screening procedure an
801
additional 17 participants were excluded prior to EEG recording due to non-age
802
related hearing loss or a medical history. Three participants dropped out of the
803
study prior to EEG recording and an additional 9 participants were excluded from
804
analyses after EEG recording: three due to incidental findings after structural MR
805
acquisition, and six due to technical problems during EEG recording or overall poor
806
EEG data quality. Participants gave written informed consent and received financial
807
compensation (8€ per hour). Procedures were approved by the ethics committee
808
of the University of Lübeck and were in accordance with the Declaration of Helsinki.
809
810
Dichotic listening task
811
In a recently established [30] linguistic variant of a classic Posner paradigm[29],
812
participants listened to two competing, dichotically presented sentences. They
813
were probed on the sentence-final word in one of the two sentences. Critically, two
814
visual cues preceded auditory presentation. First, a spatial-attention cue either
815
indicated the to-be-probed ear, thus invoking selective attention, or did not
816
provide any information about the to-be-probed ear, thus invoking divided
817
attention. Second, a semantic cue specified a general or a specific semantic
818
category for the final word of both sentences, thus allowing to utilize a semantic
819
prediction. Cue levels were fully crossed in a 2×2 design and presentation of cue
820
combinations varied on a trial-by-trial level (Fig. 2a). The trial structure is
821
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
26
exemplified in Figure 2b. Details on stimulus construction and recording can be
822
found in the Supplemental Information.
823
Each trial started with the presentation of a fixation cross in the middle of
824
the screen (jittered duration: mean 1.5 s, range 0.53.5 s). Next, a blank screen was
825
shown for 500 ms followed by the presentation of the spatial cue in the form of a
826
circle segmented equally into two lateral halves. In selective-attention trials, one
827
half was black, indicating the to-be-attended side, while the other half was white,
828
indicating the to-be-ignored side. In divided-attention trials, both halves appeared
829
in grey. After a blank screen of 500 ms duration, the semantic cue was presented
830
in the form of a single word that specified the semantic category of both sentence-
831
final words. The semantic category could either be given at a general (natural vs.
832
man-made) or specific level (e.g. instruments, fruits, furniture) and thus provided
833
different degrees of semantic predictability. Each cue was presented for 1000 ms.
834
After a 500 ms blank-screen period, the two sentences were presented
835
dichotically along with a fixation cross displayed in the middle of the screen. Finally,
836
after a jittered retention period, a visual response array appeared on the left or
837
right side of the screen, presenting four word choices. The location of the response
838
array indicated which ear (left or right) was probed. Participants were instructed to
839
select the final word presented on the to-be-attended side using the touch screen.
840
Among the four alternatives were the two actually presented nouns as well as two
841
distractor nouns from the same cued semantic category. Note that because the
842
semantic cue applied to all four alternative verbs, it could not be used to post-hoc
843
infer the to-be-attended sentence-final word.
844
Stimulus presentation was controlled by PsychoPy [83]. The visual scene
845
was displayed using a 24” touch screen (ViewSonic TD2420) positioned within an
846
arm’s length. Auditory stimulation was delivered using in-ear headphones
847
(EARTONE 3A) at sampling rate of 44.1 kHz. Following instructions, participants
848
performed a few practice trials to familiarize themselves with the listening task. To
849
account for differences in hearing acuity within our group of participants, individual
850
hearing thresholds for a 500-ms fragment of the dichotic stimuli were measured
851
using the method of limits. All stimuli were presented 50 dB above the individual
852
sensation level. During the experiment, each participant completed 60 trials per
853
cue-cue condition, resulting in 240 trials in total. The cue conditions were equally
854
distributed across six blocks of 40 trials each (~ 10 min) and were presented in
855
random order. Participants took short breaks between blocks.
856
857
Behavioral data analysis
858
We evaluated participants’ behavioral performance in the listening task with
859
respect to accuracy and response speed. For the binary measure of accuracy, we
860
excluded trials in which participants failed to answer within the given 4-s response
861
window (‘timeouts’). Spatial stream confusions, that is trials in which the sentence-
862
final word of the to-be-ignored speech stream were selected, and random errors
863
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
27
were jointly classified as incorrect answers. The analysis of response speed, defined
864
as the inverse of reaction time, was based on correct trials only. Single-trial
865
behavioral measures were subjected to (generalized) linear mixed-effects analysis
866
and regularized regression (see Statistical analysis).
867
868
EEG data acquisition and preprocessing
869
Participants were seated comfortably in a dimly-lit, sound-attenuated recording
870
booth where we recorded their EEG from 64 active electrodes mounted to an elastic
871
cap (Ag/AgCl; ActiCap / ActiChamp, Brain Products, Gilching, Germany). Electrode
872
impedances were kept below 30 kΩ. The signal was digitized at a sampling rate of
873
1000 Hz and referenced on-line to the left mastoid electrode (TP9, ground: AFz).
874
Before task instruction, 5-min eyes-open and 5-min eyes-closed resting state EEG
875
was recorded from each participant.
876
For subsequent off-line EEG data analyses, we used the EEGlab [84] (version
877
14_1_1b) and Fieldtrip toolboxes [85] (version 2016-06-13), together with
878
customized Matlab scripts. Independent component analysis (ICA) using EEGlab’s
879
default runica algorithm was used to remove all non-brain signal components
880
including eye blinks and lateral eye movements, muscle activity, heartbeats and
881
single-channel noise. Prior to ICA, EEG data were re-referenced to the average of
882
all EEG channels (average reference). Following ICA, trials during which the
883
amplitude of any individual EEG channel exceeded a range of 200 microvolts were
884
removed.
885
886
EEG data analysis
887
Sensor-level analysis
888
The preprocessed continuous EEG data were high-pass-filtered at 0.3 Hz (finite
889
impulse response (FIR) filter, zero-phase lag, order 5574, Hann window) and low-
890
pass-filtered at 180 Hz (FIR filter, zero-phase lag, order 100, Hamming window).
891
The EEG was cut into epochs of 2 to 8 s relative to the onset of the spatial-
892
attention cue to capture cue presentation as well as the entire auditory stimulation
893
interval.
894
For the analysis of changes in alpha power, EEG data were down-sampled
895
to fs=250 Hz. SpectrotemporalSpectro-temporal estimates of the single-trial data
896
were then obtained for a time window of −0.5 to 6.5 s (relative to the onset of the
897
spatial-attention cue) at frequencies ranging from 8 to 12 Hz (Morlet’s wavelets;
898
number of cycles = 6).
899
For the analysis of the neural encoding of speech by low-frequency activity,
900
the continuous preprocessed EEG were down-sampled to fs=125 Hz and filtered
901
between fc=1 and 8 Hz (FIR filters, zero-phase lag, order: 8fs/fc and 2fs/fc, Hamming
902
window). The EEG was cut to yield individual epochs covering the presentation of
903
auditory stimuli, beginning at noise onset until the end of auditory presentation.
904
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
28
Individual epochs were z-scored prior to submitting them to regression analysis
905
(see below).
906
907
EEG source and forward model construction
908
Twenty-seven participants had completed functional and structural magnetic
909
resonance imaging (MRI) from performing the same experiment on a separate
910
session. For these participants individual EEG source and forward models were
911
created on the basis of each participant’s T1-weighted MRI image (Siemens
912
MAGNETOM Skyra 3T scanner; 1-mm isotropic voxel). The T1 image of one female
913
and one male participant were used as templates for the remaining participants.
914
First, anatomical images were resliced (256x256x256 voxels) and the CTF
915
convention was imposed as their coordinate system. Next, the cortical surface of
916
each individual (or template) T1 image was constructed using the FreeSurfer
917
function recon-all. The result was used to generate a cortical mesh in accordance
918
with the Human Connectome Project (HCP) standard atlas template (available at
919
https://github.com/Washington-University/HCPpipelines). We used FieldTrip
920
(ft_postfreesurferscript.sh) to create the individual cortical mesh encompassing 4002
921
grid points per hemisphere. This representation provided the individual EEG source
922
model geometry. To generate individual head and forward models, we segmented
923
T1 images into three tissue types (skull, scalp, brain; Fieldtrip function
924
ft_volumesegment).
925
Subsequently, the forward model (volume conduction) was estimated
926
using the boundary element method in FieldTrip (‘dipoliimplementation). Next,
927
we optimized the fit of digitized EEG channel locations (xensor digitizer, ANT
928
Neuro) to each individual’s head model in the CTF coordinate system using rigid-
929
body transformation and additional interactive alignment. Finally, the lead-field
930
matrix for each channel x source pair was computed using the source and forward
931
model per individual.
932
933
Beamforming
934
Source reconstruction (inverse solution) was achieved using a frequency-domain
935
beamforming approach, namely partial and canonical coherence (PCC) [86]. To this
936
end, we concatenated 5-min segments of eyes-open resting state and a random
937
5-min segment of task data and calculated the cross-spectral density using a
938
broad-band filter centered at 15 Hz (bandwidth = 14 Hz). The result was then used
939
together with the source and forward model to obtain a common spatial filter per
940
individual in FieldTrip (regularization parameter: 5 %, dipole orientation: axis of
941
most variance using singular value decomposition).
942
943
Source projection of sensor data
944
For the source-level analysis, sensor-level single-trial data in each of our two
945
analysis routines were projected to source space by matrix-multiplication of the
946
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
29
spatial filter weights. To increase signal-to-noise ratio in source estimates and
947
computationally facilitate source-level analyses, source-projected data were
948
averaged across grid-points per cortical area defined according to the HCP
949
functional parcellation template ([87,similar to 88]. This parcellation provides a
950
symmetrical delineation of each hemisphere into 180 parcels for a total of 360
951
parcels.
952
953
Regions of interest
954
We constrained the analysis of neural measures to an a priori defined auditory
955
region of interest (ROI) as well as two control ROIs in the inferior parietal lobule
956
and visual cortex, respectively. Regions of interest were constructed by selecting a
957
bilaterally symmetric subset of functional parcels (see Figure 3 supplement 1).
958
Following the notation used in ref. [87], the auditory ROI encompassed eight
959
parcels per hemisphere covering the primary auditory cortex, lateral, medial, and
960
parabelt complex, A4 and A5 complex that extended laterally into the posterior
961
portion of the superior temporal gyrus (Brodmann area (BA) 22). It also included
962
parts of the retroinsular and parainsular cortex (BA52) that lie at the boundary of
963
temporal lobe and insula. The inferior parietal (IPL) ROI consisted of six parcels per
964
hemisphere that included the angular gyrus (BA39), supramarginal gyrus (BA40), as
965
well as the anterior lateral bank of the intraparietal sulcus. The visual ROI consisted
966
of four parcels that encompassed the areas V1 (BA17), V2 (BA18), as well as V3 and
967
V4 (BA19).
968
969
Attentional modulation of alpha power
970
Absolute source power was calculated as the square-amplitude of the spectro-
971
temporal estimates. Since oscillatory power values typically follow a highly skewed,
972
non-normal distribution, we applied a nonlinear transformation of the Box-Cox
973
family (powertrans = (powerp −1)/p with p=0.5) to minimize skewness and to satisfy
974
the assumption of normality for parametric statistical tests involving oscillatory
975
power values [89]. To quantify attention-related changes in 812 Hz alpha power,
976
per ROI, we calculated the single-trial, temporally resolved alpha lateralization
977
index as follows [9,11,12]:
978
979
ALI = (α-poweripsi − α-powercontra)/(α-poweripsi + α-powercontra)
980
(1)
981
982
To account for overall hemispheric power differences that were independent of
983
attention modulation, we first normalized single-trial power by calculating per
984
parcel and frequency the whole-trial (0.56.5 s) power averaged across all trials
985
and subtracted it from single trials. We then used a robust variant of the index that
986
applies the inverse logit transform [(1 / 1+ exp(−x)] to both inputs to scale them
987
into a common, positive-only [0;1]-bound space prior to index calculation.
988
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
30
For visualization and statistical analysis of cue-driven neural modulation, we
989
then averaged the ALI across all parcels within the auditory ROI and extracted
990
single-trial mean values for the time window of sentence presentation (3.56.5 s),
991
and treated them as the dependent measure in linear mixed-effects analysis. In
992
addition, they served as continuous predictors in the statistical analysis of brain-
993
behavior relationships (see below). We performed additional control analyses that
994
focused on the ALI during sentence presentation in the inferior parietal and visual
995
ROI, as well as on the auditory ALI during the presentation of the spatial cue (01
996
s) and the sentence-final word (5.56.5 s).
997
998
Extraction of envelope onsets
999
From the presented speech signals, we derived a temporal representation of the
1000
acoustic onsets in the form of the onset envelope [40]. To this end, using the NSL
1001
toolbox [90], we first extracted an auditory spectrogram of the auditory stimuli (128
1002
spectrally resolved sub-band envelopes logarithmically spaced between 904000
1003
Hz), which were then summed across frequencies to yield a broad-band temporal
1004
envelope. Next, the output was down-sampled and low-pass filtered to match the
1005
specifics of the EEG. To derive the final onset envelope to be used in linear
1006
regression, we first obtained the first derivative of the envelope and set negative
1007
values to zero (half-wave rectification) to yield a temporal signal with positive-only
1008
values reflecting the acoustic onsets (see Fig. 4A and figure supplements).
1009
1010
Estimation of envelope reconstruction models
1011
To investigate how low-frequency (i.e., < 8 Hz) fluctuations in EEG activity relate to
1012
the encoding of attended and ignored speech, we used a linear regression
1013
approach to describe the mapping from the presented speech signal to the
1014
resulting neural response [91,92]. More specifically, we trained stimulus
1015
reconstruction models (also termed decoding or backward models) to predict the
1016
onset envelope of the attended and ignored speech stream from EEG. In this
1017
analysis framework, a linear reconstruction model g is assumed to represent the
1018
linear mapping from the recorded EEG signal, r(t,n), to the stimulus features, s(t):
1019
1020
     
(2)
1021
1022
where sˆ(t) is the reconstructed onset envelope at time point t. We used all parcels
1023
within the bilateral auditory ROI and time lags τ in the range of –100 ms to 500 ms
1024
to compute envelope reconstruction models using ridge regression [93]:
1025
1026
     (3)
1027
1028
where R is a matrix containing the sample-wise time-lagged replication of the
1029
neural response matrix r, λ is the ridge parameter for regularization, m is a scalar
1030
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
31
representing the mean of the trace of RTR [94], and I is the identity matrix. We
1031
followed the procedures described in ref. [95] to estimate the optimal ridge
1032
parameter.
1033
Compared to linear forward (‘encoding’) models that derive temporal
1034
response functions (TRFs) independently for each EEG channel or source, stimulus
1035
reconstruction models represent multivariate impulse response functions that
1036
exploit information from all time lags and EEG channels/sources simultaneously. To
1037
allow for a neurophysiological interpretation of backward model coefficients, we
1038
additionally transformed them into linear forward model coefficients following the
1039
inversion procedure described in ref. [96]. All analyses were performed using the
1040
multivariate temporal response function (mTRF) toolbox [91] (version 1.5) for
1041
Matlab.
1042
Prior to model estimation, we split the data based on the two spatial
1043
attention conditions (selective vs. divided), resulting in 120 trials per condition.
1044
Envelope reconstruction models were trained on selective-attention single-trial
1045
data, only. Two different backward models were estimated for a given trial, an
1046
envelope reconstruction model for the-be-attended speech stream (short:
1047
attended reconstruction model), and one for the to-be-ignored speech stream
1048
(short: ignored reconstruction model). Reconstruction models for attended and
1049
ignored speech signals were trained separately for attend-left and attend-right
1050
trials which yielded 120 single-trial decoders (60 attended, 60 ignored) per
1051
attentional setting. For illustrative purposes, we averaged the forward
1052
transformations of attended and ignored speech per side across all participants
1053
(Fig. 4B).
1054
1055
Evaluation of envelope reconstruction accuracy
1056
To quantify how strongly the attended and ignored sentences were tracked by slow
1057
cortical dynamics, at the single-subject level, we reconstructed the attended and
1058
ignored envelope of a given trial using a leave-one-out cross-validation procedure.
1059
Following this approach, the envelopes of each trial were reconstructed using the
1060
averaged reconstruction models trained on all but the tested trial. For a given trial,
1061
we only used the trained models that corresponded to the respective cue condition
1062
(i.e., for an attend-left/ignore-right trial we only used the reconstruction models
1063
trained on the respective trials). The reconstructed onset envelope obtained from
1064
each model was then compared to the two onset envelopes of the actually
1065
presented speech signals. The resulting Pearson-correlation coefficients, rattended
1066
and rignored, reflect the single-trial neural tracking strength or reconstruction
1067
accuracy [23] (see Figure 4figure supplement 1 and 2).
1068
We proceeded in a similar fashion for divided-attention trials. Since these
1069
trials could not be categorized based on the to-be-attended and -ignored side, we
1070
split them based on the ear that was probed at the end the trial. Given that even in
1071
the absence of a valid attention cue, participants might still (randomly) focus their
1072
attention to one of the two streams, we wanted to quantify how strongly the
1073
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
32
probed and unprobed envelopes were tracked neurally. To this end, we used the
1074
reconstruction models trained on selective-attention trials to reconstruct the onset
1075
envelopes of divided-attention trials. Sentences presented in probed-
1076
left/unprobed-right trials were reconstructed using the attend-left/ignore-right
1077
reconstruction models while probed-right/unprobed-left trials used the
1078
reconstruction models trained on attend-right/ignore-left trials.
1079
One of the goals of the present study was to investigate the degree to which
1080
the neural tracking of the attended and ignored envelopes would be influenced by
1081
spatial attention and semantic predictability, and how these changes would relate
1082
to behavior. We thus used the neural tracking strength of the attended envelope
1083
(when reconstructed with the attended reconstruction model) and that of the
1084
ignored envelope (when reconstructed with the ignored reconstruction model) as
1085
variables in our linear mixed-effects analyses (see below).
1086
1087
Decoding accuracy
1088
To evaluate how accurately we could decode an individual’s focus of attention, we
1089
separately evaluated the performance of the reconstruction models for the
1090
attended or the ignored envelope. When the reconstruction was performed with
1091
the attended reconstruction models, the focus of attention was correctly decoded
1092
if the correlation with the attended envelope was greater than that with the ignored
1093
envelope. Conversely, for reconstructions performed with the ignore
1094
reconstruction models, a correct identification of attention required the correlation
1095
with the ignored envelope to be greater than that with the attended envelope (see
1096
Fig. 4supplement 1). To quantify whether the single-subject decoding accuracy
1097
was significantly above chance, we used a binomial test with α=0.05.
1098
1099
Statistical analysis
1100
We used (generalized) linear mixed-effect models to investigate the influence of
1101
the experimental cue conditions (spatial cue: divided vs. selective; semantic cue:
1102
general vs. specific) on behavior, see Q1 in Fig. 1) as well as on our neural measures
1103
of interest (Q2). Finally, we investigated the relationship of neural measures (Q3)
1104
and their (joint) influence of the different cue conditions and neural measures on
1105
behavior (Q4). Using linear mixed-effects models allowed us to model and control
1106
for the impact of various additional covariates known to influence behavior and/or
1107
neural measures. These included the probed ear (left/right), whether the later-
1108
probed sentence had the earlier onset (yes/no), as well as participants’ age and
1109
hearing acuity (pure-tone average across both ears).
1110
1111
Model selection
1112
To avoid known problems associated with a largely data-driven stepwise model
1113
selection that include the overestimation of coefficients [97,98] or the selection of
1114
irrelevant predictors [99], our selection procedure was largely constrained by a
1115
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
33
priori defined statistical hypotheses. The influence of visual cues and of neural
1116
measures were tested in same brain-behavior model.
1117
At the fixed-effects level, these models include the main effects of both cue
1118
types, auditory alpha lateralization during sentence presentation, separate
1119
regressors for the neural tracking of attended and ignored speech, as well as of
1120
probed ear, participants’ age, PTA, and difference in the onset of attended and
1121
ignored sentence. In line with the specifics of the experimental design and derived
1122
hypotheses, we included additional 2- and 3-way interactions to model how
1123
behavioral performance was affected by the joint influence of visual cues, and by
1124
cue-driven changes in neural measures (see Tables S1 and S2 for the full list of
1125
included parameters). The brain-behavior model of accuracy and response speed
1126
included random intercepts by subject and item. In a data-driven manner, we then
1127
tested whether model fit could be improved by the inclusion of by-subject random
1128
slopes for the effect of the spatial-attention cue, semantic cue, or probed ear. The
1129
change in model fit was assessed using likelihood ratio tests on nested models.
1130
The set of models that analyzed how neural measures changed as a
1131
function of visual cue information included main effects of visual cues, probed ear,
1132
as well as the same set of additional covariates. We also included 2-way interaction
1133
of spatial cue and semantic cue, as well as spatial cue and probed ear (see Tables
1134
S3-5 for full list of included parameters). Due to convergence issue, we did not
1135
include a random intercept by item but otherwise followed the same procedure for
1136
the inclusion of random slopes as described above.
1137
Lastly, models that probed the interdependence of neural filter mechanisms
1138
included the neural tracking of attended or ignored speech as dependent variable
1139
and alpha power, spatial cue, and their interactions as predictors (see Tables S6
1140
and S7 for details). The selection of the random structure followed the same
1141
principles as for the cue-driven neural modulation models described in the
1142
paragraph above).
1143
Deviation coding was used for categorical predictors. All continuous
1144
variables were z-scored. For the dependent measure of accuracy, we used
1145
generalized linear mixed-effects model (binomial distribution, logit link function).
1146
For response speed, we used general linear mixed-effects model (gaussian
1147
distribution, identity link function). P-values for individual model terms in the linear
1148
models are based on t-values and use the Satterthwaite approximation for degrees
1149
of freedom[100]. P-values for generalized linear models are based on z-values and
1150
asymptotic Wald tests. In lieu of a standardized measure of effect size for mixed-
1151
effects models, we report odds ratios (OR) for generalized linear models and
1152
standardized regression coefficients (β) for linear models along with their
1153
respective standard errors (SE). Given the large number of individual terms included
1154
in the models, all reported p-values corrected to control for the false discovery rate
1155
[101].
1156
All analyses were performed in R ([102]; version 3.6.1) using the packages
1157
lme4 [103], and sjPlot [104].
1158
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
34
1159
Bayes factor estimation
1160
To facilitate the interpretability of significant and non-significant effects, we also
1161
calculated the Bayes Factor (BF) based on the comparison Bayesian information
1162
criterion (BIC) values as proposed by Wagenmakers [105].
1163
For the brainbehavior models, BF calculation was based on simplified
1164
brain-behavior models that included all main effects but no interaction terms to
1165
avoid problems associated with the dependence of lower- and higher-order
1166
effects. These models included separate regressors for each of our key neural
1167
measures to arbitrate between within- and between-subject effects on behavior.
1168
Between-subject effect regressors consisted of neural measures that were
1169
averaged across all trials at the single-subject level, whereas the within-subject
1170
effect was modelled by the trial-by-trial deviation from the subject-level mean [cf.
1171
45].
1172
To calculate the BF for a given term, we compared the BIC values of the full
1173
model to that of a reduced model in which the respective term was removed [BF10
1174
= exp((BIC(H0)BIC(H1))/2)]. When log-transformed, BFs larger than 0 provide
1175
evidence for the presence of an effect (i.e., the observed data are more likely under
1176
the more complex model) whereas BFs smaller than 0 provide evidence for the
1177
absence of an effect (i.e., the observed data are more likely under the simpler
1178
model).
1179
Predictive analysis
1180
To quantify how well single-trial behavioral performance of new participants could
1181
be predicted on the basis of on our final explanatory brainbehavior models, we
1182
performed a predictive analyses including variables selection using LASSO
1183
regression [98]. All computations were carried out using the glmmLasso package
1184
in R [106,107]. As part of the implemented procedure, the design matrix was
1185
centered and orthonormalized prior to model fitting. The resulting coefficients
1186
were then transformed back to the original units.
1187
Given the imbalance of correct and incorrect trials, for accuracy, we relied
1188
on the area under the receiver operating characteristic curve (AUC) as measure of
1189
classification accuracy, and on the mean squared error (MSE) as a measure of
1190
prediction error for the analysis of response speed.
1191
We used a nested cross-validation scheme in which hyperparameter tuning
1192
and the evaluation of predictive performance were carried out in separate inner
1193
and outer loops (see Figure 6-supplement 3). For the outer loop, we split the full
1194
dataset at random into five folds for training and testing. Each set of training data
1195
was subsequently split into further 5 folds for the purpose of hyperparameter
1196
tuning. For both outer and inner loop, the data was split at the subject level, that
1197
is, all trials from one participants were assigned to either training or testing, hence
1198
avoiding data leakage and associated overfitting [108]. Lasso regression models for
1199
accuracy and response speed relied on the same data splits.
1200
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
35
For the selection of the optimal tuning parameter λ that controls the
1201
amount of shrinkage applied to the coefficients, we used a predefined grid of 100
1202
logarithmically-spaced λ values. The highest λ value (λmax ) caused all coefficients
1203
to be shrunk to zero, while λmin was defined as λmin= λmax*0.001. To determine the
1204
regression path, we computed models with decreasing λ values and passed the
1205
resulting coefficients to the next iteration thus improving the efficiency of the
1206
optimization algorithm. For each inner loop cross-validation cycle, we determined
1207
the optimal λ value (λopt) according to the 1-SE rule [109]. These values were then
1208
passed onto the corresponding outer cross-validation folds for training and testing.
1209
We quantified the average predictive performance of our variable selection
1210
procedure by averaging the respective performance metric (AUC / MSE) across all
1211
k=5 outer fold models.
1212
Finally, we determined which of these final models yielded the predictive
1213
performance and used the associated λ value to fit this winning model to all of the
1214
available data [110]. As implemented in glmmLasso, we computed p-values by re-
1215
estimating the model including only terms with non-zero coefficients and the
1216
application of Fisher scoring.
1217
1218
Data availability
1219
The complete dataset associated with this work including raw data, EEG data
1220
analysis results, as well as corresponding code will be publicly available under
1221
https://osf.io/28r57/.
1222
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
36
References
1223
1. Cherry EC. Some experiments on the recognition of speech, with one
1224
and with two ears. J Acoust Soc Am. 1953;25: 975979.
1225
doi:10.1121/1.1907229
1226
2. Jensen O, Mazaheri A. Shaping functional architecture by oscillatory
1227
alpha activity: gating by inhibition. Front Hum Neurosci. 2010;4: 186.
1228
doi:10.3389/fnhum.2010.00186
1229
3. Foxe JJ, Snyder AC. The Role of Alpha-Band Brain Oscillations as a
1230
Sensory Suppression Mechanism during Selective Attention. Front
1231
Psychology. 2011;2: 154. doi:10.3389/fpsyg.2011.00154
1232
4. Händel BF, Haarmeier T, Jensen O. Alpha oscillations correlate with
1233
the successful inhibition of unattended stimuli. J Cogn Neurosci.
1234
2011;23: 24942502. doi:10.1162/jocn.2010.21557
1235
5. Rihs TA, Michel CM, Thut G. Mechanisms of selective inhibition in
1236
visual spatial attention are indexed by α-band EEG synchronization.
1237
Eur J Neurosci. 2007;25: 603610. doi:10.1111/j.1460-
1238
9568.2007.05278.x
1239
6. Kerlin JR, Shahin AJ, Miller LM. Attentional gain control of ongoing
1240
cortical speech representations in a “cocktail party.” J Neurosci.
1241
2010;30: 620628. doi:10.1523/JNEUROSCI.3631-09.2010
1242
7. Ahveninen J, Huang S, Belliveau JW, Chang W-T, Hämäläinen M.
1243
Dynamic oscillatory processes governing cued orienting and
1244
allocation of auditory attention. J Cogn Neurosci. 2013;25: 1926
1245
1943. doi:10.1162/jocn_a_00452
1246
8. Müller N, Weisz N. Lateralized auditory cortical alpha band activity
1247
and interregional connectivity pattern reflect anticipation of target
1248
sounds. Cereb Cortex. 2011;22: 16041613.
1249
doi:10.1093/cercor/bhr232
1250
9. Wöstmann M, Herrmann B, Maess B, Obleser J. Spatiotemporal
1251
dynamics of auditory attention synchronize with speech. Proc Natl
1252
Acad Sci USA. 2016;113: 38733878. doi:10.1073/pnas.1523357113
1253
10. Wöstmann M, Vosskuhl J, Obleser J, Herrmann CS. Opposite effects
1254
of lateralised transcranial alpha versus gamma stimulation on
1255
auditory spatial attention. Brain Stimulation. 2018;11: 752758.
1256
doi:10.1016/j.brs.2018.04.006
1257
11. Tune S, Wöstmann M, Obleser J. Probing the limits of alpha power
1258
lateralisation as a neural marker of selective attention in middle-aged
1259
and older listeners. Eur J Neurosci. 2018;48: 25372550.
1260
doi:10.1111/ejn.13862
1261
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
37
12. Haegens S, Handel BF, Jensen O. Top-Down Controlled Alpha Band
1262
Activity in Somatosensory Areas Determines Behavioral Performance
1263
in a Discrimination Task. J Neurosci. 2011;31: 51975204.
1264
doi:10.1523/JNEUROSCI.5199-10.2011
1265
13. Bauer M, Kennett S, Driver J. Attentional selection of location and
1266
modality in vision and touch modulates low-frequency activity in
1267
associated sensory cortices. J Neurophysiol. 2012;107: 23422351.
1268
doi:10.1152/jn.00973.2011
1269
14. Worden MS, Foxe JJ, Wang N, Simpson GV. Anticipatory biasing of
1270
visuospatial attention indexed by retinotopically specific alpha-band
1271
electroencephalography increases over occipital cortex. J Neurosci.
1272
2000;20: 16. doi:https://doi.org/10.1523/JNEUROSCI.20-06-
1273
j0002.2000
1274
15. Kelly SP, Lalor EC, Reilly RB, Foxe JJ. Increases in alpha oscillatory
1275
power reflect an active retinotopic mechanism for distracter
1276
suppression during sustained visuospatial attention. J Neurophysiol.
1277
2006;95: 38443851. doi:10.1152/jn.01234.2005
1278
16. Schroeder CE, Lakatos P. Low-frequency neuronal oscillations as
1279
instruments of sensory selection. Trends Neurosci. 2009;32: 918.
1280
doi:10.1016/j.tins.2008.09.012
1281
17. Schroeder CE, Wilson DA, Radman T, Scharfman H, Lakatos P.
1282
Dynamics of Active Sensing and perceptual selection. Current
1283
Opinion in Neurobiology. 2010;20: 172176.
1284
doi:10.1016/j.conb.2010.02.010
1285
18. Henry MJ, Obleser J. Frequency modulation entrains slow neural
1286
oscillations and optimizes human listening behavior. Proc Natl Acad
1287
Sci USA. 2012;109: 2009520100. doi:10.1073/pnas.1213390109
1288
19. Obleser J, Kayser C. Neural Entrainment and Attentional Selection in
1289
the Listening Brain. Trends Cogn Sci. 2019;23: 913926.
1290
doi:10.1016/j.tics.2019.08.004
1291
20. Zion Golumbic EMZ, Ding N, Bickel S, Lakatos P, Schevon CA,
1292
McKhann GM, et al. Mechanisms Underlying Selective Neuronal
1293
Tracking of Attended Speech at a ‘‘Cocktail Party’’. Neuron. 2013;77:
1294
980991. doi:10.1016/j.neuron.2012.12.037
1295
21. Ding N, Simon JZ. Emergence of neural encoding of auditory objects
1296
while listening to competing speakers. Proc Natl Acad Sci USA.
1297
2012;109: 1185411859. doi:10.1073/pnas.1205381109
1298
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
38
22. Mesgarani N, Chang EF. Selective cortical representation of attended
1299
speaker in multi-talker speech perception. Nature. 2012;485: 233
1300
236. doi:10.1038/nature11020
1301
23. O’Sullivan JA, Power AJ, Mesgarani N, Rajaram S, Foxe JJ, Shinn-
1302
Cunningham BG, et al. Attentional Selection in a Cocktail Party
1303
Environment Can Be Decoded from Single-Trial EEG. Cereb Cortex.
1304
2014;25: 16971706. doi:10.1093/cercor/bht355
1305
24. Horton C, D'Zmura M, Srinivasan R. Suppression of competing
1306
speech through entrainment of cortical oscillations. J Neurophysiol.
1307
2013;109: 30823093. doi:10.1152/jn.01026.2012
1308
25. Keitel C, Keitel A, Benwell CSY, Daube C, Thut G, Gross J. Stimulus-
1309
Driven Brain Rhythms within the Alpha Band: The Attentional-
1310
Modulation Conundrum. J Neurosci. 2019;39: 31193129.
1311
doi:10.1523/JNEUROSCI.1633-18.2019
1312
26. Gundlach C, Moratti S, Forschack N, Müller MM. Spatial Attentional
1313
Selection Modulates Early Visual Stimulus Processing Independently
1314
of Visual Alpha Modulations. Cereb Cortex. 2020;16: 63718.
1315
doi:10.1093/cercor/bhz335
1316
27. Henry MJ, Herrmann B, Kunke D, Obleser J. Aging affects the balance
1317
of neural entrainment and top-down neural modulation in the
1318
listening brain. Nat Commun. 2017;8: 15801.
1319
doi:10.1038/ncomms15801
1320
28. Lakatos P, Barczak A, Neymotin SA, McGinnis T, Ross D, Javitt DC, et
1321
al. Global dynamics of selective attention and its lapses in primary
1322
auditory cortex. Nat Neurosci. 2016;19: 17071717.
1323
doi:10.1038/nn.4386
1324
29. Posner MI. Orienting of attention. Quarterly Journal of Experimental
1325
Psychology. 1980;32: 325. doi: 10.1080/00335558008248231
1326
30. Alavash M, Tune S, Obleser J. Modular reconfiguration of an auditory
1327
control brain network supports adaptive listening behavior. Proc Natl
1328
Acad Sci USA. 2019;116: 660669. doi:10.1073/pnas.1815321116
1329
31. Presacco A, Simon JZ, Anderson S. Effect of informational content of
1330
noise on speech representation in the aging midbrain and cortex. J
1331
Neurophysiol. 2016;116: 23562367. doi:10.1152/jn.00373.2016
1332
32. Sohoglu E, Peelle JE, Carlyon RP, Davis MH. Predictive Top-Down
1333
Integration of Prior Knowledge during Speech Perception. J Neurosci.
1334
2012;32: 84438453. doi:10.1523/JNEUROSCI.5069-11.2012
1335
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
39
33. Peelle JE, Gross J, Davis MH. Phase-Locked Responses to Speech in
1336
Human Auditory Cortex are Enhanced During Comprehension. Cereb
1337
Cortex. 2013;23: 13781387. doi:10.1093/cercor/bhs118
1338
34. Obleser J, Weisz N. Suppressed Alpha Oscillations Predict
1339
Intelligibility of Speech and its Acoustic Details. Cereb Cortex.
1340
2012;22: 24662477. doi:10.1093/cercor/bhr325
1341
35. Wöstmann M, Lim S-J, Obleser J. The human neural alpha response
1342
to speech is a proxy of attentional control. Cereb Cortex. 2017;27:
1343
33073317. doi:10.1093/cercor/bhx074
1344
36. Broadbent DE, Gregory M. Accuracy of recognition for speech
1345
presented to the right and left ears. Quarterly Journal of Experimental
1346
Psychology. 1964;16: 359360. doi:10.1080/17470216408416392
1347
37. Kimura D. Cerebral dominance and the perception of verbal stimuli.
1348
Canadian Journal of Psychology/Revue canadienne de psychologie.
1349
1961;15: 166171. doi:10.1037/h0083219
1350
38. Shmueli G. To Explain or to Predict? Statist Sci. 2010;25: 289310.
1351
doi:10.1214/10-STS330
1352
39. Alavash M, Tune S, Obleser J. Modular reconfiguration of an auditory
1353
control brain network supports adaptive listening behavior. Proc Natl
1354
Acad Sci USA. 2018;48: 20181532110. doi:10.1073/pnas.1815321116
1355
40. Fiedler L, Wöstmann M, Herbst SK, Obleser J. Late cortical tracking of
1356
ignored speech facilitates neural selectivity in acoustically challenging
1357
conditions. NeuroImage. 2019;186: 3342.
1358
doi:10.1016/j.neuroimage.2018.10.057
1359
41. Banerjee S, Snyder AC, Molholm S, Foxe JJ. Oscillatory alpha-band
1360
mechanisms and the deployment of spatial attention to anticipated
1361
auditory and visual target locations: supramodal or sensory-specific
1362
control mechanisms? J Neurosci. 2011;31: 99239932.
1363
doi:10.1523/JNEUROSCI.4660-10.2011
1364
42. Schneider BA, Daneman M, Pichora-Fuller MK. Listening in aging
1365
adults: from discourse comprehension to psychoacoustics. Can J Exp
1366
Psychol. 2002;56: 139152. doi:10.1037/h0087392
1367
43. Passow S, Westerhausen R, Wartenburger I, Hugdahl K, Heekeren HR,
1368
Lindenberger U, et al. Human aging compromises attentional control
1369
of auditory perception. Psychol Aging. 2012;27: 99105.
1370
doi:10.1037/a0025667
1371
44. Anderson S, White-Schwoch T, Parbery-Clark A, Kraus N. A dynamic
1372
auditory-cognitive system supports speech-in-noise perception in
1373
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
40
older adults. Hearing Research. 2013;300: 1832.
1374
doi:10.1016/j.heares.2013.03.006
1375
45. Bell A, Fairbrother M, Jones K. Fixed and random effects models:
1376
making an informed choice. Qual Quant. 2018;53: 10511074.
1377
doi:10.1007/s11135-018-0802-x
1378
46. Waschke L, Tune S, Obleser J. Local cortical desynchronization and
1379
pupil-linked arousal differentially shape brain states for optimal
1380
sensory performance. eLife. 2019;8: 186827. doi:10.7554/eLife.51501
1381
47. Krakauer JW, Ghazanfar AA, Gomez-Marin A, MacIver MA, Poeppel D.
1382
Neuroscience needs behavior: correcting a reductionist bias. Neuron.
1383
2017;93: 480490. doi:10.1016/j.neuron.2016.12.041
1384
48. van Ede F, Köster M, Maris E. Beyond establishing involvement:
1385
quantifying the contribution of anticipatory α- and β-band
1386
suppression to perceptual improvement with attention. J
1387
Neurophysiol. 2012;108: 23522362. doi:10.1152/jn.00347.2012
1388
49. Ding N, Simon JZ. Cortical entrainment to continuous speech:
1389
functional roles and interpretations. Front Hum Neurosci. 2014;8:
1390
13367. doi:10.3389/fnhum.2014.00311
1391
50. Hamilton LS, Huth AG. The revolution will not be controlled: natural
1392
stimuli in speech neuroscience. Language, Cognition and
1393
Neuroscience. 2018;27: 110. doi:10.1080/23273798.2018.1499946
1394
51. Sassenhagen J. How to analyse electrophysiological responses to
1395
naturalistic language with time-resolved multiple regression.
1396
Language, Cognition and Neuroscience. 2018;35: 474490.
1397
doi:10.1080/23273798.2018.1502458
1398
52. Ding N, Simon JZ. Neural coding of continuous speech in auditory
1399
cortex during monaural and dichotic listening. J Neurophysiol.
1400
2012;107: 7889. doi:10.1152/jn.00297.2011
1401
53. Broderick MP, Anderson AJ, Di Liberto GM, Crosse MJ, Lalor EC.
1402
Electrophysiological Correlates of Semantic Dissimilarity Reflect the
1403
Comprehension of Natural, Narrative Speech. Curr Biol. 2018;28: 1
1404
11. doi:10.1016/j.cub.2018.01.080
1405
54. Brodbeck C, Hong LE, Simon JZ. Rapid Transformation from Auditory
1406
to Linguistic Representations of Continuous Speech. Curr Biol.
1407
2018;28: 39763982.e6. doi:10.1016/j.cub.2018.10.042
1408
55. Fiedler L, Woestmann M, Herbst SK, Obleser J. Late cortical tracking
1409
of ignored speech facilitates neural selectivity in acoustically
1410
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
41
challenging conditions. NeuroImage. 2018;186: 3342.
1411
doi:https://doi.org/10.1016/j.neuroimage.2018.10.057
1412
56. Thut G. -band electroencephalographic activity over occipital
1413
cortex indexes visuospatial attention bias and predicts visual target
1414
detection. J Neurosci. 2006;26: 94949502.
1415
doi:10.1523/JNEUROSCI.0875-06.2006
1416
57. Bengson JJ, Mangun GR, Mazaheri A. The neural markers of an
1417
imminent failure of response inhibition. NeuroImage. 2012;59: 1534
1418
1539. doi:10.1016/j.neuroimage.2011.08.034
1419
58. Dahl MJ, Ilg L, Li S-C, Passow S, Werkle-Bergner M. Diminished pre-
1420
stimulus alpha-lateralization suggests compromised self-initiated
1421
attentional control of auditory processing in old age. NeuroImage.
1422
2019;197: 414424. doi:10.1016/j.neuroimage.2019.04.080
1423
59. Hong X, Sun J, Bengson JJ, Mangun GR, Tong S. Normal aging
1424
selectively diminishes alpha lateralization in visual spatial attention.
1425
NeuroImage. 2015;106: 353363.
1426
doi:10.1016/j.neuroimage.2014.11.019
1427
60. Leenders MP, Lozano-Soldevilla D, Roberts MJ, Jensen O, De Weerd
1428
P. Diminished Alpha Lateralization During Working Memory but Not
1429
During Attentional Cueing in Older Adults. Cereb Cortex. 2018;28:
1430
2132. doi:10.1093/cercor/bhw345
1431
61. Mok RM, Myers NE, Wallis G, Nobre AC. Behavioral and neural
1432
markers of flexible attention over working memory in aging. Cereb
1433
Cortex. 2016;26: 18311842. doi:10.1093/cercor/bhw011
1434
62. Rogers CS, Payne L, Maharjan S, Wingfield A, Sekuler R. Older adults
1435
show impaired modulation of attentional alpha oscillations: Evidence
1436
from dichotic listening. Psychol Aging. 2018;33: 246258.
1437
doi:10.1037/pag0000238
1438
63. Wöstmann M, Alavash M, Obleser J. Alpha Oscillations in the Human
1439
Brain Implement Distractor Suppression Independent of Target
1440
Selection. J Neurosci. 2019;39: 97979805.
1441
doi:10.1523/JNEUROSCI.1954-19.2019
1442
64. Pernet CR, Sajda P, Rousselet GA. Single-trial analyses: why bother?
1443
Front Psychology. 2011;2: 322. doi:10.3389/fpsyg.2011.00322
1444
65. O’Sullivan J, Herrero J, Smith E, Schevon C, McKhann GM, Sheth SA,
1445
et al. Hierarchical Encoding of Attended Auditory Objects in Multi-
1446
talker Speech Perception. Neuron. 2019;104:11951204.
1447
doi:10.1016/j.neuron.2019.09.007
1448
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
42
66. Serences JT, Kastner S. A multi-level account of selective attention. In:
1449
Nobre AC, Kastner S (Eds.). The Oxford Handbook of Attention.
1450
Oxford University Press. 2014; pp. 76104.
1451
doi:10.1093/oxfordhb/9780199675111.013.022
1452
67. Puvvada KC, Simon JZ. Cortical Representations of Speech in a
1453
Multitalker Auditory Scene. J Neurosci. 2017;37: 91899196.
1454
doi:10.1523/JNEUROSCI.0938-17.2017
1455
68. Melara RD, Rao A, Tong Y. The duality of selection: Excitatory and
1456
inhibitory processes in auditory selection attention. Journal of
1457
Experimental Psychology: Human Perception and Performance.
1458
2002;28: 279306. doi:10.1037//0096-1523.28.2.279
1459
69. Chait M, de Cheveigné A, Poeppel D, Simon JZ. Neural dynamics of
1460
attending and ignoring in human auditory cortex. Neuropsychologia.
1461
2010;48: 32623271. doi:10.1016/j.neuropsychologia.2010.07.007
1462
70. Shinn-Cunningham BG, Best V. Selective Attention in Normal and
1463
Impaired Hearing. Trends in Amplification. 2008;12: 283299.
1464
doi:10.1177/1084713808325306
1465
71. Shinn-Cunningham BG. Object-based auditory and visual attention.
1466
Trends Cogn Sci. 2008;12: 182186. doi:10.1016/j.tics.2008.02.003
1467
72. Kaya EM, Elhilali M. Modelling auditory attention. Philos Trans R Soc
1468
Lond, B, Biol Sci. 2017;372. doi:10.1098/rstb.2016.0101
1469
73. Wöstmann M, Obleser J. Acoustic detail but not predictability of task-
1470
irrelevant speech disrupts working memory. Front Hum Neurosci.
1471
2016;10: 2019. doi:10.3389/fnhum.2016.00538
1472
74. Zhigalov A, Herring JD, Herpers J, Bergmann TO, Jensen O. Probing
1473
cortical excitability using rapid frequency tagging. NeuroImage.
1474
2019;195: 5966. doi:10.1016/j.neuroimage.2019.03.056
1475
75. Teoh ES, Lalor EC. EEG decoding of the target speaker in a cocktail
1476
party scenario: considerations regarding dynamic switching of talker
1477
location. J Neural Eng. 2019;16: 03601730. doi:10.1088/1741-
1478
2552/ab0cf1
1479
76. Nourski KV, Brugge JF. Representation of temporal sound features in
1480
the human auditory cortex. Rev Neurosci. 2011;22: 187203.
1481
doi:10.1515/RNS.2011.016
1482
77. Giraud AL, Poeppel D. Cortical oscillations and speech processing:
1483
emerging computational principles and operations. Nat Neurosci.
1484
2012;15: 511517. doi:10.1038/nn.3063
1485
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
43
78. Sadaghiani S, Kleinschmidt A. Brain Networks and α-Oscillations:
1486
Structural and Functional Foundations of Cognitive Control. Trends
1487
Cogn Sci. 2016;20: 805817. doi:10.1016/j.tics.2016.09.004
1488
79. Banerjee S, Snyder AC, Molholm S, Foxe JJ. Oscillatory Alpha-Band
1489
Mechanisms and the Deployment of Spatial Attention to Anticipated
1490
Auditory and Visual Target Locations: Supramodal or Sensory-
1491
Specific Control Mechanisms? J Neurosci. 2011;31: 99239932.
1492
doi:10.1523/JNEUROSCI.4660-10.2011
1493
80. Zoefel B, VanRullen R. Oscillatory Mechanisms of Stimulus Processing
1494
and Selection in the Visual and Auditory Systems: State-of-the-Art,
1495
Speculations and Suggestions. Front Neurosci. 2017;11: 22513.
1496
doi:10.3389/fnins.2017.00296
1497
81. Oldfield RC. The assessment and analysis of handedness: The
1498
Edinburgh inventory. 1971;9: 97113. doi:10.1016/0028-
1499
3932(71)90067-4
1500
82. Jefferies K, Gale TM. 6-CIT: Six-Item Cognitive Impairment Test.
1501
Cognitive Screening Instruments. London: Springer, London; 2013.
1502
pp. 209218. doi:10.1007/978-1-4471-2452-8_11
1503
83. Peirce JW. PsychoPyPsychophysics software in Python. Journal of
1504
Neuroscience Methods. 2007;162: 813.
1505
doi:10.1016/j.jneumeth.2006.11.017
1506
84. Delorme A, Makeig S. EEGLAB: an open source toolbox for analysis of
1507
single-trial EEG dynamics including independent component analysis.
1508
Journal of Neuroscience Methods. 2004;134: 921.
1509
doi:10.1016/j.jneumeth.2003.10.009
1510
85. Oostenveld R, Fries P, Maris E, Schoffelen J-M. FieldTrip: open source
1511
software for advanced analysis of MEG, EEG, and invasive
1512
electrophysiological data. Comput Intell Neurosci. 2011;2011: 19.
1513
doi:10.1155/2011/156869
1514
86. Schoffelen J-M, Gross J. Source connectivity analysis with MEG and
1515
EEG. Salmelin R, Baillet S, editors. Hum Brain Mapp. 2009;30: 1857
1516
1865. doi:10.1002/hbm.20745
1517
87. Glasser MF, Coalson TS, Robinson EC, Hacker CD, Harwell J, Yacoub E,
1518
et al. A multi-modal parcellation of human cerebral cortex. Nature.
1519
2016;536: 171178. doi:10.1038/nature18933
1520
88. Keitel A, Gross J. Individual Human Brain Areas Can Be Identified
1521
from Their Characteristic Spectral Activation Fingerprints. PLoS Biol.
1522
2016;14: e100249822. doi:10.1371/journal.pbio.1002498
1523
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
44
89. Smulders FTY, Oever ten S, Donkers FCL, Quaedflieg CWEM, van de
1524
Ven V. Single-trial log transformation is optimal in frequency analysis
1525
of resting EEG alpha. Eur J Neurosci. 2018;44: 9414.
1526
doi:10.1111/ejn.13854
1527
90. Chi T, Ru P, Shamma SA. Multiresolution spectrotemporal analysis of
1528
complex sounds. J Acoust Soc Am. 2005;118: 887906.
1529
doi:10.1121/1.1945807
1530
91. Crosse MJ, Di Liberto GM, Bednar A, Lalor EC. The Multivariate
1531
Temporal Response Function (mTRF) Toolbox: A MATLAB Toolbox for
1532
Relating Neural Signals to Continuous Stimuli. Front Hum Neurosci.
1533
2016;10: 604. doi:10.3389/fnhum.2016.00604
1534
92. Lalor EC, Foxe JJ. Neural responses to uninterrupted natural speech
1535
can be extracted with precise temporal resolution. Eur J Neurosci.
1536
2010;31: 189193. doi:10.1111/j.1460-9568.2009.07055.x
1537
93. Hoerl AE, Kennard RW. Ridge Regression: Biased Estimation for
1538
Nonorthogonal Problems. Technometrics. 1970;12: 5567.
1539
doi:10.1080/00401706.1970.10488634
1540
94. Biesmans W, Das N, Francart T, Bertrand A. Auditory-Inspired Speech
1541
Envelope Extraction Methods for Improved EEG-Based Auditory
1542
Attention Detection in a Cocktail Party Scenario. IEEE Trans Neural
1543
Syst Rehabil Eng. 2017;25: 402412.
1544
doi:10.1109/TNSRE.2016.2571900
1545
95. Fiedler L, Wöstmann M, Graversen C, Brandmeyer A, Lunner T,
1546
Obleser J. Single-channel in-ear-EEG detects the focus of auditory
1547
attention to concurrent tone streams and mixed speech. J Neural
1548
Eng. 2017;14: 036020. doi:10.1088/1741-2552/aa66dd
1549
96. Haufe S, Meinecke F, Görgen K, Dähne S, Haynes J-D, Blankertz B, et
1550
al. On the interpretation of weight vectors of linear models in
1551
multivariate neuroimaging. NeuroImage. 2014;87: 96110.
1552
doi:10.1016/j.neuroimage.2013.10.067
1553
97. Chatfield C. Model Uncertainty, Data Mining and Statistical Inference.
1554
Journal of the Royal Statistical Society: Series A (Statistics in Society).
1555
John Wiley & Sons, Ltd; 1995;158: 419444. doi:10.2307/2983440
1556
98. Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat
1557
Soc Series B Stat Methodol. 1996;56: 267288.
1558
99. Derksen S, Keselman HJ. Backward, forward and stepwise automated
1559
subset selection algorithms: Frequency of obtaining authentic and
1560
noise variables. British Journal of Mathematical and Statistical
1561
.CC-BY-NC-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted May 22, 2020. . https://doi.org/10.1101/2020.05.20.105874doi: bioRxiv preprint
Tune et al. · Modelling neural filters and behavioral outcome in attentive listening
45
Psychology. 1992;45: 265282. doi:10.1111/j.2044-
1562
8317.1992.tb00992.x
1563
100. Luke SG. Evaluating significance in linear mixed-effects models in R.
1564
Behav Res. 2017;49: 14941502. doi:10.3758/s13428-016-0809-y
1565
101. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a
1566
practical and powerful approach to multiple testing. J R Stat Soc
1567
Series B Stat Methodol. 1995;57: 289300. doi:10.2307/2346101
1568
102. R Core Team. R: A language and environment for statistical
1569
computing. R Foundation for Statistical Computing. Vienna, Austria.
1570
2019. doi:https://www.R-project.org/
1571
103. Bates D, Mächler M, Bolker B, Walker S. Fitting Linear Mixed-Effects
1572
Models Using lme4. J Stat Soft. 2015;67: 148.
1573
doi:10.18637/jss.v067.i01
1574
104. Lüdecke D, Lüdecke D. Data Visualization for Statistics in Social
1575
Science [R package sjPlot version 2.6.1]. Comprehensive R Archive
1576
Network (CRAN); 2018. doi:10.5281/zenodo.1308157
1577
105. Wagenmakers E-J. A practical solution to the pervasive problems of p
1578
values. Psychonomic Bulletin & Review. 2007;14: 779804.
1579
doi:10.3758/bf03194105
1580
106. Groll A. glmmLasso: Variable selection for generalized linear mixed
1581
models by L1-penalized estimation. https://CRAN.R-
1582
project.org/package=glmmLasso
1583
107. Groll A, Tutz G. Variable selection for generalized linear mixed