Content uploaded by Gianpaolo Demarchi
Author content
All content in this area was uploaded by Gianpaolo Demarchi on Mar 12, 2019
Content may be subject to copyright.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Title
Automaticandfeaturespecific(anticipatory)predictionrelatedneuralactivityinthehuman
auditorysystem
Authors
GianpaoloDemarchi*
1
,GaëtanSanchez*
1,2
andNathanWeisz
1
*sharedfirstauthorship
Abstract
Prior experience shapes sensory perception by enabling the formation of expectations with
regards to the occurrence of upcoming sensory events. Especially in the visual modality, an
increasing number of studies show that predictionrelated neural signals carry
featurespecific information about the stimulus. This is less established in the auditory
modality, in particular without bottomup signals driving neural activity. We studied whether
auditory predictions are sharply tuned to even carry tonotopic specific information. For this
purpose, we conducted a Magnetoencephalography (MEG) experiment in which participants
passively listened to sound sequences that varied in their regularity (i.e. entropy). Sound
presentations were temporally predictable (3 Hz rate), but were occasionally omitted.
Training classifiers on the random (high entropy) sound sequence and applying them to all
conditions in a timegeneralized manner, allowed us to assess whether and how carrier
frequency specific information in the MEG signal is modulated according to the entropy level.
We show that especially in an ordered (most predictable) sensory context neural activity
during the anticipatory and omission periods contains carrierfrequency specific information.
1
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
22
23
Overall our results illustrate in the human auditory system that predictionrelated neural
activitycanbetunedinatonotopicallyspecificmanner.
2
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
Introduction
Our capacity to predict incoming sensory inputs based on past experiences is
fundamental to adapt our behavior in complex environments. A core enabling process is the
identification of statistical regularities in sensory input, which does not require any voluntary
allocation of processing resources (e.g. selective attention) and occurs more or less
automatically in healthy brains
1
. Analogous to other sensory modalities
2,3
, auditory cortical
information processing takes place in hierarchically organized streams along putative ventral
and dorsal pathways
4
. These streams reciprocally connect different portions of auditory
cortex with frontal and parietal regions
4,5
. This hierarchical anatomical architecture yields
auditory cortical processing regions sensitive to topdown modulations, thereby enabling
modulatory effects of predictions. Building upon this integrated feedforward and topdown
architecture, cortical and subcortical regions seem to be involved towards auditory
predictionerror generation mechanisms
6,7
. A relevant question in this context is to what
extent predictionrelated topdown modulations (pre)activate the same featurespecific
neuralensemblesasestablishedforgenuinesensorystimulation.
Such finetuning of neural activity would be suggested by frameworks that propose
the existence of internal generative models
8–11
, inferring causal structure of sensory events
in our environment and the sensory consequences of our actions. A relevant process of
validating and optimizing these internal models is the prediction of incoming stimulus events,
by influencing activity of corresponding neural ensembles in respective sensory areas.
Deviations from these predictions putatively lead to (prediction) error signals, which are
passed on in a bottomup manner to adapt the internal model, thereby continuously
improving predictions
9(for an alternative predictive coding architecture see
12
). According to
this line of reasoning, predicted input should lead to weaker neural activation than input that
was not predicted, which has been illustrated previously in the visual
13 and the auditory
3
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
modality
14
. Support for the idea that predictions engage neurons specifically tuned to
(expected) stimulus features has been more challenging to address and has come mainly
from the visual modality (for review see
15
). In an fMRI study Smith and Muckli showed that
early visual cortical regions (V1 and V2), which process occluded parts of a scene, carry
sufficient information to decode above chance different visual scenes
16
. Intriguingly, activity
patterns in the occlusion condition are generalized to a nonocclusion control condition,
implying contextrelated topdown feedback or input via lateral connections to modulate
visual cortex in a feature specific manner. In line with these results, it has been shown that
mental preplay of a visual stimulus sequence is accompanied by V1 activity that resembles
activity patterns driven in a feedforward manner by the real sequence
1
. Beyond more or less
automatically generated predictions, explicit attentional focus to specific visual stimulus
categories also goes along with similar featurespecific modifications in early and higher
visual cortices even in the absence of visual stimulation
17
. Actually it has been proposed
that expectations increase baseline activity of sensory neurons tuned to a specific stimulus
18,19
. Moreover, another study using magnetoencephalography (MEG) and multivariate
decoding analyses, revealed how expectation can induce a preactivation of stimulus
template in visual sensory cortex suggesting a mechanism for anticipatory predictive
perception
20
. Overall, for the visual modality, these studies underline that topdown
processes lead to sharper tuning of neural activity to contain more information about the
predictedand/orattendedstimulus(feature).
Studies as to whether predictions in the auditory domain (pre)activate specific
sensory representations in a sharply tuned manner are scarce especially in humans (for
animal works see e.g.
21,22
). Sharpened tuning curves of neurons in A1 during selective
auditory attention have been established in animal experiments
23
, even though this does not
necessarily generalize to automatically formed predictions. A line of evidence could be
drawn from research in marmoset monkeys, in which a reduction of auditory activity is seen
4
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
during vocalization (
24
; for suppression of neural activity to movementrelated sounds in rats
in rats see
25
). This effect is abolished when fed back vocal utterances are pitch shifted
26
thereby violating predictions. Such an actionbased dynamic (i.e. adaptive) sensory filter
likely involves motor cortical inputs to the auditory cortex that selectively suppresses
predictable acoustic consequences of movement
27
. Interestingly, even inner speech may be
sufficient to produce reduced neural activity, but only when the presented sounds matched
those internally verbalized
28
. Using invasive recordings in a small set of human epilepsy
patients, it was shown that masked speech is restored by specific activity patterns in bilateral
auditory cortices
29
, an effect reminiscent of a report in the visual modality (
1
; for other studies
investigating similar auditory continuity illusion phenomena see
30–32
). Albeit being feature
specific, this “fillingin” type of activity pattern observed during phoneme restoration cannot
clarify conclusively whether this mechanism requires topdown input. In principle these
results could also be largely generated via bottomup thalamocortical input driving feature
relevant neural ensembles via lateral or feedforward connections. To resolve this issue,
putative featurespecific predictions needs to be also shown without confounding
feedforwardinput(i.e.duringsilence).
It has been recently shown, with a high resolution fMRI experiment, that predictive
responses to omissions follow a tonotopic organization in the auditory cortex
33
. But following
the notion that predictive processes are also proactive in an anticipatory sense, the exact
timing of the effects provides important evidence on whether predictions in the auditory
system come along with featurespecific preactivations of relevant neural ensembles
15
. To
thisendhighertemporalresolutiontechniques(e.g.,EEG/MEG)areneeded.
The goal of the present MEG study was to investigate in healthy human participants
whether predictions in the auditory modality are exerted in a carrierfrequency (i.e. tonotopic)
specific manner and, more importantly, whether those predictions are accompanied by
anticipatory effects. For this purpose, we merged an omission paradigm with a regularity
5
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
modulation paradigm (for overview see
34
; see Figure 1 for details). Socalled omission
responses occur when an expected tone is replaced by silence. Frequently this response
has been investigated in the context of Mismatch Negativity (MMN;
35
) paradigms, which
undoubtedly have been the most common approach of studying the processing of statistical
regularities in human auditory processing
36–39
. This evoked response occurs upon a
deviance from a “standard” stimulus sequence, that is, a sequence characterized by a
regularity violation regarding stimulation order. For omission responses (e.g.
40
), this order is
usually established in a temporal sense, that is, allowing precise predictions when a tone will
occur (
41
; for a study using a repetition suppression design see
42
). The neural responses
during these silent periods are of outstanding interest since they cannot be explained by any
feedforward propagation of activity elicited by a physical stimulus. Thus, omission of an
acoustic stimulation will lead to a neural response, as long as this omission violates a regular
sequence of acoustic stimuli, i.e. when it occurs unexpectedly. Previous works have
identified auditory cortical contributions to the omission response (e.g.
42
). Interestingly, and
underlining the importance of a topdown input driving the omission response, a recent DCM
study by Chennu et al.
43 illustrates that it can be best explained when assuming topdown
driving inputs into higher order cortical areas (e.g. frontal cortex). While establishing
temporal predictions via a constant stimulation rate, we varied the regularity of the sound
sequence by parametrically modulating its entropy level (see e.g.
44,45
). Using different carrier
frequencies, sound sequences varied between random (high entropy; transition probabilities
from one sound to all others at chance level) and ordered (low entropy; transition probability
from one sound to another one above chance). Our reasoning was that omissionrelated
neural responses should contain carrierfrequency specific information that is modulated by
the entropy level of the contextual sound sequence. Using a time generalization decoding
approach
46
, we find strong evidence that particularly during the low entropy (i.e. highly
ordered) sequence, neural activity prior to and during the omission period contains
6
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
127
128
129
130
carrierfrequency specific information similar to activity observed during real sound
presentation. Our work reveals how even in passive listening situations, the auditory system
extracts the statistical regularities of the sound input, continuously casting featurespecific
(anticipatory)predictionsas(pre)activationsofcarrierfrequencyspecificneuralensembles.
7
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
131
132
133
134
Figure 1: Experimental design. A) Transition matrices used to generate sound sequences according
to the different conditions (random (RD), midminus (MM), midplus (MP) and ordered (OR)). B)
Schematic examples of different sound sequences generated across time. 10% of sound stimuli were
randomlyreplacedbyomissiontrials(absenceofsound)ineachcontext.
8
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
Results
Singletrialneuralactivitycontainscarrierfrequencyinformation
A crucial first step and the foundation to address the questions of whether
carrierfrequency specific neural activity patterns are modulated by predictions even in
anticipatory or omission periods, is to establish that we can actually decode
carrierfrequencies during actual sound presentation. To address this issue, we used the
single trial MEG timeseries data from the random (high entropy) condition and trained a
classifier (LDA) to distinguish the four different carrier frequencies. Robust above chance (p
< .05, Bonferroni corrected, grey line) classification accuracy was observed commencing
~30 ms following stimulus onset and peaking (~35%) at around 100 ms to gradually decline
until 350 ms. Interestingly however, carrierfrequency specific information remained above
chance at least until 700 ms poststimulus onset (i.e. the entire period tested) meaning that
this information was contained in the neural data when new sounds were presented. The
pattern was overall similar for the other entropy conditions (not shown), however in this study
we focus on the data from the random sequence, since this decoding was the basis (i.e.
training data set) for all upcoming analysis using a timegeneralization approach, in which
carrierfrequencyhadtobedecoded.
To identify the brain regions that provided informative activity, we adapted a
previously reported approach
47
, which projects the classifier weights obtained for the
decoding of carrier frequency from sensor to source space (Figure 2B). Since later analysis
using the timegeneralization approach pointed to a differential entropylevel effect for early
(50125 ms; W1) and late (125333 ms; W2) periods of the training time period (described
below and in Figure 3), the projected weights are displayed separately for these periods. For
both time periods it is evident that bilateral auditory cortical regions contribute informative
activity, albeit with a right hemispheric dominance. While this appears similar for both
9
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
timewindows, informative activity also spreads to nonauditory cortical regions such as
frontalareasduringthelater(W2)period.
Overall, the analyses so far show that carrier frequency specific information can be
robustly decoded from MEG, with informative activity originating (as expected) mainly from
auditory cortex. Interestingly and going beyond what can be shown by conventional evoked
response analysis, carrierfrequency specific information is temporally persistent, potentially
reflectingamemorytraceofthesoundthatis(re)activatedwhennewinformationarrives.
Figure 2: Decoding carrier frequencies from random sound sequences. A) Robust increase of
decoding accuracy is obtained rapidly, peaking ~100 ms after which it slowly wanes. Note however
that significant decoding accuracy is observed even after 700 ms, i.e. likely representing a memory
trace that is (re)activated even when new tones (with other carrier frequencies) are processed. B)
Source projection of classifier weights for an early (W1) and late (W2) reveals informative activity to
mainly originate from auditory cortices, with a right hemispheric dominance. During the later (W2)
periodinformativeactivityspreadstoalsoencompasse.g.frontalregions.
Modulationsofentropyleadtoclearanticipatorypredictioneffects
An anticipatory effect of predictions should be seen as relative increases of
carrierfrequency specific information prior to the onset of the expected sound with
increasing regularity of the sound sequence. To tackle this issue it is important to avoid
10
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
potential carryover effects of decoding, which e.g would be present when training and
testing on ordered sound sequences. We circumvented this problem by training our classifier
only on data from the random sound sequence (see Figure 2A) and testing it for
carrierfrequency specific information in all other conditions using time generalization.
Applying this approach to pre and postsound (Figure 3A) or omissions (Figure 3B)
respectively, nonparametric cluster permutation test yields a clear prestimulus effect in both
cases indicating a linear increase of decoding accuracy with decreasing entropy level
(presound: p
cluster < 0.001; preomissions: p
cluster < 0.001). In both cases, these anticipatory
effects appear to involve features that are relevant at later training time periods (~125333
ms, W2; for informative activity in source space see also Figure 2B). The time courses of
averaged accuracy for this training time interval (only shown for preomissions in bottom
panel of Figure 3B) visualizes this effect with a clear relative increase of decoding accuracy
prior to expected sound onset in particular for the ordered sequence. This analysis clearly
underlines that predictions that evolve during a regular sound sequence contain
carrierfrequency specific information that are preactivated in an anticipatory manner, akin to
theprestimulusstimulustemplatesreportedinthevisualmodality
20
.
11
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
Figure 3: Regression analysis (using entropy level as independent variable; red colors indicate
increased decoding accuracy for more regular sequences) for pre and poststimulus decoding using
timegeneralization of classifier trained on random sound sequences (used to avoid potential
carryover effects as soon as regularity is added to sound sequence). Upper panels display tvalues
thresholded at uncorrected p< .05. The areas framed in black are clusters significant at p
cluster < .05.
The lower panel shows the decoding accuracy for individual conditions averaged for training times
between the dashed lines. A) Display of effects pre and postsound, showing a clear anticipation
effect and a late effect commencing after ~400 ms. The latter effect is more clearly visualized in the
lower panel. Interestingly, different train times appear to dominate the anticipation and poststimulus
effects. B) Display of effects pre and postomission, showing a single continuous positive cluster.
However, the actual tvalues suggest temporally distinct maxima within this cluster underlining the
dynamics around this event. Analogous to sounds a clear anticipation effect can be observed, driven
by increased preomission decoding accuracy for events embedded in regular sequences (see lower
panel). A similar increase can be seen immediately following the onset of the omission which cannot
be observed following actual sound onset. Interestingly this increase is long lasting with further peaks
emergingapproximatelyat330msand580ms.
Entropydependentclassificationaccuracyofsounds
According to some predictive processing frameworks
15
, predicted sounds should
lead to a reduced activation as compared to cases in which the sound was unpredicted. In
case the activation stems mainly from carrierfrequency specific neural ensembles, a
decrease of decoding accuracy with increasing regularity (i.e. low entropy) could be
expected. Using a time generalization approach, we applied the classifier trained on the
carrierfrequencies from the random sound sequence (Figure 2) to the postsound periods
of the individual entropy conditions. Our regression approach yielded a negative relationship
12
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
between decoding accuracy and regularity at approximately 100200 ms postsound onset,
however this effect did not survive multiple comparison testing (p
cluster = 0.12). Also on a
descriptive level, this effect appeared not to be strictly linear, meaning that our study does
not provide strong evidence that predictions reduce carrierfrequency specific information of
expected tones. Interestingly, a significant positive association (p
cluster < 0.001) at a later
interval time period emerged, beginning around 370 ms postsound onset and lasting at
least until 700 ms (Figure 3A). While analogous to the aforementioned anticipation effect
(and the subsequently described omission effect) decoding accuracy increased the more
regular the sound sequence was, this effect was most strongly driven by classifier
information stemming from earlier training time intervals (50125 ms, W1; see Figure 2B).
This effect is in line with the previously described temporally extended effect for the decoding
of carrier frequency from random sound sequences and suggests that carrierfrequency
specific information is more strongly reactivated by subsequent sounds when embedded in a
moreregularsoundsequence.
Entropydependentclassificationaccuracyofsoundomissions
Our sound sequences were designed such that the onset of a sound could be
precisely predicted (following the invariant 3 Hz rhythm), but the predictability of the
carrierfrequencies was parametrically modulated according to the entropy level. Following
the illustration of anticipation related prediction effects, a core question of our study was
whether we could identify carrierfrequency specific information following an expected but
omitted sound onset. As depicted in Figure 3B, showing the outcome of our regression
analysis, this is clearly the case. Nonparametric cluster permutation test yielded a single
cluster that also comprises the anticipation effect described above (p
cluster < 0.001), however
also in contrast to the analysis locked to sounds (Figure 3A) a postomission onset
13
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
increase is clearly identifiable. Interestingly, also the postomission effect was long lasting
reaching far beyond the short interval of the omitted sound. The lower panel of Figure 3B
shows the decoding accuracy averaged over the 125333 ms training time interval (W1; see
Figure 2B), illustrating the enhanced decoding accuracy especially for the ordered
sequence. Even though the entropydriven effect is clearly continuous, local peaks at ~90,
330 and 580 ms following the time of the (omitted) stimulation onsets can be identified at a
descriptive level, thus tracking the stimulation rate of the sound sequence. This part shows
that carrierfrequency specific information about an anticipated sound is enduring, not only
encompassing prestimulus periods, but temporally persists when the prediction of a sound
presentationisviolated.
Interindividualneuralrepresentationsofstatisticalregularitiesandtheircorrelationwith
featurespecificpredictions
Our previous analyses establish a clear relationship between (anticipatory)
predictions of sounds and the carrierfrequency specific information contained in the neural
signal. This was derived by a regression analysis across conditions using the entropy level
as independent variable. An interesting followup question, is whether interindividual
variability to derive the statistical regularity from the sound sequences would be correlated
withcarrierfrequencyspecificinformationforpredictedsounds.
To pursue this question we first tested whether information pertaining to the level of
statistical regularity of the sequence is contained in the single trial level signal. Using all
magnetometers (i.e. discarding the spatial pattern) and a temporal decoding approach
showed that the entropy level of the condition in which the sound was embedded into could
be decoded above chance from virtually any time point (Figure 4A). Given the blockwise
14
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
presentation of the conditions, this temporally extended effect (p < .05, Bonferroni corrected,
green line) is not surprising. On top of this effect, a transient increase of decoding accuracy
following ~50200 ms stimulus onset can be observed. In order to identify potential neural
generators that drive the described effect, the trained classifier weights were projected in the
source space (see inset of Figure 4A) analogous to the approach described above. Since
sensor level analysis suggested a temporally stable (in the sense of almost always
significant) neural pattern, the entire 0330 ms time period was used for this purpose. The
peak informative activity was strongly right lateralized to temporal as well as parietal regions.
Based on this analysis, we can state that information about the regularity of the sound
sequence is contained also at the singletrial level and that temporal and parietal regions
may play an important role in this process. We applied the classifier trained on the sound
presentations to the same omission periods, in order to uncover whether the statistical
pattern information is also present following an unexpected omission (Figure 4A).
Interestingly, in this case a decrease of decoding accuracy was observed commencing ~120
ms after omission onset and lasting for ~200 ms. During this brief time period the entropy
level could not be decoded above chance. This effect illustrates that an unexpected omission
transientlyinterruptstheprocessingofthestatisticalregularityofthesoundsequence.
To test whether the interindividual variability in neurally representing the entropy level
is associated with carrierfrequency specific predictions, we correlated average entropy
decoding accuracy in a 0330 ms timewindow following sound onset with timegeneralized
decoding accuracy for carrierfrequency around sound or omission onset separately for the
early (W1) and late (W2) training timewindows. The analysis (Figure 4B) shows for early
training timewindow (W1) a negative relationship (p
cluster = 0.02), meaning that participants
whose neural activity suggested a stronger representation of the level of statistical regularity
preactivate carrierfrequency specific neural patterns to a lesser extent. It should be noted
that the overall entropyrelated effect was driven more by the later trainingtime window
15
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
(W2), for which no correlation was observed in the present analysis. Also, the correlation
effect was not found when locking the analysis to the omission onset, which could either
suggest a spurious finding or the fact of suboptimal power for this effect given the far lower
amount of trials for the omissioncentered timegeneralized decoding. On a descriptive
(uncorrected) level positive correlations can be seen following sound onset that are
sequentially pronounced for early (W1) and late (W2) trainingtime windows (p
cluster = 0.062
and p
cluster = 0.096, respectively). Following the omission onset a late positive correlation
(p
cluster = 0.033) was observed ~500600 ms for the late training timewindow (W2), meaning
that the carrierfrequency specific pattern of the omitted sound was reactivated more during
the ordered sequence for participants who showed stronger encoding of the statistical
regularity. Altogether, this analysis demonstrates that the brain has a continuous
representation of the magnitude of regularity of the sequence and that it is modulated by the
presence or (unexpected) absence of a sound. Furthermore, the results suggests that the
interindividual variability in this more global representation of the input regularity could
influence the exertion of carrierfrequency specific neural patterns preceding and following
sound.
16
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
Figure 4: Decoding of entropy level of sound sequence and correlation main prediction effects gained
from timegeneralization analysis (see Figure 3). A) A classifier was trained to decode the entropy
level from MEG activity elicited by sounds and tested around relevant events, i.e. sound or omission
onset. Robust increases of decoding accuracy can be observed following sound onsets. Right
temporal and parietal regions appear to contribute most informative activity (small inset). While overall
decoding accuracy is above chance level mostly throughout the entire period, this pattern breaks
down briefly following an omission. B) Average entropy level decoding following sound onset (0330
ms) was taken and (Spearman) correlated with the timegeneralized decoding accuracy of the low
entropy condition. A significant negative correlation especially with early training timewindow patterns
(W1) can be seen in the anticipation period towards a sound, that was however not observed prior to
omissions. Following the onset of omissions nonparametric cluster permutation testing pointed to a
latepositivecorrelationwiththelateactivationpatterns(W2).
Discussion
In this study, we investigate neural activity during passive listening to auditory tone
sequences by manipulating respective entropy levels and thereby the predictability of an
upcoming sound. We used MVPA applied to MEG data to first show that neural responses
contain sufficient information to decode the carrierfrequency of tones. Using classifiers
trained on random sound sequences in a condition and timegeneralized manner our main
result is that carrierfrequency specific information increases the more regular (i.e.
17
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
predictable) the sound sequence becomes, especially in the anticipatory and postomission
periods. This study provides strong support that predictionrelated processes in the human
auditory system are sharply tuned to contain tonotopically specific information. While the
finding of sharp tuning of neural activity is not surprising, given in particular invasive
recordings from the animal auditory cortex (e.g. during vocalizations, see
26
; or shift of tuning
curves following explicit manipulations of attention to specific tone frequencies, see
23,48
, our
work is a critical extension of previous human studies for which a tonotopically tuned effects
of predictions has not been shown so far. Critically, given that omission responses have
been considered as pure prediction signals
34,49
, our work illustrates that sharp tuning via
predictionsdoesnotrequirebottomupthalamocorticaldrive.
CarrierfrequencyspecificinformationfollowingsoundonsetiscontainedinsingletrialMEG
data
To pursue our main research question, we relied on MVPA applied to MEG data
46,50
.
Prior to addressing whether (anticipatory) predictionrelated neural activity contains
carrierfrequency specific information, an important sanity check was first to establish the
decoding performance when a sound was presented in the random sequence. A priori, this
is not a trivial undertaking given that the small spatial extent of the auditory cortex (
51 likely
produces highly correlated topographical patterns for different pure tones and the fact that
mapping tonotopic organization using noninvasive electrophysiological tools has had mixed
success (for critical overview see e.g.
52
). Considering this challenging background it is
remarkable that all participants showed a stable pattern with marked poststimulus onset
decoding increases after ~50 ms. While a peak is reached around 100 ms postsound onset
after which decoding accuracy slowly declines, it remains above chance for at least 700 ms.
This observation is remarkable given the passive setting for the participant (i.e. no task
involved with regards to sound input) and the very transient nature of evoked responses to
18
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
sounds that are commonly used in cognitive neuroscience. Our analysis shows that neural
activity patterns containing carrierfrequency specific information remain present for an
extended time putatively representing a memory trace of the sound that is available when
new acoustic information impinges on the auditory system. This capacity is of key
importance for forming associations across events, thereby enabling the encoding of the
statisticalregularityoftheinputstream
53,54
.
The previous result underlines that noninvasive electrophysiological methods such as
MEG can be used to decode lowlevel auditory features such as the carrierfrequency of
tones. This corroborates and extends findings from the visual modality for which successful
decoding of lowlevel stimulus features such as contrast edge orientation have been
demonstrated previously
55
. However, the analysis leading to this conclusion included all
sensors and was therefore spatially agnostic. We used an approach introduced by Marti and
Dehaene
47 to describe informative activity on the source level. Based on the subsequent
entropyrelated effects identified in the timegeneralization approach, we focussed on an
earlier (W1, 50125 ms) and a later timewindow (W2, 125333 ms). While informative
activity was observed bilaterally especially in auditory regions along the Superior Temporal
Gyrus in both hemispheres, the pattern was stronger on the right side. Furthermore, on a
descriptive level, informative activity was spread more frontally in the later timewindow,
implying an involvement of hierarchically downstream regions. Overall this analysis suggests
that carrierfrequency specific information mainly originates from auditory cortical regions,
but that regions not classically considered as auditory processing regions may contain
featurespecificinformationaswell
5
.
This analysis was of not only relevant as a sanity check, but also because the trained
classifiers were used and timegeneralized to all entropy levels. While this approach yielded
highly significant effects (see also below for discussion), decoding accuracy was not high in
absolute terms especially for the anticipatory and omission periods. However, it should be
19
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
noted that we refrained from a widespread practice of subaveraging trials
50,56
, which boosts
classification accuracies significantly. When compared to cognitive neuroscientific M/EEG
studies that perform decoding on the genuine single trials and a focus on group level effects
(rather than featureoptimizing on individual level as in BCI applications), the strength of our
effectsarecomparable(e.g.
47,57
).
(Anticipatory)Predictionsareformedinspectrallysharplytunedmanner
Using an MVPA approach with timegeneralization allowed us to assess whether
carrierfrequency related neural activity prior to or during sound / omission is systematically
modulated by the entropy level. When training and testing within regular (i.e. low entropy)
sound sequences carryover effects could be artificially introduced that would be erroneously
interpreted as anticipation effects: in these conditions the preceding sound (with its
carrierfrequency) already contains information about the upcoming sound. To circumvent
this problem, we consistently used a classifier trained on sounds from the random sequence
i.e. where neural activity following a sound is not predictive of an upcoming sound and
applied it to all conditions. Using a regression analysis, we could derive in a timegeneralized
manner to what extent carrierfrequency specific decoding accuracy of the (omitted) sound
was modulated by the entropy of the sound sequence in which the event was embedded.
The act of predicting usually contains a notion of preactivating relevant neural ensembles, a
pattern that has been previously illustrated in the visual modality (e.g.
1,20
). For the omission
response, this was put forward by Bendixen et al.
49
, even though the reported evoked
response effects cannot be directly seen as signatures of preactivation. Chouiter et al.
58 use
also a decoding approach, and show an effect of frequency/pitch after expected but omitted
sounds, but doesn't look for an anticipatory effect. In line with this preactivation view, our
main observation was that carrierfrequency specific information increased with increased
regularity of the tone sequence already in the presound/omission period, clearly showing
20
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
sharp tuning of neural activity in anticipation of the expected sound. This effect was in
particular pronounced for later training time periods (W2; see Figure 3) which contained
informative activity also beyond auditory regions (e.g. frontal cortex; see Figure 2B). This
finding critically extends the aforementioned previous research done in the visual modality,
clearly establishing featurespecific tuning of anticipatory prediction processes in the auditory
system. Our finding supports and enhances as well a recent high field fMRI study for the
auditory domain
33
, where the lower temporal resolution of the technique could not permit the
separationofpurepredictionandandpreactivationeffects.
According to most predictive coding accounts expected stimuli should lead to
attenuated neural responses, which has been confirmed for auditory processing using
evoked M/EEG (e.g.
14
) or BOLD responses (e.g.
13
). Thus a reduction of carrierfrequency
specific information could also be expected when sounds were embedded in a more ordered
sound sequence. While such an association was descriptively observed in early training
(W1) and testingtime (< 200 ms) intervals, it was statistically not significant and the
individual conditions displayed in Figure 3A also suggest such a relationship not to be
strictly linear. This observation may be reconciled when dissociating the strength of neural
responses (e.g. averaged evoked M/EEG or BOLD response) from the feature specific
information in the sigal, as has been described the visual modality: here reduced neural
responses in the visual cortex have been reported, while at the same time representational
information is enhanced
19
. An enhancement of representational i.e. carrierfrequency
specific information was observed in our study at late testingtime intervals, following ~500
ms after sound onset. This effect was broad in terms of the trainingtime window, however it
was largest for early intervals (W1; see Figure 3A). The late onset suggests this effect to be
a consequence of the subsequent sound presentation, i.e. a reactivation of the
carrierfrequency specific information within more ordered sound sequences, which would be
crucial in establishing and maintaining neural representations of the statistical regularity of
21
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
the sound sequence. While this effect derived from timegeneralization analysis is not
identical to the temporal decoding result described above, it is fully in line with it in a sense
that feature specific information is temporally persistent even in this passive setting far
beyondthetypicaltimewindowsexaminedincognitiveneuroscience.
Next to the anticipatory and postsound period, we were also interested in the
strength of carrierfrequency specific information following an omission. Since no
feedforward activity along the ascending auditory pathway can account for omissionrelated
activity, they have been considered to reflect pure prediction responses
59
. Suggestive
evidence comes from an EEG experiment (
36 , but see also
37
) in which sounds following
button presses with a fix delay could either be random or of a single identity. The authors
show evoked omission responses to be not only sensitive to timing, but also to the predicted
identity of the stimulus. However, from the evoked response it is not possible to infer which
feature specific information the signal carries (see comment above). Our result significantly
extends this research by illustrating carrierfrequency specific information to increase
following omission onset the more regular the sound sequence is. Descriptively this occurs
rapidly following omission onset with a peak ~100 ms, however further peaks can be
observed approximately following the stimulation rate up to at least 600 ms. The late
reactivations of the carrierfrequency specific information of the missing sound is in line with
the temporally persisting effects described above, pointing to an enhanced association of
past und upcoming stimulus information in the ordered sequence. However, in contrast to
the sound centered analysis, the postomission entropyrelated effects are mainly driven by
the late training timewindow (W2) analogous to the anticipatory effect. Thus a speculation
based on this temporal effect could be that whereas reactivation of carrierfrequency specific
information following sounds by further incoming sounds in an ordered sequence engage
hierarchically upstream regions in the auditory system, omissionrelated carrierfrequency
22
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
specific activity engages downstream areas including some conventionally considered
nonauditory(e.g.frontalcortex).
Interindividual variability of entropy encoding influences carrierfrequency related prediction
effects
The overall individual ability to represent statistical regularities could have profound
implications for various behavioural domains
54
. While the main analysis pertained to
decoding of carrierfrequency specific (lowlevel) information, we also addressed whether a
representation of a more abstract feature such as the sequence’s entropy level could also be
decoded from the noninvasive data. Functionally, extracting regularities requires an
integration over a longer time period and previous MEG works focussing on evoked
responses have identified in particular slow (DC) shifts to reflect transitions from random to
regular sound
60
. This fits with our result showing that the entropy level of a sound sequence
can be decoded above chance at virtually any time point, implying an ongoing (slow)
process tracking regularities that is transiently increased following the presentation of a
sound. Our acrossparticipant correlation approach is suggestive that indeed the individual’s
disposition to correctly represent the level of regularity is linked to pre and
postsound/omission engagement of carrierfrequency specific neural activity patterns. While
some open questions remain (e.g. the prestimulus discrepancy between sound and omission
correlationpatterns),weconsiderthislineofresearchasverypromising.
Taken together, the successful decoding of low and highlevel auditory information
underlines the significant potential of applying MVPA tools to noninvasive
electrophysiological data to address research questions in auditory cognitive neuroscience
that would be difficult to pursue using conventional approaches. In particular our approach
may be a reliable and easy avenue to parametrize an individual's ability to represent
statistical regularities, without the need to invoke behavioural responses that may be
influenced by multiple nonspecific factors. This could be especially valuable when studying
23
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
different groups for which conventional paradigms as relying on overt behavior may be
problematicsuchaschildrenorvariouspatientgroups(e.g.disordersofconsciousness).
Conclusion
Predictive processes should (pre)engage featurespecific neural assemblies in a
topdown manner. However only little direct evidence exist for this notion in the human
auditory system
29,33
. We significantly advance this state by introducing a hybrid regularity
modulation and omission paradigm, in which expectations of upcoming carrierfrequency of
tones were controlled in a parametric manner. Using MVPA, our results unequivocally show
an increase of carrierfrequency specific information during anticipatory as well as (silent)
omission periods the more regular the sequence becomes. Our findings and outlined
approach holds significant potential to address indepth further questions surrounding the
roleofinternalmodelsinauditoryperception.
MethodsandMaterials
Participants
A total of 34 volunteers (16 females) took part in the experiment, giving written
informed consent. At the time of the experiment, the average age was 26.6 ± 5.6 SD years.
All participants reported no previous neurological or psychiatric disorder, and reported
normal or correctedtonormal vision. One subject was discarded from further analysis, since
in a first screening it has been assessed that she has been exposed to a wrong entropy
sequence (twice MP and no OR). The experimental protocol was approved by the ethics
24
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
committee of the University of Salzburg and has been carried out in accordance with the
DeclarationofHelsinki.
Stimuliandexperimentalprocedure
Before entering the Magnetoencephalography (MEG) cabin, five head position
indicator (HPI) coils were applied on the scalp. Anatomical landmarks (nasion and left/right
preauricular points), the HPI locations, and around 300 headshape points were sampled
using a Polhemus FASTTRAK digitizer. After a 5 min resting state session (not reported in
this study), the actual experimental paradigm started. The subjects watched a movie
(“Cirque du Soleil: Worlds Away”) while passively listening to tone sequences. Auditory
stimuli were presented binaurally using MEGcompatible pneumatic inear headphones
(SOUNDPixx, VPixx technologies, Canada). This particular movie was chosen for the
absence of speech and dialogue, and the soundtrack was substituted with the sound
stimulation sequences. These sequences were composed of four different pure (sinusoidal)
tones, ranging from 200 to 2000 Hz, logarithmically spaced (that is: 200 Hz, 431 Hz, 928 Hz,
2000 Hz) each lasting 100 ms (5 ms linear fade in / out). Tones were presented at a rate of 3
Hz. Overall the participants were exposed to four blocks, each containing 4000 stimuli, with
every block lasting about 22 mins. Each block was balanced with respect to the number of
presentations per tone frequency. Within the block, 10% of the stimuli were omitted, thus
yielding 400 omission trials (100 per omitted sound frequency). While within each block, the
overall amount of trials per sound frequency was set to be equal, blocks differed in the order
of the tones, which were parametrically modulated in their entropy level using different
transition matrices
61
. In more detail, the random condition (RD; see Figure 1) was
characterized by equal transition probability from one sound to another, thereby preventing
any possibility of accurately predicting an upcoming stimulus (high entropy). In the ordered
25
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
condition (OR), presentation of one sound was followed with high (75%) probability by
another sound (low entropy). Furthermore, two intermediate conditions were included
(midminus and midplus, labelled MM and MP respectively
61
). The probability on the diagonal
was set to be equiprobable (25%) across all entropy conditions, thereby controlling for the
influence of selfrepetitions. The experiment was programmed in MATLAB 9.1 (The
MathWorks,Natick,Massachusetts,U.S.A)usingtheopensourcePsychophysicsToolbox
62
.
MEGdataacquisitionandpreprocessing
The brain magnetic signal was recorded at 1000 Hz (hardware filters: 0.1 330 Hz) in
a standard passive magnetically shielded room (AK3b, Vacuumschmelze, Germany) using a
whole head MEG (Elekta Neuromag Triux, Elekta Oy, Finland). Signals were capture by 102
magnetometers and 204 orthogonally placed planar gradiometers at 102 different positions.
We use a signal space separation algorithm implemented in the Maxfilter program (version
2.2.15) provided by the MEG manufacturer to remove external noise from the MEG signal
(mainly 16.6Hz, and 50Hz plus harmonics) and realign data to a common standard head
position (trans default Maxfilter parameter) across different blocks based on the measured
headpositionatthebeginningofeachblock
63
.
Data analysis was done using the Fieldtrip toolbox
64 (git version 20170919) and
inhouse built scripts. First, a highpass filter at 0.1 Hz (6th order zerophase Butterworth
filter) was applied to the continuous data. Subsequently, for Independent Component
Analysis, continuous data were chunked in 10s blocks and downsampled at 256 Hz. The
resulting components were manually scrutinized to identify eye blinks, eye movements,
heartbeat and 16 and ⅔ train power supply artifacts. Finally, the continuous data were
segmented from 1000ms before to 1000 ms after target stimulation onset and the artifactual
components projected out (3.0 ± 1.5 SD components removed on average per each
26
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
subject). The resulting data epochs were downsampled to 100 Hz for the decoding analysis.
Finally, the epoched data was 30 Hz lowpassfiltered (6th order zerophase Butterworth filter)
prior to further analysis. Following these preprocessing steps, no trials therefore were
rejected
65
.
MultivariatePatternAnalysis(MVPA)
We used multivariate pattern analysis as implemented in the MVPALight
(https://github.com/treder/MVPALight, commit 003a7c) package
66,67
, forked and modified in
order to extract the classifier weights (https://github.com/gdemarchi/MVPALight/tree/devel)
47
. MVPA decoding was performed on single trial sensorlevel (102 magnetometers) data
usingatimegeneralization
46
analysis.Overall,threedecodingapproachesweretaken:
1. Entropylevel decoding: To investigate how brain activity is modulated by the different
entropy levels, we kept only trials either with sound presentation (removing omission
trials), or all the omissions (discarding the sounds). We defined four decoding targets
(classes) based on block type (4 contexts: RD, MM, MP, OR). Sound that were
preceded by a rare (10% of the times) omission were discarded, whereas all the
omissionswerekeptintheomissionentropydecoding.
2. Soundtosound decoding: To test whether we could classify carrier frequency in
general, we defined four targets (classes) for the decoding related to the carrier
frequency of the sound presented on each trial (4 carrier frequencies). In order to
avoid any potential carry over effect from the previous sound, the classifier was
trained only on the random (RD) sounds. Also here the sounds preceded by an
omission were discarded. To avoid further imbalance, the number of trials preceding
a target sound were equalized, e.g. the 928 Hz, sound trials were preceded by the
samenumberof200Hz,430Hz,928Hzand2000Hztrials(N1,Nbalancing).
27
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
3. Soundtoomission decoding: To test whether omission periods contain carrier
frequency specific neural activity, omission trials were labeled according to the carrier
frequency of the sound which would have been presented. As with the
soundtosound decoding, the random sounds trials were used to train the classifier,
which was subsequently applied to a test set of trials where sounds were not
presented,i.e.omissions,usingthesamebalancingschemesasbefore.
Using a Multiclass Linear Discriminant Analysis (LDA) classifier, we performed a
decoding analysis at each time point around stimulus / omission onset. A fivefold
crossvalidation scheme, repeated for five times, was applied for entropylevel and the
random soundtosound decoding, whereas for the training on the RD sound testing on
MM, MP, OR sounds as well as for the soundtoomission decoding no crossvalidation was
performed, given the cross decoding nature of the latter testing of the classifier. For the
soundtoomission decoding analysis, the training set was restricted to random sound trials
and the testing set contained only omissions. In all cases, training and testing partitions
alwayscontaineddifferentsetsofdata.
Classification accuracy for each subject was averaged at the group level and
reported to depict the classifier’s ability to decode over time (i.e. timegeneralization analysis
at sensor level). The time generalization method was used to study the ability of each LDA
classifier across different time points in the training set to generalize to every time point in
the testing set
46
. For the soundtosound and soundtoomissions decoding, time
generalization was calculated for each entropy level separately, resulting in four
generalization matrices, one for each entropy level. This was necessary to assess whether
thecontextualsoundsequenceinfluencesclassificationaccuracyonasystematiclevel..
28
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
Decodingweightsprojectionanalysis
For relevant time frames the training decoders weights were extracted and projected
in the source space in the following way. For each participant, realistically shaped,
singleshell headmodels
68 were computed by coregistering the participants’ headshapes
either with their structural MRI (15 participants) or – when no individual MRI was available
(18 participants) – with a standard brain from the Montreal Neurological Institute (MNI,
Montreal, Canada), warped to the individual headshape. A grid with 1.5 cm resolution based
on an MNI template brain was morphed to fit the brain volume of each participant. LCMV
spatial filters were computed starting from the preprocessed data of the training random
sounds (for the soundtosound and soundtoomission decoding), or from all the sound or
omission data (for the entropy decoding)
69
. Instead of, as commonly done in the field,
multiplying the the sensor level singletrial timeseries by the filter obtained above to create
the so called virtual sensors, we multiplied the sensor level "corrected by the covariance
matrix"
70 decoding weights timeseries by the spatials filters to obtain a “informative activity”
pattern
47
. Baseline normalization was performed only for visualization purposes (subtraction
of50msprestimulusactivity).
Statisticalanalysis
For the sound and omission decoding part, we tested the dependence on entropy
level using a regression test (depsamplesregT in Fieldtrip). Results for sounds and
omissions were sorted from random to ordered respectively. In order to account for multiple
comparisons, we used a nonparametric cluster permutation test
71 as implemented in
Fieldtrip using 10000 permutations and a p < .025 to threshold the clusters, on a pseudo
timefrequency (testing time vs training time accuracy 2D structure). Moreover, to
29
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
investigate the temporal dynamics of the single entropy levels in the regression we ran the
statistics above averaging across two different time windows ("avgoverfreq' in fieldtrip),
namely an early time window "W1" (from 75 ms to 125 ms post random sound onset in the
training set) for the soundtosound decoding, and a later window "W2" (from 125 ms to 333
ms).
Acknowledgments
We thank Dr. Anne Hauswald, Dr. Anne Weise and Mrs. Marta Partyka for helpful comments
on earlier versions of the ms, and Miss Hayley Prins for proof reading it. We thank Mr. David
OpferkuchandMr.ManfredSeifterforthehelpwiththemeasurements.
Authorinformation
Affiliations
1CentreforCognitiveNeuroscienceandDivisionofPhysiologicalPsychology,Universityof
Salzburg,Hellbrunnerstraße34,5020Salzburg,Austria
2LyonNeuroscienceResearchCenter,BrainDynamicsandCognitionteam,INSERM
UMRS1028,CNRSUMR5292,UniversitéClaudeBernardLyon1,UniversitédeLyon,
F69000,Lyon,France
Contributions
G.DandN.W.designedthestudy;G.D.performedtheexperiments;G.D.,G.S.andN.W.
designedandperformedtheanalyses;G.D.,G.S.andN.W.wrotethemanuscript.
30
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
636
637
638
639
640
641
642
Competinginterests
Theauthorsdeclarenocompetinginterests.
Correspondingauthor
gianpaolo.demarchi@sbg.ac.at(G.D.)
Correspondenceandrequestsformaterials
Furtherinformationandrequestsforresourcesshouldbedirectedtoandwillbefulfilledby
GianpaoloDemarchi(gianpaolo.demarchi@sbg.ac.at).
31
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
References
1. Ekman,M.,Kok,P.&DeLange,F.P.Timecompressedpreplayofanticipatedeventsin
humanprimaryvisualcortex.Nat.Commun.8,15276(2017).
2. Felleman,D.J.&VanEssen,D.C.Distributedhierarchicalprocessingintheprimate
cerebralcortex.CerebCortex1,1–47(1991).
3. Ungerleider,L.G.&Haxby,J.V.‘What’and‘where’inthehumanbrain.Curr.Opin.
Neurobiol.4,157–165(1994).
4. Rauschecker,J.P.&Tian,B.Mechanismsandstreamsforprocessingof“what”and
“where”inauditorycortex.ProcNatlAcadSciUA97,11800–11806(2000).
5. Plakke,B.&Romanski,L.M.Auditoryconnectionsandfunctionsofprefrontalcortex.
FrontNeurosci8,199(2014).
6. Fontolan,L.,Morillon,B.,LiegeoisChauvel,C.&Giraud,A.L.Thecontributionof
frequencyspecificactivitytohierarchicalinformationprocessinginthehumanauditory
cortex.Nat.Commun.5,4694(2014).
7. Recasens,M.,Gross,J.&Uhlhaas,P.J.LowFrequencyOscillatoryCorrelatesof
AuditoryPredictiveProcessinginCorticalSubcorticalNetworks:AMEGStudy.Sci.Rep.
8,14007(2018).
8. Rao,R.P.N.&Ballard,D.H.Predictivecodinginthevisualcortex:Afunctional
interpretationofsomeextraclassicalreceptivefieldeffects.Nat.Neurosci.2,79–87
(1999).
9. Friston,K.Atheoryofcorticalresponses.Philos.Trans.R.Soc.BBiol.Sci.360,
815–836(2005).
10. Wolpert,D.M.,Ghahramani,Z.&Jordan,M.I.Aninternalmodelforsensorimotor
32
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
integration.Science269,1880–1882(1995).
11. Wolpert,D.M.&Miall,R.C.ForwardModelsforPhysiologicalMotorControl.Neural
Netw9,1265–1279(1996).
12. Spratling,M.W.Areviewofpredictivecodingalgorithms.BrainCogn.112,92–97
(2017).
13. Alink,A.,Schwiedrzik,C.M.,Kohler,A.,Singer,W.&Muckli,L.Stimuluspredictability
reducesresponsesinprimaryvisualcortex.JNeurosci30,2960–2966(2010).
14. Todorovic,A.&deLange,F.P.RepetitionSuppressionandExpectationSuppression
AreDissociableinTimeinEarlyAuditoryEvokedFields.J.Neurosci.32,13389–13395
(2012).
15. Summerfield,C.&deLange,F.P.Expectationinperceptualdecisionmaking:neuraland
computationalmechanisms.Nat.Rev.Neurosci.15,745–756(2014).
16. Smith,F.W.&Muckli,L.Nonstimulatedearlyvisualareascarryinformationabout
surroundingcontext.ProcNatlAcadSciUA107,20099–20103(2010).
17. Peelen,M.V.&Kastner,S.Aneuralbasisforrealworldvisualsearchinhuman
occipitotemporalcortex.ProcNatlAcadSciUA108,12125–12130(2011).
18. Kok,P.,Failing,M.F.&deLange,F.P.PriorExpectationsEvokeStimulusTemplatesin
thePrimaryVisualCortex.J.Cogn.Neurosci.26,1546–1554(2014).
19. Kok,P.,Jehee,J.F.M.&deLange,F.P.LessIsMore:ExpectationSharpens
RepresentationsinthePrimaryVisualCortex.Neuron75,265–270(2012).
20. Kok,P.,Mostert,P.&Lange,F.P.de.Priorexpectationsinduceprestimulussensory
templates.Proc.Natl.Acad.Sci.201705652(2017).doi:10.1073/pnas.1705652114
21. Jaramillo,S.&Zador,A.M.Theauditorycortexmediatestheperceptualeffectsof
acoustictemporalexpectation.NatNeurosci14,246–251(2011).
22. Rubin,J.,Ulanovsky,N.,Nelken,I.&Tishby,N.TheRepresentationofPredictionError
inAuditoryCortex.PLoSComput.Biol.12,e1005058(2016).
33
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
23. Fritz,J.B.,Elhilali,M.,David,S.V.&Shamma,S.A.Doesattentionplayarolein
dynamicreceptivefieldadaptationtochangingacousticsalienceinA1?HearRes229,
186–203(2007).
24. Eliades,S.J.&Wang,X.Sensorymotorinteractionintheprimateauditorycortexduring
selfinitiatedvocalizations.JNeurophysiol89,2194–2207(2003).
25. Schneider,D.M.&Mooney,R.Motorrelatedsignalsintheauditorysystemforlistening
andlearning.Curr.Opin.Neurobiol.33,78–84(2015).
26. Eliades,S.J.&Wang,X.Neuralsubstratesofvocalizationfeedbackmonitoringin
primateauditorycortex.Nature453,1102–1106(2008).
27. Schneider,D.M.,Sundararajan,J.&Mooney,R.Acorticalfilterthatlearnstosuppress
theacousticconsequencesofmovement.Nature561,391(2018).
28. Whitford,T.J.etal.Neurophysiologicalevidenceofefferencecopiestoinnerspeech.
Elife6,(2017).
29. Leonard,M.K.,Baud,M.O.,Sjerps,M.J.&Chang,E.F.Perceptualrestorationof
maskedspeechinhumancortex.NatCommun7,13619(2016).
30. Kraemer,D.J.M.,Macrae,C.N.,Green,A.E.&Kelley,W.M.Musicalimagery:sound
ofsilenceactivatesauditorycortex.Nature434,158(2005).
31. Müller,N.etal.Youcan’tstopthemusic:reducedauditoryalphapowerandcoupling
betweenauditoryandmemoryregionsfacilitatetheillusoryperceptionofmusicduring
noise.Neuroimage79,383–393(2013).
32. Voisin,J.,BidetCaulet,A.,Bertrand,O.&Fonlupt,P.Listeninginsilenceactivates
auditoryareas:afunctionalmagneticresonanceimagingstudy.JNeurosci26,273–278
(2006).
33. Berlot,E.,Formisano,E.&DeMartino,F.MappingFrequencySpecificTonePredictions
intheHumanAuditoryCortexatHighSpatialResolution.J.Neurosci.38,4934–4942
(2018).
34
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
34. Heilbron,M.&Chait,M.Greatexpectations:Isthereevidenceforpredictivecodingin
auditorycortex?Neuroscience(2017).doi:10.1016/j.neuroscience.2017.07.061
35. Näätänen,R.,Paavilainen,P.,Rinne,T.&Alho,K.Themismatchnegativity(MMN)in
basicresearchofcentralauditoryprocessing:Areview.Clin.Neurophysiol.118,
2544–2590(2007).
36. SanMiguel,I.,Saupe,K.&Schröger,E.Iknowwhatismissinghere:
electrophysiologicalpredictionerrorsignalselicitedbyomissionsofpredicted”what”but
not”when”.Front.Hum.Neurosci.7,(2013).
37. SanMiguel,I.,Widmann,A.,Bendixen,A.,TrujilloBarreto,N.&Schröger,E.Hearing
Silences:HumanAuditoryProcessingReliesonPreactivationofSoundSpecificBrain
ActivityPatterns.J.Neurosci.33,8633–8639(2013).
38. Recasens,M.&Uhlhaas,P.J.Testretestreliabilityofthemagneticmismatchnegativity
responsetosounddurationandomissiondeviants.Neuroimage157,184–195(2017).
39. Bendixen,A.,Scharinger,M.,Strauß,A.&Obleser,J.Predictionintheserviceof
comprehension:Modulatedearlybrainresponsestoomittedspeechsegments.Cortex
53,9–26(2014).
40. Raij,T.,McEvoy,L.,Mäkelä,J.P.&Hari,R.Humanauditorycortexisactivatedby
omissionofauditorystimuli.BrainRes.745,134–143(1997).
41. Wacongne,C.etal.Evidenceforahierarchyofpredictionsandpredictionerrorsin
humancortex.Proc.Natl.Acad.Sci.108,20754–20759(2011).
42. Todorovic,A.,vanEde,F.,Maris,E.&deLange,F.P.PriorExpectationMediatesNeural
AdaptationtoRepeatedSoundsintheAuditoryCortex:AnMEGStudy.J.Neurosci.31,
9118–9123(2011).
43. Chennu,S.etal.SilentExpectations:DynamicCausalModelingofCorticalPrediction
andAttentiontoSoundsThatWeren’t.JNeurosci36,8305–8316(2016).
44. Auksztulewicz,R.etal.TheCumulativeEffectsofPredictabilityonSynapticGaininthe
35
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
AuditoryProcessingStream.J.Neurosci.37,6751–6760(2017).
45. Barascud,N.,Pearce,M.T.,Griffiths,T.D.,Friston,K.J.&Chait,M.Brainresponsesin
humansrevealidealobserverlikesensitivitytocomplexacousticpatterns.ProcNatl
AcadSciUA113,E61625(2016).
46. King,J.R.&Dehaene,S.Characterizingthedynamicsofmentalrepresentations:The
temporalgeneralizationmethod.TrendsCogn.Sci.18,203–210(2014).
47. Marti,S.&Dehaene,S.Discreteandcontinuousmechanismsoftemporalselectionin
rapidvisualstreams.Nat.Commun.8,(2017).
48. Fritz,J.B.,Elhilali,M.&Shamma,S.A.Adaptivechangesincorticalreceptivefields
inducedbyattentiontocomplexsounds.JNeurophysiol98,2337–2346(2007).
49. Bendixen,A.,Schroger,E.&Winkler,I.IHeardThatComing:EventRelatedPotential
EvidenceforStimulusDrivenPredictionintheAuditorySystem.J.Neurosci.29,
8447–8451(2009).
50. Grootswagers,T.,Wardle,S.G.&Carlson,T.A.DecodingDynamicBrainPatternsfrom
EvokedResponses:ATutorialonMultivariatePatternAnalysisAppliedtoTimeSeries
NeuroimagingData.J.Cogn.Neurosci.29,677–697(2017).
51. Saenz,M.&Langers,D.R.M.Tonotopicmappingofhumanauditorycortex.Hear.Res.
307,42–52(2014).
52. Lütkenhöner,B.,Krumbholz,K.&SeitherPreisler,A.Studiesoftonotopybasedonwave
N100oftheauditoryevokedfieldareproblematic.Neuroimage19,935–949(2003).
53. Santolin,C.&Saffran,J.R.ConstraintsonStatisticalLearningAcrossSpecies.Trends
Cogn.Sci.22,52–63(2018).
54. Frost,R.,Armstrong,B.C.,Siegelman,N.&Christiansen,M.H.Domaingenerality
versusmodalityspecificity:theparadoxofstatisticallearning.TrendsCogn.Sci.19,
117–125(2015).
55. Cichy,R.M.,Ramirez,F.M.&Pantazis,D.Canvisualinformationencodedincortical
36
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
columnsbedecodedfrommagnetoencephalographydatainhumans?NeuroImage121,
193–204(2015).
56. Cichy,R.M.,Ramirez,F.M.&Pantazis,D.Canvisualinformationencodedincortical
columnsbedecodedfrommagnetoencephalographydatainhumans?NeuroImage121,
193–204(2015).
57. Kaiser,D.,Oosterhof,N.N.&Peelen,M.V.TheNeuralDynamicsofAttentional
SelectioninNaturalScenes.J.Neurosci.36,10522–10528(2016).
58. Chouiter,L.etal.ExperiencebasedAuditoryPredictionsModulateBrainActivityto
SilenceasDoRealSounds.J.Cogn.Neurosci.27,1968–1980(2015).
59. Wacongne,C.,Changeux,J.P.&Dehaene,S.ANeuronalModelofPredictiveCoding
AccountingfortheMismatchNegativity.J.Neurosci.32,3665–3678(2012).
60. Barascud,N.,Pearce,M.T.,Griffiths,T.D.,Friston,K.J.&Chait,M.Brainresponsesin
humansrevealidealobserverlikesensitivitytocomplexacousticpatterns.Proc.Natl.
Acad.Sci.113,E616–E625(2016).
61. Nastase,S.,Iacovella,V.&Hasson,U.Uncertaintyinvisualandauditoryseriesiscoded
bymodalitygeneralandmodalityspecificneuralsystems.Hum.BrainMapp.35,
1111–1128(2014).
62. Brainard,D.H.ThePsychophysicsToolbox.SpatVis10,433–436(1997).
63. Cichy,R.M.&Pantazis,D.MultivariatepatternanalysisofMEGandEEG:A
comparisonofrepresentationalstructureintimeandspace.NeuroImage158,441–454
(2017).
64. Oostenveld,R.,Fries,P.,Maris,E.&Schoffelen,J.M.FieldTrip:Opensourcesoftware
foradvancedanalysisofMEG,EEG,andinvasiveelectrophysiologicaldata.Comput.
Intell.Neurosci.2011,(2011).
65. Gross,J.etal.GoodpracticeforconductingandreportingMEGresearch.NeuroImage
65,349–363(2013).
37
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
66. Blankertz,B.,Lemm,S.,Treder,M.,Haufe,S.&Müller,K.R.Singletrialanalysisand
classificationofERPcomponents—Atutorial.NeuroImage56,814–825(2011).
67. Treder,M.S.,Porbadnigk,A.K.,ShahbaziAvarvand,F.,Müller,K.R.&Blankertz,B.
TheLDAbeamformer:OptimalestimationofERPsourcetimeseriesusinglinear
discriminantanalysis.NeuroImage129,279–291(2016).
68. Nolte,G.Themagneticleadfieldtheoreminthequasistaticapproximationanditsuse
formagnetoenchephalographyforwardcalculationinrealisticvolumeconductors.Phys.
Med.Biol.48,3637–3652(2003).
69. VanVeen,B.D.,vanDrongelen,W.,Yuchtman,M.&Suzuki,A.Localizationofbrain
electricalactivityvialinearlyconstrainedminimumvariancespatialfiltering.IEEETrans
BiomedEng44,867–880(1997).
70. Haufe,S.etal.Ontheinterpretationofweightvectorsoflinearmodelsinmultivariate
neuroimaging.NeuroImage87,96–110(2014).
71. Maris,E.&Oostenveld,R.NonparametricstatisticaltestingofEEGandMEGdata.J.
Neurosci.Methods164,177–190(2007).
38
.CC-BY-NC-ND 4.0 International licenseIt is made available under a
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint. http://dx.doi.org/10.1101/266841doi: bioRxiv preprint first posted online Feb. 18, 2018;