Content uploaded by Clara E. James
Author content
All content in this area was uploaded by Clara E. James on Nov 27, 2017
Content may be subject to copyright.
Early neuronal responses in right limbic structures mediate harmony incongruity
processing in musical experts
Clara E. James
a,b,
⁎, Juliane Britz
a,c
, Patrik Vuilleumier
a,c
, Claude-Alain Hauert
a,b
, Christoph M. Michel
a,c
a
Geneva Neuroscience Center, University of Geneva, Switzerland
b
Faculty of Psychology and Educational Sciences, University of Geneva, Switzerland
c
Department of Fundamental and Clinical Neurosciences, University of Geneva, Switzerland
abstractarticle info
Article history:
Received 27 November 2007
Revised 3 June 2008
Accepted 15 June 2008
Available online 1 July 2008
In western tonal music, musical phrases end with an explicit harmonic consequent which is highly expected.
As such expectation is a consequence of musical background, cerebral processing of incongruities of musical
grammar might be a function of expertise. We hypothesized that a subtle incongruity of standard closure
should evoke a profound and rapid reaction in an expert's brain. If such a reaction is due to neuroplasticity as
a consequence of musical training, it should be correlated with distinctive activations in sensory, motor and/
or cognitive function related brain areas in response to the incongruent closure. Using event related potential
(ERP) source imaging, we determined the temporal dynamics of neuronal activity in highly trained pianists
and musical laymen in response to syntactic harmonic incongruities in expressive music, which were easily
detected by the experts but not by the laymen. Our results revealed that closure incongruity evokes a
selective early response in musical experts, characterized by a strong, right lateralized negative ERP
component. Statistical source analysis could demonstrate putative contribution to the generation of this
component in right temporal–limbic areas, encompassing hippocampal complex and amygdala, and in right
insula. Its early onset (~200 ms) preceded responses in frontal areas that may reflect more conscious
processing. These results go beyond previous work demonstrating that musical training can change activity
of sensory and motor areas during musical or audio-motor tasks, and suggest that functional plasticity in
right medial–temporal structures and insula also modulates processing of subtle harmonic incongruities.
© 2008 Elsevier Inc. All rights reserved.
Introduction
The special allotment of high level musical training and its ensuing
abilities confer a unique role to musical experts in the research on
experience-dependent changes in the brain (Munte et al., 2002;
Schlaug, 2001). Such research has hitherto been focused on effects
arising in auditory and motor areas of the cerebral cortex, and on
audio-motor coordination, showing enhanced activation of these
brain structures in trained musicians during musical or audio-motor
tasks (Haueisen and Knosche, 2001; Lotze et al., 20 03; Schneider et al.,
2002; Zatorre et al., 2007). However, musical activities are not
confined to perception and motor skills. Cognitive and emotive
aspects also play a crucial role in music perception and expression,
and neuronal systems associated with these aspects may thus also be
shaped by experience-driven plasticity. Although it is supposed that
musical tonal contexts are maintained in brain regions that integrate
sensory, cognitive and affective functions (Janata et al., 2002a), it is not
clear which of these functions are shaped by musical training and
what the time-course of these experience-dependent responses in the
brain is.
Tonality is a system of composing music according to hierarchical
pitch relationships around a key “center”or tonic. Whether the basic
foundations of tonality, namely the search for stability or consonance
(pleasantness), are biological or learned is a matter of debate
(McDermott and Hauser, 2005; Zentner and Kagan, 1996). In western
tonal music, strong hierarchical relationships, governed by syntactical
rules, exist within and between chords (i.e. when musical pitches
sound simultaneously). A listener's expectation is based on the most
common sequences and composition of chords within a certain
context. Tonality rules were most concise in the classical period
(1730–1820). Atpresent however, most popular music is still based on
classical tonality, and therefore tonality rules are well known by the
general population exposed to these stimuli, even without any explicit
training (Tillmann et al., 2000). Nine month old infants of western
culture already show a preference to stimuli based on the western
diatonic scale (Trehub et al., 1999), and adults and children without
any formal training are able to detect music-syntactically irregular
chords (Bigand et al., 1999; Bigand, 2003; Koelsch et al., 2000, 2005).
In some contexts, musical laymen, as a consequence of mere exposure,
NeuroImage 42 (2008) 1597–1608
⁎Corresponding author. FPSE, University of Geneva, Uni Mail, 40 Bd du Pont-d'Arve,
CH-1211, Genève 4, Switzerland. Fax: +41 22 3799229.
E-mail address: Clara.James@pse.unige.ch (C.E. James).
1053-8119/$ –see front matter © 2008 Elsevier Inc. All rights reserved.
doi:10.1016/j.neuroimage.2008.06.025
Contents lists available at ScienceDirect
NeuroImage
journal homepage: www.elsevier.com/locate/ynimg
demonstrated sophisticated musical knowledge (Bigand, 2003).
Concerning the musical capacities of untrained listeners no consensus
exists, the choice of experimental methods and of musical stimuli
might explain this (Bigand, 2003). We hypothesized that subtle
syntactical transgressions of musical grammar within a complex
expressive musical context might be distinctively apprehended by
musical experts, who have incorporated the syntactical rule system
more extensively due to intensive training. We also anticipated that
the presentation of polyphonic piano pieces to expert pianists would
enhance expertise-specific responses. Such differential sensitivity can
best be observed at musical closure, where a very specific harmonic
consequent is expected (Meyer, 1956).
Tonal music formulas that signify the end of a phrase involve
acknowledged conventions, especially of harmonic nature, conveying
a sense of completion. Such end formulas are called cadences and
consist of a particular series of chords. At the very end of a musical
piece, any other than an authentic cadence, comprising a dominant
chord (scale degree V, chord build on the fifth note from the key
center), followed by a tonic chord (scale degree I, chord build on the
key center), will fail to provide full release from previously generated
harmonic tensions.
The rationale of studies on tonal expectancy violations in music is
based on the hypothesis that frustration of expectation is a main root
of cognitive, affective and aesthetic responses to music (Meyer, 1956).
But in order for these responses to arise, one should be able to
appraise the transgression. Moreover, the neural processes involved in
these putative cognitive and affective responses to musical violations
remain unknown. We hypothesized that a subtle incongruity of
standard closure in an expressive and complex musical context should
evoke a profound inevitable reaction in an expert, whereas an auditor
lacking musical training may hardly detect it.
Most research on violation of syntax has been done on highly
controlled chord sequences (Koelsch et al., 2001, 2007; Poulin-
Charronnat et al., 2006; Regnault et al., 2001; Tillmann et al., 2006),
and demonstrated specific physiological responses to irregularities in
musical syntax in musicians and non-musicians. Some studies using
chord sequences for stimuli found clear behavioral and or early
electrophysiological differences in detecting subtle syntactic incon-
gruities as a function of expertise (Koelsch et al., 2002, 2007), others
found less or none of such differences between musicians and non-
musicians (Bigand et al., 1999; Bigand, 2003; Regnault et al., 2001). We
hypothesized that offering subjects a rich and expressive musical
context might evoke more natural and complete responses, compared
to chord sequences, and could therefore induce stronger cognitive and
affective reactions in case of expectancy violations. To test this
hypothesis, a series of expressive polyphonic piano pieces of different
character and length were composed by a professional composer for
our purposes, according to the rules of classical style. Due to the
relatively long duration of these expressive compositions a strong and
close to “concert hall”musical expectancy to the terminal chord could
be built up. All pieces were presented both with a regular terminal
chord and with a subtle syntactic harmonic incongruity thereof. We
use the term syntactic, because the different harmonic incongruities
applied to the terminal chord resulted in closely related, relatively
consonant and thus acoustically pleasant chords. However, they did
not consist of a tonic chord, which constitutes the only possible
regular ending according to classical musical grammar.
We expected behavioral and cerebral responses to incongruous
musical stimuli to change with increased expertise as a consequence
of experience and functional brain plasticity. At the behavioral level
we predicted increased accuracy as a function of expertise. At the
cerebral level shorter latencies and increased amplitudes of specific
ERPs were anticipated in experts compared to laymen. We hypothe-
sized to obtain an ERAN-like (early right-anterior negativity; Koelsch
et al., 2001) ERP component with higher amplitude and possibly
earlier latency in experts, as has been demonstrated in response to
chord sequences (Koelsch et al., 2002). The ERAN has been repeatedly
elicited in response to syntactically irregular chords in musicians and
non-musicians (Koelsch et al., 2001, 2002, 2007). We speculated that a
modulation of the ERAN's scalp configuration might occur due to the
rich musical material used here and that this modulation might be
stronger or distinct for the professional musicians. As a sustained
difference in scalp voltage topography indicates that different
neuronal generators are active, we expected distinctive neuronal
correlates for these effects in experts involving not only motor and
sensory areas, as described in the literature for expert instrumentalists
(Haueisen and Knosche, 2001; Lotze et al., 2003; Schneider et al.,
2002; Zatorre et al., 2007), but also areas associated with cognitive
and emotive processing, possibly involving medial–temporal regions
that are known to be involved in higher-order pitch processing,
memory and affect (Blood et al., 1999; Borchgrevink, 1982; Gosselin et
al., 2006; Janata et al., 2002a; Martin and Morris, 2002; Wieser, 2003).
Here we studied thirteen professional pianists from top-rank Swiss
and French conservatories and thirteen musical laymen. Multichannel
EEG was continuously recorded while randomized series of regular
and incongruent musical stimuli were presented. Event-related
potentials (ERPs) to the terminal chord were analyzed in terms of
scalp potentials as well as in terms of estimated intracranial sources, by
applying statistical parametric mapping to distributed inverse
solutions.
Methods
Participants
Twenty-six right-handed male volunteers gave written informed
consent to participate in this experiment and received monetary
compensation. We only recruited men because sex is known to
influence neurophysiological responses (Ortigue et al., 2004, 2005),
and specifically so for music processing (Koelsch et al., 2003). The
group consisted of thirteen professional pianists (27.5± 4.8 years) and
thirteen musical laymen (27.7±6.3 years). The laymen had little (5
subjectsb2 years) or no musical education (8 subjects) and rarely
listened to classical music intentionally (22±14 min/week). Expert
subjects started studying the piano at 7.4± 3.0 years, and peak value of
daily training was 6.3± 1.3 h per day. These pianists were advanced
conservatory students, established artists or teachers and received
training at the Conservatoires Supérieurs de Genève, Lausanne,
Neuchâtel and Paris. All participants reported normal hearing and
presented no history of neurological illnesses. The protocol was
approved by the local ethical committee.
Materials
A series of 30 different polyphonic piano pieces of diverse character
and length (13.1± 4.8 s) was composed for this experiment by a
professional composer (Nicolaas Ravenstijn, cf. Acknowledgements).
Examples of stimuli are shown in Fig. 1 (corresponding sound-files can
be found in the Supplementary information, together with two
additional example sound-files and musical scores). Fifteen different
tonalities were used. Seventeen pieces were composed in major mode,
thirteen in minor. The incongruous endings consisted of deceptive
cadences, thereof one third (11 terminal chords) were subdominant
endings (scale degree IV), andtwo thirds (19 endings) submediant (scale
degree VI) terminal chords. Of the pieces containing subdominant
endings, 7 were written in major mode, 4 in minor; of those containing
submediant endings, 10 were in major mode, 9 in minor.
In two out of the 30 pieces, an identical chord to the terminal chord
occurred in the penultimate measure in the regular terminal chord
condition; the influence of such a sensory “repetition effect”is
supposed to be overruled by cognitive priming (Bigand et al., 2003,
2005).
1598 C.E. James et al. / NeuroImage 42 (2008) 1597–1608
We presented each piece in two experimental conditions: 1) with a
regular and 2) with an incongruous terminal chord. All stimuli were
repeated 3 times (n=90 per condition). Stimuli were presented
binaurally via headphones in 4 predetermined series of regular and
incongruent musical stimuli of which items were presented in
random order to each subject.
The stimuli were recorded in stereo at a frequency of 44100 Hz and
a bit-depth of 16 bits. They were executed by a professional pianist on
an acoustical grand piano as a whole, without editing, in order to
present the most naturally structured musical stimuli possible. Up to
40 repetitions for each piece were necessary in order to create stimuli
with no effects in the execution that would reveal an upcoming
incongruent ending, as judged by 2 independent professional
musicians. All terminal chords were cut off at 1220 ms from onset
and faded linearly over the last 70 ms. All stimuli were normalized to a
maximal peak amplitude of minus 9 dB.
Validation of musical stimuli
In order to assure that the incongruent endings of these
compositions were distinctively recognized by expert pianists, we
ran an extensive behavioral pilot study, prior to the current ERP
experiment, which we will briefly discuss here. We used a subset of
the same stimuli (n=14, 5 subdominant, 9 submediant incongruous
endings) that contained interspersed regular and incongruous end-
ings, which were appraised by 4 experimental groups of different
expertise level. These 4 levels consisted of: musical laymen (n= 8),
music lovers (n=7), amateur pianists (n=9) and a group of expert
pianists (n=8). Music lovers were individuals who listened intensively
to classical music (3.1± 1.2 h/week) but had little or no practical
experience. In this study subjects classified endings from absolutely
not satisfactory (1) to completely satisfactory (5) on a 5 point Likert
scale (Table 1). Multiple comparisons within a repeated measures
ANOVA Condition (2, within) × Group (4, between) showed that
ratings of the harmonic incongruities did not dissociate the three
non professional groups amongst each other (F
2,21
=0.05, p=0.9548).
In contrast experts rated the incongruities as more unsatisfactory than
the 3 non professional groups (F
1,28
= 17.9 7, pb0.0003). All 3 non
professional groups did rate the harmonic incongruities significantly
lower than the regular endings (F
1,28
=200.84, pb0.0001). Interaction
between professional and not non professional groups for both
conditions was significant (F
1,28
=25.13, pb0.0001). For regular end-
ings, professionals and amateurs did not rate differently, both
musician groups rated significantly higher than both non-musician
groups (F
1,28
=23.58, pb0.0001).
Behavioral task
Subjects were requested to indicate whether a musical piece
provided a satisfactory ending by means of right hand button presses
on a response box, using a button labeled with “no”(middle finger) for
non-satisfactory endings, and with “yes”(index) for satisfactory
endings. Subjects were instructed to withhold their response after
the onset of the final target chord, until a prompt (“please respond”)
was presented on the screen, after 1720 ms, in order to prevent
Fig. 1. Examples of stimuli. Regular (R) and harmonically syntactically Incongruous (I) terminal chords for two out of 30 polyphonic piano pieces (composer: Nicolaas Ravenstijn).The
incongruous terminal chord used in example (a) in F minor is a scale degree VI or submediant chord that was used in two thirds of the stimuli; the one in example (b) in F major is a
scale degree IV or subdominant chord that was used in one third of the stimuli (cf. sound-files in Supplementary information).
1599C.E. James et al. / NeuroImage 42 (2008) 1597–1608
contamination of the stimulus-related EEG signal with motor activity.
This is why only accuracy rates but no reaction times are reported.
EEG acquisition and raw data processing
EEG was continuously recorded at 128 electrode sites (BioSemi
Active-Two, V.O.F., Amsterdam, the Netherlands), equally distributed
across the scalp. Data were digitized at a sampling rate of 1024 Hz in a
bandwidth filter of 0–268 Hz. Prior to analysis, data were offline
recomputed against average reference, band-pass filtered (1–30 Hz),
and down-sampled to 256 Hz. Average evoked potentials were
calculated from the onset of the terminal chord to 500 ms post
stimulus. A DC shift correction was applied by subtracting for each
electrode at each time point the mean voltage calculated over the
entire period of analysis. Only artefact-free epochs selected by visual
inspection of each single epoch were included in the analyses.
Channels exhibiting substantial noise were interpolated using a 3D
spherical spline interpolation (Perrin et al., 1987).
Procedure of ERP analyses
The evoked potentials were analyzed in four stages.
In a first step, in order to allow comparison to the ERP component
ERAN (Koelsch et al., 2001, 2002, 2007), we determined whether the
mean voltage amplitude at 5 central electrodes (FC3, FC4, Cz, CP3 and
CP4) differed between the two groups for both experimental
conditions (harmonic incongruous versus regular terminal chord)
within a 200–260 ms window that corresponded to an early right
lateralized component we found solely in experts. The time window
was centered on the peak amplitude of this early negative component
(~230 ms). A repeated measures ANOVA was performed on mean
amplitude over 200–260 ms for the factors Condition (2, with-
in)×Electrode (5, within) ×Group (2 between).
The second stage investigated whether the differences identified in
the first stage between the two subject groups were due to
topographical changes of the whole scalp potential configuration or to
local amplitude changes only. By physical laws, potential configuration
differences must be due to changes in the localization of the distribution
of active generators in the brain during this period (Vaughan, 1982). In
order to check for such topographic modulations, statistical comparison
between the two groups of the normalized ERP topographies at each
time point were performed using a nonparametric bootstrapping
method on the global map dissimilarity values (Kondakor et al., 1997;
Michel et al., 2004b; Murray et al., 2006; Srebro, 1996). Global
dissimilarity is an index of configuration differences between 2 electric
fields that is independent of their strength (normalized data are
compared; Lehmann and Skrandies, 1980). Periods for which this
topographic test exceeded a 0.01 alpha criterion for at least 40
consecutive ms (supplementary time constraint) were considered.
The third stage consisted of a spatio-temporal ERP analysis
procedure. This method, known as microstate segmentation (Michel
et al., 2004a; Michel et al., 1999; Michel et al., 2001; Murray et al.,
2004; Pascual-Marqui et al., 1995; Thierry et al., 2007), is based on
the observation that the topography of the electric field at the scalp
does not vary randomly as a function of time, but rather remains in a
stable configuration for brief time periods or components (Lehmann
et al., 1987). First, a k-means cluster analysis defines the most
dominant scalp topographies appearing in the group-averaged ERPs
over time. On the basis of cross validation criteria, this pattern
analysis reduces ERP data for one or more conditions or groups to an
optimal number of scalp configurations or microstate maps, of which
each represents a “functional microstate”of information processing
in the brain (Lehmann et al., 1987; Michel et al., 1999). The second
step consists of statistical analysis across the individual subjects in
order to define the significance of each microstate map for a given
group or condition. Therefore the microstate maps are fitted to the
ERPs of each individual subject on the basis of spatial correlation.
This results in information on goodness of fit and also on the
duration of presence of each microstate map in a given condition for
each subject. Statistical comparison then allows to determine which
microstates are significantly more present in one group or condition
than in another, or whether they are equally present but appear at
different latencies (Michel et al., 1999, 2004a). Because of multiple
testing, although results from the group level analysis can be
considered a priori hypotheses, we used an alpha criterion of 0.005.
Recent detailed description and discussion of this topographic ERP
analysis method can be found in Murray et al. (2008) and Pourtois et
al. (2008).
The fourth stage consisted of the localization and statistical
comparison of the putative sources in the brain that differed between
the two groups across time. Since scalp EEG and MEG recordings both
suffer from the fact that the inverse problem is ill-posed, we applied
conservative statistical analysis and only interpreted results that were
concordant with the findings of the above described analysis of the
scalp evoked potentials. We used a depth-weighted minimum norm
distributed linear inverse solution (Hamalainen and Ilmoniemi, 1994;
Michel et al., 2004a) to estimate the intracranial current distribution
at each moment in time for the evoked potential of each subject. The
current distribution was calculated within the grey matter of the
average brain provided by the Montreal Neurological Institute. A
discrete grid of 3005 solution points was regularly distributed within
this volume. After applying a homogeneous transformation operation
to the volume that rendered it to the best fitting sphere (SMAC model;
Spinelli et al., 2000), a 3-shell spherical head model was used to
calculate the lead field for the 128 electrodes and the inverse solution
based on the weighted minimum norm (WMN) constraint. Based on
this approach, a current distribution was calculated for each subject's
ERP at each time point. The solution space was then spatially
smoothed by averaging the solution points within 50 regions of
interest (ROI) that were defined according to the macroscopic
anatomical parcellation of the MNI template conform to the
Automated Anatomical Labeling (AAL) map (Tzourio-Mazoyer et al.,
2002) available from the MRIcro software (Rorden and Brett, 2000).
The complete list of these regions is available in the Supplementary
information. Similar to the statistical parametric mapping used in
fMRI analysis, unpaired t-tests (two-tailed) between experimental
groups were then computed for the 50 regions of interest. Because of
multiple statistical testing (50 regions of interest), only periods for
which this topographic test exceeded a 0.005 alpha criterion for at
least 40 consecutive ms (supplementary time constraint) were
considered. Results of these tests will be enhanced activations
characterizing one group for a certain condition, because these two-
tailed t-tests identify differences and not commonalities.
Results
Behavioral results
Behavioral data were analyzed statistically by a repeated measures
ANOVA with the factors Condition (2, within) ×Group (2, between).
Table 1
Likert ratings in reponse to regular and harmonically incongruous chords as a function
of expertise level
Likert rating (1–5) ±sd
Regular Harmonic incongruity
Laymen 4.2±0.4 2.5 ±0.4
Music lovers 4.1± 0.4 2.5± 0.3
Amateurs 4.6±0.3 2.4 ±0.6
Experts 4.7±0.2 1.6± 0.6
Ratings varied from 1 “absolutely not satisfactory”to 5 “completely satisfactory”.
160 0 C.E. James et al. / NeuroImage 42 (2008) 1597–1608
Professionals' performance was significantly superior in compar-
ison to laymen for both experimental conditions (i.e. regular terminal
chord and harmonically incongruous terminal chord) and strikingly
more so for the harmonic incongruity condition (significant interac-
tion between conditions and groups, F
1,24
=4.92; p= 0.0362). Regular
terminal chords were almost perfectly identified by the professionals
who performed correctly in 96.3±2.2% of trials, versus 80.7 ±16.2% for
the laymen (F
1,24
=11.91; pb0.0021). In the harmonic incongruity
condition, professional musicians performed practically at ceiling
again (96.2 ±6.1%), the musical laymen responded slightly above
chance in correctly detecting the subtle harmonic incongruities (66.2 ±
14.8%, F
1,24
=46.19; pb0.0001).
In fact 5 layman subjects out of 13 responded not different from
chance level (normal approximation of binomial test, z-scoreb1.64).
However, the layman group as a whole responded significantly
different from chance (logistic multilevel model with subjects on
level 2 and items on level 1 (R Development Core Team, 2007), test
conducted on the intercept (unique parameter), z=3.62, pb0.0003).
Percentages of correct responses did not differ between regular
and incongruous chords for professionals, whereas for laymen this
difference was significant (F
1,24
=9.97; pb0.0043).
ERP results: waveform and topography analyses
ERP waveform analyses
Visual inspection of Grand-average ERP waveforms (Figs. 2a and b)
for experts and laymen at 5 exemplar central electrode sites revealed
the occurrence of an early right-sided negative component peaking
shortly after 200 ms that arose solely in experts in response to
harmonic incongruous terminal chords. Such a component did not
occur in laymen at any time point (0–500 ms) as the later stages of
analyses will confirm. A repeated measures ANOVA for the factors
Condition (2, within)×Electrode (5, within) ×Group (2, between) on
mean amplitude in the 200–260 ms window yielded significant main
effects for Condition (F
1,2 4
=13.75; pb0.0011) and Electrode
(F
4,96
=22.91; pb0.0001) as well as significant interactions between
Condition×Group (F
1,2 4
=23.63; pb0.0001) and E lectrode × Group
(F
4,96
=6.75; pb0.0001). Post-hoc linear contrasts (Bonferroni cor-
rected for multiple comparisons) revealed that differences between
groups were significant only for the 2 right-sided electrodes (FC4,
F
1,24
=24.92; pb0.0001; CP4, F
1,24
=18.91; pb0.0001), exclusively for
the incongruity condition.
In the 2 lower panels of Figs. 2 (c and d) difference waves are
depicted that were calculated by subtracting ERPs to the regular
endings from those to the incongruous endings as is classically done
with the ERAN (Koelsch et al., 2001, 2007). Fig. 2f shows the difference
voltage maps (incongruous minus regular) for both experimental
groups over the 200–260 ms period; a strong right lateralized and
largely distributed negative potential can be observed in the expert
group only. Amplitude differences at again FC4 and CP4 and also at Cz
yielded significant differences within the expert group when
contrasting the regular and incongruous conditions (Fig. 2d). No
significant differences between regular and incongruous endings were
found for laymen (Fig. 2c). Taken together, these results confirm a
right-sided negative component in the 200–260 ms time window
solely in experts.
Topographic dissimilarity analysis
The second stage of our analysis, using a topographic dissimilarity
measure, demonstrated a relatively sustained difference between
groups in response to the harmonically incongruous chords, from 200
to 500 ms (with a short period of non-significant difference between
250 and 285 ms; Fig. 3a). This analysis did not show any difference
between groups in response to the regular terminal chords (Fig. 4a).
Spatio-temporal ERP analysis
Next, in a third stage, we performed 2 spatio-temporal ERP
analyses that yielded a solution with 7 stable microstate map
configurations over time for the regular terminal chord condition
(Fig. 4b), and a solution with 8 maps for the harmonic incongruity
condition (Fig. 3b). These two series of microstate maps explained
respectively 85.0% (regular) and 87.4% (incongruous) of variance in the
ERP data. The succession of these maps clearly confirmed a similar
sequence of processing stages for both groups and both conditions
before 200 ms, and suggested three distinct consecutive maps during
this period. These components evolved from (1) a centro-parietal
Fig. 2. Grand-average ERP waveforms for 5 central electrode sites. ERPs elicited by (a) Regular and (b) Incongruous terminal chords for Experts and Laymen are displayed at 5 central
electrodes. The time interval used for the statistical analysis of the early negative component is indicated by the grey-shaded areas (200–260 ms). Red arrows indicate significant
differences (pb0.05, Bonferroni corrected for multiple comparisons) between Experts and Laymen for incongruous chords; ERPs for regular chords did not differ. In the lower panels
(c and d) difference waves (ERPs in response to regular chords subtracted from incongruous ones) and original ERPs for both conditions are plotted for (c) Laymen and (d) Experts. Red
arrows indicate significant differences. (e) Head positions of electrodes depicted in a–d. (f) Difference scalp configuration maps (regular subtracted from incongruous: I−R) for the
time period of 200–260 ms around the peak latency (230 ms) of the component. These maps are 2-D projections of the 3-D electrode configuration (view from above, nasion on top).
Experts on the left, Laymen on the right panel.
1601C.E. James et al. / NeuroImage 42 (2008) 1597–1608
negativity (Figs. 3b and 4b,“microstate maps”, Map 1, peaking around
65 ms), followed by (2) a fronto-central negativity (Figs. 3b and 4b,
Map 2, peaking around 110 ms), and finally (3) a fronto-central
positivity (Figs. 3b and 4b, Map 3, peaking around 180 ms), with high
field strength.
Then, for the regular terminal chord condition, a short period with
concurrent different topography maps occurred for both groups, with
Map 4 (bilateral frontal negativity) being significantly more present
(duration in ms) in the expert group as compared to Map 5 (bilateral
frontal positivity) in the layman group (Fig. 4b, Maps 4 and 5). The
statistical analysis based on fitting the maps to the individual data
confirmed this difference (Fig. 4c). This was followed again by similar
maps for both groups with frontal negativity and then posterior
positivity (Fig. 4b, Maps 6 and 7).
For the incongruous terminal chords, a single stable map with
symmetrical central posterior negativity (Fig. 3b, Map 5) was
identified for the layman group after the common processing
sequence whereas in sharp contrast, the experts exhibited a cascade
of successive components that evolved towards a centro-parietal
positivity (Fig. 3b, Maps 4, 3, 6, 7 and 8 respectively). The onset of
differences between groups (~ 200 ms) manifested initially by a
markedly asymmetrical right-sided negative component (Fig. 3b, Map
4), present in experts only. This divergence in successive topographies
marked the beginning of a different processing sequence in the two
groups. The corresponding fitting procedure confirmed simultaneous
occurrence of different maps in both groups for 3 consecutive time
periods (Fig. 3c). First, Map 4, (right negativity), was significantly more
present (duration in ms) in experts compared to Map 5 (posterior
negativity) for laymen (Fig. 3c left panel). Second, Map 6 (frontal
positivity) was significantly more present in experts compared to Map
5 for laymen (Fig. 3c middle panel). Finally Maps 7 (centro-parietal
positivity) and 8 (more posterior centro-parietal positivity) were
significantly more present in experts compared to Map 5 for laymen
(Fig. 3c right panel). In all cases Map 5 was characterizing the layman
group.
Fig. 3. Analyses of ERP responses to incongruous terminal chords. (a) Results of the
topographic dissimilarity analysis comparing ERP responses of Experts and Laymen. The
significant results are displayed in black as 1 minus p-values over time periods
exceeding 40 ms. (b) Results of the spatio-temporal ERP analysis. The microstates are
marked in colour on the superimposed Grand-average ERP waveforms of all 128
channels for Experts (upper panel) and Laymen (lower panel), in dark and light grey
when common to both groups and in colour when unique for one group. Beginning and
end of each topographic change are indicated in ms underneath. Scalp configurations of
the microstate maps are displayed in the middle panel, framed in corresponding
colours. These maps are 2-D projections of the 3-D electrode configurations (view from
above, nasion on top). (c) Mean duration of presence (in ms) of microstate Maps 4–8 for
3 consecutive time periods. From left to right: 160–310 ms (Maps 4 versus 5); 270–
380 ms (Maps 5 versus 6); 385–500 ms (Maps 5 versus 7, 8). Significant results of the
statistical fitting procedure on duration of presence of microstate maps for all individual
subjects are shown by means of asterisks. ⁎⁎⁎pb0.0 005, ⁎⁎pb0.005. Vertical bars depict
95% confidence intervals.
Fig. 4. Analyses of ERP responses to regular terminal chords. (a) Topographic
dissimilarity analysis. (b) Results of the spatio-temporal ERP analysis illustrated in the
same way as in Fig. 2. (c) Results of the Statistical Fitting procedure. Significant
differences between the two groups were found for one time period: 245–310 ms (Maps
4 versus 5). ⁎⁎pb0.005.
1602 C.E. James et al. / NeuroImage 42 (2008) 1597–1608
For the incongruous terminal chords we also performed a spatio-
temporal ERP analysis comprising group-averaged ERPs of exclu-
sively correct responses (thus successful incongruity detection) for
the laymen. We fitted the 8 microstate maps that we found in
response to incongruity processing into the group-averaged ERPs of
respectively correct responses of laymen, all responses of laymen
and all responses of experts. The microstate maps for correct res-
ponses compared to all responses for the laymen were identical. Map
4, the asymmetrical right-sided negative ERP component, did not
occur for correct responses in the layman group, but occurred again
exclusively in the expert group. The statistical analysis based on
fitting the maps to the individual data confirmed that no differences
in duration of presence of Map 4 existed between correct and all
responses for laymen.
Statistical source analyses of ERPs
The fourth stage of our analysis employed a statistical source lo-
calization method on 50 regions of interest and identified a sequence of
distinctive activations to harmony incongruity that reflected a selective
enhancement of neural responses in experts, from ~200 ms to ~450 ms
after the terminal chord onset (Fig. 5; the complete list of 50 regions is
available in the Supplementary information). Differences initiated in
right temporal–limbic structures, encompassing hippocampal complex
(ROI 25) and amygdala (ROI 23), and in right insula (ROI 39), persisting
from ~210 ms until ~285 ms. These 3 differentially activated areas
putatively contributed to the generation of the negative ERP component
in experts occurring at the same time. They are depicted in Fig. 6
superimposed on slices of MNI 152 template brain together with region
of interest current density for these 3 regions in experts and laymen,
illustrating the differences in activation of these areas starting at around
200 ms. These activations were then followed by increases in several
frontal areas, including bilateral superior frontal (right ~290–370 ms,
Fig. 5. Statistical parametric mapping (SPM) of estimated sources. Results of SPM of
cerebral sources within 50 Regions of interest (ROI, on the y-axis) estimated by a
distributed linear inverse solution (WMN). The complete list of these 50 regions is
available in the Supplementary information. Time course of current density differences
between groups is shown in graduated grey shades that indicate when current density
significantly differed between the two groups for more than 40 ms; significant p-values
vary from pb0.005 (dark grey) up to pb0.0001 (light grey).
Fig. 6. Current density in 3 Regions of Interest (ROI). These regions of interest were defined according to the MRIcro macroscopic anatomical parcellation of the MNI template
(Tzourio-Mazoyer et al., 2002). The 3 ROIs are respectively (a) Right Insula (b) Right Hippocampal complex and (c) Right Amygdala, and are superimposed in red on coronal slices of
MNI 152 template brain (left panels). Current density (μA/mm
3
) for these ROIs is plotted on the right panels asa function of time; periods of significant difference (pb0.0 05 for at least
40 ms) are shown by grey transparent blocks projected on the current density curves. Vertical bars depict 95% confidence intervals.
1603C.E. James et al. / NeuroImage 42 (2008) 1597–1608
ROI 49; left ~320–370 ms, ROI 50), right midfrontal (~290–350 ms, ROI
45), bilateral mid cingulate, (~320–365 ms, ROI15 and 16), and bilateral
supplementary motor areas (~310–365 ms, ROI 41 and 42). Partially
overlapping with activation in the left anterior cingulate cortex (ACC)
(~330–395 ms, ROI 38), a recurrent activation of the initially active
limbic areas was observed, now in bilateral hippocampal structures
(right ~350–405 ms, ROI 25; left ~360–400 ms, ROI 26) and right
amygdala (~360–400 ms, ROI 23). Finally, activations were found in left
occipital (calcarine sulcus) (~390–430 ms, ROI 12) and right orbito-
frontal areas (~ 420–460 ms, ROI 47).
All significant differences were due to enhanced activity in the
expert group. No increase was found for laymen relative to musicians.
Finally, no significant differences in source estimations were found
between the two groups for compositions with regular endings.
Discussion
The key finding of this study is a selective rapid neuronal response
(starting as early as ~200 ms) to subtle chord violations specifically for
trained pianists, but not for musical laymen. This rapid response
manifested as a right lateralized negative ERP component. Right
temporal–limbic areas, encompassing hippocampal complex and
amygdala, and right insula, brain areas critically associated with
cognitive, memory and emotive processing, contributed to the
generation of this early negative ERP.
All levels of analyses converged to reveal striking differences in
brain responses between professionals and laymen to harmonic
incongruities in our time window of analysis (0–500 ms), arising
between 200 and 500 ms after the onset of a harmonically anomalous
terminal chord. Regular endings did not evoke such differences. Two
distinct phases, an early (~200–300 ms) and a later one (300–500 ms)
could be identified in the dissimilarities between experts and laymen
in response to the incongruous chords. The early phase was
characterized by a novel right-sided negative ERP component maximal
over temporal leads in the expert group. Neuronal sources contributing
to this functional microstate could be localized in right medial–
temporal structures and in the right insula. Later differences involved
ERP positivities evolving over time from frontal to parietal sites in
experts, for which contributing putative sources were localized in more
frontal areas, although right medial–temporal activation also reoc-
curred. Meanwhile the layman group showed a single microstate
characterized by posterior negativity and low field strength.
Because our time window of analysis was limited to 500 ms, we
cannot exclude that some of the observed later differences between
300 and 500 ms might be partially due to latency differences between
experts and laymen.
Early differences: ERP analysis
A right lateralized, negative potential field was evoked over
temporal sites by the incongruent harmonic endings for the expert
group only, starting ~200 ms after the incongruous chord onset (Fig.
3b, microstate Map 4). This early effect was evidenced by the
traditional waveform analysis (Fig. 2), as well as by the topographic
dissimilarity analysis (Fig. 3a), and the spatio-temporal ERP analysis
(Figs. 3b and c). In contrast, laymen showed a symmetrical posterior
negativity during this period (Fig. 3b, Maps 3 and 5).
Comparisons of the early negative component found in experts with the
ERAN
The timing and polarity of the early difference are reminiscent of
the ERP component ERAN (peak latency ~200 ms) that has been
reported in response to syntactically irregular chords (Koelsch et al.,
2001, 2002, 2007). The ERAN is characterized by bilateral negative
potentials at frontal electrodes and shows right hemisphere dom-
inance that becomes weaker with more subtle irregularities (Koelsch
et al., 2007). An MEG study suggested bilateral sources for the ERAN in
the inferior pars opercularis of the inferior frontal gyrus (Maess et al.,
2001), part of Broca's area and its right side homologue. Functional
imaging confirmed the role of the inferior frontal cortex in the
processing of syntactic harmonic incongruities (Tillmann et al., 2006).
The ERAN shares some features with two other ERP components that
are elicited by incongruent auditory stimuli: the Mismatch Negativity
(MMN) and the early left anterior negativity (ELAN; Koelsch et al.,
2007), the latter responding selectively to syntactic violations in
language contexts (for a recent review see Friederici, 2002).
Similarities in topography, source localization, and sensitivity to
experimental variables for these two components led to the proposal
that the ERAN may belong to a “family of peri-sylvian negativities that
mediate the processing of irregularities of auditory input”(Koelsch et
al., 2001).
The strongERP component (peak latency ~ 230 ms) found for experts
in our study also shares some of these properties, but differs from the
ERAN (peak latency ~200 ms) in several crucial aspects: the negativity
we found was maximal over temporal leads, completely right
lateralized, and suggested key sources could be localized in right
medial–temporal structures. Most importantly, this component was
found in the expert group but was completely absent in the laymen. This
is in sharp contrast to the ERAN, which can also be elicited in subjects
without musical training in response to subtle syntactic incongruities,
and shows a modulation in amplitude with expertise (Koelsch et al.,
2007). The ERAN is classically depicted as a difference wave (e.g. Koelsch
et al., 2001, 2007). In order to allow comparison with this difference
ERAN we also computed difference waves for our two groups by
subtracting ERPs to the regular endings from those to the incongruous
endings. Only in the expert group we observed a strong right lateralized
and widely distributed negative potential (200–260 ms; Fig. 2f). At no
point in time within our window of analyses (0–500 ms), such a right
negative component occurred in the layman group.
Several reasons may account for these differences between ERAN
responses and the present early responses to similar subtle harmonic
syntactic incongruity that arose exclusively in expert musicians; in the
study by Koelsch et al. (2007) all irregularities were 1) identical within
blocks of testing, 2) embedded in short chord sequences. Finally, 3) the
musicians were amateurs; in the current experiment, we maximized
effects of expertise by presenting polyphonic piano pieces to expert
pianists. Interestingly, the behavioral accuracy of non-musicians in
explicitly detecting the subtly incongruous chords was practically
identical in the current study and in the study by Koelsch et al. (2007).
In Experiment 3 of that study, mean percentage of correct responses in
response to supertonics or scale degree II incongruities, that do not
contain any out of key notes, was 65.5%, and for double dominants,
that do contain one out of key note, 63.4%; the difference between
these 2 ratings was not significant; we found 66.2%. Thus the level of
incongruence appears similar.
Another recent study used subtle irregularities in chord series
(Poulin-Charronnat et al., 2006), opposing expert musicians to non-
musicians, but no ERAN was found, neither in professional musicians
nor in non-musicians. In this study, subjects judged the timbre of the
stimuli and were thus not directly asked to appraise harmonic
appropriateness. A frontal N5 dissociated musicians from non-
musicians; we cannot fully compare these results with ours because
we focused our analysis on earlier effects (our window of analyses was
limited to 500 ms), although we did also find stronger frontal
negativities (Fig. 3b, microstate Maps 7 and 8) in musicians that
occurred in combination with posterior positivities between 320 and
500 ms. In the Poulin-Charronnat et al. study (2006), the authors
proposed that the absence of an ERAN might be due to the use of the
subdominant (4th degree) as incongruity; this chord does not contain
any out of key notes, as opposed to the Neapolitan sixth chord that
contains 2 out of key notes and that was used in previous studies
160 4 C.E. James et al. / NeuroImage 42 (2008) 1597–1608
(Koelsch et al., 2000, 2002). They suggested that under preattentive
conditions, the ERAN may be elicited only by more salient harmonic
irregularities and not by very subtle ones. However, an ERAN response
was recently demonstrated in amateur musicians and non-musicians
in response to unattended subtle syntactic in-key incongruities using a
supertonic (scale degree II) as irregular ending (Koelsch et al. 2007,
Experiment 2).
Here we used interspersed 4th and 6th degrees as syntactically
incongruous terminal chords, in highly variable expressive musical
stimuli, rendering anticipation difficult. The 6th degree is most often
used as a deceptive cadence in musical compositions (e.g. “Trugs-
chluss”), and close to the tonic in function, because it may share two
pitches with it. Nevertheless an early negative component was very
clearly generated in experts only. The exact nature of the subtle
incongruities thus appears to have contributed less to the obtained
ERP results than the fact that they were embedded in a complex,
expressive and diverse musical context and also that they were task
relevant. Recently no difference could be established between ERPs in
response to harmonic incongruities to double dominant chords that
contain one out of key note and supertonics that contain solely in-key
notes (Koelsch et al., 2007, Experiment 2), and the same held for the
behavioral data (Experiment 3).
Patel et al. (1998) who used materials consisting of musical mini-
phrases, richer than chord series but not as complex as the musical
pieces used in the present study, did also find a right lateralized negative
component, albeit at 350 ms (“right antero-temporal negativity”or
RATN) only in moderately trained musicians and not in non-musicians.
One could argue that differences in relative consonance between
major versus minor chords could be perceived differently by experts
compared to laymen (Mazzola et al., 1989; Parncutt, 1988). Musicians
are more sensitive to consonance-dissonance than non-musicians
(Schön et al., 2005), and a minor chord is slightly less consonant than
a major chord. However, this possibility is rather unlikely because the
influence of such a sensory effect is overruled by cognitive priming
(Bigand et al., 2003, 2005). Psychoacoustical laws cannot explain the
qualitative culturally coded characteristics of musical chord function
(Mazzola et al., 1989; Parncutt, 2006).
Finally, the maximized harmonic skill of pianist experts appraising
syntactically incongruent piano stimuli may have enhanced the
distinctive early response.
Early differences: statistical analysis of putative sources
The major finding of our study is the localization of putative sources
contributing to the generation of this early negative ERP response to
irregular endings within right limbic areas in the medial–temporal lobe,
encompassing amygdala and hippocampal complex, as well as in right
insula exclusively in the experts (Fig. 6). Studies using simultaneous
recordings of intracranial and scalp EEG have shown that medial–
temporal activity can be reliably retrieved from scalp EEG, using
distributed source reconstruction techniques similar to those used in
the current study (Lantz et al., 1997; Zumsteg et al., 2005).
Limbic areas are implicated in different types of auditory sensory,
cognitive and also emotive operations. At the auditory sensory level, a
role of the hippocampus in orienting response to unexpected stimuli
has been demonstrated (Halgren et al., 1998; Knight, 1997; Kropotov
et al., 2000; Rosburg et al., 2007). However, these mismatch and
novelty responses reflecting echoic memory were related to activity in
bilateral hippocampi and arose at later latencies. At a more cognitive
level, right medial–temporal lobe structures may play an important
role in the perception of higher-order pitch dimensions and tonality.
Patients who underwent right-sided amygdalahippocampectomy
performed poorly for higher-order pitch discriminations and memory
for tone sequences in a musical aptitude test (Seashore Test; Wieser,
2003). Recordings from implanted electrodes in the medial–temporal
lobe of an epileptic patient revealed that responses in the left
hippocampus were modulated by traditional dissonances and in the
right hippocampus by higher-order tonality aspects of auditory
stimuli (Wieser and Mazzola, 1986). When healthy experienced
musical listeners were exposed to ongoing changes in tonality (Janata
et al., 2002a), right hippocampal activations were also observed. In
addition, responses to subtle harmonic incongruities were associated
with right insula activation in functional imaging (Tillmann et al.,
2006). We therefore hypothesize that functional plasticity in musical
experts within medial–temporal and insular structures may subserve
the formation of enhanced higher-order pitch and tonality processing
together with a highly specific auditory memory for musical syntax.
Compelling evidence for such experience-dependent plasticity in the
neocortex and hippocampus exists (Brecht and Schmitz, 2008;
Maguire et al., 2000, 2003; Martin and Morris, 2002).
Additionally, emotional aspects might also be involved in the
differences observed in right temporal–limbic structures between
groups, because for expert musicians even a subtle expectancy
violation could evoke some frustration or disliking (Meyer, 1956).
Several features intrinsic to musical structure, such as consonance and
complexity of harmony, correlate with physiological measures of
emotion induced by music (Gomez and Danuser, 2007). Previous
functional imaging work has linked emotional responses evoked by
unpleasant musical stimuli to the right limbic system (Blood et al.,
1999; Flores-Gutierrez et al., 2007; Gosselin et al., 2006, 2007). More
specifically, Steinbeis et al. (2005, 2006) reported that increase in
electrodermal activity and changes in subjective emotional experience
coincided with elicitation of specific ERP responses to harmonically
unexpected events in Bach chorales, thus within a complex musical
context. Electrodermal activity is known to be a good indicator of
emotional arousal, also for acoustical stimuli (Bradley and Lang, 2000).
The authors concluded that harmonically unexpected events can elicit
emotional effects.
This early affective response could be a by-product of the detection
of grammatical incongruity, but may also have a utilitarian function
for an expert musician, allowing rapid adaptation and error correction
(Katahira et al., 2008).
More generally, the early right limbic activation found in experts
might thus reflect an acquired automatic response to violations of
harmonic syntactic rules, preceding the activation of frontal and
cingulate areas, presumably associated with more explicit conscious
cognitive processes and error detection. Such rapid, automatic res-
ponses may result from experience-driven plasticity in neural path-
ways recruited by music perception.
Early differences: hemispheric asymmetry
With respect to auditory or music processing, it has been argued
that auditory cortices in the two hemispheres are relatively specia-
lized (Zatorre et al., 2002): right auditory cortex may subserve fine
grained pitch (frequency) processing, including musical stimuli;
whereas left auditory cortex may process rapidly changing broad-
band stimuli, including many aspects of speech. Clinical studies
confirm this right hemisphere dominance in auditory and medial–
temporal areas for processing of higher-order pitch cues (tonality) in
musical stimuli (Borchgrevink, 1982; Gosselin et al., 2006; Janata et al.,
2002a; Mazzola et al., 1989; Wieser, 20 03). Our results, showing more
pronounced right lateralization in musicians relative to non-musi-
cians, suggest enhanced hemispheric specialization as a consequence
of musical training.
Later differences: ERP analysis
Brain responses to harmonic incongruity also differed between the
two groups during a subsequent period from 300 to 500 ms. Spatio-
temporal ERP analysis identified a single microstate over this period
in the layman group (Fig. 3b, Map 5, ~250–500 ms), a symmetrical
1605C.E. James et al. / NeuroImage 42 (2008) 1597–1608
posterior negativity with low field strength. In contrast, experts
showed several successive microstates culminating in a centro-
parietal positivity. These differential responses in experts started
with a positivity at frontal electrodes (~ 320–385 ms) (Fig. 3b, Map 6)
reminiscent of the P3a or novelty P3 (Polich and Criado, 2006) that
then spread to a more centro-parietal positivity (~385–500 ms) (Fig.
3b, Maps 7 and 8) reminiscent of the P3b component, thought to
reflect context-based memory updating (Polich and Criado, 2006).
This P3b component has also been observed in response to
expectation violation in an artificial grammar learning task and
increased with training (Carrion and Bly, 2007).
Later differences: statistical analysis of putative sources
Source solutions indicated that these differential brain responses
for experts involved bilateral frontal regions in pre-motor and
supplementary motor areas, probably reflecting superior audio-
motor coordination in musicians, who can apply a translation of
auditory inputs into motor patterns to support their perceptual
analysis (Haueisen and Knosche, 2001). Since subjects were not
familiar with the stimuli, such motor involvement may be automatic
in professional musicians due to training and plasticity (Haueisen and
Knosche, 2001; Zatorre et al., 2007).
In addition, activations in ACC are consistent with a role in error
detection and decision making (Ridderinkhof et al., 2004), which
occurred along with reactivation of temporal–limbic areas. The ACC is
also part of the limbic system, and has been suggested to regulate
emotional processing (Bush et al., 2000). Other sources in right
orbitofrontal areas further supports a selective involvement of limbic
areas in experts, possibly related to cognitive and affective processing
of musical stimuli (Blood et al., 1999; Janata et al., 2002a). Moreover,
Orbitofrontal activation has been demonstrated previously in
response to syntactical incongruities in music (Tillmann et al.,
2006). Concomitant activation of visual areas during the same time
window may be interpreted as a by-product of imagery during music
listening, as often reported in literature (Janata et al., 2002a,b).
Moreover, in previous work, musicians showed stronger activations in
visual areas than in non-musicians during harmony processing
(Schmithorst and Holland, 2003).
Musical laymen
Although one might expect that non-musicians may show
distinctive patterns of brain activation, we found no specific effects
of harmony incongruity in laymen compared to experts in our time
window of analysis. The statistical source analyses used consisted in
comparisons by means of two-tailed t-tests between groups; results
of these tests consisted in selective enhancement of certain neural
responses in experts, but not in laymen. In consequence, all areas
activated in non-musicians were implicated in the expert group alike.
This further corroborates the hypothetical development of specialized
processes from a common network as a consequence of training.
Moreover, early auditory responses and activation to regular ter-
minal chords evoked almost identical activation patterns in both
groups, supporting this interpretation. But if incongruities had been
more salient, similar activations might have occurred in the layman
group.
Critically, however, the asymmetrical right-sided negative ERP
response to incongruous chords, distinctively generated in right
temporal–limbic areas and insula in experts, did not arise in the
layman group at any time point within our window of analysis; it
could neither be observed when only successful incongruity detection
of the layman was analyzed.
In contrast, for the later differences after 300 ms, it is possible that
laymen would exhibit certain similar or idiosyncratic activation patters
as compared to experts after our time window of analysis (500 ms),
including possible differential responses between regular and incon-
gruous chords. In this case the observed “later”differences after 300 ms
could be partially due to latency differences and not to idiosyncratic
microstates of information processing. This holds particularly for the
cognitive activations in the ACC and also for the orbitofrontal activations
that have been observed previously in response to syntactical irre-
gularities in music in non-musicians (Tillmann et al., 2006).
Concerning behavioral responses we consider that the forced binary
responses demanded here represented a disadvantage for the laymen.
In our behavioral pilot study where we used a 5 point Likert scale,
ratings differed more clearly between regular and incongruous chords
for non-musicians. However, this concerns most likely later conscious
processing; the initial distinctive ERP response in experts appeared in a
more preconscious time window (Dehaene et al., 2006).
Conclusion
Our results demonstrate a highly specific and early response of right
temporal–limbic regions, encompassing hippocampal complex and
amygdala and of right insula, to subtle harmonic incongruities in
expressive music for expert pianists. These results suggest a key role of
these brain regions in processing the significance of musical chord
function, modulated by experience. These findings extend previous
studies that showed neural changes as a result of musical training for
primary sensory and motor areas of the cerebral cortex, rather than in
the limbic system and insula as here. Since no such difference between
groups was found for early auditory processing nor for regular endings,
we suggest that this particular pattern of limbic response of experts to
fine harmony incongruity might subserve early cognitive and emotive
analysis of musical chord function, and that this reflects functional
plasticity within these structures due to intensive and enduring musical
training. We speculate that in the dynamic context of stage perfor-
mance, these rapid responses would allow a professional to quickly
adapt in incongruous musical situations due to individualor contextual
errors. We therefore suggest that such rapid responses evoked by
musical expectation violation might also have utilitarian values.
Acknowledgments
We thank Emmanuel Bigand for his advices on the nature of the
musical incongruencies and Nicolaas Ravenstijn (nico.ravenstijn@
filiuscorvi.nl/www.filiuscorvi.com) for the compositions. We also
thank Armin Schnider for allowing us to use the EEG facilities at
Hôpital Beau-Séjour and our colleagues Sandra Lehmann and
Christian Camen for their help with the EEG recordings. Then we
thank Olivier Renaud for his advice on statistics. The Cartool software
(http://brainmapping.unige.ch/Cartool.htm) has been programmed by
Denis Brunet, from the Functional Brain Mapping Laboratory, Geneva,
Switzerland, and is supported by the Center for Biomedical Imaging
(CIBM) of Geneva and Lausanne.
Appendix A. Supplementary data
Supplementary data associated with this article can be found, in
the online version, at doi:10.1016/j.neuroimage.2008.06.025.
References
Bigand, E., 2003. More about the musical expertise of musically untrained listeners.
Ann. N. Y. Acad. Sci. 999, 304–312.
Bigand, E., Madurell, F., Tillmann, B., Pineau, M., 1999. Effect of global structure and
temporal organization on chord processing. Journal of Experimental Psychology:
Human Perception and Performance 25, 184–197.
Bigand, E., Poulin, B., Tillmann, B., Madurell, F., D'Adamo, D.A., 2003. Sensory versus
cognitive components in harmonic priming. J. Exp. Psychol. Hum. Percept. Perform.
29, 159–171.
Bigand, E., Tillmann, B., Poulin-Charronnat, B., Manderlier, D., 2005. Repetition priming:
is music special? Q. J. Exp. Psychol. A 58, 1347–1375.
160 6 C.E. James et al. / NeuroImage 42 (2008) 1597–1608
Blood, A.J., Zatorre, R.J., Bermudez, P., Evans, A.C., 1999. Emotional responses to pleasant
and unpleasant music correlate with activity in paralimbic brain regions. Nat.
Neurosci. 2, 382–387.
Borchgrevink, H.M., 1982. Prosody and musical rhythm are controlled by the speech
hemisphere. In: M. Clynes, E. (Ed.), Music, Mind, and Brain. Plenum Press, New York,
pp. 151–157.
Bradley, M.M., Lang, P.J., 2000. Affective reactions to acoustic stimuli. Psychophysiology
37, 204–215.
Brecht, M., Schmitz, D., 2008. Neuroscience. Rules of plasticity. Science 319, 39–40.
Bush, G., Luu, P., Posner, M.I., 2000. Cognitive and emotional influences in anterior
cingulate cortex. Trends Cogn. Sci. 4, 215–222.
Carrion, R.E., Bly, B.M., 2007. Event-related potential markers of expectation violation in
an artificial grammar learning task. Neuroreport 18, 191–195.
Dehaene, S., Changeux, J.P., Naccache, L., Sackur, J., Sergent, C., 2006. Conscious,
preconscious, and subliminal processing: a testable taxonomy. Trends Cogn. Sci. 10,
204–211.
Flores-Gutierrez, E.O., Diaz, J.L., Barrios, F.A., Favila-Humara, R., Guevara, M.A., del Rio-
Portilla, Y., Corsi-Cabrera, M., 2007. Metabolic and electric brain patterns during
pleasant and unpleasant em otions induced by music masterpieces. Int. J.
Psychophysiol. 65, 69–84.
Friederici, A.D., 2002. Towards a neural basis of auditory sentence processing. Trends
Cogn. Sci. 6, 78–84.
Gomez, P., Danuser, B., 2007. Relationships between musical structure and psychophy-
siological measures of emotion. Emotion 7, 377–387.
Gosselin, N., Samson, S., Adolphs, R., Noulhiane, M., Roy, M., Hasboun, D., Baulac, M.,
Peretz, I., 2006. Emotional responses to unpleasant music correlates with damage
to the parahippocampal cortex. Brain 129, 2585–2592.
Gosselin, N., Peretz, I., Johnsen, E., Adolphs, R., 2007. Amygdala damage impairs emotion
recognition from music. Neuropsychologia 45, 236–244.
Halgren, E., Marinkovic, K., Chauvel, P., 1998. Generators of the late cognitive potentials
in auditory and visual oddball tasks. Electroencephalogr. Clin. Neurophysiol. 106,
156 –164.
Hamalainen, M.S., Ilmoniemi, R.J., 1994. Interpreting magnetic fields of the brain:
minimum norm estimates. Med. Biol. Eng. Comput. 32, 35–42.
Haueisen, J., Knosche, T.R., 2001. Involuntary motor activity in pianists evoked by music
perception. J. Cogn. Neurosci. 13, 786–792.
Janata, P., Birk, J.L., Van Horn, J.D., Leman, M., Tillmann, B., Bharucha, J.J., 2002a. The
cortical topography of tonal structures underlying Western music. Science 298,
2167–2170.
Janata, P., Tillmann, B., Bharucha, J.J., 2002b. Listening to polyphonic music recruits
domain-general attention and working memory circuits. Cogn. Affect. Behav.
Neurosci. 2, 121–140.
Katahira, K., Abla, D., Masuda, S., Okanoya, K., 2008. Feedback-based error monitoring
processes during musical performance: an ERP study. Neurosci. Res. 61, 120–128.
Knight, R.T., 1997. Distributed cortical network for visual attention. J. Cogn. Neurosci. 9,
75–91.
Koelsch, S., Gunter, T., Friederici, A.D., Schroger, E., 2000. Brain indices of music
processing: “nonmusicians”are musical. J. Cogn. Neurosci. 12, 520–541.
Koelsch, S., Gunter, T.C., Schroger, E., Tervaniemi, M., Sammler, D., Friederici, A.D., 2001.
Differentiating ERAN and MMN: an ERP study. Neuroreport 12, 1385–1389.
Koelsch, S., Schmidt, B.H., Kansok, J., 2002. Effects of musical expertise on the early right
anterior negativity: an event-related brain potential study. Psychophysiology 39,
657–663.
Koelsch, S., Maess, B., Grossmann, T., Friederici, A.D., 2003. Electric brain responses
reveal gender differences in music processing. Neuroreport 14, 709–713.
Koelsch, S., Fritz, T., Schulze, K., Alsop, D., Schlaug, G., 2005. Adults and children
processing music: an fMRI study. Neuroimage 25, 1068–1076.
Koelsch, S., Jentschke, S., Sammler, D., Mietchen, D., 2007. Untangling syntactic and
sensory processing: an ERP study of music perception. Psychophysiology 44,
476–490.
Kondakor, I., Lehmann, D., Michel, C.M., Brandeis, D., Kochi, K., Koenig, T., 1997.
Prestimulus EEG microstates influence visual event-related potential microstates in
field maps with 47 channels. J. Neural. Transm. 104, 161–173 .
Kropotov, J.D., Alho, K., Naatanen, R., Ponomarev, V.A., Kropotova, O.V., Anichkov, A.D.,
Nechaev, V.B., 2000. Human auditory–cortex mechanisms of preattentive sound
discrimination. Neurosci. Lett. 280, 87–90.
Lantz, G., Michel, C.M., Pascual-Marqui, R.D., Spinelli, L., Seeck, M., Seri, S., Landis, T.,
Rosen, I., 1997. Extracranial localization of intracranial interictal epileptiform
activity using LORETA (low resolution electromagnetic tomography). Electroence-
phalogr. Clin. Neurophysiol. 102, 414–422.
Lehmann, D., Ozaki, H., Pal, I., 1987. EEG alpha map series: brain micro-states by space-oriented
adaptive segmentation. Electroencephalogr. Clin. Neurophysiol. 67, 271–288.
Lehmann, D., Skrandies, W., 1980. Reference-free identification of components of
checkerboard-evok ed multichannel potential fields. Electroencephalogr. Clin.
Neurophysiol. 48, 609–621.
Lotze, M., Scheler, G., Tan, H.R., Braun, C., Birbaumer, N., 2003. The musician's brain:
functional imaging of amateurs and professionals during performance and imagery.
Neuroimage 20, 1817–1829.
Maess, B., Koelsch, S., Gunter, T.C., Friederici, A.D., 2001. Musical syntax is processed in
Broca's area: an MEG study. Nat. Neurosci. 4, 540–545.
Maguire, E.A., Gadian, D.G., Johnsrude, I.S., Good, C.D., Ashburner, J., Frackowiak, R.S.,
Frith, C.D., 2000. Navigation-related structural change in the hippocampi of taxi
drivers. Proc. Natl. Acad. Sci. U. S. A. 97, 4398–4403.
Maguire, E.A., Spiers, H.J., Good, C.D., Hartley, T., Frackowiak, R.S., Burgess, N., 2003.
Navigation expertise and the human hippocampus: a structural brain imaging
analysis. Hippocampus 13, 250–259.
Martin, S.J., Morris, R.G., 2002. New life in an old idea: the synaptic plasticity and
memory hypothesis revisited. Hippocampus 12, 609–636.
Mazzola, G., Wieser, H.G., Brunner, V., Muzzulini, D., 1989. A symmetry-oriented
mathematical model of classical counterpoint and related neurophysiological
investigations by depth EEG. An International Journal Computers and Mathematics.
Symmetry: Unifying Human Understanding, II, 17. Pergamon Press, pp. 539–594.
McDermott, J., Hauser, M.D., 2005. Probing the evolutionary origins of music perception.
Ann. N. Y. Acad. Sci. 1060, 6–16.
Meyer, L.B., 1956. Emotion and Meaning in Music. The University of Chicago Press,
Chicago.
Michel, C.M., Seeck, M., Landis, T., 1999. Spatiotemporal dynamics of human cognition.
News Physiol. Sci. 14, 206–214.
Michel, C.M., Thut, G., Morand, S., Khateb, A., Pegna, A.J., Grave de Peralta, R., Gonzalez,
S., Seeck, M., Landis, T., 2001. Electric source imaging of human brain functions.
Brain Res. Brain Res. Rev. 36, 108–118 .
Michel, C.M., Murray, M.M., Lantz, G., Gonzalez, S., Spinelli, L., Grave de Peralta, R.,
2004a. EEG source imaging. Clin. Neurophysiol. 115, 2195–2222.
Michel, C.M., Seeck, M., Murray, M.M., 2004b. The speed of visual cognition. Suppl. Clin.
Neurophysiol. 57, 617–627.
Munte, T.F., Altenmuller, E., Jancke, L., 20 02. The musician's brain as a model of
neuroplasticity. Nat. Rev. Neurosci. 3, 473–478.
Murray, M.M., Michel, C.M., Grave de Peralta, R., Ortigue, S., Brunet, D., Gonzalez Andino,
S., Schnider, A., 2004. Rapid discrimination of visual and multisensory memories
revealed by electrical neuroimaging. Neuroimage 21, 125–135.
Murray, M.M., Camen, C., Gonzalez Andino, S.L., Bovet, P., Clarke, S., 2006. Rapid brain
discrimination of sounds of objects. J. Neurosci. 26, 1293–1302.
Murray, M.M., Brunet, D., Michel, C.M., 2008. Topographic ERP analyses: a step-by-step
tutorial review. Brain Topogr. 20, 249–264.
Ortigue, S., Michel, C.M., Murray, M.M., Mohr,C., Carbonnel, S., Landis, T., 2004. Electrical
neuroimaging reveals early generator modulation to emotional words. Neuroimage
21, 1242–1251.
Ortigue, S., Thut, G., Landis, T., Michel, C.M., 2005. Time-resolved sex differences in
language lateralization. Brain 128, E28 author reply E29.
Parncutt, R., 1988. Revision of Terhardt's psychoacoustical model of the root(s) of a
musical chord. Music Perception, 6, 65–94.
Parncutt, R., 2006. Commentary on Cook and Fujisawa's “the psychophysics of harmony
perception: harmony is a three-tone phenomenon”. Empirical Musicology Review
1, 4, 204–209.
Pascual-Marqui, R.D., Michel, C.M., Lehmann, D.,1995. Segmentation of brain electrical activity
into microstates: model estimation and validation. IEEE Trans. Biomed. Eng. 42, 658–665.
Patel, A.D., Gibson, E., Ratner,J., Besson, M., Holcomb, P.J., 1998.Processing syntactic relations in
language and music: an event-related potential study. J. Cogn. Neurosci. 10, 717–733.
Perrin, F., Pernier, J., Bertrand, O., Giard, M.H., Echallier, J.F., 1987. Mapping of scalp
potentials by surface spline interpolation. Electroencephalogr. Clin. Neurophysiol.
66, 75–81.
Polich, J., Criado, J.R., 2006. Neuropsychology and neuropharmacology of P3a and P3b.
Int. J. Psychophysiol. 60, 172–185.
Poulin-Charronnat, B., Bigand, E., Koelsch, S., 2006. Processing of musical syntax tonic
versussubdominant: an event-relatedpotentialstudy.J. Cogn. Neurosci.18,1545–1554.
Pourtois,G.,Delplanque,S.,Michel,C.,Vuilleumier,P.,2008.Beyondconventionalevent-
related brain potential (ERP): exploring the time-course of visual emotion processing
using topographic and principal component analyses. Brain Topogr. 20, 265–277.
R DevelopmentCore Team, 2007. R: a languageand environment for statisticalcomputing.
R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0. http://
www.R-project.org.
Regnault, P., Bigand, E., Besson, M., 2001. Different brain mechanism s mediate
sensitivity to sensory consonance and harmonic context: evidence from auditory
event-related brain potentials. J. Cogn. Neurosci. 13, 241–255.
Ridderinkhof, K.R., Ullsperger, M., Crone, E.A., Nieuwenhuis, S., 2004. The role of the
medial frontal cortex in cognitive control. Science 306, 443–447.
Rorden, C., Brett, M., 2000. Stereotaxic display of brain lesions. Behav. Neurol. 12,
191–20 0.
Rosburg, T., Trautner, P., Ludowig, E., Schaller, C., Kurthen, M., Elger, C.E., Boutros, N.N.,
2007. Hippocampal event-related potentials to tone duration deviance in a passive
oddball paradigm in humans. Neuroimage 37, 274–281.
Schlaug, G., 2001. The brain of musicians. A model for functional and structural
adaptation. Ann. N. Y. Acad. Sci. 930, 281–299.
Schmithorst, V.J., Holland, S.K., 20 03. The effect of musical training on music processing:
a functional magnetic resonance imaging study in humans. Neurosci. Lett. 348,
65–68.
Schneider, P., Scherg, M., Dosch, H.G., Specht, H.J., Gutschalk, A., Rupp, A., 20 02.
Morphology of Heschl's gyrus reflects enhanced activation in the auditory cortex of
musicians. Nat. Neurosci. 5, 688–694.
Schön, D., Regnault, P., Ystad, S., Besson, M., 2005. Sensory consonance: an event-related
brain potential study. Music Perception 23, 105–117.
Spinelli, L., Andino, S.G., Lantz, G., Seeck, M., Michel, C.M., 2000. Electromagnetic inverse
solutions in anatomically constrained spherical head models. Brain Topogr. 13,115–125.
Srebro, R., 1996. A bootstrap method to compare the shapes of two scalp fields.
Electroencephalogr. Clin. Neurophysiol. 10 0, 25–32.
Steinbeis, N., Koelsch, S., Sloboda, J.A., 2005. Emotional processing of harmonic
expectancy violations. Ann. N. Y. Acad. Sci. 1060, 457–461.
Steinbeis, N., Koelsch, S., Sloboda, J.A., 2006. The role of harmonic expectancy violations
in musical emotions: evidence from subjective, physiological, and neural responses.
J. Cogn. Neurosci. 18, 1380–1393.
Thierry, G., Martin, C.D., Downing, P., Pegna, A.J., 2007. Controlling for interstimulus
perceptual variance abol ishes N170 face selectivity. Nat. Neurosci. 10, 505–511.
1607C.E. James et al. / NeuroImage 42 (2008) 1597–1608
Tillmann, B., Bharucha, J.J., Bigand, E., 2000. Implicit learning of tonality: a self-
organizing approach. Psychol. Rev. 107, 885–913.
Tillmann, B., Koelsch, S., Escoffier, N., Bigand, E., Lalitte, P., Friederici, A.D., von Cramon,
D.Y., 2006. Cognitive priming in sung and instrumental music: activation of inferior
frontal cortex. Neuroimage 31,1771–1782.
Trehub, S.E., Schelle nberg, E.G., Kamenetsky, S.B., 1999. Infants' and adults' perception of
scale structure. J. Exp. Psychol. Hum. Percept. Perform 25, 965–975.
Tzourio-Mazoyer, N., Landeau, B., Papathanassiou, D., Crivello, F., Etard, O., Delcroix, N.,
Mazoyer, B., Joliot, M., 2002. Automated anatomical labeling of activations in SPM
using a macroscopic anatomical parcellation of the MNI MRI single-subject brain.
Neuroimage 15, 273–289.
Vaughan Jr., H.G., 1982. The neural origins of human event-related potentials. Ann. N. Y.
Acad. Sci. 388, 125–138.
Wieser, H.G., 2003. Music and the brain. Lessons from brain diseases and some
reflections on the “emotional”brain. Ann. N. Y. Acad. Sci. 999, 76–94.
Wieser, H.G., Mazzola, G., 1986. Musical consonances and dissonances: are they
distinguished independently by the right and left hippocampi? Neuropsychologia
24, 805–812.
Zatorre, R.J., Belin, P., Penhune, V.B., 2002. Structure and function of auditory cortex:
music and speech. Trends Cogn. Sci. 6, 37–46.
Zatorre, R.J., Chen, J.L., Penhune, V.B., 2007. When the brain plays music: auditory–motor
interactions in music perception and production. Nat. Rev. Neurosci. 8, 547–558.
Zentner, M.R., Kagan, J., 1996. Perception of music by infants. Nature 383, 29.
Zumsteg, D., Friedman, A., Wennberg, R.A., Wieser, H.G., 2005. Source localization of
mesial temporal interictal epileptiform discharges: correlation with intracranial
foramen ovale electrode recordings. Clin. Neurophysiol. 116, 2810–2818.
1608 C.E. James et al. / NeuroImage 42 (2008) 1597–1608