ArticlePDF Available

When That Tune Runs Through Your Head: A PET Investigation of Auditory Imagery for Familiar Melodies


Abstract and Figures

The present study used positron emission tomography (PET) to examine the cerebral activity pattern associated with auditory imagery for familiar tunes. Subjects either imagined the continuation of nonverbal tunes cued by their first few notes, listened to a short sequence of notes as a control task, or listened and then reimagined that short sequence. Subtraction of the activation in the control task from that in the real-tune imagery task revealed primarily right-sided activation in frontal and superior temporal regions, plus supplementary motor area (SMA). Isolating retrieval of the real tunes by subtracting activation in the reimagine task from that in the real-tune imagery task revealed activation primarily in right frontal areas and right superior temporal gyrus. Subtraction of activation in the control condition from that in the reimagine condition, intended to capture imagery of unfamiliar sequences, revealed activation in SMA, plus some left frontal regions. We conclude that areas of right auditory association cortex, together with right and left frontal cortices, are implicated in imagery for familiar tunes, in accord with previous behavioral, lesion and PET data. Retrieval from musical semantic memory is mediated by structures in the right frontal lobe, in contrast to results from previous studies implicating left frontal areas for all semantic retrieval. The SMA seems to be involved specifically in image generation, implicating a motor code in this process.
Merged PET/MRI images illustrating selected regions of significant increase in CBF in each of three comparisons. (Top panel) Areas of CBF increase in the Cue/Image – Control subtraction (keyed to Table 2). The leftmost image corresponds to a horizontal section (z = 7) and shows the activity in the right and left inferior frontal gyrus, corresponding to foci 2 and 5 respectively in Table 2. Also visible in the horizontal section is activation in the right superior temporal gyrus (focus 9 in Table 2). The right frontal and temporal areas of activity are also shown in the two parasagittal sections through the right hemisphere shown in the middle of the top panel (x = 46 and 55). The positions of the dotted lines indicate corresponding planes of section in the figures. The rightmost image in the top panel shows the SMA activity in a midsaggital section (focus 11 in Table 2). (Middle panel) Areas of CBF increasein the Cue/Image – Control/Image subtraction (keyed to Table 3). The horizontal section (z = 6) and the two parasagittal sections (x = 40 and 54) illustrate the activation in right and left frontal areas (foci 2 and 4 of Table 3), and in the right superior temporal cortex (focus 6). Note the similarity of activation in right and left inferior frontal areas and right superior temporal cortex to the upper panel. Note also the absence of SMA activity in the midsaggital section (far right). (Bottom panel) Images associated with the Control/Image – Control subtraction (keyed to Table 4). The horizontal section (z = 7) is included to demonstrate the absence of activation in frontal or temporal areas comparable to those shown in the other two panels. The middle image is a coronal section (y = 4) to show the region of left midfrontal activity (focus 3 in Table 4). Also visible in this section and in the midsaggital section (far right) is activity within the SMA (focus 7 in Table 4).
Content may be subject to copyright.
The present study used positron emission tomography (PET) to
examine the cerebral activity pattern associated with auditory
imagery for familiar tunes. Subjects either imagined the continuation
of nonverbal tunes cued by their first few notes, listened to a short
sequence of notes as a control task, or listened and then reimagined
that short sequence. Subtraction of the activation in the control task
from that in the real-tune imagery task revealed primarily right-sided
activation in frontal and superior temporal regions, plus supple-
mentary motor area (SMA). Isolating retrieval of the real tunes by
subtracting activation in the reimagine task from that in the real-tune
imagery task revealed activation primarily in right frontal areas and
right superior temporal gyrus. Subtraction of activation in the control
condition from that in the reimagine condition, intended to capture
imagery of unfamiliar sequences, revealed activation in SMA, plus
some left frontal regions. We conclude that areas of right auditory
association cortex, together with right and left frontal cortices, are
implicated in imagery for familiar tunes, in accord with previous
behavioral, lesion and PET data. Retrieval from musical semantic
memory is mediated by structures in the right frontal lobe, in
contrast to results from previous studies implicating left frontal
areas for all semantic retrieval. The SMA seems to be involved
specifically in image generation, implicating a motor code in this
Cognitive scientists are interested in the mental structures that
underlie the experience of imagery, or mental acts in which we
seem to re-enact the experience of perceiving an object when
the object is no longer available. Cognitive psychologists have
wondered whether this experience is fundamentally different to
the more abstract mentation used in recalling facts or solving
arithmetic problems. Purely behavioral methods have shown
intriguing similarities between performance on perceptual and
imaginal versions of the same task (Farah, 1989) or facilitation by
an imagined stimulus on performance of a task involving a
perceived stimulus (Hubbard and Stoeckig, 1988). However,
these behavioral methods have their limitations in helping us
decide whether perception and imagery share similar mental
structures. For instance, although similarity of response patterns
in perceived and imagined tasks may indicate shared mental
structures, the similarity may be coincidental or epiphenomenal.
To investigate the nature of mental imagery further, re-
searchers have turned to physiological evidence that imagery
and perception may share actual neural structures. Farah
reviewed evidence from brain-damaged patients who show
parallel deficits in visual imagery and perception skills after
damage to particular brain areas (Farah, 1988). More directly, a
number of researchers have employed brain-imaging technology
to observe the brain areas that are active while participants
perceive or imagine stimuli. To the extent that brain areas
known to be associated with sensory processing are active
during imagery tasks, we may conclude that the brain efficiently
uses similar areas both to process information initially, as well as
to reactivate it for further processing.
For the case of visual imagery, several studies have indeed
found the hypothesized activation in visual cortical areas during
imagery tasks (Kosslyn et al., 1993) [reviewed by Farah (Farah,
1995) and Mellet et al. (Mellet et al., 1998)]. Many studies
investigating visual imagery have found evidence that visual
association areas (and sometimes primary areas) are engaged
during visual imagery tasks. This pattern obtains over different
kinds of imagery tasks and different brain-imaging techniques,
including SPECT (Goldenberg et al., 1989) and ERP (Farah et al.,
1989) as well as positron emission tomography (PET) and func-
tional magnetic resonance imaging (fMRI) mentioned above.
In addition to the localization of visual areas active in imag-
ery, researchers have investigated the lateralization of imagery
processes. The findings related to this question are mixed. But as
Mellet and co-workers have pointed out, the degree of lateral-
ization may depend on both the complexity of the task (leading
to more right-sided activation) and nameability of the stimuli
(leading to more left-sided activation) (Mellet et al., 1998).
All these studies have examined visual imagery exclusively. To
what extent can we extend conclusions made in the visual
domain to other domains, specifically audition? Behavioral
investigations of auditory imagery have suggested that it can be
manipulated and measured in ways similar to that of visual imag-
ery. For instance, Halpern asked people to mentally compare the
pitches corresponding to two words drawn from familiar tunes
(Halpern, 1988). She found that the time taken to respond was
systematically related to the number of beats separating the
words in the real tune. The interpretation here was that as visual
images represent space, auditory images may represent time or
other auditory properties, such as loudness (Farah and Smith,
1983). This certainly accords with subjective reports that people
can ‘hear’ sounds in their heads.
In work previous to the current investigation, we asked
whether auditory imagery for music might be mediated by the
same brain areas as auditory perception. Our first study (Zatorre
and Halpern, 1993) presented the mental pitch comparison task
described above to patients who had undergone right or left
temporal-lobe excisions for the relief of epilepsy, plus normal
controls. We also presented a perceptual version in which the
pitches were to be compared while the song in question was
actually being presented. Our results were straightforward: all
groups found the imagery task more difficult, as expected, but
only the right temporal lobectomy group showed a performance
deficit. They were impaired relative to the other groups by the
same amount on both imagery and perception tasks.
This result seemed to implicate the right temporal lobe as
being necessary to perform both auditory imagery and percep-
tion tasks, but by this approach we could not see what structures
would normally be active during performance of such a task.
Cerebral Cortex Oct/Nov 1999;9:697–704; 1047–3211/99/$4.00
When That Tune Runs Through Your Head:
A PET Investigation of Auditory Imagery
for Familiar Melodies
Andrea R. Halpern and Robert J. Zatorre
Psychology Department, Bucknell University, Lewisburg, PA
17837, USA and
Montreal Neurological Institute, McGill
University, Montreal, Canada
© Oxford University Press 1999
Consequently, our next study (Zatorre et al., 1996) utilized
PET to look at cerebral blood flow (CBF) in normal volunteers
during essentially the same imagery and perception tasks
described above. Each of these experimental conditions was
subtracted from a visual baseline, in which the words from the
songs were randomly paired and presented on a screen for a
visual length judgment. The subtractions revealed remarkable
similarity in CBF patterns in the perception and imagery
conditions. Notably, both tasks activated auditory association
regions in the superior temporal gyrus (STG) bilaterally, as well
as several areas bilaterally in the frontal lobe and one parietal
area. The supplementary motor area (SMA) was also activated in
both tasks. Outside of activation unique to primary auditory
cortex in the perception condition (the only one with actual
auditory input), only four brain regions showed statistically
significant differences between imagery and perception tasks,
including frontopolar areas bilaterally, the subcallosal gyrus
and, in the only lateralized effect we found, right thalamus.
These two experiments confirmed the importance of audit-
ory association areas in mediating auditory imagery, analogously
to the visual imagery results described previously. However, we
were left with several open questions. First, it is evident that our
imagery and perception tasks were complex. They involved
retrieval of a tune from musical semantic memory (in the
imagery task), the rehearsal of the first pitch while the second
was retrieved (a working memory task in both), and finally the
pitch comparison and decision. We could not separate these
components and thus link them to various areas in the frontal
lobes that have been implicated in the literature on brain areas
active in memory tasks. Second, the two experiments seemed to
be at variance with one another concerning lateralized effects.
The lesion study suggested that the right temporal lobe is
essential for processing heard and imagined musical stimuli. This
is consistent with numerous other studies showing an important
role for the right temporal neocortex in tonal perception tasks
(Milner, 1962; Zatorre, 1985, 1988; Divenyi and Robinson, 1989;
Robin et al., 1990; Zatorre and Samson, 1991; Zatorre et al.,
1994; Samson and Zatorre, 1994; Liégeois-Chauvel et al., 1998).
However, the PET study of musical imagery showed a bilateral
pattern of activation in the STG.
Regarding the first set of concerns, several groups of
researchers have been investigating the involvement of the
frontal lobes in working memory tasks. For example, Petrides et
al. have found using PET that the dorsolateral frontal cortex
bilaterally (Broadmann areas 46 and 9) is important in mediating
self-generated working memory tasks, such as randomly
generating numbers from 1 to 10 without repeating any (Petrides
et al., 1993). Smith and Jonides have also reported dorsolateral
frontal activity in a working memory task in which subjects have
to compare the current stimulus with the stimulus two or three
positions back in a list for a same–different judgment (Smith and
Jonides, 1997). Braver et al. parametrically varied working
memory load (from zero-back to three-back comparisons) and
found that activation in areas 46 and 9 increased monotonically
with increasing working memory load (Braver et al., 1997).
The other major component of our original imagery task
retrieval from semantic memory has been somewhat less
studied. Nyberg et al. have reviewed the possible differences in
brain areas active during retrieval from episodic and semantic
memory (Nyberg et al., 1996). In their HERA model (hemi-
spheric encoding/retrieval asymmetry), they implicate the left
prefrontal area for semantic retrieval (which they also identify
with episodic encoding, on the assumption that retrieval of a
semantic fact will be newly encoded as an episodic memory). In
contrast, they implicate the right prefrontal area in episodic
retrieval. Gabrieli et al. similarly found that judging words as
being abstract or concrete a semantic memory task
selectively activated the left inferior frontal gyrus (areas 45, 46
and 47), as measured by fMRI (Gabrieli et al., 1996). However,
the role of these regions of the left inferior frontal cortex may be
more general, since numerous studies have consistently found
CBF increases in this location during tasks that require lexical
search and retrieval, including noun–verb generation (Petersen
et al., 1988), synonym generation or translation (Klein et al.,
1995), or stem completion (Buckner et al., 1995).
Most of these studies have used verbal stimuli. Petrides
and colleagues usually find bilateral activation in their verbal
working memory tasks, although Smith and Jonides have
evidence that verbal working memory is left lateralized whereas
spatial working memory is right lateralized (Smith et al., 1996).
Tulving and colleagues, however, assert that the episodic/
semantic distinction in retrieval cuts across materials. Nyberg et
al. cite numerous cases of object and face memory that seem to
follow the left = semantic and right = episodic retrieval scheme
(Nyberg et al., 1996). None of the studies in this literature have
used music, which, as noted above, is known to be heavily
dependent on right hemisphere structures. This brings us to the
second concern remaining from our prior PET study, the lack of
asymmetrical right-sided activations in either the imagery or
perception task. Recall, however, that in both of our previous
studies the music we used had lyrics associated with them. It is
possible that the bilateral findings of our first PET study were
due to the fact that the task required processing of both words
and music.
In view of both the memory literature cited and our own
previous research, we decided to construct a musical imagery
task that did not require processing of lyrics. By presenting
stimuli that were exclusively musical, we minimized the involve-
ment of verbal processing structures in the left hemisphere. This
would, we hoped, reveal music-specific structures in the right
hemisphere in the temporal lobe (as predicted from both our
previous studies), as well as the frontal lobe. In addition, we
modified our task to try to better separate the retrieval of
information from musical semantic memory versus the working
memory task of keeping the memory traces active for a period of
time. The main imagery task (Cue/Imagery) involved presenta-
tion of the first few notes of familiar, but nonverbal tunes, such
as movie themes and excerpts from classical music. Participants
then had to imagine the rest of the tune, and press a button when
they completed the task. This involved imagery and working
memory processes, as well as retrieval of the tune from semantic
memory. In the Control task, a novel tone sequence derived from
the real tune fragment was presented, and subjects simply
pressed a button at the end of each one. In a second control task
called Control/Imagery, the same novel tone sequence was
presented and subjects had to reimagine the sequence, then
press a button. This task involves working memory (because the
sequence had to be remembered) and imagery (because it had to
be rehearsed), but does not require retrieval from semantic
memory, as the sequence was novel. Subtracting the Control task
from the Control/Imagery task should remove the effects of
hearing a note sequence and thereby isolate imagery and
working memory processes. Subtracting the Control/Imagery
task from the Cue/Imagery task should remove the effects of
auditory input, working memory and imagery, thereby isolating
698 PET Study of Auditory Imagery Halpern and Zatorre
musical semantic retrieval. A summary of these tasks is presen-
ted in Table 1.
To summarize, we had four goals for this experiment. First,
we were interested in testing the generality of our results from
our first PET study (Zatorre et al., 1996) with new sets of
materials and a new task. Specifically, we hoped to confirm the
activation of the auditory association areas in the STG during
silent auditory imagery tasks. We also sought to confirm the
activation of SMA and frontal areas during our tasks. Second, we
hypothesized that eliminating all words from our materials
would show asymmetrical activation in right temporal and
frontal areas. Third, we were interested in seeing whether
working memory associated with musical tasks would activate
areas in the dorsolateral frontal lobe as found by previous
researchers. Although most previous work has found bilateral or
left-sided activation during verbal working memory tasks, Smith
et al. have suggested that spatial working memory is mediated by
right prefrontal structures (Smith et al., 1996). Music provides
an interesting test of the generality of this finding. Thus we were
particularly interested to see if we found any asymmetry in
dorsolateral frontal cortex activation when listeners had to
imagine a novel sequence just presented. We also wondered
whether SMA activation would be associated with the working
memory aspect of our task. Our thinking here was that activa-
tion of motor codes, with which the SMA has been associated
(Rao et al., 1997), may assist listeners in preserving memory
traces of tunes.
Finally, the aspect of our task requiring retrieval from musical
semantic memory provides a test of the HERA model of Tulving
and colleagues, in which semantic retrieval is associated
exclusively with left frontal areas. If our task elicits substantial
right frontal activation when the familiar tune fragments are
presented for mental continuation, we would have to question
the generality of the HERA formulation in favor of a material-
specific scheme.
Materials and Methods
Eight right-handed volunteers (five women, three men, mean age 24)
participated after giving informed consent in accord with ethical
guidelines in place at the Montreal Neurological Institute. Subjects had
received varying amounts of musical training, with a range of 3–16 years
of formal music lessons, and an average of 9.5 years.
Three types of stimuli were prepared: melodic themes, cue sequences
and control sequences (Fig. 1). The melodic themes were only used for
familiarization; during scanning only cue and control sequences were
used. Melodies were initially selected based on two criteria: that they not
be associated with lyrics, and that they were rated as familiar in pilot
testing. Fifteen melodies were chosen from among classical music (e.g.
dances from the Nutcracker Suite, opening theme from Beethoven’s Fifth
Symphony), television shows or movies (e.g. themes from Dallas, Star
Wars), and from other popular sources (e.g. chimes of Big Ben, Scott
Joplin’s ‘The Entertainer’). The initial few bars of each of these melodies
were then selected as the melodic themes to be used for the study. One
additional feature of these themes is that they were selected to fall into
one of three categories based on their duration, with average durations of
short, medium and long themes being 2.2, 4.8 and 6.2 s respectively.
This manipulation enabled us to measure the time taken to imagine the
theme. The end point of each theme coincided with phrasal boundaries.
Fifteen cue sequences were then created by taking the first few notes
from each theme (Fig. 1). Pilot testing ensured that the cues uniquely
specified one of the target melodies, and that subjects had no difficulty
generating an internal image from each cue. Finally, 15 control
sequences were created by randomly permuting the tones within each
cue sequence, to create a set of control sequences that were matched for
number of tones, total duration and types of rhythmic and tonal intervals.
Subjects were screened to ensure familiarity with the tunes to be used,
and to indicate to them the endpoints selected for each melody. Subjects
rated each melody for familiarity; all subjects who were retained for
scanning rated the stimuli as familiar or very familiar (average rating of
1.2 on a scale of 1–5). Subjects were also instructed to pay close attention
to the endpoint of each melody. On the day of scanning, subjects were
once again presented with each of the themes to remind them of the
stimulus materials to be used, and to help them recall the endpoints.
Three conditions were tested during scanning: Control, Cue/Image
and Control/Image, in that order (see Table 1). In the Control condition,
subjects were presented with each of the control sequences described
above; they were instructed simply to listen to each sequence, and to
press a mouse key after each stimulus. In the Cue/Image condition,
subjects heard the cues associated with the themes (i.e. the first few notes
of the tune), and were asked to imagine the continuation of each melody
as it followed from the cue. They were further instructed to stop
imagining the melody at the same point as had been demonstrated during
screening, and to press the mouse key at that point. In the Control/Image
condition subjects were told to listen to the control sequences, and were
then instructed to imagine the same sequence just heard, and to press the
key when this was accomplished.
Practice trials were given prior to each scan condition using the same
stimuli but in a different randomization. Intertrial intervals were 5, 6 or
7 s for the short, medium and long trials respectively (to allow sufficient
Table 1
Summary of experimental paradigm
Condition Stimulus Task Imagery
Control control tone sequence listen no no
Cue/Image cue sequence listen and image rest of tune yes yes
Control/Image control tone sequence listen and image control
yes no
Figure 1. Illustration of the stimuli used, in musical notation. The first line illustrates a
melodic theme taken from the television show Dallas which was played to subjects
during screening sessions and prior to scanning. The second line illustrates the cue
sequence for this item (consisting of the first five notes of the theme), which was used
during the Cue/Image condition. Subjects were instructed to imagine the rest of the
theme continuing from the cue sequence. The third line shows a control sequence
(consisting of a random permutation of the five notes), which was used during the
Control and Control/Image conditions.
Cerebral Cortex Oct/Nov 1999, V 9 N 7 699
time for subjects to image the appropriate duration for each trial). The
tasks were begun prior to the onset of scanning, which typically started
between the third and fifth trials. Stimuli were presented binaurally over
insert earphones (EAR Tone Type 3A), which had been calibrated for an
average intensity of 72 dB SPL (A).
PET Scanning
PET scans were obtained with a Siemens Exact HR+ tomograph operating
in three-dimensional acquisition mode. The distribution of CBF was
measured during each 60 s scan using the H
O water bolus method
(Raichle et al., 1983). MRI scans (160 1 mm thick slices) were also
obtained for each subject with a 1.5 T Phillips ACS system to provide
anatomical detail. CBF images were reconstructed using a 14 mm
Hanning filter, normalized for differences in global CBF, and coregistered
with the individual MRI data (Evans et al., 1992). Each matched MRI/PET
data set was then linearly resampled into the standardized stereotaxic
coordinate system of Talairach and Tournoux (Talairach and Tournoux,
1988) via an automated feature-matching algorithm (Collins et al., 1994).
PET images were averaged across subjects for each condition, and the
mean change image volume obtained for each comparison; this volume
was converted to a t-statistic map, and the significance of focal CBF
changes was assessed by a method based on three-dimensional Gaussian
random-field theory (Worsley et al., 1992). The presence of significant
changes in CBF was first established on the basis of an exploratory search,
for which the t-value criterion was set at 3.53 or greater. This value
corresponds to an uncorrected P-value of 0.0004 (two-tailed), and results
in an average of 0.58 false positives per search volume of 182 resolution
elements (dimensions of 14 × 14 × 14 mm), corresponding approximately
to the volume of gray matter scanned. For the superior temporal region,
where activity had been predicted based on previous findings, the
threshold was lowered to t = 3.0.
Behavioral Data
All subjects indicated that they had been able to generate the
musical images required during the Cue/Image and Control/
Image tasks, and subjectively felt that they had a strong sense of
vivid auditory imagery (‘hearing the music in my head’). Mean
latencies for seven of the eight subjects (data for the eighth
subject were lost due to computer error) to key press, measured
from the onset of the cue in the Cue/Image condition for the
short, medium and long trials were 2.72, 3.72 and 4.55 s respect-
ively. These values were entered into an analysis of variance,
which indicated a significant difference among them [F(2,12) =
32.36, P < 0.001]; this provides behavioral evidence that subjects
were generating an auditory image in conformity with the
desired stimulus duration. All seven subjects showed the pattern
of increasing latency with increasing length of theme.
Analysis of CBF data
Comparisons were performed by subtracting the Control task
from each of the other two, and also by comparing the
Cue/Image and the Control/Image tasks to one another. Table 2
shows the stereotaxic coordinates and t-values for foci in the
Cue/Image – Control subtraction. In addition, the two rightmost
columns show the foci from the other two subtractions that
correspond in location to those in Table 2.
The Cue/Image – Control subtraction was intended to capture
all the processes involved in musical imagery, controlling for
physical stimulus input and response output. One question of
interest was whether greater activity would be detected in the
right frontal region than in the left in this subtraction. The
Cue/Image – Control comparison did yield significant activation
within the right inferior frontal gyrus (focus 1, area 10/47), with
no homologous activation on the left. A second focus in right
frontal cortex (focus 2, area 45; see Fig. 2, top), was matched by
a similar region of activity on the left (focus 5), but at a much
lower level of significance. A third region in the middle frontal
gyrus was approximately equally active in the two hemispheres
(foci 3 and 6, area 46). The only region that was uniquely active
in the left frontal lobe was located approximately within area 44
(focus 7).
In addition, significant CBF increases were noted in the
predicted area of the right superior temporal cortex (focus 9,
visible in Fig. 2, top panel; the t-value of 3.41 in this case falls just
below the t-threshold for exploratory search set a priori, but is
well above the threshold for predicted activation sites), and also
in the right inferior temporal cortex (focus 10). No significant
activity was detected in the left temporal lobe, even using the
lower t-threshold value. Finally, and also in keeping with
the predictions, this subtraction yielded activity within the SMA
(focus 11; Fig. 2).
The Cue/Image Control/Image subtraction (Table 3 and
Fig. 2, middle) was intended to isolate processing components
associated with retrieving real tunes from semantic memory.
This comparison yielded a number of foci that were very similar
in location to those elicited by the Control/Image – Control sub-
traction. In fact, all nine foci in Table 3 are within regions also
shown in Table 2, as indicated. The inferior frontal gyrus, area
10/47(focus 1), again showed a significant CBF increase in the
right hemisphere only, as in the previous subtraction. Areas 45
and 46 (foci 2–5) showed bilateral activation, but with much
higher activity on the right. Importantly, this subtraction also
yielded significant activity within the right STG (focus 6), as
well as the right inferior temporal gyrus (focus 7), without any
corresponding activation on the left. It is notable, however, that
this comparison did not yield any detectable activity within the
SMA, even applying the less stringent statistical cutoff for
predicted areas.
Table 2
Stereotaxic coordinates and significance levels of activation foci in the Cue/Image – Control
subtraction. The two far right columns indicate foci in corresponding locations for the Cue/Image
– Control subtraction (Table 3) and the Control/Image – Control subtraction (Table 4)
Region x y ztCorresponding foci
in other conditions
Table 3 Table 4
Right frontal cortex
1. Inferior frontal gyrus (10/47) 34 53 –9 4.57 1
38 42 –3 4.13
2. Inferior frontal gyrus (45) 46 13 6 6.22 2
3. Middle frontal gyrus (46) 36 46 23 3.63 3
4. Precentral gyrus (6) 51 –1 47 4.20
Left frontal cortex
5. Inferior frontal gyrus (45) –42 12 2 3.70 4
–35 22 5 4.28
6. Middle frontal gyrus (46/9) –34 44 18 3.84 5
7. Middle frontal gyrus (44) –48 10 23 5.49 3
8. Precentral gyrus (6) –48 –6 41 4.78 6
Right temporal cortex
9. Superior temporal gyrus (22) 56 –30 8 3.41 6
10. Inferior temporal gyrus (37) 62 –42 –8 4.40 7
Other regions
11. Supplementary motor area (6) –1 8 62 7.47 7
12. Anterior cingulate (32) –4 29 30 3.53 8
13. Right parietal cortex (40) 46 –49 51 4.87 9
14. Precuneus (7) –4 –69 36 3.64
700 PET Study of Auditory Imagery Halpern and Zatorre
Figure 2. Merged PET/MRI images illustrating selected regions of significant increase in CBF in each of three comparisons. (Top panel) Areas of CBF increase in the Cue/Image –
Control subtraction (keyed to Table 2). The leftmost image corresponds to a horizontal section (z = 7) and shows the activity in the right and left inferior frontal gyrus, corresponding
to foci 2 and 5 respectively in Table 2. Also visible in the horizontal section is activation in the right superior temporal gyrus (focus 9 in Table 2). The right frontal and temporal areas
of activity are also shown in the two parasagittal sections through the right hemisphere shown in the middle of the top panel (x = 46 and 55). The positions of the dotted lines indicate
corresponding planes of section in the figures. The rightmost image in the top panel shows the SMA activity in a midsaggital section (focus 11 in Table 2). (Middle panel) Areas of
CBF increasein the Cue/Image – Control/Image subtraction (keyed to Table 3). The horizontal section (z = 6) and the two parasagittal sections (x = 40 and 54) illustrate the activation
in right and left frontal areas (foci 2 and 4 of Table 3), and in the right superior temporal cortex (focus 6). Note the similarity of activation in right and left inferior frontal areas and right
superior temporal cortex to the upper panel. Note also the absence of SMA activity in the midsaggital section (far right). (Bottom panel) Images associated with the Control/Image
– Control subtraction (keyed to Table 4). The horizontal section (z = 7) is included to demonstrate the absence of activation in frontal or temporal areas comparable to those shown
in the other two panels. The middle image is a coronal section (y = 4) to show the region of left midfrontal activity (focus 3 in Table 4). Also visible in this section and in the midsaggital
section (far right) is activity within the SMA (focus 7 in Table 4).
Cerebral Cortex Oct/Nov 1999, V 9 N 7 701
The final comparison, Control/Image Control (Table 4; Fig.
2, bottom), was designed to demonstrate the activity associated
with imagery in the absence of any semantic components. This
subtraction did not show activation in the inferior frontal areas
that were detected in the other subtractions, nor in the temporal
cortex, even with a less stringent statistical threshold value. It
did, however, demonstrate a clear CBF increase in the SMA
(similar in location to that shown in Table 2), along with predom-
inantly left-sided frontal cortical sites. The left dorsolateral
frontal region, area 44 (focus 3), was similar in location to that
seen in Table 2.
The principal findings of this study confirm predictions that
activity in the right auditory association cortex, together with
the SMA, accompanies musical imagery (Cue/Image Control;
see Fig. 2). Breaking the task down into its components, we
found that when imagery entails retrieval from musical semantic
memory (Cue/Image – Control/Image), activity ensued in a right
inferior frontal region and bilaterally in middle frontal areas
(more significant on the right side), together with right auditory
association areas in STG. When imagery does not require
semantic retrieval (Control/Image Control), left frontal areas
and SMA are recruited.
Auditory Areas Active in Imagery
The present study extends previous findings implicating audit-
ory cortical regions in musical imagery (Zatorre and Halpern,
1993; Zatorre et al., 1996). Even using a different behavioral
paradigm and stimuli from the previous studies, several com-
monalities emerge. The most salient point is that auditory
association areas are involved in processing imagined familiar
melodies. Because the task design involved similar auditory
input in each scan condition, the activation in auditory cortex
during imagery must be due to processing beyond that elicited
by the auditory stimulation. This pattern of activation supports
the hypothesis that cortical perceptual areas can mediate
internally generated information. This conclusion is consistent
with findings from the visual domain (Kosslyn et al., 1993; Farah,
Also consistent with prior PET data (Zatorre et al., 1996), only
associative cortical regions, not primary, were active in the
imagery task. Comparing the locus of activity in the superior
temporal region with anatomical probability maps derived from
MR scans in stereotaxic space indicates that the activation is well
posterior to Heschl’s gyrus (Penhune et al., 1996), and is located
within or just posteroventral to the planum temporale (Westbury
et al., 1999). These areas are known from physiological and
anatomical studies to constitute unimodal auditory association
cortex (Celesia, 1976; Galaburda and Sanides, 1980). To date,
therefore, auditory imagery paradigms for music, as well as for
simpler tonal stimuli (Rao et al., 1997; Penhune et al., 1998)
have not revealed activation in primary auditory cortex, in
contrast to at least some visual imagery tasks which have found
activation in primary visual areas (Kosslyn et al., 1993). Whether
this reflects differences related to how information is processed
in different modalities, or to task design, or other factors remains
to be determined.
Laterality of Effects
Another important point concerns the laterality of the STG
activity. Our hypothesis was that nonverbal melodies would
activate primarily right temporal cortex, in contrast to the
bilateral activity observed previously with verbal melodies. The
present finding of right auditory cortical activation therefore
supports the broader hypothesis that mechanisms within the
right hemisphere are specialized for processing tonal patterns, as
predicted based on previous lesion (Milner, 1962; Zatorre and
Samson, 1991; Liégeois-Chauvel et al., 1998;) and imaging
(Démonet et al., 1994; Zatorre et al., 1994; Binder et al., 1997;)
studies. Specifically, the present PET data are in good accord
with the results of our behavioral lesion study (Zatorre and
Halpern, 1993), in which we observed that right temporal-lobe
resection resulted in decrements on perceptual and imaginal
music tasks, whereas similar damage to the left temporal region
had no effect.
An important conclusion to be drawn from the present results
is that the right-hemisphere specialization extends beyond per-
ceptual analysis to encompass complex tonal imagery processes.
Supporting evidence for this conclusion comes from two studies
in which subjects were asked to listen to a simple tonal sequence
and then either continue tapping in the same rhythm (Rao et al.,
1997) or tap in imitation of the sequence (Penhune et al., 1998).
Both studies reported activation within right but not left
posterior STG, which the authors interpreted as reflecting an
auditory imagery process that accompanied the tapping. In
Table 3
Stereotaxic coordinates and significance levels of activation foci in the Cue/Image – Control/Image
Region xyzt
Right frontal cortex
1. Inferior frontal gyrus (10/47) 29 55 –8 4.28
2. Inferior frontal gyrus (45) 40 17 6 7.41
3. Middle frontal gyrus (46) 36 46 23 4.29
Left frontal cortex
4. Inferior frontal gyrus (45) –42 15 2 4.65
5. Middle frontal gyrus (46/9) –32 46 26 3.91
Right temporal cortex
6. Superior temporal gyrus (22) 55 –40 11 4.53
61 –42 3 4.40
7. Inferior temporal gyrus (37) 60 –41 –11 4.61
Other regions
8. Anterior cingulate (32) 4 24 41 4.48
9. Right parietal cortex (40) 48 –49 50 6.31
Table 4
Stereotaxic coordinates and significance levels ofactivation foci in the Control/Image – Control
Region xyzt
Right frontal cortex
1. Precentral gyrus (6/4) 26 –30 53 3.89
Left frontal cortex
2. Inferior frontal gyrus (47) –48 30 –17 3.81
3. Middle frontal gyrus (44) –51 6 20 3.81
4. Frontal pole (10) –17 53 26 3.54
5. Superior frontal gyrus (8) –21 12 50 5.04
6. Precentral gyrus (6) –46 –7 39 4.63
Other regions
7. Supplementary motor area (6) –1 1 65 6.95
8. Right lingual gyrus (19) 20 –54 6 3.73
9. Right superior occipital gyrus (19) 17 –83 33 3.73
702 PET Study of Auditory Imagery Halpern and Zatorre
support of this, Janata found that the scalp topography of the
electrical activity elicited by imaging the continuation of a
melody is similar to the N100 component elicited by a real note
(Janata, 1999).
Image Generation versus Retrieval
The experimental design of the present study allowed us to
dissociate some of the processing components associated with
retrieving and imagining a familiar tune. The Cue/Image
Control/Image subtraction isolated processes related to retrieval
of the tune from semantic memory, whereas the Control/Image
Control subtraction focused on image generation without a
retrieval component. These two subtractions revealed com-
plementary areas of activation (see Table 2). The former
comparison showed the right STG activation referred to above,
together with predominantly right inferior frontal cortical
activation. The latter subtraction did not show activation in
these areas, but instead showed activity in SMA and in several
left frontal regions. Taken together, these two subtractions yield
an activity pattern similar to that seen in the Cue/Image
Control subtraction, suggesting that our design successfully
captured the decomposition of the complex imagery task into
separate retrieval and generation components.
In this context, the frontal-lobe activity in the Cue/Image
Control/Image subtraction may be interpreted as ref lecting
retrieval from musical semantic memory. Of the frontal regions
activated (see Table 3), the most inferior one (area 10/47) was
exclusively seen on the right, and the other two (areas 45 and
46) were active bilaterally but with a higher t-value on the right.
The right inferior frontal focus observed in this subtraction is
comparable to one found in our previous study (Zatorre et al.,
1996), in the comparison of imagery to perceptual conditions
(the coordinates of the activity observed in that study, 34, 53,
–11, are within 2 mm of focus 1 in Table 2). That subtraction
was meant to isolate image retrieval and generation from per-
ceptual processes. Although the paradigms in these studies were
different, both had in common the necessity to retrieve a stored
representation of a tune based on a cue. It is of interest to note
that the STG areas identified in the present study are probably
homologous to regions which in the macaque have been shown
to be topographically interconnected with inferior frontal
cortical areas (Petrides and Pandya, 1988; Romanski et al.,
1999). The CBF changes observed in the inferior frontal regions
may therefore be interpreted as reflecting activity within this
functional network.
Another region implicated in musical semantic retrieval in our
earlier study was the right thalamus. In the current study a
similar region was activated in the Cue/Image Control/Image
subtraction, albeit just below our relatively stringent level of
significance for an exploratory search (coordinates: 8, –11, 8; t =
3.24). These convergent findings therefore further implicate a
right inferior frontal/thalamic network in melodic semantic
The frontal cortical areas just described are approximately
homologous to the areas in the left hemisphere that have been
proposed as mediating retrieval from verbal semantic memory in
the HER A model (Nyberg et al., 1996). We propose that the
neural substrate of semantic memory retrieval may depend on
the type of material to be retrieved. Retrieval of familiar musical
information may involve the right hemisphere predominantly,
as has already been established for the perception and dis-
crimination of many musical materials. These findings indicate
the importance of using music to extend the generality of
processing models.
An alternative interpretation of our findings concerning the
retrieval component is that the imagery task may have entailed
some degree of episodic retrieval. This could have occurred
since subjects were asked to image the melodic theme to a
specific endpoint, as demonstrated in screening and pre-
scanning sessions. Thus, it is possible that some of the activity in
the right frontal region may reflect subjects’ retrieval of an
episodic memory trace associated with recalling at what point in
the melody they were supposed to stop. Nonetheless, the major
aspect of retrieval elicited by the task should be the semantic
component, since the cue sequence only presented the first few
notes, and the rest of the tune is stored in long-term memory.
Another brain area active in both of our auditory imagery
studies was the SMA. In the current study, the SMA was active in
the Cue/Image Control subtraction (Fig 2, top), and in the
subtraction related to generation (Control/Image Control; Fig
2, bottom), but not in the subtraction related to retrieval. We
infer, therefore, that the SMA may be important in the genera-
tion of the auditory image. SMA is thought to be important in
organization of motor codes, implying a close relationship
between auditory and motor memory systems. In our previous
paper (Zatorre et al., 1996) we raised the possibility that the
activation of SMA may imply a ‘singing to oneself strategy
during auditory imagery tasks. Because that study used songs
with words, we could not tell if the motor component was
related to verbalization or vocalization planning, or both. The
tunes in the current study had no lyrics, thus the SMA activation
cannot solely ref lect preparation of words associated with
retrieved tunes. SMA activation seems to reflect motor planning
associated with a subvocal singing or humming strategy during
the generation process.
We note that SMA activation was found by Rao et al. in several
of their conditions (Rao et al., 1997). They distinguished
between activation of the pre-SMA (positive y coordinates) and
SMA proper (negative y coordinates). The former was active
during a pitch discrimination task and the latter was active dur-
ing the simpler continuation task. In our task, SMA coordinates
corresponded to pre-SMA, which Rao et al. claim is associated
with more complex processing. This conclusion is consistent
with the fact that our image-generation task was more complex
than their task of imagining an isochronous single tone.
Remaining Issues
One puzzling aspect of our data pertains to theControl/Imagery
– Control subtraction. Contrary to our expectations, we did not
find activation of the right STG in this condition, which we
had assumed would be associated with the phenomenological
aspect of seeming to hear an auditory image. Even re-evoking an
unfamiliar short sequence of tones just heard ought to require
imagery, although perhaps with less vivid imagery than the task
of completing a familiar tune given its first few notes. Thus, it is
possible that the task may not have elicited a sufficiently lengthy
or strong imagery process to be detected by our methods.
A second puzzling aspect of this comparison was the activa-
tion in several left frontal sites. The task would appear to require
auditory working memory, as the patterns to be imagined were
all novel. Previous studies of verbal or figural working memory
have implicated dorsolateral regions of the frontal lobe bilater-
ally in working memory (Petrides et al., 1993; Braver et al.,
1997); studies in which tonal working memory was specifically
examined have also reported activity within left frontal cortex,
Cerebral Cortex Oct/Nov 1999, V 9 N 7 703
but have generally found much more extensive activity in right
frontal sites (Binder et al., 1997; Zatorre et al., 1994). We spec-
ulate that some of the areas activated in the left frontal lobe in
the present study are related to working memory, and that the
SMA is specifically involved in a motor process relevant for
auditory image generation, irrespective of the familiarity of the
imagined stimulus.
We gratefully acknowledge the assistance of Dr A.C. Evans and the staff of
the McConnell Brain Imaging Center, and of the MNI Cyclotron Unit. We
thank Stefan Köhler for helpful comments. This research was supported
by grants from the Medical Research Council of Canada (MT11541) and
by the McDonnell–Pew Cognitive Neuroscience Program.
Address correspondence to A.R. Halpern, Psychology Department,
Bucknell University, Lewisburg, PA 17837, USA. Email: ahalpern@
Binder J, Frost J, Hammeke T, Cox R, Rao S, Prieto,T (1997) Human brain
language areas identified by functional magnetic resonance imaging. J
Neurosci 17:353–362.
Buckner R, Raichle M, Petersen S (1995) Dissociation of human prefrontal
cortical areas across different speech production tasksand gender
groups. J Neurophysiol 74:2163–2173.
Braver TS, Cohen JD, Nystrom LE, Jonides J, Smith EE, Noll DC (1997) A
parametric study of prefrontal cortex involvement in human working
memory. NeuroImage 5:49–62.
Celesia G (1976) Organization of auditory cortical areas in man. Brain
Collins D, Neelin P, Peters T, Evans AC (1994) Automatic 3D intersubject
registration of MR volumetric data in standardized Talairach space. J
Comput Assist Tomogr 18:192–205.
Démonet JF, Price C., Wise R, Frackowiack RSJ (1994) A PET study of
cognitive strategies in normal subjects during language tasks. Brain
Divenyi P, Robinson A (1989) Nonlinguistic auditory capabilities
inaphasia. Brain Lang 37:290–326.
Evans A, Marrett S, Neelin P, Collins L, WorsleyK, Dai W, Milot S, Meyer E,
Bub D (1992) Anatomical mapping of functional activation in
stereotactic coordinate space. NeuroImage 1:43–53.
Farah MJ (1988) Is visual imagery really visual? Overlooked evidence from
neuropsychology. Psychol Rev 95:307–317.
Farah MJ (1989) Mechanisms of imagery–perception interaction. J Exp
Psychol: Hum Percept Perform 15:203–211.
Farah MJ (1995) The neural bases of mental imagery. In: The cognitive
neurosciences (Gazzaniga MS, ed.), pp. 963–975.Cambridge, MA: MIT
Farah MJ, Smith AF (1983) Perceptional interference and facilitation with
auditory imagery. Percept Psychophys 33:475–478.
Farah MJ, Weisberg LL, Monheit M, Peronnet F (1989) Brain activity
underlying mental imagery: event-related potentials during mental
image generation. J Cogn Neurosci 1:302–316.
Gabrieli JDE, Desmond JE, Demb JB,Wagner AD, Stone MV, Vaidya CJ,
Glover GH (1996) Functional magnetic resonance imaging of
semantic memory processes in the frontal lobes. Psychol Sci
Galaburda AM, Sanides F (1980) Cytoarchitectonic organization of
thehuman auditory cortex. J Comp Neurol 190:597–610.
Goldenberg G, Podreka I, Steiner M, Willmes K, Suess E, Deecke L
(1989) Regional cerebral blood flow patterns in visual imagery.
Neuropsychologia 27:641–664.
Hubbard TL, Stoeckig K (1988) Musical imagery: generation of tones and
chords. J Exp Psychol: Learn Mem Cog 14:656–667.
Halpern AR (1988) Mental scanning in auditory imagery for tunes. J Exp
Psychol: Learn Mem Cog 14:434–443.
Janata P (1999) Brain electrical activity evoked by imagined musical
events (in press).
Klein D, Milner B, Zatorre RJ, Evans AC, Meyer E (1995) The neural
substrates underlying word generation: a bilingual functional imaging
study. Proc Natl Acad Sci USA 92:2899–2903.
Kosslyn SM, Alpert NM, Thompson WL, Maljkovic V, Weise SB, Chabris
CF, Hamilton SE, Rauch SL, Buonanno FS (1993). Visual mental
imagery activates topographically organized visual cortex: PET
investigations. J Cog Neurosci 5:263–287.
Liégeois-Chauvel C, Peretz I, Babaï M, Laguitton V, Chauvel P(1998)
Contribution of different cortical areas in the temporal lobes to music
processing. Brain 121:1853–1867.
Mellet E, Petit L, Mazoyer B, Denis M, Tzourio N (1998) Reopening the
mental imagery debate: lessons from functional anatomy. Neuro-
Image, 8:129–139.
Milner BA (1962) Laterality effects in audition. In: Interhemispheric
relations and cerebral dominance (Mountcastle V, ed.), pp. 177–195.
Baltimore, MD: Johns Hopkins Press
Nyberg L, Cabeza R, Tulving E (1996) PET studies of encoding and
retrieval. Psychon Bull Rev 3:135–148.
Penhune VB, Zatorre RJ, MacDonald JD, Evans AC (1996) Interhemi-
spheric anatomical differences in human primary auditory cortex:
probabilistic mapping and volume measurement from magnetic
resonance scans. Cereb Cortex 6:661–672.
Penhune VB, Zatorre RJ, Evans AC (1998) Cerebellar contributions to
motor timing: a PET study of auditory and visual rhythm repro-
duction. J Cog Neurosci 10:752–765.
Petersen S, Fox P, Posner M, Mintun M, Raichle M (1988) Positron
emission tomographic studies of the cortical anatomy of single-word
processing. Nature 331:585–589.
Petrides M, Pandya DN (1988) Association fiber pathways to the frontal
cortex from the superior temporal region in the Rhesus monkey. J
Comp Neurol 273:52–66.
Petrides M, Alivisatos B, Meyer E, Evans AC (1993) Functional activation
of the human frontal cortex during the performance of verbal
working memory tasks. Proc Natl Acad Sci USA 90:878–882.
Raichle M, Martin W, Herscovitch P, Mintun M, Markham J (1983) Brain
blood flow measured with intravenous O15 H2O. 1. Theory and error
analysis. J Nucl Med 24:790–798.
Rao SM, Harrington DL, Haaland KY, Bobholz JA, Cox RW, Binder JR
(1997) Distributed neural systems underlying the timing of
movements. J Neurosci 17:5528–5535.
Robin DA, Tranel D, Damasio H (1990) Auditory perception of temporal
and spectral events in patients with focal left and right cerebral
lesions. Brain Lang 39:539–555.
Romanski LM, Bates JF, Goldman-Rakic PS (1999) Auditory belt and
parabelt projections to the prefrontal cortex in the Rhesus monkey. J
Comp Neurol 403:141–157.
Samson S, Zatorre RJ (1994) Contribution of the right temporal lobe to
musical timbre discrimination. Neuropsychologia 32:231–240.
Smith EE, Jonides J (1997) Working memory: a view from neuroimaging.
Cog Psychol 33:5–42.
Smith EE, Jonides J, Koeppe R A (1996) Dissociating verbal and spatial
working memory using PET. Cereb Cortex 6:11–20.
Talairach J, Tournoux P (1988) Co-planar stereotaxic atlas of the human
brain. New York: Thieme.
Westbury C, Zatorre RJ, Evans A (1999) Quantifying variability in the
planum temporale: a probability map. Cereb Cortex (in press).
Worsley K, Evans A, Marrett S, Neelin P (1992) A three-dimensional
statistical analysis for CBF activation studies in human brain. J Cereb
Blood Flow Metab 12:900–918.
Zatorre RJ (1985) Discrimination and recognition of tonal melodies after
unilateral cerebral excisions. Neuropsychologia 23:31–41.
Zatorre RJ (1988) Pitch perception of complex tones and human
temporal-lobe function. J Acoust Soc Am 84:566–572.
Zatorre RJ, Halpern AR (1993) Effect of unilateral temporal-lobe excision
on perception and imagery of songs. Neuropsychologia 31:221–232.
Zatorre RJ, Samson S (1991) Role of the right temporal neocortex in
retention of pitch in auditory short-term memory. Brain
Zatorre RJ, Evans AC, Meyer E (1994) Neural mechanisms underlying
melodic perception and memory for pitch. J Neurosci 14:1908–1919.
Zatorre RJ, Halpern AR, Perry DW, Meyer E, Evans AC (1996) Hearing in
the mind’s ear: a PET investigation of musical imagery and perception.
J Cog Neurosci 8:29–46.
704 PET Study of Auditory Imagery Halpern and Zatorre
... There has been a significant overlap between neural substrates for the generation of visuo-spatial tasks and music perception. For example, occipital and frontoparietal cortical areas, traditionally involved in visuo-spatial tasks, including the precuneus in medial parietal lobes called the "mind's eye" for being crucially involved in generation of visuo-spatial imagery (Mellet et al., 2002;Mellet et al., 1996), are among the most frequently and significantly activated regions in perception of music either in naïve listeners or in musicians (Mazziotta et al., 1982;Nakamura et al., 1999;Platel et al., 1997;Satoh, Takeda, Nagata, Hatazawa & Kuzuhara, 2001;Satoh, Takeda, Nagata, Hatazawa & Kuzuhara, 2003;Zatorre, Evans & Meyer, 1994;Zatorre, Perry, Beckett, Westbury & Evans, 1998), and even in subjects mentally imagining a melody (Halpern & Zatorre, 1999). These imaging data are consistent with three well documented cases of amusia associated to deficits in spatialtemporal tasks in both visual and auditory modalities (Griffiths et al., 1997;Steinke et al., 2001;Wilson & Pressing, 1999), as well as associated deficits in pith perception and visuospatial tasks in left prefrontal damage patients (Harrington et al., 1998). ...
... Other imaging studies have frequently reported superior parietal lobes (Zatorre et al., 1998) and precuneus activation also in perceptual tasks (Satoh et al., 2001). Halpern & Zatorre (1999) found that musicians when asked to mentally reproduce a just heard unfamiliar melody yield significant activations in several left inferior prefrontal (BA 44, 47, 10) and middle/superior frontal gyrus (BA 8, 6), as well as in premotor (BA6), visual association cortex (BA19), and parietal cortex and precuneus (BA40/7), but no activations in temporal lobes were found. Musicians with both absolute pitch (AP) and relative pitch (RP) both showed bilateral activations of parietal cortices in judging a musical interval, but with stronger left parietal activations in RP musicians whose cognitive strategies rely more on the maintaneance of pitch information on auditory working memory for pitch comparison on a mental stave (Zatorre et al., 1998), whereas AP musicians rely more on longterm memory (Zatorre et al., 1998). ...
Full-text available
Here we review the most important psychological aspects of music, its neural substrates, its universality and likely biological origins and, finally, how the study of neurocognition and emotions of music can provide one of the most important windows to the comprehension of the higher brain functioning and human mind. We begin with the main aspects of the theory of modularity, its behavioral and neuropsychological evidences. Then, we discuss basic psychology and neuropsychology of music and show how music and language are cognitively and neurofunctionally related. Subsequently we briefly present the evidences against the view of a high degree of specification and encapsulation of the putative language module, and how the ethnomusicological, pscychological and neurocognitive studies on music help to shed light on the issues of modularity and evolution, and appear to give further support for a cross-modal, interactive view of neurocognitive processes. Finally, we will argue that the notion of large modules do not adequately describe the organization of complex brain functions such as language, math or music, and propose a less radical view of modularity, in which the modular systems are specified not at the level of culturally determined cognitive domains but more at the level of perceptual and sensorimotor representations. © Cien. Resumo Aqui revisamos os aspectos psicológicos mais importantes da música, seus substratos neurais, sua universalidade e prováveis origens biológicas e, finalmente, como o estudo da neurocognição e das emoções associadas à música pode fornecer uma das mais importantes janelas para a compreensão das funções cerebrais superiores e da mente humana. Iniciamos com os principais aspectos da teoria da modularidade, suas evidências comportamentais e neuropsicológicas. Então discutimos a psicologia e a neuropsicologia básicas da música e mostramos como a música e a linguagem estão cognitiva e neurofuncionalmente
... In this regard, Schaefer [18] has stated that the commonalities between music imagery and movement may be more related to internal timing mechanisms. Although this study did not examine the temporal adaptation of muscle activity in the conditions of imagery and real performance, various researchers in the field of music have suggested that auditory imagery maintains the temporal characteristics of the auditory stimulus [31,60]. Keller, Bella and Koch [61] and Debarnot and Guillot [35] also showed that auditory imagery can accommodate movement timing and it is assumed that auditory imagery can at least create a similarity between the temporal structure of the visualized action and the actual movement. ...
... Such a development may be based partly on auditory and audiovisual mirror neurons that form a "listen-act" system as part of the perceptual system [64]. On the other hand, various researchers have confirmed that in auditory imagery, the structural characteristics of auditory stimuli and, in particular, the sound frequency changes (used in this research), are preserved [31,60,65,66]. Therefore, it seems that auditory imagery can not only independently affect the timing of movement execution, but also develop one's overall understanding by affecting visual perception. ...
Full-text available
The purpose of this research was to study the effect of AudioVisual pattern on the muscle activity amplitude during mental imagery. For this purpose, 25 female students (20.73 ± 1.56 years old) engaged in mental imagery (internal, external, and kinesthetic) in three conditions: No pattern, Visual pattern, and AudioVisual pattern. The angular velocity of the elbow joint in the basketball jump shot skill was sonified and presented to the subjects as an auditory pattern. The results showed that the muscle activity amplitude in AudioVisual-kinesthetic and AudioVisual-internal (and not external) conditions is higher than for other conditions. Additionally, a positive correlation was observed between Visual-kinesthetic imagery ability and muscle activity amplitude in the AudioVisual pattern condition and in kinesthetic and internal imagery. In addition, the muscle activity amplitude of high and low Visual-kinesthetic imagery ability conditions were only different in the AudioVisual pattern. The superiority of the AudioVisual condition is most likely due to the auditory information presented in this research being closely related to the kinesthetic sense of movement .
... Together, these results align with the Multi-Modal Imagery Association (MMIA) model, which suggests that auditory imagery depends on a sensorimotor mechanism involving both auditory and motor processes . Evidence for sensorimotor processing in auditory imagery as posited by the MMIA model is further supported by neuroimaging research that has found that the act of imagining sound recruits both perceptual and motor planning areas of the brain (Halpern et al., 1999;Herholz et al., 2012;Lima et al., 2016;Zatorre & Halpern, 1993). ...
... Given this pattern of results, we suggest that the role of subvocal muscle movements of the sternohyoid muscles may reflect motor processing pertaining to a sensorimotor image of a pitch sequence, which supports the role of sensorimotor processing during auditory imagery described in the MMIA model . Furthermore, the current study highlights the role of motor processes in the peripheral nervous system during auditory imagery and complements neuroimaging research that has reported that imagining pitch recruits cortical motor planning processes at the level of the central nervous system (Halpern et al., 1999;Herholz et al., 2012;Lima et al., 2016;Zatorre & Halpern, 1993). The upper lip muscles may be more strongly recruited for vocal articulation than pitch control, a claim that aligns with previous work on auditory imagery for verbal information (Aleman & Van't Wout, 2004;Smith et al., 1995). ...
Given previous results showing that auditory imagery is associated with subvocal muscle movements related to pitch control, the present study addressed whether subvocalization of pitch is differentially involved during imagery that precedes the execution of an imagined action as compared to non-preparatory imagery. We examined subvocal activity using surface electromyography (sEMG) during auditory imagery that preceded sung reproduction of a pitch sequence (preparatory) or recognition of a pitch sequence (non-preparatory). On different trials, participants either imagined the sequence as presented, or imagined a mental transformation of that sequence. Behavioral results replicated previous findings of poorer reproduction and recognition of transformed sequences compared to sequences in their original form. Physiological results indicated that subvocal activity was significantly above baseline for all conditions, greater than activity observed for the bicep control site, and greater for longer sequences, but did not reliably scale with transformation type. Furthermore, greater subvocal activity during preparatory imagery was associated with greater subvocal activity during non-preparatory imagery for muscles involved in pitch control and articulation. Muscle activity involved in pitch control was similarly recruited for both preparatory and non-preparatory auditory imagery. In contrast, muscle activity involved in vocal articulation was most strongly recruited during motor preparation. Our findings suggest that pitch imagery recruits subvocal muscle activity regardless of whether the imagined action is intended to be effected.
... The effects of low-uncertainty and high-uncertainty SL were pronounced in the right and left hemispheres, respectively. We also found that the amplitudes were larger in the right hemisphere than in the left hemisphere regardless of TP, which may be consistent with a large amount of evidence indicating that the right auditory cortex is specialised for spectral processing, such as pitches, whereas the left auditory cortex is specialised for temporal processing (Binder et al., 1997;Halpern and Zatorre, 1999;Zatorre et al., 1994). A previous study (Furl et al., 2011) suggested that the neural basis of the SL of auditory sequences with TP ratios of 10:90, which is the same as the low-uncertainty sequences in the present study, is associated with source activity in the right Brodmann area 39, posterior to the planum temporale, near the temporoparietal junction, and the superior temporal sulcus including parts of the middle and superior temporal gyri and the inferior angular gyrus. ...
Full-text available
Statistical learning (SL) is an innate mechanism by which the brain automatically encodes the n-th order transition probability (TP) of a sequence and grasps the uncertainty of the TP distribution. Through SL, the brain predicts a subsequent event (en+1) based on the preceding events (en) that have a length of “n”. It is now known that uncertainty modulates prediction in top-down processing by the human predictive brain. However, the manner in which the human brain modulates the order of SL strategies based on the degree of uncertainty remains an open question. The present study examined how uncertainty modulates the neural effects of SL and whether differences in uncertainty alter the order of SL strategies. It used auditory sequences in which the uncertainty of sequential information is manipulated based on the conditional entropy. Three sequences with different TP ratios of 90:10, 80:20, and 67:33 were prepared as low-, intermediate, and high-uncertainty sequences, respectively (conditional entropy: 0.47, 0.72, and 0.92 bit, respectively). Neural responses were recorded when the participants listened to the three sequences. The results showed that stimuli with lower TPs elicited a stronger neural response than those with higher TPs, as demonstrated by a number of previous studies. Furthermore, we found that participants adopted higher-order SL strategies in the high uncertainty sequence. These results may indicate that the human brain has an ability to flexibly alter the order based on the uncertainty. This uncertainty may be an important factor that determines the order of SL strategies. Particularly, considering that a higher-order SL strategy mathematically allows the reduction of uncertainty in information, we assumed that the brain may take higher-order SL strategies when encountering high uncertain information in order to reduce the uncertainty. The present study may shed new light on understanding individual differences in SL performance across different uncertain situations.
... Once it is translated into neuronal activity, widely distributed brain areas participate in the neuronal encoding of music 63 . Acoustic aspects and musical structure such as rhythm, tone, melody, and harmony are processed in the frontal, temporal, and parietal regions 8,11,64,65 . The amygdala, ventral striatum, hippocampus, hypothalamus, and interaction with arousal control systems, based on norepinephrine and serotonin concentrations, have effects on emotional responses and the autonomic nervous system, inducing behavioral and organic responses 7,62,66,67 . ...
Full-text available
Music is a complex stimulus, with various spectro-temporal acoustic elements determining one of the most important attributes of music, the ability to elicit emotions. Effects of various musical acoustic elements on emotions in non-human animals have not been studied with an integrated approach. However, this knowledge is important to design music to provide environmental enrichment for non-human species. Thirty-nine instrumental musical pieces were composed and used to determine effects of various acoustic parameters on emotional responses in farm pigs. Video recordings (n = 50) of pigs in the nursery phase (7–9 week old) were gathered and emotional responses induced by stimuli were evaluated with Qualitative Behavioral Assessment (QBA). Non-parametric statistical models (Generalized Additive Models, Decision Trees, Random Forests, and XGBoost) were applied and compared to evaluate relationships between acoustic parameters and pigs’ observed emotional responses. We concluded that musical structure affected emotional responses of pigs. The valence of modulated emotions depended on integrated and simultaneous interactions of various spectral and temporal structural components of music that can be readily modified. This new knowledge supports design of musical stimuli to be used as environmental enrichment for non-human animals.
... Primary auditory cortex was activated during timbre perception but was not activated during timbre imagery. Other PET studies investigating imagery of music melody found an involvement of the right superior temporal gyrus, of the posterior parietal cortex (Zatorre et al., 2010) as well as the right frontal lobe and the supplementary motor area (Halpern & Zatorre, 1999). Other studies have investigated the ability to imagine music key (minor, vs major: Meyer et al., 2007), melodies (Halpern, 1988a), and tempo (Halpern, 1988b). ...
Full-text available
This study aimed to investigate the psychophysiological markers of imagery processes through EEG/ERP recordings. Visual and auditory stimuli representing 10 different semantic categories were shown to 30 healthy participants. After a given interval and prompted by a light signal, participants were asked to activate a mental image corresponding to the semantic category for recording synchronized electrical potentials. Unprecedented electrophysiological markers of imagination were recorded in the absence of sensory stimulation. The following peaks were identified at specific scalp sites and latencies, during imagination of infants (centroparietal positivity, CPP, and late CPP), human faces (anterior negativity, AN), animals (anterior positivity, AP), music (P300-like), speech (N400-like), affective vocalizations (P2-like) and sensory (visual vs auditory) modality (PN300). Overall, perception and imagery conditions shared some common electro/cortical markers, but during imagery the category-dependent modulation of ERPs was long latency and more anterior, with respect to the perceptual condition. These ERP markers might be precious tools for BCI systems (pattern recognition, classification, or A.I. algorithms) applied to patients affected by consciousness disorders (e.g., in a vegetative or comatose state) or locked-in-patients (e.g., spinal or SLA patients).
This study examines whether the modality effect can be used to improve visual time perception. In Experiment 1, we used a time-reproduction task to explore the accuracy (i.e., deviation of reproduced time from veridical time) and precision (i.e., variability of reproduced time) of time perception under auditory, visual, or audiovisual conditions. Results confirmed the existence of a modality effect. Experiments 2a and 2b and Experiment 3 examined whether adding auditory stimuli improves visual time perception. In Experiments 2a and 2b, participants were required to sound when the visual stimuli appeared. Results showed that the addition of sound to visual stimuli perception is associated with higher time perception accuracy than viewing visual stimuli alone. Given that sounding is not always applicable, we conducted Experiment 3, with participants asked to imagine sounds instead of sounding. Results showed that imaginary sounds improved accuracy. However, in Experiments 2a, 2b, and 3, neither sounding nor imagining sounds changed the precision of time perception. The findings of this study indicate that adding auditory stimuli reliably improves the accuracy of visual-time perception, irrespective of whether the sound is real or imagined.
Full-text available
Subjective tinnitus is a prevalent, though heterogeneous, condition whose pathophysiological mechanisms are still under investigation. Based on animal models, changes in neurotransmission along the auditory pathway have been suggested as co-occurring with tinnitus. It has not, however, been studied whether such effects can also be found in other sites beyond the auditory cortex. Our MR spectroscopy study is the first one to measure composite levels of glutamate and glutamine (Glx; and other central nervous system metabolites) in bilateral medial frontal and non-primary auditory temporal brain areas in tinnitus. We studied two groups of participants with unilateral and bilateral tinnitus and a control group without tinnitus, all three with a similar hearing profile. We found no metabolite level changes as related to tinnitus status in neither region of interest, except for a tendency of an increased concentration of Glx in the left frontal lobe in people with bilateral vs unilateral tinnitus. Slightly elevated depressive and anxiety symptoms are also shown in participants with tinnitus, as compared to healthy individuals, with the bilateral tinnitus group marginally more affected by the condition. We discuss the null effect in the temporal lobes, as well as the role of frontal brain areas in chronic tinnitus, with respect to hearing loss, attention mechanisms and psychological well-being. We furthermore elaborate on the design-related and technical obstacles when using MR spectroscopy to elucidate the role of neurometabolites in tinnitus.
Humans excel at constructing mental representations of speech streams in the absence of external auditory input: the internal experience of speech imagery. Elucidating the neural processes underlying speech imagery is critical to understanding this higher-order brain function in humans. Here, using functional magnetic resonance imaging, we investigated the shared and distinct neural correlates of imagined and perceived speech by asking participants to listen to poems articulated by a male voice (perception condition) and to imagine hearing poems spoken by that same voice (imagery condition). We found that compared to baseline, speech imagery and perception activated overlapping brain regions, including the bilateral superior temporal gyri and supplementary motor areas. The left inferior frontal gyrus was more strongly activated by speech imagery than by speech perception, suggesting functional specialization for generating speech imagery. Although more research with a larger sample size and a direct behavioral indicator is needed to clarify the neural systems underlying the construction of complex speech imagery, this study provides valuable insights into the neural mechanisms of the closely associated but functionally distinct processes of speech imagery and perception.
Full-text available
We review positron emission tomography (PET) studies whose results converge on the hemispheric encoding/retrieval asymmetry (HERA) model of the involvement of prefrontal cortical regions in the processes of human memory. The model holds that the left prefrontal cortex is differentially more involved in retrieval of information from semantic memory, and in simultaneously encoding novel aspects of the retrieved information into episodic memory, than is the right prefrontal cortex. The right prefrontal cortex, on the other hand, is differentially more involved in episodic memory retrieval than is the left prefrontal cortex. This general pattern holds for different kinds of information (e.g., verbal materials, pictures, faces) and a variety of conditions of encoding and retrieval.
Full-text available
Neuropsychological studies have suggested that imagery processes may be mediated by neuronal mechanisms similar to those used in perception. To test this hypothesis, and to explore the neural basis for song imagery, 12 normal subjects were scanned using the water bolus method to measure cerebral blood flow (CBF) during the performance of three tasks. In the control condition subjects saw pairs of words on each trial and judged which word was longer. In the perceptual condition subjects also viewed pairs of words, this time drawn from a familiar song; simultaneously they heard the corresponding song, and their task was to judge the change in pitch of the two cued words within the song. In the imagery condition, subjects performed precisely the same judgment as in the perceptual condition, but with no auditory input. Thus, to perform the imagery task correctly an internal auditory representation must be accessed. Paired-image subtraction of the resulting pattern of CBF, together with matched MRI for anatomical localization, revealed that both perceptual and imagery. tasks produced similar patterns of CBF changes, as compared to the control condition, in keeping with the hypothesis. More specifically, both perceiving and imagining songs are associated with bilateral neuronal activity in the secondary auditory cortices, suggesting that processes within these regions underlie the phenomenological impression of imagined sounds. Other CBF foci elicited in both tasks include areas in the left and right frontal lobes and in the left parietal lobe, as well as the supplementary motor area. This latter region implicates covert vocalization as one component of musical imagery. Direct comparison of imagery and perceptual tasks revealed CBF increases in the inferior frontal polar cortex and right thalamus. We speculate that this network of regions may be specifically associated with retrieval and/or generation of auditory information from memory.
Four experiments are reported that examine Ss' ability to form and use images of tones and chords. In Exps 1 and 3, Ss heard a cue tone or chord and formed an image of a tone or chord one whole step in pitch above the cue. This image was then compared to a probe tone or chord that was either the same as the image in pitch, different from the image in pitch and harmonically closely related, or different and harmonically distantly related. In Exp 3, a random-tone mask was used to control for possible contributions of the cue in echoic memory. In both experiments tone images were formed faster than chord images, a result consistent with the idea of structural complexity as a determinant of image formation time. Response times and accuracy rates were found to parallel results found in music perception studies, results consistent with the idea of shared mechanisms in the processing of musical images and percepts. Exps 2 and 4 were control experiments examining the possible influence of demand characteristics and Ss' knowledge. Findings rule out the possibility that demand characteristics and Ss' knowledge were solely responsible for the results of Exps 1 and 3 and support the role of imagery.
Abstract Cerebral blood flow was measured using positron emission tomography (PET) in three experiments while subjects performed mental imagery or analogous perceptual tasks. In Experiment 1, the subjects either visualized letters in grids and decided whether an X mark would have fallen on each letter if it were actually in the grid, or they saw letters in grids and decided whether an X mark fell on each letter. A region identified as part of area 17 by the Talairach and Tournoux (1988) atlas, in addition to other areas involved in vision, was activated more in the mental imagery task than in the perception task. In Experiment 2, the identical stimuli were presented in imagery and baseline conditions, but subjects were asked to form images only in the imagery condition; the portion of area 17 that was more active in the imagery condition of Experiment 1 was also more activated in imagery than in the baseline condition, as was part of area 18. Subjects also were tested with degraded perceptual stimuli, which caused visual cortex to be activated to the same degree in imagery and perception. In both Experiments 1 and 2, however, imagery selectively activated the extreme anterior part of what was identified as area 17, which is inconsistent with the relatively small size of the imaged stimuli. These results, then, suggest that imagery may have activated another region just anterior to area 17. In Experiment 3, subjects were instructed to close their eyes and evaluate visual mental images of upper case letters that were formed at a small size or large size. The small mental images engendered more activation in the posterior portion of visual cortex, and the large mental images engendered more activation in anterior portions of visual cortex. This finding is strong evidence that imagery activates topographically mapped cortex. The activated regions were also consistent with their being localized in area 17. Finally, additional results were consistent with the existence of two types of imagery, one that rests on allocating attention to form a pattern and one that rests on activating stored visual memories.
This article addresses two issues about the neural bases of mental imagery. The first issue concerns the modality-specificity of mental images, that is, whether or not they involve activity in visual areas of the brain. The second issue concerns hemispheric specialization for the generation of mental images. We compared event-related potentials recorded under two conditions: one in which subjects were shown words and asked to read them and one in which subjects were shown words and asked to read them and generate visual mental images of the words' referents. Imagery caused a slow, late positivity, maximal at the occipital and posterior temporal regions of the scalp, relative to the comparison condition, and consistent with the involvement of modality-specific visual cortex in mental imagery. Also noted was an asymmetry in the imagery-related ERP, consistent with left-hemisphere specialization for mental image generation. Similar results were obtained when subjects listened to auditorily presented words with and without instructions to generate mental images. To assess the specificity of the relation between these ERP effects and mental imagery, we compared the ERP changes brought about by imaging with those brought about by another effortful task using the same stimulus words: proofreading the words for occasional misspellings. This produced changes that differed in polarity, time course, and scalp distribution from the imagery-related changes.
Frontal-lobe activation during semantic memory performance was examined using functional magnetic resonance imaging (fMRI), a noninvasive technique for localizing neural activity associated with cognitive function Left inferior prefrontal cortex was more activated for semantic than for perceptual encoding of words, and for initial than for repeated semantic encoding of words Decreased activation for semantic encoding of repeated words reflects repetition priming, that is, implicit retrieval of memory gained in the initial semantic encoding of a word The left inferior prefrontal region may subserve semantic working memory processes that participate in semantic encoding and that have decreased demands when such encoding can be facilitated by recent semantic experience These results demonstrate that fMRI can visualize changes in an individual's brain function associated with the encoding and retrieval of new memories
What neural events underlie the generation of a visual mental image? This chapter reviews evidence from pa- tients with brain damage and from measurements of regional brain activity in normal subjects. The answer emerging from these studies is that many of the same modality-specific corti- cal areas used in visual perception are also used in imagery. These areas include spatially mapped regions of occipital cortex. There is also evidence for a distinct imagery mecha- nism, not used under normal circumstances for perception, which is required for the generation of images from memory. Evidence concerning the localization of this process is not entirely consistent, but shows a trend toward regions of the posterior left hemisphere.