Object-based attention is multisensory: co-activation of an object's representations in ignored sensory modalities.
ABSTRACT Within the visual modality, it has been shown that attention to a single visual feature of an object such as speed of motion, results in an automatic transfer of attention to other task-irrelevant features (e.g. colour). An extension of this logic might lead one to predict that such mechanisms also operate across sensory systems. But, connectivity patterns between feature modules across sensory systems are thought to be sparser to those within a given sensory system, where interareal connectivity is extensive. It is not clear that transfer of attention between sensory systems will operate as it does within a sensory system. Using high-density electrical mapping of the event-related potential (ERP) in humans, we tested whether attending to objects in one sensory modality resulted in the preferential processing of that object's features within another task-irrelevant sensory modality. Clear evidence for cross-sensory attention effects was seen, such that for multisensory stimuli responses to ignored task-irrelevant information in the auditory and visual domains were selectively enhanced when they were features of the explicitly attended object presented in the attended sensory modality. We conclude that attending to an object within one sensory modality results in coactivation of that object's representations in ignored sensory modalities. The data further suggest that transfer of attention from visual-to-auditory features operates in a fundamentally different manner than transfer from auditory-to-visual features, and indicate that visual-object representations have a greater influence on their auditory counterparts than vice-versa. These data are discussed in terms of 'priming' vs. 'spreading' accounts of attentional transfer.
-
Citations (0)
-
Cited In (0)
Page 1
Object-based attention is multisensory: co-activation of an
object’s representations in ignored sensory modalities
Sophie Molholm,1,2Antigona Martinez,1Marina Shpaner1,2and John J. Foxe1,2
1The Cognitive Neurophysiology Laboratory, Program in Cognitive Neuroscience and Schizophrenia, Nathan Kline Institute for
Psychiatric Research, 140 Old Orangeburg Road, Orangeburg, New York 10962, USA
2Program in Cognitive Neuroscience, Department of Psychology, The City College of the City University of New York, 138th Street
and Convent Avenue, New York, NY 10031, USA
Keywords: auditory, ERPs, human, visual
Abstract
Within the visual modality, it has been shown that attention to a single visual feature of an object such as speed of motion, results in
an automatic transfer of attention to other task-irrelevant features (e.g. colour). An extension of this logic might lead one to predict
that such mechanisms also operate across sensory systems. But, connectivity patterns between feature modules across sensory
systems are thought to be sparser to those within a given sensory system, where interareal connectivity is extensive. It is not clear
that transfer of attention between sensory systems will operate as it does within a sensory system. Using high-density electrical
mapping of the event-related potential (ERP) in humans, we tested whether attending to objects in one sensory modality resulted in
the preferential processing of that object’s features within another task-irrelevant sensory modality. Clear evidence for cross-sensory
attention effects was seen, such that for multisensory stimuli responses to ignored task-irrelevant information in the auditory and
visual domains were selectively enhanced when they were features of the explicitly attended object presented in the attended
sensory modality. We conclude that attending to an object within one sensory modality results in coactivation of that object’s
representations in ignored sensory modalities. The data further suggest that transfer of attention from visual-to-auditory features
operates in a fundamentally different manner than transfer from auditory-to-visual features, and indicate that visual-object
representations have a greater influence on their auditory counterparts than vice-versa. These data are discussed in terms of
‘priming’ vs. ‘spreading’ accounts of attentional transfer.
Introduction
Attention to a single visual feature of an object such as speed of
motion, results in an automatic transfer of attention to other task-
irrelevant features within the visual sensory modality (e.g. colour;
see for example Schoenfeld et al., 2003). The question remains
however, whether object-based selective attention extends across the
sensory systems, such that an object’s features in other sensory
modalities are also automatically coactivated. The coactivation of
features within a sensory system (e.g. colour and motion) is clearly
predicated upon extensive anatomical connectivity between the
respective processing modules within the visual system (i.e. V4
and MT; Felleman & VanEssen, 1991), whereas similar levels of
connectivity between processing units across sensory systems may
not exist and have only recently begun to be studied (e.g. Falchier
et al., 2002). Further, a number of intersensory attention studies, that
is studies where attention is directed toward one of two sensory
modalities, have shown a diminution of sensory processing for
stimuli presented in the ‘ignored’ sensory modality (Foxe et al.,
2005; Johnson & Zatorre, 2005).
There are reasons to expect that object-based attention will extend
across sensory modalities, however. Spatially directed attention is
multisensory (e.g. Spence & Driver, 1996; Eimer & Driver, 2001). The
co-occurrence of visual and auditory elements of an attended object
result in enhanced visual selective attention processes (Molholm et al.,
2004), indicating a connection between visual and auditory object
representations, and a central tenet of the influential biased-compe-
tition model of visual attention (see Duncan, 2006) is that when an
object’s representation is primed in one part of a representational
network (e.g. its colour), its representations in other parts of the
network will be as well (e.g. its shape), even when these related
representations are irrelevant to the task at hand (e.g. O’Craven et al.,
1999; Schoenfeld et al., 2003). An extension of this logic is that such
mechanisms will also operate across sensory systems in much the
same way as they do within vision.
In a pair of studies by Woldorff and colleagues (Busse et al., 2005;
Talsma et al., 2007) in which simple and unrelated auditory and visual
stimuli were presented, attention was shown to ‘spread’ to an
unattended auditory stimulus when it was paired with an attended
visual stimulus. In the present study we were interested more
specifically in how attention operates on familiar objects with well-
known multisensory attributes, a situation far more typical of everyday
experience (searching for the dog, tracking a nearby moving car,
etc....). We expected that this might function differently than attention
Correspondence: Dr Sophie Molholm, The Cognitive Neurophysiology Laboratory, as
above.
E-mail: molholm@nki.rfmh.org
Received 17 February 2007, revised 8 May 2007, accepted 30 May 2007
European Journal of Neuroscience, Vol. 26, pp. 499–509, 2007
doi:10.1111/j.1460-9568.2007.05668.x
ª The Authors (2007). Journal Compilation ª Federation of European Neuroscience Societies and Blackwell Publishing Ltd
Page 2
to simple and arbitrarily paired multisensory stimuli. That is, the
neural representations of features of common objects are likely bound
together, so that in highlighting an object representation in one sensory
system via voluntarily directed selective attention, representations of
that object in other sensory systems would be likewise highlighted.
For example, will selectively attending to the image of a dog result
in enhanced processing of a corresponding dog bark? Here, attention
was explicitly directed at a specific object within the visual or auditory
sensory modality, and attended and unattended objects were presented
to the same location. High-density scalp-recorded event-related
potentials (ERPs) were recorded and the presence of responses known
to index visual and auditory selective attention processes was assessed
for features of attended objects that were presented in the ignored
sensory modality. To presage our results, the data provide unambig-
uous evidence for the multisensory nature of object-based attention.
Further, asymmetries in the data are consistent with an object
recognition system in which visual representations tend to dominate.
Materials and methods
Subjects
Twelve neurologically normal, paid volunteers participated (mean age
26 ± 5.5 years; seven female; three left-handed). All reported normal
hearing and normal or corrected-to-normal vision. The Institutional
Review Board of the Nathan Kline Institute for Psychiatric Research
approved the experimental procedures. Each subject provided written
informed consent in line with the Declaration of Helsinki.
Stimuli
There were three stimulus types; sounds alone, images alone, and
paired images and sounds belonging to the same object.
Images
Three line drawings were presented; a guitar, a dog, and a hammer.
These came from the Snodgrass and Vanderwart set (Snodgrass &
Vanderwart, 1980) and were standardized on familiarity and com-
plexity. They were presented on a 21-inch computer monitor located
143 cm in front of the subject, and were black on a grey background.
The pictures subtended an average of 4.8? of visual angel in the
vertical plane and 4.4? of visual angle in the horizontal plane. These
were presented for 400 ms.
Sounds
Three complementary sounds, adapted from Fabiani et al. (1996),
were presented; the strum of a guitar, the bark of a dog, and the bang
of a hammer. These were 400 ms in duration, and were presented at a
comfortable listening level of approximately 75 dB SPL over two JBL
speakers placed at either side of the monitor.
Procedure
Participants were seated in a comfortable chair in a dimly lit and
electrically shielded (Braden Shielding Systems) room and asked to
keep head and eye movements to a minimum, while maintaining
central fixation. Eye position was monitored with horizontal and
vertical electro-oculogram (EOG) recordings. The auditory, visual,
and auditory-visual stimuli were presented equiprobably and in
pseudo-random order. Stimulus onset asynchrony varied randomly
between 800 and 1100 ms. A total of 596 stimuli were presented
within a block of approximately 10 min, which was further divided
into three subblocks to allow for frequent breaks to maintain high
concentration and prevent fatigue.
There were six blocked attention conditions; attend visual stimuli,
either to the dog image or to the guitar image in different blocks;
attend auditory stimuli, either to the dog sound or to the guitar sound
in different blocks; and attend both auditory and visual stimuli, either
to the dog image and sound or the guitar image and sound in different
blocks. Subjects were instructed to make a button press response with
their right index finger to consecutive repetitions of the attended object
(dog or guitar) within the attended sensory modality (auditory, visual,
or both). The probability of a target was 6% in each of the auditory
and visual attention conditions.
In the attend visual condition, when dog was the attended object,
the target was the second of two dog-images presented in a row; in the
attend auditory condition, when dog was the attended object, the target
was the second of two dog-barks presented in a row; and in the attend
both auditory and visual condition, when dog was the attended target,
the target could be the second of two dog-barks presented in a row, the
second of two dog-images presented in a row, a dog-image that
followed a dog-bark, or a dog-bark that followed a dog-image. Over
the course of the experiment the dog and the guitar each served as the
target object in each of the three sensory attention conditions at least
twice (for a total of 12 blocks), and at most three times (for a total of
18 blocks). The order of blocks of sensory-modality attended and
object attended (dog or guitar; nested within sensory-modality
attended) was counterbalanced across subjects. The hammer stimuli
were never attended, serving only as neutral-fillers to prevent an
excess of object-repetition trials (i.e. targets). Only data from the
visual and auditory attention conditions are considered here, as the
auditory-visual attention condition is not germane to the current
question. See Fig. 1A for a schematic of the stimulus paradigm during
visual (top) and auditory (bottom) attention conditions, with guitar
serving as the target.
Data acquisition and analysis
High-density continuous EEG recordings were obtained from the
BioSemi ActiveTwo 168 channel system. Recordings were initially
referenced online relative to a common mode active electrode and
digitally sampled at 512 Hz. The continuous EEG was divided into
epochs ()100 ms pre-stimulus to 500 ms post-stimulus onset) and
baseline corrected over the full 600 ms for the purpose of artifact
rejection. Trials with blinks and eye movements were automatically
rejected off-line on the basis of horizontal electro-oculogram placed on
the left and right canthi. An artifact criterion of ±60 lV was used at
all other scalp sites to reject trials with excessive EMG or other noise
transients. EEG epochs were sorted according to stimulus and
attention condition, and averaged from each subject to compute the
ERP. The baseline for the ERP was defined as the epoch from
)100 ms to stimulus onset. Only non-target responses were analysed,
due to the temporally and spatially overlapping target and response
related componentry typically associated with target responses. For all
electrophysiological analyses, the responses to the two objects (dog
and guitar) were collapsed, to increase the signal-to-noise ratio. The
average number of accepted sweeps per stimulus-type within an
attention condition was 157 (± 25; collapsed across objects). For the
analysis of the data, the waveforms were algebraically re-referenced to
the nasion. Separate group-averaged ERPs for each of the stimulus
types in each of the attention conditions were calculated for display
purposes and for identification of the short-latency sensory-evoked
500S. Molholm et al.
ª The Authors (2007). Journal Compilation ª Federation of European Neuroscience Societies and Blackwell Publishing Ltd
European Journal of Neuroscience, 26, 499–509
Page 3
components (e.g. P1, N1, P2), and the well-characterized selective
attention components the visual ‘selection negativity’ (SN) and the
auditory ‘negative difference’ wave (Nd; e.g. Hansen & Hillyard,
1980). Button press responses to the target trials were acquired during
the recording of the EEG and processed offline. Responses falling
between 250 and 950 ms post-stimulus onset were considered valid.
This window was used so that a response could only be associated
with a single stimulus presentation.
Statistical analyses
Behaviour
To determine if attention was directed as intended, performance during
both auditory and visual attention conditions was examined. Perform-
ance between auditory and visual attention conditions was compared
to assess if there were significant differences in the successful
allocation of attention. Error patterns were also examined, to assess
whether subjects effectively ignored stimuli presented in the unatten-
ded sensory modality.
For individual subjects, accuracy and average reaction time (RT)
were calculated for each of the objects (dog and guitar), for each of the
target types (auditory, visual, or visual-auditory) and for each of the
unisensory-attention conditions (attend auditory and attend visual).
Target type was determined by the second of the repeating stimulus
pair. S1 of the target pair was counterbalanced such that an equal
number of auditory, visual, and visual-auditory stimuli preceded S2.
For the auditory attention condition targets could be auditory (A) or
visual-auditory (VA; when the attended auditory element was paired
with its irrelevant visual counterpart); for the visual attention condition
targets could be visual (V) or visual-auditory (VA; when the attended
visual object was paired with its irrelevant auditory counterpart).
Separate three-way anovas with factors of Object (dog or guitar),
Target type (unisensory or multisensory), and Unisensory-attention
condition (visual or auditory) were performed on the reaction-time and
accuracy data.
Electrophysiology
Baseline unisensory selective attention effects
Characterization of selective attention effects elicited under typical
unisensory conditions provided a baseline against which to compare
selective attention effects for the attended objects when presented in
ignored sensory modalities.
Visual selective attention effects were examined by comparing the
response to the visual objects when attended to the response to
the same when unattended (e.g. during visual attention, the response to
the dog image when dog was attended compared to the response to the
dog image when guitar was attended). For individual subjects,
amplitude data were averaged over a 100-ms latency window centred
on the peak of the grand-averaged SN. This was performed for the
three electrodes over each of right and left occipital scalp where
the SN was largest in the grand-mean difference waveform, and over
the central occipital scalp. The SN was expected to be most
prominent over bilateral occipital scalp sites, and to peak 250–350 ms
following stimulus onset, in line with the extant literature (e.g. Hillyard &
Anllo-Vento, 1998; Molholm et al., 2004).
Auditory selective attention effects were examined by comparing
the response to the auditory objects when attended to the response to
the same when unattended (e.g. during auditory attention, the response
to the dog bark when dog was attended compared to the response to
the dog bark when guitar was attended). For individual subjects,
amplitude data was averaged over a 100-ms latency window that was
centred on the peak of the grand-averaged Nd. This was performed for
nine electrodes over fronto-central scalp where the Nd was largest in
the grand-mean difference waveform, three from left, three from right,
Fig. 1. (A) Schematic of experimental paradigm. A sequence of stimuli (‘–’,
task-irrelevant object; ‘+’, task relevant object) when visual is attended (top)
and when auditory is attended (bottom), where a repetition of ‘guitar’ in the
attended sensory modality comprises a target trial (‘target’). The hammer
stimuli (bang sounds and hammer images) were neutral filler stimuli, which
were never attended. (B) Back (left) and top (right) views of the 168 channel
recording montage. Electrode sites used in the statistical analysis of the visual
selective attention effects (back view; filled circles) and the auditory selective
attention effects (top view; filled circles). Data were analysed from electrodes
representing activity over left (three), central (three), and right (three) scalp
regions where the baseline selective attention effects were greatest.
Multisensory transfer of object-based attention501
ª The Authors (2007). Journal Compilation ª Federation of European Neuroscience Societies and Blackwell Publishing Ltd
European Journal of Neuroscience, 26, 499–509
Page 4
and three from central regions. The Nd was expected to be most
prominent over fronto-central scalp, and to peak 150–250 ms
following stimulus onset, in line with the extant literature (e.g.
Hansen & Hillyard, 1980).
Cross-sensory object-based selective attention effects
Two approaches were taken in assessing whether object-based
selective attention is multisensory. The first examined evidence in
the responses to the unisensory stimuli and the second did so in the
responses to the multisensory stimuli. These selective attention effects
are referred to as ‘uni cross-sensory’ SN or Nd and ‘multicross-
sensory’ SN or Nd, respectively.
For the unisensory visual stimuli, the response to the image when its
auditory counterpart was attended (e.g. guitar image when the guitar
strum was attended) was compared to the response to the image when
its auditory counterpart was not attended (e.g. guitar image when the
dog bark was attended). In this case the presence of the SN would
signify that, even though presented in an ignored sensory modality, the
visual element of the attended object was preferentially processed.
Likewise, the response to the sound when its visual counterpart was
attended was compared to the response to the sound when its visual
counterpart was not attended and the Nd was assessed.
For the multisensory stimuli the same logic was applied. However
for the multisensory comparison we predicted the occurrence of both
visual and auditory selective attention components (the SN and the
Nd), one reflecting selective attention to the attended object in the
attended sensory modality and the other selective attention to its
counterpart in the ignored sensory modality. As the goal was to
examine the latter, we first subtracted out the response to the
unisensory stimuli corresponding to the attended sensory modality.
For example, in the visual attention condition for which the question
of interest is whether visual selective attention processes also affect
processing of auditory features of the same object (as reflected by
the presence of the Nd), the response to the attended unisensory
visual stimuli was subtracted from the response to the attended visual-
auditory stimuli, and the response to the unattended unisensory
visual stimuli was subtracted from the response to the unattended
visual-auditory stimuli (see Fig. 2 for illustration). This left the
auditory response along with any possible auditory cross-sensory
selective attention effects (any additional multisensory interactions in
the AV response would also be contained in these responses; if these
differed for the attended vs. unattended conditions they would be
reflected in the resulting comparison), and ensured that observed
electrical activity over the region of interest was not due to volume
conduction of the ‘primary’ selective attention effect.
The dependent measure used to assess selective attention effects for
stimuli presented in the unattended sensory modality was the mean
amplitude over a 100-ms latency window centred on the peak of the
grand-mean selective attention component of interest (the SN in
the case of the attend-auditory condition or the Nd in the case of the
attend-visual condition). Data were taken from the same electrodes as
used to test baseline unisensory selective attention effects (as
described above).
Statistical tests
Presence of selective attention effects. Omnibus four-way anovas
for each of the visual and auditory selective attention effects were
conducted. These each had factors of Condition [three levels –
baseline attention effect (baseline); multisensory object attention
effect in unisensory stimuli (uni cross-sensory); multisensory object
attention effect in multisensory stimuli (multi cross-sensory)],
Attention (two levels – attended, unattended), Region (three levels
– right, centre, and left), and Electrode (3). A significant interaction
between Condition and Attention was followed-up with three two-
way anovas with factors of Attention and Region, to assess the
presence of selective attention effects individually for each of the
conditions. Where appropriate, Greenhouse-Geisser corrections were
made.
Onset and offset of selective attention effects. Running t-tests (two-
tailed) were used to assess the onset latency and duration of the visual
(SN) and auditory (Nd) selective attention effects for each of the three
comparison conditions. For each of these, the amplitude of the
attended condition was compared to the amplitude of the correspond-
ing unattended condition. The onset of the SN or Nd was defined as
the first significant time-point of at least ten consecutive significant
time-points, in data from a single electrode. Offset was defined as the
last of the following significant time-points. These tests were
performed on each data-point (?2 ms steps) from 150 to 500 ms
post-stimulus onset (150 ms being the lower limit at which the object-
based selective attention effects might onset). Performing tests on
multiple time-points increases the probability of a false positive. To
reduce the chances of a false positive, the tests were restricted to the
subset of nine electrodes for which the selective attention component
was originally tested (as defined above), and a criterion of ten
consecutive time points was used as the likelihood of getting ten false-
positives in a row is considerably low (Guthrie & Buchwald, 1991).
To provide a fuller description of the selective-attention effects, cluster
plots of t-tests on data from all electrodes across the examined epoch
()100 to 500 ms) were also inspected. With the potential to reveal
Fig. 2. Derivation of the auditory response (A¢) from the auditory-visual response. Responses are from a left fronto-central scalp site during the visual attention
condition, to attended (dashed trace) and unattended (solid trace) objects. Positive is plotted up.
502 S. Molholm et al.
ª The Authors (2007). Journal Compilation ª Federation of European Neuroscience Societies and Blackwell Publishing Ltd
European Journal of Neuroscience, 26, 499–509
Page 5
Fig. 3. Voltage maps and waveforms of the selective attention effects. Voltage maps of the visual (A) and auditory (B) selective attention effects for the baseline
and two cross-sensory conditions, and the corresponding waveforms. Waveforms to the neutral stimuli are also included (these are to the hammer image in A and to
the hammer bang in B). Note that the waveforms for the multi cross-sensory conditions are derived, as illustrated in Fig. 2 (equivalent to V¢ for the multi
cross-sensory SN and A¢ for the multi cross-sensory Nd). Waveforms are from a left lateral occipital site for the visual selective attention effects, and from a left
fronto-central site for the auditory selective attention effects. Positive is plotted up. The scalp site from which the displayed waveforms were recorded is indicated by
a pink dot.
Multisensory transfer of object-based attention503
ª The Authors (2007). Journal Compilation ª Federation of European Neuroscience Societies and Blackwell Publishing Ltd
European Journal of Neuroscience, 26, 499–509
Page 6
unpredicted effects, these cluster plots can serve as a hypothesis
generation tool for future work.
Results
Behaviour
Mean reaction-time tended to be longer to auditory compared to visual
targets, with mean RTs of 559 ms and 523 ms, respectively. The
longer RTs to the auditory targets may reflect that a minimal segment
of the dynamic auditory signal is necessary before a target decision
can be made, whereas for the visual targets the entire signal is present
at target onset. This difference approached significance (main effect of
unisensory-attention condition F1,11¼ 4.2, P ¼ 0.07). RTs to unisen-
sory targets were slightly faster than RTs to multisensory targets (536
vs. 545 ms), and this difference was significant (F1,11¼ 8.4,
P ¼ 0.01). There was no significant effect of Object on RTs
(F1,11¼ 0.06, P ¼ 0.80), and there were no significant interactions.
The analysis of the per cent hit data did not reveal any significant
main effects or interactions. Performance when attention was directed
at the auditory stimuli was 85% and performance when attention was
directed at the visual stimuli was 87% (F1,11¼ 1.34, P ¼ 0.27). There
was no significant effect of Target type (i.e. whether the stimuli were
unisensory or multisensory, F1,11¼ 2.87, P ¼ 0.12) or of Object (i.e.
whether the target object was guitar or dog, F1,11¼ 0.97, P ¼ 0.34).
Analysis of incorrect responses
In the auditory attention condition 7% of responses were false alarms
and in the visual attention condition 6% of responses were false
alarms, false alarms being instances where subjects responded to a
stimulus that was not a target. False alarms did not differ significantly
between these conditions (t1,11¼ 0.83, P ¼ 0.424). The majority of
false alarms were due to correct responses that fell just out of the set
response window (66%, thus classified if the response fell in the
response window of the following stimulus). Another 18% of false
alarms occurred in response to a repetition of the attended object in the
attended sensory modality where there was a single intervening task-
irrelevant stimulus (i.e. these responses would have been correct if
there wasn’t an intervening stimulus). 15% of the false alarms were
classified as ‘other’, not following a clear pattern. Finally, and perhaps
most importantly, across all the subjects only one false alarm was to an
object repetition where one of the two was in the ignored sensory
modality (e.g. a guitar strum followed by guitar image, during auditory
attention; 0.7% of false alarms), and there were no false alarms to
repetitions of the relevant object presented in the ignored sensory
modality. The absence of a meaningful number of these two latter
Fig. 4. Maps of the significant t-values from the comparison of the ERP to the ‘attended object’ vs. the ERP to the ‘unattended object’, across the 168 recording
channels (ordinate), from 0 to 500 ms post-stimulus onset (abscissa). The electrodes are arranged from most anterior (top of plot) to most posterior (bottom of plot),
sectioned by scalp region (FP, fronto-polar; F, frontal; FC, fronto-central; C, central; P, parietal; PO, parieto-occipital; O, occipital). Significant values are only
displayed when there are at least ten consecutive significant data-points. Maps are for visual (A, upper panels) and auditory (B, lower panels) stimuli in each of the
three comparison conditions (baseline, uni cross-sensory, and multi cross-sensory).
504 S. Molholm et al.
ª The Authors (2007). Journal Compilation ª Federation of European Neuroscience Societies and Blackwell Publishing Ltd
European Journal of Neuroscience, 26, 499–509
Page 7
types of false alarms suggests that subjects effectively ignored stimuli
in the unattended sensory modality, as instructed.
Electrophysiology
Visual selective attention effects
A clear selection negativity (SN), our measure of visual selective
attention processes, was observed in the grand-mean data for the
baseline condition, in which responses to attended and unattended
images (i.e. unisensory visual stimuli) in the visual attention condition
were compared. The difference between these responses peaked at
270 ms and was of maximal amplitude bilaterally over the lateral
occipital scalp (Fig. 3A). To test for the presence of object-based
selective attention effects in the unattended sensory modality, the
responses to the image that was a feature of the attended auditory
object was compared to the response to the image that was not a
feature of the attended auditory object. This revealed visual selective
attention effects for both the cross-sensory conditions (see Fig. 3A),
with scalp topographical distributions very similar to the baseline
condition. The SN peaked at ?270 ms for the multi cross-sensory
condition, and somewhat later at 310 ms for the uni cross-sensory con-
dition. The placement of the electrodes used in the statistical analyses
is displayed in Fig. 1B.
To test the reliability of the observed visual selective attention
effects, a four-way anova with factors of Condition (baseline, uni
cross-sensory, multi cross-sensory), Attention (attended object,
unattended object), Region (RH, LH, midline), and Electrode (3, in
each region) was conducted. There was a main effect of Attention
(F1,11¼ 56.10, P < 0.000). A significant Condition by Attention
interaction (F2,11¼ 6.41, P ¼ 0.01) indicated that the magnitude of
the SN differed across the conditions. There were no significant
higher-level interactions. Follow-up anovas revealed significant
attention effects for the baseline SN and the multi cross-sensory SN
(main effect of Attention for the two conditions, respectively,
F1,11¼ 29.00, P < 0.001; F1,11¼ 27.96, P < 0.001). The unicross-
sensory SN was not significant (F1,11¼ 2.61, P ¼ 0.14). There were
no significant higher-level interactions.
To determine the onset and offset of the visual selective attention
effects, statistical cluster-plots were generated (Fig. 4). These
revealed an onset of 196 ms and an offset of 346 ms for the
baseline SN and a later onset and offset of 237 ms and 366 ms for
the multi cross-sensory SN. In both cases onsets were slightly earlier
at the sites over left occipital scalp. For the uni cross-sensory
condition there was no corresponding significant activity over the
tested electrode sites.
Auditory selective attention effects
A clear negative difference wave (Nd), our measure of auditory
selective attention processes, was observed in the grand-mean data for
the baseline condition, in which responses to attended and unattended
sounds (i.e. unisensory auditory stimuli) in the auditory attention
condition were compared. The difference between these responses
peaked at 220 ms, and was focused over the fronto-central scalp
region, with a slightly leftward distribution (Fig. 3B). To test for the
presence of object-based selective attention effects in the unattended
sensory modality, the response to the sound that was a feature of the
attended visual object was compared to the response to the sound that
was not a feature of the attended visual object. This revealed auditory
selective attention effects, for both the multi cross-sensory and uni
cross-sensory comparisons (see Fig. 3B). The Nd peaked at 220 ms
for the multi cross-sensory condition, and somewhat later at 300 ms
for the uni cross-sensory condition. The scalp distributions of both the
cross-sensory Nds were slightly more frontal and leftward compared
to the baseline Nd. The placement of the electrodes used in the
statistical analyses is displayed in Fig. 1B.
A four-way anova with factors of Condition (baseline, unicross-
sensory, multi cross-sensory), Attention (attended object, unattended
object), Region (RH, LH, midline), and Electrode was conducted to
test the reliability of the Nd across the three conditions. There was a
significant main effect of Attention (F1,11¼ 40.97, P < 0.000). The
factors Attention and Condition did not interact (F2,22¼ 1.03,
P ¼ 0.37), indicating that the amplitude of the Nd did not significantly
differ across the three conditions. There were no significant higher-
level interactions.
To determine the onset and offset latencies of the auditory selective
attention effects, statistical cluster-plots were generated (Fig. 4). These
revealed an onset of 176 ms for the baseline Nd, with the differential
activity remaining significant throughout the analysed epoch (out to
500 ms). The multi cross-sensory Nd onset at approximately the same
latency, of 180 ms, but offset much earlier at approximately 255 ms.
The uni cross-sensory Nd onset somewhat later at approximately
190 ms and offset at approximately 423 ms.
Comparison of the attention effects with a neutral condition
A posthoc analysis was performed in which the amplitude of the
‘attended’ and ‘unattended’ responses were each compared to a neutral
condition. This allowed us to consider whether the attention effects
better fit a model in which there was ‘suppressed’ processing of the
unattended stimuli, or a model in which there was ‘enhanced’
processing of the attended stimuli. The responses to the hammer-
images and bang-sounds were considered neutral as throughout the
experimental session they were never attended and had no associated
task relevance. While these stimuli were not exactly matched for
sensory stimulation with the comparator attended and unattended
stimuli, the sensory responses were quite similar (see Fig. 3). At later
latencies (?200 ms onward) the waveform followed the responses of
the unattended conditions (except in the multi cross-sensory Nd
condition, where it was more positive than both the attended and
unattended responses; see Fig. 3).
We first tested whether there were effects of stimulus-type when the
neutral stimulus and the attended stimuli were all compared, and when
the neutral stimulus and the unattended stimuli were all compared.
A total of four one-way anovas were performed. In one the
unattended auditory responses and the neutral auditory response were
all compared. This resulted in four levels of the stimulus-condition
factor; baseline, uni cross-sensory, multi cross-sensory, and neutral. In
another, the attended auditory responses and the neutral auditory
response were all compared. This resulted in the same four levels of
stimulus-condition. In a third the unattended visual responses and the
neutral visual response were compared. This resulted in three levels of
the stimulus-condition factor; baseline, multi cross-sensory, and
neutral, and in a fourth, the ‘unattended’ visual responses and the
neutral visual response were compared. This resulted in the same three
levels of stimulus-condition.
Data were only included from conditions where there were
significant attention effects. To maximize the sensitivity of the tests,
data were pooled across the three electrode sites over the left
hemisphere, where the attention effects had appeared greatest in
amplitude (when considering the right, left, and central sites from
which data were taken for the original analyses). For the comparisons,
the dependent variable was the mean amplitude over a 100-ms
Multisensory transfer of object-based attention505
ª The Authors (2007). Journal Compilation ª Federation of European Neuroscience Societies and Blackwell Publishing Ltd
European Journal of Neuroscience, 26, 499–509
Page 8
window that was centred on the peak of the selective attention effects
for the attended and unattended responses (the same as used in the
planned tests of the selective attention effects), and over the same
latency window for the neutral stimulus conditions (170–270 ms for
the auditory neutral stimulus and 220–320 ms for the visual neutral
stimulus).
There were no significant effects of stimulus-condition when the
responses to the unattended and neutral stimuli were compared, for
either the visual or the auditory data (F2,22¼ 0.51, P ¼ 0.61 and
F3,33¼ 1.09, P ¼ 0.37, respectively). There were, in contrast,
significant effects of stimulus-condition when the responses to the
attended and neutral stimuli were compared, and this was the case for
both the visual and the auditory data (F2,22¼ 14.52, P ¼ < 0.001)
and (F3,33¼ 5.65, P ¼ 0.003, respectively). These significant effects
were followed-up by paired samples t-tests to determine which of the
attended stimulus conditions differed significantly from the neutral
condition (three comparisons for the auditory stimuli and two
comparisons for the visual stimuli). The responses to the attended
stimuli were all significantly more negative going than the respective
neutral conditions, at P £ 0.01. These data are consistent with an
‘enhancement’ of attended responses explanation of the attention
effects.
Discussion
The present study provides evidence that object-based attention is
multisensory. Selectively attending to an object in one sensory
modality resulted in facilitated processing of its features in the ignored
sensory modality. This was indexed by the presence of the visual and
auditory selective attention ERP responses, the selection negativity
and the negative difference wave, respectively (e.g. (Hansen &
Hillyard, 1980; Hillyard & Anllo-Vento, 1998). A key question is
what are the mechanisms underlying such multisensory object-based
attention effects?
Underlying mechanisms
Selective attention to a stimulus feature can result in the amplification
of the sensory driven response of the corresponding feature specific
sensory processing network (e.g. Corbetta et al., 1990), a process that
has been conceptualized as ‘priming’ (e.g. Hillyard et al., 1998). The
neural representations of features of objects are likely strongly bound
together (e.g. Beauchamp et al., 2004; Amedi et al., 2005) such that in
setting an object representation in one sensory system at a lower
threshold via selective attention, the thresholds of representations of
that object in other sensory systems are likewise altered. Such a cross-
sensory object priming account provides an excellent fit to the
auditory cross-sensory attention data of the present study. Specifically,
we observed auditory cross-sensory attention effects even in the
absence of the visual stimulus that was being attended at the time.
Further, the onset of the auditory cross-sensory attention effects
approximated that of the baseline comparison condition (180–190 ms
during visual attention vs. 176 ms during auditory attention). This
pattern of data suggests that the sensory processing network was
primed for this input. Consistent with this explanation, cross-sensory
priming has been demonstrated using behavioural measures, for
auditory-visual as well as tactile-visual objects (Easton et al., 1997;
Greene et al., 2001). Also supporting a representational system in
which the multisensory features of an object are coactivated is a
neuroimaging study showing tactile to visual priming for novel visual
objects studied by touch (James et al., 2002).
The auditory cross-sensory attention effects of the current study
strongly suggest that the automatic coactivation of a visual object’s
unattended features, proposed in the highly influential biased-compe-
tition model of visual attention (Desimone & Duncan, 1995; Duncan,
2006) and supported by experimental data (e.g. Schoenfeld et al.,
2003), also applies to a visual object’s representations in other
unisensory systems.
The priming explanation doesn’t account as well, if at all, for the
visual cross-sensory attention effects. For one, a significant visual
cross-sensory attention effect was only seen when the ignored visual
stimulus was actually presented in conjunction with the attended
auditory stimulus. Thus, the presence of the explicitly attended
auditory stimulus was necessary for visual cross-sensory atten-
tion effects to occur. What’s more, the observed cross-sensory
attention effect developed substantially later than that for the baseline
comparison condition. That is, the visual selective attention effect
onset at 196 ms during visual attention (the baseline condition),
whereas it didn’t onset until 237 ms during auditory attention. One
possibility is that the auditory-to-visual transfer effects were just
weaker than the visual-to-auditory transfer effects, and hence they
failed to reach significance in one case and only reached significance
at a relatively late latency in the other, an issue of signal-to-noise.
Consistent with this view, in Fig. 3 there is some suggestion of a
selection negativity for the uni cross-sensory condition (Fig. 3A,
middle panel), though this does not reach statistical significance in the
planned anova, and the running t-test significance map for the visual
uni cross-sensory SN shows significant differences in just two
posterior electrodes in the timeframe of the SN (Fig. 4, top middle
panel). Weaker auditory to visual transfer of attention effects could be
due to the ‘dominance’ of visual object representations, a notion that is
discussed below.
However, an alternative explanation also fits the present visual
cross-sensory attention data, and indeed was suggested as the process
underlying the cross-sensory attention effect observed for simple and
unrelated visual and auditory stimuli by the Woldorff group (Busse
et al., 2005; Talsma et al., 2007). This is a spread of attention account,
in which attention spreads from attended to unattended aspects of an
object. Such spreading of attention was originally proposed to account
for the finding of better performance on targets presented to an
unattended location that fell within the same object boundaries as an
attended location (Egly et al., 1994; Davis et al., 2000). The spread of
spatial-attention within an object has received further support from
ERP and fMRI neuroimaging studies, with attention effects present for
stimuli presented to unattended locations that fell within the attended
object (e.g. Martinez et al., 2006). In a similar vein, attention may
spread within an object across sensory systems, such that when an
attended object is presented, its constituent parts also receive
preferential processing, even though presented in ignored sensory
modalities. By this explanation the so-called spread of attention should
depend upon the presence of the attended object, and the onset of
selective attention effects should onset somewhat later than that of
standard within-modality selective attention effects, simply allowing
for the time for effects to transmit between visual and auditory cortical
regions. In fact, this is precisely what was observed here for visual
cross-sensory attention.
There is also the question of the mediating neural structures and
pathways of cross-sensory attentional transfer. One possibility is that
direct connections among an object’s feature-representations in the
different unisensory systems mediate the effects. There is feasibility
for such a pathway insofar as non-human primate anatomical tracer
studies reveal direct connections between auditory and visual sensory
cortices (Falchier et al., 2002; Rockland & Ojima, 2003; Cappe &
506 S. Molholm et al.
ª The Authors (2007). Journal Compilation ª Federation of European Neuroscience Societies and Blackwell Publishing Ltd
European Journal of Neuroscience, 26, 499–509
Page 9
Barone, 2005), and human and non-human primate neurophysiology
have shown that multisensory processing can occur early in time in
what are considered unisensory cortical regions (Giard & Peronnet,
1999; Molholm et al., 2002); for a discussion of these data see Foxe &
Schroeder (2005) and Schroeder & Foxe (2005).
Equally plausible is that higher-order cortical regions mediate cross-
sensory attentional transfer. In this case a multisensory or supramodal
cortical region higher in the information processing hierarchy (e.g.
polysensory superior temporal sulcus) might send signals to the
unisensory cortices to modulate processing of the features of a
particular object, even for features that are not explicitly attended. This
could be initiated by the presentation of the attended stimulus, with
inputs from unisensory cortex to higher-order regions triggering the
top-down activation of the object’s features in unattended sensory
modalities (though this explanation is limited to instances where the
presence of the attended stimulus is necessary for cross-sensory
attention effects to occur). Alternatively, it could be initiated in a top-
down manner following task instructions. This general mechanism, in
which higher-order cortical regions are responsible for cross-sensory
attentional effects in unisensory cortex, bears some resemblance to the
way in which higher-order parietal and frontal regions of cortex
presumably mediate so-called supramodal spatial attention effects in
unisensory cortices (Eimer & Driver, 2002).
Attentional transfer between simple and unrelated auditory
and visual features
A pair of studies from the Woldorff laboratory addresses attentional
spread across unrelated simple multisensory stimuli, during spatial
(Busse et al., 2005) and intersensory (Talsma et al., 2007) attention.
Using event-related potentials (ERPs) and simple and unrelated visual
and auditory stimuli, Busse et al. (2005) showed an auditory selective
attention effect for centrally presented task-irrelevant auditory tonal
stimuli that were presented while subjects selectively attended to
lateralized visual checkerboard stimuli. Under their design, visual
stimuli were randomly presented to the right or left of fixation and
subjects were required to detect infrequent targets in either the right or
left stimulus stream in a given block. On 50% of trials the visual
stimuli, half attended and half unattended, were accompanied by a
centrally presented task-irrelevant tone. Although the tone was
presented to central auditory space, the co-occurrence of the visual
stimulus resulted in the so-called ventriloquist illusion, a powerful
illusion in which the perceived location of a sound is shifted toward a
concurrently presented visual stimulus (Bertelson & Radeau, 1981).
Auditory selective attention effects were then assessed by comparing
‘attended’ vs. ‘unattended’ trials. The authors argue that because the
auditory stimulus was presented to a different location than the visual
stimulus, the presence of the observed auditory attention effect could
not be attributed to well-established cross-sensory spatial attention
mechanisms (Hillyard et al., 1984; Eimer & Driver, 2001), but rather
was due to the spreading of attention within the object across
modalities and space. This effect is intriguing and consistent with the
multisensory spread of attention across features of an object.
Another ERP study from the same group provides further evidence
(Talsma et al., 2007) for the multisensory spread of attention within an
object. Again, simple and unrelated auditory and visual stimuli were
presented alone or together, but now in an intersensory attention
paradigm in which subjects detected targets in the visual or the auditory
sensory modality in separate blocks. ERP recordings revealed an
unpredicted late-onset negativity in response to the auditory-visual
stimulus for the visual attention condition. This had a topography
characteristic of the Nd, a response associated with selective processing
of auditory information. What’s more, during visual attention this
response was only observed when the auditory stimulus was presented
with the attended visual stimulus, and not when it was presented alone.
These data are thus consistent with attention automatically spreading
across sensory modalities, from attended to unattended dimensions of
thearbitrarilypairedstimuli.Ofnote,thelatencyoftheNdresponsewas
very late (only significant by 420 ms) with respect to that seen in
response to auditory stimuli during auditory attention conditions
(significant by 280 ms). Also, such multisensory attention appeared to
be unidirectional, with no report of selective processing of the visual
stimulusduringauditoryattention.Nevertheless,attentionwasshownto
spreadacrosssensorymodalities,fromtheattendedtotheunattendedof
arbitrarily paired simple visual and auditory stimuli.
In these studies attention was directed at a location (Busse et al.,
2005) or at one of two sensory modalities (Talsma et al., 2007), rather
than at specific objects. When attention was directed in this manner,
the presence of the attended object was necessary for the occurrence of
cross-sensory attentional effects, whereas this was not the case in the
present study for transfer of attention within an object from the visual
to the auditory sensory modality. This strongly suggests that the cross-
sensory object priming that we observe may be invoked specifically
when attention is directed at objects.
The topography of the auditory selective attention effects that we
observed in the cross-sensory conditions was similar in its fronto-
central distribution to the cross-sensory auditory selective attention
effects in both Busse et al. (2005) and Talsma et al. (2007). In the
present study however, there was a tendency to a leftward distribution
not seen in Busse et al. (2005) or Talsma et al. (2007). This slight
lateralization, seen for all three auditory selective attention effects
including the baseline condition, is very likely a function of the
stimuli. Brain imaging studies using electroencephalography (Hauk
et al., 2006), magnetoencephalography (Pulvermuller et al., 2005) and
fMRI (Pulvermuller & Hauk, 2006) have revealed that processing of
object sounds and action words engages an object⁄event-specific
representational network relatively early in time (< 200 ms post-
stimulus onset). Future studies will be needed to assess the similarity
of the underlying neuronal generators as a function of how attention is
directed (e.g. toward objects, locations, or a sensory modality), as well
as assess how the latency of the effect may be related to the specific
attentional processes that are engaged.
Visual dominance in multisensory object recognition?
The present data would also seem to argue that visual-object
representations have a greater influence on their auditory counterparts
than vice-versa. That is, when attention was directed at visual stimuli,
there were greater cross-sensory attention effects than when attention
was directed at auditory stimuli. This asymmetry was seen both in the
current study for well-known multisensory objects, and Talsma et al.
(2007) for simple and unrelated auditory and visual stimuli. Consistent
with this asymmetry, in a behavioural cross-sensory priming study,
visual to auditory priming was readily apparent while there was no
evidence for auditory to visual priming (Greene et al., 2001). The
authors attributed this to the lack of specificity of auditory objects. But
this explanation would not apply to the current design, in which the
stimulus set was small and subjects knew with certainty the object a
sound belonged to. As an alternative, we propose that visual
representations play a leading role in the representation of certain
types of objects (see also, e.g. James et al., 2002; Molholm et al.,
2004; Amedi et al., 2005; Gomez-Ramirez et al., 2007), perhaps due
Multisensory transfer of object-based attention507
ª The Authors (2007). Journal Compilation ª Federation of European Neuroscience Societies and Blackwell Publishing Ltd
European Journal of Neuroscience, 26, 499–509
Page 10
to their generally greater specificity (see e.g. Ernst & Bulthoff, 2004),
as an image usually exactly specifies the object while a sound often
does not. This specificity of visual representations may have, through
phylogenetic or developmental means, led to the dominance of visual
object recognition areas in object processing.
One possible issue is that, due to our delimited stimulus set,
associative-links between the auditory and visual stimuli may have
formed over the course of the experiment, and that these transient
associations might account for the cross-sensory selective attention
effects seen here. This possibility does not cohere well with
differences seen between our findings and those of Talsma et al.
(2007). That is, one would expect such associative links to similarly
occur for the repeatedly presented but previously unrelated auditory
and visual stimuli in Talsma et al. (2007). However, Talsma et al.
(2007) only found cross-sensory attention effects for repeatedly
presented unrelated auditory and visual stimuli when (i) they were
presented together, and (ii) during visual attention. What’s more, this
auditory cross-sensory attention response was substantially delayed in
time with respect to a baseline condition whereas the effects seen here
were not.
Conclusions
Perceptualandcognitiveprocesseshavetraditionallybeenstudiedinthe
contextofunisensorystimulation.Thishasservedtheimportantpurpose
of reducing the number of variables and allowing the experimenter to
attribute specific effects to a single factor. Of course this approach
hugely simplifies how perceptual and cognitive processes truly operate
in a multisensory world. That is, practically all our experiences are
multisensory, with multisensory information about objects and events
highly related (often complimentary, or even redundant). It is clear that
ournervoussystemhasevolvedtotakeadvantageofsuchamultisensory
environment (Stein & Meredith, 1993). The current work contributes to
a growing exploration of the multisensory nature of perception,
establishing the interconnectedness of multisensory object representa-
tions during selective attention processes.
Acknowledgements
This work supported by grants from the National Institute of Mental Health to
SM (MH68174) and JJF (MH65350). We would also like to express our sincere
thanks to Dr Simon Kelly for comments on an earlier version of this
manuscript, and to Jeannette Mahoney for her technical assistance in collecting
these data.
Abbreviations
ERP, event-related potential; Nd, negative difference wave; RT, reaction time;
SN, selection negativity wave.
References
Amedi, A., von, K.K., van Atteveldt, N.M., Beauchamp, M.S. & Naumer, M.J.
(2005) Functional imaging of human crossmodal identification and object
recognition. Exp. Brain Res., 166, 559–571.
Beauchamp, M.S., Lee, K.E., Argall, B.D. & Martin, A. (2004) Integration of
auditory and visual information about objects in superior temporal sulcus.
Neuron, 41, 809–823.
Bertelson, P. & Radeau, M. (1981) Cross-modal bias and perceptual fusion
with auditory-visual spatial discordance. Percept. Psychophys., 29, 578–
584.
Busse, L., Roberts, K.C., Crist, R.E., Weissman, D.H. & Woldorff, M.G. (2005)
The spread of attention across modalities and space in a multisensory object.
Proc. Natl Acad. Sci. USA, 102, 18751–18756.
Cappe, C. & Barone, P. (2005) Heteromodal connections supporting multisen-
sory integration at low levels of cortical processing in the monkey. Eur. J.
Neurosci., 22, 2886–2902.
Corbetta, M., Miezin, F.M., Dobmeyer, S., Shulman, G.L. & Petersen, S.E.
(1990) Attentional modulation of neural processing of shape, color, and
velocity in humans. Science, 248, 1556–1559.
Davis, G., Driver, J., Pavani, F. & Shepherd, A. (2000) Reappraising the
apparent costs of attending to two separate visual objects. Vis. Res., 40,
1323–1332.
Desimone, R. & Duncan, J. (1995) Neural mechanisms of selective visual
attention. Annu. Rev. Neurosci., 18, 193–222.
Duncan, J. (2006) EPS Mid-Career Award 2004: brain mechanisms of
attention. Q. J. Exp. Psychol. (Colchester), 59, 2–27.
Easton, R.D., Srinivas, K. & Greene, A.J. (1997) Do vision and haptics share
common representations? Implicit and explicit memory within and between
modalities. J. Exp. Psychol. Learn. Mem. Cogn., 23, 153–163.
Egly, R., Driver, J. & Rafal, R.D. (1994) Shifting visual attention between
objects and locations. evidence from normal and parietal lesion subjects.
J. Exp. Psychol. General, 123, 161–177.
Eimer, M. & Driver, J. (2001) Crossmodal links in endogenous and exogenous
spatial attention: evidence from event-related brain potential studies.
Neurosci. Biobehav. Rev., 25, 497–511.
Eimer, M., van, V.J. & Driver, J. (2002) Cross–modal interactions between
audition, touch, and vision in endogenous spatial attention: ERP evidence on
preparatory states and sensory modulations. J. Cogn. Neurosci., 14, 254–271.
Ernst, M.O. & Bulthoff, H.H. (2004) Merging the senses into a robust percept.
Trends Cogn. Sci., 8, 162–169.
Fabiani, M., Kazmerski, V.A., Cycowicz, Y.M. & Friedman, D. (1996) Naming
norms for brief environmental sounds: effects of age and dementia.
Psychophysiology, 33, 462–475.
Falchier, A., Clavagnier, S., Barone, P. & Kennedy, H. (2002) Anatomical
evidence of multimodal integration in primate striate cortex. J. Neurosci., 22,
5749–5759.
Felleman, D.J. & VanEssen, D.C. (1991) Distributed hierarchical processing in
the primate cerebral cortex. Cereb. Cortex, 1, 1–47.
Foxe, J.J. & Schroeder, C.E. (2005) The case for feedforward multisensory
convergence during early cortical processing. Neuroreport, 16, 419–423.
Foxe, J.J., Simpson, G.V., Ahlfors, S.P. & Saron, C.D. (2005) Biasing the
brain’s attentional set. I. Cue Driven Deployments of Intersensory Selective
Attention. Exp. Brain Res., 166, 370–392.
Giard, M.H. & Peronnet, F. (1999) Auditory-visual integration during
multimodal object recognition in humans: a behavioral and electrophysio-
logical study. J. Cogn. Neurosci., 11, 473–490.
Gomez-Ramirez, M., Higgins, B.A., Rycroft, J.A., Owen, G.N., Mahoney, J.,
Shpaner, M. & Foxe, J.J. (2007) The deployment of intersensory selective
attention: a high-density electrical mapping study of the effects of theanine.
Clin. Neuropharmacol., 30, 25–38.
Greene, A.J., Easton, R.D. & LaShell, L.S. (2001) Visual-auditory events:
cross-modal perceptual priming and recognition memory. Conscious Cogn.,
10, 425–435.
Guthrie, D. & Buchwald, J.S. (1991) Significance testing of difference
potentials. Psychophysiology, 28, 240–244.
Hansen, J.C. & Hillyard, S.A. (1980) Endogenous brain potentials associated
with selective auditory attention. Electroencephalogr. Clin. Neurophysiol.,
49, 277–290.
Hauk, O., Shtyrov, Y. & Pulvermuller, F. (2006) The sound of actions as
reflected by mismatch negativity: rapid activation of cortical sensory-motor
networks by sounds associated with finger and tongue movements. Eur. J.
Neurosci., 23, 811–821.
Hillyard, S.A. & Anllo-Vento, L. (1998) Event-related brain potentials in the
study of visual selective attention. Proc. Natl Acad. Sci. USA, 95, 781–787.
Hillyard, S.A., Simpson, G.V., Woods, D.L., Van Voorhis, S.T. & Munte, T.F.
(1984) Event-related brain potentials and selective attention to different
modalities. In Reinoso-Suarez, F., (Ed), Cortical Integration. Ravel Press,
New York, pp. 395–414.
Hillyard, S.A., Vogel, E.K. & Luck, S.J. (1998) Sensory gain control
(amplification) as a mechanism of selective attention: electrophysiological
and neuroimaging evidence. Philos. Trans. R. Soc. Lond. B Biol. Sci., 353,
1257–1270.
James, T.W., Humphrey, G.K., Gati, J.S., Servos, P., Menon, R.S. & Goodale,
M.A. (2002) Haptic study of three-dimensional objects activates extrastriate
visual areas. Neuropsychologia, 40, 1706–1714.
508S. Molholm et al.
ª The Authors (2007). Journal Compilation ª Federation of European Neuroscience Societies and Blackwell Publishing Ltd
European Journal of Neuroscience, 26, 499–509
Page 11
Johnson, J.A. & Zatorre, R.J. (2005) Attention to simultaneous unrelated
auditory and visual events. behavioral and neural correlates. Cereb. Cortex,
15, 1609–1620.
Martinez, A., Teder-Salejarvi, W., Vazquez, M., Molholm, S., Foxe, J.J.,
Di Javitt, D.C.R.F., Worden, M.S. & Hillyard, S.A. (2006) Objects
are highlighted by spatial attention. J. Cogn. Neurosci., 18, 298–310.
Molholm, S., Ritter, W., Javitt, D.C. & Foxe, J.J. (2004) Multisensory visual-
auditory object recognition in humans: a high-density electrical mapping
study. Cereb. Cortex, 14, 452–465.
Molholm, S., Ritter, W., Murray, M.M., Javitt, D.C., Schroeder, C.E. & Foxe,
J.J. (2002) Multisensory auditory–visual interactions during early sensory
processing in humans: a high-density electrical mapping study. Brain Res.
Cogn. Brain Res., 14, 115–128.
O’Craven, K.M., Downing, P.E. & Kanwisher, N. (1999) fMRI evidence for
objects as the units of attentional selection. Nature, 401, 584–587.
Pulvermuller, F. & Hauk, O. (2006) Category-specific conceptual processing of
color and form in left fronto-temporal cortex. Cereb. Cortex, 16, 1193–1201.
Pulvermuller, F., Shtyrov, Y. & Ilmoniemi, R. (2005) Brain signatures of
meaning access in action word recognition. J. Cogn. Neurosci., 17, 884–892.
Rockland,
calcarine visual areas in macaque monkey. Int. J. Psychophysiol., 50,
19–26.
Schoenfeld, M.A., Tempelmann, C., Martinez, A., Hopf, J.M., Sattler, C.,
Heinze, H.J. & Hillyard, S.A. (2003) Dynamics of feature binding
during object-selective attention. Proc. Natl Acad. Sci. USA, 100,
11806–11811.
Schroeder, C.E. & Foxe, J. (2005) Multisensory contributions to low-level,
‘unisensory’ processing. Curr. Opin. Neurobiol., 15, 454–458.
Snodgrass, J.G. & Vanderwart, M. (1980) A standardized set of 260 pictures:
norms for name agreement, image agreement, familiarity, and visual
complexity. J. Exp. Psychol. [Hum. Learn.], 6, 174–215.
Spence, C. & Driver, J. (1996) Audiovisual links in endogenous covert spatial
attention. J. Exp. Psychol. Hum. Percept. Perform., 22, 1005–1030.
Stein, B.E. & Meredith, M.A. (1993) The Merging of the Senses. The MIT
Press, Cambrige, MA.
Talsma, D., Doty, T.J. & Woldorff, M.G. (2007) Selective attention and
audiovisual integration: is attending to both modalities a prerequisite for
early integration? Cereb. Cortex, 17, 679–690.
K.S.& Ojima, H.(2003) Multisensoryconvergencein
Multisensory transfer of object-based attention 509
ª The Authors (2007). Journal Compilation ª Federation of European Neuroscience Societies and Blackwell Publishing Ltd
European Journal of Neuroscience, 26, 499–509