Impaired multisensory processing in schizophrenia: Deficits in the
visual enhancement of speech comprehension under noisy
Lars A. Rossa,b,c, Dave Saint-Amourb,d, Victoria M. Leavittb,e, Sophie Molholma,b,
Daniel C. Javitta,b, John J. Foxea,b,e,⁎
aProgram in Cognitive Neuroscience, Department of Psychology, The City College of the City University of New York,
138th St. and Convent Avenue, New York, New York 10031, USA
bThe Cognitive Neurophysiology Laboratory, Nathan S. Kline Institute for Psychiatric Research, Program in Cognitive Neuroscience
and Schizophrenia, 140 Old Orangeburg Road, Orangeburg, New York 10962, USA
cRamapo College of New Jersey, 505 Ramapo Valley Road, Mahwah, New Jersey 07430, USA
dDépartement d'ophtalmologie, Université de Montréal, C.P. 6128 Succ. Centre Ville, Montréal, Québec, Canada H3C 3J7
eProgram in Neuropsychology, Department of Psychology, Queens College of the City University of New York,
65-30 Kissena Boulevard, Flushing, New York 11367, USA
Received 14 March 2007; received in revised form 7 August 2007; accepted 10 August 2007
Available online 24 October 2007
Background: Viewing a speaker's articulatory movements substantially improves a listener's ability to understand spoken words,
especially under noisy environmental conditions. In this study we investigated the ability of patients with schizophrenia to integrate
under what listening conditions they might show the greatest impairments.
Methods: We assessed the ability to recognize auditory and audiovisual speech in different levels of noise in 18 patients with
schizophrenia and compared their performance with that of 18 healthy volunteers. We used a large set of monosyllabic words as our
stimuli in order to more closely approximate performance in everyday situations.
was most pronounced at signal-to-noise levels where multisensory gain is known to be maximal in healthy control subjects. A
surprising finding was that despite known early auditory sensory processing deficits and reports of impairments in speech processing
in schizophrenia, patients' performance in unisensory auditory speech perception remained fully intact.
unisensory speech processing and suggest that sensory integration dysfunction may be an important and, to date, rather overlooked
aspect of schizophrenia.
© 2007 Elsevier B.V. All rights reserved.
Keywords: Multisensory; Crossmodal; Audiovisual; Speech; Schizophrenia; Sensory integration
Available online at www.sciencedirect.com
Schizophrenia Research 97 (2007) 173–183
⁎Corresponding author. Cognitive Neurophysiology Laboratory, Nathan S. Kline Institute for Psychiatric Research, Program in Cognitive
Neuroscience and Schizophrenia, 140 Old Orangeburg Road, Orangeburg, NY 10962, USA. Tel.: +1 845 398 6547; fax: +1 845 398 6545.
E-mail address: email@example.com (J.J. Foxe).
0920-9964/$ - see front matter © 2007 Elsevier B.V. All rights reserved.
The integration of heard speech signals with the seen
articulatory movements of a speaker's face and mouth is
essential for everyday communication, as seeing a
speaker's face substantially facilitates the recognition of
spoken words, especially under noisy listening condi-
tions (e.g. Erber, 1969; Grant and Seitz, 2000; Munhall
et al., 2004a; O'Neill, 1954; Ross et al., 2007; Sumby
and Pollack, 1954). The brain processes underlying this
multisensory speech integration are presently under in-
tense investigation (Bernstein et al., 2004; Callan et al.,
2003; Calvert, 2001; Calvert and Campbell, 2003;
Campbell and MacSweeney, 2004; Callan et al., 2003;
Munhall et al., 2002, 2004b) and investigators have now
begun to explore whether there is a specific role for
multisensory processes in some of the perceptual defi-
cits seen in disorders such as autism (Iarocci and Mc-
Donald, 2006; Kern, 2002) and schizophrenia (de
Gelder et al., 2003). In schizophrenia, past research
has established the existence of robust deficits within-
modality where early auditory and visual sensory pro-
cessing has been shown to be impaired (e.g. Butler et al.,
2006; Foxe et al., 2001, 2005; Schwartz et al., 2001).
to predict that multisensory processes, which clearly rely
on the fidelity of early sensory inputs from the respective
unisensory systems, will show similar, if not greater im-
pairment. Given extensive physiological evidence that
multisensory integration can act as a non-linear gain
mechanism (Foxe and Schroeder, 2005; Meredith and
2002; Schroeder and Foxe, 2005), it seems a reasonable
prediction that impairment of multisensory processing
might well be especially impaired in this population.1
Indeed, recent evidence does suggest processing defi-
cits in schizophrenia. In a cleverly constructed study, de
Gelder and colleagues (de Gelder et al., 2003) used a
variant of the so-called “McGurk illusion” (McGurk and
MacDonald, 1976; Saint-Amour et al., 2006) to assess
whether patients have deficits in integrating auditory and
visual speech. For the reader unfamiliar with the McGurk
illusion, the following example will be helpful. When
participants attend to a video of a speaker articulating the
syllable /ga/ while listening to the incongruent auditory
syllable /ba/, the listener typically reports the perception
of the fused syllable /da/, and this occurs despite that fact
thatthe /da/syllablewasneitherheardnor seen.Thereare
numerous other examples of these phonemic fusions and
the illusionisverystrongsuchthatevenwhenthe listener
is fully apprised of the ‘trick,’ it is difficult or even im-
possible to suppress it (Massaro, 1998). In de Gelder's
experiment, patients were much less susceptible to these
illusory fusions than healthy participants, whereas perfor-
mance in an audiovisual control task involving spatial
localization of sounds remained unimpaired. The authors
hypothesized that if there was a general deficit in multi-
sensory integration, patients would show decrements in
both tasks. The results favored the notion of an isolated
deficit related to the integration of phonetic information.
However, somewhat contradictory evidence comes from
a studybySurguladzeetal.,where schizophrenia patients
and controls showed similar susceptibility to fusions in a
McGurk-type experiment (Surguladze et al., 2001).
In both of these previous studies, the premise was
that susceptibility to McGurk fusions would index an
intact audiovisual integration system, although it should
be also be pointed out that a small proportion of healthy
control subjects do not experience McGurk fusions.
Nonetheless, the vast majority of normal observers do in
fact perceive these fusions, and so these studies took
advantage of this fact to assess whether, on average,
patients would experience lower levels of fusion. The
McGurk-task, however, where mostly simple syllables
are used, could be considered a rather indirect and non-
Due in part to the rather artificial nature of the McGurk-
task, we reasoned that testing patients with schizophrenia
on an audiovisual task using real words as opposed to
syllables would provide a better test of their abilities for
audiovisual integration of speech in real-life situations.
More importantly, it has also been suggested that an im-
pairment in auditory speech recognition in general
(Hoffman et al., 1999; Lebib et al., 2003), and the inte-
gration of auditory and visual speech in particular (Sur-
guladze et al., 2001), is most likely to manifest itself
in situations where the auditory signal is degraded. We
would therefore expect a deficit in speech processing to
predominate when patients are asked to identify speech
under noisy environmental conditions that are more ty-
pical of normal everyday social situations.
Furthermore, we expected to find the most robust
deficit in multisensory speech perception under environ-
mental conditions where healthy control subjects usually
1Note that the authors do not mean to imply that sensory
integration deficits are peculiar to schizophrenia. They have also
been implicated in a number of other clinical populations such as in
autism (see Iarocci and McDonald, 2006) and in certain neurological
patients (e.g. Rorden et al., 1999; Munhall et al., 2002).
2It should be mentioned that the McGurk effect has also been
shown using words (Dekle et al., 1992) but that only syllables have
been used in patients.
174L.A. Ross et al. / Schizophrenia Research 97 (2007) 173–183
experience the most benefit from seeing the speaker's
articulations. In a recent experiment from our laboratory
(Ross et al., 2007), we showed that the gain derived from
viewing visual articulations is maximal at intermediate
signal-to-noise ratios (SNRs) in healthy volunteers. Here,
we investigated the ability of patients with schizophrenia
to determine to what extent they experience benefit from
visual articulation and to detail under what listening con-
For that, we assessed their ability to recognize auditory
and audiovisual speech in different levels of noise and
compared their performance with that of healthy volun-
teers. We used a large, normed set of monosyllabic words
as our stimuli in order to more closely approximate per-
formance in everyday situations without delivering se-
mantic, grammatical or prosodic context.
Informed consent was obtained from 18 patients (1
woman, mean age: 39, SD: 10.6) meeting the DSM-IV
criteria for schizophrenia (n=15) or schizoaffective
disorder (n=3) and 18 healthy volunteers (7 women,
mean age: 35, SD: 11.6) at the Nathan Kline Institute
(NKI) for Psychiatric Research (Orangeburg, NY).
NKI's Institutional Review Board approved all proce-
dures. Please refer to Table 1 for the sample character-
istics of the patients with schizophrenia. All patients and
controls had normal or corrected-to-normal vision and
reported normal hearing. Patients' diagnoses were
obtained using the Structured Clinical Interview for
DSM-IV (First et al., 1997) and all available clinical
information. All patients were receiving antipsychotic
medications at the time of testing. Chlorpromazine
equivalents were on average 1194±435 mg per day.
Equivalents were calculated using conversion factors
described previously (Hyman et al., 1995; Peuskens and
Link, 1997; Jibson and Tendon, 1998; Woods, 2003).
None of the healthy volunteers had a history of Axis I
psychiatric disorder as defined by the DSM-IV.
Stimulus material consisted of 525 simple monosyl-
labic words (taken from the MRC Psycholinguistic
database). The words were selected from a well-cha-
racterized normed set (Kucera and Francis, 1967) based
on their written-word frequency.3The face of a female
speaker was digitally recorded articulating the words.
These movies were digitally re-mastered, so that the
length of the movie (1.3 s) and the onset of the acoustic
signal were highly similar across all words. Average
voice onset occurred at 520 ms after movie onset (SD=
30 ms). The words were presented at 50 dBA FSPL.
Seven different levels of pink noise were presented
simultaneously with the presentation of the words at 50,
54, 58, 62, 66, 70 and 74 dBA FSPL. Noise onset at the
same time as the movie onset, 520 ms before the voicing
began. The signal-to-noise ratios (SNRs) were therefore
0, −4, −8, −12, −16, −20 and −24 dBA FSPL. The
movies were presented on a 21-inch computer monitor
at a distance of 1.7 m from the participant. The whole
face of the speaker was visible and extended 6.3° hori-
zontally and 7.6° vertically. The words were presented
from a speaker situated in the center on top of the screen
and the noise was presented from speakers flanking both
sides of the screen.
The main experiment consisted of two conditions. In
the auditory-alone condition (A) 175 words (25 words
Demographic and clinical characteristics of schizophrenia patients
Patient socioeconomic statusa
Parental socioeconomic status
Brief Psychiatric Rating Scale Score
Schizophrenia, no subtype
SD: 435Mean: 1196 Range: 500–2000
aSmaller numbers reflect higher socioeconomic status, per the
bTruncatedversion of the PeabodyPictureVocabulary(non-verbal).
cNote that 7 patients received 2 different antipsychotic medications.
3It should be noted that common language usage will have changed
somewhat since this normed list was first established. The authors
have made every effort to select only words that are still commonly
used. In addition, words were distributed randomly over conditions
and were checked for equal distributions afterwards.
175 L.A. Ross et al. / Schizophrenia Research 97 (2007) 173–183
per noise level) were presented in conjunction with a
still image of the speaker's face; in the audiovisual
condition (AV) the speaker's face articulated another set
of 175 words. Words were randomly assigned to all of
the conditions and noise levels and reassigned several
times to all conditions across subjects during the course
of data collection. Stimulus presentation of A and AV
trials were also randomly intermixed. A subset (from the
same pool of subjects) of nine controls and three patients
received a thirdspeechreading-alone condition (V),where
we used an additional 175 words. In this condition, the
speaker's face articulatedthe wordsbutno auditoryword-
signal waspresent. Again, this condition occurred with all
seven levels of noise and V trials were randomly
intermixed with all other trials.
Participants were instructed to watch the screen and
report which word they heard. The experimenter assured
that eye fixation was maintained by reminding partici-
pants, if necessary. If a word was not clearly understood
they were asked to guess what word was presented. The
from the participant at a 45° angle to the participant-
screen axis. The experimenter recorded a response that
exactly matched the target word presented as a correct
answer while any other response was recorded as an
incorrect answer. Pacing of the experiment was under
participant control by initiating the next trial with a
button press. The experiment consisted of five blocks
with 105 words per block for participants who received
and AV conditions were present.
After the experiment, participants of both groups
were presented with the full list of words used in the
experiment. Subjects were asked to characterize the
words in terms of familiarity and knowledge of
meaning. A list of pseudo-words was randomly inter-
mixed with the words in the list as catch trials to control
for the possible tendency of subjects to not want to
report words they didn't know. This test was run to
ensure that we had not inadvertently included any
subject with unusually low vocabulary size. Word
knowledge below 90% of the words in the list was used
as a cutoff criterion to exclude subjects from the
analysis. None of the subjects fell below that criterion
and no significant differences in word knowledge were
found between groups.
A 2X7X2 repeated measures analysis of variance
(RM-ANOVA) with the factors of condition (A and AV)
and SNR level (1–7) and the between groups factor
patients (P) vs. controls (C) was employed to analyze the
data. Overall, the level of noise affected recognition
performance significantly in both conditions, F(1, 34)=
1871.4,pb0.001,η2=0.98; the lower the SNR,the fewer
words that were recognized (see Fig. 1). In the auditory-
alone (A) condition we can see a monotonic increase
ranging from a recognition accuracy of essentially zero at
controls showed no differences in the A condition.
In both groups, speech-recognition benefited substan-
tially from the additional visual stimulation, F(1, 34)=
136.9, pb0.001, η2= 0.8. However, AV-curves differed
between groups which was reflected by a significant
interaction between the factors of condition and group F
(1, 34)=4.06, p=0.05, η2=0.11. A series of protected
Fig. 1. Top panel: Percentage of correctly identified words (%correct)
depending on the signal to noise ratio (SNR) for the auditory alone (A)
and the audiovisual conditions. The dashed lines represent the
performance of the patients with schizophrenia and the solid lines
the performance of healthy controls. Bottom panel: Multisensory gain
(AV-A) in speech-recognition accuracy as a function of SNR. Here, the
significant difference between groups is indexed with an asterisk.
176L.A. Ross et al. / Schizophrenia Research 97 (2007) 173–183
comparisons (two tailed t-tests, α=0.05) between groups
at each SNR level revealed that this difference is of
significant magnitude at the intermediate SNR (−12 dB)
t(34)=2.08, pb0.05. At this SNR the difference between
performance inthe Aconditionandthe AVconditionwas
maximal as the bottom panel in Fig. 1 shows. Looking
one can see that control subjects showed a gain of some
46% in recognition accuracy, going from approximately
25% to 71% performance. Patients on the other hand,
showed a more modest gain of 29%, improving to just
56.5% performance in the multisensory condition. Data
on speechreading performance were unfortunately only
obtained for 3 patients (mean over 7 noise levels: 8.7%,
SD: 5.67) and 9 controls (mean over 7 noise levels:
10.7%, SD: 3.32). No trend in differences was observed
between the groups. Note that the mean performance
value for the small sampleof patientsfalls wellwithin the
95% confidence interval around the mean speechreading
performance level found in controls, which ranges from
4.4% to 16.9%. In addition, a two-tailed Pearson
correlational analysis revealed no overall relationship
between speechreading ability and AV-gain in control
In the control population the gain at −12 dB re-
presents a significant deviation from the gain at other
SNR's, representing a maximal window of integration
(Ross et al., 2007). In a post-hoc analysis, we tested
here whether the gain at this SNR was also significantly
more prominent in the patient population. For this, we
compared the gain at −12 dB with the average gain
at the two flanking levels (−8 dB and −16 dB)
where the next two highest gains were observed. A
simple 2-tailed paired sample t-test was employed
(α=0.05). As expected, the difference was significant
for control subjects (t(17)=4.71, pb0.001) but this
was also the case for patients, albeit to a lesser degree (t
(17)=2.331, p=0.032). Thus, although patients show a
significant impairment at this level relative to controls,
they do appear to show residual specialized tuning
at this intermediate SNR, a result that will bear
Finally, we assessed whether there was a relation-
ship between the strength of antipsychotic agents that
the patients received as measured in chlorpromazine
equivalents (range: 500–2000 mg; mean: 1196 mg) and
recognition performance in the AV-condition. Chlor-
promazine equivalents were available for 16 of the
patients. Pearson bivariate correlation coefficients
(two-tailed, α=0.05) did not reach significance at any
of the seven noise levels with an average p-value of
audiovisual processing in patients with schizophrenia,
benefit similarly to healthy controls by seeing speakers'
articulations while trying to recognize spoken words
embedded in various levels of background noise. A
rather surprising finding was that despite very well-cha-
racterized deficits in early unisensory auditory proces-
sing (e.g. Javitt et al., 1993, 1995, 1997, 1998, 2000b;
Michieetal.,2002; Rosburgetal.,2004;Salisbury etal.,
2002) and quite a number of reports of deficits in aspects
of speech perception (e.g. Baltaxe and Simmons, 1995;
Bull and Venables, 1974; Cannon et al., 2002; Condray,
2005; Lee et al., 2004; Leitman et al., 2005; Titone and
Levy, 2004), patients in this study did not show any
deficits whatsoever in recognizing auditorily presented
words when they were embedded in noise. On the other
gain that is commonly observed when auditory informa-
tion is accompanied by visual articulation, an impairment
controlsubjects (Rossetal., 2007).Thus, theresultsshow
a specific deficit in multisensory speech processing in the
alone) speech processing and suggest that sensory inte-
gration dysfunction may be an important and, to date,
rather overlooked aspect of schizophrenia.
4.1. Possible neural substrates
There are several mechanisms involving different
brain structures that have been suggested to underlie
the successful integration of audiovisual speech and
possible reasons for its malfunction are therefore
manifold. One of them is the visual biological mo-
tion processing system, which has already been shown
to be impaired in schizophrenia (Kim et al., 2005).
In contrast to more basic motion stimuli, which are
mainly processed in area V5, biological motion
contains information about the identity of the moving
stimulus, actions, intentions and even emotions. As a
necessary sub-process for audiovisual integration, lip-
motion obviously represents such a biological motion
process. In fact, the anatomical substrate of biological
motion, including lip-motion, is found in the posterior
superior temporal cortex (STS/G) (Allison and Mc-
Carthy, 2000), and this region has also been shown to
contain abnormalities in schizophrenia (Shenton et al.,
177 L.A. Ross et al. / Schizophrenia Research 97 (2007) 173–183
In humans and primates this region is involved in the
analysis of complex biological stimuli, such as hand
motion and communication and their imitation (Iaco-
boni et al., 2001). It receives projections from motion
processing area medial superior temporal (MST) and is
therefore part of a functional network that uses motion
information provided by the dorsal visual stream for the
identification of complex motion, which would clearly
include articulatory movement of the lips. The dorsal
visual stream has extensive magnocellular inputs that
rapidly conduct low-resolution visual information to
cortex (e.g. V5 receives direct input from V1) providing
information about motion and spatial organization of
stimuli. Patients with schizophrenia have shown clear
dysfunction in tasks involving the magnocellular system
(see e.g. Kim et al., 2006; Schwartz et al., 2001) and
electrophysiological studies have shown early dorsal
visual stream processing deficits in schizophrenia (Do-
niger et al., 2002; Foxe et al., 2001, 2005; Yeap et al.,
2006; Butler et al., 2007) that may well be the origin for
impairment of “upstream” function involving motion
perception (Chen et al., 2005), biological motion (Kim
et al., 2005) and consequently audiovisual integration of
speech. As such, dysfunction of dorsal stream visual
processing, and by inference, of biological motion pro-
cessing, might well be a source of upstream deficits in
the multisensory integration of speech. However, such
links remain speculative and will need to be explicitly
tested in future studies.
It should also be noted that results from speechread-
ing studies in patients, including the small sample
tested here, are not entirely consonant with this
hypothesis, with research in this area producing
somewhat non-uniform results. de Gelder and collea-
gues did find reduced ability in their patients to identify
syllables that were speechread (de Gelder et al., 2003).
This deficit, however, did not correlate with audiovi-
sual integration as measured by fusions in the McGurk
illusion, and so the authors concluded that the multi-
sensory integration decrement found in their study was
not due to reduced speechreading ability. A more
detailed examination of their results is merited here. The
speechreading decrement for patients in their study
amounted to 14% accuracy in comparison to healthy
subjects. If a speechreading deficit of similar magnitude
existed in our patient group, we could clearly infer that
speechreading would only account for a minute propor-
tion of the observed audiovisual integration deficit we
find. In our study, healthy controls, and for thatmatterthe
three patients we tested, recognized 8.7% of the words
when delivered in the visual modality alone (i.e. when
they were speech-read). A decrease in accuracy by 14%
(i.e. from 8% down to 7%) would be insufficient to
did find between patients and controls at the −12 dB
noise level. Further, if a deficit in speechreading was
the source of the patients' failure to experience normal
to impact gain at all SNR levels to the same extent. The
gain curve in Fig. 1 shows otherwise, where the gain at
higher SNRs (i.e. −8 dB, −4 dB and 0 dB) shows no
differences between patients and controls. (It should be
by AV-performance reaching ceiling levels). Further, our
ability, which again, would not be sufficient to explain a
17% loss in AV-gain.
It is also the case that other studies have found no
deficit in speechreading ability at all for patients with
schizophrenia. In a carefully screened set of patients,
Myslobodsky and colleagues found no difference
between patients and controls when they were asked
to speech read spoken words (Myslobodsky et al.,
1992). The authors attribute a mild deficiency in the
speechreading of spoken sentences in their patients to a
secondary coping strategy rather than to speechreading
itself. Similarly, only very small differences were
found between patients and controls in speechreading
(Schonauer et al., 1998). Although in the present study
only three patients were tested for speechreading, they
also did not show any obvious deficit. In a similar vein,
Cienkowski and Carney (2002) showed that spee-
chreading ability in both young and elderly subjects
was unrelated to audiovisual integrations in a McGurk
task. Gagné et al. (1995) investigated the effect of
conversational versus “clear” speech on speech intel-
ligibility and reported no correlations between perfor-
mance in the visual-alone and audiovisual conditions.
In line with these findings, we also found no rela-
tionship between speechreading performance and AV-
gain in either patients or controls. Overall, it seems
highly unlikely in our opinion that an isolated deficit in
speechreading could be the major source of the multi-
sensory deficit observed here.
So, if we can largely rule out that the present deficit is
primarily driven by unisensory visual deficits, and since
it is also clear that there is no deficit at all in unisensory
auditory speech recognition under the present circum-
stances, the inevitable conclusion is that these data have
uncovered an isolated multisensory integration deficit
for audiovisual speech recognition. This in turn suggests
that the neural substrate of this deficit is likely to be a
higher-order speech integration region, of which the
superior temporal sulcus/gyrus (STS/G) would appear to
178 L.A. Ross et al. / Schizophrenia Research 97 (2007) 173–183
be the likeliest candidate (Calvert and Campbell, 2003;
Surguladze et al., 2001). One obvious avenue for further
study would be to assess STS/G activity as a function of
the extent of the deficit in audiovisual integration seen in
A reviewer of an earlier version of this paper raised
the important issue of whether the decrements seen in
audiovisual gain might be due in part to a failure to
maintain proper fixation on the face during the task. The
experimenters closely monitored eye-fixation through-
out the experiment and no systematic deviations in eye
fixation between patients and controls were observed.
Fixation and attention during audiovisual speech pro-
cessing is a rather complex issue. In various listening
conditions, the perceiver's eye gaze moves over the
face, predominantly located at the mouth, nose and eyes
(Vatikiotis-Bateson et al., 1998) with gaze patterns
changing depending on the listening conditions (Buchan
et al., 2007). It therefore would have been inappropriate
to instruct participants to focus on a specific location on
the face of the speaker. However, if gaze position were
at the root of the deficit seen here, it should also have
resulted in substantial deficits in speechreading, some-
thing that was not seen in the small sample tested here,
and also not seen in a number of previous studies as
discussed above. It should also be mentioned that a
difference in gaze patterns might not necessarily explain
AV gain deficits as AV gain is observed at varying gaze
locations on the face. For example, Paré et al. (2003)
have shown that oral foveation is not necessary for
processing visual speech information. Lastly, if gaze
were the cause of the deficits seen here, one would
clearly expect to see deficits more evenly distributed
across the various signal-to-noise ratios used rather than
concentrated at the −12 dB level.
4.2. The optimal integration window
What remains to be discussed is the fact that patients
with schizophrenia show the largest deficit at the “inter-
mediate” SNR (−12 dB). In an earlier study (Ross et al.,
2007) we showed that healthy volunteers experience the
largest gain from visual articulation at this SNR. We
contended that the multisensory speech system is spe-
cially tuned for SNRs between extremes, extremes
where the system relies on either the visual (speechread-
ing) or the auditory modality alone, forming a window
of maximal integration centered at intermediate SNRs.
determined in part by properties of the speech stimulus.
In spoken words, vowels are generally easier to identify
whereas consonants, due to their lower intensity, are
easier to mask with noise (Barnett, 1999; French and
Steinberg, 1947). Of course, consonant sounds play a
critical role in speech recognition, since many words,
especially monosyllabic words, share the same vowel
SNRs, often only vowels are intelligible and words can
therefore remain ambiguous. It is possible that in our
particular word identification task, critical consonant
information became available to the listener at −12 dB,
which together with visual cues facilitated the identifica-
tion of the whole word. That is, we posit that visual
articulation becomes maximally effective when accom-
panied by a certain, critical amount of acoustic consonant
information. At this point on the SNR function, the gain
increases until word recognition in the AV condition
becomes increasingly restricted by the performance
ceiling (100%). From then on, benefit decreases again,
thereby bracketing what we have termed the window of
maximal integration (Ross et al., 2007).
Development of this specially tuned window is
likely a factor of repeated lifetime exposure to
environmental conditions that are approximated by
the −12 dB condition in our study. In an ecological
context we are often unable to eliminate the source of
noise. We have, however, a variety of options to adjust
or compensate: we can get closer to the sound source
(i.e. move closer to the speaker), adjust the volume (ask
the speaker to speak up) etc. Therefore, conditions
where the acoustic stimulus is masked to the extent that
we are forced to rely solely on speechreading are rather
rare. Thus, the exposure to intermediate SNRs
throughout development may have resulted in an
adaptation of multisensory mechanisms in the brain
to integrate at these SNRs more efficiently. Through
repeated exposure to these specific environmental
conditions, cells in multisensory regions integrating
auditory and visual speech may have the capacity to be
“tuned” or “sensitized” to audiovisual speech that is
delivered at intermediate SNR's. In this view the
integration window is therefore an emerging property
of a plastic multisensory system. This notion is
supported by evidence from electrophysiological expe-
riments in the superior colliculus (SC) of cats. Wallace
and Stein have shown that multisensory neurons are
present in the SC of newborn cats (Wallace and Stein,
1997), but that these neurons show no capacity yet to
integrate sensory inputs. These neurons, however, ac-
quire the capacity to integrate information from
different senses within the first few weeks after birth
(Wallace and Stein, 1997). Critically, these integrative
properties are gated by feedback inputs from neocortex
(Jiang et al., 2001; Wallace and Stein, 1997) and are
179 L.A. Ross et al. / Schizophrenia Research 97 (2007) 173–183
dependent on the animal's specific sensory environ-
ment (Wallace et al., 2004). In a recent study by
Wallace and Stein, the authors were able to show that
when cats were raised in an altered sensory environ-
ment where auditory and visual inputs were temporally
coupled but originated from different spatial locations
multisensory cells in the SC developed adaptively by
integrating auditory and visual input originating from
different spatial locations (Wallace and Stein, 2007).
It also follows that the development of appropri-
ately functioning multisensory networks is reliant on
the integrity of the individual sensory systems them-
selves and is therefore vulnerable to abnormalities in
basic sensory processing, as has been shown in schi-
zophrenia. Consequently, we would predict differ-
ences in the attributes of this integration window
where sensory processing is impacted by a disorder
early in life, a notion that may well extend to other
childhood clinical conditions such as autism (e.g.
(Iarocci and McDonald, 2006; Molholm and Foxe,
2005) and early hearing impairments (e.g. Schlum-
berger et al., 2004).
4.3. Why no unisensory auditory deficits in speech
A somewhat surprising finding in our study was that
patients with schizophrenia did not show any deficits in
the condition where the words were presented without
the aid of visual articulation. This is surprising, given
the large number of reports of impairment in speech and
language related functions (e.g. Baltaxe and Simmons,
1995; Bull and Venables, 1974; Cannon et al., 2002;
Condray, 2005; Lee et al., 2004; Leitman et al., 2005;
Titone and Levy, 2004). It also appears reasonable to
assume that lower level auditory deficits that have been
shown in schizophrenia (e.g. Alain et al., 2002; Javitt et
al., 1997, 2000a; Michie, 2001; Rosburg et al., 2004)
would impact higher-order functions such as speech
perception. Overall, however, receptive language does
not seem to be as impaired as speech production
(Weinstein et al., 2006). Thought disorder, the most
prominent symptom of schizophrenia, observably
manifests itself in language production. Functional
brain imaging studies have found thought disorder to
be associated with altered activation patterns in the left
and right STS/G (McGuire et al., 1998; Kircher et al.,
2002) during speech production. It has been hypothe-
sized that the relative preservation of receptive language
in schizophrenia is due to a compensatory process
(Weinstein et al., 2006) analogous to the mechanism
proposed for the preservation of performance in tasks
targeting working memory (Manoach et al., 1999;
Manoach, 2003). Here, normal performance in low
memory load conditions was associated with an increase
in prefrontal activity whereas high demands produced
low performance together with decreased activity in the
prefrontal cortex. This particular pattern of performance
and activation in schizophrenia is thought to be due to a
memory system operating at higher intensity to maintain
normal performance. This way, patients with schizo-
phrenia are able to compensate for deficits in less
demanding memory tasks whereas patients reach the
limit of their capacity when higher demands are
imposed. It is possible that language production
represents a higher demand on the system than receptive
processes such as word recognition. Consequently, one
might predict that more complex speech recognition
may also reveal a receptive deficit in schizophrenia. For
example, meaningful and syntactically correct sentences
provide semantic and syntactic context that benefit word
recognition. There is evidence that patients with
schizophrenia do not experience that benefit to the
same extent as controls (e.g. Kuperberg et al., 2006;
Ruchsow et al., 2003).
In line with the found lack of impairment in the
recognition of spoken speech in schizophrenia are stu-
dies that failed to find strong associations between psy-
choacoustic measures of auditory acuity and auditory
speech recognition (CHABAWorking Group 95, 1991).
It has been argued elsewhere (Watson et al., 1996) that
given the redundancy of the speech signal and the
complexity of speech recognition as a cognitive task,
for existing impairments. This argument is supported by
evidence showing that the elimination of fine spectral
detail has little impact on speech intelligibility (Dudley,
1939; Greenberg and Arai, 2004).
In conclusion, patients with schizophrenia showed
deficits in their ability to derive benefit from visual arti-
culatory motion while unisensory auditory speech per-
in audio–visual speech integration is related to a well-
characterized dysfunction of the dorsal visual processing
stream but this remains to be explicitly examined.
Role of funding source
Support for this work was provided by grants to Professor Foxe
from the National Institute of Mental Health (MH65350) and the
National Institute on Aging (AG22696) and to Dr. Javitt from the
National Institute of Mental Health (MH49334 and MH01439). Ms.
Leavittwas supportedby a RuthL. Kirschstein pre-doctoral fellowship
180L.A. Ross et al. / Schizophrenia Research 97 (2007) 173–183
(NRSA - MH074284) from the National Institute of Mental Health
(NIMH). Dr. Molholm was supported by a Ruth L. Kirschstein post-
doctoral fellowship (NRSA - MH068174) from the NIMH. Neither the
NIMH nor the NIA had any further role in study design; in the
collection, analysis and interpretation of data; in the writing of the
report; or in the decision to submit the paper for publication.
Mr. Ross designed the stimulus sequences, programmed all para-
digms, analyzed all data and wrote the first draft of the manuscript. Dr.
Saint-Amour aided in the design and setup of the experimental para-
digm, provided statistical help and commented critically on multiple
drafts of the manuscript. Professor Foxe designed the experimental
protocol and edited multiple drafts of the manuscript. Ms. Leavitt
helped inthe collectionof data, the editingandpreparationof the video
and audio clip files, tabulated patient demographics, and aided in
analyses. Drs. Molholm and Javitt provided editorial comments on
multiple drafts of the manuscript. All authors contributed to and have
approved the final manuscript. The principle investigator, Dr. Foxe,
takes responsibility for the integrity of the data and the accuracy of the
data analysis, and attests that all authors had full access to all the data
in the study.
Conflict of interest
All authors declare no conflicts of interest, financial or otherwise.
We are deeply indebted to the team at the Cognitive Neurophys-
iology Laboratory for their dedication and hard work. Thanks also go
to Ms. Gail Silipo for her assistance in recruiting subjects and her
enduring dedication to the patients. The principle investigator, Dr.
Foxe, takes responsibility for the integrity of the data and the accuracy
of the data analysis, and attests that all authors had full access to all the
data in the study.
Alain, C., Bernstein, L.J., Cortese, F., Yu, H., Zipursky, R.B., 2002.
Deficits in automatically detecting changes in conjunction of
auditory features in patients with schizophrenia. Psychophysiology
Allison, T.P.A., McCarthy, G., 2000. Social perception from visual
cues: role of the STS region. Trends Cogn. Sci. 4, 267–278.
Baltaxe,C.A.,Simmons III,J.Q., 1995.Speechandlanguagedisorders
in children and adolescents with schizophrenia. Schizophr. Bull.
Barnett, 1999. Overview of speech intelligibility. Proc. IOA 21, 1–15.
Bernstein, L.J., Auer, E.T., Moore, J.K., 2004. Audiovisual Speech
Binding: Convergence or Association? MIT Press, Cambridge, MA.
Buchan, J.N., Paré, M., Munhall, K.G., 2007. Spatial statistics of gaze
fixations during dynamic face processing. Social Neurosci. 2, 1–13.
Bull, H.C., Venables, P.H., 1974. Speech perception in schizophrenia.
Br. J. Psychiatry 125, 350–354.
Butler, P.D., Hoptman, M.J., Nierenberg, J., Foxe, J.J., Javitt, D.C.,
Lim, K.O., 2006. Visual white matter integrity in schizophrenia.
Am. J. Psychiatry 163, 2011–2013.
Butler, P.D., Martinez, A., Foxe, J.J., Kim, D., Silipo, G., Mahony, J.,
Shpaner, M., Jalbrikowski, M., Javitt, D.C., 2007. Subcortical
visual dysfunction in schizophrenia drives secondary cortical
impairments. Brain 130, 417–430.
Callan, D.E., Jones, J.A., Munhall, K., Callan, A.M., Kroos, C.,
Vatikiotis-Bateson, E., 2003. Neural processes underlying percep-
tual enhancement by visual speech gestures. NeuroReport 14,
Calvert, G.A., 2001. Crossmodal processing in the human brain:
insights from functional neuroimaging studies. Cereb. Cortex 11,
Calvert, G.A., Campbell, R., 2003. Reading speech from still and
moving faces: the neural substrates of visible speech. J. Cogn.
Neurosci. 15, 57–70.
Campbell, R., MacSweeney, M., 2004. Neuroimaging Studies of
Cross-Modal Plasticity and Language Processing in Deaf People.
MIT Press, Cambridge, MA.
Cannon, M., Caspi, A., Moffitt, T.E., 2002. Evidence for early-
childhood, pan-developmental impairment specific to schizophre-
niform disorder: results from a longitudinal birth cohort. Arch.
Gen. Psychiatry 59, 449–456.
CHABA, Working Group on Communication Aids for the Hearing-
Impaired, 1991. Speech perception aids forth the hearing impaired
people: current status and needed research. J. Acoust. Soc. Am. 90,
Chen, Y., Bidwell, L.C., Holzman, P.S., 2005. Visual motion
integration in schizophrenia patients, their first-degree relatives,
and patients with bipolar disorder. Schizophr. Res. 74, 271–281.
Cienkowski, K.M., Carney, A.E., 2002. Auditory–visual speech per-
ception and aging. Ear Hear. 23, 439–449.
Condray, R., 2005. Language disorder in schizophrenia as a
developmental learning disorder. Schizophr. Res. 73, 5–20.
Audio-visual integration in schizophrenia. Schizophr. Res. 59,
Dekle, D.J., Fowler, C.A., Funnell, M.G., 1992. Audiovisual
integration in perception of real words. Percept. Psychophys. 51,
Doniger, G.M., Foxe, J.J., Murray, M.M., Higgins, B.A., Javitt, D.C.,
2002. Impaired visual object recognition and dorsal/ventral stream
Dudley,H., 1939.The automaticsynthesisof speech.Proc. Natl. Acad.
Sci. U.S.A. 25 (7), 377–383.
Erber, N.P., 1969. Interaction of audition and vision in the recognition
of oral speech stimuli. J. Speech Hear. Res. 12, 423–425.
First, M.B., Spitzer, R.L.,Benjamin, L., Gibbon,M., Williams, J.B.W.,
1997. Structured Clinical Interview for DSM-IV. American
Psychiatric Publishers INC.
Foxe, J.J., Schroeder, C.E., 2005. The case for a feedforward com-
ponent in multisensory integration mechanisms. NeuroReport 16,
Foxe, J.J., Doniger, G.M., Javitt, D.C., 2001. Early visual processing
deficits in schizophrenia: impaired P1 generation revealed by high-
density electrical mapping. NeuroReport 12, 3815–3820.
Foxe, J.J., Murray, M.M., Javitt, D.C., 2005. Filling-in in schizophrenia:
a high-density electrical mapping and source-analysis investigation
of illusory contour processing. Cereb. Cortex 15, 1914–1927.
French, N.R., Steinberg, J.C., 1947. Factors governing the Intelligi-
bility of Speech Sounds. J. Acoust. Soc. Am. 19, 90–119.
Gagné, J.-P., Querengesser, C., Folkeard, P., Munhall, K., Masterson,
V.M., 1995. Auditory, visual, and audiovisual speech intelligibility
for sentence-length stimuli: an investigation of conversational and
clear speech. Volta Rev. 97, 33–51.
Grant, K., Seitz, P.F., 2000. The use of visible speech cues for impro-
ving auditory detection of spoken sentences. J. Acoust. Soc. Am.
181L.A. Ross et al. / Schizophrenia Research 97 (2007) 173–183
Greenberg, S., Arai, T., 2004. What are the essential cues for understan-
ding spoken language? IEICE Trans. Inf. Syst., E 87, 1059–1070.
Hoffman, R.E., Rapaport, J., Mazure, C.M., Quinlan, D.M., 1999.
Selective speech perception alterations in schizophrenic patients
reporting hallucinated “voices”. Am. J. Psychiatry 156, 393–399.
Hyman, S.E., Arana, G.W., Rosenbaum, J.F., 1995. Handbook of
Psychiatric Drug Therapy. Little, Brown and Company, Boston.
Iacoboni, M., Koski, L.M., Brass, M., Bekkering, H., Woods, R.P.,
Dubeau, M.C., Mazziotta, J.C., Rizzolatti, G., 2001. Reafferent
copies of imitated actions in the right superior temporal cortex.
Proc. Natl. Acad. Sci. U. S. A. 98.
Iarocci, G., McDonald, J., 2006. Sensory integration and the perceptual
Javitt, D.C., Doneshka, P., Zylberman,I., Ritter, W., VaughanJr., H.G.,
1993. Impairment of early cortical processing in schizophrenia: an
event-related potential confirmation study. Biol. Psychiatry 33,
Javitt, D.C., Doneshka, P., Grochowski, S., Ritter, W., 1995. Impaired
mismatch negativity generation reflects widespread dysfunction of
working memory in schizophrenia. Arch. Gen. Psychiatry 52,
Javitt, D.C., Strous, R.D., Grochowski, S., Ritter, W., Cowan, N.,
1997.Impairedprecision,but normal retention,of auditory sensory
(“echoic”) memory information in schizophrenia. J. Abnorm.
Psychology 106, 315–324.
Javitt, D.C., Grochowski, S., Shelley, A.M., Ritter, W., 1998. Impaired
mismatch negativity (MMN) generation in schizophrenia as a
function of stimulus deviance, probability, and interstimulus/
interdeviant interval. Electroencephalogr. Clin. Neurophysiol. 108,
Javitt, D.C., Shelley, A., Ritter, W., 2000a. Associated deficits in
mismatch negativity generation and tone matching in schizophre-
nia. Clin. Neurophysiol. 111, 1733–1737.
Javitt, D.C., Shelley, A.M., Silipo, G., Lieberman, J.A., 2000b. Deficits
in auditory and visual context-dependent processing in schizophre-
nia: defining the pattern. Arch. Gen. Psychiatry 57, 1131–1137.
Jiang, W., Wallace, M., Jiang, H., Vaughan, J., Stein, B., 2001. Two
cortical areas mediate multisensory integration in superior
colliculus neurons. J. Neurophysiol. 85, 506–522.
Jibson, M.D., Tendon, R., 1998. New atypical antipsychotic medica-
tions. J. Psychiatr. Res. 32, 215–228.
Kern, J.K., 2002. The possible role of the cerebellum in autism/PDD:
disruption of a multisensory feedback loop. Med. Hypotheses 59,
Kim, J., Doop, M.L., Blake, R., Park, S., 2005. Impaired visual
recognition of biological motion in schizophrenia. Schizophr. Res.
Kim, D., Wylie, G., Pasternak, R., Butler, P.D., Javitt, D.C., 2006.
Magnocellular contributions to impaired motion processing in
schizophrenia. Schizophr. Res. 82, 1–8.
Kircher, T., Liddle,P.F.,Brammer,M.J.,Williams, S.C.,Murray,R.M.,
McGuire, P.K., 2002. Reversed lateralization of temporal activa-
tion during speech production in thought disordered patients with
schizophrenia. Psychol. Med. 32, 439–449.
Kucera, H., Francis, W.N., 1967. Computational Analysis of Present-
Day American English. Brown University Press, Providence, I.
Kuperberg, G.R., Sitnikova, T., Goff, D., Holcomb, P.J., 2006. Making
sense of sentences in schizophrenia: electrophysiological evidence
for abnormal interactions between semantic and syntactic proces-
sing. J. Abnorm. Psychology 115, 251–265.
a visual-to-auditory cross-modal sensory gating phenomenon as
reflected by the human P50 event-related brain potential modula-
tion. Neurosci. Lett. 341, 185–188.
Lee, S.H., Chung, Y.C., Kim, Y.K., Suh, K.Y., 2004. Abnormal speech
perception in schizophrenia with auditory hallucinations. Acta Neu-
ropsychiatr. 16, 154–159.
Leitman, D.I., Foxe, J.J., Butler, P.D., Saperstein, A., Revheim, N.,
Javitt, D.C., 2005. Sensory contributions to impaired prosodic
processing in schizophrenia. Biol. Psychiatry 58, 56–61.
Manoach, D.S., 2003. Prefrontal cortex dysfunction during working
memory performance in schizophrenia: reconciling discrepant find-
ings. Schizophr. Res. 60, 285–298.
as measured by fMRI. Biol. Psychiatry 45, 1128–1137.
Massaro,D.W., 1998.PerceivingTalkingFaces:Insightsinto Auditory
Attention. MIT Press, Cambridge, MA.
McGuire, P., Quested, D.J., Spence, S.A., Murray, R.M., Frith, C.D.,
Liddle, P.F., 1998. Pathophysiology of ‘positive’ thought disorder
in schizophrenia. Br. J. Psychiatry 173, 231–235.
McGurk, H., MacDonald, J., 1976. Hearing lips and seeing voices.
Nature 264, 746–748.
Meredith, M.A., Stein, B.E., 1986. Visual, auditory, and somatosen-
sory convergence on cells in superior colliculus results in
multisensory integration. J. Neurophysiol. 56, 640–662.
Michie,P.T., 2001.Whathas MMNrevealedaboutthe auditory system
in schizophrenia? Int. J. Psychophysiol. 42, 177–194.
Michie, P.T.,Innes-Brown, H., Todd,J., Jablensky, A.V.,2002. Duration
mismatch negativity in biological relatives of patients with
schizophrenia spectrum disorders. Biol. Psychiatry 52, 749–758.
Molholm, S., Foxe, J.J., 2005. Look ‘hear’, primary auditory cortex is
active during lip-reading. NeuroReport 16, 123–124.
Molholm, S., Ritter, W., Javitt, D.C., Foxe, J.J., 2004. Multisensory
visual–auditory object recognition in humans: a high-density elec-
trical mapping study. Cereb. Cortex 14, 452–465.
Molholm, S., Sehatpour, P., Mehta, A.D., 2006. Audio–visual multi-
sensory integration in superior parietal lobule revealed by human
intracranial recordings. J. Neurophysiol. 96, 721–729.
Munhall, K.G., Servos, P., Santi, A., Goodale, M.A., 2002. Dynamic
visual speech perception in a patient with visual form agnosia.
NeuroReport 13, 1793–1796.
Munhall, K.G., Jones, J.A., Callan, D.E., Kuratate, T., Vatikiotis-
Bateson, E., 2004a. Visual prosody and speech intelligibility: head
movement improves auditory speech perception. Psychol. Sci. 15,
Munhall, K.G., Jones, J.A., Callan, D.E., Kuratate, T., Vatikiotis-
Bateson, E., 2004b. Visual prosody and speech intelligibility: head
movement improves auditory speech perception. Psychol. Sci. 15,
Myslobodsky, M.S., Goldberg, T., Johnson, F., Hicks, L., Weinberger,
D.R., 1992. Lipreading in patients with schizophrenia. J. Nerv.
Ment. Dis. 180, 168–171.
O'Neill, J.J., 1954. Contributions of the visual component of oral
in audiovisual speech perception: the influence of ocular fixations
on the McGurk effect. Percept. Psychophys. 65, 553–567.
Peuskens, J., Link, C.G.G., 1997. A comparison of quetiapine and
chlorpromazine in the treatment of schizophrenia. Acta Psychiatr.
Scand. 96, 265–273.
Rorden, C., Heutnik, J., Greenfield, E., Robertson, I.H., 1999. When a
rubber hand ʻfeels' what a real hand cannot. Neuroreport. 10 (1),
182 L.A. Ross et al. / Schizophrenia Research 97 (2007) 173–183
Rosburg, T., Kreitschmann-Andermahr, I., Sauer, H., 2004. Mismatch Download full-text
disorders of acoustic information. Nervenarzt 75, 633–641.
Ross, L.A., Saint-Amour, D., Leavitt, V.M., Javitt, D.C., Foxe, J.J.,
2007.Doyou see what Iam saying? Exploringvisualenhancement
Ruchsow, M., Trippel, N., Groen, G., Spitzer, M., Kiefer, M., 2003.
Semantic and syntactic processes during sentence comprehension
in patients with schizophrenia: evidence from event-related poten-
tials. Schizophr. Res. 64, 147–156.
Saint-Amour, D., De Sanctis, P., Molholm, S., Ritter, W., Foxe, J.J.,
2006. Seeing voices: High-density electrical mapping and source-
analysis of the multisensory mismatch negativity evoked during
the McGurk illusion. Neuropsychologia 45 (3), 587–597.
Salisbury, D.F., Shenton, M.E., Griggs, C.B., Bonner-Jackson, A.,
McCarley, R.W., 2002. Mismatch negativity in chronic schizo-
phrenia and first-episode schizophrenia. Arch. Gen. Psychiatry 59,
Schlumberger, E., Narbona, J., Manrique, M., 2004. Non-verbal
development of children with deafness with and without cochlear
implants. Dev. Med. Child Neurol. 46, 599–606.
Schonauer, K., Achtergarde, D., Reker, T., 1998. Lipreading in
prelingually deaf and hearing patients with schizophrenia. J. Nerv.
Ment. Dis. 186, 247–249.
Schroeder,C.E., Foxe,J.,2005. Multisensory contributions to low-level,
“Unisensory” processing. Curr. Opin. Neurobiol. 15, 454–458.
Schwartz, B.D., Tomlin, H.R., Evans, W.J., Ross, K.V., 2001.
Neurophysiologic mechanisms of attention: a selective review of
early information processing in schizophrenics. Front. Biosci. 6,
Shenton, M.E., Dickey, C.C., Frumin, M., McCarley, R.W., 2001. A
review of MRI findings in schizophrenia. Schizophr. Res. 49,
Stein, B.E., Jiang, W., Wallace, M.T., Stanford, T.R., 2001. Nonvisual
influences on visual-information processing in the superior collicu-
lus. Prog. Brain Res. 134, 143–156.
Stein, B.E., Wallace, M.W., Stanford, T.R., Jiang, W., 2002. Cortex
governs multisensory integration in the midbrain. Neuroscientist 8,
Sumby, W.H., Pollack, I., 1954. Visual contribution to speech
intelligibility in noise. J. Acoust. Soc. Am. 26, 212–215.
Surguladze, S.A., Calvert, G.A., Brammer, M.J., 2001. Audio–visual
speech perception in schizophrenia: an fMRI study. Psychiatry
Res. 106, 1–14.
Titone, D., Levy, D.L., 2004. Lexical competition and spoken word
identification in schizophrenia. Schizophr. Res. 68, 75–85.
Vatikiotis-Bateson, E., Eigsti, I.M., Yano, S., Munhall, K.G., 1998.
Eye movementof perceiversduringaudiovisualspeechperception.
Percept. Psychophys. 60, 926–940.
Wallace, M.T., Stein, B.E., 1997. Development of multisensory
neurons and multisensory integration in cat superior colliculus.
J. Neurosci. 17, 2429–2444.
Wallace, M., Stein, B.E., 2007. Early experience determines how the
senses will interact. J. Neurophysiol. 97 (1), 921–926.
Wallace, M., Perrault II, T.J., Hairston, W.D., Stein, B.E., 2004. Visual
experience is necessary for the development of multisensory
integration. J. Neurosci. 24, 9580–9584.
Watson, C.S., Qiu, W.W., Chamberlain, M.M., Li, X., 1996. Auditory
and visual speech perception: confirmation of a modality-
independent source of individual differences in speechrecognition.
J. Acoust. Soc. Am. 100, 1153–1162.
Weinstein, S., Werker, J.F., Vouloumanos, A., Woodward, T.S., Ngan,
E.T., 2006. Do you hear what I hear? Neural correlates of thought
disorder during listening to speech in schizophrenia. Schizophr.
Res. 86, 130–137.
Woods, S.W., 2003. Chlorpromazine equivalent doses for the newer
atypical antipsychotics. J. Clin. Psychiatry 64, 663–667.
Yeap,S., Kelly, S.P.,Sehatpour, P., Magno, E., Javitt, D.C., Garavan,H.,
Thakore, J.H., Foxe, J.J., 2006. Early visual sensory deficits as
endophenotypes for schizophrenia: high-density electrical mapping
in clinically unaffected first-degree relatives. Arch. Gen. Psychiatry
183 L.A. Ross et al. / Schizophrenia Research 97 (2007) 173–183