J Psycholinguist Res
Effects of Orthography on Speech Production in Chinese
Qingfang Zhang · Markus F. Damian
© Springer Science+Business Media, LLC 2011
was investigated with speakers of Chinese, a non-alphabetic and orthographically non-trans-
parent language. Using the response generation procedure, we obtained the well-known
facilitation fromword-initial phonologicaloverlap,butthiseffectwasunaffectedbywhether
or not responses shared the initial character. In a study which manipulated the visual simi-
larity of the word-initial character, a significant inhibitory effect of orthography was found.
However, this effect disappeared when prompt stimuli were presented auditorily, suggesting
that the orthographic effect might be attributable to the memorization stage of the response
generation task, rather than reflecting processes genuine to speaking. By contrast, a reliable
orthographic effect was found in an oral reading task, suggesting that orthography plays a
role only when it is relevant to the word production task. Furthermore, the present findings
show that the orthographic effect is tied to the correspondence between orthography and
phonology of a language when orthography is relevant to the task used.
The potential role of orthographic representations on spoken word production
Speech production · Orthography · Response generation task · Word reading
The spoken production of a word involves the retrieval of semantic, syntactic and phono-
graphic information. This claim is somewhat counterintuitive as spelling codes would prima
facie appear irrelevant for phonological encoding. It is motivated by the observation that in
language comprehension rather than production, there is substantial and growing evidence
for the co-activation of orthographic codes (e.g., Chéreau et al. 2007; Dijkstra et al. 1995;
Donnenwerth-Nolan et al. 1981; Hallé et al. 2000; Jakimik et al. 1985; Muneaux and Ziegler
Q. Zhang (B )
Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences,
4A Datun Road, Chaoyang District, Beijing 100101, China
M. F. Damian
Department of Experimental Psychology, University of Bristol, Bristol, UK
J Psycholinguist Res
2004; Racine and Grosjean 2005; Pattamadilok et al. 2007; Seidenberg and Tanenhaus 1979;
Taft and Hambly 1985; Ventura et al. 2004; Ziegler and Ferrand 1998; Ziegler et al. 2004,
findings imply a high degree of interconnectedness of linguistic codes in different formats
in the mental lexicon, with access to phonological representations entailing parallel access
to the orthographic format.
By contrast, the potential role of orthographic information in spoken language production
conflicting results. A number of studies have used the so-called “picture-word interference”
procedure in a way which may allow inferences about orthographic influences in spoken
production. Lupker (1982) compared distractors overlapping in spelling and sound with the
but still significant, facilitation in the two latter, compared to the former, condition (see also
Underwood and Briggs 1984). However, distractors were presented in printed format, and
the observed effect may be largely attributable to visual word recognition, rather than word
production. Damian and Bowers (2009) showed that with spoken distractors, phonologi-
cal facilitation was found, but there was no effect of orthographic overlap. Hence, findings
from the picture-word interference task do not render unambiguous evidence concerning the
possibility of orthographic influences on spoken production.
Of particular relevance for the experiments reported below are studies using the so-called
“response generation” procedure. In this task (e.g., Meyer 1990, 1991), participants first
learn a small set of highly associated word pairs such as “fruit-melon”, “iron-metal”, and
produce the second word of each pair (“response”) in response to the visually presented first
of the response is measured. After completion, they are presented with a new set of words to
learn, and execute the corresponding block, etc. Critically, the presence or absence of form
overlap between the responses within a block is manipulated. Across the experiments, each
replicated previous studies (e.g., Meyer 1990, 1991, and many others) in showing a reliable
priming effect in the homogenous condition in which all response words shared initial sound
and spelling (e.g., “camel”-“coffee”-“cushion”), compared to a heterogeneous condition in
which this was not the case (e.g., “camel”-“gypsy”-“cushion”). Crucially, no priming effect
was obtained in an inconsistent condition in which all response words shared initial sound,
but differed in spelling (e.g., “camel”-“kayak”-“kidney”). Hence, when retrieving the pho-
nological codes of the response words, orthographic codes may be activated and affect word
production, despite the fact that spelling is irrelevant to the speaking process.
However, a few more recent studies suggested that effects of orthography in spoken pro-
duction are difficult to replicate. Roelofs (2006) carried out a study in Dutch with a similar
task as Damian and Bowers (2003) study, and demonstrated that contrary to their findings,
response preparation was not disrupted when response words had the same initial phoneme
but differed in spelling. By contrast, when the task was to simply read aloud the visually
presented response words, Roelofs found the predicted effect of orthographic overlap: the
priming effect largely disappeared when targets were spelled with different onset letters. He
concluded that preparation in speech production takes place in terms of phonemes, and that
the spelling of a word constrains spoken production in Dutch only when it is relevant for
the word production task at hand, such as in reading aloud. Schiller (2007) investigated the
orthographic and phonological contribution of visually masked primes to reading aloud in
J Psycholinguist Res
for visual word recognition, while suggesting that orthography plays a rather limited role
in reading Dutch words aloud. Schiller (2007) further suggests that no orthographic effect
might be a result of the shallow orthography in Dutch, with a relatively clear correspondence
between graphemes and phonemes.
Overall, despite the initial findings reported by Damian and Bowers (2003), subsequent
similar studies conducted in Dutch (Roelofs 2006) and French (Alario et al. 2007) failed to
replicate the originally reported effect of orthography, warranting caution about the claim
is more transparent in Dutch than in English (Seymour et al. 2003), hence Roelofs (2006)
and Schiller (2007) raised the possibility that the differences among studies might arise from
deviations in transparency between the languages in which the experiments were conducted.
This hypothesis predicts that the deeper of the correspondence between orthography and
phonology of a language, the more prone to find the orthographic activation in a language’s
speech production, especially when orthography is relevant for the speech production task
The summarized studies on this topic were conducted with speakers of languages using
an alphabetically based notation system (although with varying degrees of transparency).
An interesting property of the Chinese orthographic system is that it is very “deep”, i.e., the
correspondences between spelling and sound are deeper than English or French. With a pic-
ture-word interference task, pervious study demonstrated that there exists pure orthographic
activation with visually presented distractors in Chinese speech production, although it is
not necessary to activate orthographic information for spoken naming (Zhang et al. 2009;
Zhang and Weekes 2009). According to Roelofs (2006) and Schiller (2007)’s hypothesis, it
is hence possible that studies conducted on Chinese speakers are particularly likely to reveal
an automatic orthographic effect in spoken word production.
To date little evidence exists concerning this issue in Chinese. Chen et al. (2002) inves-
tigated word form encoding of spoken word production with visually presented words, and
first experiment compared response generations in a condition when all response words in a
block shared the initial syllable and tone, to one in which response words additionally shared
the orthographic character. They reported a very similar degree of priming in the former case
(46 ms), compared to the latter (53 ms), suggesting a limited role of orthography in Chinese
production with response generation task (see also Chen and Chen 2006). Most recently, Bi
et al. (2009) investigated the contribution of orthography to spoken word production in Man-
darin Chinese by manipulating orthographic and phonological overlap separately, in either
picture naming, a prompt-response generation task, or in word naming. They found no effect
of sharedorthographyin picture naming,a small (11ms) andnonsignificant inhibitory effect
in prompt-response generation, and a larger (20 ms) and significant inhibitory effect in word
reading. Their findings suggest that a word’s orthographic properties influence spoken word
production only in tasks that rely heavily on orthographic information.
Bi et al. (2009) results offer prima facie rather strong evidence that character overlap in
Chinese response generation task is irrelevant. However, Bi et al. used different response
words, and hence different overlapping syllables, across the different conditions (O+P+,
O+P−, O−P+). Bi et al. did not report each syllable’s effect in their study, but we observed
in Chen et al. (2002) study that different syllables produced varying degrees of priming. For
J Psycholinguist Res
instance, in Chen et al.’s Experiment 2a, response words beginning with the syllable “fei”
produced a 29 ms effect, but those beginning with “ke” produced a −1 ms effect. Chen et al.
(2002) also found a significant syllable facilitation effect in their Experiment 3 with four
syllables, but a non-significant syllable effect in their Experiment 4 with a different set of
four syllables. In languages other than Chinese, it is also the case that effects sometimes
vary dependent on specific target properties; e.g., in Meyer (1990) seminal study introducing
the response generation manipulation, the first experiment used five different sets of Dutch
Thus, for yet unknown reasons, different syllables may produce priming effects of different
magnitude in speech production. Hence, it would be clearly advantageous to use identical
syllables when comparing across conditions, as was the case in most previous studies on the
topic (e.g., Roelofs 2006).
In summary, the existing evidence to some extent discounts the role of orthographic vari-
two objectives in the present study. First, in the series of response generation experiments
and its locus on Chinese spoken word production. Second, we aimed to test the language
related hypothesis with a deeper orthography language (Chinese) than Dutch or English in
oral word reading, which is an orthography relevant task. Experiment 1 aimed to replicate
Damian and Bowers (2003) findings in Chinese, by mimicking their particular design as
closely as possible. In typical response generation studies, each response in a heterogeneous
context is different from all others in terms of phonological properties (e.g., a set consisting
of the responses “crown”, flood”, “grit”, “break”, in which each word starts with a different
item (e.g., “crown”, “cloud”, “cape”, “bear”; the “odd-man out” version; see Roelofs 1999;
Roelofs and Baayen 2002, for examples). With this version of the task, Damian and Bowers
(2003) found an orthographic facilitation effect. Furthermore, we compared a response gen-
eration to a word reading task with identical stimuli in order to determine whether or not
orthography plays a role when it is relevant to the task. Experiment 2 attempted to isolate the
effects of orthography by using visually similar, but non-identical, word-initial characters
in the homogeneous condition. Here we found a small but significant inhibitory effect of
orthographic similarity. However, in the otherwise identical Experiment 3, prompt words
were auditorily presented, and the orthographic effect disappeared, suggesting a strategic
origin of the effect found in Experiment 3. Overall, the results argue against the possibility
that the role of orthography in spoken production is particularly pronounced in languages
with “deep” orthographic systems such as Chinese with response generation task, but favor
the role of orthography in word reading, which is an orthography relevant task.
Participants. Forty-eight undergraduate students from the Chinese Agricultural University
and Beijing Forestry University were paid for their participation. All were native Chi-
nese speakers and had normal or corrected-to-normal vision, with 24 participants randomly
assigned to either the response generation or the word reading task.
Materials and Design. The target stimuli consisted of 12 words, which were either the
responses in the response generation task, or were read aloud in the word reading task.
J Psycholinguist Res
All words were two-character nouns or verbs. The two syllable-tone combinations /ci2/ and
/sheng4/ served as the first character’s pinyin of the response words.
The independent variable was context, with the following three levels: homogeneous (all
responses within an experimental block shared the initial phonological syllable, as well as
the character), heterogeneous (responses shared neither the initial syllable nor the character),
and inconsistent (responses shared the initial syllable, but differed in their first character).
For each syllable-tone combination /ci2/ and /sheng4/, six cue-response word pairs were
selected, rendering four homogeneous sets with three cue-response pairs each. Within each
set, all response words began with the same phonological syllable and were spelled with
the same initial character (e.g., sheng4-fu4 (
ter than), sheng4-ren4 (
, be capable of)). The mean frequency of occurrence for the 12
response words was 40 per million in the Modern Chinese Frequency Dictionary (Liu 1990).
ferent set, such that initial phonology and character differed. For instance, the homogeneous
triplet “sheng4-fu4 (
), sheng4-si1 ( ), sheng4-ren4 (
inconsistent sets were generated by again exchanging one item between sets, but now such
that the although the initial character of the words differed, the initial phonological syllable
remained identical. For instance, the homogeneous triplet “sheng4-fu4 (
), sheng4-ren4 ()” became the inconsistent triplet “sheng4-xia4 (
summer), sheng4-si1 (
), sheng4-ren4 (
and four inconsistent sets were constructed respectively. Across all 12 sets, each item was
selected exactly once; hence, the same response words were tested in all three conditions,
and only the context in which they appeared was varied (see Appendix A for a complete list
of all sets).
Each participant completed all 12 blocks. The order in which the sets were presented
as follows: Conditions were rotated from block to block in a particular order (i.e., block 1:
homogeneous, block 2: inconsistent, block 3: heterogeneous or block 3, block 2, block 1),
which was determined by two Latin squares of size three such that four of the 24 participants
received blocks in each order. Furthermore, the order of the four sets for each type of context
received a particular order. By this way, each experimental block (i.e., set 1: homogeneous)
items were presented within each block was random, but repetitions of pairs were avoided.
Each block consisted of seven repetitions of each of the three stimuli, producing a total of
21 trials. Hence, each participant received 252 trials in total.
Apparatus. The experiment was programmed in E-Prime Professional Software (Schnei-
der et al. 2002) using a fast Pentium compatible PC. The stimuli were presented on a CRT
phone, which was connected with the computer via a PST Serial Response Box (Schneider
Procedure. Participants were tested individually. For the response generation task, the
general procedure was similar to that used in Damian and Bowers (2003). Participants were
asked to memorize four word pairs one by one, and then to produce the response word when
the cue word was presented. Each trial was constructed as follows: first, a fixation cross was
presented for 500 ms; after a blank interval of 500 ms, the cue word appeared for 1,500 ms.
Participants’ task was to speak out the response word as accurately and quickly as possi-
ble. Each response word was judged by the experimenter to be either correct or incorrect.
, victory and failure), sheng4-si1 ( , bet-
)” became the heterogeneous
, a very hot
)”. By this method, four heterogeneous sets
J Psycholinguist Res
Table 1 Experiment 1–3: mean response times (RT, in ms) and mean error percentages (PE, in %, standard
deviation in parentheses), varied by context (heterogeneous, homogeneous, and inconsistent)
RT EffectPE Effect
* p < .05, **p < .01, *** p < .001
Each trial was followed by a 1,000 ms intertrial interval. The entire response generation
experiment lasted approximately 25 min.
For the word reading task, participants were instructed to read aloud, on each trial, the
written word presented on the screen. The structure of a trial was as follows. A fixation cross
was presented firstly for 500 ms, after a blank interval between 500 ms and 1,200 ms, then
the written word appeared for 1,000 ms. Participants’ task was to read out aloud the written
word as quickly and accurately as possible. Each response was judged by the experimental
to be either correct or incorrect. Each trial was followed by a 1,000 ms intertrial interval. The
entire word reading experiment lasted approximately 12 min.
and those exceeding 2.5 standard deviations (SD) from the mean were also excluded from
tively. For word reading, response latencies longer than 1,000 ms or shorter than 100 ms and
exceeding 2.5 SD from the mean were excluded from data analysis. The above three criteria
latencies and error percentages in different conditions.
ANOVAs were conducted on the response latencies and error percentages for each task
separately, in which context (homogeneous vs. heterogeneous vs. inconsistent) was treated
as a within-participants and within-items variable.
In the response generation task, the main effect of context was significant, F1(2,46) =
6.74,MSE = 35,920,p < .01;F2(2,22) = 25.34,MSE = 5,211,p < .01. Bonfer-
roni-corrected multiple comparisons between the three conditions showed that the 31 ms
J Psycholinguist Res
.05). The heterogeneous and the inconsistent conditions differed significantly from each
other (p < .05). By contrast, the difference between the homogeneous and the inconsis-
tent condition was not significant. Parallel analyses conducted on error percentages yielded
no significant effect of context, F1(2,46) = 1.60,MSE = 32,p = .213;F2(2,22) =
1.94,MSE = 4,p = .17. Bonferroni-corrected multiple comparisons between the three
conditions were not significant (all ps ≥ .922)
In the word reading task, the main effect of context was significant, F1(2,46) =
35.81,MSE = 71,652,p < .001;F2(2,22) = 92.28,MSE = 9,101,p < .001.
Bonferroni-corrected multiple comparisons between the three conditions showed that the
55 ms difference between the homogeneous and the heterogeneous was significant (p <
.001). The heterogeneous and the inconsistent conditions (34 ms difference) differed sig-
nificantly from each other (p < .001). And importantly, the 21 ms difference between the
homogeneousandtheinconsistentconditionswassignificant(p < .01),suggestinganeffect
fect of context, F1(2,46) = 21.508,MSE = 449,p < .001;F2(2,22) = 12.11,MSE =
56,p < .01. Bonferroni-corrected multiple comparisons between three conditions showed a
significantdifferencebetweentheheterogeneousandthehomogeneouscondition(p < .001),
and between the heterogeneous and the inconsistent condition (p < .001). By contrast, the
difference between the homogeneous and the inconsistent conditions was not significant
(p = 1.000).
In the response generation task, the results displayed very similar priming for the homoge-
neous and the inconsistent, relative to the heterogeneous, condition. That is, sets in which
response words shared syllable and tone, and additionally orthography, did not receive any
activated in word production.
For word reading, we found robust orthographic and phonological priming effects. The
priming effect in the inconsistent condition (34 ms) suggests clear response-related phono-
logical benefits when responses share word-initial form properties. More importantly, the
priming effect in the homogeneous condition (55 ms) was significantly larger, highlighting
the fact that word-initial orthographic overlap has a beneficial effect on response latencies if
the task renders spelling properties relevant. The combined results (a clear effect of orthog-
raphy in word reading, but none in response generation) makes it clear that such effects can
principally emerge in tasks of this type, but they evidently do not appear if the task does not
require access to orthographic properties (see also Bi et al. 2009; Roelofs 2006).
It may be noted, however, that in the homogeneous condition (O+P+), a confound may
et al. (2002) Experiment 1, compared priming in a condition in which response words shared
initial syllable and tone, with priming in a condition in which response words additionally
imply the same underlying morpheme, and hence potential overlap in meaning. Semantic
relatedness between words in the response generation task tends to have an inhibitory influ-
J Psycholinguist Res
orthography. On the other hand, Bi et al. (2009) found a small inhibitory effect in the ortho-
task. This finding possibly indicates weak orthographic activation in Chinese spoken word.
However, because the observed inhibition was not significant, the data of Bi et al. are incon-
effect in Chinese spoken word production with response generation task. Response words
in the homogeneous condition were chosen to have leftmost characters which were ortho-
graphically highly similar to each other, yet their corresponding phonological and semantic
properties were entirely dissimilar, hence orthographic similarity could be investigated in
isolation from other variables.
Participants. Thirty-two undergraduate students from the same population as in experiment
1 were paid for their participation. None had been in the earlier experiment.
Materials and Design. Four sets of word pairs were prepared for this experiment, each
set consisted of three word pairs, and each pair was made from two two-character words.
In the homogeneous condition, the first character of all response words in a set contained a
similar visual pattern. For example, for lang2-bei4 (
a married woman’s parents’ home), and hen3-du2 (
acters share orthographic properties. For the heterogeneous condition, the four sets were
constructed by interleaving the word pairs from the homogeneous sets so that the first char-
responses words in a set were avoided. Across all eight sets, each item was selected once,
hence, the same cue-response word pairs were tested in both conditions, and only the context
in which they appeared was varied. Thus, the independent variable was context (homoge-
neous and heterogeneous). All materials for this experiment are shown in Appendix B. We
assessed the degree of orthographic relatedness among the initial characters of the response
age ranged from 18 to 22 years old) from Beijing Forestry University participated in this rat-
ing (they did not take part in the experiment reported below). Target word pairs were rated
on a 9-point scale, with 1 indicating characters which were totally different in orthographic
features and 9 indicating characters which were orthographically identical. The mean value
relatedness among initial characters of response words.
Each participant completed each of 8 blocks. The order in which the sets were presented
was as follows: Conditions were rotated from block to block in a particular order (i.e., block
1: homogeneous, block 2: heterogeneous) which was determined by a Latin Square of size
two such that 16 of the 32 participants received a particular order. Furthermore, the order
of the four sets for each type of condition across the experiment was determined by a Latin
Square of size four such that eight participants received a particular order. Each block con-
sisted of 10 repetitions of each of the three stimuli, producing a total of 30 trials. Hence, each
participant received 240 trials in total. Word pairs in learning and cue words in testing were
Procedure and Apparatus. These were identical to Experiment 1.
, embarrassing), niang2-jia1 (
, acridity) in a set, all the initial char-
J Psycholinguist Res
The same three criteria for data treatment as in Experiment 1 were applied, removing data
accounting for 2.60, 0.30, and 5.21% of the data, respectively. ANOVAs conducted on the
response latencies, with context (homogeneous vs. heterogeneous) as a within-participants
andwithin-itemsvariable,showedasignificantresult,F1(1,31) = 6.27,MSE = 7140,p <
.05;F2(1,11) = 7.77,MSE = 504,p < .05, suggesting slightly longer response times in
the homogeneous than in the heterogeneous condition. Parallel analyses conducted on error
percentages showed no significant effect of context, F1(1,31) = 1.25,MSE = 7,p =
.272;F2(1,11) = 0.7,MSE = 0.7,p = .42.
than in the heterogeneous condition—shared orthographic properties tended to delay re-
sponses, compared to the unrelated condition. Based on previous findings we would have
predicted either a facilitatory, or no effect, of orthographic overlap. Despite the fact that the
polarity of the effect is reversed compared to the typical effects of form overlap found in the
response generation task, this finding may indicate some degree of orthographic activation
in Chinese spoken word production.
with orthographic overlap are more difficult to distinguish from each other than dissimilar
An alternative account would attribute the orthographic effect to a specific property of
the response generation task, specifically that it might have its origin at the memorization
stage. Alario et al. (2007) previously suggested that the orthographic consistency effect
shown in Damian and Bowers (2003) may have its locus in word-pair learning, rather than
in word production proper, if participants use an orthographic code to improve memoriza-
tion during learning the prompt-response associations. Damian and Bowers tested, and to
some extent discounted, this possibility by conducting an additional study in which in the
memorization phase, prompt-response pairs were read out aloud by the experimenter (rather
than visually presented on the screen), and in the experiment prompt words were presented
in spoken format. Even here, an effect of orthography was found. Notwithstanding these
memorization phase, encourages orthographically based effects which would not otherwise
For this reason, the following and final experiment was identical in all relevant aspects
to Experiment 2, except that the stimuli were presented auditorily throughout. If the ortho-
graphic effect found in the earlier experiment is localized at the language production stage,
we should find it in this new experiment as well; by contrast, if it is mainly caused by visual
presentation of the stimuli (in the memorization phase, and/or during the experiment), then
we expect the effect to disappear.
J Psycholinguist Res
Participants. Twenty-four undergraduate students from the same population as in the earlier
experiments were paid for their participation. None had been in the earlier experiments.
Materials, Design, and Apparatus. These were identical to Experiment 2, except for the
following difference: the associative pairs were recorded by a male speaker and digitized
with a sampling frequency of 22,050 Hz. During the experiment, prompt words were pre-
sented to participants at a comfortable volume level via headphones, which was connected
to a computer.
response pairs in the learning stage, nor the prompt words in the experimental blocks. Before
each experimental block, the prompt-response word pairs were presented auditorily and
repeated until participants indicated that they had memorized them and were ready for the
Response latencies longer than 1,500 ms or shorter than 300 ms and those exceeding 2.5 SD
from the mean were also excluded from data analysis. The above three criteria account for
error percentages, suggesting very similar latencies in the homogeneous and heterogeneous
condition. ANOVAs conducted on the response latencies showed no significant main effect
of context, F1(1,23) < 1,MSE = 145,p = .829;F2(1,11) < 1,MSE = 0,p = 1.0, and
Compared to Experiment 2, response latencies were slowed in this experiment, which is
likely due to processing speed differences between the visually and auditorily presented
prompt words. More importantly, the inhibitory effect of orthographic similarity found in
the earlier experiment disappeared in the present study. The only difference between the two
experiments was in the presentation format of prompt-response pairs in the memorization
this change in presentation format discouraged participants from applying orthographically
based strategies which were present in Experiment 2. In any case, the results cast doubt on
Three experiments investigated the role of orthographic information in spoken word pro-
duction in Chinese as a language with a deep orthography. In an adaptation of response
generation task, participants spoke out response words when they saw or heard associated
prompt words. The results can be summarized as follows: (1) a reliable orthographic effect
J Psycholinguist Res
was found in an oral reading task, but none in a response generation task (Experiment 1),
inhibitory effect of matching orthography was found with visually presented prompt stimuli
when reducing the possible semantic confound with orthography among response words in
the homogeneous condition (Experiment 2); (3) the orthographic effect disappeared when
stimuli were presented auditorily (Experiment 3), indicating that the orthographic inhibition
words, rather than the stage of word production.
On a general level, our findings with response generation task are consistent with those
transparent ones. In the studies reported here, the absence of an orthographic effect in Chi-
nese, a language with non-transparent mapping between orthography and phonology, makes
it rather unlikely that spelling-to-sound transparency is a relevant variable in the response
In Experiments 2 and 3, we compared visually and auditorily presented prompt stimuli.
In Experiment 2, orthographic similarity of response words generated a small but significant
orthographic effect, which contrary to our expectations was inhibitory. In contrast, no ortho-
graphic effect was present in Experiment 3. This implies that effects of orthography may
arise at the memorization stage, rather than the stage of speaking response words. In other
words, participants can use an orthographic code to improve recall during the memorization
stage of the response generation task. Alario et al. (2007) suggested that when participants
are asked to learn the association between cues and responses, they establish an episodic
subsequently delay memory retrieval, relative to heterogeneous sets in which no such cues
exist. Thus, an inhibitive orthographic effect is found with visually presented condition. In
contrast, when the cue-response words were presented auditorily, the orthographic represen-
orthographic properties do not exert additional retrieval loads.
Damian and Bowers (2003) reported a facilitative effect even when the stimuli were
presented auditorily in their Experiment 3. Recent evidencesuggests that orthographic infor-
et al. 2004; Pattamadilok et al. 2010, 2007), indicating that orthography and phonology may
be closely linked to each other in spoken word recognition. In addition, Pattamadilok et al.
(2007) found that the orthographic information is activated in spoken words recognition in
it is possible that orthographic information is activated even when the words are presented
However, the situation in Chinese is different. There are two possibilities which could
explain the absence of an orthographic effect in the auditory context. One, Chinese spelling
is intransparent, and there are multiple words corresponding to a pinyin, thus it is difficult to
lun4 as an example, there are two possible disyllabic words can match it:
in English) and
(prodromes in English). In the auditory context, it was probably that
participants only memorize the sound pairs, and did not activate orthographic information.
Damian and Bowers (2009) demonstrated that there is no orthographic activation in the
J Psycholinguist Res
auditory distractor words but not in the visual distractor words with the picture-word inter-
ference task. In contrast, a few behavioral studies show that orthographic information is
activated automatically if Chinese word is presented visually (i.e., Wong and Chen 1999).
This may be one reason for orthographic activation in the visual context, but not in the
asymmetry in strength of connections from semantics (when the response word is selected)
to orthography versus semantics to phonology. In terms of an asymmetry in the strength of
connections, it was probably that the connections from orthography to phonology is much
stronger than the ones from phonology to orthography, which in turn is rooted in the differ-
ence between learning to read and learning to write. When learning to read, one is required
for reading simply make use of the existing spoken word recognition system via the recoding
of orthography into phonology (Frost 1998; Van Orden 1991). When learning to write in the
China Mainland, teachers ask students to write characters in elementary school first, then
to learn the corresponding phonetic pinyin (Zhang and Weekes 2009). On the other hand,
learning to read is usually occurs 3–4 years prior to learning to write during the period of
a child’s language acquisition. Therefore, the connection from orthography to phonology is
more frequency than the one from phonology to orthography. Furthermore, Holcomb et al.
(2005) observed that the priming effect in visual prime-auditory target is much stronger than
the one in auditory prime-visual target, which provide evidence for this possibility.
In Experiment 1, the orthographic manipulation strongly affected a word reading task.
Because participants necessarily use orthographic information to recognize the character,
and subsequently to speak it out, it is not overly surprising to find such effects in this type of
task. Indeed, our findings are generally consistent with previous studies in the literature on
Chinese word recognition. Several studies have demonstrated a reliable orthographic prim-
ing effect for compound words in Chinese (Liu and Peng 1997; Peng et al. 1999; Taft et al.
1999; Taft and Zhu 1995; Zhou and Marslen-Wilson 1995; Zhou et al. 1999). Roelofs (2006)
demonstrated that the spelling of a word plays a role when it is relevant for the task, i.e.,
orthographic information (i.e., picture naming or spoken word production with the response
generation paradigm), orthographic information is not activated.
The present findings in word reading task suggest that orthography plays a role in Chi-
nese word reading. We compared an orthographically and phonologically related condition
(O+P+) with a phonologically related condition (O−P+), and found a facilitative effect of
orthographic consistency when conditions also shared phonological relationship, which is
generally consistent with Roelofs’s results 2006 in English. However, Schiller (2007) inves-
tigated the orthographic and phonological contribution of visually masked primes to reading
aloud in Dutch. He compared O+P− and O−P− conditions and found that orthographic
overlap did not yield any facilitative or inhibitive effect. Schiller concluded that orthography
be a result of differences in transparency among languages (see also Schiller (2007); Dutch
has a relatively shallow orthographic language, whereas English and Chinese are relatively
deep orthographic languages.
On the other hand, no orthographic effect in Dutch word reading might be related to the
conditions Schiller (2007) used. Similar to Schiller’s study, Bi et al. (2009) compared an
orthographically related condition (O+P−) with a condition with neither orthographically
nor phonologically related overlap (O−P−), and found a pure orthographic inhibitive effect
in Chinese, which is consistent with the results of Glushko (1979) in English word read-
ing, but inconsistent with the results of Shen and Forster (1999)’s study in Chinese. Shen
J Psycholinguist Res
and Forster (1999) found that orthographically similar primes produced strong and robust
facilitative effects in naming and lexical decision in Chinese despite the fact they were pho-
nologically unrelated to the target characters. The purely orthographically based (null or
inhibitory) effect clearly needs to be further investigated.
In the Introduction it was mentioned that the picture-word interference task may provide
evidence regarding a potential role of orthography in spoken production. Indeed, a few stud-
ies found orthographic facilitation effects in Chinese spoken production with this technique
which were independent of phonological overlap (Weekes et al. 2002; Zhang et al. 2009;
Zhang and Weekes 2009). However, these studies used visually presented distractor words,
thus, it is possible that the occurrence of orthographic effects can be attributed to processes
taking place during the visual processing of the distractor (see Damian and Bowers 2009).
It is therefore urgent to conduct picture-word interference tasks with auditorily presented
stimuli in Chinese to further illuminate this issue.
Looking at the issue of the interplay between orthography and phonology from a larger
tion is large and growing (see “Introduction”) yet no equivalent exists in speech production,
with the retrieval of phonological codes in speaking seemingly unaffected by the spelling
closely shared, between language input and output tasks (e.g., Martin and Saffran 2002), it is
less than obvious why input tasks involving phonology should mandatorily activate orthog-
raphy, yet output tasks involving phonology apparently do not. A tentative account hinges on
the particular demand characteristics of the respective tasks: in speech perception, the aim
is to disambiguate a noisy input signal, and orthographic codes are apparently recruited in
order to accomplish the task. By contrast, in speech production there is little or no ambiguity
with regard to a word’s form property once its lexical-semantic identity has been established,
that perception and production are supported by the same or similar underlying representa-
To conclude, the present study show that the orthographic effect is related to whether or
not orthography is relevant for the task used, and then related to the correspondence between
orthography and phonology of a language. Using the response generation task, no ortho-
graphic effect obtained in Chinese, a language with a non-transparent and non-alphabetic
orthography. The orthographic inhibition effect with response generation procedure which
was found in Experiment 2 may occur at the stage of cue-response words association. By
contrast, an orthographic effect in an oral reading task was obtained, suggesting that orthog-
raphy plays a role only when it is relevant for the task at hand. Compared to the earlier
studies in Dutch and English, the present study suggests that the orthography effect is tied
to a language’s overall degree of spelling-to-sound consistency in the orthography relevant
of China (30870761, 31170977) and Excellent Associate Professor Foundation in Institute of Psychology
(09CX232023) to Qingfang Zhang, and an International Incoming Fellowship (IIF-2007/R1) from The Royal
Society to Qingfang Zhang and Markus Damian.
This research was supported by Grants from the National Natural Science Foundation
J Psycholinguist Res
Stimuli Used in Experiment 1
Set 1: ci2-dai4 (
Set 2: ci2-cheng2 (
Set 3: sheng4-fu4 (
Set 4: sheng4-xia4 (
, tape), ci2-xian4 (, the name of a place), ci2-xin1 ( , magnetic
, resignation), ci2-dian3 (, dictionary), ci2-rang4 (, politely
, victory or defeat), sheng4-si1 (
, be competent for)
, a very hot summer), sheng4-dian3 (
, the name of a place)
, be better than), sheng4-ren4
, grand ceremony),
Set 1: ci2-dai4 (
Set 2: ci2-cheng2 (
Set 3: sheng4-fu4 (
Set 4: sheng4-xia4 (
, tape), sheng4-si1 (
, resignation), sheng4-dian3 (
, politely decline)
, victory or defeat), ci2-xian4 (
, be competent for)
, a very hot summer), ci2-dian3 (
, the name of a place)
, better than), ci2-xin1 ( , magnetic core)
, grand ceremony), ci2-rang4
, the name of a place), sheng4-
, dictionary), sheng4-jing1
Set 1: ci2-cheng2 (
Set 2: ci2-dai4 (
Set 3: sheng4-xia4 (
Set 4: sheng4-fu4 (
, resignation), ci2-xian4 (, the name of a place), ci2-xin1 (,
, tape), ci2-dian3 (
, a very hot summer), sheng4-si1 (
, be competent for)
, victory or defeat), sheng4-dian3 (
, the name of a place)
, dictionary), ci2-rang4 ( , politely decline)
, better than), sheng4-ren4
, grand ceremony), sheng4-
Stimuli Used in Experiments 2 and 3
Set 1: chuang2-pu1 (
Set 2: lang2-bei4(
Set 3: guo1-lu2(
Set 4: zhu1-rou4 (
, bed), qing4-he4 ( , congratulations), zhuang1-jia1 (,
, pork), du3-bo2 (
,a married woman’s parent’s home),
,an apple of discord),wo1-niu2(
, gambling), xu4-lun4 (
Set 1: lang2-bei4 (
, discomposure), huo4-gen1 (
, an apple of discord), xu4-lun4
J Psycholinguist Res Download full-text
Set 2: qing4-he4 (
Set 3: zhuang1-jia1 (
Set 4: chuang2-pu1 (
, congratulations), wo1-niu2 (
, emblements), hen3-du2 (
, bed), niang2-jia1 (
, snail), du3-bo2 (
, acridity), zhu1-rou4 (
, a married woman’s parent’s home),
Alario, F.-X., Perre, L., Castel, C., & Ziegler, J. C. (2007). The role of orthography in speech production
revisited. Cognition, 102, 464–475.
Bi, Y., Wei, T., Janssen, N., & Han, Z. (2009). The contribution of orthography to spoken word production:
Evidence from Mandarin Chinese. Psychonomic Bulletin & Review, 16(3), 555–560.
Chen, T. M., & Chen, J. Y. (2006). Morphological encoding in the production of compound words in
Mandarin Chinese. Journal of Memory and Language, 54, 491–514.
Chen, T. Y., Chen, T. M., & Dell, G. S. (2002). Word-form encoding in Mandarin as assessed by the
implicit priming task. Journal of Memory and Language, 46, 751–781.
Chéreau, C., Gaskell, M. G., & Dumay, N. (2007). Reading spoken words: Orthographic effects in auditory
priming. Cognition, 102, 341–360.
Damian, M. F., & Bowers, J. S. (2003). Effects of orthography on speech production in a form-preparation
paradigm. Journal of Memory and Language, 49, 119–132.
Evidence from picture-word interference tasks. European Journal of Cognitive Psychology, 22, 1–11.
Dijkstra, T., Roelofs, A., & Fieuws, S. (1995). Orthographic effects on phoneme monitoring. Canadian
Journal of Experimental Psychology, 49, 264–271.
Donnenwerth-Nolan, S., Tanenhaus, M. K., & Seidenberg, M. S. (1981). Multiple code activation in
word recognition: Evidence from rhyme monitoring. Journal of Experimental Psychology: Learning,
Memory and Cognition, 7, 170–180.
Frost, R. (1998). Toward a strong phonological theory of visual word recognition: True issues and false
trails. Psychological Bulletin, 123, 71–99.
Glushko, R.J., (1979). The organization and activation of orthographic knowledge in reading aloud. Journal
of Experimental Psychology: Human Perception & Performance, 5, 674–691
Hallé, P. A., Chéreau, C., & Segui, J. (2000). Where is the /b/ in “absurde” [apsyrd]? It is in French
listeners’ minds. Journal of Memory and Language, 43, 618–639.
Holcomb, P. J., Anderson, J., & Grainger, J. (2005). An Electrophysiological study of cross-modal repetition
priming. Psychophysiology, 42, 493–507.
Jakimik, J., Cole, R. A., & Rudnicky, A. I. (1985). Sound and spelling in spoken word recognition. Journal
of Memory and Language, 24, 165–178.
Johnston, M., McKague, M., & Pratt, C. (2004). Evidence for an automatic orthographic code in the
processing of visually novel word forms. Language and Cognitive Processes, 19, 273–317.
Liu, Y., & Peng, D.-L. (1997). Meaning access of Chinese compounds and its time course. In
H.-C. Chen (Ed.), Cognitive processing of Chinese and related Asian languages (pp. 219–232). Hong
Kong: Chinese University Press.
Liu, Y. (1990). Modern Chinese Frequency Dictionary of Common Words. Beijing: Yu Hang Press.
Lupker, S. J. (1982). The role of phonetic and orthographic similarity in picture-word interference. Canadian
Journal of Psychology, 26, 349–367.
Martin, N., & Saffran, E. M. (2002). The relationship of input and output phonological processing: An
evaluation of models and evidence to support them. Aphasiology, 16, 107–150.
Meyer, A. S. (1990). The time course of phonological encoding in language production: The encoding
of successive syllables. Journal of Memory and Language, 29, 524–545.
Meyer, A. S. (1991). The time course of phonological encoding in language production: Phonological
encoding inside a syllable. Journal of Memory and Language, 30, 69–89.
Muneaux, M., & Ziegler, J. C. (2004). Locus of orthographic effects in spoken word recognition: Novel
insights from the neighbor generation task. Language and Cognitive Processes, 19, 641–660.
Pattamadilok, C., Lafontaine, H., Morais, J., & Kolinsky, R. (2010). Auditory word serial recall benefits
from orthographic dissimilarity. Language and Speech, 53, 321–341.
Pattamadilok, C., Perre, L., Dufau, S., & Ziegler, J. C. (2009). On-line orthographic influences on spoken
language in a semantic task. Journal of Cognitive Neuroscience, 21, 169–179.