ArticlePDF Available

Production of English interdental fricatives by Dutch, German, and English speakers


Abstract and Figures

Non-native (L2) speakers of English often experience difficulties in producing English interdental fricatives (e.g. the voiceless [θ]), and this leads to frequent substitutions of these fricatives (e.g. with [t], [s], and [f]). Differences in the choice of [θ]-substitutions across L2 speakers with different native (L1) language backgrounds have been extensively explored. However, even within one foreign accent, more than one substitution choice occurs, but this has been less systematically studied. Furthermore, little is known about whether the substitutions of voiceless [θ] are phonetically clear instances of [t], [s], and [f], as they are often labelled. In this study, we attempted a phonetic approach to examine language-specific preferences for [θ]-substitutions by carrying out acoustic measurements of L1 and L2 realizations of these sounds. To this end, we collected a corpus of spoken English with L1 speakers (UK-English), and Dutch and German L2 speakers. We show a) that the distribution of differential substitutions using identical materials differs between Dutch and German L2 speakers, b) that [t,s,f]-substitutes differ acoustically from intended [t,s,f], and c) that L2 productions of [θ] are acoustically comparable to L1 productions.
Content may be subject to copyright.
Production of English interdental fricatives by Dutch, German,
and English speakers
Adriana Hanulíková & Andrea Weber
Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
{A.Hanulikova; A.Weber}
Non-native (L2) speakers of English often experience difficulties in producing English interdental fricatives
(e.g. the voiceless [θ]), and this leads to frequent substitutions of these fricatives (e.g. with [t], [s], and [f]).
Differences in the choice of [θ]-substitutions across L2 speakers with different native (L1) language
backgrounds have been extensively explored. However, even within one foreign accent, more than one
substitution choice occurs, but this has been less systematically studied. Furthermore, little is known about
whether the substitutions of voiceless [θ] are phonetically clear instances of [t], [s], and [f], as they are often
labelled. In this study, we attempted a phonetic approach to examine language-specific preferences for [θ]-
substitutions by carrying out acoustic measurements of L1 and L2 realizations of these sounds. To this end,
we collected a corpus of spoken English with L1 speakers (UK-English), and Dutch and German L2
speakers. We show a) that the distribution of differential substitutions using identical materials differs
between Dutch and German L2 speakers, b) that [t,s,f]-substitutes differ acoustically from intended [t,s,f],
and c) that L2 productions of [θ] are acoustically comparable to L1 productions.
Keywords: segmental substitutions, interdental fricatives, Dutch, German, English.
One characteristic feature of speech produced by L2 speakers is its accent. Foreign accents result from a
combination of subphonemic, segmental, and suprasegmental deviations from the target language. A
common phenomenon at the segmental level is substitution, by which we mean the replacement of a specific
L2 phoneme by another phoneme, usually one that occurs in the native phoneme inventory of the speaker.
Substitutions can result, for example, from the lack of a native counterpart for a given L2 phoneme, and are
often subject to variation, such as in the L2 production of English interdental fricatives. Since phoneme
inventories of most European languages lack interdental fricatives, many L2 speakers of English have
difficulties producing them correctly and often substitute them. German and European-French learners of
English, for example, often replace the voiceless interdental fricative [θ] with [s], while Dutch and Canadian-
French speakers are reported to prefer [t] (for an overview, see Brannen 2002). Phoneme-identification
studies show that [θ] is perceptually most often confused with the acoustically similar [f] by native as well as
by various L2 listeners, and less frequently confused with [t] or [s] (Brannen 2002; Cutler et al. 2004;
Hancin-Bhatt 1994a; Miller and Nicely 1955; Tabain 1998). Given the acoustic similarity with [f], it is rather
surprising that [f] is not the most common substitution in English L2 speech, not even when [f] is available
in the L1 phoneme inventory of the L2 speakers. Note that substitutions of voiceless (and voiced) fricatives
are not restricted to L2 speech; they also occur in dialects of English, with reported instances of [f] in
Cockney (Wells 1982), and of [t] in Irish English (Hickey 2004). However, in contrast to L2 studies, the
production frequencies of these dialectal substitutions across L1 speakers are seldom systematically studied
or reported (see McGuire 2003).
Prior research has explored the causal relationship of variation in [θ]-substitutions across L2 learners with
different L1 backgrounds, and has focused on the dissociation between perception and production (e.g.
Brannen 2002; Hancin-Bhatt 1994b; Teasdale 1997), on phonological theories, universal factors, and
language acquisition models (e.g. Flege and Davidian 1984; Picard 2002; Weinberger 1994; Westers et al.
2007). However, there does not appear to be a simple answer to the question why certain substitutes are
chosen. While the phonological structure of the L1 certainly is an important factor in explaining different
substitutions, other factors such as word-dependent characteristics or social factors and varying teaching
curricula have probably an influence on L2 production as well.
Interestingly, even within one foreign accent different substitution choices are made, but these have been
less systematically studied. Moreover, little is known about whether the substitutions are phonetically clear
instances of [t,s,f] as they are often labelled. The purpose of our study was therefore to answer the following
questions: a) what is the distribution of differential substitutions using comparable materials across L2 and
L1 speakers, b) how do [t,s,f]-substitutes differ acoustically from the intended [t,s,f], and c) how do L2
productions of [θ] compare acoustically to L1 productions. Here we attempt a phonetic approach to examine
language-specific preferences for [θ]-substitutions. A similar approach has been put forward by Teasdale
(1997), who proposed that articulatory properties of [s] in the L1 are the best predictor of whether [t] or [s]
would be chosen as the [θ]-substitute. Here we wish to elaborate on this idea by providing acoustic
measurements of the substitutions as well as a comparison of these measurements between L1 and L2
speakers. We chose Dutch and German L2 learners of English, for which the acoustic properties of both [s]
and [t] are different in their respective L1. Dutch [s] is less articulatorily tense and has graver friction than
German (Mees and Collins 1982), and [t] in initial position is aspirated in German but unaspirated in Dutch
(Keating 1984; Lisker and Abramson 1964). To this end, we collected a corpus of spoken English containing
UK-English L1 speakers, as well as Dutch and German L2 speakers of English. The two groups of learners
were selected because they not only differ in their predominant [θ]-substitutions (e.g. Westers et al. 2007 for
Dutch; Hancin-Bhatt 1994b for German), but also in fine-acoustic details in fricative and stop production
(Mees and Collins 1982; Rietveld and van Heuven 2001). The data obtained from the corpus were labelled
and categorized. Moreover, acoustic measurements were taken to compare L1 and L2 [θ]-realizations, and to
compare the L2 realizations of [θ]-substitutes [t,s,f] with intended [t,s,f]-realizations. In this study, only
word-initial sounds are considered.
Studies on the acoustics of English fricatives have shown mainly four parameters that can distinguish
fricatives: duration, spectral properties (e.g. centre of gravity, F2 onset, spectral peak location), amplitude
(overall and relative noise amplitude), and transitions from the fricative into a vowel (e.g. Hughes and Halle
1956; Strevens 1960; Jassem 1962). While these measures can distinguish [s] from [f] and [θ], it seems that
formant transitions and spectral peak location can provide additional information for less distinct fricatives
such as [f] and [θ] (e.g. Harris 1958, Jongman et al. 2000, Tabain 1998). Which measure is best suitable can
also depend on the use of real words versus syllables (e.g. Tabain 1998). The most informative cue for place
of articulation of plosives is the distribution of energy in the release burst (e.g. Steven and Blumstein 1981),
but other cues such as formant transition and spectral properties have also been reported (e.g. van Alphen
and Smits 2004, for Dutch). In the present paper, we restrict the analysis of fricative intervals and plosive
bursts to duration, center of gravity (COG), and amplitude.
2.1. Materials and Procedure
A short story in English was constructed, containing numerous words with voiceless [θ] and words with [s],
[f], and [t]. Participants were asked to read the story aloud at a comfortable speaking rate. Stereo recordings
were made in a quiet room with a digital recorder at 44.1 kHz sampling rate with 16-bit resolution and were
later transferred to a computer. The left channel was extracted for further processing.
For the analysis, we selected 18 content words with the voiceless [θ] in word-initial position (13 different
words, occurring between 1 and 4 times in the story), and 10 [s]-, [f]-, and [t]-initial words each (in two cases
for [t] and [f], the phoneme occurred in a stressed syllable-initial position within a word; altogether 27
different words, occurring 1 to 2 times). These words and their target phonemes were manually annotated.
The spectrogram and the waveform were used to determine the onset and the cessation of the fricatives, and
the onset and the offset of the burst. All [θ]-instances were then categorized by two trained research
assistants (German learners’ data by two assistants with L1 German, and Dutch learners’ data by two
assistants with L1 Dutch). Whenever there was disagreement about the category of a particular token, the
categorization of a third coder (a trained phonetician) was decisive. English speakers’ data were labelled and
categorized by a trained phonetician and by one native speaker of English. In case of a disagreement, the
opinion of the English native speaker was decisive. Before carrying out the acoustic analysis, all critical
words were normalized for mean amplitude. Only [θ]-instances categorized as [t], [s], [f], or [θ] were
measured, excluding few [θ]-instances that were either not fully produced, unclear, or substituted with other
than the above mentioned substitutes. We used the PRAAT speech editor (Boersma 2001) to extract the
duration, the amplitude, and the COG for each token. The weightening for COG was done by the absolute
spectrum of the frequency (p=1). To calculate the average amplitude, the root-mean-squared (RMS) method
was used, that is, the square-root of the mean of the squared amplitude of each point of a waveform.
2.2. Participants
The participants in the corpus study consisted of 37 native speakers of Dutch from the Radboud University in
Nijmegen in the Netherlands (mean age 21.5, SD 2.2), 37 native speakers of German from the University of
Cologne in Germany (mean age 22.5, SD 2.3; recordings of one participants were excluded due to technical
problems), and 31 native speakers of English from the University Birmingham in England (mean age 19.4,
SD 1.1). All participants took part in exchange for payment. The L2 participants were highly proficient in
English. Dutch students had on average 7.6 years of formal English training, and German students had on
average 8.8 years of formal English training. In an English multiple-choice vocabulary test (including many
low frequency words), Dutch students scored on average 83% correctly, and German students scored on
average 79% correctly (the difference in their scores did not reach significance). None of the German
participants had lived in the Netherlands, and none of the Dutch participants had lived in Germany.
3.1. Categorization results
The categorization results across all items in Table 1 show how often [θ] was produced correctly or
substituted with [s,t,f] or other phonemes (e.g. [tθ], [ts], [S], or unclear), listed for each speaker group
separately. The results show that all participants produced the English tokens with word-initial [θ] more
often correctly than with a substitution, and that, unsurprisingly, L1 speakers substituted less frequently than
L2 speakers. When comparing the two learner groups, German speakers produced significantly more words
with substitutions than Dutch speakers did.
Table 1: Percentages of [θ]-productions per speaker group (percentages rounded up; numbers of occurrences are in brackets).
Speakers [s] [t] [f] [θ] others
Dutch 5% (30) 23% (155) 3% (17) 62% (412) 7% (47)
German 29% (187) 7% (43) 5% (34) 51% (323) 8% (49)
English 0% (1) 0% (0) 12% (63) 88% (463) 1% (4)
Within the substituted instances, a significant difference between German and Dutch speakers was found:
German learners predominantly substituted the English [θ] with [s] (71%, compared to 15% for Dutch
speakers), while Dutch speakers predominantly substituted [θ] with [t] (77%, compared to 16% for German
speakers). For both groups, the perceptually similar [f] occurred least frequently (13% for German speakers
and 8% for Dutch speakers). It is worth noting that, overall, substitutions did not seem to be word-dependent,
and that many participants produced more than one substitute type. For English speakers we found 12% of
[f]-substitutions, which were mainly driven by three speakers. When excluding these speakers, the number of
[f]-instances dropped to 5% and the number of [θ]-instances rose to 95%.
3.2. Measurements
The Figures below show the results from acoustic measurements across the three speaker groups for duration
(Figure 1), RMS amplitude (Figure 2), and COG (Figure 3). To evaluate differences and similarities in the
obtained values, t-tests were conducted across [θ]-realizations across the three speaker groups. Further t-tests
within each of the speaker groups were aimed at a comparison between the accent-specific predominant
substitutions ([t] for Dutch; [s] for Germans) and the realizations of the intended [t], [f], and [s] within and
across L2 learners.
Figure 1: Duration in seconds (s) of the intended [t,s,f,θ], and of the [θ]-substitutions [t,s,f] (indicated by a following (th)).
f f(th) s s(th) t t(th) th
0.00 0.10 0.20 0.30
Dutch speakers
segment type
Duration (s)
f f(th) s s(th) t t(th) th
0.00 0.10 0.20 0.30
German speakers
segment type
Duration (s)
f f(th) s s(th) t th
0.00 0.10 0.20 0.30
English speakers
segment type
Duration (s)
Figure 2: RMS in Pascal (Pa) of the intended [t,s,f,θ], and of the [θ]-substitutions [t,s,f] (indicated by a following (th)).
f f(th) s s(th) t t(th) th
0.000 0.010 0.020 0.030
Dutch speakers
segment type
RMS (Pa)
f f(th) s s(th) t t(th) th
0.000 0.010 0.020 0.030
German speakers
segment type
RMS (Pa)
f f(th) s s(th) t th
0.000 0.010 0.020 0.030
English speakers
segment type
RMS (Pa)
Figure 3: COG in Hertz (Hz) of the intended [t,s,f,θ], and of the [θ]-substitutions [t,s,f] (indicated by a following (th)).
f f(th) s s(th) t t(th) th
2000 4000 6000 8000 10000
Dutch speakers
segment type
Centre of gravity (Hz)
f f(th) s s(th) t t(th) th
2000 4000 6000 8000 10000
German speakers
segment type
Centre of gravity (Hz)
f f(th) s s(th) t th
2000 4000 6000 8000 10000
English speakers
segment type
Centre of gravity (Hz)
The results showed that the [θ]-realizations of English L1 speakers differed from those of German L2
speakers in duration and RMS but not in COG. Dutch L2 speakers differed from English L1 speakers in
duration but not in RMS or COG. Differences in duration between L1 and L2 speakers are not surprising,
given that L2-speech rate is overall slower. Similarly, differences in amplitude could come about when L2
speakers encounter difficulties with a given speech sound and consequently lower their voice in amplitude.
Importantly, the COG values did not differ across the groups, suggesting some evidence for target-like
pronunciation of the English [θ] for L2 speakers. A comparison of the three measurements for [θ]-realization
between German and Dutch speakers did not show significant differences (however, COG showed a weak
tendency for a difference, p = .084). Further comparisons have shown that [s]-realizations did not differ
between German and Dutch speakers, but the properties of [t] differed in all three measures. Given prior
studies, we expected a difference between German and Dutch realizations of [s]. Because of a weak tendency
for a difference in COG (p = .097), we further examined this issue by carrying out additional measurements
that can help distinguish small differences in articulation (see Jongman et al. 2000). We found that the
German [s] differed from the Dutch [s] in the kurtosis, standard deviation, skewness, and central moments.
A comparison of the intended [t,s,f] with the substitutes [t,s,f] was limited to the dominant substitutes
within an L2 group (this was due to an insufficient number of responses for less frequent substitutes). Within
the German group, [s]-substitutes differed from the intended [s]-realizations as well as from the correctly
pronounced [θ]-instances in all measures. Similarly, Dutch speakers’ [t]-substitutes differed from the
intended [t]-realizations in all measures, and from [θ]-realizations in duration and COG, but not in RMS.
This suggests that substitutions in L2 speech are on average not clear instances of the [t,s,f], as they are often
labelled, and that they are neither clear instances of [θ].
The first question of this study concerned the distribution of substitution choices for the English voiceless
interdental fricative [θ] by L2 and L1 speakers. The categorization results confirmed previous findings for
L2 speakers: the dominant [θ]-substitute for German learners is clearly [s] while for Dutch learners it is [t]
(e.g. Westers et al. 2007; Hancin-Bhatt 1994b). However, all three substitutions [t,s,f] occurred in the L2
productions of both learner groups, and the substitutes were not word- or speaker-specific. Importantly, L2
speakers produced native-like realizations of the fricative [θ] more often than any of the dominant
substitutions. Since this probably depends strongly on the proficiency level of the L2 speakers, the numbers
could reverse with lower proficiency. In contrast to L2 speakers, L1 speakers of English substituted [θ] (if at
all) with [f]. It has been previously suggested that speakers of languages that articulate [s] further back
and/or have a dental [t], are very likely to substitute the English interdental fricative [θ] with [t] (Taesdale
1997). This would indeed support Dutch preference for [t]-substitutes, because [s] is articulated further back
in Dutch compared to German, which prefers [s]-substitutes. The present study further explored this proposal
and found acoustic differences in L2-production of both [t] and [s] between German and Dutch speakers,
supporting the phonetic explanation proposed by Taesdale (1997).
The second question concerned acoustic differences between [t,s,f]-substitutes and intended [t,s,f]. Given
the nature of the natural elicitation method, we restricted the analysis only to dominant substitutes within an
L2 group to ensure enough data points for a comparison. We found that not only did [t,s,f]-substitutes differ
from the intended [t,s,f], they also differed from the [θ]-realization within each L2 group. This suggests that
labeling conventions of [θ]-substitutions as [t,s,f] might not sufficiently characterize L2-productions, at least
concerning its acoustics. Rather, [θ]-substitutions seem to show gradient properties, exhibiting acoustic
properties that are often in between those of [θ]-realizations and [t,s,f]-realizations. However, perceptually
these substitutes could still be perceived as good exemplars of [t,s,f]. To answer this question, results from a
categorization experiment might be more telling, and we leave this issue to future studies.
The last question addressed an acoustic comparison of L2 and L1 [θ]-realizations. We found that German
speakers did not differ from Dutch speakers, but differed from the English speakers in RMS and duration.
Dutch speakers differed from the English speakers only in duration. Differences in amplitude and duration,
however, are not surprising when comparing non-native with native speakers. Importantly, both L2 groups
resembled the L1 group in the COG.
To conclude, this study showed that despite the difficulties that L2 speakers have with the English
fricative [θ], more than half of the produced instances were target-like. Acoustically, the [θ]-substitutions
were not clear instances of [t,s,f]. Articulatory differences between German and Dutch [t] and [s] were found
and show a promising (phonetic) approach to future investigations of differential substitutions in L2 speech.
This work was supported by the Max-Planck-Gesellschaft (MPG). We would like to thank Laurence
Bruggeman, Anne Blankenhorn, Sabrina Jung, Julia Lennertz, Simon Mack, Berit Meinert, Rachel Sheer,
and Karina Visser for their assistance, and Frank Eisner for comments on an earlier version of the paper.
Boersma, P. 2001. PRAAT, a system for doing phonetics by computer. Glot Internatinal 5. 341-345.
Brannen, K. 2002. The role of perception in differential substitution. Canadian Journal of Linguistics – Revue Canadienne de
Linguistique 47. 1–20.
Cutler, A., Weber, A., Smits, R., Cooper, N. 2004. Patterns of English phoneme confusions by native and non-native listeners.
Journal of the Acoustical Society of America 116(6). 3668-3678.
Flege, J.E., Davidian, R. 1984. Transfer and developmental processes in adult foreign language production. Journal of Applied
Psycholinguistic Research 5. 323-347.
Hancin-Bhatt, B.J. 1994a. Segment transfer: a consequence of a dynamic system. Second Language Research 10(3). 241-269.
Hancin-Bhatt, B.J. 1994b. Phonological Transfer in Second Language Perception and Production. Doctoral Dissertation, University
of Illinois.
Harris, K.S. 1958. Cues for the discrimination of American English fricatives in spoken syllables. Language and Speech 1. 1-7.
Hickey, R. (ed.). 2004. A Sound Atlas of Irish English. Berlin: Mouton de Gruyter.
Hughes, G.W., Halle, M. 1956. Spectral properties of fricative consonants. Journal of the Acoustical Society of America 28. 303–310.
Keating, P.A. 1984. Phonetic and phonological representation of stop consonant voicing. Language 60. 286-319.
Jassem, W. 1962. Noise spectra of Swedish, English, and Polish fricatives. Proceedings of the Speech Communication Seminar,
Stockholm, Royal Institute of Technology Speech Transmission Laboratory. 1–4.
Jongman, A., Wayland, R., Wong, S. 2000. Acoustic characteristics of English fricatives. Journal of the Acoustical Society of
America 108(3). 1252-1263.
Lisker, L., Abramson, A.S. 1964. A cross-language study of voicing in initial stops-acoustical measurements. Word 20. 384-422.
McGuire, G. 2003. The realization of interdental fricatives in Columbus, OH, AAVE. Paper presented at the Montreal-Ottawa-
Toronto Phonology Workshop. Retrieved February 2009 from]Handout.doc.
Mees, I., Collins, B. 1982. A phonetic description of the consonant system of Standard Dutch (ABN). Journal of the International
Phonetic Association 12. 2-12.
Miller, G.A., Nicely, P.E. 1955. An analysis of perceptual confusions among some English consonants. Journal of the Acoustical
Society of America 27(2). 338-352.
Picard, M. 2002. The differential substitution of English /θ ð/ in French: The case against underspecification in L2 phonology.
Lingvisticæ Investigationes 25(1). 87–96.
Rietveld, A.C.M., van Heuven, V.J. 2001. Algemene Fonetiek. Bussum: Coutinho.
Stevens, K., Blumstein, S. 1981. The search for invariant acoustic correlates of phonetic features. In Eimas, P.D. and Miller, J.L.
(eds.), Perspectives on the study of speech. Hillsdale, NJ: Erlbaum. 1-38.
Strevens, P. 1960. Spectra of fricative noise in human speech. Language and Speech 3. 32–49.
Tabain, M. 1998. Non-sibilant fricatives in English: spectral information above 10 kHz. Phonetica 55. 107–130.
Teasdale, A.M. 1997. On the differential substitution of English [θ]: a phonetic approach. Calgary Working Papers in Linguistics 19.
van Alphen, P., Smits, R. 2004. Acoustical and perceptual analysis of the voicing distinction in Dutch initial plosives: the role of
prevoicing. Journal of Phonetics 32. 455-491.
Weinberger, S.H. 1994. Theoretical Foundations of Second Language Phonology. Doctoral Dissertation, University of Washington.
Wells, J.C. 1982. Accents of English 2: The British Isles. Cambridge: Cambridge University Press.
Westers, F., Gilbers, D., Lowie, W. 2007. Substitution of dental fricatives in English by Dutch L2 speakers. Language Sciences 29.
... Thus, the substituted sound is typically acoustically and/or articulatorily most similar sound in the native language to one from the target language [1]. One of the most frequently analysed features in learners of English is the substitution of dental fricatives, which has been studied for a number of typologically different L1s [4,5,6,7,8,9,11,12,13,16,17]. For learners of English, this sound in particular is difficult as it is very rare: The World Atlas of Language Structures (WALS) indicates only 40 languages to have dental fricatives [2]. ...
... Figs. 1 (voiced) and 2 (voiceless) show the speakers' relative preferences, collapsed across phonological contexts. [9,16]. Thus, perhaps, Swiss German can be viewed as a *t+ languageor even as an *f+ language in line with [10]'s suggestion for a new category, this is particularly true for the voiced allophone. ...
... Thus, the substituted sound is typically acoustically and/or articulatorily most similar sound in the native language to one from the target language [1]. One of the most frequently analysed features in learners of English is the substitution of dental fricatives, which has been studied for a number of typologically different L1s [4,5,6,7,8,9,11,12,13,16,17]. For learners of English, this sound in particular is difficult as it is very rare: The World Atlas of Language Structures (WALS) indicates only 40 languages to have dental fricatives [2]. ...
... Figs. 1 (voiced) and 2 (voiceless) show the speakers' relative preferences, collapsed across phonological contexts. [9,16]. Thus, perhaps, Swiss German can be viewed as a *t+ languageor even as an *f+ language in line with [10]'s suggestion for a new category, this is particularly true for the voiced allophone. ...
Conference Paper
Full-text available
Anecdotally, it has been observed that Swiss Germans speaking English use a plethora of sounds for the dental fricatives /θ/ and /ð/. It is unsurprising that L2 speakers tend to substitute a sound not present in their native phoneme inventory with a sound that is present; however, there is wide intra-and inter-speaker variation in the sounds chosen to replace the dental fricatives. The present study is an initial examination of how speakers of Swiss German differ in their choice of sound substitution when speaking English. We recorded read speech from 45 high school students. Data was coded auditorily and acoustically. Findings confirm substantial variation between the learners, with the most common replacement being [d] for the voiced dental fricative and [f] for the unvoiced counterpart. We discuss potential reasons for the reported between-speaker variation.
... Comparable results have been found for other language pairs. For example, Mandarin speakers' productions of the English /θ/-/s/ and /ð/-/z/ contrasts are often misperceived by native English speakers (Hanulíková & Weber, 2010;Picard, 2002;Rau et al., 2009;Rogers & Dalby, 2005;Teasdale, 1997;Zhang & Xiao, 2014). Rogers and Dalby (2005) reported that /θ/ productions are misperceived as /s/ about 30% of the time, and /ð/ is misperceived as /z/ about 20% of the time. ...
The speech perception system adjusts its phoneme categories based on the current speech input and lexical context. This is known as lexically driven perceptual recalibration, and it is often assumed to underlie accommodation to non-native accented speech. However, recalibration studies have focused on maximally ambiguous sounds (e.g., a sound ambiguous between “sh” and “s” in a word like “superpower”), a scenario that does not represent the full range of variation present in accented speech. Indeed, non-native speakers sometimes completely substitute a phoneme for another, rather than produce an ambiguous segment (e.g., saying “shuperpower”). This has been called a “bad map” in the literature. In this study, we scale up the lexically driven recalibration paradigm to such cases. Because previous research suggests that the position of the critically accented phoneme modulates the success of recalibration, we include such a manipulation in our study. And to ensure that participants treat all critical items as words (an important point for successful recalibration), we use a new exposure task that incentivizes them to do so. Our findings suggest that while recalibration is most robust after exposure to ambiguous sounds, it also occurs after exposure to bad maps. But interestingly, positional effects may be reversed: recalibration was more likely for ambiguous sounds late in words, but more likely for bad maps occurring early in words. Finally, a comparison of an online versus in-lab version of these conditions shows that experimental setting may have a non-trivial effect on the results of recalibration studies.
... Some studies point towards greater and less systematic variation in foreign accents compared to regional accents, especially when a person uses some of the rules or sounds of another language. It is often acknowledged that the interaction between the segmental and suprasegmental systems results in higher and qualitatively different withinand between-speaker variation when the interacting languages/varieties have little or no phonological overlaps (e.g., Flege, 1988;Best, McRoberts, & Goodell, 2001;Hanulíková & Weber, 2010). Because of such differences, regional and foreign accents are usually treated in a separate fashion; nonetheless, they are not uniform entities. ...
Full-text available
This study assesses the effects of heterogeneous speech input on ratings of children's words and vowels. First, we examined whether exposure to a certain accent type or to different languages can predict accent categorization as standard, regional or foreign. Second, we examined how perceived accent strength of words and isolated vowels can be predicted by the amount of input children receive in distinct accents and languages and by lexical frequency. To this end, speech samples of 51 monolingual German children and simultaneous bilingual children speaking German and another language (mean age 9;9) were presented to 63 monolingual German adult raters. In Experiment 1, 31 raters assessed the category (standard, regional, foreign) and degree of accent of children's words, while in Experiment 2, 32 raters assessed vowels extracted from these words. The results show that an equal proportion of monolingual and bilingual children were categorized as having a Standard German accent. Children who were rated as foreign-accented were more likely to be bilingual, while children who were rated as regionally-accented were more likely to be monolingual. As predicted, foreign-accent strength increased with a greater amount of input in the other language of the bilinguals. Lexical frequency predicted accentedness ratings of vowels (but not of words). These findings show that not only other language input but also accented input matter when assessing perceived accent.
This study examines the L2 production of the Italian (e.g. ‹tovaglia›, ‘tablecloth’) and (e.g. ‹agnello›, ‘lamb’) by English-speaking learners. Four beginner English-native speakers, one advanced English-native speakers and two Italian-native speakers completed a picture-naming task, a reading task and a language background questionnaire. An auditory and an acoustic analysis were conducted, where F1, F2, F3 and F4, and duration were measured. The results showed that both sounds are difficult for second language learners to acquire in a native-like manner. Moreover, each of these complex sounds may be produced as a sequence of two existing first languages sounds. Our findings have implications for L2 models of speech learning. We propose that a (marked) L2 sound may be produced as a sequence of existing L1 sounds.
Full-text available
Aims and Objectives We compared speech accuracy and pronunciation patterns between early learners of English as a foreign language (EFL) with different language backgrounds. We asked (1) whether linguistic background predicts pronunciation outcomes, and (2) if error sources and substitution patterns differ between monolinguals and heterogeneous bilinguals. Methodology Monolingual and bilingual 4th-graders ( N = 183) at German public primary schools participated in an English picture-naming task. We further collected linguistic, cognitive and social background measures to control for individual differences. Data and Analysis Productions were transcribed and rated for accuracy and error types by three independent raters. We compared monolingual and bilingual pronunciation accuracy in a linear mixed-effects regression analysis controlling for background factors at the individual and institutional level. We further categorized all error types and compared their relative frequency as well as substitution patterns between different language groups. Findings After background factors were controlled for, bilinguals (irrespective of specific L1) significantly outperformed their monolingual peers on overall pronunciation accuracy. Irrespective of language background, the most frequent error sources overlapped, affecting English sounds which are considered marked, are absent from the German phoneme inventory, or differ phonetically from a German equivalent. Originality This study extends previous work on bilingual advantages in other domains of EFL to less researched phonological skills. It focuses on overall productive skills in young FL learners with limited proficiency and provides an overview over the most common error sources and substitution patterns in connection to language background. Significance/Implications The study highlights that bilingual learners may deploy additional resources in the acquisition of target language phonology that should be addressed in the foreign language classroom.
Background/aims: Previous research has shown that exposure to multiple foreign accents facilitates adaptation to an untrained novel accent. One explanation is that L2 speech varies systematically such that there are commonalities in the productions of nonnative speakers, regardless of their language background. Methods: A systematic acoustic comparison was conducted between 3 native English speakers and 6 nonnative accents. Voice onset time, unstressed vowel duration, and formant values of stressed and unstressed vowels were analyzed, comparing each nonnative accent to the native English talkers. A subsequent perception experiment tests what effect training on regionally accented voices has on the participant's comprehension of nonnative accented speech to investigate the importance of within-speaker variation on attunement and generalization. Results: Data for each measure show substantial variability across speakers, reflecting phonetic transfer from individual L1s, as well as substantial inconsistency and variability in pronunciation, rather than commonalities in their productions. Training on native English varieties did not improve participants' accuracy in understanding nonnative speech. Conclusion: These findings are more consistent with a hypothesis of accent attune-ment wherein listeners track general patterns of nonnative speech rather than relying on overlapping acoustic signals between speakers.
Full-text available
p>This paper investigated factors influencing students’ pronunciation. Pronunciation is one of the important aspects of the learning of English. When mastering English pronunciation, many non-native English speakers have difficulty pronouncing certain words that contain the phonemes not used in their native languages. This paper also reported several aspects that might influence pronunciation. The pronunciation aspect investigated in this research paper was the English interdental consonants [θ] and [ð], which are not available in the Indonesian language sound system. This qualitative research used interviews as a method for collecting primary data. The researchers interviewed twenty participants from the English Language Education Study Program (ELESP) of Sanata Dharma University, Yogyakarta, Indonesia, related to the pronunciation of the two interdental consonant sounds. The findings showed that the mother tongue, age, and teacher instruction on target language exposures affected the ELESP students’ (mis)pronunciation. Pedagogical implications of the findings are English teachers can assist their students in overcoming pronunciation challenges involving the two interdental consonants and pronunciation textbook writers should provide more pronunciation practices focusing on [θ] and [ð] sounds.</p
Full-text available
The alveolar fricative [s] in Mandarin ESL production, Hua Lin & Junyu Wu, Department of Linguistics, University of Victoria, Canada (Abstract) Most studies on English-as-a-second-language (ESL) fricatives have focused on the interdentals (Brannen, 2002; Brown, 1993; Chang & Rau, 2004; Dubois & Horvath, 1999; Deterding, Wong & Kirkpatrick, 2008; Gonet and Pietroń, 2006; Hanulíková & Weber, 2010; Rau, Chang & Tarone, 2009; Schmidt, 1987; Smith, 1997; Wong & Kirkpatrick, 2008; Wong, 2005; Wu & Lin, 2018, to name a few). The underlying assumption appears to be that those other fricatives, such as the alveolar fricative [s] which has an equivalent counter part in the L1s, do not pose an acquisition problem. While it may be true that the interdentals account for much of the fricative-induced English accent by non-native speakers, other English fricatives may be just as problematic even when they do have “(near) equivalent counterparts” in the L1s. Indeed, according to Speech Learning Model (Flege, 2005), if the difference is minute between a pair of L1-L2 counterpart segments, chances are that the L2 segment is not going to be learned. In this study, we conducted an experiment on the ESL fricative [s] produced by speakers of Mandarin which has an equivalent [s]. The goal was to see 1) if the ESL [s] is not produced the same as the native English [s], contributing therefore to the Mandarin speakers’ ESL accent, and 2) if it is not, whether the difference is due to L1 influence. An experiment was conducted. Eleven Mandarin ESL speakers (7 male and 4 female) and a control group of 3 native English speakers (1 male and 2 female) were recruited. Their production of single CV syllables with [s] onset in three vowel environments [i, a, u] in both real and pseudo words was recorded. Among the stimuli were both English and Chinese words. The English words were read by both the Mandarin and English speakers while the Chinese words by Mandarin speakers only. Acoustic analysis was done on the recordings with the speech software Praat. The fricative [s] was segmented out of the syllable and measured on friction duration (FD) and center of gravity (CoG) averaged over the central three of five equally-spaced locations over the duration. The results show that the average CoG of the native English [s] was consistently higher than the [s]s of the Mandarin ESL and the native Mandarin in all environments and word types. Looking more closely, vowel environment was found to play a role in one context—significant difference was found between the CoG values of native English and ESL [s]s in the [su] environment in pseudo word production (ρ=0.021). Like CoG, the mean FD of the native English speaking [s] was found consistently larger than that of either the Mandarin ESL or the native Mandarin [s]s in all environments and word types. Significant difference was also found between native English [s] and ESL [s] for friction duration in real word production (ρ=0.008). Other significant differences concerning FD were found when the vowel context and the type of words were factored in. In [a] context, significant difference is found among the three groups in pseudo word (ρ=0.019) as well as real word production (ρ=0.048). In [u] context, significant difference is found between the native English and the ESL [s]s in real word production (ρ=0.034). The differences in both FD and CoG between the native and non-native English speakers are interpreted as that the ESL [s] is not identical to the native English [s] and is thus a contributing factor to the ESL accent of the Mandarin speakers. A high correlation has been found between the FD and CoG values of the ESL and the native Mandarin production, clearly implicating L1 transfer.
Full-text available
English voiceless interdental [θ] in words such as thank and method is among the most difficult consonants for ESL learners to acquire. This has been demonstrated in many studies with participants from a variety of first language backgrounds including German, Dutch, French, Polish, Arabic, Persian, Chinese, Korean, and Japanese (Brannen, 2011; Hanulíková & Weber, 2010; Rau, Chang & Tarone, 2009; Gonet and Pietroń, 2010; Schmidt, 1987; Dubois & Horvath, 1999). Even highly proficient ESL speakers have been found to regularly substitute [θ] with other sounds which are, most often, [s], [t], or [f] (Brannen, 2011; Hanulíková & Weber, 2010). Preference for one or another of the three substitutions seems to vary from one L1 to another and sometimes, from one speaker to another of the same L1. ESL learners of L1 Mandarin, for instance, have been found to strongly favor [s] as a substitute in word-initial or word-final position (Rau, Chang & Tarone, 2009), while L1 Cantonese speakers of Hong Kong English opt for [f] as a substitute (Peust, 1996). One gap in the literature on L1 Mandarin learners is [θ] in word-medial, intervocalic position--all previous studies have focused on the onset or coda position. This research aims to find out if the intervocalic environment plays a role in the substitution by examining the production and perception of English voiceless interdental in word-medial intervocalic position by L1 Mandarin speakers.
Full-text available
This article examines differential substitution of the L2 English voiceless interdental fricative, [theta]. The L1s investigated in this study-European French, Quebec French, and Japanese-have been reported to substitute [s], [t], and [s] respectively in production. Two main hypotheses are explored: 1) Transfer is perceptually based; 2) Substitution involves an assessment of non-contrastive in addition to contrastive features. Results of an AXB task show that advanced learners are unable to perceive certain non-contrastive distinctions; however, unlike Japanese listeners, French listeners do perceive Strident and Mellow, features which are non-contrastive in their L1. Results indicate a clear perceptual basis for the Japanese substitute. The difference between Quebec and European French is less clear; however, there is a trend which suggests a perceptual basis for the European French substitute. Another finding is that confusion of [f] and [theta] is greater for French than it is for Japanese listeners. It is proposed that the composition of the L1 phonetic inventory influences which features listeners attend to during perception.
Full-text available
This paper argues for a more structured view of the relation between the phonological feature [voice] and its specific phonetic implementations. Under the theory of universal phonetics proposed here, the implementation of [voice] is sharply constrained: the opposition is defined relatively, as more or less voicing, along a dimension consisting of exactly three discrete, ordered categories, which can be shown to have clear articulatory and acoustic bases. While the phonological feature allows certain rule equivalences across languages to be expressed, the phonetic categories describe possible contrasts within languages, and express markedness relations.
This paper describes the results of a spectrographic analysis of a number of voiceless fricatives. The sounds are shown to be capable of description in terms of the frequencies of the lower and upper limits of energy present, the presence or absence of formant-like concentrations of energy, and the over-all relative intensity of the sounds. The sounds investigated fall into three groups: front, mid and back, corresponding to the regions of the vocal tract within which they are produced. Sounds in the front group have a long spectrum, with little patterning of peaks of energy; their relative intensity is low. Sounds in the mid group have a short spectrum, with the main region of energy at a higher frequency than in the other groups; their relative intensity is high. Sounds in the back group have a spectrum of medium length, exhibiting a formant-like patterning of energy; their relative intensity is intermediate between the other groups. Tentative criteria are advanced for distinguishing between members of each group. Combining this evidence with general phonetic knowledge it is possible to make a number of statements about other categories of sounds which include a component of fricative noise: i.e. voiced fricatives, stops, and affricates.
This article presents the foundations of the Feature Competition Model (FCM) of segment transfer. The FCM is a proposal to explain how L2 sounds are mapped on to L1 phonological categories. Like previous analyses on segment transfer, the FCM assumes that not all features are of the same prominence in a given phonemic inventory and that feature prominence can be determined through underspecification. Unlike previous analyses, the FCM adopts a dynamic approach to phonology, one which assumes that fea tures do not have discrete values, rather ones which are continuous, of greater or lesser prominence in an inventory. A specific metric for calculat ing prominence is given, and hypotheses for three L1-L2 contexts are gener ated and tested. The results of an experiment suggest that the metric has predictive power, but that certain refinements of the formula are necessary. Finally, implications the FCM has for our understanding of developing L2 speech patterns are discussed.
It has sometimes been assumed that the identification of the fricatives of American English in CV syllables depends primarily on the characteristics of the noise (i.e., nonvocalic) portion of the speech sound. A second possibility is that characteristics of the vocalic portion—previously shown to be cues for the perception of other consonants—are important for the fricatives. These alternatives were tested by combining the noise from one spoken fricative-vowel syllable with the voiced portion of another. Results indicate that the important cues for the fricatives /s/ and /∫/ are given by the noise but that the differentiation of /f/ and /θ/ is accomplished primarily on the basis of cues contained in the vocalic part of the syllable. Similar results were obtained for the voiced counterparts of these sounds.
Equipment was assembled and a procedure was developed for the measurement of power spectra of consonants. Detailed power spectra as well as measurements of grosser spectral properties were made on a fairly large sample of English and Russian stops and fricatives. Special criteria were developed for the evaluation of the data obtained. Possibilities of utilizing the data automatic recognition were considered.
Energy density spectra of gated segments of fricative consonants were measured. The spectral data were used as a basis for developing objective identification criteria which yielded fair results when tested. As a further check gated segments of fricatives were presented for identification to a group of listeners and their responses evaluated in terms of the objective identification criteria.