Content uploaded by Andrea Weber
Author content
All content in this area was uploaded by Andrea Weber
Content may be subject to copyright.
Production of English interdental fricatives by Dutch, German,
and English speakers
Adriana Hanulíková & Andrea Weber
Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
{A.Hanulikova; A.Weber}@mpi.nl
ABSTRACT
Non-native (L2) speakers of English often experience difficulties in producing English interdental fricatives
(e.g. the voiceless [θ]), and this leads to frequent substitutions of these fricatives (e.g. with [t], [s], and [f]).
Differences in the choice of [θ]-substitutions across L2 speakers with different native (L1) language
backgrounds have been extensively explored. However, even within one foreign accent, more than one
substitution choice occurs, but this has been less systematically studied. Furthermore, little is known about
whether the substitutions of voiceless [θ] are phonetically clear instances of [t], [s], and [f], as they are often
labelled. In this study, we attempted a phonetic approach to examine language-specific preferences for [θ]-
substitutions by carrying out acoustic measurements of L1 and L2 realizations of these sounds. To this end,
we collected a corpus of spoken English with L1 speakers (UK-English), and Dutch and German L2
speakers. We show a) that the distribution of differential substitutions using identical materials differs
between Dutch and German L2 speakers, b) that [t,s,f]-substitutes differ acoustically from intended [t,s,f],
and c) that L2 productions of [θ] are acoustically comparable to L1 productions.
Keywords: segmental substitutions, interdental fricatives, Dutch, German, English.
1. INTRODUCTION
One characteristic feature of speech produced by L2 speakers is its accent. Foreign accents result from a
combination of subphonemic, segmental, and suprasegmental deviations from the target language. A
common phenomenon at the segmental level is substitution, by which we mean the replacement of a specific
L2 phoneme by another phoneme, usually one that occurs in the native phoneme inventory of the speaker.
Substitutions can result, for example, from the lack of a native counterpart for a given L2 phoneme, and are
often subject to variation, such as in the L2 production of English interdental fricatives. Since phoneme
inventories of most European languages lack interdental fricatives, many L2 speakers of English have
difficulties producing them correctly and often substitute them. German and European-French learners of
English, for example, often replace the voiceless interdental fricative [θ] with [s], while Dutch and Canadian-
French speakers are reported to prefer [t] (for an overview, see Brannen 2002). Phoneme-identification
studies show that [θ] is perceptually most often confused with the acoustically similar [f] by native as well as
by various L2 listeners, and less frequently confused with [t] or [s] (Brannen 2002; Cutler et al. 2004;
Hancin-Bhatt 1994a; Miller and Nicely 1955; Tabain 1998). Given the acoustic similarity with [f], it is rather
surprising that [f] is not the most common substitution in English L2 speech, not even when [f] is available
in the L1 phoneme inventory of the L2 speakers. Note that substitutions of voiceless (and voiced) fricatives
are not restricted to L2 speech; they also occur in dialects of English, with reported instances of [f] in
Cockney (Wells 1982), and of [t] in Irish English (Hickey 2004). However, in contrast to L2 studies, the
production frequencies of these dialectal substitutions across L1 speakers are seldom systematically studied
or reported (see McGuire 2003).
Prior research has explored the causal relationship of variation in [θ]-substitutions across L2 learners with
different L1 backgrounds, and has focused on the dissociation between perception and production (e.g.
Brannen 2002; Hancin-Bhatt 1994b; Teasdale 1997), on phonological theories, universal factors, and
language acquisition models (e.g. Flege and Davidian 1984; Picard 2002; Weinberger 1994; Westers et al.
2007). However, there does not appear to be a simple answer to the question why certain substitutes are
chosen. While the phonological structure of the L1 certainly is an important factor in explaining different
substitutions, other factors such as word-dependent characteristics or social factors and varying teaching
curricula have probably an influence on L2 production as well.
Interestingly, even within one foreign accent different substitution choices are made, but these have been
less systematically studied. Moreover, little is known about whether the substitutions are phonetically clear
instances of [t,s,f] as they are often labelled. The purpose of our study was therefore to answer the following
questions: a) what is the distribution of differential substitutions using comparable materials across L2 and
L1 speakers, b) how do [t,s,f]-substitutes differ acoustically from the intended [t,s,f], and c) how do L2
productions of [θ] compare acoustically to L1 productions. Here we attempt a phonetic approach to examine
language-specific preferences for [θ]-substitutions. A similar approach has been put forward by Teasdale
(1997), who proposed that articulatory properties of [s] in the L1 are the best predictor of whether [t] or [s]
would be chosen as the [θ]-substitute. Here we wish to elaborate on this idea by providing acoustic
measurements of the substitutions as well as a comparison of these measurements between L1 and L2
speakers. We chose Dutch and German L2 learners of English, for which the acoustic properties of both [s]
and [t] are different in their respective L1. Dutch [s] is less articulatorily tense and has graver friction than
German (Mees and Collins 1982), and [t] in initial position is aspirated in German but unaspirated in Dutch
(Keating 1984; Lisker and Abramson 1964). To this end, we collected a corpus of spoken English containing
UK-English L1 speakers, as well as Dutch and German L2 speakers of English. The two groups of learners
were selected because they not only differ in their predominant [θ]-substitutions (e.g. Westers et al. 2007 for
Dutch; Hancin-Bhatt 1994b for German), but also in fine-acoustic details in fricative and stop production
(Mees and Collins 1982; Rietveld and van Heuven 2001). The data obtained from the corpus were labelled
and categorized. Moreover, acoustic measurements were taken to compare L1 and L2 [θ]-realizations, and to
compare the L2 realizations of [θ]-substitutes [t,s,f] with intended [t,s,f]-realizations. In this study, only
word-initial sounds are considered.
Studies on the acoustics of English fricatives have shown mainly four parameters that can distinguish
fricatives: duration, spectral properties (e.g. centre of gravity, F2 onset, spectral peak location), amplitude
(overall and relative noise amplitude), and transitions from the fricative into a vowel (e.g. Hughes and Halle
1956; Strevens 1960; Jassem 1962). While these measures can distinguish [s] from [f] and [θ], it seems that
formant transitions and spectral peak location can provide additional information for less distinct fricatives
such as [f] and [θ] (e.g. Harris 1958, Jongman et al. 2000, Tabain 1998). Which measure is best suitable can
also depend on the use of real words versus syllables (e.g. Tabain 1998). The most informative cue for place
of articulation of plosives is the distribution of energy in the release burst (e.g. Steven and Blumstein 1981),
but other cues such as formant transition and spectral properties have also been reported (e.g. van Alphen
and Smits 2004, for Dutch). In the present paper, we restrict the analysis of fricative intervals and plosive
bursts to duration, center of gravity (COG), and amplitude.
2. METHOD
2.1. Materials and Procedure
A short story in English was constructed, containing numerous words with voiceless [θ] and words with [s],
[f], and [t]. Participants were asked to read the story aloud at a comfortable speaking rate. Stereo recordings
were made in a quiet room with a digital recorder at 44.1 kHz sampling rate with 16-bit resolution and were
later transferred to a computer. The left channel was extracted for further processing.
For the analysis, we selected 18 content words with the voiceless [θ] in word-initial position (13 different
words, occurring between 1 and 4 times in the story), and 10 [s]-, [f]-, and [t]-initial words each (in two cases
for [t] and [f], the phoneme occurred in a stressed syllable-initial position within a word; altogether 27
different words, occurring 1 to 2 times). These words and their target phonemes were manually annotated.
The spectrogram and the waveform were used to determine the onset and the cessation of the fricatives, and
the onset and the offset of the burst. All [θ]-instances were then categorized by two trained research
assistants (German learners’ data by two assistants with L1 German, and Dutch learners’ data by two
assistants with L1 Dutch). Whenever there was disagreement about the category of a particular token, the
categorization of a third coder (a trained phonetician) was decisive. English speakers’ data were labelled and
categorized by a trained phonetician and by one native speaker of English. In case of a disagreement, the
opinion of the English native speaker was decisive. Before carrying out the acoustic analysis, all critical
words were normalized for mean amplitude. Only [θ]-instances categorized as [t], [s], [f], or [θ] were
measured, excluding few [θ]-instances that were either not fully produced, unclear, or substituted with other
than the above mentioned substitutes. We used the PRAAT speech editor (Boersma 2001) to extract the
duration, the amplitude, and the COG for each token. The weightening for COG was done by the absolute
spectrum of the frequency (p=1). To calculate the average amplitude, the root-mean-squared (RMS) method
was used, that is, the square-root of the mean of the squared amplitude of each point of a waveform.
2.2. Participants
The participants in the corpus study consisted of 37 native speakers of Dutch from the Radboud University in
Nijmegen in the Netherlands (mean age 21.5, SD 2.2), 37 native speakers of German from the University of
Cologne in Germany (mean age 22.5, SD 2.3; recordings of one participants were excluded due to technical
problems), and 31 native speakers of English from the University Birmingham in England (mean age 19.4,
SD 1.1). All participants took part in exchange for payment. The L2 participants were highly proficient in
English. Dutch students had on average 7.6 years of formal English training, and German students had on
average 8.8 years of formal English training. In an English multiple-choice vocabulary test (including many
low frequency words), Dutch students scored on average 83% correctly, and German students scored on
average 79% correctly (the difference in their scores did not reach significance). None of the German
participants had lived in the Netherlands, and none of the Dutch participants had lived in Germany.
3. RESULTS
3.1. Categorization results
The categorization results across all items in Table 1 show how often [θ] was produced correctly or
substituted with [s,t,f] or other phonemes (e.g. [tθ], [ts], [S], or unclear), listed for each speaker group
separately. The results show that all participants produced the English tokens with word-initial [θ] more
often correctly than with a substitution, and that, unsurprisingly, L1 speakers substituted less frequently than
L2 speakers. When comparing the two learner groups, German speakers produced significantly more words
with substitutions than Dutch speakers did.
Table 1: Percentages of [θ]-productions per speaker group (percentages rounded up; numbers of occurrences are in brackets).
Speakers [s] [t] [f] [θ] others
Dutch 5% (30) 23% (155) 3% (17) 62% (412) 7% (47)
German 29% (187) 7% (43) 5% (34) 51% (323) 8% (49)
English 0% (1) 0% (0) 12% (63) 88% (463) 1% (4)
Within the substituted instances, a significant difference between German and Dutch speakers was found:
German learners predominantly substituted the English [θ] with [s] (71%, compared to 15% for Dutch
speakers), while Dutch speakers predominantly substituted [θ] with [t] (77%, compared to 16% for German
speakers). For both groups, the perceptually similar [f] occurred least frequently (13% for German speakers
and 8% for Dutch speakers). It is worth noting that, overall, substitutions did not seem to be word-dependent,
and that many participants produced more than one substitute type. For English speakers we found 12% of
[f]-substitutions, which were mainly driven by three speakers. When excluding these speakers, the number of
[f]-instances dropped to 5% and the number of [θ]-instances rose to 95%.
3.2. Measurements
The Figures below show the results from acoustic measurements across the three speaker groups for duration
(Figure 1), RMS amplitude (Figure 2), and COG (Figure 3). To evaluate differences and similarities in the
obtained values, t-tests were conducted across [θ]-realizations across the three speaker groups. Further t-tests
within each of the speaker groups were aimed at a comparison between the accent-specific predominant
substitutions ([t] for Dutch; [s] for Germans) and the realizations of the intended [t], [f], and [s] within and
across L2 learners.
Figure 1: Duration in seconds (s) of the intended [t,s,f,θ], and of the [θ]-substitutions [t,s,f] (indicated by a following (th)).
f f(th) s s(th) t t(th) th
0.00 0.10 0.20 0.30
Dutch speakers
segment type
Duration (s)
f f(th) s s(th) t t(th) th
0.00 0.10 0.20 0.30
German speakers
segment type
Duration (s)
f f(th) s s(th) t th
0.00 0.10 0.20 0.30
English speakers
segment type
Duration (s)
Figure 2: RMS in Pascal (Pa) of the intended [t,s,f,θ], and of the [θ]-substitutions [t,s,f] (indicated by a following (th)).
f f(th) s s(th) t t(th) th
0.000 0.010 0.020 0.030
Dutch speakers
segment type
RMS (Pa)
f f(th) s s(th) t t(th) th
0.000 0.010 0.020 0.030
German speakers
segment type
RMS (Pa)
f f(th) s s(th) t th
0.000 0.010 0.020 0.030
English speakers
segment type
RMS (Pa)
Figure 3: COG in Hertz (Hz) of the intended [t,s,f,θ], and of the [θ]-substitutions [t,s,f] (indicated by a following (th)).
f f(th) s s(th) t t(th) th
2000 4000 6000 8000 10000
Dutch speakers
segment type
Centre of gravity (Hz)
f f(th) s s(th) t t(th) th
2000 4000 6000 8000 10000
German speakers
segment type
Centre of gravity (Hz)
f f(th) s s(th) t th
2000 4000 6000 8000 10000
English speakers
segment type
Centre of gravity (Hz)
The results showed that the [θ]-realizations of English L1 speakers differed from those of German L2
speakers in duration and RMS but not in COG. Dutch L2 speakers differed from English L1 speakers in
duration but not in RMS or COG. Differences in duration between L1 and L2 speakers are not surprising,
given that L2-speech rate is overall slower. Similarly, differences in amplitude could come about when L2
speakers encounter difficulties with a given speech sound and consequently lower their voice in amplitude.
Importantly, the COG values did not differ across the groups, suggesting some evidence for target-like
pronunciation of the English [θ] for L2 speakers. A comparison of the three measurements for [θ]-realization
between German and Dutch speakers did not show significant differences (however, COG showed a weak
tendency for a difference, p = .084). Further comparisons have shown that [s]-realizations did not differ
between German and Dutch speakers, but the properties of [t] differed in all three measures. Given prior
studies, we expected a difference between German and Dutch realizations of [s]. Because of a weak tendency
for a difference in COG (p = .097), we further examined this issue by carrying out additional measurements
that can help distinguish small differences in articulation (see Jongman et al. 2000). We found that the
German [s] differed from the Dutch [s] in the kurtosis, standard deviation, skewness, and central moments.
A comparison of the intended [t,s,f] with the substitutes [t,s,f] was limited to the dominant substitutes
within an L2 group (this was due to an insufficient number of responses for less frequent substitutes). Within
the German group, [s]-substitutes differed from the intended [s]-realizations as well as from the correctly
pronounced [θ]-instances in all measures. Similarly, Dutch speakers’ [t]-substitutes differed from the
intended [t]-realizations in all measures, and from [θ]-realizations in duration and COG, but not in RMS.
This suggests that substitutions in L2 speech are on average not clear instances of the [t,s,f], as they are often
labelled, and that they are neither clear instances of [θ].
4. DISCUSSION
The first question of this study concerned the distribution of substitution choices for the English voiceless
interdental fricative [θ] by L2 and L1 speakers. The categorization results confirmed previous findings for
L2 speakers: the dominant [θ]-substitute for German learners is clearly [s] while for Dutch learners it is [t]
(e.g. Westers et al. 2007; Hancin-Bhatt 1994b). However, all three substitutions [t,s,f] occurred in the L2
productions of both learner groups, and the substitutes were not word- or speaker-specific. Importantly, L2
speakers produced native-like realizations of the fricative [θ] more often than any of the dominant
substitutions. Since this probably depends strongly on the proficiency level of the L2 speakers, the numbers
could reverse with lower proficiency. In contrast to L2 speakers, L1 speakers of English substituted [θ] (if at
all) with [f]. It has been previously suggested that speakers of languages that articulate [s] further back
and/or have a dental [t], are very likely to substitute the English interdental fricative [θ] with [t] (Taesdale
1997). This would indeed support Dutch preference for [t]-substitutes, because [s] is articulated further back
in Dutch compared to German, which prefers [s]-substitutes. The present study further explored this proposal
and found acoustic differences in L2-production of both [t] and [s] between German and Dutch speakers,
supporting the phonetic explanation proposed by Taesdale (1997).
The second question concerned acoustic differences between [t,s,f]-substitutes and intended [t,s,f]. Given
the nature of the natural elicitation method, we restricted the analysis only to dominant substitutes within an
L2 group to ensure enough data points for a comparison. We found that not only did [t,s,f]-substitutes differ
from the intended [t,s,f], they also differed from the [θ]-realization within each L2 group. This suggests that
labeling conventions of [θ]-substitutions as [t,s,f] might not sufficiently characterize L2-productions, at least
concerning its acoustics. Rather, [θ]-substitutions seem to show gradient properties, exhibiting acoustic
properties that are often in between those of [θ]-realizations and [t,s,f]-realizations. However, perceptually
these substitutes could still be perceived as good exemplars of [t,s,f]. To answer this question, results from a
categorization experiment might be more telling, and we leave this issue to future studies.
The last question addressed an acoustic comparison of L2 and L1 [θ]-realizations. We found that German
speakers did not differ from Dutch speakers, but differed from the English speakers in RMS and duration.
Dutch speakers differed from the English speakers only in duration. Differences in amplitude and duration,
however, are not surprising when comparing non-native with native speakers. Importantly, both L2 groups
resembled the L1 group in the COG.
To conclude, this study showed that despite the difficulties that L2 speakers have with the English
fricative [θ], more than half of the produced instances were target-like. Acoustically, the [θ]-substitutions
were not clear instances of [t,s,f]. Articulatory differences between German and Dutch [t] and [s] were found
and show a promising (phonetic) approach to future investigations of differential substitutions in L2 speech.
5. ACKNOWLEDGEMENTS
This work was supported by the Max-Planck-Gesellschaft (MPG). We would like to thank Laurence
Bruggeman, Anne Blankenhorn, Sabrina Jung, Julia Lennertz, Simon Mack, Berit Meinert, Rachel Sheer,
and Karina Visser for their assistance, and Frank Eisner for comments on an earlier version of the paper.
6. REFERENCES
Boersma, P. 2001. PRAAT, a system for doing phonetics by computer. Glot Internatinal 5. 341-345.
Brannen, K. 2002. The role of perception in differential substitution. Canadian Journal of Linguistics – Revue Canadienne de
Linguistique 47. 1–20.
Cutler, A., Weber, A., Smits, R., Cooper, N. 2004. Patterns of English phoneme confusions by native and non-native listeners.
Journal of the Acoustical Society of America 116(6). 3668-3678.
Flege, J.E., Davidian, R. 1984. Transfer and developmental processes in adult foreign language production. Journal of Applied
Psycholinguistic Research 5. 323-347.
Hancin-Bhatt, B.J. 1994a. Segment transfer: a consequence of a dynamic system. Second Language Research 10(3). 241-269.
Hancin-Bhatt, B.J. 1994b. Phonological Transfer in Second Language Perception and Production. Doctoral Dissertation, University
of Illinois.
Harris, K.S. 1958. Cues for the discrimination of American English fricatives in spoken syllables. Language and Speech 1. 1-7.
Hickey, R. (ed.). 2004. A Sound Atlas of Irish English. Berlin: Mouton de Gruyter.
Hughes, G.W., Halle, M. 1956. Spectral properties of fricative consonants. Journal of the Acoustical Society of America 28. 303–310.
Keating, P.A. 1984. Phonetic and phonological representation of stop consonant voicing. Language 60. 286-319.
Jassem, W. 1962. Noise spectra of Swedish, English, and Polish fricatives. Proceedings of the Speech Communication Seminar,
Stockholm, Royal Institute of Technology Speech Transmission Laboratory. 1–4.
Jongman, A., Wayland, R., Wong, S. 2000. Acoustic characteristics of English fricatives. Journal of the Acoustical Society of
America 108(3). 1252-1263.
Lisker, L., Abramson, A.S. 1964. A cross-language study of voicing in initial stops-acoustical measurements. Word 20. 384-422.
McGuire, G. 2003. The realization of interdental fricatives in Columbus, OH, AAVE. Paper presented at the Montreal-Ottawa-
Toronto Phonology Workshop. Retrieved February 2009 from www.ling.ohio-state.edu/~mcguire/Interdentals]Handout.doc.
Mees, I., Collins, B. 1982. A phonetic description of the consonant system of Standard Dutch (ABN). Journal of the International
Phonetic Association 12. 2-12.
Miller, G.A., Nicely, P.E. 1955. An analysis of perceptual confusions among some English consonants. Journal of the Acoustical
Society of America 27(2). 338-352.
Picard, M. 2002. The differential substitution of English /θ ð/ in French: The case against underspecification in L2 phonology.
Lingvisticæ Investigationes 25(1). 87–96.
Rietveld, A.C.M., van Heuven, V.J. 2001. Algemene Fonetiek. Bussum: Coutinho.
Stevens, K., Blumstein, S. 1981. The search for invariant acoustic correlates of phonetic features. In Eimas, P.D. and Miller, J.L.
(eds.), Perspectives on the study of speech. Hillsdale, NJ: Erlbaum. 1-38.
Strevens, P. 1960. Spectra of fricative noise in human speech. Language and Speech 3. 32–49.
Tabain, M. 1998. Non-sibilant fricatives in English: spectral information above 10 kHz. Phonetica 55. 107–130.
Teasdale, A.M. 1997. On the differential substitution of English [θ]: a phonetic approach. Calgary Working Papers in Linguistics 19.
71-85.
van Alphen, P., Smits, R. 2004. Acoustical and perceptual analysis of the voicing distinction in Dutch initial plosives: the role of
prevoicing. Journal of Phonetics 32. 455-491.
Weinberger, S.H. 1994. Theoretical Foundations of Second Language Phonology. Doctoral Dissertation, University of Washington.
Wells, J.C. 1982. Accents of English 2: The British Isles. Cambridge: Cambridge University Press.
Westers, F., Gilbers, D., Lowie, W. 2007. Substitution of dental fricatives in English by Dutch L2 speakers. Language Sciences 29.
477-491.