Content uploaded by Esther de Leeuw
Author content
All content in this area was uploaded by Esther de Leeuw on Jan 04, 2019
Content may be subject to copyright.
Page 1 of 52
This is the accepted version of this manuscript before going to press; to appear in the Journal
of Phonetics, special issue edited by Esther de Leeuw and Chiara Celata on “Plasticity of
Native Phonetic and Phonological Domains in the Context of Bilingualism”.
Minor changes may still occur in this manuscript before it is in press.
Native speech plasticity in the German-English late bilingual Stefanie Graf:
A longitudinal study over four decades
Esther de Leeuw
Queen Mary University of London
Acknowledgements:
I am grateful to Jonathan Harrington for providing such a welcoming and knowledgeable
environment at the IPS (Institute of Phonetics and Speech Processing, Ludwig-Maximilian
University, Munich). At the IPS, I have been privileged to learn from many highly skilled and
knowledgeable colleagues. In particular, I am indebted to Ulrich Reubold for his time and advice
regarding the statistical analysis. Of course, any remaining shortcomings remain my own
responsibility. Funding from the Alexander von Humboldt Foundation enabled me to spend time at
the IPS, for which I am also grateful. Finally, I am very grateful to Stefanie Graf, for wishing me
success with this research in its early stages, and for being such an inspiring woman on which to base
my research.
Abstract
The purpose of this study was to expose the trajectory of native speech plasticity in the context of
late bilingualism through analysis of spontaneous speech of Stefanie Graf (SG) over four decades.
With regard to segmental variables, results showed a significant lowering of F2 in /l/,
suggesting darkening of the German lateral under the influence of English as a second language (L2).
F2 significantly increased in /i/, indicating a more front pronunciation, as predicted due to English
L2 acquisition. There was also a significant decrease in F1 of /l/, as well as of /i/, but a significant
increase in the F1 frequency of /a/, suggesting widening of the vertical vowel space, potentially due
Page 2 of 52
to increasing age. Regarding prosody, there was a significant decrease in pitch level and narrowing
of pitch span over time, as expected with increasing age, and an increase in average maximum f0
over time, possibly as a result of English L2 acquisition of prosody.
These findings suggest that native speech remains plastic post adolescence throughout
adulthood, here proposed to be evidenced through a result of acquisition of L2 counterparts in late
bilingualism, and that such changes are intertwined with natural biological speech developments
over the lifespan.
Page 3 of 52
The main objective of the present research was to expose the trajectory of plasticity in native speech
over the course of four decades in the native German speech of Stefanie Graf (SG) from youth (12
years of age) to mid adulthood (48 years of age). This objective was set to pursue the questions of
whether and how native speech remains plastic post adolescence within the context of late
bilingualism. Indeed, we know very little about the extent to which speech changes and develops
throughout the lifespan post childhood, and what we do know comes largely from studying
monolinguals.
Monolingual longitudinal research into speech plasticity
Seminal longitudinal research examining the Queen’s Christmas speeches shows that phonetic norms
change in adulthood in monolinguals (Harrington, 2006; Harrington, Palethorpe, & Watson, 2000a,
2000b). In Harrington et al.’s research, one of the many findings was that the final vowel of words
like happy became more tense in the Queen’s Christmas speeches over an analysis window of 50
years (so-called happY-tensing). Based on their results, they proposed that in Received
Pronunciation (RP), the KIT and happY vowels had undergone phonetic raising, and that the Queen
participated in these sound changes (Harrington, 2006). As underscored by Labov, “[t]he
significance of Harrington’s result […] rests on his strengthening the case for adult change rather
than stability” (2006, p. 501). Indeed, he was able to show that “one individual—who might have
been expected to be the most resistant to changes in the community pattern—did in fact absorb
community change in her own speech” (Labov, 2006, pp. 501–502).1
1 Note, however, that the style of speech used by the Queen in her scripted Christmas speeches was not necessarily
representative of her “vernacular”, as defined by Labov (1997, p. 168). Helen Mirren notes the difference in the Queen’s
speech styles in an interview regarding her role as the Queen: “I mean, she has a voice for her speeches, and then she has
her own voice, and they are really quite different, and it was very hard to get to the real voice because we hear it so
infrequently” (‘Interview with Helen Mirren’, n.d.). This is important to note because it may be that the speech analysed
in her Christmas speeches was representative of her formal speech, which is becoming more like the community’s
vernacular.
Page 4 of 52
Moreover, longitudinal research into monolinguals also indicates speech plasticity in
adulthood as a result of new dialect acquisition in individuals who have moved to a new dialect area.
A study investigating vowel changes in three sessions over two years among London students from
the Midlands, UK, where the local accent is classified as a variety of northern English (Evans, 2005;
Wells, 1982), showed that the subjects’ accents became more like Standard Southern British English
(SSBE) revealing plasticity in early adulthood. However, there were differences in the amount and
direction of accent change: although the majority of subjects were rated as sounding more ‘southern’
after two years than when they were first recorded, in a minority, there was only a very slight change
in accent; three subjects were rated not to have changed their accent at all; and one subject was
judged to have a more ‘northern’ accent at the final testing session than when she first moved to
London. Similarly, Sankoff (2004) carried out an impressionistic study of dialect change over the
lifespan in two speakers who were recorded from childhood to mid-adulthood. Until the age of 16
both speakers lived in northern England, where the front [a] is produced in BATH words and [ʊ] in
both could and cud. Both speakers began to distinguish could and cud words after moving to the
south of England. Only one speaker, however, inconsistently produced non-northern [ɑ:], again
suggesting that speech plasticity takes place in adulthood as a result of new dialect acquisition, but is
variable between speakers. Furthermore, in the case of the broadcaster and journalist Alistair Cooke,
who moved to the United States from England in early adulthood, it was found that he evidenced a
more American-like pronunciation in mid-life, but then reverted back to a more British English-like
pronunciation in late adulthood, although he continued to live in the United States at that time
(Reubold & Harrington, 2015, 2017). Such longitudinal dialect acquisition studies indicate that new
dialect features are indeed acquired by adult speakers, although the degree of change in native dialect
varies between and within participants, thereby supporting findings from cross-sectional dialectal
research (see Nycz, 2015 for an overview and also e.g. Munro, Derwing, & Flege, 1999; Nycz, 2013;
Shockey, 1984).
Page 5 of 52
Additionally, some research has honed in on natural biological changes which speech
undergoes in adulthood, i.e. neither as a result of sound changes in the community nor as a result of
new dialect acquisition. For example, a further longitudinal study of Queen Elizabeth II, as well as of
Margaret Thatcher, and Margaret Lockwood (a popular British actress in the 1930s and 1940s),
investigated speech developments in mid-adulthood (mid 30’s) versus late adulthood (mid 60’s to early
70’s) (Reubold, Harrington, & Kleber, 2010). It was found that both fundamental frequency (f0) and
F1 followed a generally decreasing trajectory with increasing age, but that F2 and F3 evidenced no
systematic changes over time, except in Lockwood who displayed a decrease in F3. The results of this
longitudinal analysis were interpreted to be consistent with other cross-sectional investigations into
biological aging effects by showing a decrease in f0 and F1 with increasing age in female voices
(Linville, 1996; Linville & Rens, 2001). Such decreases in formant values due to aging have been
attributed to a lengthening of the vocal tract, caused by a lowering of the larynx and of the
tracheobronchial tree and lungs (Laver & Trudgill, 1979; Linville & Rens, 2001), although these
explanations are not conclusive (see e.g. Reubold et al. 2010 for a discussion). In relation to the current
investigation, it is worth noting that a later age range (34-70 years of age) was investigated in the
Reubold et al., 2010 study, whereas the recordings of SG were from ages 12–48, which may have
impacted any potential findings as related to biological changes (i.e. potentially the same effects will
not be found in SG as she is a younger subject).
More recent longitudinal findings confirm that biological age-related changes are brought
about through a declining f0, which more specifically corresponds to a declining F1 in high vowels,
but not in low vowels, for which an increase in F1 is observed (Reubold & Harrington, 2015). In a re-
analysis of the speech of Alistair Cooke, recordings were taken when he was between 42 and 95 years
of age. It was found that both age-related biological effects and new dialect acquisition induced
changes in the analysis: age-related changes were brought about by a declining f0, which in turn
Page 6 of 52
influenced the F1 in his high vowels, whilst F1 increased in low vowels. The dialectal changes
confirmed a shift of accent from General American towards RP in late adulthood, and it was reported
that Cooke seemed to have shifted his attitude towards the American society and re-identified with his
British past.
Finally, in another study by Reubold and Harrington (2017) examining the German native
speech of the German female newsreader Dagmar Berghoff, recordings from the Tagesschau were
investigated from 34-70 years of age, i.e. again over a later overall age range than that of SG. In
contrast to the aforementioned studies, no consistent age-related change in F1 was observed, and f0
stayed about the same until the age of 50 years, after which it decreased. These findings are particularly
relevant to the investigation into SG, because they suggest that in German female native speech
unexposed to English (as Dagmar Berghoff did not move to the US), no changes in F1 nor f0 would
be expected as a result of increasing age up to 48 years of age. However, it should be observed that
Dagmar Berghoff was a trained actress and newsreader, and that she therefore most likely undertook
voice training, which could have countered biological aging effects evidenced in non-trained voices.
Alternatively, it is also most likely that Queen Elizabeth II, Alistair Cooke, Margaret Thatcher, and
Margaret Lockwood undertook some sort of voice training, which therefore might suggest that the
observed stability of F1 in Berghoff’s speech may have been characteristic of German monolingual
female speech, and therefore comparable to the speech of SG.
Bilingual longitudinal research into speech plasticity
Although a wide range of cross-sectional research has indicated phonetic and phonological attrition in
a bilingual context (Bergmann et al., 2016; de Leeuw et al., 2012, 2013, 2010; de Leeuw, Tusha, &
Schmid, 2017; Dmitrieva, Jongman, & Sereno, 2010; Flege, 1987; Flege & Eefting, 1987; Hopp &
Schmid, 2013; Major, 1992; Mayr et al., 2012; Mennen, 2004; Ulbrich & Ordin, 2014), there is very
Page 7 of 52
little longitudinal research into changes in native speech post adolescence, and we know very little
about the trajectory of speech plasticity across the individual bilingual lifespan.
A pioneering short-term longitudinal case study revealed speech plasticity in a Brazilian-
Portuguese – English late bilingual (Sancier & Fowler, 1997). The VOT values of plosives in a 27 year
old female native speaker of Brazilian Portuguese was studied in three sessions: after a stay in the US
right before leaving for Brazil, upon return from Brazil after a stay in Brazil, and once again, right
before leaving for Brazil after a stay in the US. It was found that the VOT values of her plosives were
always shorter for productions in Brazilian Portuguese than in English, but VOTs produced in both
languages were shorter after the several month stay in Brazil than after the several month stay in the
United States, in line with similar cross-sectional phonetic research into VOT of late consecutive
bilinguals (Flege & Hillenbrand, 1984; Flege, 1987; Major, 1992). Sancier and Fowler (1997)
emphasised that the perceptually guided changes in speech production occurred in a speaker who was
“well past the critical period for language acquisition” (p. 421). This study was followed up by Tobin,
Nam, and Fowler (2017) who obtained similar results regarding 11 Spanish-English bilinguals living
in the United States. Although their results showed more extensive changes in the VOT of the
bilinguals’ English than in Spanish, considerable interspeaker variation was reported with one
participant evidencing changes in both languages, whilst two others showed no L1 changes at all.
Similarly, in a study by Chang (2012, 2013), 19 adult native speakers of American English
(AE) learning Korean participated in a longitudinal study, which was run a total of five times over two
weeks. Significant changes in L1 production were reported: the VOT of English voiceless plosives
lengthened significantly from Week 1 to Week 5. In the analysis of vowel production, phonetic drift
was reported to be more systematic than in the consonantal analysis, occurring uniformly over the
entire vowel system (see also Mayr et al., 2012 and Guion, 2003 who suggest systematic L2 induced
L1 changes, rather than counterpart-specific L2 induced L1 changes, e.g. L2 /a/ → L1 /a/). However,
Page 8 of 52
in this study the acquisition of Korean was not tested, and, therefore, to a certain extent, it is difficult
to know whether the participants’ acquisition of Korean impacted the native English, or whether,
potentially, their English was simply changing in Korea, but not through the acquisition of a new
language (e.g. potentially as a result of exposure to Korean accented English speech).
One further case study offers insight into speech plasticity within an individual. “Freddy” spent
his childhood in the Netherlands and immigrated to Indonesia at the age of 13, where he acquired
Indonesian (Giesbers, 1997). Although he regularly spoke to Dutch tourists and read Dutch
newspapers, it was assumed that Freddy did not have contact with the Dutch language or culture apart
from his working environment. Data for the case study came from a single interview with Freddy when
he was 45 years of age. As such, the study itself was not longitudinal, but the assumption was that
Freddy spoke more Dutch-like while living in the Netherlands, and that, after moving to Indonesia, his
Dutch was influenced by Indonesian, i.e. the possibility of Indonesian having influenced his Dutch
before the move to Indonesia was discounted.2 The authors summarised that the most frequent
deviations were found on the suprasegmental level. Of a total of 48 phonological cases in their
recordings, 19 observations involved incorrect word stress assignment (e.g. óverstroming for
overstróming (flood)), and another 11 observations displayed an incorrect sentence intonation pattern.
Other, less frequent phonological findings included consonant cluster reduction through word-final (t)
deletion; and a lax vowel for a tense vowel. This study is relevant to the current study, as plasticity of
both segmental and prosodic elements were investigated.
2 On page 164, the following is stated: “He was born of a Javanese father and a Dutch mother. Freddy’s father worked in
the Dutch navy and we can assume that his proficiency in Dutch was excellent considering the fact that his own father
(Freddy’s grandfather) had been the headmaster of a Dutch school in colonial Indonesia. Freddy’s mother is a native of
the city of Heerlen in the Dutch province of Limburg”. It is nonetheless possible that the father may have spoken
Indonesian with Freddy during his childhood, and that this may have had some phonetic impact on his Dutch
pronunciation (see e.g. Caramazza, Yeni‐Komshian, Zurif, & Carbone, 1973; Sundara, Polka, & Baum, 2006 who
showed bidirectional phonetic interaction in simultaneous bilinguals).
Page 9 of 52
Finally, in contrast to the previous studies, 16 native Japanese children and 16 adults were
investigated at two data points over a one year period of time while they acquired English in the United
Sates (Oh et al., 2011). Their study revealed that in the adult group, no difference in Japanese vowel
production occurred over time, but that the adults likewise showed no changes in L2 English vowel
production across time, indicating that they had not actually acquired the English vowels. However,
changes in the child group did occur, and appeared to reflect their acquisition of English vowels. The
interpretation of this study suggested that there was an effect of age of acquisition (see also e.g. Ahn,
Chang, DeKeyser, and Lee‐Ellis (2017) and Bylund (2009), but also note the confound between
attrition and acquisition in children (Schmid, 2011)). However, as already mentioned, as the Japanese
adults had not actually acquired the English vowel productions, it is difficult to assert whether, had
they acquired the English productions, plasticity in their native speech would have been evidenced.
Summarising, these previous longitudinal studies provide insight into how the acquisition of
an L2 might impact adult native speech developments over time, but they are all restricted to shorter
time spans, ranging from two to four weeks (Tobin et al., 2017), to five weeks (Chang, 2012), to eleven
months (Sancier & Fowler, 1997), to one year (Oh et al., 2011). In Sancier and Fowler’s longitudinal
research it was ensured that the new language was acquired in the first place, which, it is argued here,
would have enabled it to have had the potential to influence native speech production.
In the current study, SG was investigated over more than four decades, both before and after
her move to the United States. As she moved to the United States midway through the recordings, it is
ensured that she had acquired English, although the extent of this acquisition was not measured.
Therefore, an in depth analysis into the trajectory of native speech plasticity over an individual lifespan
is possible, across numerous segmental and prosodic variables.
Page 10 of 52
Voiced lateral approximant /l/ in German and English
The voiced lateral approximant /l/ is known to differ in German and English. In Standard German,
the back of the tongue is usually not constricted during the realisation of /l/ (Kufner, 1970; Moulton,
1970; Wells, 1982). This position of the dorsum is reflected in a higher second formant (F2)
frequency and a ‘clear’ (Gimson, 1989, p. 202) or ‘light’ /l/ (Olive, Greenwood, & Coleman, 1993,
pp. 204–216) lateral in German. However, the high F2 frequency of [l] in German can be influenced
by either regressive or progressive coarticulation (Neppert, 1999, pp. 229, 242). Contrastingly, in
AE, the back of the tongue is generally retracted during the realisation of word final laterals. The
constriction creates what can be termed a velarised, or even pharyngealised, lateral which is most
clearly reflected in the acoustic signal as a decrease in the frequency of the F2 (Hayward, 2000, p.
201; Kent & Read, 1996, p. 140; Olive et al., 1993, p. 207). When F2 frequency is low, the literature
refers to a ‘dark’ /l/ (Gimson, 1989; Olive et al., 1993), expressed by the IPA symbol [ɫ]. Based on
initial research by Sproat and Fujimura (1993), Ladefoged and Maddieson elaborate that in AE word
final /l/ may be more velarised than word initial /l/, but that both are characterised by a low F2
frequency (2007). Recasens confirms the lower F2 in the North AE /l/ than in German, additionally
stating that a lower F1 is observed in the lighter lateral than in the darker (2004).
Based on the characteristics of /l/ in German and English, as described above, it was
predicted that under the influence of English L2 acquisition, F1 would increase over time in the
speech of SG, indicating a darkening of her native German lateral. Similarly, as a result of English
acquisition, F2 was predicted to decrease over time in her speech, indicating a darkening of her
native German lateral. However, as previously mentioned, recent longitudinal research has indicated
that with increasing age, the F1 frequency of non-low vowels in female speakers decreases, although
this was not observed in a female German native speaker with little English immersion (Reubold &
Harrington, 2015, 2017; Reubold et al., 2010).
Page 11 of 52
High front unrounded /i/ in German and English
Very generally, the high vowels in German are considered to be higher than those in English. For
example, Delattre has stated that English “close vowels are less close [than German close vowels]”
(Delattre, 1964, p. 53, as cited in Biersack, 2002, p. 62). Such articulatory descriptions are in line
with a lower F1 frequency in the German high front unrounded /i/ vowel, in comparison to the
English /i/. In line with these impressionistic descriptions, an acoustic study of /i/ found an average
F1 frequency of 437 Hz and an average F2 frequency of 2761 Hz for AE females (Hillenbrand,
Getty, Clark, & Wheeler, 1995, p. 3103). Another study reported an average F1 frequency for AE
women of 390 Hz and an average F2 frequency of 2826 Hz (Yang, 1996). In contrast, German
females have been reported to have an average F1 frequency of 302 Hz, and an average F2 frequency
of 2369 Hz (Künzel, 2001), and of respectively 302 Hz and 2533 Hz (Sendlmeier & Seebode, 2006).
Accordingly, it could be generalised that German /i/ is both slightly closer than the AE /i/, and also
potentially that German /i/ is slightly more back on the horizontal axis of the vowel space, due to the
reported lower F2 frequency in German /i/ than in English.
Based on the characteristics of /i/ in German and English, as described above, it was
predicted that under the influence of English L2 acquisition, F1 would increase over time in the
speech of SG, indicating a more open pronunciation of her native German high front unrounded /i/
vowel. Similarly, as a result of English acquisition, F2 was predicted to increase over time in her
speech, indicating a more front pronunciation of her native German /i/. On the other hand, as SG
aged at the same time as acquiring more English, it was considered possible that F1 in her native
German speech would become lower over time given previous findings regarding effects of aging on
speech, although, again, the lowering effect of F1 in high vowels was not observed in a female
German native speaker with little English immersion (Reubold & Harrington, 2015, 2017; Reubold
et al., 2010).
Page 12 of 52
Low front unrounded /
a
/ in German and English
Very generally, it has been considered that the German centre of gravity is higher than that of
English, i.e. “[English] centre of gravity is lower. And its low vowels are more extreme (close to
cardinal vowels) than its high vowels, which is not the case with [German]” (Delattre, 1964, p. 53, as
cited in Biersack, 2002, p. 62). In the past, there have been thought to be two low unrounded vowels
in both German and in English (Delattre, 1964). Traditionally, these were transcribed as /a/ and /ɑ/ in
German, with the former being a low front unrounded vowel and the latter being a low back
unrounded vowel; in English, similar corresponding vowels have been transcribed as /æ/ and /ɑ/ with
the former considered to be more front than the latter (Delattre, 1964). However, more recently, as
described, /a/ has been used to transcribe these vowels in both German and English. In early acoustic
analyses comparing German and English, research revealed that F1 frequency values for all of these
vowels were the same (i.e. 750 Hz); however, F2 values were reported to differ between /a/ in
German and /æ/ in English, respectively 1250 Hz in German /a/ and 1700 Hz in English /æ/
(Delattre, 1964). A recent acoustic investigation indicates that in AE female speakers, the average F1
value for /æ/ is 669 Hz, and the average F2 value is 2349 Hz (Hillenbrand et al., 1995). Similarly
examining AE female speakers, Yang (1996) reported average F1 and F2 values of respectively 825
Hz and 2059 Hz. A formant analysis of Standard German female speakers arrived at the values in /a/
of 710 Hz for F1 and 1505 Hz for F2; and in /a:/ of 781 Hz for F1 and 1462 Hz for F2 (Künzel,
2001). Similarly examining Standard German female speakers, Sendlmeier and Seebode (2006)
arrived at the average frequency values in /a/ of 836 Hz for F1 and 1586 Hz for F2; and in /a:/ of 896
Hz for F1 and 1517 Hz for F2. Although the latter studies into German separated long and short
vowels, in comparing the AE female speakers with the Standard German female speakers, one can
deduce that /a/ in German appears to be more back than its English equivalent, as reflected by lower
F2 values in German than in English. However, with regard to closeness of /a/, the acoustic findings
Page 13 of 52
are less clear, although analyses would suggest that /a/ is more open in English than in German, as
reflected by higher F1 values in English than in German.
Given these characteristics of /a/ in German and English, as described above, it was predicted
that as a result of English L2 acquisition, F1 would increase over time in the speech of SG, indicating
a more open pronunciation of her native German low front /
a
/ vowel. Similarly, as a result of English
acquisition, F2 would increase over time in her speech, indicating a more front pronunciation of her
native German /
a
/. However, the predicted increase of F1 frequency in her native German speech,
due to English L2 acquisition, was thought to potentially be amplified due to the age effect of open
vowels exhibiting a potential increase in F1 frequency as a function of biological aging although,
again, the lowering effect of F1 in high vowels was not observed in a female German native speaker
with little English immersion (Reubold & Harrington, 2015, 2017; Reubold et al., 2010).
Pitch in German and English
Impressionistically, English female voices have been reported to be high-pitched and “sound
aggressive and over-excited to the German hearer” (Gibbon, 1998, p. 89). Instrumental cross-
sectional research which has contrasted pitch in German and English indicates that German females
have a lower pitch level and a narrower pitch span than English females of the same age group
(Mennen et al., 2014, 2012; Mennen, Schaeffler, & Docherty, 2007; Scharff-Rethfeldt et al., 2008)
although the English L2 may also influence the German L1 (de Leeuw, 2009, 2019).
Therefore, in this stage of the current analysis, pitch level and span were investigated in order
to examine whether pitch level would increase and pitch span would widen under the continued
influence of English over SG’s lifespan. Given the above differences between German and English
pitch, it was predicted that as a result of English L2 acquisition, mean f0 would increase over time in
the native German speech of SG, indicating an increasing pitch level over time. Similarly, as a result
Page 14 of 52
of English acquisition, 80%Range was predicted to increase over time in her native German speech,
indicating a widening of pitch span over time. Finally, due to English acquisition, maxf0 was
predicted to increase over time in the native German speech of SG, indicating wider maximum pitch
excursions.
As a function of biological age, it was nonetheless thought possible that f0 would decrease
over time (see e.g. Linville, 1996; Linville & Rens, 2001; Reubold & Harrington, 2015, 2017), which
could have a narrowing effect on 80%Range and decreasing effect on maxf0, although again, this
lowering effect on f0 was not observed in a female German native speaker with little English
immersion (Reubold & Harrington, 2015, 2017; Reubold et al., 2010).
Objectives of the study
The main objective of the present research was to expose the trajectory of plasticity in the spontaneous
native German speech of SG from youth to mid adulthood in the segmental and prosodic variables
described above. This objective was set to pursue the questions of whether and how native speech
remains plastic post adolescence within the context of late bilingualism and therefore expands on the
aforementioned monolingual and bilingual longitudinal research by delving into similar questions
within a late bilingual over an extended period of time. It has been suggested that the first years after
migration have the greatest influence on the extent to which an individual undergoes attrition (de Bot
& Clyne, 1994); however, as SG moved to the United States already fluent in English, the first years
after moving may have been less impactful. With regular contact to English before her move, and to
German after her move, as will be discussed, plasticity effects across the four decades might be more
variable, with ups and downs dependent also on recent language exposure (Sancier & Fowler, 1997).
That said, to date no longitudinal study has verified such claims, and the primary objective of the
current study was therefore to investigate plasticity in SG’s native German spontaneous speech as a
result of the acquisition of L2 English.
Page 15 of 52
A secondary objective of this study was to investigate whether plasticity in the native speech
of a late bilingual, if evidenced at all, does so uniformly across all variables examined, or whether
certain variables are more likely to undergo change than others. It has been suggested that cross-
linguistic interactions operate at a system-wide level (Mayr et al., 2012), rather than at the level of
individual sounds, which is consistent with related research into early bilinguals (Guion, 2003) short-
term phonetic drift effects (Chang, 2012; Guion, 2003), as well as with the notion of language
specific phonetic settings (Laver, 1980; Mennen, Scobbie, de Leeuw, Schaeffler, & Schaeffler,
2010). The idea of a systematic L1 change is to a certain extent unsupported by models of L2
acquisition, such as the SLM (Flege, 1995) which maintains that perceptual similarity between
individual sounds underlies L2 speech perception, and therefore also potential merging between L1
and L2 counterparts, e.g. if L1 /a/ and L2 /a/ are perceived as similar, although they are different,
these counterparts will merge in production as well. Likewise, perceptual similarity between
individual properties of the L1 and the L2 underlies related models of speech perception and
production (Best, 1995; Escudero, 2005). Moreover, dissimilation effects, which have also been
reported in phonetic attrition studies (de Leeuw et al., 2012; Evans & Iverson, 2007; Flege &
Eefting, 1987), do not feed into the interpretation of systematic change cleanly. If there were a
system wide shift, it is unclear why and how some individuals evidence dissimilation whilst others
evidence assimilation for the same phonetic variable. For example, in de Leeuw et al. (2017), three
of the Albanian-English bilinguals evidenced dissimilation of coda-/l/, whilst three evidenced
assimilation. If a system-wide change was occurring in the bilinguals’ speech, it would be more
intuitive to assume that coda-/l/ would move in the same direction for all of the speakers. As this did
not occur, it suggests that other factors – aside from systematic changes, i.e. L2 system to L1 system
- most likely also play a role in determining L2 influence on the L1 (such as socio-indexical
information). Through examination of numerous segments, particularly focussing on formant values,
it will be possible to determine whether e.g. the vowels follow a general trend towards more open
Page 16 of 52
(higher F1) realisations as in Mayr et al., 2012, which examined the effect of English late L2
acquisition on Dutch native speech, or whether differences are observed between variables over time.
A final objective of this study was to investigate the potential effects of biological aging on
the speech of SG in a late bilingual context. As previously discussed, some research has shown that
aging effects cause a decrease in female f0 over time, a decrease in the F1 of high vowels, and an
increase in the F1 of low vowels (Linville, 1996; Linville & Rens, 2001; Reubold & Harrington,
2015, 2017; Reubold, Harrington, & Kleber, 2010). However, it has also been shown in the German
female speaker Dagmar Berghoff, who had no documented increased English exposure, that there
were no noticeable changes in F1 of her high vowels, and no rising trend in F1 of low vowels.
Moreover, f0 only started to drop at 50 years of age, which could suggest that no decrease in f0
would be expected in the recordings of SG, and that there would be no F1 changes as a result of
biological aging. However, the findings from Berghoff run counter to the previous case studies
investigated by Reubold et al., and it may be that she therefore represents a highly trained anomaly.
Based on these three objectives, the following general hypothesis was established.
The production of German native speech patterns, as measured in segmental and prosodic variables,
will change over time, as they are influenced by increased exposure to English counterparts.
Systematic changes regarding variation between variables were considered important to
bear in mind, as well as any biological aging effects which may have overridden or enhanced L2
effects.
Methodology
Stefanie Graf
The subject of this case study is Stefanie Graf (SG), born 14 June, 1969 in Mannheim, Germany, and
raised in a monolingual German environment in Brühl, Germany. SG is the world’s longest number 1
Page 17 of 52
ranked tennis player and is widely recognised as the world’s greatest tennis player (see both Women’s
Tennis Association and the Association for Tennis Professionals)3.
During her childhood, she was coached by her German father until 1986 (Finn, 1991; Ostler,
1987). At 17 years of age, a new Czech coach, Pavel Slozil, was hired to supplement her father’s
coaching, at which point her exposure to English increased as she travelled more frequently on the
international tennis circuit. At 30 years of age, in part due to injuries, SG retired from her tennis career
and was fully immersed in English upon moving to Las Vegas, USA in 2000 to be with her husband,
Andre Agassi (Danis, 1999). However, SG will have started to acquire English before she moved to
Las Vegas; she will have learned English at school in Germany, and have used English frequently in
an international environment as a professional tennis player, both in Germany and abroad, and her age
of English acquisition (AOA) is therefore earlier than the onset of her move to Las Vegas.
Since moving to Las Vegas, SG has reported a decrease in the amount of German she uses. For
example, in 2009 (40 years of age) she stated that she could not really ever lose touch with German.
In unserer Familie in Las Vegas unterhalten wir uns überwiegend auf
Englisch. Aber ich spreche mit meinen Kindern auch Deutsch, und sie
verstehen es gut. Außerdem bin ich ständig mit meinem Büro in Deutschland
oder mit meiner Stiftung in Verbindung. So richtig rauskommen kann ich also
gar nicht.
In our family in Las Vegas, we mostly speak English. But I also speak German
with my children, and they understand it well. Otherwise, I’m frequently in
contact with my office in Germany, or in contact with my charity. So I can’t
really ever lose touch. (Herffs, 2009)
3 See [www.wtatennis.com] and [www.atpworldtour.com].
Page 18 of 52
However, later in 2012 (43 years of age), she is reported to have stated that she sometimes has
difficulties finding the right words in German.
Ja, wenn ich mit ihnen allein bin [spreche ich mit den Kindern Deutsch]. Wenn
ich aber will, dass sie etwas auf jeden Fall verstehen, spreche ich Englisch mit
ihnen. Außerdem unterhält sich auch meine Mutter mit ihnen auf Deutsch. Ich
habe aber mittlerweile manchmal Schwierigkeiten, die richtigen Worte auf
Deutsch zu finden.
Yes, when I’m alone with the children, [I speak German with them]. But when
I want to make sure that they definitely understand something, I speak English
with them. My mother also speaks German with them. But lately I’ve
sometimes had difficulties finding the right words in German. (Hüdaverdi &
Mitatselis, 2012)
However, although living in Las Vegas full time, SG has continued to maintain a strong connection
to Germany, and she has reported that she returns five or six times annually for personal and
professional reasons related to her charity Children for Tomorrow and distinguishes between the
German terms “zu Hause” (Eng. at home) and “Heimat” (Eng. homeland) (2013, 44 years old).
Dadurch, dass meine Kinder in die Schule gehen, [komme ich] etwas seltener
[nach Deutschland] [...]... Etwa fünf-, sechsmal im Jahr. Im Sommer, während
der Schulferien, versuchen wir immer etwas länger zu bleiben... Heimat wird
immer Deutschland bleiben. Zu Hause aber ist dort, wo meine Familie ist.
As my children go to school, I come less frequently to Germany [...] Around
five or six times per year. In the summer during the school holiday, we always
Page 19 of 52
try to stay longer... The homeland will always be Germany. But I’m at home
where my family is. (Hungermann, 2013)
More recently, in 2016 (47 years of age), she is reported to have stated that she rarely speaks
German.
Ich spreche [Deutsch] mittlerweile eigentlich nur noch, wenn mir irgendwas
nicht passt. Dann rutscht mir "Mensch, jetzt mach mal!" raus und [die Kinder]
wissen: Jetzt müssen sie wirklich aufpassen.
I actually only ever speak German now when I’m upset. Then "Mensch, jetzt
mach mal!" (Eng. C’mon, hurry up!) slips out and the children know – now
they really have to be careful. (Mol-Wolf & Voss, 2016)
Although SG has had an exceptional career as a tennis player, her descriptions of her language use
patterns as a late bilingual can be interpreted as representative of many other late bilinguals who have
immigrated to a new country in adulthood. Given the geographical distance between Germany and the
United States, she may be able to return more regularly than others who have emigrated from Europe
to North America (see e.g. Schmid, 2004 and de Leeuw et al., 2010, in the latter, late bilinguals return
on average less than once per year to Germany from Canada), but many late bilinguals might also have
family and friends who visit them regularly, or live in closer proximity to their home countries, thereby
enabling them to return more frequently. In some ways her language exposure, which was
characterised by regular exposure to English before moving to Las Vegas, as well as continued frequent
contact with German post move to Las Vegas, is perhaps more representative of bilingual immigrants
of present, in a connected world, where international travel can be considered less costly and faster
than before the turn of the 21st Century. Nevertheless, by very nature of the move, such late bilinguals,
like SG, who have established themselves in a new country, are likely to speak their new language far
more regularly than their native language, and most definitely more likely than they did in their country
Page 20 of 52
of origin. The findings from the phonetic analysis of SG’s native speech are therefore interpreted as
applicable to other late bilinguals who have immigrated to a new country in adulthood, although due
to the fact that this case study is based on only one speaker, it is considered possible that some of the
findings may in part be due to unique characteristics and developments in the speech of SG.
Recordings
Recordings of SG were collected from various online German media resources, such as ZDF and
RTL Mediathek. The recordings were conducted in various settings, for example after SG won a
match on the tennis court, in an interview room in either Germany or abroad, or at her charity in
Hamburg. Initially, 40 recordings were selected. From this total, 36 recordings of SG were finally
examined, ranging in duration from approximately 1-20 minutes. Four recordings were discarded
because there was too much background noise in them, e.g. people were cheering, or the recording
was overlaid with music. In total, 1250 segments were analysed in all 36 recordings, with an average
of 38 segments per recording (in total 491 /l/ segments, 261 /i/ segments and 175 /a/ segments). For
the prosodic analysis, one value for each variable was obtained per recording as these were long term
durational measurements over the entire recording.
The recordings were transcribed by hand using Praat (Boersma & Weenink, 2010) initially at
the sentence level orthographically. Thereafter words containing the relevant segmental variables
were located and the variables were transcribed using IPA including the segments on either side of
the targeted segments. The advantage of hand transcriptions in this case was that all recordings were
listened to by the author in detail, such that a provisional impressionistic analysis was also able to be
conducted, as the content of the interviews was also considered informative. Once the recordings had
been annotated, relevant phonetic variables were measured, as detailed below.
The original sampling frequency of the recordings varied, as in the previous longitudinal
studies into monolinguals (Harrington, 2006; Reubold & Harrington, 2017; Reubold et al., 2010)
Page 21 of 52
from 16 kHz to 22 kHz to 24 kHz, largely due to technological developments over time. Through the
downloading process, recordings were resampled uniformly to 44.1kHz.
As already mentioned, recordings were all of spontaneous speech, and therefore comprise
false-starts, hesitation markers, long pauses, and interviewer questions which were rejected by SG,
due to their inappropriate or personal nature. The overall impression from the recordings is that her
speech was informal, and therefore representative of naturalistic communicative events experienced
by both monolinguals and bilinguals who, in their day to day lives, encounter different people from
varying backgrounds, with different communicative agendas. For this reason, the analysis of these
recordings is considered to enhance previous studies implementing controlled word list elicitation
methods (see e.g. de Leeuw, Mennen, & Scobbie, 2012, 2013; de Leeuw, Tusha, & Schmid, 2017;
Mayr, Price, & Mennen, 2012) and read speech (see e.g. Harrington, 2006; Reubold et al., 2010).
Furthermore, as a public figure, SG was interviewed and recorded without the primary
objective of language testing, but rather to gather information regarding her tennis performances, visits
to Germany, family life in Las Vegas, or regarding her career. Accordingly, the methodological
constraint which can potentially arise in longitudinal linguistic studies of bilinguals, that the repeated
testing of variables within the bilingual group may disturb the “natural course of the process [the test]
hoped to track down” (Jaspaert & Kroon, 1989, p. 81), is considered to have been circumvented in the
current research, because no “test” as such was consistently implemented during these recordings
which would have potentially maintained her L1.
Segmental analysis
The primary aim of the segmental analysis was to examine the trajectory of native speech segments
in the spontaneous German speech of SG, and to see whether these segments might evidence
different trajectories under the influence of English L2 acquisition. To do so, specified formant
frequencies were examined which are known to differ between English and German, as described
Page 22 of 52
previously. Formant frequencies were measured in the lateral approximant /l/ (F1 and F2); the rhotic
/r/ (F3); the high front vowel /i/ (F1 and F2); the low front vowel /a/ (F1 and F2); and the high back
vowel /u/ (F1 and F2). However, only the following sounds exhibited significant changes and
therefore, due to limitations in space, only the following segments were analysed in more detail.
voiced lateral approximant /l/
high front unrounded /i/
low front unrounded /a/
Measuring formants
Formant frequencies were measured in Praat (Boersma & Weenink, 2010) with standard settings.
The window length was set to 25ms; five formants were displayed in the spectrogram with the
maximum formant frequency of 5500Hz; and a pre-emphasis of 50Hz. In Praat, Burg’s algorithm
(Andersen, 1974) is used to extract formant frequencies from the speech signal through linear
predictive coding. As in Harrington et al. (2010), formants were sometimes inspected using a
narrowband (50 Hz) as well as wideband displays, especially when F1 and F2 were very close
together.
All segments (i.e. /l/, /i/ and /a/), measured at the mid-point, were delineated manually
(Moosmüller, Schmid, & Kasess, 2015), initially through listening to the recordings and observing
where the segments occurred. Segments were not included if their duration was so short that no
reliable measurement could be taken (Harrington, Palethorpe, & Watson, 2007). Changes in the
waveform and spectrogram’s intensity as well as formant transitions were used as the main
determiners of segmental boundaries (Chang, 2012; Mayr et al., 2012; Moosmüller et al., 2015).
Onset of /l/, /i/, and /a/ was marked at the first glottal striation where the formants were visible, while
offset was marked either at the final glottal striation or at the point where a clear F1 and F2 were no
Page 23 of 52
longer visible. F1 and F2 were then measured at the mid-point of the segment, as in e.g. Harrington,
2006; Reubold et al., 2010, in order to minimise coarticulation effects.
Prosodic analysis
The primary aim of the prosodic analysis was to examine whether the acquisition of English pitch
patterns would affect native speech pitch patterns in SG. A number of prosodic variables were
extracted from the pitch contour in the speech of SG: mean f0 (pitch level), 80%Range (pitch span),
and maxf0, as described below.
Measuring pitch
For this stage in the analysis, recordings were visually and auditorily inspected in Praat before
running a script to extract the pitch variables. For the inspection, the waveform and spectrogram
were initially examined in 5-10 second intervals, and where the pitch contour was viewed to “jump
up and down, doubling or halving the actual f0” (Styler, 2013, p. 15), the audio files were examined
in more detail. In doing so, sections of approximately 1000ms were further visually and auditorily
inspected in order to gauge whether the pitch contour indeed reported an accurate f0. In some cases,
the individual cycles of the waveform were also examined and measured to compare the manual
pitch measurement with the pitch contour. Following the guidelines below, where the pitch contour
in Praat was considered to be unreliable, these portions were deleted, as they would have otherwise
adversely affected the overall f0 measurements (Styler, 2013, p. 17).
1. Where creaky voice occurred, the pitch contour in Praat was considered to be unreliable
because although auditorily a low pitch was heard, and verified by a visual single cycle
inspection of the waveform, the pitch contour in Praat often reported a high f0, or sometimes
even dropped out altogether (see also Styler, 2013).
Page 24 of 52
2. Similarly, at some points in the analysis, where no speech was actually produced, the pitch
contour was nevertheless reported in Praat. In these cases, the portion of silence was also
deleted from the original recording, as, again, an inaccurate f0 was reported in Praat.
Analysis settings were those recommended for women: pitch floor was set to 100 Hz while pitch
ceiling was set to 500 Hz. Thereafter, a number of different values were obtained. For pitch level,
mean f0 (Hz) (Mennen, Schaeffler, & Docherty, 2007) was extracted, in order to examine whether
over time the pitch of SG would increase under the influence of English, or decrease as a function of
biological age. Maximum pitch excursions have also been shown to socially index speakers in
conversation (Podesva, 2007); as such, maximum f0 (maxf0) was also extracted to investigate
whether - although mean f0 might have decreased with time - an increase in maximum f0 excursions
might have occurred. Finally, for pitch span, the difference between the 90th and 10th percentile
range (80%Range) in semitones (ST) was obtained.
Statistical analysis
For both the segmental and prosodic results, data were organised in CSV files using Excel software.
Thereafter, R software (R Core Team, 2017) was used for the analyses and a series of linear mixed-
effects regression models were built using the lme4 package (Bates, Maechler, Bolker, & Walker,
2015). The fixed effect was always 1. age of SG (continuous variable from 12 to 48 years of age) and
for /l/, but not /i/ and /a/, 2. position in syllable (three levels: onset, nucleus and coda); whilst random
effects were type of word (e.g. verb, noun, preposition), preceding sound, and following sound. The
choice of age of SG as the fixed factor was based on the general hypothesis of the study, as in e.g.
Harrington et al. (2010). The choice of position in syllable was motivated by the observation that the
light lateral occurs more frequently in onset and the dark lateral more frequently in coda position in
AE (Ladefoged & Maddieson, 2007; Sproat & Fujimura, 1993), whilst the vowels were always in
nucleus position. For the latter random factors, no specific predictions were made with regard to how
Page 25 of 52
they would influence the dependent variable, as e.g. the specific sound represented the preceding and
following sound, not a categorical distinction such as front versus back. Using the lmerTest function
in R (Kuznetsova, Bruun, Brockhof, & Haubo Bojesen Christensen, 2017) with both the summary
and step functions (which performs automatic backward model selection of fixed and random parts
of the linear mixed model), linear mixed models were fit by REML t-tests using Satterthwaite
approximations to degrees of freedom. An alpha level of 0.05 was used throughout for hypothesis
testing (model outputs are displayed in Table 2).
Results
Long-term changes in segmental variables
The first linear mixed model regression was conducted with F1 frequency of /l/ as the dependent
variable to see if, as predicted, there would be an increase in F1 over time. Only the fixed factor of
age had a significant influence on F1, i.e. χ2[1] = 33.6804, p < .0001, whereas position in syllable did
not. As displayed in Figure 1, F1 frequency decreased over time in contrast to the L2 acquisition
prediction. Of the random factors, both preceding and following sound had a significant influence on
F1, i.e. respectively χ2[1] = 16.23, p < .0001 and χ2[1] = 21.21, p < .0001, whereas type of word did
not. In the second regression, the dependent variable was F2 frequency, in order to see whether there
was a significant darkening of /l/ over time as a result of English L2 acquisition. Again, only the
fixed factor of age had a significant influence on F2, i.e. χ2[1] = 10.522, p < .01. As shown in Figure
1, with increasing age, and therefore increasing exposure to English, F2 decreased, i.e. the lateral
became darker over time. Note additionally that the amount of laterals in onset position increased
after her move to Las Vegas from 53% of total lateral tokens pre-move to 69% post-move, thus
revealing that even though laterals in onset position became more frequent, there was still a
significant decline in F2 frequency. Of the random factors, both preceding and following sound had a
significant influence on F2, i.e. respectively χ2[1] = 43.50, p<.0001 and χ2[1] = 62.81, p<.0001.
Page 26 of 52
Figure 1: Average F1 and F2 frequencies in /l/ of native speech of SG, displaying significant
decreases over time, interpreted to reflect the effects of biological aging, 95% CI.
For the analysis of /i/, the dependent variable was F1 frequency, in order to see whether an
increase would occur. As /i/ was only ever in nucleus position, the fixed effect was only age of SG.
The fixed factor of age had a significant influence on F1, χ2[1] = 13.283, p < .0001, with a decrease
in overall F1 frequency revealed, in contrast to the prediction as a result of L2 acquisition (see Figure
2), most likely as a result of natural aging effects. Of the random factors, only following sound had a
significant influence on F1 χ2[1] = 26.27, p < .0001. In the second analysis of /i/, the dependent
variable was this segment’s F2 frequency, in order to see whether it would increase over time. As
expected, the fixed factor of age had a significant influence on F2, i.e. χ2[1] = 24.0217, p < .0001,
with, as predicted, an overall increase in F2 frequency (see Figure 2). Here, the random effects of
preceding sound and type of word were significant factors, respectively χ2[1] = 4.92, p < .05 and
χ2[1] = 5.20, p < .05.
Page 27 of 52
Figure 2: Average F1 and F2 frequencies in /i/ of native speech of SG, displaying a significant
decrease in F1 over time and a significant increase in F2. 95% CI.
In the next analysis of /a/, the dependent variable was F1 frequency, in order to see whether it
would increase as a result of English L2 acquisition with the fixed effect of age of SG and random
effects of type of word, preceding sound, and following sound. Only the fixed factor of age had a
significant influence on F1, i.e. χ2[1] = 10.7448, p < .01, with, as predicted, an increase in overall F1
frequency revealed (see Figure 3), potentially augmented by natural aging effects. Of the random
factors, again, only following sound had a significant influence on F1 χ2[1] = 4.53, p < .05. In the
second analysis of /a/, the dependent variable was F2 to see whether it would potentially increase, as
predicted as a result of English L2 acquisition. The same fixed and random effects were used as in
the previous step of the /a/ analysis; however, this test was not significant.
Page 28 of 52
Figure 3: Average F1 frequencies in /a/ of native speech of SG, displaying a significant increase in
F2 frequency over time, interpreted to reflect the effects of English L2 acquisition, although
potentially augmented by natural aging effects, 95% CI.
The segmental findings are summarised in Figure 4 which shows ellipses of /l/, /i/ and /a/,
drawn in ggplot (Hadley, 2016) with the default confidence level of .95, i.e. 95% of the spontaneous
speech tokens of SG fell within the radius of these segments. In addition to the changes in F1 and F2
mentioned above, the ellipses also show a high level of variation in the segmental speech production
of SG both before and after her move to Las Vegas, as also reported in Table 1.
Figure 4: Ellipses of /l/, /i/ and /a/ before and after SG’s move to Las Vegas.
Page 29 of 52
Results /l/
F1 before
move
F1 after
move
F2 before
move
F2 after
move
mean 468.1 414.2 1776.8 1669.9
stdev 94.9 98.7 238.6 258.2
n 174 279 178 287
Results /i/
F1 before
move
F1 after
move
F2 before
move
F2 after
move
mean 413.3 371.02 2117.5 2261.5
stdev 88.4 61.2 213.0 208.9
n 86 169 86 169
Results /a/
F1 before
move
F1 after
move
F2 before
move
F2 after
move
mean 723.2 779.4 1530.5 1485.1
stdev 102.1 127.3 174.4 250.7
n 80 92 80 92
Table 1: Means, standard deviations and number of tokens of F1 and F2 (Hz) of /l/, /i/ and /a/
before and after SG's move to Las Vegas.
Page 30 of 52
Estimate
Std. Error
t-value
p-value
F1 of /l/
(Intercept)
525.7500
19.4181
27.075
< 2e-16 ***
Age of SG
-2.1131
0.3713
-5.691
2.32e-08 ***
Position in word, onset
-28.9442
18.4061
-1.573
0.124
F2 of /l/
(Intercept)
1800.6949
59.4662
30.281
< 2e-16 ***
Age of SG
-2.9943
0.9604
-3.118
0.00194 **
Position in word, onset
-107.0731
60.0187
-1.784
0.08146
F1 of /i/
(Intercept)
426.6912
19.7134
21.645
< 2e-16 ***
Age of SG
-1.4039
0.3817
-3.678
0.000289 ***
F2 of /i//
(Intercept)
2025.887
56.203
36.046
< 2e-16 ***
Age of SG
5.681
1.178
4.823
2.46e-06 ***
F1 of /a/
(Intercept)
685.1481
27.1978
25.191
< 2e-16 ***
Age of SG
2.3907
0.7371
3.244
0.00147 **
Table 2: Estimates, standard errors, t values and p values for each linear mixed effect model,
significance codes: *** < 0.001; ** < 0.01, * < 0.05.
Long-term changes in prosodic variables
In the prosodic analysis, only one measurement was obtained for mean f0, 80%Range and maxf0 per
recording (i.e. three measurements per recording), and, therefore, no random effects were measured.
Accordingly, a single linear regression was conducted with age of SG as the dependent variable and
mean f0, maxf0, and 80%Range as independent variables. This model proved to be significant
(F(3,31)=9.148, p<.0001) with a total adjusted R2 of .418. Only mean f0 added significantly to the
model, p<.0001 with a standardized beta value of -.609. The Pearson’s correlation test indicated a
significant negative relationship between mean f0 and age of SG (r=-.645, p<.0001), indicating that
as SG became older, her pitch decreased, as expected as a result of natural aging effects. Moreover, a
Pearson’s correlation for 80%Range and age of SG was also significant (r=-.315, p<.05), indicating
that with increasing age, her pitch span decreased. Surprisingly, although her overall pitch decreased
Page 31 of 52
with time, her maxf0 significantly increased with age (r=.287, p<.05). This difference also proved to
be significant in a dependent t-test comparing maxf0 before her move to Las Vegas with after her
move (t(16) = -2.218, p<.05, r=.49), revealing a large effect size (Figure 5), although both before and
after her move to Las Vegas there was a high level of variation in maxf0 (see Table 2).
Figure 5: Boxplot indicating significantly higher maximum f0 (Hz) in the speech of SG after moving
to Las Vegas in comparison to before.
Page 32 of 52
Results mean f0 (Hz)
Before move to Las Vegas After move to Las Vegas
mean 218.6 197.5
stdev 18.9 9.1
Results 80%Range (ST)
Before move to Las Vegas After move to Las Vegas
mean 5.0 3.9
stdev 3.7 1.0
Results maxf0 (Hz)
Before move to Las Vegas After move to Las Vegas
mean 358.2 418.6
stdev 82.1 81.5
Table 3: Means and standard deviations of F1 and F2 (Hz) of f0, 80%Range and maxf0 before and
after SG's move to Las Vegas
Page 33 of 52
Discussion
The primary objective of the present research was to expose the trajectory of plasticity in the
spontaneous native German speech of SG from youth to mid adulthood as a consequence of English
L2 acquisition. As hypothesised, modifications to L1 segmental and prosodic variables occurred over
time, most likely having been influenced by increased exposure to English counterparts. With regard
to the front – back dimension of /l/, /i/ and /a/, results showed a significant lowering of the second
formant (F2) frequency in /l/, possibly suggesting an overall darkening of the German lateral under
the influence of English as a second language (L2), as well as a significant increase in the F2
frequency of /i/, indicating a more front pronunciation, as predicted due to English L2 acquisition
(see Figure 4). Through an analysis of spontaneous speech, these findings corroborate previous
cross-sectional research into attritional effects in the lateral of German native speakers who have
acquired North AE (Bergmann et al., 2016; de Leeuw et al., 2013, 2017). However, no changes were
observed in the F2 frequency of /a/, which substantiates previous cross-sectional findings into
phonetic attrition in this same sound (Bergmann et al., 2016), and it is also important to note that in
the preliminary analysis of the data, no significant changes occurred in the F1 and F2 frequencies of
/u/ nor in the F3 frequency of /r/. More generally, these segmental findings support previous short-
term longitudinal research into changes in native speech as a result of the acquisition of a new
language in adulthood (Chang, 2012; Sancier & Fowler, 1997), although the current findings do so
over a much longer time span of 40 years.
With regard to the prosodic level of analysis, the average maxf0 in all recordings increased
significantly with age, as predicted as a result of English L2 acquisition. This finding enhances
previous research which suggests that in phonetic studies examining change in the L1 as a result of
L2 acquisition, “the most frequent deviations are found on the suprasegmental level” (Giesbers,
1997, p. 166), as in the current investigation only one of the three prosodic hypotheses was upheld,
Page 34 of 52
whilst biological aging effects also influenced the native speech of SG over time, as will be
discussed shortly. The prosodic results also support previous cross-sectional studies into attrition of
prosodic elements of native speech which indicate that L2 English prosody affects L1 German
prosody (de Leeuw, 2019; de Leeuw et al., 2012; Mennen, 2004).
It is important to note that both before and after SG’s move to Las Vegas, there was a wide
range of variation in her spontaneous speech production with regard to both the segmental and
prosodic variables (see Table 1 and Table 2). This variation was most likely influenced by the fact
that the analysed speech was spontaneous, and therefore less controlled than e.g. word lists, which
control preceding and following sounds and to date have often been used in studies related to the
effects of the L2 on the L1 (see for example de Leeuw et al., 2013, 2017; Major, 1992; Mayr et al.,
2012). Note that the random factors of preceding and following sounds were often significant in the
linear mixed models, which was unsurprising given the spontaneous nature of the recordings.
However, future research into the relationship between specific phonetic contexts and target
variables may prove to be fruitful, i.e. it may be the case that the preponderance of particular
contexts at certain recording times adds to an explanation of changes in speech over time. However,
with regard to the lateral, there was a darkening over time, even though more laterals occurred in
onset position after her move to Las Vegas than prior to her move, and, moreover, position in
syllable was not a significant fixed effect.
The wide range of variation may also have arisen due to recency of native language exposure,
as found in Sancier and Fowler (1997). For example, if SG had only recently returned to Germany
when a specific recording took place, it may have been possible to see more L2 effects on the L1
than had the recording taken place after she had already stayed in Germany for a longer period of
time. Although most of the recordings were made in Germany, and the year of the interviews was
known, it was not known, for example, at what point the interview was made during her stay in
Page 35 of 52
Germany. Such short-term effects of ambient language exposure may have counter-acted or even
enhanced any long-term effects, which might have been evidenced had SG returned less frequently
to Germany (see de Bot et al., 2007; de Bot & Schrauf, 2010; de Leeuw, Opitz, et al., 2013 in
relation to the emphasis on short and long term changes in native speech with regard to Dynamic
Systems Theory, as well as Paradis, 2004, 2007 in relation to language exposure influencing
language production with regard to the Activation Threshold Hypothesis). However, it is likewise
possible that the frequency of SG’s visits may have made L2 effects on the L1 more likely as it has
also been found that interaction may be more prevalent where dual activation occurs commonly
(Mora, Keidel, & Flege, 2015), although it is not known how strictly German and English were
separated during SG’s visits to Germany.
Moreover, it may be that SG participated in sound changes already occurring in Standard
German, and that her /l/, /i/ and maxf0 changed as a result of these more general sound changes
occurring in Germany, which she participated in as well, given her frequent visits back to Germany,
rather than her German being influenced by English L2 acquisition at the individual level. A number
of studies have examined contact situations in which languages undergo phonological convergence
(Bullock & Gerfen, 2004; Colantoni & Gurlekian, 2004; Heselwood & McChrystal, 1999; Mayr,
Morris, Mennen, & Williams, 2015), and as English is acquired en masse in German schools, it may
be that this has influenced the German language more generally in Germany and that SG participated
in this sound change. However, the author is not aware of studies documenting sound changes in
Standard German with regard to changes in the front-back dimension of /l/ and /i/, nor with regard to
increases in maxf0, and this possibility is therefore not considered likely in the current study,
although such general language change phenomena would be important to consider in future research
examining L1 changes in immigrant populations.
Page 36 of 52
In addition to recency of language exposure effects, the variation in the segmental and
prosodic variables of SG may also have arisen due to potentially different levels of formality in the
interviews, for example, whether she already knew her interviewer well, and felt relaxed with him or
her. Again, it was impossible to deduce this information from the recordings of her spontaneous
speech, but previous cross-sectional research suggests that level of formality influences phonetic
attrition with more formal settings eliciting less phonetic attrition than informal settings (Major,
1992; see also Shockey, 1984 with regard to new dialect acquisition).
Moreover, there is a growing body of research which suggests that heritage language markers
may be retained in second generation speakers to fulfil socio-indexical functions on the part of the
heritage language speakers (Alam & Stuart-Smith, 2011; Heselwood & McChrystal, 1999, 2000;
Kirkham, 2011; Sharma & Sankaran, 2011) and some preliminary studies which suggest that even in
first generation late bilingual immigrants, general attrition processes (i.e. the L2 influencing the L1)
are influenced by socio-indexical factors (de Leeuw, 2019; Passoni, Mehrabi, Levon, & de Leeuw,
2018). It may be that the variation evidenced in the speech of SG was in part influenced by socio-
indexical factors, potentially related to her interviewer, as well as the wider audience body she may
have been targeting in her interviews. Nonetheless, although variation in the recordings was reported,
it is interpreted that even with regular visits back to Germany, the effects of English on her German
native speech were apparent, i.e. over time there was a significant increase in maxf0; significant
decrease of F2 in /l/; and significant increase of F2 in /i/.
The intrapersonal variation in the spontaneous speech of SG, potentially due to recency
effects, level of formality and / or socio-indexical factors, has implications on previous group cross-
sectional research into the effects of the L2 on the L1, which often reports a high degree of
interpersonal variation amongst late bilinguals. In almost all research into the effects of either a new
language on native language pronunciation (Bergmann et al., 2016; de Leeuw et al., 2012, 2010,
Page 37 of 52
2017; de Leeuw, Mennen, et al., 2013; Flege & Eefting, 1987; Hopp & Schmid, 2013; Major, 1992;
Mennen, 2004; Mora et al., 2015; Tobin et al., 2017; Ulbrich & Ordin, 2014), or a new dialect on
native dialect pronunciation (Munro et al., 1999; Shockey, 1984), as well as group longitudinal
research into the effects of new dialect acquisition on the native dialect (Evans & Iverson, 2007;
Sankoff, 2004), interpersonal variation has been reported: some participants evidence the analysed
effects, some do not, and some only do so to a moderate extent. If the findings of SG are considered
in relation to these group studies, it seems plausible that intrapersonal variation may have fed into
the interpersonal variation. For example, often when interpreting interpersonal variation, age of
acquisition and length of residence are investigated as potential factors which may impact the
influence of L2 influence on the L1, but, at least within adults, most research shows that these factors
do not significantly influence the effects on the L1, e.g. it does not appear to be the case that the
longer a late bilingual resides in an L2 country, the more likely he or she is to undergo phonetic
attrition, and it is likewise often found that within late adult bilinguals, the effects of age of L2
acquisition are weak to non-existing (de Leeuw, 2009; Hopp & Schmid, 2013; Schmid, 2011). It may
be that the intrapersonal variation explains the observed interpersonal variation: those late bilinguals
displaying strong effects of the L2 on the L1 in group studies did so at that particular time of testing,
and those displaying weak or no effects of the L2 on the L1 did so at that particular time of testing,
perhaps due to e.g. recency of exposure to the L1 and the L2, rather than due to fixed factors such as
age of acquisition and length of residence. It may be prudent for future studies to focus on short as
well as long term effects of the L2 on the L1 in order to explain native speech plasticity over the
lifespan, which would likewise bring together research into shorter term phonetic drift (Chang, 2012,
2013; Sancier & Fowler, 1997; Tobin et al., 2017) as well as longer term phonetic attrition
(Bergmann et al., 2016; de Leeuw et al., 2012, 2010, 2017; de Leeuw, Mennen, et al., 2013; Flege &
Eefting, 1987; Hopp & Schmid, 2013; Major, 1992; Mayr et al., 2012; Mennen, 2004; Ulbrich &
Ordin, 2014). Likewise, examining the effects of language mode on speech production would also
Page 38 of 52
potentially prove to be informative (Amengual, 2018; Antoniou, Tyler, & Best, 2012; Simonet,
2014). Such investigations would need to ensure that the repeated testing of variables would not
disturb the “natural course of the process [the test] hope[s] to track down” (Jaspaert & Kroon, 1989,
p. 81), but there is potential for innovative research in this field incorporating both short and long
term effects of speech plasticity over the lifespan.
A secondary objective of this study was to investigate whether plasticity in the native speech
of SG was evidenced uniformly across all variables examined, or whether certain variables were
more likely to undergo change than others. The conclusion from the analysis of SG is that the
variables studied did not evidence plasticity uniformly. As already mentioned, although not further
reported, at the segmental level, neither /u/ nor /r/ evidenced significant changes over time, whilst
both /i/ and /l/ changed significantly over time. Similarly, of the prosodic variables, only maxf0
increased significantly, most likely as a result of L2 acquisition of English, but pitch level and pitch
range were potentially more impacted by the natural biological effects of aging. These findings do
not support the suggestion that cross-linguistic interactions operate at a system-wide level (as
suggested by Chang, 2012; Guion, 2003; Mayr et al., 2012). Specifically, there was nothing to
suggest from the current study that there was a general trend towards more open (higher F1)
realisations (as found in Mayr et al., 2012). Instead, the F1 of /i/ decreased significantly over time;
the F1 of /u/ did not change significantly; the F1 of /a/ increased significantly over time, but it is not
unequivocal whether the latter was caused by natural aging effects or the influence of the L2 on the
L1; and the F1 of /l/ decreased significantly over time. Accordingly, and as will be discussed with
regard to the third objective of this investigation, these results align more consistently with the
effects of biological aging than they do with the impact of the L2 on the L1, although in the research
by Mayr et al. (2012), Chang (2012), and Guion (2003), age was a controlled variable, which
increases the likelihood of the L2 having indeed systematically affected the L1, rather than aging
Page 39 of 52
processes. Nevertheless, the findings into SG similarly do not report systematic changes into the
front-back dimension: F2 frequency of /i/ significantly increased over time; whilst neither the F2 of
/a/ nor /u/ significantly changed over time; and the F2 of /l/ significantly decreased over time.
Considering both the segmental and prosodic variables, these findings appear to verify the
understanding that the production of native speech patterns, as measured in segmental and prosodic
variables, are influenced specifically by increased exposure to their L2 counterparts, and that not all
variables are influenced uniformly to the same degree. Accordingly, these findings could align more
consistently with research which has shown that specific L1 variables are more prone to being
influenced by a new language or dialect due to socio-indexical marking, as reported in Sankoff
(2004) and Bergmann et al. (2016), which has also been reported in the L2 of heritage language
speakers (Heselwood & McChrystal, 1999, 2000; Kirkham, 2011; Sharma & Sankaran, 2011). This
is not to say that some kind of systematic change does not occur at all, e.g. in the form of change to
an L1 articulatory setting (Laver, 1980; Mennen et al., 2010), but it is well established that different
linguistic variables have different social meaning (e.g. stereotypes, markers and indicators, Labov,
1971), so it seems reasonable to assume that different L1 phonetic variables may be affected
differently by their L2 counterparts dependent on their social meaning, as reported in Sankoff (2004),
and potentially due to the perceptual similarity between L1 and L2 counterparts (Best, 1995;
Escudero, 2005; Flege, 1995; Flege, 1987). Further longitudinal research examining a greater
number of participants, incorporating the analysis of numerous phonetic variables which differ in
social meaning and phonetic similarity may make headway on the question of whether plasticity of
native speech occurs systematically over phonetic variables within a particular language, or whether
it occurs dependent upon the individual variables, the latter of which the current investigation
appears to suggest.
Page 40 of 52
Finally, with regard to the third objective, these findings into plasticity of spontaneous native
speech as a result of acquisition of L2 counterparts were entangled with the effects of biological
aging. As predicted by previous longitudinal research (Reubold & Harrington, 2015, 2017; Reubold
et al., 2010), the F1 of /i/ and /l/ decreased over time, whereas the F1 of /a/ increased over time,
which was potentially amplified as a result of English L2 acquisition. This is to say that, particularly
with regard to F1, aging effects appeared to either counteract any potential influence brought about
by SG’s acquisition of English segments, or amplify changes in her native speech. Particularly with
regard to changes in the F1 as a result of biological aging over time, the findings from the current
investigation support previous longitudinal research which suggests that in high vowels F1 decreases
whilst in low values F1 increases, thereby expanding the vertical vowel space as the vocal tract ages
(Reubold & Harrington, 2015, 2017; Reubold et al., 2010). Interestingly, the recordings of SG were
made when she was much younger than the participants in the Reubold et al. research, therefore
suggesting that the vowel space might begin to change already in early adulthood as a result of
natural aging process. Accordingly, the results of SG do not align with those of Dagmar Berghoff,
who did not exhibit any F1 aging effects until 56 years of age (Reubold & Harrington, 2017),
potentially as a result of Berghoff’s extensive professional voice training, which SG would not have
undergone to the same extent given their different professions. Again, the current results therefore
suggest that aging effects in the untrained voice might start already in early adulthood, and in the
case of SG, be intertwined with L2 acquisition of English counterparts. Similarly, with regard to
prosody, although it was expected that over time mean f0 would have increased as a result of English
acquisition of prosody, the more powerful effect appeared to be that of natural aging processes and,
instead, a decrease in mean f0 was evidenced in the speech of SG. It is therefore all the more striking
that maxf0 increased over time, which is suggested to indicate plasticity in the speech of SG as a
result of English L2 acquisition of prosody. It would also be interesting to examine this prosodic
effect from the perspective of intonational phonology (i.e., whether/how there may be some effects
Page 41 of 52
on the actual tone-segment alignment patterns and intonational composition), rather than solely
examining long-term distributional effects (see e.g. Graham & Post, 2018; Grice et al., 2017;
Leemann et al., 2018) in order to further our understanding phonetics/phonology of bilinguals in
connection with fine-grained phonetic detail of intonational structure.
In order to tease apart the longitudinal effects of biological aging and plasticity of native
speech as a result of late L2 acquisition, it would be necessary to follow both bilingual and
monolingual speakers over time, and compare their progression. For example, it could be that
although the f0 of SG decreased over time, it may have decreased less over time than had she not
learned English as an L2. Likewise, although her 80%Range narrowed over time, most probably as
result of aging effects, had she not been immersed in English as an L2 in Las Vegas, it may have
narrowed even more. Such questions can only be answered through larger studies, which track both
monolinguals and bilinguals over time, whilst bearing in mind the effects of repeated testing of
variables (Jaspaert & Kroon, 1989). As it stands, through investigating a single late bilingual over the
course of time, the current study has been able to reveal plasticity in native speech which is
interpreted to be a result of L2 acquisition in a few selected variables (maxf0, indicating a significant
increase in maximum pitch excursions over time; F2 of /l/, indicating a significant darkening of /l/
over time; F2 of /i/, indicating significant fronting of /i/ over time), whilst in other variables, there
was no effect, or the effect was potentially overridden by aging effects.
Nevertheless, as previously discussed, much cross-sectional research into monolinguals relies
on the premise that once a language is acquired during childhood, it will not readily be altered by
subsequent input (see e.g. Labov, 2006). The current findings strengthen the case for adult change
rather than stability through examining native speech in the late German – English bilingual of SG
over a timeframe of four decades. Together with other longitudinal research into monolingual adults
(Evans & Iverson, 2007; Harrington, 2006; Reubold & Harrington, 2015, 2017; Reubold et al., 2010;
Page 42 of 52
Sankoff, 2004), as well as bilingual adults (Chang, 2012, 2013; Giesbers, 1997; Oh et al., 2011;
Sancier & Fowler, 1997; Tobin et al., 2017), the study shows that, if at all, one must proceed with
caution when deducing an individual’s earlier native speech patterns from his or her present adult
speech patterns, and, therefore, likewise when deducing a community’s childhood speech patterns
from the same community’s adult speech patterns.
In sum, although this study was only based on a single individual, and it is of course
necessary to collect further longitudinal data to make unequivocal conclusions, the findings are
interpreted to strengthen “the case for adult change rather than stability” (Labov, 2006, p. 501). They
do so by revealing that L2 counterparts influence native speech phonetic variables at both segmental
and prosodic levels, and that these changes are intertwined with biologically induced age-based
changes. Moreover, the findings reveal that individual phonetic variables follow different trajectories
within the same individual. It therefore seems compelling to continue to incorporate bilingual
research into our understanding of speech plasticity, as experienced by the individual, as well as
change within the community. Such studies into bilinguals will enhance our understanding of
continued speech development post adolescence, and therefore provide a meaningful window into
our understanding of human speech, language and cognition across the lifespan.
Page 43 of 52
References
Ahn, S., Chang, C. B., DeKeyser, R., & Lee‐Ellis, S. (2017). Age effects in first language attrition: Speech
perception by Korean-English bilinguals. Language Learning, 67(3), 694–733.
Alam, F., & Stuart-Smith, J. (2011). Identity and ethnicity in /t/ in Glasgow-Pakistani high-school girls. In
ICPhS (pp. 216–219). Hong Kong.
Amengual, M. (2018). Asymmetrical interlingual influence in the production of Spanish and English laterals
as a result of competing activation in bilingual language processing. Journal of Phonetics, 69, 12–28.
Andersen, N. (1974). On the calculation of filter coefficients for maximum entropy spectral analysis.
Geophysics, 39(1), 69–72.
Antoniou, M., Tyler, M. D., & Best, C. T. (2012). Two ways to listen: Do L2-dominant bilinguals perceive
stop voicing according to language mode? Journal of Phonetics, 40(4), 582–594.
Babel, M., & Bulatov, D. (2012). The role of fundamental frequency in phonetic accommodation. Language
and Speech, 55(2), 231–248.
Bates, D., Maechler, M., Bolker, B., & Walker, S. (2015). lme4: Linear mixed-effects models using Eigen and
S4.
Bergmann, C., Nota, A., Sprenger, S. A., & Schmid, M. S. (2016). L2 immersion causes non-native-like L1
pronunciation in German attriters. Journal of Phonetics, 58(Supplement C), 71–86.
Best, C. T. (1995). A direct realist view of cross-language speech perception. In W. Strange (Ed.), Speech
perception and linguistic experience: Theoretical and methodological issues (pp. 171–204).
Baltimore: York Press.
Boersma, P., & Weenink, D. (2010). PRAAT. University of Amsterdam. Retrieved from http://www.praat.org
Bullock, B. E., & Gerfen, C. (2004). Phonological convergence in a contracting language variety.
Bilingualism: Language and Cognition, 7(2), 95–104.
Bylund, E. (2009). Maturational constraints and first language attrition. Language Learning, 59(3), 687–715.
Caramazza, A., Yeni‐Komshian, G. H., Zurif, E. B., & Carbone, E. (1973). The acquisition of a new
phonological contrast: The case of stop consonants in French‐English bilinguals. The Journal of the
Acoustical Society of America, 54(2), 421–428.
Page 44 of 52
Chang, C. B. (2012). Rapid and multifaceted effects of second-language learning on first-language speech
production. Journal of Phonetics, 40(2), 249–268.
Chang, C. B. (2013). A novelty effect in phonetic drift of the native language. Journal of Phonetics, 41(6),
520–533.
Colantoni, L., & Gurlekian, J. (2004). Convergence and intonation: Historical evidence from Buenos Aires
Spanish. Bilingualism: Language and Cognition, 7(2), 107–119.
Danis, K. (1999, September 18). Steffi and Andre find love under the same roof. Retrieved from
http://nypost.com/1999/09/18/steffi-and-andre-find-love-under-same-roof/
de Bot, K., & Clyne, M. (1994). A 16‐year longitudinal study of language attrition in Dutch immigrants in
Australia. Journal of Multilingual and Multicultural Development, 15(1), 17–28.
de Bot, K., Lowie, W., & Verspoor, M. (2007). A Dynamic Systems Theory approach to second language
acquisition. Bilingualism: Language and Cognition, 10(01), 7–21.
de Bot, K., & Schrauf, R. W. (2010). Language Development Over the Lifespan. Routledge.
de Leeuw, E. (2009). When your native language sounds foreign: A phonetic investigation into first language
attrition (PhD). Queen Margaret University, Edinburgh.
de Leeuw, E. (2019, in press). Gendered attrition of pitch in the German – English heritage language
community in Vancouver, Canada. In J. Treffers-Daller, B. Bremer, & D. Berndt (Eds.), Lost in
Transmission. Amsterdam: John Benjamins Publishing Company.
de Leeuw, E., Mennen, I., & Scobbie, J. M. (2012). Singing a different tune in your native language: First
language attrition of prosody. International Journal of Bilingualism, 16(1), 101–116.
de Leeuw, E., Mennen, I., & Scobbie, J. M. (2013). Dynamic systems, maturational constraints and L1
phonetic attrition. International Journal of Bilingualism, 17(6), 683–700.
de Leeuw, E., Opitz, C., & Lubińska, D. (2013). Dynamics of first language attrition across the lifespan.
International Journal of Bilingualism, 17(6), 667-674.
de Leeuw, E., Schmid, M. S., & Mennen, I. (2010). The effects of contact on native language pronunciation in
an L2 migrant setting. Bilingualism: Language and Cognition, 13, 33–40.
Page 45 of 52
de Leeuw, E., Tusha, A., & Schmid, M. S. (2017). Individual phonological attrition in Albanian-English late
bilinguals. Bilingualism: Language and Cognition, 278-295.
Delattre, P. (1964). Comparing the Phonetic Features of English, French, German and Spanish. Heidelberg:
Julius Groos.
Dmitrieva, O., Jongman, A., & Sereno, J. (2010). Phonological neutralization by native and non-native
speakers: The case of Russian final devoicing. Journal of Phonetics, 38(3), 483–492.
Escudero, P. (2005). Linguistic Perception and Second Language Acquisition: Explaining the Attainment of
Optimal Phonological Categorization. University of Utrecht, Utrecht.
Evans, B. G., & Iverson, P. (2007). Plasticity in vowel perception and production: A study of accent change in
young adults. The Journal of the Acoustical Society of America, 121(6), 3814–3826.
Finn, R. (1991, November 12). Graf Splits with her long-time coach. The New York Times. Retrieved from
http://www.nytimes.com/1991/11/12/sports/tennis-graf-splits-with-her-longtime-coach.html
Flege, J. E. (1995). Second language speech learning: Theory, findings, and problems. In W. Strange (Ed.),
Speech perception and linguistic experience: Theoretical and methodological issues (pp. 233–277).
Maryland: York Press.
Flege, J. E., & Hillenbrand, J. (1984). Limits on phonetic accuracy in foreign language speech production. The
Journal of the Acoustical Society of America, 76(3), 708–721.
Flege, James E. (1987). The production of ‘new’ and ‘similar’ phones in a foreign language: Evidence for the
effect of equivalence classification. Journal of Phonetics, 15, 47–65.
Flege, James E., & Eefting, W. (1987). Cross-language switching in stop consonant perception and production
by Dutch speakers of English. Speech Communication, 6(3), 185–202.
Gibbon, D. (1998). Intonation in German. In D. Hirst & A. Di Cristo (Eds.), Intonation Systems. A Survey of
Twenty Languages. Cambridge: Cambridge University Press.
Giesbers, H. W. M. (1997). Dutch in Indonesia. Language attrition or language contact? In Klatter-Folmer, J.
Kroon, S. (ed.), Dutch Overseas. Studies in Maintenance, Shift and Loss of Dutch as an Immigrant
Language (pp. 163–188). Tilburg: Tilburg University Press. Retrieved from
http://repository.ubn.ru.nl/handle/2066/105794.
Page 46 of 52
Graham, C., & Post, B. (2018). Second language acquisition of intonation: Peak alignment in American
English. Journal of Phonetics, 66, 1–14.
Grice, M., Ritter, S., Niemann, H., & Roettger, T. B. (2017). Integrating the discreteness and continuity of
intonational categories. Journal of Phonetics, 64, 90–107.
Guion, S. G. (2003). The vowel systems of Quichua-Spanish bilinguals. Age of acquisition effects on the
mutual influence of the first and second languages. Phonetica, 60(2), 98–128.
Hadley, W. (2016). ggplot2: Elegant Graphics for Data Analysis. New York: Springer.
Harrington, J. (2006). An acoustic analysis of ‘happy-tensing’ in the Queen’s Christmas broadcasts. Journal
of Phonetics, 34(4), 439–457.
Harrington, J., Palethorpe, S., & Watson, C. (2000a). Monophthongal vowel changes in Received
Pronunciation: An acoustic analysis of the Queen’s Christmas broadcasts. Journal of the International
Phonetic Association, 30(1–2), 63–78.
Harrington, J., Palethorpe, S., & Watson, C. I. (2000b). Does the Queen speak the Queen’s English? Nature,
408(6815), 927–928.
Harrington, J., Palethorpe, S., & Watson, C. I. (2007). Age-related changes in fundamental frequency and
formants: A longitudinal study of four speakers (pp. 2753--2756). Presented at the Interspeech,
Antwerp, Belgium.
Herffs, H. (2009, October 14). Steffi Graf: ‘Ich bin nicht immer konsequent’. GALA. Retrieved from
http://www.gala.de/stars/news/interview/steffi-graf-ich-bin-nicht-immer-konsequent_19362-p2.html.
Heselwood, B., & McChrystal, L. (1999). The effect of age-group and place of L1 acquisition on the
realisation of Panjabi stop consonants in Bradford: An acoustic sociophonetic study. In Leeds
Working Papers in Linguistics & Phonetics 7 (pp. 49–68).
Heselwood, B., & McChrystal, L. (2000). Gender, accent features and voicing in Panjabi-English bilingual
children. In Leeds Working Papers in Linguistics & Phonetics (pp. 49–68).
Hillenbrand, J., A. Getty, L., J. Clark, M., & Wheeler, K. (1995). Acoustic characteristics of American
English vowels. Journal of the Acoustic Society of America, 97.
Page 47 of 52
Hopp, H., & Schmid, M. S. (2013). Perceived foreign accent in first language attrition and second language
acquisition: The impact of age of acquisition and bilingualism. Applied Psycholinguistics, 34(02),
361–394.
Hüdaverdi, N., & Mitatselis, C. (2012, October 1). Steffi Graf im Interview: „Ich möchte nichts ändern“.
Retrieved 9 November 2017, from https://www.ksta.de/koeln/steffi-graf-im-interview--ich-moechte-
nichts-aendern--3968044.
Hungermann, J. (2013, July 2). Steffi Graf: „Deutschland ist die Heimat, Las Vegas das Zuhause“. Die Welt.
Retrieved from https://www.welt.de/sport/article117611312/Deutschland-ist-die-Heimat-Las-Vegas-
das-Zuhause.html.
Interview with Helen Mirren. (n.d.). Michael Parkinson Show. London: ITV. Retrieved from
https://www.youtube.com/watch?v=JU7899DXQGI on 26 September 2017.
Jaspaert, K., & Kroon, S. (1989). Social determinants of language loss. ITL International Journal of Applied
Linguistics, (83), 75–98.
Kirkham, Sam, F. (2011). The acoustics of coronal stops in British Asian English. In ICPhS (pp. 1102–1105).
Hong Kong.
Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2017). lmerTest Package: Tests in Linear Mixed
Effects Models. Journal of Statistical Software, 82(13).
Künzel, H. J. (2001). Beware of the ‘telephone effect’: The influence of telephone transmission on the
measurement of formant frequencies. Forensic Linguistics, 8(1), 80–99.
Labov, W. (1971). The study of language in its social context. In J. A. Fishman (Ed.), Advances in the
Sociology of Language (Vol. 1, pp. 152–216). The Hague: Mouton.
Labov, W. (1997). The Social Stratification of (r) in New York City Department Stores. In Sociolinguistics
(pp. 168–178). Palgrave, London.
Labov, W. (2006). A sociolinguistic perspective on sociophonetic research. Journal of Phonetics, 34(4), 500–
515.
Ladd, D. R. (2008). Intonational Phonology. Cambridge: Cambridge University Press.
Ladefoged, P., & Maddieson, I. (2007). The sounds of the world’s languages. Oxford, UK: Blackwell.
Page 48 of 52
Laver, J. (1980). The Phonetic Description of Voice Quality. Cambridge: Cambridge University Press.
Laver, J., & Trudgill, K. (1979). Phonetic and linguistic markers in speech. In K. R. Scherer & H. Giles
(Eds.), Social Markers in Speech. Cambridge: Cambridge University Press.
Leemann, A., Kolly, M.-J., Nolan, F., & Li, Y. (2018). The role of segments and prosody in the identification
of a speaker’s dialect. Journal of Phonetics, 68, 69–84.
Levon, E. (2009). Dimensions of style: Context, politics and motivation in gay Israeli speech. Journal of
Sociolinguistics, 13(1), 29–58.
Linville, S. E. (1996). The sound of senescence. Journal of Voice, 10(2), 190–200.
Linville, S. E., & Rens, J. (2001). Vocal tract resonance analysis of aging voice using long-term average
spectra. Journal of Voice, 15(3), 323–330.
Major, R. C. (1992). Losing English as a first language. The Modern Language Journal, 76(2), 190–208.
Mayr, R., Morris, J., Mennen, I., & Williams, D. (2015). Disentangling the effects of long-term language
contact and individual bilingualism: The case of monophthongs in Welsh and English. International
Journal of Bilingualism, 245-267.
Mayr, R., Price, S., & Mennen, I. (2012). First language attrition in the speech of Dutch–English bilinguals:
The case of monozygotic twin sisters. Bilingualism: Language and Cognition, 15(04), 687–700.
Mennen, I. (2004). Bi-directional interference in the intonation of Dutch speakers of Greek. Journal of
Phonetics, 32(4), 543–563.
Mennen, I., & de Leeuw, E. (2014). Beyond segments. Studies in Second Language Acquisition, 36(02), 183–
194.
Mennen, I., Schaeffler, F., & Dickie, C. (2014). Second language acquisition of pitch range in German
learners of English. Studies in Second Language Acquisition, 36, 303–329.
Mennen, I., Schaeffler, F., & Docherty, G. (2012). Cross-language differences in fundamental frequency
range: A comparison of English and German. The Journal of the Acoustical Society of America,
131(3), 2249–2260.
Page 49 of 52
Mennen, I., Schaeffler, F., & Docherty, G. J. (2007). Pitching it differently: A comparison of the pitch ranges
of German and English speakers. In Proceedings of the XVIth International Congress on Phonetic
Sciences (pp. 1769–1772). Saarbrücken, Germany.
Mennen, I., Scobbie, J. M., Leeuw, E. de, Schaeffler, S., & Schaeffler, F. (2010). Measuring language-specific
phonetic settings. Second Language Research, 26(1), 13–41.
Mol-Wolf, K., & Voss, J. (2016). Im Interview: Stefanie Grafs Gespür für Glück. Emotion, 26–31.
Moosmüller, S., Schmid, C., & Kasess, C. H. (2015). Alveolar and velarized laterals in Albanian and in the
Viennese dialect. Language and Speech, 488-515.
Mora, J. C., Keidel, J. L., & Flege, J. E. (2015). Effects of Spanish use on the production of Catalan vowels by
early Spanish-Catalan bilinguals. In J. Romero & M. Riera (Eds.), The phonetics–phonology
interface: Representations and methodologies (pp. 33–54). Amsterdam: John Benjamins Publishing.
Munro, M. J., Derwing, T. M., & Flege, J. E. (1999). Canadians in Alabama: A perceptual study of dialect
acquisition in adults. Journal of Phonetics, 27(4), 385–403.
Neppert, J. M. H. (1999). Elemente einer Akustischen Phonetik. Hamburg: Buske.
Nycz, J. (2013). Changing words or changing rules? Second dialect acquisition and phonological
representation. Journal of Pragmatics, 52, 49–62.
Nycz, J. (2015). Second dialect acquisition: A sociophonetic perspective. Language and Linguistics Compass,
9(11), 469–482.
Oh, G. E., Guion-Anderson, S., Aoyama, K., Flege, J. E., Akahane-Yamada, R., & Yamada, T. (2011). A one-
year longitudinal study of English and Japanese vowel production by Japanese adults and children in
an English-speaking setting. Journal of Phonetics, 39(2), 156–167.
Ohara, Y. (1999). Performing gender through voice pitch: A cross-cultural analysis of Japanese and American
English. In U. Pasero & F. Braun (Eds.), Wahrnehmung und Herstellung von Geschlecht (pp. 105–
116). VS Verlag für Sozialwissenschaften.
Ostler, S. (1987, June 29). For Steffi Graf, Pappa knows best: When it comes to her tennis career, he calls the
shots. Los Angeles Times. Retrieved from http://articles.latimes.com/1987-06-29/sports/sp-
16_1_tennis/
Page 50 of 52
Paradis, M. (2007). L1 attrition features predicted by a neurolinguistic theory of bilingualism. In B. Köpke, M.
S. Schmid, M. Keijzer, & S. Dostert (Eds.), Language attrition: A theoretical perspective (pp. 121–
134). Amsterdam: John Benjamins Publishing Company.
Paradis, Michel. (2004). A Neurolinguistic Theory of Bilingualism. John Benjamins Publishing.
Passoni, E., Mehrabi, A., Levon, E., & de Leeuw, E. (2018). Bilingualism, pitch range and social factors:
Preliminary results from sequential Japanese-English bilinguals. In Speech Prosody 2018 Conference
Proceedings (pp. 1–5). Poznań, Poland.
Podesva, R. J. (2007). Phonation type as a stylistic variable: The use of falsetto in constructing a persona.
Journal of Sociolinguistics, 11(4), 478–504.
R Core Team. (2017). R: A Language and Environment for Statistical Computing. Vienna, Austria: R
Foundation for Statistical Computing. Retrieved from https://www.R-project.org.
Reubold, U., & Harrington, J. (2015). Disassociating the effects of age from phonetic change: A longitudinal
study of formant frequencies. In Language Development: The Lifespan Perspective (pp. 9–37).
Amsterdam: John Benjamins Publishing Company.
Reubold, U., & Harrington, J. (2017). The influence of age on estimating sound change acoustically from
longitudinal data. In S. E. Wagner & I. Buchstaller (Eds.), Panel Studies of Language Variation and
Change (pp. 129–152). Routledge.
Reubold, U., Harrington, J., & Kleber, F. (2010). Vocal aging effects on f0 and the first formant: A
longitudinal analysis in adult speakers. Speech Communication, 52(7–8), 638–651.
Sancier, M. L., & Fowler, C. A. (1997). Gestural drift in a bilingual speaker of Brazilian Portuguese and
English. Journal of Phonetics, 25(4), 421–436.
Sankoff, G. (2004). Adolescents, young adults and the critical period: Two case studies from “Seven Up”. In
C. Fought (Ed.), Sociolinguistic Variation: Critical Reflections (pp. 121–139). Oxford, New York:
Oxford University Press.
Scharff-Rethfeldt, W., Miller, N., & Mennen, I. (2008). Speaking fundamental frequency differences in highly
proficient bilinguals of German/English. Sprache, Stimme, Gehör, 32(03), 123–128.
Page 51 of 52
Scherer, K. R. (1974). Voice quality analysis of American and German Speakers. Journal of Psycholinguistic
Research, 3(3), 281 – 298.
Schmid, M. S. (2004). First Language Attrition: Interdisciplinary Perspectives on Methodological Issues.
Amsterdam: John Benjamins Publishing.
Schmid, M. S. (2011). Language Attrition. Cambridge: Cambridge University Press.
Sendlmeier, W. F., & Seebode, J. (2006). Formantkarten des deutschen Vokalsystems (pp. 1–4). Berlin: TU
Berlin, Institut für Sprache und Kommunikation. Retrieved from http://www.ba.tu-
berlin.de/fileadmin/a01311100/Formantkarten_des_deutschen_Vokalsystems.pdf.
Sharma, D., & Sankaran, L. (2011). Cognitive and social forces in dialect shift: Gradual change in London
Asian speech. Language Variation and Change, 23(3), 399–428.
Shockey, L. (1984). All in a flap: Long-term accommodation in phonology. International Journal of the
Sociology of Language, 1984(46), 87–96.
Simonet, M. (2014). Phonetic consequences of dynamic cross-linguistic interference in proficient bilinguals.
Journal of Phonetics, 43, 26–37.
Sproat, R., & Fujimura, O. (1993). Allophonic variation in English/l/and its implications for phonetic
implementation. Journal of Phonetics, 21(3), 291–311.
Styler, W. (2013). Using Praat for Linguistic Research (No. 1.3.5) (pp. 1–70). Boulder Colorado: University
of Colorado at Boulder Phonetics Lab. Retrieved from
https://phonetique.uqam.ca/upload/files/LIN2623/Styler_2013_2.pdf.
Sundara, M., Polka, L., & Baum, S. (2006). Production of coronal stops by simultaneous bilingual adults.
Bilingualism: Language and Cognition, 9(01), 97–114.
Tobin, S. J., Nam, H., & Fowler, C. A. (2017). Phonetic drift in Spanish-English bilinguals: Experiment and a
self-organizing model. Journal of Phonetics, 65, 45–59.
Ulbrich, C., & Ordin, M. (2014). Can L2-English influence L1-German? The case of post-vocalic /r/. Journal
of Phonetics, 45, 26–42.
Ullakanoja, R. (2007). Comparison of pitch range in Finnish (L1) and Russian (L2) (pp. 1701–1704).
Presented at the International Congress of Phonetic Sciences, Saarbrücken.
Page 52 of 52
Van Bezooijen, R. (1995). Sociocultural aspects of pitch differences between Japanese and Dutch women.
Language and Speech, 38(3), 253–265.
Wells, J. C. (1982). Accents of English. Cambridge: Cambridge University Press.
Willems, N. (1982). English Intonation from a Dutch Point of View. Walter de Gruyter.