Article

Durations of American English Vowels by Native and Non-native Speakers: Acoustic Analyses and Perceptual Effects

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The goal of this study was to examine durations of American English vowels produced by English-, Chinese-, and Korean-native speakers and the effects of vowel duration on vowel intelligibility. Twelve American English vowels were recorded in the /hVd/ phonetic context by native speakers and non-native speakers. The English vowel duration patterns as a function of vowel produced by non-native speakers were generally similar to those produced by native speakers. These results imply that using duration differences across vowels may be an important strategy for non-native speakers' production before they are able to employ spectral cues to produce and perceive English speech sounds. In the intelligibility experiment, vowels were selected from 10 native and non-native speakers and vowel durations were equalized at 170 ms. Intelligibility of vowels with original and equalized durations was evaluated by American English native listeners. Results suggested that vowel intelligibility of native and non-native speakers degraded slightly by 3-8% when durations were equalized, indicating that vowel duration plays a minor role in vowel intelligibility.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The productive-perceptual layer is where L2 Learners and L1 Listeners are directly engaged in a "speech circuit (De Saussure, 1959)" and the characteristics of L2 Learners' pronunciation act on L1 Listeners' perception. Perception and production can be measured subjectively and objectively (Munro and Derwing, 1995;Flege et al., 2003), and their relationships can be identified through statistical analyzes and psychoacoustic experiments (Liu et al., 2014;Porretta et al., 2015). Such research can characterize L2 oral communication and suggest effective targets for pronunciation instruction (Trofimovich and Isaacs, 2012). ...
... Acoustic properties can be synthetically manipulated to verify causal relationships between acoustics and perception. For example, Liu et al. (2014) observed that L2 Learners might use duration as a cue to differentiate lax and tense vowels in production. To prove this hypothesis, they equalized the duration of L2 Learners' productions to find that intelligibility was reduced. ...
... To prove this hypothesis, they equalized the duration of L2 Learners' productions to find that intelligibility was reduced. In contrast with how Liu et al. (2014) removed one dimension of acoustic variance, acoustic cues can be varied to form a continuum. Chan et al. (2017) manipulated spectral features gradually and found that the frequencies of vowel formants were a primary cue for the perception of L2 speech. ...
Article
Full-text available
Second language (L2) pronunciation patterns that differ from those of first language (L1) speakers can affect communication effectiveness. Research on children’s L2 pronunciation in bilingual education that involves non-English languages is much needed for the field of language acquisition. Due to limited research in these specific populations and languages, researchers often need to refer to literature on L2 pronunciation in general. However, the multidisciplinary literature can be difficult to access. This paper draws on research from different disciplines to provide a brief but holistic overview of L2 pronunciation. A conceptual model of L2 pronunciation is developed to organize multidisciplinary literature, including interlocutors’ interactions at three layers: the sociopsychological, acquisitional, and productive-perceptual layers. Narrative literature review method is used to identify themes and gaps in the field. It is suggested that challenges related to L2 pronunciation exist in communication. However, the interlocutors share communication responsibilities and can improve their communicative and cultural competencies. Research gaps are identified and indicate that more studies on child populations and non-English L2s are warranted to advance the field. Furthermore, we advocate for evidence-based education and training programs to improve linguistic and cultural competencies for both L1 speakers and L2 speakers to facilitate intercultural communication.
... To illustrate, the English /i/ and /ɪ/ contrast can be distinguished along the spectrum dimension (vowel quality, related mainly to the first and second formant frequencies, F1 and F2) and the duration dimension (vowel length). Native English speakers have been reported to rely primarily on spectral properties, with vowel duration playing only a secondary role (Hillenbrand et al., 2000;Mermelstein, 1978), whereas adult Chinese learners of English rely predominantly on vowel duration instead of spectral properties in both perception and production (Escudero & Boersma, 2004;Liu et al., 2014). According to the A2D models, successful acquisition of the English /i/ and /ɪ/ categories by native Chinese learners involves not only enhancement of attention to the under-attended spectral properties, but also a simultaneous inhibition of over-attended vowel duration. ...
... Evidence has witnessed the difficulty of adult Chinese learners of English in both perception and production of the English /i/ and /ɪ/ vowels (Escudero & Boersma, 2004;Liu et al., 2014). Mandarin Chinese has only /i/ category in its vowel system with the average F1 and F2 frequencies being 290-320 Hz and 2360-2800 Hz (Wang, 1995). ...
Article
Talker variability has been reported to facilitate generalization and retention of speech learning, but is also shown to place demands on cognitive resources. Our recent study provided evidence that phonetically-irrelevant acoustic variability in single-talker (ST) speech is sufficient to induce equivalent amounts of learning to the use of multiple-talker (MT) training. This study is a follow-up contrasting MT versus ST training with varying degrees of temporal exaggeration to examine how cognitive measures of individual learners may influence the role of input variability in immediate learning and long-term retention. Native Chinese-speaking adults were trained on the English /i/-/ɪ/ contrast. We assessed the trainees' working memory and inhibition control before training. The two trained groups showed comparable long-term retention of training effects in terms of word identification performance and more native-like cue weighting in both perception and production regardless of talker variability condition. The results demonstrate the role of phonetically-irrelevant variability in robust speech learning and modulatory functions of nonlinguistic domain-general inhibitory control and working memory, highlighting the necessity to consider the interaction between input characteristics, task difficulty, and individual differences in cognitive abilities in assessing learning outcomes.
... To examine whether the training effect was extended to English vowels produced by novel speakers, the identification of English vowels produced by a new male English-native (EN) speaker was measured in quiet and babble (eight-talker babble) in T2 and T3. This EN speaker was selected because he had similar vowel intelligibility with the female speaker in T1, T2, and T3, i.e., the average vowel intelligibility scores were 92.4% and 89.8% for the female and male EN speaker, respectively (Liu et al., 2014;Jin and Liu, 2014b). This male EN speaker was not used in the T1 session and training sessions. ...
... Untrained vowels of the same female talker received a small but significant improvement for several listening conditions (e.g., training effect of about 20% in quiet and 10%-20% in most noise conditions from T1 to T2). Moreover, as the vowel intelligibility of the old (female) and new (male) talkers were matched with less than 3% difference (Liu et al., 2014;Jin and Liu, 2014b), the three groups of Chinese-native listeners in this study identified vowels of the new talker with 60%-90% correctness for trained vowels and about 80% for untrained vowels when they listened to the new talker for the first time (e.g., at T2), markedly higher than the accuracy of about 30% for trained vowels and about 60% for untrained vowels for the old talker at the first-time exposure (e.g., at T1). Nishi and Kewley-Port (2007) reported no-training effect on the vowel recognition transferred to six untrained vowels for their subset training group (i.e., three difficult vowels used in training), whereas Leong et al. (2018) found small but significant training effects for untrained /e-ae/ pairs that were produced by new talkers, which is similar to the present study. These results suggest that a small training effect may be transferred to untrained vowels of the same and new talkers, possibly depending on the size of the trained vowels (e.g., nearly half of the entire vowel inventory in this study). ...
Article
Full-text available
Noise makes speech perception much more challenging for non-native listeners than for native listeners. Training for non-native speech perception is usually implemented in quiet. It remains unclear if background noise may benefit or hamper non-native speech perception learning. In this study, 51 Chinese-native listeners were randomly assigned into three groups, including vowel training in quiet (TIQ), vowel training in noise (TIN), and watching videos in English as an active control. Vowel identification was assessed before (T1), right after (T2), and three months after training (T3) in quiet and various noise conditions. Results indicated that compared with the video watching group, the TIN group improved vowel identification in both quiet and noise significantly more at T2 and at T3. In contrast, the TIQ group improved significantly more in quiet and also in non-speech noise conditions at T2, but the improvement did not hold at T3. Moreover, compared to the TIQ group, the TIN group showed significantly less informational masking at both T2 and T3 and less energetic masking at T3. These results suggest that L2 speech training in background noise may improve non-native vowel perception more effectively than TIQ background only. The implications for non-native speech perception learning are discussed.
... Learning non-native speech sounds can be a particularly challenging task. For example, the distinction of the English vowel contrast /i/-/I/ is exceptionally difficult for many native Chinese speakers who learn English as a second language (L2) Wang, 1997;Wang and Heuven, 2006;Cheng and Zhang, 2013;Liu et al., 2014;Huang et al., 2018). The literature has well documented that high variability phonetic training (HVPT) is effective on improving non-native speech perception. ...
... Native English speakers can use both duration and formant frequency cues for distinguishing the two sounds (Mermelstein, 1978;Whalen, 1989;Grenon et al., 2019), and they predominantly rely on the spectral cues (Mermelstein, 1978;Hillenbrand et al., 2000). On the contrary, at least at the initial learning stage, English-as-a-secondlanguage (ESL) learners rely dominantly on duration cues rather than spectral cues in both perception and production (Escudero and Boersma, 2004;Morrison, 2005;Wang and Heuven, 2006;Yang, 2011;Liu et al., 2014). ...
Article
Full-text available
High variability phonetic training (HVPT) has been found to be effective in helping adult learners acquire non-native phonetic contrasts. The present study investigated the role of temporal acoustic exaggeration by comparing the canonical HVPT paradigm without involving acoustic exaggeration with a modified adaptive HVPT paradigm that integrated key temporal exaggerations in infant-directed speech (IDS). Sixty native Chinese adults participated in the training of the English /i/ and /i/ vowel contrast and were randomly assigned to three subject groups. Twenty were trained with the typical HVPT paradigm (the HVPT group), twenty were trained under the modified adaptive approach with acoustic exaggeration (the HVPT-E group), and twenty were in the control group. Behavioral tasks for the pre- and post- tests used natural word identification, synthetic stimuli identification, and synthetic stimuli discrimination. Mismatch negativity (MMN) responses from the HVPT-E group were also obtained to assess the training effects in within- and across- category discrimination without requiring focused attention. Like previous studies, significant generalization effects to new talkers were found in both the HVPT group and the HVPT-E group. The HVPT-E group, by contrast, showed greater improvement as reflected in larger progress in natural word identification performance. Furthermore, the HVPT-E group exhibited more native-like categorical perception based on spectral cues after training, together with corresponding training-induced changes in the MMN responses to within- and across- category differences. These data provide the initial evidence supporting the important role of temporal acoustic exaggeration with adaptive training in facilitating phonetic learning and promoting brain plasticity at the perceptual and pre-attentive neural levels.
... For native English-speaking listeners, vowel identification scores dropped by 5% when the vowel duration was equalized to 144 ms (Hillenbrand et al., 2000), and dropped by 3% when equalized to 170 ms . Moreover, studies have shown that non-native listeners depend more on vowel duration than native English-speaking listeners, regardless of the existence of duration contrasts in their native vowel system (Cebrian, 2006;Kondaurova and Francis, 2008;Escudero et al., 2009;Chl adkov a et al., 2013), possibly because the vowel duration cue is easier to grasp than the spectral cues to vowel perception for non-native listeners (Bohn and Flege, 1990;Bohn, 1995;Flege et al., 1997;Cebrian, 2006;Wang, 2006;Liu et al., 2014). Since previous studies mainly focused on two or three vowel pairs with spectral and durational contrasts, such as /i/-/I/ in "beat" and "bit", the extent to which non-native listeners' dependence on duration cues in English vowel identification for a large set of vowels had not been examined. ...
... In one condition, the original length of the vowel nucleus was preserved (i.e., duration cues preserved); in the other condition, the length of all vowel nuclei were shortened to 170 ms (i.e., duration cue removed). The 170 ms was adopted in order to facilitate comparisons with the results of previous studies in our laboratories (Mi et al., 2013;Liu et al., 2014). The vowel stimuli had 10ms rise-fall ramps. ...
Article
The purpose of this study was to examine the relationship between English vowel identification and English vowel formant discrimination for native Mandarin Chinese- and native English-speaking listeners. The identification of 12 English vowels was measured with the duration cue preserved or removed. The thresholds of vowel formant discrimination on the F2 of two English vowels,/Λ/and/i/, were also estimated using an adaptive-tracking procedure. Native Mandarin Chinese-speaking listeners showed significantly higher thresholds of vowel formant discrimination and lower identification scores than native English-speaking listeners. The duration effect on English vowel identification was similar between native Mandarin Chinese- and native English-speaking listeners. Moreover, regardless of listeners' language background, vowel identification was significantly correlated with vowel formant discrimination for the listeners who were less dependent on duration cues, whereas the correlation between vowel identification and vowel formant discrimination was not significant for the listeners who were highly dependent on duration cues. This study revealed individual variability in using multiple acoustic cues to identify English vowels for both native and non-native listeners.
... For linguistics and language learning, studies covered areas such as bilingual children's language learning experiences (Cote & Bornstein, 2014;Kan, 2014;Uchikoshi, 2014), Asian Americans' heritage language development (C. E. Kim & Pyun, 2014;Leung, 2014), and English pronunciation (Jin & Liu, 2014;Liu, Jin, & Chen, 2014;A. W.-m. Wong & Hall-Lew, 2014). ...
... For linguistics and language learning, studies covered areas such as bilingual children's language learning experiences (Cote & Bornstein, 2014;Kan, 2014;Uchikoshi, 2014), Asian Americans' heritage language development (C. E. Kim & Pyun, 2014;Leung, 2014), and English pronunciation (Jin & Liu, 2014;Liu, Jin, & Chen, 2014;A. W.-m. Wong & Hall-Lew, 2014). ...
Article
Full-text available
This 2014 review of Asian American psychology is the sixth review in the series. It includes 316 articles that met the inclusion criteria established by the past 5 annual reviews. Featured articles were derived from 3 sources: 137 were generated via the search term “Asian American” in PsycINFO, 111 were generated via a search for specific Asian American ethnic groups, and 32 were generated via author searches of articles that met the inclusion criteria. The top primary topic was health and health-related behaviors, the most frequently employed study design was cross-sectional, and the most studied Asian American ethnic group was Chinese. This year’s review includes information on the target population of the primary topic, the age range and developmental period of participants, and whether the study design was cross-sectional or longitudinal. It also identifies top authors and journals contributing to the 2014 annual review. These new features reveal that the most common target population of the primary topic was youths; studies most commonly included emerging adults ages 18 to 25; cross-sectional study design was employed more often than longitudinal design; the top contributor to the 2014 review was Stephen Chen, who authored the highest number of articles included; and the Asian American Journal of Psychology generated the highest number of publications for this review.
... For native English-speaking listeners, vowel identification scores dropped by 5% when the vowel duration was equalized to 144 ms (Hillenbrand et al., 2000), and dropped by 3% when equalized to 170 ms . Moreover, studies have shown that non-native listeners depend more on vowel duration than native English-speaking listeners, regardless of the existence of duration contrasts in their native vowel system (Cebrian, 2006;Kondaurova and Francis, 2008;Escudero et al., 2009;Chl adkov a et al., 2013), possibly because the vowel duration cue is easier to grasp than the spectral cues to vowel perception for non-native listeners (Bohn and Flege, 1990;Bohn, 1995;Flege et al., 1997;Cebrian, 2006;Wang, 2006;Liu et al., 2014). Since previous studies mainly focused on two or three vowel pairs with spectral and durational contrasts, such as /i/-/I/ in "beat" and "bit", the extent to which non-native listeners' dependence on duration cues in English vowel identification for a large set of vowels had not been examined. ...
... In one condition, the original length of the vowel nucleus was preserved (i.e., duration cues preserved); in the other condition, the length of all vowel nuclei were shortened to 170 ms (i.e., duration cue removed). The 170 ms was adopted in order to facilitate comparisons with the results of previous studies in our laboratories (Mi et al., 2013;Liu et al., 2014). The vowel stimuli had 10ms rise-fall ramps. ...
Article
Full-text available
This study was to investigate how Chinese-native listeners use spectral and duration cues for English vowel identification. The first experiment was to examine whether Chinese-native listeners’ English vowel perception was related to their sensitivity to the change of vowel formant frequency that is a critical spectral cue to vowel identification. Identification of 12 isolated American English vowels was measured for 52 Chinese college students in Beijing. Thresholds of vowel formant discrimination were also examined for these students. Results showed that there was a significantly moderate correlation between Chinese college students’ English vowel identification and their thresholds of vowel formant discrimination. That is, the lower vowel formant threshold of listeners, the better vowel identification. However, the moderate correlation between vowel identification and formant discrimination suggested some other factors accounting for the individual variability in English vowel identification for Chinese-native listeners. In Experiment 2, vowel identification was measured with and without duration cues, showing that vowel identification was reduced by 5.1% when duration cue was removed. Further analysis suggested that for the listeners who depended less on duration cue, the better thresholds of formant discrimination, the higher scores of vowel identification, but no such correlation for listeners who used duration cues remarkably.
... Default bandwidths of 50, 70, and 110 Hz were used for F1, F2, and F3. The duration of the synthesized vowels was kept constant at 350 ms, slightly longer than the average duration of isolated American English vowels, ≈ 260 ± 82 ms (Liu et al., 2014), for ease of identification in simulation (Guérit et al., 2014). These stimuli were then used to generate five listening conditions, namely, BiEAS, EAS, bimodal, CI alone, and HA alone, as outlined in the Signal Processing sec tion. ...
Article
Full-text available
Purpose The aim of this study was to measure the effects of frequency spacing (i.e., F2 minus F1) on spectral integration for vowel perception in simulated bilateral electric–acoustic stimulation (BiEAS), electric–acoustic stimulation (EAS), and bimodal hearing. Method Twenty listeners with typical hearing participated in synthetic vowel recognition. Four vowels were used with varying frequency spacings (/ͻ/: 270 Hz, /ʊ/: 653 Hz, /æ/: 1040 Hz, and /I/: 1607 Hz). F1 was acoustically simulated with a band-pass filtering, while F2 was electrically simulated using an eight-channel sine wave vocoder with matched input and output frequency range. Vowel recognition was measured in five listening conditions: BiEAS (F1 and F2 in both ears), EAS (F1 and F2 in the left ear), bimodal (F1 and F2 in opposite ears), cochlear implant alone (F2 alone in the left ear), and hearing aid alone (F1 alone in the left ear). Results In EAS, spectral integration was significantly better at a 270-Hz spacing, while in bimodal hearing, spectral integration was significantly poorer at a 1607-Hz frequency compared to other frequency spacings. BiEAS conditions offered the best spectral integration, regardless of frequency spacing. Vowel confusion remained consistent and below chance level across the first three listening conditions. Bimodal interference occurred for the /I/ vowel when the cochlear implant ear perceives the dominant cue and the hearing aid ear perceives the nondominant cue. The F2 place cue is transmitted significantly better than the F1 height cue in BiEAS, EAS, and bimodal conditions. Conclusions EAS and bimodal hearing integrates narrower frequency ranges better than wider spacings. EAS hearing provided greater outcomes over bimodal hearing, suggesting that within-ear (EAS) integration is more effective than across-ear (bimodal) integration. Bimodal interference may be a factor for variability in bimodal performance. Cautious interpretation and further research with real EAS and bimodal users are suggested to validate and extend these findings. Supplemental Material https://doi.org/10.23641/asha.28127249
... With respect to L2 speech learning, it has been argued that acquisition is made more difficult because learners process L2 input using their L1 segmentation patterns-for example, Japanese listeners overrely on second formant [F2] information for English [r] and [l] (Iverson et al., 2003); Chinese and Spanish listeners use duration cues for English vowel sounds (Escudero & Boersma, 2004;Flege et al., 1997;Liu et al., 2014); and Chinese and Vietnamese listeners depend more on pitch information and less on other information during their categorization of English stress (Nguyêñ et al., 2008; AUDITORY PROCESSING 121 Yu & Andruski, 2010;Y. Zhang & Francis, 2010). ...
Article
Full-text available
Public Significance Statement Everyone has a unique skill set when it comes to processing general characteristics of sounds, such as pitch and duration. An emerging theory proposes that individual differences in how well we can perceive sound can impact how quickly, effectively, and easily we learn our first and second languages. To date, auditory processing has traditionally been considered as a bottom-up, automatic, and isolated phenomenon. Following the interaction view, however, we argue that auditory processing can be considered a multidimensional phenomenon that encompasses not only the perception of acoustic characteristics (perceptual proficiency), but also the direction of attention towards specific acoustical elements (cognitive proficiency) and the transformation of auditory information into motor output (motoric proficiency). Indeed, recent studies have introduced new concepts of auditory processing that extend beyond mere acuity to include aspects such as attention and integration. In our study involving 102 Chinese English learners, we empirically examined the extent to which a model incorporating not just perceptual, but also cognitive and motoric facets of auditory processing could account for additional variance in language learning outcomes. Our findings align with an expanding body of evidence suggesting that auditory processing, which includes perceptual, cognitive, and motoric components, plays a pivotal role in language learning throughout the lifespan.
... To illustrate, the English /i/ and /ɪ/ contrast can be distinguished along the spectrum dimension (vowel quality, related mainly to the first and second formant frequencies, F1 and F2) and the duration dimension (vowel length). Native English speakers have been reported to rely primarily on spectral properties, with vowel duration playing only a secondary role (Hillenbrand et al., 2000;Mermelstein, 1978), whereas adult Chinese learners of English rely predominantly on vowel duration instead of spectral properties in both perception and production (Escudero & Boersma, 2004;Liu et al., 2014). According to the A2D models, successful acquisition of the English /i/ and /ɪ/ categories by native Chinese learners involves not only enhancement of attention to the under-attended spectral properties, but also a simultaneous withdrawal of attention from vowel duration. ...
Preprint
Full-text available
Talker variability has been reported to facilitate generalization and retention of speech learning, but is also shown to place demands on cognitive resources. Our recent study provided evidence that phonetically-irrelevant acoustic variability in single-talker (ST) speech is sufficient to induce equivalent amounts of learning to the use of multiple-talker (MT) training. This study is a follow-up contrasting MT versus ST training with varying degrees of temporal exaggeration to examine how cognitive measures of individual learners may influence the role of input variability in immediate learning and long-term retention. Native Chinese-speaking adults were trained on the English /i/-/ɪ/ contrast. We assessed the trainees' working memory and selective attention before training. Trained participants showed retention of more native-like cue weighting in both perception and production regardless of talker variability condition. The ST training group showed long-term benefit in word identification, whereas the MT training group did not retain the improvement. The results demonstrate the role of phonetically-irrelevant variability in robust speech learning and modulatory functions of nonlinguistic working memory and selective attention, highlighting the necessity to consider the interaction between input characteristics, task difficulty, and individual differences in cognitive abilities in assessing learning outcomes.
... For speech production, Mandarin vowel-plus-tone stimuli significantly differed in duration across tone categories (J. Yang et al., 2017) but not across vowel categories (Liu, Jin, & Chen, 2014). For speech perception, signal duration significantly influenced the categoricality of Mandarin tone perception; that is, categorical perception became less for shorter duration (100 ms) than for longer duration (400 ms), particularly for older listeners (Y. ...
Article
Full-text available
Purpose The purpose of this study was to measure Mandarin Chinese vowel-plus-tone identification in quiet and noise for younger and older listeners. Method Two types of noise served as the masker, namely, six-talker babble and babble-modulated noise, at two signal-to-noise ratios (SNRs) of −4 and −8 dB. Fourteen listeners from both age groups were recruited, and three sets of data analyses were conducted: the identification of vowel plus tone, the identification of vowel, and the identification of tone. Results Younger listeners outperformed older listeners in all listening conditions, whereas the younger–older listener difference became greater in noise than in quiet, indicating a more detrimental effect of noise for older listeners than for younger listeners. In addition, vowel identification showed slightly better scores than tone identification in noise, suggesting that noise appeared to affect tone perception more negatively than vowel perception in Mandarin Chinese. At −4 dB SNR, there was a significantly greater amount of informational masking (IM) and a greater amount of energetic masking (EM) for older listeners than for younger listeners. At −8 dB SNR, there was a greater amount of EM for older listeners than for younger listeners but with no group difference in the amount of IM. Conclusion These results suggest that older listeners received a more negative impact of noise for Mandarin Chinese phonemic and tone recognition and had a larger amount of IM or EM from competing speech noise than younger listeners, depending on the SNR.
... Like native speakers of Spanish, adult Chinese speakers also rely predominantly on the durational rather than spectral cues in both perception and production of the English /i/-/ɪ/ contrast (Escudero & Boersma, 2004;Liu et al., 2014;Morrison, 2005). Given this fact, the present investigation followed up our recent study (Cheng et al., 2019) in training Chinese adults to learn the English /i/-/ɪ/ contrast and further tested the hypothesis that increasing the amount of acoustic variability along the secondary dimension (i.e., vowel duration) can induce successful learning independent of talker variability. ...
Article
Full-text available
The current investigation adopted high variability phonetic training with additional audiovisual input and adaptive acoustic exaggeration to examine the role of talker variability. Sixty native Chinese-speaking adults were randomly assigned to a multiple-talker (MT) training group, a single-talker (ST) training group, and a control (CTRL) group without training. The target sounds were the English /i/-/ɪ/ contrast, delivered in 7 sessions using minimal pair word lists. Pre-and post-tests employed natural word identification, synthetic phoneme identification, and word production. Unlike the CTRL group, both training groups showed significant identification improvements, and the effects generalized to novel talkers and new phonetic contexts. Although training did not improve speech intelligibility, there was a significant gain in the use of the primary spectral cues and a decrease in the secondary durational cue. No differences were observed between the MT and ST groups. By removing the "enhancement" features, however, the training program with independent samples was able to verify the advantages of MT over ST training. These results provide the first evidence for the efficacy of other facilitative training features, independent of talker variability, in retuning second language learners' attention to critical acoustic cues for the target speech contrast and producing transfer of learning.
... It is generally acknowledged that tense and lax vowels in German are distinguished by both length and quality, although the spectral difference may be small, as for the German /a+-a/ pair. Thus, a line of studies has been conducted in terms of durational difference (Liu et al., 2014), or spectral difference (Chen et al., 2001;Jin and Liu, 2013), or both separately (Chen, 2006;Wang and Heuven, 2006) to illustrate how differently native speakers and L2 learners of English realize the tense-lax vowel contrast. However, the duration-quality relationship is mutually dependent and not the same for all vowels in German (Weiss, 1974). ...
Article
Full-text available
This study analyzed the durational and spectral differences and their interaction in the production of seven German tense-lax vowel pairs between 30 German native speakers and 30 Mandarin learners of German. The results showed that Mandarin speakers differed significantly from the German speakers in producing the German tense-lax contrast. The general pattern was that Mandarin learners employed temporal features more strongly than spectral features to indicate the tense-lax contrast as compared to German speakers. The phonetic influences of the Mandarin language on the production of German tense and lax vowels are discussed.
... The duration-associated advantage for older listeners' tone perception in this study collaborates with the benefits in the intelligibility of clear speech observed with older populations (Gordon-Salant et al., 2006;Helfer, 1998;Schum, 1996). The duration effect has been presented in many speech perception studies, such as the identification of English vowels for native listeners (Liu, Jin, & Chen, 2014) and nonnative listeners (Cebrian, 2006;Chladkova, Escudero, & Lipski, 2013;Mi et al., 2016) as well as the perception of Mandarin tones by amusics (Huang et al., 2015). However, whether longer duration could bring better categoricality in tone perception remained an understudied issue. ...
Article
Full-text available
Purpose The purpose of this study was to investigate the aging effect on the categorical perception of Mandarin Chinese tones with varied fundamental frequency (F0) contours and signal duration. Method Both younger and older native Chinese listeners with normal hearing were recruited in 2 experiments: tone identification and tone discrimination on a series of stimuli with the F0 contour systematically varying from the flat tone to the rising–falling tones. Apart from F0 contour, tone duration was manipulated at 3 levels: 100, 200, and 400 ms. Results Results suggested that, compared with younger listeners, older listeners performed with shallower slope in the identification function and smaller peakedness in the discrimination function, particularly for Tones 1 and 2, whereas for Tones 1 and 4, comparable categorical perception was found between younger and older listeners. Conclusions The current study suggested that longer duration facilitated categorical perception in the flat–rising tones for the older listeners. Such an aging effect was not found with the flat–falling tones, suggesting that the aging-related deficit in categorical perception might relate to different tone types. Aging resulted in less categoricality of Mandarin tone perception for the flat–rising tones with short duration like 100 ms, possibly due to the aging-related decline in temporal processing.
... Non-native English listeners, such as native Arabic, Japanese and Spanish listeners, rely mainly on duration cues to identify English vowels [15,16]. Recent studies reported that native Chinese listeners also relied more on vowel duration for English vowel perception compared with native English listeners [5,17]. Moreover, studies showed that training with stimuli of equalized duration improved vowel perception [18,19]. ...
Article
Full-text available
Difficulties with second-language vowel perception may be related to the significant challenges in using acoustic-phonetic cues. This study investigated the effects of perception training with duration-equalized vowels on native Chinese listeners’ English vowel perception and their use of acoustic-phonetic cues. Seventeen native Chinese listeners were perceptually trained with duration-equalized English vowels, and another 17 native Chinese listeners watched English videos as a control group. Both groups were tested with English vowel identification and vowel formant discrimination before training, immediately after training, and three months later. The results showed that the training effect was greater for the vowel training group than for the control group, while both groups improved their English vowel identification and vowel formant discrimination after training. Moreover, duration-equalized vowel perception training significantly reduced listeners’ reliance on duration cues and improved their use of spectral cues in identifying English vowels, but video-watching did not help. The results suggest that duration-equalized English vowel perception training may improve non-native listeners’ English vowel perception by changing their perceptual weights of acoustic-phonetic cues.
... In order to ensure that participants attended not only to the arrow location but also to the word that it contained, we asked them to read the word aloud, which very likely contributed to our slower observed reaction times (i.e., in excess of one second) than are typically found in reaction time experiments (e.g., Bar-Anan et al., 2007). Of note, the absence of any main effect for vowel suggests that participants did not respond especially fast or slow to either front or back vowels (despite certain differences in vowel duration; Liu, Jin, & Chen, 2014) but, instead, showed evidence of a matching or fit facilitation effect, responding especially quickly when front vowels were paired with near arrows and back vowels were paired with far arrows. ...
Article
This study investigated how 40 Chinese learners of English as a foreign language (EFL learners) differed from 40 native English speakers in the production of four English tense-lax contrasts, /i-ɪ/, /u-ʊ/, /ɑ-ʌ/, and /æ-ε/, by examining the acoustic measurements of duration, the first three formant frequencies, and the slope of the first formant movement (F1 slope). The dynamic formant trajectory was modeled using discrete cosine transform coefficients to demonstrate the time-varying properties of formant trajectories. A discriminant analysis was employed to illustrate the extent to which Chinese EFL learners relied on different acoustic parameters. This study found that: (1) Chinese EFL learners overemphasized durational differences and weakened spectral differences for the /i-ɪ/, /u-ʊ/, and /ɑ-ʌ/ pairs, although they maintained sufficient spectral differences for /æ-ε/. In contrast, native English speakers predominantly used spectral differences across all four pairs; (2) in non-low tense-lax contrasts, unlike native English speakers, Chinese EFL learners failed to exhibit different F1 slope values, indicating a non-nativelike tongue-root placement during the articulatory process. The findings underscore the contribution of dynamic spectral patterns to the differentiation between English tense and lax vowels, and reveal the influence of precise articulatory gestures on the realization of the tense-lax contrast.
Article
Full-text available
Speech learning model aims to explain the variables contributing to the differences in L2 phonetic productions. Most previous studies comparing L2 vowel production with L1 vowel production mostly attribute the differences to the mother tongue inference, which is also proposed by Speech learning model. However, the past transfer studies show a number of discrepant findings even regarding the same L2 vowel production. Therefore, this systematic review collected past studies compared L2 vowel production with L1 vowel production to understand the causes to the discrepant findings. Relevant articles published 2000 onward were searched with key words, such as "L2 ac-cented English", "vowel space", "L2 Formants", "Chinese-accented English", "comparing Chinese English" in the online database. In the initial search with the key words, 120 articles were found. After two screenings on titles and abstracts , and based on inclusion and exclusion criteria, 14 articles were kept to be reviewed. Another search was conducted by referring to the reference lists of the selected articles. Another 2 articles were added, which is 16 articles totally reviewed in this paper. This review starts with a review on the speech learning model. Then synthesizing and analyzing the collected articles are followed. Pedagogical implications and recommendations for future studies regarding the language transfers studies are discussed.
Article
Full-text available
Second language acquisition involves readjusting features from one’s L1 onto counterparts in the L2. Learners often face difficulty during this process due to the presence of an already firmly rooted L1 grammar. Furthermore, a learner’s L1 serves to constrain sensitivity to non-native contrasts during the acquisition process. If a learner’s L2 grammar lacks the phonological feature that can differentiate a non-native contrast, then that learner may experience persistent difficulties in representing the L2 sounds as a result. Mandarin learners of English as a second language have to contend with a substantially expanded L2 vowel inventory in the early stages of acquisition, grappling with the addition of pronounced features less prevalent in their L1. In an attempt to account for front vowel acquisition difficulties and possible routes to progress for L1- Mandarin L2-English using a direct transfer approach, this work follows the Toronto School of contrastive phonology which holds that phonological representation is determined primarily through the ordering of contrastive features. We present data from recent phonetic research that catalogues Mandarin learners’ progress in incorporating English front vowels while, at the same time, examining the underlying phonological processes. This serves as the basis for a preliminary model of contrastive hierarchy in language acquisition using elements of a feature geometry paradigm. The model provides a theoretical roadmap showing that, as Mandarin learners progress and gradually incorporate English front vowels into their L2 repository, the learner’s L2 hierarchy evolves through successive stages as contrasts are perceived and categorized.
Chapter
The focus of this unique publication is on Ethiopian languages and linguistics. Not only major languages such as Amharic and Oromo receive attention, but also lesser studied ones like Sezo and Nuer are dealt with. The Gurage languages, that often present a descriptive and sociolinguistic puzzle to researchers, have received ample coverage. And for the first time in the history of Ethiopian linguistics, two chapters are dedicated to descriptive studies of Ethiopian Sign Language, as well as two studies on acoustic phonetics. Topics range over a wide spectrum of issues covering the lexicon, sociolinguistics, socio-cultural aspects and micro-linguistic studies on the phonology, morphology and syntax of Ethiopian languages.
Article
Full-text available
Purpose The purpose of this study was to examine the intelligibility of English consonants and vowels produced by Chinese-native (CN), and Korean-native (KN) students enrolled in American universities. Method 16 English-native (EN), 32 CN, and 32 KN speakers participated in this study. The intelligibility of 16 American English consonants and 16 vowels spoken by native and nonnative speakers of English was evaluated by EN listeners. All nonnative speakers also completed a survey of their language backgrounds. Results Although the intelligibility of consonants and diphthongs for nonnative speakers was comparable to that of native speakers, the intelligibility of monophthongs was significantly lower for CN and KN speakers than for EN speakers. Sociolinguistic factors such as the age of arrival in the United States and daily use of English, as well as a linguistic factor, difference in vowel space between native (L1) and nonnative (L2) language, partially contributed to vowel intelligibility for CN and KN groups. There was no significant correlation between the length of U.S. residency and phoneme intelligibility. Conclusion Results indicated that the major difficulty in phonemic production in English for Chinese and Korean speakers is with vowels rather than consonants. This might be useful for developing training methods to improve English intelligibility for foreign students in the United States.
Article
Full-text available
This study assessed the effect of English-language experience on non-native speakers' production and perception of English vowels. Twenty speakers each of German, Spanish, Mandarin, and Korean, as well as a control group of 10 native English (NE) speakers, participated. The non-native subjects, who were first exposed intensively to English when they arrived in the United States (mean age=25years), were assigned to relatively experienced or inexperienced subgroups based on their length of residence in the US (M=7.3vs. 0.7years). The 90 subjects' accuracy in producing English /iIε æ/ was assessed by having native English-speaking listeners attempt to identify which vowels had been spoken, and through acoustic measurements. The same subjects also identified the vowels in syntheticbeat-bit(/i/-/I/) andbat-bet(/æ/-/ε/) continua. The experienced non-native subjects produced and perceived English vowels more accurately than did the relatively inexperienced non-native subjects. The non-native subjects' degrees of accuracy in producing and perceiving English vowels were related. Finally, both production and perception accuracy varied as a function of native language (L1) background in a way that appeared to depend on the perceived relation between English vowels and vowels in the L1 inventory.
Article
Full-text available
An acoustic study of American English vowels produced by native Mandarin speakers was performed. First and second formant frequencies (F1 and F2) of 11 vowels were examined in syllable-level productions of 40 Mandarin speakers compared to 40 American English speakers. Results of comparative acoustic analysis indicated that male and female Mandarin speakers differed significantly from American English speakers in their production of several English vowels. For female and male Mandarin speakers, the overall vowel quadrilaterals appeared to be smaller than corresponding American speakers' quadrilaterals. The general pattern shown across the Mandarin subjects was one in which vowels are produced with less acoustic diversity compared to native speakers of American English. Phonetic influences of the Mandarin language on production of American English vowels are discussed, as are implications of these findings with regard to clinical management of Chinese individuals who speak English as a second language.
Article
Full-text available
A series of experiments shows that Spanish learners of English acquire the ship-sheep contrast in a way specific to their target dialect (Scottish or Southern British English) and that many learners exhibit a perceptual strategy found in neither Spanish nor English. To account for these facts as well as for the findings of earlier research on second language (L2) speech perception, we provide an Optimality Theoretic model of phonological categorization that comes with a formal learning algorithm for its acquisition. Within this model, the dialect-dependent and L2-specific facts provide evidence for the hypotheses of Full Transfer and Full Access. a
Article
Full-text available
The major issues in relating acoustic waveforms of spoken vowels to perceived vowel categories are presented and discussed in terms of the author's auditory-perceptual theory of phonetic recognition. A brief historical review of formant-ratio theory is presented, as well as an analysis of frequency scales that have been proposed for description of the vowel. It is illustrated that the monophthongal vowel sounds of American English can be represented as clustered in perceptual target zones within a three-dimensional auditory-perceptual space (APS), and it is shown that preliminary versions of these target zones segregate a corpus of vowels of American English with 93% accuracy. Furthermore, it is shown that the nonretroflex vowels of American English fall within a narrow slab within the APS, with spread vowels near the front of this slab and rounded vowels near the back. Retroflex vowels fall in a distinct region behind the vowel slab. Descriptions of the vowels within the APS are shown to be correlated with their descriptions in terms of dimensions of articulation and timbre. Additionally, issues related to talker normalization, coarticulation effects, segmentation, pitch, transposition, and diphthongization are discussed.
Article
Full-text available
Arcsine or angular transformations have been used for many years to transform proportions to make them more suitable for statistical analysis. A problem with such transformations is that the arcsines do not bear any obvious relationship to the original proportions. For this reason, results expressed in arcsine units are difficult to interpret. In this paper a simple linear transformation of the arcsine transform is suggested. This transformation produces values that are numerically close to the original percentage values over most of the percentage range while retaining all of the desirable statistical properties of the arcsine transform.
Article
Full-text available
The purpose of this study was to replicate and extend the classic study of vowel acoustics by Peterson and Barney (PB) [J. Acoust. Soc. Am. 24, 175-184 (1952)]. Recordings were made of 45 men, 48 women, and 46 children producing the vowels /i,I,e, epsilon,ae,a, [symbol: see text],O,U,u, lambda,3 iota/ in h-V-d syllables. Formant contours for F1-F4 were measured from LPC spectra using a custom interactive editing tool. For comparison with the PB data, formant patterns were sampled at a time that was judged by visual inspection to be maximally steady. Analysis of the formant data shows numerous differences between the present data and those of PB, both in terms of average frequencies of F1 and F2, and the degree of overlap among adjacent vowels. As with the original study, listening tests showed that the signals were nearly always identified as the vowel intended by the talker. Discriminant analysis showed that the vowels were more poorly separated than the PB data based on a static sample of the formant pattern. However, the vowels can be separated with a high degree of accuracy if duration and spectral change information is included.
Article
Full-text available
This study was designed to examine the role of duration in vowel perception by testing listeners on the identification of CVC syllables generated at different durations. Test signals consisted of synthesized versions of 300 utterances selected from a large, multitalker database of /hVd/ syllables [Hillenbrand et al., J. Acoust. Soc. Am. 97, 3099-3111 (1995)]. Four versions of each utterance were synthesized: (1) an original duration set (vowel duration matched to the original utterance), (2) a neutral duration set (duration fixed at 272 ms, the grand mean across all vowels), (3) a short duration set (duration fixed at 144 ms, two standard deviations below the mean), and (4) a long duration set (duration fixed at 400 ms, two standard deviations above the mean). Experiment 1 used a formant synthesizer, while a second experiment was an exact replication using a sinusoidal synthesis method that represented the original vowel spectrum more precisely than the formant synthesizer. Findings included (1) duration had a small overall effect on vowel identity since the great majority of signals were identified correctly at their original durations and at all three altered durations; (2) despite the relatively small average effect of duration, some vowels, especially [see text] and [see text], were significantly affected by duration; (3) some vowel contrasts that differ systematically in duration, such as [see text], and [see text], were minimally affected by duration; (4) a simple pattern recognition model appears to be capable of accounting for several features of the listening test results, especially the greater influence of duration on some vowels than others; and (5) because a formant synthesizer does an imperfect job of representing the fine details of the original vowel spectrum, results using the formant-synthesized signals led to a slight overestimate of the role of duration in vowel recognition, especially for the shortened vowels.
Article
Full-text available
When listening to languages learned at a later age, speech intelligibility is generally lower than when listening to one's native language. The main purpose of this study is to quantify speech intelligibility in noise for specific populations of non-native listeners, only broadly addressing the underlying perceptual and linguistic processing. An easy method is sought to extend these quantitative findings to other listener populations. Dutch subjects listening to Germans and English speech, ranging from reasonable to excellent proficiency in these languages, were found to require a 1-7 dB better speech-to-noise ratio to obtain 50% sentence intelligibility than native listeners. Also, the psychometric function for sentence recognition in noise was found to be shallower for non-native than for native listeners (worst-case slope around the 50% point of 7.5%/dB, compared to 12.6%/dB for native listeners). Differences between native and non-native speech intelligibility are largely predicted by linguistic entropy estimates as derived from a letter guessing task. Less effective use of context effects (especially semantic redundancy) explains the reduced speech intelligibility for non-native listeners. While measuring speech intelligibility for many different populations of listeners (languages, linguistic experience) may be prohibitively time consuming, obtaining predictions of non-native intelligibility from linguistic entropy may help to extend the results of this study to other listener populations.
Article
The major issues in relating acoustic waveforms of spoken vowels to perceived vowel categories are presented and discussed in terms of the author’s auditory‐perceptual theory of phonetic recognition. A brief historical review of formant‐ratio theory is presented, as well as an analysis of frequency scales that have been proposed for description of the vowel. It is illustrated that the monophthongal vowelsounds of American English can be represented as clustered in perceptual target zones within a three‐dimensional auditory‐perceptual space (APS), and it is shown that preliminary versions of these target zones segregate a corpus of vowels of American English with 93% accuracy. Furthermore, it is shown that the nonretroflex vowels of American English fall within a narrow slab within the APS, with spread vowels near the front of this slab and rounded vowels near the back. Retroflex vowels fall in a distinct region behind the vowel slab. Descriptions of the vowels within the APS are shown to be correlated with their descriptions in terms of dimensions of articulation and timbre. Additionally, issues related to talker normalization, coarticulation effects, segmentation, pitch, transposition, and diphthongization are discussed.
Article
F1-F3and f0of 10 Korean vowels and 13 American English vowels produced by 10 male and 10 female speakers of each language group were studied while holding dialectal factors as homogeneous as possible in each group. Within- and across-language comparisons of the collected data revealed considerable variation in vocal tract length between male and female speakers and between Korean and American English speakers. For a more precise comparison, the variation was drastically reduced by applying uniform scaling within and across the two languages. In the cross-language comparison of the normalized data, which were converted to a perceptual dimension, it is argued that adaptive dispersion is operating within a language’s system of contrasts to fulfill a condition of sufficient contrast.t-tests were conducted on those vowels transcribed using the same or similar IPA symbols in the two languages to assess the statistical significance of those comparisons.
Article
Presented sets of synthetic vowel sounds to a group of 8 listeners for identification. The sets differed in 1st-formant frequency (Fl), 2nd-formant frequency (F2), and duration. Ss' judgments depended on all of these factors. Duration was a relatively more important cue for vowels located in the center of the F1-F2 space, where a vowel might more readily be confused with 1 of its neighbors. The perceived duration of a vowel was biased by the rhythm of the sounds that preceded it. In the case of sounds lying near perceptual boundaries, this was sometimes sufficient to change the identity of the vowel. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Flege, Bohn, and Jang (1997) and Escudero and Boersma (2004) analyzed first language-Spanish second language-English listeners' perception of English /i/–/I/ continua that varied in spectral and duration properties. They compared individuals and groups on the basis of spectral reliance and duration reliance measures. These reliance measures indicate the change in identification rates from one extreme of the stimulus space to the other; they make use of only a portion of the data collected and suffer from a ceiling effect. The current paper presents a reanalysis of Escudero and Boersma's data using first-order logistic regression modeling. All of the available data contribute to the calculation of logistic regression coefficients, and they do not suffer from the same ceiling effect as the reliance measures. It is argued that—as a metric of cue weighting—logistic regression coefficients offer methodological and substantive advantages over the reliance measures. a
Article
Speech produced by two speakers was manually segmented. generating two databases with 18,000 and 6,000 vowel segments. The effects on vowel duration of several contextual factors were measured, including those of syllabic stress, pitch accent, the identities of adjacent segments, the syllabic structure of a word and proximity to a syntactic boundary. With statistical techniques for de-confounding factors, detailed characterizations of the effects of the factors and their interactions could be given, which were then summarized in the form of a simple equation for predicting vowel duration from context.
Article
The intelligibility of speech is known to be lower if the speaker is non-native instead of native for the given language. This study is aimed at quantifying the overall degradation due to limitations of non-native speakers of Dutch, specifically of Dutch-speaking Americans who have lived in the Netherlands 1–3 years. Experiments were focused on phoneme intelligibility and sentence intelligibility, using additive noise as a means of degrading the intelligibility of speech utterances for test purposes. The overall difference in sentence intelligibility between native Dutch speakers and American speakers of Dutch, using native Dutch listeners, was found to correspond to a difference in speech-to-noise ratio (SNR) of approximately 3 dB. The main segmental contribution to the degradation of speech intelligibility by introducing non-native speakers and/or listeners is the confusion of vowels, especially those that do not occur in American English. Vowels that are difficult for second-language speakers to produce are also difficult for second-language listeners to classify; such vowels attract false recognition, reducing the overall recognition rate for all vowels.
Article
A revision of the author's thesis, Indiana University, entitled: The vowels and tones of Mandarin Chinese; acoustical measurements and experiments.
Article
Research on the perception of vowels in the last several years has given rise to new conceptions of vowels as articulatory, acoustic, and perceptual events. Starting from a "simple" target model in which vowels were characterized articulatorily as static vocal tract shapes and acoustically as points in a first and second formant (F1/F2) vowel space, this paper briefly traces the evolution of vowel theory in the 1970s and 1980s in two directions. (1) Elaborated target models represent vowels as target zones in perceptual spaces whose dimensions are specified as formant ratios. These models have been developed primarily to account for perceivers' solution of the "speaker normalization" problem. (2) Dynamic specification models emphasize the importance of formant trajectory patterns in specifying vowel identity. These models deal primarily with the problem of "target undershoot" associated with the coarticulation of vowels with consonants in natural speech and with the issue of "vowel-inherent spectral change" or diphthongization of English vowels. Perceptual studies are summarized that motivate these theoretical developments.
Article
An adequate theory of vowel perception must account for perceptual constancy over variations in the acoustic structure of coarticulated vowels contributed by speakers, speaking rate, and consonantal context. We modified recorded consonant-vowel-consonant syllables electronically to investigate the perceptual efficacy of three types of acoustic information for vowel identification: (1) static spectral "targets," (2) duration of syllabic nuclei, and (3) formant transitions into and out of the vowel nucleus. Vowels in /b/-vowel-/b/ syllables spoken by one adult male (experiment 1) and by two females and two males (experiment 2) served as the corpus, and seven modified syllable conditions were generated in which different parts of the digitized waveforms of the syllables were deleted and the temporal relationships of the remaining parts were manipulated. Results of identification tests by untrained listeners indicated that dynamic spectral information, contained in initial and final transitions taken together, was sufficient for accurate identification of vowels even when vowel nuclei were attenuated to silence. Furthermore, the dynamic spectral information appeared to be efficacious even when durational parameters specifying intrinsic vowel length were eliminated.
Article
In this study we assessed age-related differences in the perception and production of American English (AE) vowels by native Mandarin speakers as a function of the amount of exposure to the target language. Participants included three groups of native Mandarin speakers: 87 children, adolescents and young adults living in China, 77 recent arrivals who had lived in the U.S. for two years or less, and 54 past arrivals who had lived in the U.S. between three and five years. The latter two groups arrived in the U.S. between the ages of 7 and 44 years. Discrimination of six AE vowel pairs /i-i/, /i-e(I)/, /e-ae/, /ae-a/, /a-(symbol see text)/, and /u-a/ was assessed with a categorial AXB task. Production of the eight vowels /i, i, e(I), e, ae, (symbol see text), a, u/ was assessed with an immediate imitation task. Age-related differences in performance accuracy changed from an older-learner advantage among participants in China, to no age differences among recent arrivals, and to a younger-learner advantage among past arrivals. Performance on individual vowels and vowel contrasts indicated the influence of the Mandarin phonetic/phonological system. These findings support a combined environmental and L1 interference/transfer theory as an explanation of the long-term younger-learner advantage in mastering L2 phonology.
  • Black J. W.
  • Strange W.
  • Hillenbrand J.