Poster

Using high variability phonetic training to train non-tonal listeners with no musical background to perceive lexical tones

Authors:
To read the file of this research, you can request a copy directly from the author.

Abstract

Previous research has not extensively investigated whether High Variability Phonetic Training (HVPT) is effective in training listeners with no musical background and no prior experience with a tone language in their identification of non-native lexical tones. In this study, it was investigated whether HVPT is applicable to the acquisition of non-native tones by such listeners. Twenty-one speakers of American English were trained in eight sessions using the HVPT approach to identify Mandarin tones in monosyllabic words. Ten of the participants were exposed to words produced by multiple talkers (MT condition), and eleven participants were exposed to words produced by a single talker (ST condition). The listeners’ identification accuracy revealed an average 44% increase from the pretest to the posttest for the MT condition and an average 30% increase for the ST condition. The improvement also generalized to new monosyllabic words produced by a familiar talker and those produced by two unfamiliar talkers. The learning however did not generalize to novel disyllabic words produced either by a familiar talker or an unfamiliar talker. Comparisons between two groups further revealed that the improvement of the listeners in the MT condition was significantly higher than that of the listeners in the ST condition.

No file available

Request Full-text Paper PDF

To read the file of this research,
you can request a copy directly from the author.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Speech variability facilitates non-tonal language speakers’ lexical tone learning. However, it remains unknown whether tonal language speakers can also benefit from speech variability while learning second language (L2) lexical tones. Researchers also reported that the effectiveness of speech variability was only shown on learning new items. Considering that the first language (L1) and L2 probably share similar tonal categories, the present study hypothesizes that speech variability only promotes the tonal language speakers’ acquisition of L2 tones that are different from the tones in their L1. To test this hypothesis, the present study trained native Mandarin (a tonal language) speakers to learn Cantonese tones with either high variability (HV) or low variability (LV) speech materials, and then compared their learning performance. The results partially supported this hypothesis: only Mandarin subjects’ productions of Cantonese low level and mid level tones benefited from the speech variability. They probably relied on the mental representations in L1 to learn the Cantonese tones that had similar Mandarin counterparts. This learning strategy limited the impact of speech variability. Furthermore, the results also revealed a discrepancy between L2 perception and production. The perception improvement may not necessarily lead to an improvement in production.
Article
Full-text available
This study investigates the interaction between voice quality and pitch by revisiting the well-known case of Mandarin creaky voice. This study first provides several pieces of experimental data to assess whether the mechanism behind allophonic creaky voice in Mandarin is tied to tonal categories or is driven by phonetic pitch ranges. The results show that the presence of creak is not exclusively limited to tone 3, but can accompany any of the low pitch targets in the Mandarin tones; further, tone 3 is less creaky when the overall pitch range is raised, but more creaky when the overall pitch range is lowered. More importantly, tone 3 is not unique in this regard, and other tones such as tone 1 are also subject to similar variations. In sum, voice quality is quite systematically tied to F0 in Mandarin. Results from a pitch glide experiment further suggest that voice quality overall covaries with pitch height in a wedge-shaped function. Non-modal voice tends to occur when pitch production exceeds certain limits. Voice quality, thus, has the potential to enhance the perceptual distinctiveness of extreme pitch targets.
Article
Full-text available
Studies of lexical tone learning generally focus on monosyllabic contexts, while reports of phonetic learning benefits associated with input variability are based largely on experienced learners. This study trained inexperienced learners on Mandarin tonal contrasts to test two hypotheses regarding the influence of context and variability on tone learning. The first hypothesis was that increased phonetic variability of tones in disyllabic contexts makes initial tone learning more challenging in disyllabic than monosyllabic words. The second hypothesis was that the learnability of a given tone varies across contexts due to differences in tonal variability. Results of a word learning experiment supported both hypotheses: tones were acquired less successfully in disyllables than in monosyllables, and the relative difficulty of disyllables was closely related to contextual tonal variability. These results indicate limited relevance of monosyllable-based data on Mandarin learning for the disyllabic majority of the Mandarin lexicon. Furthermore, in the short term, variability can diminish learning; its effects are not necessarily beneficial but dependent on acquisition stage and other learner characteristics. These findings thus highlight the importance of considering contextual variability and the interaction between variability and type of learner in the design, interpretation, and application of research on phonetic learning.
Article
Full-text available
Previous studies suggest that musicians show an advantage in processing and encoding foreign-language lexical tones. The current experiments examined whether musical experience influences the perceptual learning of lexical tone categories. Experiment I examined whether musicians with no prior experience of tonal languages differed from nonmusicians in the perception of a lexical tone continuum. Experiment II examined whether short-term perceptual training on lexical tones altered the perception of the lexical tone continuum differentially in English-speaking musicians and nonmusicians. Results suggested that (a) musicians exhibited higher sensitivity overall to tonal changes, but perceived the lexical tone continuum in a manner similar to nonmusicians (continuously), in contrast to native Mandarin speakers (categorically); and (b) short-term perceptual training altered perception; however, there were no significant differences between the effects of training on musicians and nonmusicians.
Article
Full-text available
Although the high-variability training method can enhance learning of non-native speech categories, this can depend on individuals’ aptitude. The current study asked how general the effects of perceptual aptitude are by testing whether they occur with training materials spoken by native speakers and whether they depend on the nature of the to-be-learned material. Forty-five native Dutch listeners took part in a 5-day training procedure in which they identified bisyllabic Mandarin pseudowords (e.g., asa) pronounced with different lexical tone combinations. The training materials were presented to different groups of listeners at three levels of variability: low (many repetitions of a limited set of words recorded by a single speaker), medium (fewer repetitions of a more variable set of words recorded by three speakers), and high (similar to medium but with five speakers). Overall, variability did not influence learning performance, but this was due to an interaction with individuals’ perceptual aptitude: increasing variability hindered improvements in performance for low-aptitude perceivers while it helped improvements in performance for high-aptitude perceivers. These results show that the previously observed interaction between individuals’ aptitude and effects of degree of variability extends to natural tokens of Mandarin speech. This interaction was not found, however, in a closely matched study in which native Dutch listeners were trained on the Japanese geminate/singleton consonant contrast. This may indicate that the effectiveness of high-variability training depends not only on individuals’ aptitude in speech perception but also on the nature of the categories being acquired.
Chapter
Full-text available
Language experience systematically constrains perception of speech contrasts that deviate phonologically and/or phonetically from those of the listener's native language. These effects are most dramatic in adults, but begin to emerge in infancy and undergo further development through at least early childhood. The central question addressed here is: How do nonnative speech perception findings bear on phonological and phonetic aspects of second language (L2) perceptual learning? A frequent assumption has been that nonnative speech perception can also account for the relative difficulties that late learners have with specific L2 segments and contrasts. However, evaluation of this assumption must take into account the fact that models of nonnative speech perception such as the Perceptual Assimilation Model (PAM) have focused primarily on naive listeners, whereas models of L2 speech acquisition such as the Speech Learning Model (SLM) have focused on experienced listeners. This chapter probes the assumption that L2 perceptual learning is determined by nonnative speech perception principles, by considering the commonalities and complementarities between inexperienced listeners and those learning an L2, as viewed from PAM and SUA. Among the issues examined are how language learning may affect perception of phonetic vs. phonological information, how monolingual vs. multiple language experience may impact perception, and what these may imply for attunement of speech perception to changes in the listener's language environment.
Article
Full-text available
The current study investigates the learning of nonnative suprasegmental patterns for word identification. Native English-speaking adults learned to use suprasegmentals (pitch patterns) to identify a vocabulary of six English pseudosyllables superimposed with three pitch patterns (18 words). Successful learning of the vocabulary necessarily entailed learning to use pitch patterns in words. Two major facets of sound-to-word learning were investigated: could native speakers of a nontone language learn the use of pitch patterns for lexical identification, and what effect did more basic auditory ability have on learning success. We found that all subjects improved to a certain degree, although large individual differences were observed. Learning success was found to be associated with the learners' ability to perceive pitch patterns in a nonlexical context and their previous musical experience. These results suggest the importance of a phonetic–phonological–lexical continuity in adult nonnative word learning, including phonological awareness and general auditory ability.
Article
Full-text available
While the tones of Mandarin are conveyed mainly by the F0 contour, they also differ consistently in duration and in amplitude contour. The contribution of these factors was examined by using signal-correlated noise stimuli, in which natural speech is manipulated so that it has no F0 or formant structure but retains its original amplitude contour and duration. Tones 2, 3 and 4 were perceptible from just the amplitude contour, even when duration was not also a cue. In two further experiments, the location of the critical information for the tones during the course of the syllable was examined by extracting small segments from each part of the original syllable. Tones 2 and 3 were often confused with each other, and segments which did not have much F0 change were most often heard as Tone 1. There were, though, also cases in which a low, unchanging pitch was heard as Tone 3, indicating a partial effect of register even in Mandarin. F0 was positively correlated with amplitude, even when both were computed on a pitch period basis. Taken together, the results show that Mandarin tones are realized in more than just the F0 pattern, that amplitude contours can be used by listeners as cues for tone identification, and that not every portion of the F0 pattern unambiguously indicates the original tone.
Conference Paper
Full-text available
It has been suggested that music and speech maintain entirely dissociable mental processing systems. The current study, however, provides evidence that there is an overlap in the processing of certain shared aspects of the two. This study focuses on fundamental frequency (pitch), which is an essential component of melodic units in music and lexical and/or intonational units in speech. We hypothesize that extensive experience with the processing of musical pitch can transfer to a lexical pitch-processing domain. To that end, we asked nine English-speaking musicians and nine English- speaking non-musicians to identify and discriminate the four lexical tones of Mandarin Chinese. The subjects performed significantly differently on both tasks; the musicians identified the tones with 89% accuracy and discriminated them with 87% accuracy, while the non-musicians identified them with only 69% accuracy and discriminated them with 71% accuracy. These results provide counter-evidence to the theory of dissociation between music and speech processing.
Article
Full-text available
Studies evaluating phonological contrast learning typically investigate either the predictiveness of specific pretraining aptitude measures or the efficacy of different instructional paradigms. However, little research considers how these factors interact--whether different students learn better from different types of instruction--and what the psychological basis for any interaction might be. The present study demonstrates that successfully learning a foreign-language phonological contrast for pitch depends on an interaction between individual differences in perceptual abilities and the design of the training paradigm. Training from stimuli with high acoustic-phonetic variability is generally thought to improve learning; however, we found high-variability training enhanced learning only for individuals with strong perceptual abilities. Learners with weaker perceptual abilities were actually impaired by high-variability training relative to a low-variability condition. A second experiment assessing variations on the high-variability training design determined that the property of this learning environment most detrimental to perceptually weak learners is the amount of trial-by-trial variability. Learners' perceptual limitations can thus override the benefits of high-variability training where trial-by-trial variability in other irrelevant acoustic-phonetic features obfuscates access to the target feature. These results demonstrate the importance of considering individual differences in pretraining aptitudes when evaluating the efficacy of any speech training paradigm.
Article
Full-text available
Auditory training has been shown to be effective in the identification of non-native segmental distinctions. In this study, it was investigated whether such training is applicable to the acquisition of non-native suprasegmental contrasts, i.c., Mandarin tones. Using the high-variability paradigm, eight American learners of Mandarin were trained in eight sessions during the course of two weeks to identify the four tones in natural words produced by native Mandarin talkers. The trainees' identification accuracy revealed an average 21% increase from the pretest to the post-test, and the improvement gained in training was generalized to new stimuli (18% increase) and to new talkers and stimuli (25% increase). Moreover, the six-month retention test showed that the improvement was retained long after training by an average 21% increase from the pretest. The results are discussed in terms of non-native suprasegmental perceptual modification, and the analogies between L2 acquisition processes at the segmental and suprasegmental levels.
Article
Full-text available
This study examined the perception of the four Mandarin lexical tones by Mandarin-naïve Hong Kong Cantonese, Japanese, and Canadian English listener groups. Their performance on an identification task, following a brief familiarization task, was analyzed in terms of tonal sensitivities (A-prime scores on correct identifications) and tonal errors (confusions). The A-prime results revealed that the English listeners' sensitivity to Tone 4 identifications specifically was significantly lower than that of the other two groups. The analysis of tonal errors revealed that all listener groups showed perceptual confusion of tone pairs with similar phonetic features (T1-T2, T1-T4 and T2-T3 pairs), but not of those with completely dissimilar features (T1-T3, T2-T4, and T3-T4). Language-specific errors were also observed in their performance, which may be explained within the framework of the Perceptual Assimilation Model (PAM: Best, 1995; Best & Tyler, 2007). The findings imply that linguistic experience with native tones does not necessarily facilitate non-native tone perception. Rather, the phonemic status and the phonetic features (similarities or dissimilarities) between the tonal systems of the target language and the listeners' native languages play critical roles in the perception of non-native tones.
Article
Full-text available
This study investigated whether individuals with small and large native-language (L1) vowel inventories learn second-language (L2) vowel systems differently, in order to better understand how L1 categories interfere with new vowel learning. Listener groups whose L1 was Spanish (5 vowels) or German (18 vowels) were given five sessions of high-variability auditory training for English vowels, after having been matched to assess their pre-test English vowel identification accuracy. Listeners were tested before and after training in terms of their identification accuracy for English vowels, the assimilation of these vowels into their L1 vowel categories, and their best exemplars for English (i.e., perceptual vowel space map). The results demonstrated that Germans improved more than Spanish speakers, despite the Germans' more crowded L1 vowel space. A subsequent experiment demonstrated that Spanish listeners were able to improve as much as the German group after an additional ten sessions of training, and that both groups were able to retain this learning. The findings suggest that a larger vowel category inventory may facilitate new learning, and support a hypothesis that auditory training improves identification by making the application of existing categories to L2 phonemes more automatic and efficient.
Article
Full-text available
This study investigated speaker normalization in perception of Mandarin tone 2 (midrising) and tone 3 (low-falling-rising) by examining listeners' use of F0 range as a cue to speaker identity. Two speakers were selected such that tone 2 of the low-pitched speaker and tone 3 of the high-pitched speaker occurred at equivalent F0 heights. Production and perception experiments determined that turning point (or inflection point of the tone), and delta F0 (the difference in FO between onset and turning point) distinguished the two tones. Three tone continua varying in either turning point, delta FO, or both acoustic dimensions, were then appended to a natural precursor phrase from each of the two speakers. Results showed identification shifts such that identical stimuli were identified as low tones for the high precursor condition, but as high tones for the low precursor condition. Stimuli varying in turning point showed no significant shift, suggesting that listeners normalize only when the precursor varies in the same dimension as the stimuli. The magnitude of the shift was greater for stimuli varying only in delta FO, as compared to stimuli varying in both turning point and delta FO, indicating that normalization effects are reduced for stimuli more closely matching natural speech.
Article
Full-text available
Many weak elements in speech, such as schwa in English and neutral tone in Standard Chinese, are commonly assumed to be unspecified or underspecified phonologically. The surface phonetic values of these elements are assumed to derive from interpolation between the adjacent phonologically specified elements or from the spreading of the contextual phonological features. In the present study, we re-evaluate this view by investigating detailed F(0) contours of neutral-tone syllables in Standard Chinese, which are widely accepted as toneless underlyingly. We recorded sentences containing 0-3 consecutive neutral-tone syllables at two speaking rates with two focus conditions. Results of the experiment indicate that neutral-tone syllables do have a target that is independent of the surrounding tones, which is likely to be static and mid. Furthermore, the neutral tone is found to be different from the full lexical tones in the manner with which the underlying tonal target is implemented: it is slow and ineffective both in overcoming the influence of the preceding full lexical tone and in approaching its own target. Applying the recently proposed pitch target approximation model, we conclude that the neutral tone differs from the other lexical tones in Standard Chinese not only in terms of its mid target, but also in terms of the weak articulatory strength with which this target is implemented. Finally, we suggest that this new understanding is potentially applicable to other weak elements in speech.
Article
Full-text available
The present study explores the use of extrinsic context in perceptual normalization for the purpose of identifying lexical tones in Cantonese. In each of four experiments, listeners were presented with a target word embedded in a semantically neutral sentential context. The target word was produced with a mid level tone and it was never modified throughout the study, but on any given trial the fundamental frequency of part or all of the context sentence was raised or lowered to varying degrees. The effect of perceptual normalization of tone was quantified as the proportion of non-mid level responses given in F0-shifted contexts. Results showed that listeners' tonal judgments (i) were proportional to the degree of frequency shift, (ii) were not affected by non-pitch-related differences in talker, (iii) and were affected by the frequency of both the preceding and following context, although (iv) following context affected tonal decisions more strongly than did preceding context. These findings suggest that perceptual normalization of lexical tone may involve a "moving window" or "running average" type of mechanism, that selectively weights more recent pitch information over older information, but does not depend on the perception of a single voice.
Article
This article is a critical research synthesis of 32 studies that used the High Variability Phonetic Training (HVPT) technique to teach learners to better perceive and produce L2 sounds. Taken together, the studies surveyed provide compelling evidence that HVPT is a very effective pronunciation training tool, and that resulting improvement is long-lasting. The analysis of this research also helps to explain why very few teachers have heard of this empirically-driven approach to pronunciation instruction: HVPT studies are largely published in technically oriented journals; few are accessible to language teachers. A variety of obstacles to the widespread use of HVPT are discussed, and some possible solutions are provided.
Article
We tested the usability of prosody visualization techniques for second language (L2) learners. Eighteen Danish learners realized target sentences in German based on different visualization techniques. The sentence realizations were annotated by means of the phonological Kiel Intonation Model and then analyzed in terms of (a) prosodic-pattern consistency and (b) correctness of the prosodic patterns. In addition, the participants rated the usability of the visualization techniques. The results from the phonological analysis converged with the usability ratings in showing that iconic techniques, in particular the stylized “hat pattern” visualization, performed better than symbolic techniques, and that marking prosodic information beyond intonation can be more confusing than instructive. In discussing our findings, we also provide a description of the new Danish-German learner corpus we created: DANGER. It is freely available for interested researchers upon request.
Article
Research into pronunciation has often disregarded its potential to inform pedagogy. This is due partly to the historical development of pronunciation teaching and research, but its effect is that there is often a mismatch between research and teaching. This paper looks at four areas in which the (mis)match is imperfect but in which a greater recognition of research can lead to better teaching materials (high variability phonetic training, intonation, information structure, and setting priorities). Furthermore, two areas in which teaching materials are desperate for research to be carried out (connected speech and the primacy of suprasegmentals) will be discussed.
Article
Productions of Tone 4 and Tone 3 (mài/măi, ‘sell’/‘buy’) in comparable sentences suggest that although the two tones are realized in different ways by different speakers in different speech acts, some features are constant. Tone 3 is connected with a low pitch level throughout the second half of the vowel and Tone 4 with a gradual fall over the main part of the vocalic segment. These observations were tested in a series of manipulations of pitch movements over mài from Tone 4 to Tone 3 in the sentence Sòng Yán mài niúròu. The manipulated sentences were presented in a test, in which listeners were asked if they heard mài or năi. The result confirmed the observed constant features and indicated in addition that it was important for both tones to have a clear reference. The identification of Tone 4 was favoured by an introductory rising or level part, and for Tone 3 an introductory fall seemed to be important. Creaky voice is a concomitant but not a necessary feature of Tone 3.
Article
Native speakers of Japanese learning English generally have difficulty differentiating the phonemes /r/ and /l/, even after years of experience with English. Previous research that attempted to train Japanese listeners to distinguish this contrast using synthetic stimuli reported little success, especially when transfer to natural tokens containing /r/ and /l/ was tested. In the present study, a different training procedure that emphasized variability among stimulus tokens was used. Japanese subjects were trained in a minimal pair identification paradigm using multiple natural exemplars contrasting /r/ and /l/ from a variety of phonetic environments as stimuli. A pretest‐posttest design containing natural tokens was used to assess the effects of training. Results from six subjects showed that the new procedure was more robust than earlier training techniques. Small but reliable differences in performance were obtained between pretest and posttest scores. The results demonstrate the importance of stimulus variability and task‐related factors in training nonnative speakers to perceive novel phoneticcontrasts that are not distinctive in their native language.
Article
This study first replicated Shen and Lin’s [Lang. Speech34(2), 145–156 (1991)] experiment with improved stimuli to confirm their claim that the F0 turning point was a perceptual cue to Mandarin tones 2 and 3, and further investigated the relative importance between the timing of the turning point and F0 height. The results indicated that, indeed, the distinction between these two tones was cued by the timing of the turning point, and that F0 height had little effect on the labeling.
Article
This study examines whether second language (L2) learners from tonal and non-tonal first language (L1) backgrounds differ in their perception and production of L2 tones. Ten English-speaking and nine Cantonese-speaking learners participated in Experiment 1, which consisted of the following three tasks: identifying auditory tonal stimuli using Mandarin tonal labels (Identification), mimicking tonal stimuli (Mimicry), and producing tones based upon Mandarin tonal labels (Reading). The results of Experiment 1 showed that the Cantonese group did not perform significantly better than the English group in perceiving and producing Mandarin tones. Both groups had significant difficulty in distinguishing Mandarin Tone 2 (T2) and Tone 3 (T3), and the Cantonese group also had additional trouble distinguishing Mandarin Tone 1 (T1) and Tone 4 (T4). Overall, across the different tasks of Experiment 1 learners had similar accuracy rates and error patterns, indicating comparable tone perception and production abilities. However, learners were significantly better at mimicking tones than at identifying or reading them, suggesting that the major difficulty learners faced in acquiring Mandarin tones was associating pitch contours with discrete tonal labels. This difficulty, however, may be specific to tone acquisition. Seven of the nine Cantonese participants took part in Experiment 2, which assessed their perceptual assimilation of Mandarin tones to Cantonese tones. The results of Experiment 2 helped explain Cantonese learners' T1–T4 confusion by showing that these two tones were mapped onto overlapping Cantonese tonal categories. However, the mapping results would not predict prevailing T2–T3 confusion as observed in Experiment 1, suggesting that this confusion stemmed from factors outside of learners' L1 experience. This study argues that the T2–T3 contrast is hard for L2 learners regardless of their native languages, because of these two tones' acoustic similarity and complex phonological relationship. This suggests that for explaining difficulties in acquisition of certain L2 sounds, factors other than learners' L1 background may also play a significant role.
Article
This study reports effects of a high-variability training procedure on nonnative learning of a Japanese geminate-singleton fricative contrast. Thirty native speakers of Dutch took part in a 5-day training procedure in which they identified geminate and singleton variants of the Japanese fricative /s/. Participants were trained with either many repetitions of a limited set of words recorded by a single speaker (low-variability training) or with fewer repetitions of a more variable set of words recorded by multiple speakers (high-variability training). Both types of training enhanced identification of speech but not of nonspeech materials, indicating that learning was domain specific. High-variability training led to superior performance in identification but not in discrimination tests, and supported better generalization of learning as shown by transfer from the trained fricatives to the identification of untrained stops and affricates. Variability thus helps nonnative listeners to form abstract categories rather than to enhance early acoustic analysis.
Article
This reference grammar provides, for the first time, a description of the grammar of "Mandarin Chinese", the official spoken language of China and Taiwan, in functional terms, focusing on the role and meanings of word-level and sentence-level structures in actual conversations.
Article
The ability of native English (NE) and native Chinese (NC) speakers to identify and discriminate the mid- versus the low-tone contrast in Thai was investigated before and after auditory training. The variables under investigation were first language background and the interstimulus interval (ISI) of the presentation (500 ms vs. 1500 ms). The NC group outperformed the NE group in its ability to discriminate the two Thai tones under the ISI 500 ms condition before training and under both ISI conditions after training. A significant improvement in identification from the pretest to the posttest was observed in the NC group under both ISI conditions, but not in the NE group. These results suggest that prior experience with the tone system in one tone language may be transferable to the perception of tone in another language.
Article
The importance of consonant transitions for American English vowel identification has been demonstrated by studies of “silent-center” syllables (Strange, 1989a), in which only the initial and final portions of the syllable are presented. For such syllables, identification is quite accurate, nearly reaching the accuracy of intact syllables. The present study examined sources of acoustic information for vowels and tones in Mandarin Chinese, comparing identification of intact syllables to syllables with the initial and final portions removed (center-only), syllables with the centers removed (silent-center), and syllables with only the initial transition presented (initial-only). In Experiment 1, test syllables were presented with the word /dzι/ which immediately followed the test syllable in the original sentence. Native Chinese listeners made few identification errors with intact, center-only, and silent-center syllables. Non-native listeners made more errors overall, especially errors of tone identification, and did more poorly on silent-center syllables than on center-only syllables. Both natives and non-natives made more tone confusions in the initial-only condition, and they misidentified the diphthong /uo/ as /u/. In experiment 2, test syllables were presented without the word /dzι/. Natives made more errors than in Experiment 1, suggesting that carry-over tonal coarticulation in the /dzι/ provided information about the test word's tone. Types of errors made by natives were different from those of non-natives. Explanations for the different error patterns are offered.
Article
J. M. Howie, 1974. “On the Domain of Tone in Mandarin,” Phonetica30. 129–148. Has claimed that the basic contours of Mandarin tone are coextensive only with the syllabic vowel and any voiced segment that may follow it, any portion of fundamental frequency (F0) preceding them is merely an anticipatory adjustment of the voice. This is true insofar as tones are considered in isolated citation form. However, in connected speech, it has been found that Mandarin tones are perturbed due to tonal coarticulation, viz., a portion of F0 of syllabic vowels is affected X.-N. S. Shen, 1990. “Tonal Coarticulartion in Mandarin,” Journal of Phonetics18. 281–295. Because of the carryover effects of the preceding tone, the adjustment of the voice in a subsequent tone is not only anticipatory, it is also preservative. A perceptual experiment was conducted to examine whether carryover coarticulatory perturbations occurring at syllabic vowels in excised tones were perceptible. The results show that listeners identified above chance-level, the F0 perturbations under carryover effects in excised tones and the perturbations at initial syllabic vowels are better identified. Therefore, the concept of tone in Mandarin is revised as follows: in connected speech, a portion of fundamental frequency at intertonemic onset is perturbed — this includes initial voiced consonants and vowels (syllabic and nonsyllabic are alike); the perturbations result from preservative as well as anticipatory adjustments of the voice. This study suggest also that, F0 being an acoustic attribute in the signals and tone being a linguistic unit, their relation is a many-to-one relation. In acoustical study of tones in connected speech, tonal perturbations must be taken into consideration, when measuring F0, to determine the phonetic tones; moreover, the acoustic properties of tones in isolated citation form are not appropriate for use as norms.
Article
Monolingual speakers of Japanese were trained to identify English /r/ and /l/ using Logan et al.'s [J. Acoust. Soc. Am. 89, 874-886 (1991)] high-variability training procedure. Subjects' performance improved from the pretest to the post-test and during the 3 weeks of training. Performance during training varied as a function of talker and phonetic environment. Generalization accuracy to new words depended on the voice of the talker producing the /r/-/l/ contrast: Subjects were significantly more accurate when new words were produced by a familiar talker than when new words were produced by an unfamiliar talker. This difference could not be attributed to differences in intelligibility of the stimuli. Three and six months after the conclusion of training, subjects returned to the laboratory and were given the post-test and tests of generalization again. Performance was surprisingly good on each test after 3 months without any further training: Accuracy decreased only 2% from the post-test given at the end of training to the post-test given 3 months later. Similarly, no significant decrease in accuracy was observed for the tests of generalization. After 6 months without training, subjects' accuracy was still 4.5% above pretest levels. Performance on the tests of generalization did not decrease and significant differences were still observed between talkers. The present results suggest that the high-variability training paradigm encourages a long-term modification of listeners' phonetic perception. Changes in perception are brought about by shifts in selective attention to the acoustic cues that signal phonetic contrasts. These modifications in attention appear to be retrained over time, despite the fact that listeners are not exposed to the /r/-/l/ contrast in their native language environment.
Article
Two experiments were carried out to extend Logan et al.'s recent study [J. S. Logan, S. E. Lively, and D. B. Pisoni, J. Acoust. Soc. Am. 89, 874-886 (1991)] on training Japanese listeners to identify English /r/ and /l/. Subjects in experiment 1 were trained in an identification task with multiple talkers who produced English words containing the /r/-/l/ contrast in initial singleton, initial consonant clusters, and intervocalic positions. Moderate, but significant, increases in accuracy and decreases in response latency were observed between pretest and posttest and during training sessions. Subjects also generalized to new words produced by a familiar talker and novel words produced by an unfamiliar talker. In experiment 2, a new group of subjects was trained with tokens from a single talker who produced words containing the /r/-/l/ contrast in five phonetic environments. Although subjects improved during training and showed increases in pretest-posttest performance, they failed to generalize to tokens produced by a new talker. The results of the present experiments suggest that variability plays an important role in perceptual learning and robust category formation. During training, listeners develop talker-specific, context-dependent representations for new phonetic categories by selectively shifting attention toward the contrastive dimensions of the non-native phonetic categories. Phonotactic constraints in the native language, similarity of the new contrast to distinctions in the native language, and the distinctiveness of contrastive cues all appear to mediate category acquisition.
Article
Perception of second language speech sounds is influenced by one's first language. For example, speakers of American English have difficulty perceiving dental versus retroflex stop consonants in Hindi although English has both dental and retroflex allophones of alveolar stops. Japanese, unlike English, has a contrast similar to Hindi, specifically, the Japanese /d/ versus the flapped /r/ which is sometimes produced as a retroflex. This study compared American and Japanese speakers' identification of the Hindi contrast in CV syllable contexts where C varied in voicing and aspiration. The study then evaluated the participants' increase in identifying the distinction after training with a computer-interactive program. Training sessions progressively increased in difficulty by decreasing the extent of vowel truncation in stimuli and by adding new speakers. Although all participants improved significantly, Japanese participants were more accurate than Americans in distinguishing the contrast on pretest, during training, and on posttest. Transfer was observed to three new consonantal contexts, a new vowel context, and a new speaker's productions. Some abstract aspect of the contrast was apparently learned during training. It is suggested that allophonic experience with dental and retroflex stops may be detrimental to perception of the new contrast.
The acoustical features and perceptual cues of the four tones of standard colloquial Chinese
  • C K Chuang
  • S Hiki
  • T Sone
  • T Nimura
Chuang, C. K., Hiki, S., Sone, T., & Nimura, T. (1972). The acoustical features and perceptual cues of the four tones of standard colloquial Chinese. In Proceedings of the 7th International Congress of Acoustics, vol. 3 (pp. 297-300). Budapest: Akademial Kiado.
Tone perception in Far Eastern languages
  • J Gandour
Gandour, J. (1983). Tone perception in Far Eastern languages. Journal of Phonetics. https://doi.org/10.1016/S0095-4470(19)30813-7
Remarks on the Underlying Representations of Neutral-Tone Words in Mandarin Chinese
  • C Li
Li, C. (1987). Remarks on the Underlying Representations of Neutral-Tone Words in Mandarin Chinese. Stud. English Lit. Linguist, 189-199.
The pitch indicator and the pitch characteristics of tones in Standard Chinese
  • M C Lin
Lin, M. C. (1965). The pitch indicator and the pitch characteristics of tones in Standard Chinese. Acta Acoustica (China), 2, 8-15.
Integrated Chinese: Simplified Characters Textbook
  • Y Liu
Liu, Y. (2008). Integrated Chinese: Simplified Characters Textbook, Level 1, Part 1 (English and Chinese Edition) (3rd ed.). Cheng & Tsui.
An Acoustic Phonetic Study on Tones in Mandarin Chinese. Nankang, Taipei: Institute of History & Philology, Academia Sinica
  • C Tseng
Tseng, C. (1990). An Acoustic Phonetic Study on Tones in Mandarin Chinese. Nankang, Taipei: Institute of History & Philology, Academia Sinica. special publications.
Perception of Mandarin tones: The effect of L1 background and training
  • X Wang
Wang, X. (2013). Perception of Mandarin tones: The effect of L1 background and training. The Modern Language Journal, 97(1), 144-160. https://doi.org/10.1111/j.1540-4781.2013.01386.x
Musical experience shapes human brainstem encoding of linguistic pitch patterns
  • P C M Wong
  • E Skoe
  • N M Russo
  • T Dees
  • N Kraus
Wong, P. C. M., Skoe, E., Russo, N. M., Dees, T., & Kraus, N. (2007). Musical experience shapes human brainstem encoding of linguistic pitch patterns. Nature Neuroscience, 10(4), 420. https://doi.org/10.1038/nn1872