Article
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The present study investigated the neurophysiological correlates of categorical perception of Chinese lexical tones in Mandarin Chinese. Relative to standard stimuli, both within- and across-category deviants elicited mismatch negativity (MMN) in bilateral frontal-central recording sites. The MMN elicited in the right sites was marginally larger than in the left sites, which reflects the role of the right hemisphere in acoustic processing. At the same time, relative to within-category deviants, the across-category deviants elicited larger MMN in the left recording sites, reflecting the long-term phonemic traces of lexical tones. These results provide strong neurophysiological evidence in support of categorical perception of lexical tones in Chinese. More important, they demonstrate that acoustic and phonological information is processed in parallel within the MMN time window for the perception of lexical tones. Finally, homologous nonspeech stimuli elicited similar MMN patterns, indicating that lexical tone knowledge influences the perception of nonspeech signals.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Most previous L2 studies of lexical tone learning have focused on naturally spoken tonal categories as the perceptual targets (e.g., Mandarin Tone 1, Tone 2, Tone 3, and Tone 4) and found differential gains associated with difficulty levels of individual lexical tones (e.g., Hao, 2012;Wayland & Guion, 2004). But recently, several studies of native speakers of tonal language have examined two specific types of information critical for lexical tone perception (e.g., Luo et al., 2006;Xi, Zhang, Shu, Zhang, & Li, 2010). Specifically, acoustic information refers to the physical features of lexical tones as assessed by F0 (e.g., pitch height and pitch contour), and phonological information refers to the linguistic properties that distinguish each lexical tone for expressing semantic distinctions (e.g., Mandarin syllable /fu/ spoken with a high flat tone, Tone 1, means skin, and the same syllable means rich when spoken with a short falling tone, Tone 4). ...
... Using ERP, particularly the MMN (mismatch negativity) measure, Xi et al. (2010), K. Yu, Wang, Li, and Li (2014), and L. Zhang, Xi, Wu, Shu, and Li (2012) have shown that these two types of information play significant but different roles in native lexical tone perception at the preattentive and attentive stages. The preattentive stage is an early processing stage at which people process the stimuli automatically and unconsciously, which contrasts with the attentive stage at a later processing stage at which people process the stimuli consciously (Kubovy et al., 1999;Neisser, 1967). ...
... In order to explore the processing of acoustic and phonological information in lexical tones, previous studies have adopted several methods to distinguish the acoustic information from the phonological information in lexical tones, such as the cross-language (native vs. nonnative) comparison (e.g., Chandrasekaran, Krishnan, & Gandour, 2007), speech and nonspeech (hum) comparison (e.g., Jia, Tsang, Huang, & Chen, 2015), interstimulus interval manipulation (Y. H. Yu, Shafer, & Sussman, 2017), and categorical perception paradigm (e.g., Xi et al., 2010). ...
Article
Learning the acoustic and phonological information in lexical tones is significant for learners of tonal languages. Although there is a wealth of knowledge from studies of second language (L2) tone learning, it remains unclear how L2 learners process acoustic versus phonological information differently depending on whether their first language (L1) is a tonal language. In the present study, we first examined proficient L2 learners of Mandarin with tonal and nontonal L1 in a behavioral experiment (identifying a Mandarin tonal continuum) to construct tonal contrasts that could differentiate the phonological from the acoustic information in Mandarin lexical tones for the L2 learners. We then conducted an ERP experiment to investigate these learners' automatic processing of acoustic and phonological information in Mandarin lexical tones by mismatch negativity (MMN). Although both groups of L2 learners showed similar behavioral identification features for the Mandarin tonal continuum as native speakers, L2 learners with nontonal L1, as compared with both native speakers and L2 learners with tonal L1, showed longer reaction time to the tokens of the Mandarin tonal continuum. More importantly, the MMN data further revealed distinct roles of acoustic and phonological information on the automatic processing of L2 lexical tones between the two groups of L2 learners. Taken together, the results indicate that the processing of acoustic and phonological information in L2 lexical tones may be modulated by L1 experience with a tonal language. The theoretical implications of the current study are discussed in light of models of L2 speech learning.
... Previous research has shown varying (i.e., from non-significant to significant) degrees of association between MMN responses and behavioral discrimination performance under different experimental conditions (e.g., Näätänen et al., 1993;Tiitinen et al., 1994;Kujala et al., 2001;Novitski et al., 2004;Chen and Sussman, 2013;Yu et al., 2017). There were also different findings regarding whether the MMN can reflect the category boundary effect in categorical perception (e.g., Ylinen et al., 2006;Xi et al., 2010). For instance, Ylinen et al. (2006) suggested that the status of the deviant relative to the phoneme boundary did not affect the MMN amplitude; that is, the MMN responses to within-and across-category differences did not differ regardless of whether the listeners demonstrated CP for the target sounds or not. ...
... For instance, Ylinen et al. (2006) suggested that the status of the deviant relative to the phoneme boundary did not affect the MMN amplitude; that is, the MMN responses to within-and across-category differences did not differ regardless of whether the listeners demonstrated CP for the target sounds or not. On the contrary, in studying the neurophysiological correlates of categorical perception of Chinese lexical tones, Xi et al. (2010) found that the across-category deviants elicited larger MMN than that of the within-category deviants in native speakers of Chinese, and this phenomenon was not observed in the non-native speakers. Despite the controversies, speech research has indeed documented that the MMN can reflect learning-induced changes with enhanced MMN amplitude and reduced MMN latency (Menning et al., 2002;Kujala and Näätänen, 2010). ...
... In Cheng and Zhang (2013), Step 3 was 100% identified by 10 native speakers of American English as the phoneme /i/ while Step 7 and Step 11 were identified as the phoneme /I/ with accuracy rates of 96.5 and 100%, respectively. Stimulus presentation followed the Double Oddball Paradigm in Xi et al. (2010), which was implemented in Eprime 2.0 (Psychology Software Tools Inc., United States). In this paradigm, an MMN response can be elicited when the repeated presentation of the standard stimulus (Step 7) is interrupted by either deviant stimulus (Step 3 or Step 11). ...
Article
Full-text available
High variability phonetic training (HVPT) has been found to be effective in helping adult learners acquire nonnative phonetic contrasts. The present study investigated the role of temporal acoustic exaggeration by comparing the canonical HVPT paradigm without involving acoustic exaggeration with a modified adaptive HVPT paradigm that integrated key temporal exaggerations in infant-directed speech (IDS). Sixty native Chinese adults participated in the training of the English /i/ and /ɪ/ vowel contrast and were randomly assigned to three subject groups. Twenty were trained with the typical HVPT (the HVPT group), twenty were trained under the modified adaptive approach with acoustic exaggeration (the HVPT-E group), and twenty were in the control group. Behavioral tasks for the pre- and post- tests used natural word identification, synthetic stimuli identification, and synthetic stimuli discrimination. Mismatch negativity (MMN) responses from the HVPT-E group were also obtained to assess the training effects in within- and across-category discrimination without requiring focused attention. Like previous studies, significant generalization effects to new talkers were found in both the HVPT group and the HVPT-E group. The HVPT-E group, by contrast, showed greater improvement as reflected in larger progress in natural word identification performance. Furthermore, the HVPT-E group exhibited more native-like categorical perception based on spectral cues after training, together with corresponding training-induced changes in the MMN responses to within- and across- category differences. These data provide the initial evidence supporting the important role of temporal acoustic exaggeration with adaptive training in facilitating phonetic learning and promoting brain plasticity at the perceptual and pre-attentive neural levels.
... The mismatch response has also been used to examine the processing of tonal information in tone languages. Xi et al. (2010) used MMN to investigate categorical perception of Mandarin lexical tone 2 and tone 4 with the syllable structure /pa/. A 10interval lexical tone continuum from tone 2 (stimulus 1) to tone 4 (stimulus 11) was created, producing a total of 11 stimuli. ...
... Interestingly, physically different tones across tonal categories yielded a stronger MMN amplitude than did physically different tones within the same tonal category in the left hemisphere but not in the right hemisphere. These results suggest that the left-hemisphere may be more involved with the long-term phonemic processing of lexical tones while the right hemisphere may play a more important role for acoustic processing (Ren et al., 2009;Xi et al., 2010). ...
... The lack of a tone 3 sandhi environment in this condition ensured that the initial tone 2 of the standards had a surface and an underlying tone 2 form. In this condition, there is no mismatch between the standards and deviants in underlying tone category or surface tone category (see Figure 1), as both the initial syllable of standards and deviants are tone 2. However, as this study utilizes a design in which multiple physically different tokens of the standards are used, this condition is intended to assess whether the physical difference between the set of standards and deviants is sufficient to yield an MMN Kasai et al., 2001;Xi et al., 2010), or whether MMN will be absent due to the lack of many-to-one ratio at the level of tone category. ...
Article
Full-text available
Phonological alternation (sound change depending on the phonological environment) poses challenges to spoken word recognition models. Mandarin Chinese T3 sandhi is such a phenomenon in which a tone 3 (T3) changes into a tone 2 (T2) when followed by another T3. In a mismatch negativity (MMN) study examining Mandarin Chinese T3 sandhi, participants passively listened to either a T2 word [tʂu2 je4] /tʂu2 je4/, a T3 word [tʂu3 je4] /tʂu3 je4/, a sandhi word [tʂu2 jen3] /tʂu3 jen3/, or a mix of T3 and sandhi word standards. The deviant in each condition was a T2 word [tʂu2]. Results showed an MMN only in the T2 and T3 conditions but not in the Sandhi or Mix conditions. All conditions also yielded omission MMNs. This pattern cannot be explained based on the surface forms of standards and deviants; rather these data suggest an underspecified or underlying T3 stored linguistic representation used in spoken word processing.
... We try to bridge the predictive role of speech perception on reading. We chose the classical oddball paradigm and the stimuli chosen were very classical speech stimuli (Xi et al., 2010), a set of stimuli that has been used in many studies (Yu et al., 2014(Yu et al., , 2017. The neural marker we selected was late MMN, and this component, together with early MMN, confirmed its sensitivity in groups such as autistic children and bilingual Cantonese and Mandarin (Yu et al., 2015(Yu et al., , 2017. ...
... To measure children's phonological awareness as well as their reading ability, we selected classic behavioral tasks that have been used in previous studies. The first one is the classical task of categorical perception, the identification task (Xi et al., 2010). This task contains only basic cognitive processes such as decision-making and involves fewer cognitive abilities and a purer description of perceptual abilities than other behavioral tasks. ...
... The /pa2/ and /pa4/ stimuli were taken as the endpoint Frontiers in Psychology 04 frontiersin.org stimuli and a morphing technique was then performed in MATLAB (MathWorks Corporation, Natick, MA, United States) using STRAIGHT (Kawahara et al., 1999) to create a 10-interval lexical tone continuum (Xi et al., 2010). All the 11 stimuli in the / pa2/−/pa4/ lexical tone continuum were used in the behavioral identification test. ...
Article
Full-text available
Background Deficits in phonological processing are commonly reported in dyslexia but longitudinal evidence that poor speech perception compromises reading is scant. This 2-year longitudinal ERP study investigates changes in pre-attentive auditory processing that underlies categorical perception of mandarin lexical tones during the years children learn to read fluently. The main purpose of the present study was to explore the development of lexical tone categorical perception to see if it can predict children’s reading ability. Methods Both behavioral and electrophysiological measures were taken in this study. Auditory event-related potentials were collected with a passive listening oddball paradigm. Using a stimulus continuum spanning from one lexical tone category exemplar to another, we identified a between-category and a within-category tone deviant that were acoustically equidistant from a standard stimulus. The standard stimulus occurred on 80% of trials, and one of two deviants (between-category or within-category) equiprobably on the remaining trials. 8-year-old Mandarin speakers participated in both an initial ERP oddball paradigm and returned for a 2-year follow-up. Results The between-category MMN and within-category MMN significantly correlate with each other at age 8 ( p = 0.001) but not at age 10. The between-category MMN at age 8 can predict children’s ability at age 10 ( p = 0.03) but the within-category cannot. Conclusion The categorical perception of lexical tone is still developing from age 8 to age 10. The behavioral and electrophysiological results demonstrate that categorical perception of lexical tone at age 8 predicts children’s reading ability at age 10.
... Interestingly, physically different tones across tonal categories yielded a stronger MMN amplitude than did physically different tones within the same tonal category in the left hemisphere but not in the right hemisphere. These results suggest that the righthemisphere may play a more important role for acoustic processing (Xi et al., 2010;Ren, Yang & Li, 2009), while the left-hemisphere may be more in charge of the long-term phonemic processing of lexical tones (Xi et al., 2010). These data also suggest that acoustic information (right hemisphere) and phonological information (left hemisphere) were processed simultaneously for Mandarin lexical tones in the same MMN time window. ...
... Interestingly, physically different tones across tonal categories yielded a stronger MMN amplitude than did physically different tones within the same tonal category in the left hemisphere but not in the right hemisphere. These results suggest that the righthemisphere may play a more important role for acoustic processing (Xi et al., 2010;Ren, Yang & Li, 2009), while the left-hemisphere may be more in charge of the long-term phonemic processing of lexical tones (Xi et al., 2010). These data also suggest that acoustic information (right hemisphere) and phonological information (left hemisphere) were processed simultaneously for Mandarin lexical tones in the same MMN time window. ...
... These data also suggest that acoustic information (right hemisphere) and phonological information (left hemisphere) were processed simultaneously for Mandarin lexical tones in the same MMN time window. Finally, these data demonstrated that non-speech stimuli with the same F0 contours as their corresponding speech counterparts elicited a similar MMN pattern as speech stimuli, indicating that Mandarin speakers can transfer their lexical tone knowledge to non-speech perception (Xi et al., 2010;Ren et al., 2009). ...
... In Mandarin, there are four lexical tones, which use different pitch contours to discriminate lexical items: /mā/ (level f0 contour with a slight drop at the end of the utterance) means mother, /má/ (rising f0) means hemp, /mǎ/ (falling-rising f0) means horse, and /mà/ (falling f0) means scold. Previous studies have argued that lexical tone is perceived categorically, based on both behavioral and ERP data (Xu, Gandour, & Francis, 2006;Xi, Zhang, Shu, Zhang, & Li, 2010;Zhang, Xi, Wu, Shu, & Li, 2012). However, some of these studies are problematic methodologically (e.g., low N's; inappropriate use of high-pass filters in ERP data), and they have often not investigated non-native listeners' responses to tone stimuli. ...
... A number of studies examining lexical tone perception in adult listeners have used the categorical perception framework to understand how native speakers of tone languages categorize stimuli varying along tone continua (Xu et al., 2006;Xi et al., 2010;Zhang et al., 2012). One way that this has been studied is by editing f0 contours in spoken syllables in a graded manner, with the ends of the continua representing prototypical contours for tones. ...
... That is, two members of the same category would be less discriminable than two tokens from different categories, even with an equivalent acoustic difference between them. Xi et al. (2010) used a categorical perception task to examine listeners' perception of tone stimuli varying along an 11-step continuum from tones 2 to 4, embedded in the syllable /ba/. Identification data showed typical categorization functions, with a steep slope at the category boundary, and discrimination performance showed a peak at the category boundary, consistent with categorical perception. ...
Article
Full-text available
Some studies have argued that native speakers of tonal languages have been shown to perceive lexical tone continua in a more categorical manner than speakers of non-tonal languages. Among these, Zhang and colleagues (NeuroReport 23 (1): 35-9) conducted an event-related potential (ERP) study using an oddball paradigm showing that native Mandarin speakers exhibit different sensitivity to deviant tones that cross category boundaries compared to deviants that belong to the same category as the standard. Other recent ERP findings examining consonant voicing categories question whether perception is truly categorical. The current study investigated these discrepant findings by replicating and extending the Zhang et al. study. Native Mandarin speakers and naïve English speakers performed an auditory oddball detection test while ERPs were recorded. Naïve English speakers were included to test for language experience effects. We found that Mandarin speakers and English speakers demonstrated qualitatively similar responses, in that both groups showed a larger N2 to the across-category deviant and a larger P3 to the within-category deviant. The N2/P3 pattern also did not differ in scalp topography for the within- versus across-category deviants, as was reported by Zhang et al. Cross-language differences surfaced in behavioral results, where Mandarin speakers showed better discrimination for the across-category deviant, but English speakers showed better discrimination for within-category deviants, though all results were near-ceiling. Our results therefore support models suggesting that listeners remain sensitive to gradient acoustic differences in speech even when they have learned phonological categories along an acoustic dimension.
... As shown in a number of studies in both the auditory and visual modalities, larger MMN amplitudes are elicited by acrosscategory deviants compared with within-category deviants. For example, discontinuous MMN responses reflecting adult categorical perception have been reported for phonemes (Kazanina et al., 2006;Kasai et al., 2001;Sharma & Dorman, 1999;Winkler et al., 1999;Dehaene-Lambertz, 1997), Mandarin tones (Xi, Zhang, Shu, Zhang, & Li, 2010), and colors (Mo et al., 2011). These findings suggest that the enhanced MMN response to cross-category deviants compared with within-category deviants with equal physical variance is a reliable indicator of categorical discrimination at a relatively early stage of perceptual processing. ...
... The first step in computing sMMN amplitude was to subtract the ERPs for one stimulus as the control from the ERP when the same stimulus was the deviant (Zheng, Minett, Peng, & Wang, 2012;Xi et al., 2010). The most negative peak in the deviant-minus-control difference wave between 100 and 200 msec (Garrido et al., 2009;Näätänen et al., 2005) was identified at the selected electrodes for each participant. ...
... The current results provide novel electrophysiological evidence of categorical discrimination in the tactile modality, in line with findings in the domain of speech perception (e.g., Xi et al., 2010;Minagawa-Kawai, Mori, & Sato, 2005). In our first protocol, the category boundary that was examined was the wrist joint, which separates the hand from the forearm. ...
Article
The focus of the current study is on a particular aspect of tactile perception: categorical segmentation on the body surface into discrete body parts. The MMN has been shown to be sensitive to categorical boundaries and language experience in the auditory modality. Here we recorded the somatosensory MMN (sMMN) using two tactile oddball protocols and compared sMMN amplitudes elicited by within- and across-boundary oddball pairs. Both protocols employed the identity MMN method that controls for responsivity at each body location. In the first protocol, we investigated the categorical segmentation of tactile space at the wrist by presenting pairs of tactile oddball stimuli across equal spatial distances, either across the wrist or within the forearm. Amplitude of the sMMN elicited by stimuli presented across the wrist boundary was significantly greater than for stimuli presented within the forearm, suggesting a categorical effect at an early stage of somatosensory processing The second protocol was designed to investigate the generality of this MMN effect, and involved three digits on one hand. Amplitude of the sMMN elicited by a contrast of the third digit and the thumb was significantly larger than a contrast between the third and fifth digits, suggesting a functional boundary effect that may derive from the way that objects are typically grasped. These findings demonstrate that the sMMN is a useful index of processing of somatosensory spatial discrimination that can be used to study body part categories.
... Specifically, the acoustic information contains pitch features, such as pitch height and contour variations, while the phonological information differentiates different lexical semantics on the basis of different tonal categories (Gandour et al., 2000). Several neuroimaging studies have indicated that the acoustic information and the phonological information of lexical tones in native speakers are essentially two distinct auditory inputs, which exhibit different cognitive and physiological characteristics (Xi et al., 2010;Zhang et al., 2011Zhang et al., , 2012Yu et al., 2014Yu et al., , 2017. The phonological processing, which develops early in native speakers, is driven by an accumulation of perceptual development from the ambient native sound inputs (Kuhl, 2004). ...
... In order to illuminate the level of lexical tone processing difficulties in Mandarin-speaking amusics, this study manipulated tonal category (within-category/across-category) along a pitch continuum to directly dissociate acoustic processing from the phonological processing of lexical tones for native speakers, an approach that was widely adopted in previous ERP studies (Xi et al., 2010;Zhang et al., 2012;Yu et al., 2014Yu et al., , 2017. A posttest identification test after the current P300 experiment confirmed that both the controls and amusics perceived (1) stimulus #1 (within-category deviant) as the same tone category as stimulus #4 (standard) and (2) ...
... The other line of evidence regarding brain lateralization indicates that although lexical tonal processing engages both hemispheres, pure acoustic processing tends to be processed in the right hemisphere, while phonological and semantic processing mainly occurs in the left hemisphere (Gandour et al., 2004;Gandour, 2006). Especially for Mandarin tone processing, Xi et al. (2010) found that between-category tonal deviants (i.e., phonological processing) elicited larger MMN than within-category tonal deviants (i.e., acoustic processing) in the left hemisphere, while the withincategory deviants elicited larger MMN in the right hemisphere. At the attentive stage, Zhang et al. (2012) found that that for the P300 component, the amplitudes elicited by the withincategory deviants were similar between the left and the right recording sites. ...
Article
Full-text available
Previous studies have shown that for congenital amusics, long-term tone language experience cannot compensate for lexical tone processing difficulties. However, it is still unknown whether such difficulties are merely caused by domain-transferred insensitivity in lower-level acoustic processing and/or by higher-level phonological processing of linguistic pitch as well. The current P300 study links and extends previous studies by uncovering the neurophysiological mechanisms underpinning lexical tone perception difficulties in Mandarin-speaking amusics. Both the behavioral index (d′) and P300 amplitude showed reduced within-category as well as between-category sensitivity among the Mandarin-speaking amusics regardless of the linguistic status of the signal. The results suggest that acoustic pitch processing difficulties in amusics are manifested profoundly and further persist into the higher-level phonological processing that involves the neural processing of different lexical tone categories. Our findings indicate that long-term tone language experience may not compensate for the reduced acoustic pitch processing in tone language speakers with amusia but rather may extend to the neural processing of the phonological information of lexical tones during the attentive stage. However, from both the behavioral and neural evidence, the peakedness scores of the d′ and P300 amplitude were comparable between amusics and controls. It seems that the basic categorical perception (CP) pattern of native lexical tones is preserved in Mandarin-speaking amusics, indicating that they may have normal or near normal long-term categorical memory.
... Most of prior studies probed into lexical tones as a whole (e.g., Wang et al., 2001;Francis et al., 2003;Hallé et al., 2004), while more recent studies have switched to the dynamic interaction between acoustic and phonological information of lexical tones (e.g., Xi et al., 2010;Zhang et al., 2011;Yu et al., 2014). In general, the acoustic information consists of the physical features of lexical tones as estimated by F0 (e.g., pitch height and pitch contour), while the phonological information refers to the linguistic properties with tonal categories to distinguish lexical semantics (Yu et al., 2019). ...
... Although some secondary cues might influence the judgment of lexical tone contrasts, F0 remains most critical as amply confirmed by the seminal (Wang, 1976) and subsequent studies of categorical perception of lexical tones (Xu et al., 2006;Peng et al., 2010;Shen and Froud, 2016). It is reported that for Mandarin lexical tones, the perception of within-category pairs mainly depends on lower-level acoustic information of pitch, yet the perception of across-category comparisons is principally reliant on higher-level phonological information of lexical categories (Fujisaki and Kawashima, 1971;Xi et al., 2010;Yu et al., 2014). Importantly, distinguishing acoustic and phonological information of lexical tones re-paints a clear picture to interpret the mechanisms underlying lexical tone perception. ...
... Importantly, distinguishing acoustic and phonological information of lexical tones re-paints a clear picture to interpret the mechanisms underlying lexical tone perception. For example, the study by Xi et al. (2010) revealed that when listening to Mandarin lexical tones, native speakers need to process the acoustic information and the phonological information simultaneously. A following study by Yu et al. (2014) manipulated phonological categories and acoustic intervals of lexical tones in experiments and replicated the results of Xi et al. (2010). ...
Article
Full-text available
Music impacting on speech processing is vividly evidenced in most reports involving professional musicians, while the question of whether the facilitative effects of music are limited to experts or may extend to amateurs remains to be resolved. Previous research has suggested that analogous to language experience, musicianship also modulates lexical tone perception but the influence of amateur musical experience in adulthood is poorly understood. Furthermore, little is known about how acoustic information and phonological information of lexical tones are processed by amateur musicians. This study aimed to provide neural evidence of cortical plasticity by examining categorical perception of lexical tones in Chinese adults with amateur musical experience relative to the non-musician counterparts. Fifteen adult Chinese amateur musicians and an equal number of non-musicians participated in an event-related potential (ERP) experiment. Their mismatch negativities (MMNs) to lexical tones from Mandarin Tone 2–Tone 4 continuum and non-speech tone analogs were measured. It was hypothesized that amateur musicians would exhibit different MMNs to their non-musician counterparts in processing two aspects of information in lexical tones. Results showed that the MMN mean amplitude evoked by within-category deviants was significantly larger for amateur musicians than non-musicians regardless of speech or non-speech condition. This implies the strengthened processing of acoustic information by adult amateur musicians without the need of focused attention, as the detection of subtle acoustic nuances of pitch was measurably improved. In addition, the MMN peak latency elicited by across-category deviants was significantly shorter than that by within-category deviants for both groups, indicative of the earlier processing of phonological information than acoustic information of lexical tones at the pre-attentive stage. The results mentioned above suggest that cortical plasticity can still be induced in adulthood, hence non-musicians should be defined more strictly than before. Besides, the current study enlarges the population demonstrating the beneficial effects of musical experience on perceptual and cognitive functions, namely, the effects of enhanced speech processing from music are not confined to a small group of experts but extend to a large population of amateurs.
... The P300 effect has been interpreted as an index of discrimination of speech stimuli by phonological information (Maiste et al., 1995;Frenck-Mestre et al., 2005;Zheng et al., 2012). Both the MMN and the P300 have been found in lexical tone processing in tone languages, but they have predominantly been examined at the level of categorical perception (Luo et al., 2006;Xi et al., 2010;Zheng et al., 2012;Yu et al., 2014Yu et al., , 2017. Kung et al. (2014) examined the online interplay of tone and intonation in a larger context (rather than single word stimuli) in Cantonese Chinese. ...
... However, despite all these studies, previous ERP studies in tone languages have focused on lexical tones (Xi et al., 2010;Zheng et al., 2012). So far, no study has examined ERP responses when processing grammatical tone. ...
Article
Full-text available
Previous electrophysiological studies that have examined temporal agreement violations in (Indo-European) languages that use grammatical affixes to mark time reference, have found a Left Anterior Negativity (LAN) and/or P600 ERP components, reflecting morpho-syntactic and syntactic processing, respectively. The current study investigates the electrophysiological processing of temporal relations in an African language (Akan) that uses grammatical tone, rather than morphological inflection, for time reference. Twenty-four native speakers of Akan listened to sentences with time reference violations. Our results demonstrate that a violation of a present context by a past verb yields a P600 time-locked to the verb. There was no such effect when a past context was violated by a present verb. In conclusion, while there are similarities in both Akan and Indo-European languages, as far as the modulation of the P600 effect is concerned, the nature of this effect seems to be different for these languages.
... There is some discrepancy with respect to the effect of tone language experience on MMN responses to linguistic and non-linguistic pitch change. When non-speech stimuli (harmonic or pure tones) are closely matched to lexical tones (identical amplitude information, duration, and fundamental requency (F0)), native tone language listeners have been shown to exhibit comparable MMNs to lexical tones and non-speech analogues (Gu, Zhang, Hu, & Zhao, 2013;Xi, Zhang, Shu, Zhang, & Li, 2010), suggesting common neural mechanisms for processing both speech and non-speech pitch contours. Different results have been found comparing tone and non-tone language listeners' MMN responses to pitch stimuli. ...
... How language experience may affect MMN lateralization is poorly understood. When presented with native lexical tones, MMN lateralization in tone language listeners differed across studies: some have found right lateralization (Luo et al., 2006;Ren, Yang, & Li, 2009;Xi et al., 2010); others no clear lateralization (Chandrasekaran, et al., 2007b); and yet others left lateralization (Gu et al., 2013). With regard to music, several studies report a frontal-central distribution of the MMN elicited by contour violation without clear lateralization (Trainor, McDonald, & Alain, 2002;Vuust, Brattico, Seppänen, Näätänen, & Tervaniemi, 2012). ...
Article
Language experience shapes musical and speech pitch processing. We investigated whether speaking a lexical tone language natively modulates neural processing of pitch in language and music as well as their correlation. We tested tone language (Mandarin Chinese), and non-tone language (Dutch) listeners in a passive oddball paradigm measuring mismatch negativity (MMN) for (i) Chinese lexical tones and (ii) three-note musical melodies with similar pitch contours. For lexical tones, Chinese listeners showed a later MMN peak than the non-tone language listeners, whereas for MMN amplitude there were no significant differences between groups. Dutch participants also showed a late discriminative negativity (LDN). In the music condition two MMNs, corresponding to the two notes that differed between the standard and the deviant were found for both groups, and an LDN were found for both the Dutch and the Chinese listeners. The music MMNs were significantly right lateralized. Importantly, significant correlations were found between the lexical tone and the music MMNs for the Dutch but not the Chinese participants. The results suggest that speaking a tone language natively does not necessarily enhance neural responses to pitch either in language or in music, but that it does change the nature of neural pitch processing: non-tone language speakers appear to perceive lexical tones as musical, whereas for tone language speakers, lexical tones and music may activate different neural networks. Neural resources seem to be assigned differently for the lexical tones and for musical melodies, presumably depending on the presence or absence of long-term phonological memory traces.
... While previous studies in line with the NLNC have largely focused on cortical processing of phonemic representations (e.g., /r/ vs /l/), some studies have found supporting evidence at the suprasegmental level. For instance, when compared with non-tonal language users, native speakers of tonal languages such as Chinese and Thai show enhanced cortical responses to perceptually challenging but linguistically relevant pitch contrasts (Chandrasekaran et al., 2007; Xi et al., 2010) and better discriminative ability for tones in the native speech context (Lee et al., 1996). ...
... This FFR enhancement is thought to reflect the fine-tuned auditory processing in Chinese speakers for stronger neural phase-locking of waveform periodicity and representation of F 0 for lexical tones (Krishnan et al., 2010a(Krishnan et al., , b, c, 2009a(Krishnan et al., , b, 2005Swaminathan et al., 2008a). Notably, such cross-domain experience-dependent effect for pitch processing is also evident at cortical level (Krishnan et al., 2015(Krishnan et al., , 2014Xi et al., 2010). Although the exact mechanisms of experience-driven attunement in FFRs and cortical responses are unknown, it has been proposed that experience-dependent neural tuning relies on coordination among ascending, descending, and local pathways of auditory system (Krishnan and Gandour, 2017), with FFR representing a snapshot of such confluence of the hierarchical system on auditory processing . ...
Article
A current topic in auditory neurophysiology is how brainstem sensory coding contributes to higher-level perceptual, linguistic and cognitive skills. This cross-language study was designed to compare frequency following responses (FFRs) for lexical tones in tonal (Mandarin Chinese) and non-tonal (English) language users and test the correlational strength between FFRs and behavior as a function of language experience. The behavioral measures were obtained in the Garner paradigm to assess how lexical tones might interfere with vowel category and duration judgement. The FFR results replicated previous findings about between-group differences, showing enhanced pitch tracking responses in the Chinese subjects. The behavioral data from the two subject groups showed that lexical tone variation in the vowel stimuli significantly interfered with vowel identification with a greater effect in the Chinese group. Moreover, the FFRs for lexical tone contours were significantly correlated with the behavioral interference only in the Chinese group. This pattern of language-specific association between speech perception and brainstem-level neural phase-locking of linguistic pitch information provides evidence for a possible native language neural commitment at the subcortical level, highlighting the role of experience-dependent brainstem tuning in influencing subsequent linguistic processing in the adult brain.
... This paradigm has been widely used to investigate automatic encoding of different sound features including speech processing by human auditory cortex, eliminating task-related attention or memory confounds (Näätänen et al., 1978;Näätänen and Alho, 1997;Näätänen, 2001). Studies related to the processing of pitch timescales have demonstrated earlier MMNs to early cue divergent time point in both pitch variations induced by "Ma" at lexical-level tone and sentence-level accentuation, as well as differential MMNs in response to lexical tone and intonation-level pitch variations either within the same acoustic category or across categories (Luo et al., 2006;Ren et al., 2009;Xi et al., 2010;Yu et al., 2014;Li and Chen, 2018). Along the same line, some studies have demonstrated larger and earlier MMN response elicited by lexical tone pairs having a relatively early acoustic cue divergent point compared to later ones (i.e., T1/T2 vs. T2/T3) in Mandarin Chinese speakers (Chandrasekaran et al., 2007;Li and Chen, 2015). ...
... Our MMN response yielded by local and global timescale FM sweeps lends further support to this notion. The finding with respect to processing different levels of timescales is also consistent with existing studies showing that unattended processing of speech pitch contours can be observed at the lexical-level tone and sentence-level pitch variations (Xi et al., 2010;Li and Chen, 2015). The main difference is that most of these studies have typically employed natural lexical-tonal materials modulated at relatively longer durations (i.e., 300-550 ms), whereas an acoustic variation free of lexical meaning with a shorter time local scale is employed in our study. ...
Article
Full-text available
Speech comprehension across languages depends on encoding the pitch variations in frequency-modulated (FM) sweeps at different timescales and frequency ranges. While timescale and spectral contour of FM sweeps play important roles in differentiating acoustic speech units, relatively little work has been done to understand the interaction between the two acoustic dimensions at early cortical processing. An auditory oddball paradigm was employed to examine the interaction of timescale and pitch contour at pre-attentive processing of FM sweeps. Event-related potentials to frequency sweeps that vary in linguistically relevant pitch contour (fundamental frequency F0 vs. first formant frequency F1) and timescale (local vs. global) in Mandarin Chinese were recorded. Mismatch negativities (MMNs) were elicited by all types of sweep deviants. For local timescale, FM sweeps with F0 contours yielded larger MMN amplitudes than F1 contours. A reversed MMN amplitude pattern was obtained with respect to F0/F1 contours for global timescale stimuli. An interhemispheric asymmetry of MMN topography was observed corresponding to local and global-timescale contours. Falling but not rising frequency difference waveforms sweep contours elicited right hemispheric dominance. Results showed that timescale and pitch contour interacts with each other in pre-attentive auditory processing of FM sweeps. Findings suggest that FM sweeps, a type of non-speech signal, is processed at an early stage with reference to its linguistic function. That the dynamic interaction between timescale and spectral pattern is processed during early cortical processing of non-speech frequency sweep signal may be critical to facilitate speech encoding at a later stage.
... In non-speech conditions (nSN and nSQ), the stimulus paradigm (including stimulus duration, rise/fall times, SOA, and AM noise) was the same as in the speech conditions, except that the foreground stimuli were complex tones which contained fundamental frequency components as well as five harmonic (3rd, 6th, 7th, 8th, and 12th) components. Similar to previous studies ( Xi et al., 2010;Wang et al., 2017), the non-speech stimuli in our experiment were generated to match the speech stimuli in terms of fundamental frequency, amplitude, and duration parameters. Speech and non-speech stimuli differed only in spectral components; in non-speech complex tones, some harmonics were absent to create the non-speech percept. ...
... means not significant. Kozou et al., 2005), but not consistent with reports that speech deviants elicited stronger MMN ( Jaramillo et al., 2001;Sorokin et al., 2010), stronger P300 ( Sorokin et al., 2010), earlier MMN ( Xi et al., 2010), and earlier N2b and P300 than non-speech deviants ( Sussman et al., 2004). Also, speech deviants were reported to elicit weaker MMN than non-speech deviants in traffic noise but not in other types of noise ( Kozou et al., 2005). ...
Article
Since sound perception takes place against a background with a certain amount of noise, both speech and non-speech processing involve extraction of target signals and suppression of background noise. Previous works on early processing of speech phonemes largely neglected how background noise is encoded and suppressed. This study aimed to fill in this gap. We adopted an oddball paradigm where speech (vowels) or non-speech stimuli (complex tones) were presented with or without a background of amplitude-modulated noise and analyzed cortical responses related to foreground stimulus processing, including mismatch negativity (MMN), N2b, and P300, as well as neural representations of the background noise, i.e. auditory steady-state response (ASSR). We found that speech deviants elicited later and weaker MMN, later N2b, and later P300 than non-speech ones, but N2b and P300 had similar strength, suggesting more complex processing of certain acoustic features in speech. Only for vowels, background noise enhanced N2b strength relative to silence, suggesting an attention-related speech-specific process to improve perception of foreground targets. In addition, noise suppression in speech contexts, quantified by ASSR amplitude reduction after stimulus onset, was lateralized towards the left hemisphere. The left-lateralized suppression following N2b was associated with the N2b enhancement in noise for speech, indicating that foreground processing may interact with background suppression, particularly during speech processing. Together, our findings indicate that the differences between perception of speech and non-speech sounds involve not only the processing of target information in the foreground but also the suppression of irrelevant aspects in the background.
... [11] In the last decade, a surge of interest has been developed to assess the pitch contour contrasts of tonal languages. [11][12][13][14] Mandarin Chinese is one such tonal language involving contour tones which is perceived categorically. Researchers reported strong evidences of categorical perception for pitch contrasts in Mandarin, [11] Thai, [15] and Cantonese listeners. ...
... Researchers reported strong evidences of categorical perception for pitch contrasts in Mandarin, [11] Thai, [15] and Cantonese listeners. [16] Xi et al. [13] have found that categorical perception of lexical tones in Mandarin Chinese is perceived categorically. ...
... Tonal languages such as Mandarin Chinese deploy lexical tones together with consonants and vowels to define word meaning. Previous neuroimaging studies, including positron emission tomography (PET) and functional magnetic resonance imaging (fMRI), have reported that the bilateral superior temporal gyri (STG), the left anterior insula cortex, and the left middle temporal gyrus, as well as the right lateralized cortical activations in the posterior inferior frontal gyrus, are activated during the processing of lexical tone in Mandarin Chinese (Klein et al., 2001;Wong et al., 2004;Liu et al., 2006;Xi et al., 2010;Chang et al., 2014). In our previous study, spectrograms of the syllable /bai/ pronounced in four lexical tones (bai1, bai2, bai3, and bai4) illustrated that the lexical tones are characterized by varying frequencies with time, and that lexical tones have minimal effects on the voice onset time of the consonant /b/; moreover, spectrograms of the syllables /bai, /dai, and /tai/ pronounced in a flat tone (bai1, dai1, and tai1) illustrated that the syllables show relatively unchanged frequencies with time and that the consonants in the upper syllables are characterized by temporal variations as reflected by the voice onset time. ...
... Notably, the MEG data for the neural basis of perceptual processing of lexical tones indicated a left hemispheric dominance for detecting large lexical-tone changes and small deviant contrasts involving less left hemispheric activation in the auditory cortex and greater activation in the right frontal cortex at a later time window (Hsu et al., 2014). The cross-category contrasts also revealed larger MMN responses than within-category contrasts in the left scalp, but not in the right scalp (Xi et al., 2010;Zhang et al., 2011). In addition, an MMN study investigating the effect of allophonic variation on the mental representation and neural processing of lexical tones suggested that activation of the allophonic tonal variants can lead to right-hemisphere-dominant processing of lexical tones, which are otherwise categorically processed via recruitment of both left and right hemispheres (Li and Chen, 2015). ...
Article
Full-text available
Labor division of the two brain hemispheres refers to the dominant processing of input information on one side of the brain. At an early stage, or a preattentive stage, the right brain hemisphere is shown to dominate the auditory processing of tones, including lexical tones. However, little is known about the influence of brain damage on the labor division of the brain hemispheres for the auditory processing of linguistic tones. Here, we demonstrate swapped dominance of brain hemispheres at the preattentive stage of auditory processing of Chinese lexical tones after a stroke in the right temporal lobe (RTL). In this study, we frequently presented lexical tones to a group of patients with a stroke in the RTL and infrequently varied the tones to create an auditory contrast. The contrast evoked a mismatch negativity response, which indexes auditory processing at the preattentive stage. In the participants with a stroke in the RTL, the mismatch negativity response was lateralized to the left side, in contrast to the right lateralization pattern in the control participants. The swapped dominance of brain hemispheres indicates that the RTL is a core area for early-stage auditory tonal processing. Our study indicates the necessity of rehabilitating tonal processing functions for tonal language speakers who suffer an RTL injury.
... Naatanen et al. found evidence for language-dependent vowel representations in the human brain [9]. Another study examined the categorical perception of lexical tones and found that across-category contrast elicited a larger MMN than within-category distinction [10]. In animal experiments, more accurate EEG signals were obtained through invasive procedures. ...
... Therefore, the TFR of the grand-averaged EEG was calculated for each sound to identify vowel recognition-related changes in the magnitude and phase of EEG oscillations at specific frequencies (Fig 5A). From the TFR analysis, high power activation was observed around the delta (1-4 Hz), theta (4-8 Hz), and alpha (8)(9)(10)(11)(12) Hz) band at 0.3-0.6 s from the stimulus onset, regardless of the speech sound stimulation. ...
Article
Full-text available
Over the years, considerable research has been conducted to investigate the mechanisms of speech perception and recognition. Electroencephalography (EEG) is a powerful tool for identifying brain activity; therefore, it has been widely used to determine the neural basis of speech recognition. In particular, for the classification of speech recognition, deep learning-based approaches are in the spotlight because they can automatically learn and extract representative features through end-to-end learning. This study aimed to identify particular components that are potentially related to phoneme representation in the rat brain and to discriminate brain activity for each vowel stimulus on a single-trial basis using a bidirectional long short-term memory (BiLSTM) network and classical machine learning methods. Nineteen male Sprague-Dawley rats subjected to microelectrode implantation surgery to record EEG signals from the bilateral anterior auditory fields were used. Five different vowel speech stimuli were chosen, /a/, /e/, /i/, /o/, and /u/, which have highly different formant frequencies. EEG recorded under randomly given vowel stimuli was minimally preprocessed and normalized by a z-score transformation to be used as input for the classification of speech recognition. The BiLSTM network showed the best performance among the classifiers by achieving an overall accuracy, f1-score, and Cohen’s κ values of 75.18%, 0.75, and 0.68, respectively, using a 10-fold cross-validation approach. These results indicate that LSTM layers can effectively model sequential data, such as EEG; hence, informative features can be derived through BiLSTM trained with end-to-end learning without any additional hand-crafted feature extraction methods.
... Although some studies have investigated speech perception of Mandarin-speaking children, they generally focused on the perception of lexical tones. This focus is probably due to (a) the critical role that lexical tones play in a tonal language, such as Chinese (Xi, Zhang, Shu, Zhang, & Li, 2010;Yu et al., 2015;Zhang et al., 2012) and (b) the limitations that current CI devices have in presenting vocal pitch information (Han et al., 2009;Peng, Tomblin, Cheung, Lin, & Wang, 2004;Xu et al., 2011). For example, Peng et al. (2004) found that the overall average accuracy of lexical tone identification in prelingually deafened pediatric CI users was only about 73%, which is apparently much lower than the age-matched peers with normal hearing (NH). ...
... Aside from the notion of an automatic response to acoustic deviation, MMN also reflects the activation of experience-dependent, longterm memory traces for speech sounds, including phonemes Näätänen et al., 1997), lexical tones (Gu et al., 2013;Xi et al., 2010), syllables , word forms (Gu et al., 2012;Pulvermüller et al., 2001), compounds (Cappelle et al., 2010), and word meanings Pulvermüller et al., 2005;Shtyrov et al., 2004). The retrieval of these language-specific memory traces is usually reflected by the enhancement of MMN elicited by familiar (native) speech sounds compared to the MMN elicited by unfamiliar (nonnative) speech sounds. ...
Article
The remarkable rapidity and effortlessness of speech perception and word reading by skilled listeners or readers suggest implicit or automatic mechanisms underlying language processing. In speech perception, the implicit mechanisms are reflected by the auditory mismatch negativity (MMN) response, suggesting that phonemic, lexical, semantic, and syntactic information are automatically and rapidly processed in the absence of focused attention. In visual word reading, implicit orthographic and lexical processing are reflected by visual mismatch negativity (vMMN), the visual counterpart of auditory MMN. The semantic processing of spoken words is reflected by MMN. This study investigated whether semantic processing is also reflected by vMMN. For this purpose, visual Chinese words belonging to different semantic categories (color, taste, and action) were presented to participants in oddball paradigms. A set of words belonging to the same semantic category was frequently presented as standards; a word belonging to a different semantic category was presented sporadically as deviant. Participants were instructed to perform a visual cross-change detection task and ignore the words. Significant vMMN was elicited in Experiments 1 to 3, in which the deviant word carried a semantic radical that overtly indicated the word’s semantic category information. The vMMNs were most prominent around 260 ms after word onset, were parieto-occipital distributed, and were significantly left-hemisphere lateralized, suggesting rapid semantic processing of the visual words’ category-related information. No significant vMMN was elicited in Experiment 4, in which the deviant word did not carry any semantic radicals. Thus, the semantic radical, which has a high frequency of occurrence because it is carried by many words, may be critical for the elicitation of vMMN.
... A plethora of studies have synthesized tonal continua with systematic variations in the pitch contour while keeping duration and intensity of the stimuli constant, which demonstrated robust CP of Mandarin tones among native listeners 9 of 59 with normal hearing (NH) (e.g., Wang 1973;Xu et al. 2006;Peng et al. 2010;Xi et al. 2010;Zhang et al. 2012;Yu et al. 2019;Chen & Peng 2021;Ma et al. 2021;Zhu et al. 2021;Feng & Peng 2022). Moreover, the CP paradigm of Mandarin tones has been successfully extended to individuals with CIs (e.g., Luo et al. 2014;Peng et al. 2017;Zhang et al. 2019a;Zhang et al. 2020c), showing that CI users exhibited impaired but improvable lexical tone categorization/normalization. ...
Preprint
Full-text available
Objectives: Although pitch reception poses a great challenge for individuals with cochlear implants (CIs), formal auditory training (e.g., high variability phonetic training, HVPT) has been shown to provide direct benefits in pitch-related perceptual performances such as lexical tone recognition for CI users. As lexical tones in spoken language are expressed with a multitude of distinct spectral, temporal, and intensity cues, it is important to determine the sources of training benefits for CI users. The purpose of the present study was to conduct a rigorous fine-scale evaluation with the categorical perception (CP) paradigm to control the acoustic parameters and test the efficacy and sustainability of HVPT for Mandarin-speaking pediatric CI recipients. The main hypothesis was that HVPT-induced perceptual learning would greatly enhance CI users' ability to extract the primary pitch contours from spoken words for lexical tone identification and discrimination. Furthermore, individual differences in immediate and long-term gains from training would likely be attributable to baseline performance and duration of CI use. Design: Twenty-eight prelingually deaf Mandarin-speaking kindergarteners with CIs were tested. Half of them received five sessions of HVPT within a period of three weeks. The other half served as control who did not receive the formal training. Two classical CP tasks on a tonal continuum from Mandarin Tone 1 (high-flat in pitch) to Tone 2 (mid-Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 November 2022. Participants were instructed to either label a speech stimulus along the continuum (i.e., identification task) or determine whether a pair of stimuli separated by zero or two steps from the continuum was the same or different (i.e., discrimination task). Identification function measures (i.e., boundary position and boundary width) and discrimination function scores (i.e., between-category score, within-category score, and peakedness score) were assessed for each child participant across the three test sessions. Results: Linear mixed-effects (LME) models showed significant training-induced enhancement in lexical tone categorization with significantly narrower boundary width and better between-category discrimination in the immediate posttest over pretest for the trainees. Furthermore, training-induced gains were reliably retained in the follow-up test 10 weeks after training. By contrast, no significant changes were found in the control group across sessions. Regression analysis confirmed that baseline performance (i.e., boundary width in the pretest session) and duration of CI use were significant predictors for the magnitude of training-induced benefits. Conclusions: The stringent CP tests with synthesized stimuli that excluded acoustic cues other than the pitch contour and were never used in training showed strong evidence for the efficacy of HVPT in yielding immediate and sustained improvement in lexical tone categorization for Mandarin-speaking children with CIs. The training results and individual differences have remarkable implications for developing personalized computer-based short-term HVPT protocols that may have sustainable long-term benefits for aural rehabilitation in this clinical population. Abbreviations: CI = cochlear implant; CP = categorical perception; FDR = false discovery rate; F0 = fundamental frequency; H-NTLA = Hiskey-Nebraska test of learning aptitude; HVPT = high variability pho-netic training; LME = linear mixed-effects; MCI = melodic contour identification; MMN = mismatch nega-tivity; NH = normal hearing; PSOLA = Pitch-Synchronous Overlap Add; T1 = Tone 1; T2 = Tone 2; T3 = Tone 3; T4 = Tone 4; 2 AFC = two-alternative forced choice; 4 AFC = four-alternative forced choice; 9 AFC = nine-alternative forced-choice
... In Mandarin, for example, four tones are described that differ in pitch height and contour: A high level tone (Tone 1), a high rising tone (Tone 2, hereafter T2), a low fallingrising tone (Tone 3), and a high falling tone (Tone 4, hereafter T4) (Ladefoged & Johnson, 2011). Previous studies have suggested that lexical tonal information is functionally similar to phonemic information in language comprehension, given that tonal language speakers access tonal information early for word recognition (Malins & Joanisse, 2012) and process tone quasi-categorically (Feng, Gan, Wang, Wong, & Chandrasekaran, 2018;Gandour & Krishnan, 2016;Peng et al., 2010;Xi, Zhang, Shu, Zhang, & Li, 2010). Neuroimaging work has shown that tone processing in native Mandarin speakers involves brain regions for the analysis and abstraction of acoustic signals, and the processing of phonological and/or semantic information. ...
Article
Full-text available
Intonation, the modulation of pitch in speech, is a crucial aspect of language that is processed in right‐hemispheric regions, beyond the classical left‐hemispheric language system. Whether or not this notion generalises across languages remains, however, unclear. Particularly, tonal languages are an interesting test case because of the dual linguistic function of pitch that conveys lexical meaning in form of tone, in addition to intonation. To date, only few studies have explored how intonation is processed in tonal languages, how this compares to tone and between tonal and non‐tonal language speakers. The present fMRI study addressed these questions by testing Mandarin and German speakers with Mandarin material. Both groups categorised mono‐syllabic Mandarin words in terms of intonation, tone, and voice gender. Systematic comparisons of brain activity of the two groups between the three tasks showed large cross‐linguistic commonalities in the neural processing of intonation in left fronto‐parietal, right frontal, and bilateral cingulo‐opercular regions. These areas are associated with general phonological, specific prosodic, and controlled categorical decision‐making processes, respectively. Tone processing overlapped with intonation processing in left fronto‐parietal areas, in both groups, but evoked additional activity in bilateral temporo‐parietal semantic regions and subcortical areas in Mandarin speakers only. Together, these findings confirm cross‐linguistic commonalities in the neural implementation of intonation processing but dissociations for semantic processing of tone only in tonal language speakers.
... Lexical tone is primarily identified by fundamental frequency (F0) cues, including mean pitch and pitch contour (Gandour, 1983(Gandour, , 1984Massaro, Cohen, & Tseng, 1985), and is secondarily identified by cues in duration, amplitude, vocal range, and register (Blicher, Diehl, & Cohen, 1990;Liu & Samuel, 2004). Thus, lexical tone is perceived distinctly from segmental phonology and is classified based on these cues according to a fixed set of categories in a given tonal language (Francis, Ciocca, & Ng, 2003;Xi, Zhang, Shu, Zhang, & Li, 2010). Mandarin, the tonal language of interest in the current study, is a contour tone system, meaning that its tones are distinguished from one another by the pitch trajectory of their F0 (Chao, 1965). ...
Article
Full-text available
This study examined how bilingualism in an atonal language, in addition to a tonal language, influences lexical and non-lexical tone perception and word learning during childhood. Forty children aged 5;3–7;2, bilingual either in English and Mandarin or English and another atonal language, were tested on Mandarin lexical tone discrimination, level-pitch sine-wave tone discrimination, and learning of novel words differing minimally in Mandarin lexical tone. Mandarin–English bilingual children discriminated between and learned novel words differing minimally in Mandarin lexical tone more accurately than their atonal–English bilingual peers. However, Mandarin–English and atonal–English bilingual children discriminated between level-pitch sine-wave tones with similar accuracy. Moreover, atonal–English bilingual children showed a tendency to perceive differing Mandarin lexical and level-pitch sine-wave tones as identical, whereas their Mandarin–English peers showed no such tendency. These results indicate that bilingualism in a tonal language in addition to an atonal language—but not bilingualism in two atonal languages—allows for continued sensitivity to lexical tone beyond infancy. Moreover, they suggest that although tonal–atonal bilingualism does not enhance sensitivity to differences in pitch between sine-wave tones beyond infancy any more effectively than atonal–atonal bilingualism, it protects against the development of biases to perceive differing lexical and non-lexical tones as identical. Together, the results indicate that, beyond infancy, tonal–atonal bilinguals process lexical tones using different cognitive mechanisms than atonal–atonal bilinguals, but that both groups process level-pitch non-lexical tone using the same cognitive mechanisms.
... Apart from these behavioral studies, the CP pattern at the suprasegmental level has been confirmed by research adopting electroencephalography. Mandarin-speaking adult listeners provoked larger amplitudes of mismatch negativity to between-category tones than withincategory tones (Xi et al., 2010). This supported strong CP of lexical tones among native tone language listeners. ...
Article
Full-text available
This study investigated the developmental trajectories of categorical perception (CP) of segments (i.e., stops) and suprasegments (i.e., lexical tones) in an attempt to examine the perceptual development of phonological categories and whether CP of suprasegments develops in parallel with that of segments. Forty-seven Mandarin-speaking monolingual preschoolers aged four to six years old, and fourteen adults completed both identification and discrimination tasks of the Tone 1-2 continuum and the /pa/-/pha/ continuum. Results revealed that children could perceive both lexical tones and aspiration of stops in a categorical manner by age four. The boundary position did not depend on age, with children having similar positions to adults regardless of speech continuum types. The boundary width, on the other hand, reached the adult-like level at age six for lexical tones, but not for stops. In addition, the within-category discrimination score did not differ significantly between children and adults for both continua. The between-category discrimination score improved with age and achieved the adult-like level at age five for lexical tones, but still not for stops even at age six. It suggests that the fine-grained perception of phonological categories is a protracted process, and the improvement and varying timeline of the development of segments and suprasegments are discussed in relation to statistical learning of the regularities of speech sounds in ambient language, ongoing maturation of perceptual systems, the memory mechanism underlying perceptual learning, and the intrinsic nature of speech elements.
... Neuroscientific evidence for the phonological nature of lexical tone contrasts has come from ERP studies. Larger mismatch negativities (MMNs), associated with deviant processing, have been found in the left hemisphere across Mandarin Chinese tone categories than within them, which has been advanced as support for the categorical nature of Chinese tone contrasts (Shen & Froud, 2019;Xi, Zhang, Shu, Zhang, & Li, 2010). Similarly, Zheng, Minett, Peng, and Wang (2012) report categorical effects in behavioural identification results for across-category rising vs. level pitch contours from Mandarin and Cantonese listeners, with the latter group additionally showing a larger P300 effect, commonly associated with stimulus categorization. ...
Article
Full-text available
We intended to establish if two lexical tone contrasts in Zhumadian Mandarin, one between early and late aligned falls and another between early and late aligned rises, are perceived categorically, while the difference between declarative and interrogative pronunciations of these four tones is perceived gradiently. Presenting stimuli from 7-point acoustic continua between tones and between intonations, we used an identification task and a discrimination task with an experimental group of native listeners and a control group of Indonesian listeners, whose language employs none of the differences within either the falling or the rising pitch contours in its phonology. Only the lexical condition as perceived by the experimental group yielded sigmoid identification functions and a heightened discriminatory sensitivity around the midpoint of continua. The intonational condition in the native group and both conditions in the control group yielded gradient identification functions and smaller, reverse effects of the continuum midpoints in the discrimination task. The results are interpreted to mean that sentence modality contrasts can be expressed gradiently, but that lexical tone differences are represented phonologically, and hence are perceived categorically, despite low phonetic salience of the contrast. This conclusion challenges assumptions about the relation between linguistic functions and linguistic structures.
... Categorical perception of pitch is not confined to lexical tone perception, but extends also to pitch accent alignment perception in intonational languages (D'Imperio and House, 1997). Recent neuro-imaging studies confirm that native listeners process tones similarly to other speech segments in the left hemisphere and with the activation of the left frontal operculum, which demonstrates that the phonological processing of suprasegmental units also occurs near Broca's area (Gandour et al., 2000;Brown-Schmidt and Canseco-Gonzalez, 2004;Xi et al., 2010). ...
Article
Full-text available
Research investigating listeners’ neural sensitivity to speech sounds has largely focused on segmental features. We examined Australian English listeners’ perception and learning of a supra-segmental feature, pitch direction in a non-native tonal contrast, using a passive oddball paradigm and electroencephalography. The stimuli were two contours generated from naturally produced high-level and high-falling tones in Mandarin Chinese, differing only in pitch direction (Liu and Kager, 2014). While both contours had similar pitch onsets, the pitch offset of the falling contour was lower than that of the level one. The contrast was presented in two orientations (standard and deviant reversed) and tested in two blocks with the order of block presentation counterbalanced. Mismatch negativity (MMN) responses showed that listeners discriminated the non-native tonal contrast only in the second block, reflecting indications of learning through exposure during the first block. In addition, listeners showed a later MMN peak for their second block of test relative to listeners who did the same block first, suggesting linguistic (as opposed to acoustic) processing or a misapplication of perceptual strategies from the first to the second block. The results also showed a perceptual asymmetry for change in pitch direction: listeners who encountered a falling tone deviant in the first block had larger frontal MMN amplitudes than listeners who encountered a level tone deviant in the first block. The implications of our findings for second language speech and the developmental trajectory for tone perception are discussed.
... Research going back to Wang (1976), who conducted identification and discrimination tasks with synthetic tokens varying along the T1-T2 continuum, has found that L1 Mandarin listeners perceive tone more categorically than naive L1 English listeners with no experience of tonal languages. Subsequent studies including continua of all six possible tone pairs in Mandarin have confirmed these findings (Hallé et al., 2004;Peng et al., 2010;Shen and Froud, 2016;Xi et al., 2010;Xu et al., 2006). At the level of lexical processing in tonal languages, most previous work has focused on the independent contributions of tones versus segments in the process of lexical access. ...
Article
Successful listening in a second language (L2) involves learning to identify the relevant acoustic–phonetic dimensions that differentiate between words in the L2, and then use these cues to access lexical representations during real-time comprehension. This is a particularly challenging goal to achieve when the relevant acoustic–phonetic dimensions in the L2 differ from those in the L1, as is the case for the L2 acquisition of Mandarin, a tonal language, by speakers of non-tonal languages like English. Previous work shows tone in L2 is perceived less categorically (Shen and Froud, 2019) and weighted less in word recognition (Pelzl et al., 2019) than in L1. However, little is known about the link between categorical perception of tone and use of tone in real time L2 word recognition at the level of the individual learner. This study presents evidence from 30 native and 29 L1-English speakers of Mandarin who completed a real-time spoken word recognition and a tone identification task. Results show that L2 learners differed from native speakers in both the extent to which they perceived tone categorically as well as in their ability to use tonal cues to distinguish between words in real-time comprehension. Critically, learners who reliably distinguished between words differing by tone alone in the word recognition task also showed more categorical perception of tone on the identification task. Moreover, within this group, performance on the two tasks was strongly correlated. This provides the first direct evidence showing that the ability to perceive tone categorically is related to the weighting of tonal cues during spoken word recognition, thus contributing to a better understanding of the link between phonemic and lexical processing, which has been argued to be a key component in the L2 acquisition of tone (Wong and Perrachione, 2007).
... Moreover, Chen, Peter, et al. (2018) found that while tone (Mandarin Chinese) and non-tone (Dutch) language adults show similar right-lateralised mismatch negativity (MMN) for 3-note musical melodies based on Mandarin tones, for the lexical tones on which the f 0 of these 3-note melodies were based, Chinese adults showed a later MMN peak to lexical tone oddballs than did non-tone Dutch adults. These results imply that, consistent with findings that tone language speakers process tones categorically (Feng, Gan, Wang, Wong, & Chandrasekaran, 2018;Gandour & Krishnan, 2016;Peng et al., 2010;Xi, Zhang, Shu, Zhang, & Li, 2010), a larger f 0 difference is necessary for Chinese than for Dutch adults to detect the lexical tone change. Moreover, Chen and colleagues Chen, Liu, & Kager, 2016;Liu, Chen, & Kager, 2020) found significant correlations between lexical tone and music pitch for non-tone, but not tone language speakers. ...
Article
Some prior investigations suggest that tone perception is flexible, reasonably independent of native phonology, whereas others suggest it is constrained by native phonology. We address this issue in a systematic and comprehensive investigation of adult tone perception. Sampling from diverse tone and non-tone speaking communities, we tested discrimination of the three major tone systems (Cantonese, Thai, Mandarin) that dominate the tone perception literature, in relation to native language and language experience as well as stimulus variation (tone properties, presentation order, pitch cues) using linear mixed effect modelling and multidimensional scaling. There was an overall discrimination advantage for tone language speakers and for native tones. However, language- and tone-specific effects, and presentation order effects also emerged. Thus, over and above native phonology, stimulus variation exerts a powerful influence on tone discrimination. This study provides a tone atlas, a reference guide to inform empirical studies of tone sensitivity, both retrospectively and prospectively.
... One of the debates lies in hemispherical asymmetry. Using various methodologies, studies have reported either right (Ren et al., 2009;Ge et al., 2015) or left (Xi et al., 2010;Gu et al., 2013) biased activation for lexical tone perception. The discrepancy could be partially reconciled by the modulatory effect of language experience in the interplay of bottom-up and top-down processes during lexical tone perception (Zatorre and Gandour, 2008). ...
Article
Full-text available
In tonal language such as Chinese, lexical tone serves as a phonemic feature in determining word meaning. Meanwhile, it is close to prosody in terms of suprasegmental pitch variations and larynx-based articulation. The important yet mixed nature of lexical tone has evoked considerable studies, but no consensus has been reached on its functional neuroanatomy. This meta-analysis aimed at uncovering the neural network of lexical tone perception in comparison with that of phoneme and prosody in a unified framework. Independent Activation Likelihood Estimation meta-analyses were conducted for different linguistic elements: lexical tone by native tonal language speakers, lexical tone by non-tonal language speakers, phoneme, word-level prosody, and sentence-level prosody. Results showed that lexical tone and prosody studies demonstrated more extensive activations in the right than the left auditory cortex, whereas the opposite pattern was found for phoneme studies. Only tonal language speakers consistently recruited the left anterior superior temporal gyrus (STG) for processing lexical tone, an area implicated in phoneme processing and word-form recognition. Moreover, an anterior-lateral to posterior-medial gradient of activation as a function of element timescale was revealed in the right STG, in which the activation for lexical tone lied between that for phoneme and that for prosody. Another topological pattern was shown on the left precentral gyrus (preCG), with the activation for lexical tone overlapped with that for prosody but ventral to that for phoneme. These findings provide evidence that the neural network for lexical tone perception is hybrid with those for phoneme and prosody. That is, resembling prosody, lexical tone perception, regardless of language experience, involved right auditory cortex, with activation localized between sites engaged by phonemic and prosodic processing, suggesting a hierarchical organization of representations in the right auditory cortex. For tonal language speakers, lexical tone additionally engaged the left STG lexical mapping network, consistent with the phonemic representation. Similarly, when processing lexical tone, only tonal language speakers engaged the left preCG site implicated in prosody perception, consistent with tonal language speakers having stronger articulatory representations for lexical tone in the laryngeal sensorimotor network. A dynamic dual-stream model for lexical tone perception was proposed and discussed.
... S.-Y. Wang, 1976;Xi et al., 2010;Xu et al., 2006) and linguistic time (VOT; Cheung et al., 2009;Feng, 2018;Xi et al., 2009) in a highly categorical perception (CP) mode, with a sharp identification boundary and enhanced sensitivity to between-category contrasts relative to within-category ones (Liberman et al., 1957;Massaro, 1987). Moreover, developmental studies showed that Mandarin-speaking TD children from 6-year-olds started to show an adultlike competence in the CP of lexical tones (Chen et al., 2017;Xi et al., 2009), and 10-year-old children generally reached an adultlike CP of VOT (Feng, 2018). ...
Article
Purpose Previous studies have shown enhanced pitch and impaired time perception in individuals with autism spectrum disorders (ASD). However, it remains unclear whether such deviated patterns of auditory processing depending on acoustic dimensions would transfer to the higher level linguistic pitch and time processing. In this study, we compared the categorical perception (CP) of lexical tones and voice onset time (VOT) in Mandarin Chinese, which utilize pitch and time changes, respectively, to convey phonemic contrasts. Method The data were collected from 22 Mandarin-speaking adolescents with ASD and 20 age-matched neurotypical controls. In addition to the identification and discrimination tasks to test CP performance, all the participants were evaluated with their language ability and phonological working memory. Linear mixed-effects models were constructed to evaluate the identification and discrimination scores across different groups and conditions. Results The basic CP pattern of cross-boundary benefit when perceiving both native lexical tones and VOT was largely preserved in high-functioning adolescents with ASD. The degree of CP of lexical tones in ASD was similar to that in typical controls, whereas the degree of CP of VOT in ASD was greatly reduced. Furthermore, the degree of CP of lexical tones correlated with language ability and digit span in ASD participants. Conclusions These findings suggest that the unbalanced acoustic processing capacities for pitch and time can be generalized to the higher level linguistic processing in ASD. Furthermore, the higher degree of CP of lexical tones correlated with better language ability in Mandarin-speaking individuals with ASD.
... This interpretation assumes that listener sensitivity to ambient acoustic and statistical information can pave the way for perception and learning of new contrasts in a second language. However, our results should not be construed as evidence that native English speakers cannot achieve a level of tone categorization matching that of native tone speakers [76]. Many factors may play a role. ...
Article
Full-text available
As many distributional learning (DL) studies have shown, adult listeners can achieve discrimination of a difficult non-native contrast after a short repetitive exposure to tokens falling at the extremes of that contrast. Such studies have shown using behavioural methods that a short distributional training can induce perceptual learning of vowel and consonant contrasts. However, much less is known about the neurological correlates of DL, and few studies have examined non-native lexical tone contrasts. Here, Australian-English speakers underwent DL training on a Mandarin tone contrast using behavioural (discrimination, identification) and neural (oddball-EEG) tasks, with listeners hearing either a bimodal or a unimodal distribution. Behavioural results show that listeners learned to discriminate tones after both unimodal and bimodal training; while EEG responses revealed more learning for listeners exposed to the bimodal distribution. Thus, perceptual learning through exposure to brief sound distributions (a) extends to non-native tonal contrasts, and (b) is sensitive to task, phonetic distance, and acoustic cue-weighting. Our findings have implications for models of how auditory and phonetic constraints influence speech learning.
... Although the auditory stimuli used by Bidelman et al. (2013) were not tonal, it can be argued that the timing of the categorisation (i.e. reflected in P2 wave, around 175 ms after the stimulus onset) is similar in tones because they are also perceived categorically (Xi, Zhang, Shu, Zhang, & Li, 2010;Xu, Gandour, & Francis, 2006). However, what precise process the ERPs reflect remains ambiguous. ...
Conference Paper
Speech processing was studied by looking at brain processes underlying speech perception and production. Existing models of speech and empirical data propose that producing speech decreases neural activity relative to perceiving speech (termed Speech-Induced Suppression - SIS). SIS is associated with monitoring the intended auditory targets against perceived speech output. SIS has been frequently reported at cortical levels but not at subcortical levels. If SIS occurs at subcortical levels, then speech processing models would be expanded to incorporate these in the internal sensory prediction (i.e. the intended auditory targets). Auditory tonal stimuli were used in this thesis. Such stimuli are commonly used in research on subcortical activity during speech perception. Knowing what the benchmark response (i.e. subcortical activity to tones in speech perception) looks like, allows us to compare our findings made during speech production to speech perception research. The first four studies recorded cortical activity using EEG, a common method in studying SIS. The same experimental conditions were used across the studies to facilitate comparison. The results showed a large variation in the magnitude and direction of the SIS effect across conditions and experiments. Even though mean amplitudes appeared to indicate than the cortical activity was indeed suppressed in some cases, when the random effects were controlled for using linear mixed models, the suppression was not significant. A potential explanation of this result might be that the alien voice auditory stimuli played during the experimental tasks were not recognised as one’s own. This mismatch would preclude occurrence of SIS. SIS was tested for the first time using functional near-infrared spectroscopy (fNIRS) using the same experimental conditions that were used in the EEG studies. The suppression of the fNIRS signal (HbO peaks) was not significant. However, the haemoglobin concentration plots suggested that the responses to conditions that involved vocalisation differed from those that did not. This thesis also describes attempts at recording subcortical responses (FFR) during speech production. SIS has been reported at the brainstem level in the past (Papanicolaou, Raz, Loring, & Eisenberg, 1986) but this required further exploration because of procedural issues in the study. Recording FFRs during vocalisation was attempted here to test whether subcortical activity is suppressed. This required the development of a processing pipeline to extract clean signals (FFR) from brainstem recordings during speech production. Recording FFRs during speech production turned out to be very challenging. Methodological improvements introduced in the later experiments improved signal quality but it was far from the standard achieved during speech perception. Combining these two strands of research, i.e. SIS on cortical and subcortical level, led to methodological improvements. The main theoretical contribution of the thesis is the finding that SIS cannot be consistently observed when an external audio stimulus is presented whilst speech production occurs concurrently. This result agrees with a previous finding which described that less prototypical speech sounds are less suppressed (Niziolek, Nagarajan, & Houde, 2013). These results support speech models which postulate that suppression is due to matching predicted and perceived feedback.
... In addition, only behavioral protocols were adopted in this study. A body of evidence has demonstrated the underlying neural correlates of the CP of lexical tones for both native speakers (e.g., Xi et al., 2010;Zheng et al., 2012) and nonnative speakers (e.g., G. Shen & Froud, 2019;Yu et al., 2019), as well as the cortical abnormalities of the tonal CP for children with dyslexia (e.g., Y. and autism (e.g., X. Wang et al., 2017). Further investigations with electrophysiological approaches are warranted to flesh out the underlying mechanisms of the bimodal benefits in lexical tone categorization from the neurophysiological perspective. ...
Article
Full-text available
Purpose Pitch reception poses challenges for individuals with cochlear implants (CIs), and adding a hearing aid (HA) in the nonimplanted ear is potentially beneficial. The current study used fine-scale synthetic speech stimuli to investigate the bimodal benefit for lexical tone categorization in Mandarin-speaking kindergarteners using a CI and an HA in opposite ears. Method The data were collected from 16 participants who were required to complete two classical tasks for speech categorical perception (CP) with CI + HA device condition and CI alone condition. Linear mixed-effects models were constructed to evaluate the identification and discrimination scores across different device conditions. Results The bimodal kindergarteners showed CP for the continuum varying from Mandarin Tone 1 and Tone 2. Moreover, the additional acoustic information from the contralateral HA contributes to improved lexical tone categorization, with a steeper slope, a higher discrimination score of between-category stimuli pair, and an improved peakedness score (i.e., an increased benefit magnitude for discriminations of between-category over within-category pairs) for the CI + HA condition than the CI alone condition. The bimodal kindergarteners with better residual hearing thresholds at 250 Hz level in the nonimplanted ear could perceive lexical tones more categorically. Conclusion The enhanced CP results with bimodal listening provide clear evidence for the clinical practice to fit a contralateral HA in the nonimplanted ear in kindergarteners with unilateral CIs with direct benefits from the low-frequency acoustic hearing.
... Especially, since lexical tone is used to convey grammatical or word meanings in more than half of the world's languages (Yip 2002), the functional anatomy and lateralization of lexical tone processing in speech has been intensively investigated. A bi-lateralized cortical network is shown to mediate speech tone, which involves superior temporal regions, inferior prefrontal regions, and the insula (Gandour et al. 2000;Hsieh et al. 2001;Klein et al. 2001;Wong 2002;Liu et al. 2006;Luo et al. 2006;Ren et al. 2009;Li et al. 2010;Xi et al. 2010;Nan and Friederici 2013;Chang et al. 2014;Yu et al. 2014;Ge et al. 2015;Kwok et al. 2016Kwok et al. , 2017Liang and Du 2018). ...
Article
Full-text available
One prominent theory in neuroscience and psychology assumes that cortical regions for language are left hemisphere lateralized in the human brain. In the current study, we used a novel technique, quantitative magnetic resonance imaging (qMRI), to examine interhemispheric asymmetries in language regions in terms of macromolecular tissue volume (MTV) and quantitative longitudinal relaxation time (T1) maps in the living human brain. These two measures are known to reflect cortical myeloarchitecture from the microstructural perspective. One hundred and fifteen adults (55 male, 60 female) were examined for their myeloarchitectonic asymmetries of language regions. We found that the cortical myeloarchitecture of inferior frontal areas including the pars opercularis, pars triangularis, and pars orbitalis is left lateralized, while that of the middle temporal gyrus, Heschl’s gyrus, and planum temporale is right lateralized. Moreover, the leftward lateralization of myelination structure is significantly correlated with language skills measured by phonemic and speech tone awareness. This study reveals for the first time a mixed pattern of myeloarchitectonic asymmetries, which calls for a general theory to accommodate the full complexity of principles underlying human hemispheric specialization.
Article
This study examines brain responses to boundary effects with respect to Mandarin lexical tone continua for three groups of adult listeners: (1) native English speakers who took advanced Mandarin courses; (2) naïve English speakers; and (3) native Mandarin speakers. A cross-boundary tone pair and a within-category tone pair derived from tonal contrasts (Mandarin Tone 1/Tone 4; Tone 2/Tone 3) with equal physical/acoustical distance were used in an auditory oddball paradigm. For native Mandarin speakers, the cross-category deviant elicited a larger MMN over left hemisphere sensors and larger P300 responses over both hemispheres relative to within-category deviants, suggesting categorical perception of tones at both pre-attentive and attentional stages of processing. In contrast, native English speakers and Mandarin learners did not demonstrate categorical effects. However, learners of Mandarin showed larger P300 responses than the other two groups, suggesting heightened sensitivity to tones and possibly greater attentional resource allocation to tone identification.
Article
Enhanced pitch perception has been identified in autistic individuals, but it remains understudied whether such enhancement can be observed in the lexical tone perception of language-delayed autistic children. This study examined the categorical perception of Mandarin lexical tones in 23 language-delayed autistic children and two groups of non-autistic children, with one matched on chronological age ( n = 23) and the other on developmental age in language ability ( n = 23). The participants were required to identify and discriminate lexical tones. A wider identification boundary width and a lower between-category discrimination accuracy were found in autistic children than their chronological-age-matched non-autistic peers, but the autistic group exhibited seemingly comparable performance to the group of developmental-age-matched non-autistic children. While both non-autistic groups displayed a typical categorical perception pattern with enhanced sensitivity to between-category tone pairs relative to within-category ones, such a categorical perception pattern was not observed in the autistic group. These findings suggest among language-delayed autistic children with a developmental age around 4, categorical perception is still developing. Finally, we found categorical perception performance correlated with language ability, indicating autistic children’s language disability might be predictive of their poor categorical perception of speech sounds. Lay abstract Some theories suggested that autistic people have better pitch perception skills than non-autistic people. However, in a context where pitch patterns are used to differentiate word meanings (i.e. lexical tones), autistic people may encounter difficulties, especially those with less language experience. We tested this by asking language-delayed autistic children to identify and discriminate two Mandarin lexical tones (/yi/ with Tone 1, meaning ‘clothes’; /yi/ with Tone 2, meaning ‘aunt’; /yi/: the standard romanization of Mandarin Chinese). On average, these autistic children were 7.35 years old, but their developmental age in language ability was 4.20, lagging behind 7-year-old non-autistic children in terms of language ability. Autistic children’s performance in identifying and discriminating lexical tones was compared with two groups of non-autistic children: one group was matched with the autistic group on age, and the other was matched based on language ability. Autistic children performed differently from the non-autistic children matched on age, while autistic and non-autistic children matched on language ability exhibited seemingly similar performance. However, both the non-autistic groups have developed the perceptual ability to process lexical tones as different categories, but this ability was still developing in autistic children. Finally, we found autistic children who performed worse in identifying lexical tones had poorer language ability. The results suggest that language disability might have adverse influence on the development of skills of speech sound processing.
Article
Lexical tone processing in speech is mediated by bilateral superior temporal and inferior prefrontal regions, but little is known concerning the neural circuitries of lexical tone phonology in reading. Using fMRI, we examined the neural systems for lexical tone in visual Chinese word recognition. We found that the extraction of lexical tone phonology in print was subserved by bilateral fronto-parietal regions. Seed-to-voxel analyses showed that functionally connected cortical regions involved right inferior frontal gyrus and SMA, right middle frontal gyrus and right inferior parietal lobule, and SMA and bilateral cingulate gyri. Our results indicate that in Chinese tone reading, a bilateral network of frontal, parietal, motor, and cingulate regions is engaged, without involvement of temporal regions crucial for tone identification in auditory domain. Although neural couplings for lexical tone processing are different in speech and reading to some degree, the motor cortex seems to be a key component independent of modality.
Article
Full-text available
The accurate perception of lexical tones in Mandarin Chinese is an important foundation for successfully understanding spoken Chinese. Previous behavioral studies have shown that the ability to perceive lexical tones in Mandarin declines in elderly individuals. In addition to other research areas related to language and aging, the central issue in phonetic perception during aging concerns whether perceptual changes related to aging are area-specific or area-general. The area-general language hypothesis of aging assumes that changes in language perception related to aging are caused by a decline in both general sensory perception function and high-order cognitive function. In contrast, the area-specific language hypothesis of aging assumes that changes in aging-related language perception are caused by specific deficits in language processing. Previous studies mostly detected the state of attention and focused on how area-general factors affect the processing of segmental phonemes in elderly individuals. The present study examined neurophysiological responses, particularly that of MMN, to explore whether the aging of lexical tone perception is language-specific for Mandarin. The current study recruited 22 healthy elderly participants (age range: 55.6~79.6 years) and 18 young participants (age range: 22.7~29.0 years). In a passive oddball task, we used event-related potentials (ERPs) to examine Mandarin lexical tone perception. Three syllables from a lexical tone continuum were chosen as stimuli to form an across-category stimulus pair and a within-category stimulus pair for the ERP oddball task. A non-speech stimulus pair was generated on the basis of the within-category stimulus pair. During the experiment, participants were instructed to ignore the presented sounds while watching a self-selected movie. ERP data showed that in the across-category condition, compared with the young group, the elderly group had a smaller MMN, and there was no between-group difference in the within-category condition. In the young group, a non-speech tone elicited a larger MMN amplitude than a speech tone that shared the same pitch contour, while the elderly group did not show a speech enhancement effect. In addition, compared with that of the young group, the amplitude of the MMN elicited by the non-speech contrast in the elderly group was significantly smaller. The results indicated that the general decline in central auditory processing function was not related to the pre-attention processing of lexical tone. In addition, when the level at which the auditory input stimulus could be sensed was controlled according to peripheral hearing abilities, the decline in peripheral auditory function was not related to the preservation of or decline in lexical tone perception in the current study. In the current study, there is no evidence that the age-related decline in area-general factors affects tone perception in the pre-attention condition. On this basis, this study further speculated that the ability of elderly Mandarin-speaking individuals to perceive lexical tone in pre-attention conditions was preserved and only declined for specific languages, and the above-mentioned decline in the processing of knowledge of Mandarin tone category and the wider preservation of the processing of speech tones are language-specific. The present study provides evidence for the area-specific language hypothesis of aging.
Article
The categorical perception of lexical tones is important to understand tonal languages. Recent studies have provided electrophysiological evidence for the categorical perception of lexical tones at the cortical level; however, whether neural correlates exist at subcortical levels remain unknown. In this study, by using across-category and within-category lexical tone contrasts with the equivalent physical interval, we recorded deviance detection activities at both the brainstem (reflected by frequency following response) and cortical levels (reflected by mismatch negativity) simultaneously. We found that significantly enhanced intertrial phase-locking of frequency following response s was observed only during the across-category deviance detection, which indicates that phonological differences could be detected at the level of brainstem. In addition, the across-category deviants induced stronger mismatch negativity than within-category deviants. For the first time, our results demonstrate that neural correlates of categorical perception of lexical tones exist even in the brainstem, and suggests that both cortical and subcortical processes are involved in the coding and categorization of tonal information.
Article
Full-text available
There is growing interest in developing and using novel measures to assess how the body is represented in human infancy. Various lines of evidence with adults and older children show that tactile perception is modulated by a high-level representation of the body. For instance, the distance between two points of tactile stimulation is perceived as being greater when they cross a joint boundary than when they are within a body part, suggesting that the representation of the body is structured with joints acting as categorical boundaries between body parts. Investigating the developmental origins of this categorical effect has been constrained by infants’ inability to verbally report on the properties of tactile stimulation. Here we made novel use of an infant brain measure, the somatosensory mismatch negativity (sMMN), to explore categorical aspects of tactile body processing in infants aged 6 to 7 months. Amplitude of the sMMN elicited by tactile stimuli across the wrist boundary was significantly greater than for stimuli of equal distance that were within the boundary, suggesting a categorical effect in body processing in infants. We suggest that an early-appearing, structured representation of the body into ‘parts’ may play a role in mapping correspondences between self and other.
Article
This study used the categorical perception (CP) paradigm, a fine-grained perceptual method, to investigate the perceptual performance of lexical tones in Chinese people with post-stroke aphasia (PWA). Twenty patients with post-stroke aphasia (10 Broca’s and 10 Wernicke’s) and ten neurologically intact age-matched control participants were recruited to complete both identification and discrimination tasks of the Mandarin Tone 1–2 continuum. In addition, all participants completed tests on their auditory comprehension ability and working memory. The results showed that both Broca’s and Wernicke’s patients exhibited reduced sensitivity to within-category and between-category information but preserved CP of lexical tones. The degree of CP of lexical tones related to working memory in aphasic patients. Furthermore, lower-level acoustic processing underpinned higher-level phonological processing on the CP of lexical tones since both patient groups’ unbalanced pitch processing ability extended to their CP of lexical tones. These findings are significant for researchers and clinicians in speech-language rehabilitation, clinical psychology, and cognitive communication.
Chapter
Autism spectrum disorder (ASD) is a neurodevelopmental disorder that presents with core deficits in language and social communication areas. Past decades have witnessed a growing number of studies concerning this population’s language and communication skills. However, studies focusing on Chinese-speaking individuals with ASD are rare and have just begun to accumulate. This review focuses on prosody and lexical tone perception and production in Chinese-speaking individuals with ASD. We also briefly review the evidence from general ASD literature for cross-language comparisons. Similar to patterns seen in many non-tonal language speakers with ASD, Chinese-speaking individuals with ASD generally demonstrate atypical pitch in terms of both average and range of values in verbal productions. Behavioral and neurophysiological evidence suggest atypicality, such as enhanced lower-level auditory processing and reduced higher-level linguistic processing in Chinese-speaking individuals with ASD. We also report some preliminary neural intervention data on bilingual English–Mandarin-learning children with ASD. Future directions on advancing theory and practice are discussed.
Chapter
The perception of acoustic and phonological information in lexical tones is crucial for understanding Chinese words correctly. Research in the past has considered the linguistic functions of both acoustic and phonological information. However, it has been debated whether Chinese lexical tones are processed in the right or the left hemisphere, and whether different types of information may be handled differently in the two hemispheres. For native Chinese speakers (L1), the acoustic information of tones appears to be processed in the right hemisphere, whereas the phonological information of tones is mostly processed in the left hemisphere. For second language (L2) Chinese learners, it has been hypothesized that they may show right-lateralized pattern for processing both acoustic and phonological information at the early stage of Chinese learning; when their processing of these two types of information improves to a higher level at a later stage of Chinese learning, native-like patterns emerge. In this chapter, we discuss how these two types of information play their roles in the processing of lexical tones in Chinese by both native speakers and second language learners of Chinese.
Article
The purpose of this review is to provide an accessible exploration of key considerations of lateralization in speech and non-speech perception using clear and defined language. From these considerations, the primary arguments for each side of the linguistics versus acoustics debate are outlined and explored in context of emerging integrative theories. This theoretical approach entails a perspective that linguistic and acoustic features differentially contribute to leftward bias, depending on the given context. Such contextual factors include stimulus parameters and variables of stimulus presentation (e.g., noise/silence and monaural/binaural) and variances in individuals (sex, handedness, age, and behavioural ability). Discussion of these factors and their interaction is also aimed towards providing an outline of variables that require consideration when developing and reviewing methodology of acoustic and linguistic processing laterality studies. Thus, there are three primary aims in the present paper: (1) to provide the reader with key theoretical perspectives from the acoustics/linguistics debate and a synthesis of the two viewpoints, (2) to highlight key caveats for generalizing findings regarding predominant models of speech laterality, and (3) to provide a practical guide for methodological control using predominant behavioural measures (i.e., gap detection and dichotic listening tasks) and/or neurophysiological measures (i.e., mismatch negativity) of speech laterality.
Article
Purpose Although acquisition of Chinese lexical tones by second language (L2) learners has been intensively investigated, very few studies focused on categorical perception (CP) of lexical tones by highly proficient L2 learners. This study was designed to address this issue with behavioral and electrophysiological measures. Method Behavioral identification and auditory event-related potential (ERP) components for speech discrimination, including mismatch negativity (MMN), N2b, and P3b, were measured in 23 native Korean speakers who were highly proficient late L2 learners of Chinese. For the ERP measures, both passive and active listening tasks were administered to examine the automatic and attention-controlled discriminative responses to within- and across-category differences for carefully chosen stimuli from a lexical tone continuum. Results The behavioral task revealed native-like identification function of the tonal continuum. Correspondingly, the active oddball task demonstrated larger P3b amplitudes for the across-category than within-category deviants in the left recording site, indicating clear CP of lexical tones in the attentive condition. By contrast, similar MMN responses in the right recording site were elicited by both the across- and within-category deviants, indicating the absence of CP effect with automatic phonological processing of lexical tones at the pre-attentive stage even in L2 learners with high Chinese proficiency. Conclusion Although behavioral data showed clear evidence of categorical perception of lexical tones in proficient L2 learners, ERP measures from passive and active listening tasks demonstrated fine-grained sensitivity in terms of response polarity, latency, and laterality in revealing different aspects of auditory versus linguistic processing associated with speech decoding by means of largely implicit native language acquisition versus effortful explicit L2 learning.
Article
The processing of lexical tones, vowels, and consonants is significant in tonal language speech perception. However, it remains unclear whether their processing is similar or distinct concerning the extent and time course and whether their processing is independent or integrated. Thus in the present study, we conducted two event-related potential (ERP) experiments to explore how native speakers of Cantonese process lexical tones (including level and contour tones), vowels, and consonants in real vs. pseudo-Cantonese words with mismatch negativity (MMN). The MMN amplitudes and latencies showed that lexical tones and vowels were processed similarly in extent and time course. Lexical tones and consonants were processed differently in extent and time course. Vowels and consonants were processed to similar extents but over different time courses. Lexicality (real words vs. pseudowords) and tonal type (level vs. contour tones) modulated the differences in the extent and time courses of processing between lexical tones/vowels and consonants. The MMN additivity analyses further suggested that the processing of lexical tones and vowels, lexical tones and consonants, and vowels and consonants were integrated regardless of lexicality and tonal type. The results revealed that distinct but integrated processing occurs for lexical tones, vowels, and consonants in the speech perception of tonal languages. The findings provided neurophysiological evidence for the mechanism underlying tonal language spoken word recognition.
Article
Previous studies proposed different views to explain the hemispheric lateralization of lexical tone processing. But how the acoustic and phonological information modulates it remains unclear. The acoustic information refers to the physical acoustic features of lexical tones, and the phonological information means the different word meanings differentiated by lexical tones. In the present study, we adopted the active oddball paradigm to explore the effects of pitch type and lexicality on native Cantonese speakers' lexical tone processing with the event-related potential (ERP) technique. We used Cantonese level and contour tones (pitch type) to examine the role of acoustic information and real words and pseudowords (lexicality) to detect the phonological information's effect. The results showed that the pitch type and lexicality affected the N2b amplitudes between the left and right hemispheres interactively, while they did not play roles in P3b amplitudes. The results indicated that the acoustic and phonological information modulated the hemispheric lateralization of lexical tone processing interactively only in the early stage (N2b time window) but not in the later stage (P3b time window). The findings suggested a two-stage model interprets the hemispheric lateralization in lexical tone processing.
Article
Purpose Congenital deafness not only delays auditory development but also hampers the ability to perceive nonspeech and speech signals. This study aimed to use auditory event-related potentials to explore the mismatch negativity (MMN), P3a, negative wave (Nc), and late discriminative negativity (LDN) components in children with and without hearing loss. Method Nineteen children with normal hearing (CNH) and 17 children with hearing loss (CHL) participated in this study. Two sets of pure tones (1 kHz vs. 1.1 kHz) and lexical tones (/ba2/ vs. /ba4/) were used to examine the auditory discrimination process. Results MMN could be elicited by the pure tone and the lexical tone in both groups. The MMN latency elicited by nonspeech and speech was later in CHL than in CNH. Additionally, the MMN latency induced by speech occurred later in the left than in the right hemisphere in CNH, and the MMN amplitude elicited by speech in CHL produced a discriminative deficiency compared with that in CNH. Although the P3a latency and amplitude elicited by nonspeech in CHL and CNH were not significantly different, the Nc amplitude elicited by speech performed much lower in CHL than in CNH. Furthermore, the LDN latency elicited by nonspeech was later in CHL than in CNH, and the LDN amplitude induced by speech showed higher dominance in the right hemisphere in both CNH and CHL. Conclusion By incorporating nonspeech and speech auditory conditions, we propose using MMN, Nc, and LDN as potential indices to investigate auditory perception, memory, and discrimination.
Article
Objective: Speech is a common way of communication. Decoding verbal intent could provide a naturalistic communication way for people with severe motor disabilities. Active brain computer interaction (BCI) speller is one of the most commonly used speech BCIs. To reduce the spelling time of Chinese words, identifying vowels and tones that are embedded in imagined Chinese words is essential. Functional near-infrared spectroscopy (fNIRS) has been widely used in BCI because it is portable, non-invasive, safe, low cost, and has a relatively high spatial resolution. Approach: In this study, an active BCI speller based on fNIRS is presented by covertly rehearsing tonal monosyllables with vowels (i.e., /a/, /i/, /o/, and /u/) and four lexical tones in Mandarin Chinese (i.e., tones 1, 2, 3, and 4) for 10 s. Main results: fNIRS results showed significant differences in the right superior temporal gyrus between imagined vowels with tone 2/3/4 and those with tone 1 (i.e., more activations and stronger connections to other brain regions for imagined vowels with tones 2/3/4 than for those with tone 1). Speech-related areas for tone imagery (i.e., the right hemisphere) provided majority of information for identifying tones, while the left hemisphere had advantages in vowel identification. Having decoded both vowels and tones during the post-stimulus 15 s period, the average classification accuracies exceeded 40 % and 70 % in multiclass (i.e., four classes) and binary settings, respectively. To spell words more quickly, the time window size for decoding was reduced from 15 s to 2.5 s while the classification accuracies were not significantly reduced. Significance: For the first time, this work demonstrated the possibility of discriminating lexical tones and vowels in imagined tonal syllables. In addition, the reduced time window for decoding indicated that the spelling time of Chinese words could be significantly reduced in the fNIRS-based BCIs.
Article
Full-text available
Previous work has not yielded clear conclusions about the categorical nature of perception of tone contrasts by native listeners of tone languages. We reopen this issue in a cross-linguistic study comparing Taiwan Mandarin and French listeners. We tested these listeners on three tone continua derived from natural Mandarin utterances within carrier sentences, created via a state-of-the-art pitch-scaling technique in which within-continuum interpolation was applied to both f0 and intensity contours. Classic assessments of categorization and discrimination of each tone continuum were conducted with both groups of listeners. In Experiment 1, Taiwanese listeners identified the tone of target syllables within carrier sentence context and discriminated tones of single syllables. In Experiment 2, both French and Taiwanese listeners completed an AXB identification task on single syllables. Finally, French listeners were run on an AXB discrimination task in Experiment 3. Results indicated that Taiwanese listeners’ perception of tones is quasi-categorical whereas French listeners’ is psychophysically based. French listeners nevertheless show substantial sensitivity to tone contour differences, though to a lesser extent than Taiwanese listeners. Thus, the findings suggest that despite the lack of lexical tone contrasts in the French language, French listeners are not absolutely “deaf” to tonal variations. They simply fail to perceive tones along the lines of a well-defined and finite set of linguistic categories.
Article
Full-text available
Linguistic analyses of tone and intonation, as well as experimental and clinical studies of pitch in the speech signal, indicate that pitch cues play various roles in language behavior. These different functions, occurring at different levels of grammar, are located along a continuum of linguistic structure from most structured (e.g., tones) to least structured (e.g., voice quality). Hemispheric laterality studies show that highly structured pitch contrasts are associated with left cerebral processing, whereas least structured pitch cues are specialized to the right hemisphere. Intermediate functional roles of pitch, those conveyed on intonation contours, which are made up of intricate meshings of both all-or-none and graded phenomena, are correspondingly ambiguous with respect to laterality. The studies reviewed lead to the conclusion that pitch in the acoustic signal is processed in the brain according to its functional context, properties of which may be specialized in either hemisphere.
Article
Full-text available
Article
Full-text available
The affinity and temporal course of functional fields in middle and posterior superior temporal cortex for the categorization of complex sounds was examined using functional magnetic resonance imaging (fMRI) and event-related potentials (ERPs) recorded simultaneously. Data were compared before and after subjects were trained to categorize a continuum of unfamiliar nonphonemic auditory patterns with speech-like properties (NP) and a continuum of familiar phonemic patterns (P). fMRI activation for NP increased after training in left posterior superior temporal sulcus (pSTS). The ERP P2 response to NP also increased with training, and its scalp topography was consistent with left posterior superior temporal generators. In contrast, the left middle superior temporal sulcus (mSTS) showed fMRI activation only for P, and this response was not affected by training. The P2 response to P was also independent of training, and its estimated source was more anterior in left superior temporal cortex. Results are consistent with a role for left pSTS in short-term representation of relevant sound features that provide the basis for identifying newly acquired sound categories. Categorization of highly familiar phonemic patterns is mediated by long-term representations in left mSTS. Results provide new insight regarding the function of ventral and dorsal auditory streams.
Article
Full-text available
Human beings differ in their ability to master the sounds of their second language (L2). Phonetic training studies have proposed that differences in phonetic learning stem from differences in psychoacoustic abilities rather than speech-specific capabilities. We aimed at finding the origin of individual differences in L2 phonetic acquisition in natural learning contexts. We consider two alternative explanations: a general psychoacoustic origin vs. a speech-specific one. For this purpose, event-related potentials (ERPs) were recorded from two groups of early, proficient Spanish-Catalan bilinguals who differed in their mastery of the Catalan (L2) phonetic contrast /e-ε/. Brain activity in response to acoustic change detection was recorded in three different conditions involving tones of different length (duration condition), frequency (frequency condition), and presentation order (pattern condition). In addition, neural correlates of speech change detection were also assessed for both native (/o/-/e/) and nonnative (/o/-/ö/) phonetic contrasts (speech condition). Participants' discrimination accuracy, reflected electrically as a mismatch negativity (MMN), was similar between the two groups of participants in the three acoustic conditions. Conversely, the MMN was reduced in poor perceivers (PP) when they were presented with speech sounds. Therefore, our results support a speech-specific origin of individual variability in L2 phonetic mastery. • mismatch negativity • event-related potentials • bilingualism
Article
Full-text available
Neural representation of pitch is influenced by lifelong experiences with music and language at both cortical and subcortical levels of processing. The aim of this article is to determine whether neural plasticity for pitch representation at the level of the brainstem is dependent upon specific dimensions of pitch contours that commonly occur as part of a native listener's language experience. Brainstem frequency following responses (FFRs) were recorded from Chinese and English participants in response to four Mandarin tonal contours presented in a nonspeech context in the form of iterated rippled noise. Pitch strength (whole contour, 250 msec; 40-msec segments) and pitch-tracking accuracy (whole contour) were extracted from the FFRs using autocorrelation algorithms. Narrow band spectrograms were used to extract spectral information. Results showed that the Chinese group exhibits smoother pitch tracking than the English group in three out of the four tones. Moreover, cross-language comparisons of pitch strength of 40-msec segments revealed that the Chinese group exhibits more robust pitch representation of those segments containing rapidly changing pitch movements across all four tones. FFR spectral data were complementary showing that the Chinese group exhibits stronger representation of multiple pitch-relevant harmonics relative to the English group across all four tones. These findings support the view that at early preattentive stages of subcortical processing, neural mechanisms underlying pitch representation are shaped by particular dimensions of the auditory stream rather than speech per se. Adopting a temporal correlation analysis scheme for pitch encoding, we propose that long-term experience sharpens the tuning characteristics of neurons along the pitch axis with enhanced sensitivity to linguistically relevant variations in pitch.
Article
Full-text available
There is considerable debate about whether the early processing of sounds depends on whether they form part of speech. Proponents of such speech specificity postulate the existence of language-dependent memory traces, which are activated in the processing of speech but not when equally complex, acoustic non-speech stimuli are processed. Here we report the existence of these traces in the human brain. We presented to Finnish subjects the Finnish phoneme prototype /e/ as the frequent stimulus, and other Finnish phoneme prototypes or a non-prototype (the Estonian prototype /õ/) as the infrequent stimulus. We found that the brain's automatic change-detection response, reflected electrically as the mismatch negativity (MMN), was enhanced when the infrequent, deviant stimulus was a prototype (the Finnish /ö/) relative to when it was a non-prototype (the Estonian /õ/). These phonemic traces, revealed by MMN, are language-specific, as /õ/ caused enhancement of MMN in Estonians. Whole-head magnetic recordings located the source of this native-language, phoneme-related response enhancement, and thus the language-specific memory traces, in the auditory cortex of the left hemisphere.
Article
Full-text available
In studies of pitch processing, a fundamental question is whether shared neural mechanisms at higher cortical levels are engaged for pitch perception of linguistic and nonlinguistic auditory stimuli. Positron emission tomography (PET) was used in a crosslinguistic study to compare pitch processing in native speakers of two tone languages (that is, languages in which variations in pitch patterns are used to distinguish lexical meaning), Chinese and Thai, with those of English, a nontone language. Five subjects from each language group were scanned under three active tasks (tone, pitch, and consonant) that required focused-attention, speeded-response, auditory discrimination judgments, and one passive baseline as silence. Subjects were instructed to judge pitch patterns of Thai lexical tones in the tone condition; pitch patterns of nonspeech stimuli in the pitch condition; syllable-initial consonants in the consonant condition. Analysis was carried out by paired-image subtraction. When comparing the tone to the pitch task, only the Thai group showed significant activation in the left frontal operculum. Activation of the left frontal operculum in the Thai group suggests that phonological processing of suprasegmental as well as segmental units occurs in the vicinity of Broca's area. Baseline subtractions showed significant activation in the anterior insular region for the English and Chinese groups, but not Thai, providing further support for the existence of possibly two parallel, separate pathways projecting from the temporo-parietal to the frontal language area. More generally, these differential patterns of brain activation across language groups and tasks support the view that pitch patterns are processed at higher cortical levels in a top-down manner according to their linguistic function in a particular language.
Article
Full-text available
This study examined neurophysiologic correlates of the perception of native and nonnative phonetic categories. Behavioral and electrophysiologic responses were obtained from Hindi and English listeners in response to a stimulus continuum of naturally produced, bilabial CV stimuli that differed in VOT from -90 to 0 ms. These speech sounds constitute phonemically relevant categories in Hindi but not in English. As expected, the native Hindi listeners identified the stimuli as belonging to two distinct phonetic categories (/ba/ and /pa/) and were easily able to discriminate a stimulus pair across these categories. On the other hand, English listeners discriminated the same stimulus pair at a chance level. In the electrophysiologic experiment N1 and MMN cortical evoked potentials (considered neurophysiologic indices of stimulus processing) were measured. The changes in N1 latency which reflected the duration of pre-voicing across the stimulus continuum were not significantly different for Hindi and English listeners. On the other hand, in response to the /ba/-/pa/ stimulus contrast, a robust MMN was seen only in Hindi listeners and not in English listeners. These results suggest that neurophysiologic levels of stimulus processing reflected by the MMN and N1 are differentially altered by linguistic experience.
Article
Full-text available
We used positron emission tomography to examine the response of human auditory cortex to spectral and temporal variation. Volunteers listened to sequences derived from a standard stimulus, consisting of two pure tones separated by one octave alternating with a random duty cycle. In one series of five scans, spectral information (tone spacing) remained constant while speed of alternation was doubled at each level. In another five scans, speed was kept constant while the number of tones sampled within the octave was doubled at each level, resulting in increasingly fine frequency differences. Results indicated that (i) the core auditory cortex in both hemispheres responded to temporal variation, while the anterior superior temporal areas bilaterally responded to the spectral variation; and (ii) responses to the temporal features were weighted towards the left, while responses to the spectral features were weighted towards the right. These findings confirm the specialization of the left-hemisphere auditory cortex for rapid temporal processing, and indicate that core areas are especially involved in these processes. The results also indicate a complementary hemispheric specialization in right-hemisphere belt cortical areas for spectral processing. The data provide a unifying framework to explain hemispheric asymmetries in processing speech and tonal patterns. We propose that differences exist in the temporal and spectral resolution of corresponding fields in the two hemispheres, and that they may be related to anatomical hemispheric asymmetries in myelination and spacing of cortical columns.
Article
Full-text available
As indexed by electrophysiological measures, in native speakers of a language with linguistically significant opposition between short and long phonemes, the pre-attentive detection accuracy of duration changes in speech sounds was tuned in comparison with that in non-speech sounds. This was not observed in advanced second-language users of the same language, suggesting that second-language acquisition does not lead to speech-specific tuning of the duration processing as does native language acquisition in early childhood.
Article
Full-text available
Functional magnetic resonance imaging was employed before and after six native English speakers completed lexical tone training as part of a program to learn Mandarin as a second language. Language-related areas including Broca's area, Wernicke's area, auditory cortex, and supplementary motor regions were active in all subjects before and after training and did not vary in average location. Across all subjects, improvements in performance were associated with an increase in the spatial extent of activation in left superior temporal gyrus (Brodmann's area 22, putative Wernicke's area), the emergence of activity in adjacent Brodmann's area 42, and the emergence of activity in right inferior frontal gyrus (Brodmann's area 44), a homologue of putative Broca's area. These findings demonstrate a form of enrichment plasticity in which the early cortical effects of learning a tone-based second language involve both expansion of preexisting language-related areas and recruitment of additional cortical regions specialized for functions similar to the new language functions.
Article
Full-text available
Identification and discrimination of lexical tones in Cantonese were compared in the context of a traditional categorical perception paradigm. Three lexical tone continua were used: one ranging from low level to high level, one from high rising to high level, and one from low falling to high rising. Identification data showed steep slopes at category boundaries, suggesting that lexical tones are perceived categorically. In contrast, discrimination curves generally showed much weaker evidence for categorical perception. Subsequent investigation showed that the presence of a tonal context played a strong role in the identification of target tones and less of a role in discrimination. The results are consistent with the hypothesis that tonal category boundaries are determined by a combination of regions of natural auditory sensitivity and the influence of linguistic experience.
Article
Full-text available
Auditory pitch patterns are significant ecological features to which nervous systems have exquisitely adapted. Pitch patterns are found embedded in many contexts, enabling different information-processing goals. Do the psychological functions of pitch patterns determine the neural mechanisms supporting their perception, or do all pitch patterns, regardless of function, engage the same mechanisms? This issue is pursued in the present study by using 150-water positron emission tomography to study brain activations when two subject groups discriminate pitch patterns in their respective native languages, one of which is a tonal language and the other of which is not. In a tonal language, pitch patterns signal lexical meaning. Native Mandarin-speaking and English-speaking listeners discriminated pitch patterns embedded in Mandarin and English words and also passively listened to the same stimuli. When Mandarin listeners discriminated pitch embedded in Mandarin lexical tones, the left anterior insular cortex was the most active. When they discriminated pitch patterns embedded in English words, the homologous area in the right hemisphere activated as it did in English-speaking listeners discriminating pitch patterns embedded in either Mandarin or English words. These results support the view that neural responses to physical acoustic stimuli depend on the function of those stimuli and implicate anterior insular cortex in auditory processing, with the left insular cortex especially responsive to linguistic stimuli.
Article
Full-text available
Linguistic experience alters an individual's perception of speech. We here provide evidence of the effects of language experience at the neural level from two magnetoencephalography (MEG) studies that compare adult American and Japanese listeners' phonetic processing. The experimental stimuli were American English /ra/ and /la/ syllables, phonemic in English but not in Japanese. In Experiment 1, the control stimuli were /ba/ and /wa/ syllables, phonemic in both languages; in Experiment 2, they were non-speech replicas of /ra/ and /la/. The behavioral and neuromagnetic results showed that Japanese listeners were less sensitive to the phonemic /r-l/ difference than American listeners. Furthermore, processing non-native speech sounds recruited significantly greater brain resources in both hemispheres and required a significantly longer period of brain activation in two regions, the superior temporal area and the inferior parietal area. The control stimuli showed no significant differences except that the duration effect in the superior temporal cortex also applied to the non-speech replicas. We argue that early exposure to a particular language produces a "neural commitment" to the acoustic properties of that language and that this neural commitment interferes with foreign language processing, making it less efficient.
Article
Full-text available
This study investigates the neural substrates underlying the perception of two sentence-level prosodic phenomena in Mandarin Chinese: contrastive stress (initial vs. final emphasis position) and intonation (declarative vs. interrogative modality). In an fMRI experiment, Chinese and English listeners were asked to selectively attend to either stress or intonation in paired 3-word sentences, and make speeded-response discrimination judgments. Between-group comparisons revealed that the Chinese group exhibited significantly greater activity in the left supramarginal gyrus and posterior middle temporal gyrus relative to the English group for both tasks. These same two regions showed a leftward asymmetry in the stress task for the Chinese group only. For both language groups, rightward asymmetries were observed in the middle portion of the middle frontal gyrus across tasks. All task effects involved greater activity for the stress task as compared to intonation. A left-sided task effect was observed in the posterior middle temporal gyrus for the Chinese group only. Both language groups exhibited a task effect bilaterally in the intraparietal sulcus. These findings support the emerging view that speech prosody perception involves a dynamic interplay among widely distributed regions not only within a single hemisphere but also between the two hemispheres. This model of speech prosody processing emphasizes the role of right hemisphere regions for complex-sound analysis, whereas task-dependent regions in the left hemisphere predominate when language processing is required.
Article
Full-text available
In the present experiment, the authors tested Mandarin and English listeners on a range of auditory tasks to investigate whether long-term linguistic experience influences the cognitive processing of nonspeech sounds. As expected, Mandarin listeners identified Mandarin tones significantly more accurately than English listeners; however, performance did not differ across the listener groups on a pitch discrimination task requiring fine-grained discrimination of simple nonspeech sounds. The crucial finding was that cross-language differences emerged on a nonspeech pitch contour identification task: The Mandarin listeners more often misidentified flat and falling pitch contours than the English listeners in a manner that could be related to specific features of the sound structure of Mandarin, which suggests that the effect of linguistic experience extends to nonspeech processing under certain stimulus and task conditions.
Article
Full-text available
Whether or not categorical perception results from the operation of a special, language-specific, speech mode remains controversial. In this cross-language (Mandarin Chinese, English) study of the categorical nature of tone perception, we compared native Mandarin and English speakers' perception of a physical continuum of fundamental frequency contours ranging from a level to rising tone in both Mandarin speech and a homologous (nonspeech) harmonic tone. This design permits us to evaluate the effect of language experience by comparing Chinese and English groups; to determine whether categorical perception is speech-specific or domain-general by comparing speech to nonspeech stimuli for both groups; and to examine whether categorical perception involves a separate categorical process, distinct from regions of sensory discontinuity, by comparing speech to nonspeech stimuli for English listeners. Results show evidence of strong categorical perception of speech stimuli for Chinese but not English listeners. Categorical perception of nonspeech stimuli was comparable to that for speech stimuli for Chinese but weaker for English listeners, and perception of nonspeech stimuli was more categorical for English listeners than was perception of speech stimuli. These findings lead us to adopt a memory-based, multistore model of perception in which categorization is domain-general but influenced by long-term categorical representations.
Article
Full-text available
A hallmark of categorical perception is better discrimination of stimulus tokens from 2 different categories compared with token pairs that are equally dissimilar but drawn from the same category. This effect is well studied in speech perception and represents an important characteristic of how the phonetic form of speech is processed. We investigated the brain mechanisms of categorical perception of stop consonants using functional magnetic resonance imaging and a passive short-interval habituation trial design (Zevin and McCandliss 2005). The paradigm takes advantage of neural adaptation effects to identify specific regions sensitive to an oddball stimulus presented in the context of a repeated item. These effects were compared for changes in stimulus characteristics that result in either a between-category (phonetic and acoustic) or a within-category (acoustic only) stimulus shift. Significantly greater activation for between-category than within-category stimuli was observed in left superior sulcus and middle temporal gyrus as well as in inferior parietal cortex. In contrast, only a subcortical region specifically responded to within-category changes. The data suggest that these habituation effects are due to the unattended detection of a phonetic stimulus feature.
Article
Full-text available
In tonal languages such as Mandarin Chinese, a lexical tone carries semantic information and is preferentially processed in the left brain hemisphere of native speakers as revealed by the functional MRI or positron emission tomography studies, which likely measure the temporally aggregated neural events including those at an attentive stage of auditory processing. Here, we demonstrate that early auditory processing of a lexical tone at a preattentive stage is actually lateralized to the right hemisphere. We frequently presented to native Mandarin Chinese speakers a meaningful auditory word with a consonant-vowel structure and infrequently varied either its lexical tone or initial consonant using an odd-ball paradigm to create a contrast resulting in a change in word meaning. The lexical tone contrast evoked a stronger preattentive response, as revealed by whole-head electric recordings of the mismatch negativity, in the right hemisphere than in the left hemisphere, whereas the consonant contrast produced an opposite pattern. Given the distinct acoustic features between a lexical tone and a consonant, this opposite lateralization pattern suggests the dependence of hemisphere dominance mainly on acoustic cues before speech input is mapped into a semantic representation in the processing stream.
Article
Scitation is the online home of leading journals and conference proceedings from AIP Publishing and AIP Member Societies
Article
Arguments continue over categorical perception of phonetic segments versus continuous perception. In tone languages, phonemic tones are characterized principally by F 0 contours. Theories of categoricity would predict continuous perception of tones; i.e., there out to be no peaks in the discrimination of variants at category boundaries, more as in experiments with isolated synthetic vowels than with synthetic stop consonants. This was the outcome of earlier work [A. S. Abramson, J. Acoust. Soc. Am. 33, 842 (1961)], but it seems to have been contradicted by recent work on Mandarin [S. W. Chan, C‐K. Chuang and W. S‐Y. Wang, J. Acoust. Soc. Am. 58, S119 (A) (1975)]. The present study involves a considerably larger number of subjects, 34 native speakers of Thai, a language with five phonemic tones. Sixteen flat F 0 variants synthesized on a syllable of the type [kha:] were sorted into the three “static” high, mid and low tones with considerable overlap. Discrimination tests yielded a high level of discrimination across the continuum with no effects of boundaries between categories, thus implying noncategorical perception of tone categories.
Article
Ss can discriminate phonemes presented singly and in random order. Ss discriminated better between speech sounds to which they have attached different phonemic labels than between sounds which they normally put in the same phoneme class. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
A series of thirteen two-formant vowels was synthesized and used as the basis of labelling and discrimination tests with a group of English-speaking listeners. The sounds varied only in F1/F2 plot and the resulting vowel qualities were such that listeners found no difficulty in assigning each sound to one of three phonemic categories, those of the vowels in bid, bed and bad. The results of the tests were compared with those previously obtained in experiments involving the consonant phonemes /b, d, g/. It appears from the data that the phoneme boundaries in the case of the three vowel phonemes are less sharply defined than in the case of the stop consonants. The labelling functions for the vowels show a gradual slope and the discrimination functions do not show any marked increase in sensitivity to change in the region of the phoneme boundaries. It is clear also that the listeners were able to discriminate differences very much smaller than would need to be distinguished simply in order to place vowels in the appropriate category. The results show further that the effect of sequence or acoustic context in the perception of vowels is very considerable. In all the aspects examined in these experiments, the perception of synthetic vowels is found to be different from that of synthetic stop consonants. These differences lend some support to the hypothesis that the degree of articulatory discontinuity between sounds may be correlated with the sharpness of the phonemic boundaries that separate them.
Article
To test the effect of linguistic experience on the perception of a cue that is known to be effective in distinguishing between [r] and [l] in English, 21 Japanese and 39 American adults were tested on discrimination of a set of synthetic speech-like stimuli. The 13 “speech” stimuli in this set varied in the initial stationary frequency of the third formant (F3) and its subsequent transition into the vowel over a range sufficient to produce the perception of [r a] and [l a] for American subjects and to produce [r a] (which is not in phonemic contrast to [l a ]) for Japanese subjects. Discrimination tests of a comparable set of stimuli consisting of the isolated F3 components provided a “nonspeech” control. For Americans, the discrimination of the speech stimuli was nearly categorical, i.e., comparison pairs which were identified as different phonemes were discriminated with high accuracy, while pairs which were identified as the same phoneme were discriminated relatively poorly. In comparison, discrimination of speech stimuli by Japanese subjects was only slightly better than chance for all comparison pairs. Performance on nonspeech stimuli, however, was virtually identical for Japanese and American subjects; both groups showed highly accurate discrimination of all comparison pairs. These results suggest that the effect of linguistic experience is specific to perception in the “speech mode.”
Article
Event-related brain potentials (ERP) were recorded to infrequent changes of a synthesized vowel (standard) to another vowel (deviant) in speakers of Hungarian and Finnish language, which are remotely related to each other with rather similar vowel systems. Both language groups were presented with identical stimuli. One standard-deviant pair represented an across-vowel category contrast in Hungarian, but a within-category contrast in Finnish, with the other pair having the reversed role in the two languages. Both within- and across-category contrasts elicited the mismatch negativity (MMN) ERP component in the native speakers of either language. The MMN amplitude was larger in across- than within-category contrasts in both language groups. These results suggest that the pre-attentive change-detection process generating the MMN utilized both auditory (sensory) and phonetic (categorical) representations of the test vowels.