Wei Zhang

Wei Zhang
  • Doctor of Philosophy
  • PhD Student at McGill University

About

24
Publications
3,205
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
30
Citations
Current institution
McGill University
Current position
  • PhD Student

Publications

Publications (24)
Article
Full-text available
The prosody of an utterance encodes multiple types of information simultaneously, including information status of constituents—for example, by modulations in prosodic prominence to encode focus—and information about syntactic constituent structure—by modulations of prosodic phrasing. According to many prosodic theories, however, focus and constitue...
Preprint
The perception of Mandarin flat-falling tonal contrast has been proposed to be guided by both linguistic categories, expected in native speakers, and psychophysical categories, relevant to all speakers. A recent study has shown that phonetic imitation of Mandarin tones is mediated by categories, but it remains unclear whether this non-linear imitat...
Article
Phonetic imitation is mediated by phonological contrast, as evident in features such as formant, VOT and F0. However, a recent study observed that duration imitation was not mediated by phonological contrast. In contrast to other studies, duration served as a non-primary cue to the phonological contrast in this recent study. This current study furt...
Article
The acoustic cues for prosodic prominence have been explored extensively, but one open question is to what extent they differ by context. This study investigates the extent to which vowel type affects how acoustic cues are related to prominence ratings provided in a corpus of spoken Mandarin. In the corpus, each syllable was rated as either promine...
Article
Phonetic imitation has been found to be mediated by phonological contrast. For features whose values vary around a phonological prototype, the imitation is distorted by the phonological category, i.e., the imitation is non-linear. This phonological mediation effect was mostly found in segmental features such as VOT and formants. Supra-segmental fea...
Article
This study investigates how prosodic prominence mediates the perception of American English vowels, testing the effects of F0 and duration. In Experiment 1, the perception of four vowel continua varying in duration and formants (high: /i-ɪ/, /u-ʊ/, non-high: /ɛ-ae/, /ʌ-ɑ/), was examined under changes in F0-based prominence. Experiment 2 tested if c...
Conference Paper
Full-text available
Taiwanese Southern Min has checked verses unchecked tonal contrast the perception of which has received limited investigation. This study examined the role of duration in perceiving this contrast in TSM, for both the mid and high registers. Identification experiments were carried out on stimuli sampled from 'All-cue' Continuum where all cues were p...
Conference Paper
Full-text available
Prosodic prominence modulates vowel production and acoustics. In the present study we test if the same effects play out in perception. We test perception of four American English vowel contrasts, varying formants and duration along a continuum. We test how F0-based prominence modulates vowel categorization, and explore the role of vowel duration as...
Article
It has been well known that rising/falling pitch is employed to distinguish the rising (R) or falling (F) tones from the high-level (H) tone in Mandarin, but whether F0 range or F0 slope is the more critical F0 cue to perception is still inconclusive. To clarify this issue quantitatively, we took the F tone as the test case, and conducted two-alter...
Article
Full-text available
Featured Application This work could be used in phonetic analysis from a speaker’s limited speech. For example, in helping with the speech analysis of new users for a language learning application. Abstract From a very brief speech, human listeners can estimate the pitch range of the speaker and normalize pitch perception. Spectral features which...
Conference Paper
Introduc)on Phone&c imita&on has been found to be affected by several factors • Phonological contrast effect-Imita'on of phonological ambiguous stimuli is inaccurate or inhibited (Nielsen 2011, Kim & Clayards, 2019) • Feature type effect-Supra-segmental features (e.g. F0) are easier to imitate than segmental features (e.g. formants) (Sato et al., 2...
Conference Paper
As a crucial perceived trait of speech, prominence is associated with various communicative functions. The acoustic cues for prominence have been explored extensively, but how the cue weighting pattern varies in different contexts needs a closer examination. This paper investigates how Mandarin tones affect the cue weighting pattern of prominence....
Article
Full-text available
Detecting pronunciation erroneous tendency (PET) can provide detailed instructive feedback for second language learners in computer aided pronunciation training (CAPT). In this paper, we proposed to apply soft targets from various models to improve the detection performance of PET. First, we examined the effectiveness of soft targets in three singl...
Article
Pitch-range estimation from brief speech segments could bring benefits to many tasks like automatic speech recognition and speaker recognition. To estimate pitch range, previous studies have proposed to utilize deep-learning-based models with spectrum information as input. They demonstrated that such method works and could still achieve reliable es...
Conference Paper
Full-text available
In human speech communication, pitch can be normalized automatically by listeners through a subjective estimation of the speaker’s overall pitch range, even from a very brief speech input. In speech technologies, pitch range used to be estimated by direct analysis of F0 values from a lengthy speech input, but a reliable estimation from a brief spee...
Conference Paper
Full-text available
Detecting pronunciation erroneous tendency (PET) can provide detailed instructive feedback for second language learners in computer aided pronunciation training (CAPT). In this paper, we utilize soft targets with knowledge from various models for improving the detection performance of PET. First, we examined the effectiveness of soft targets in thr...
Conference Paper
Full-text available
Prosodic strength refers to the relative prominence of each syllable in continuous speech. It used to be annotated at several degrees by perception, but perceptual annotation is highly subjective and cannot give continuous values that are potentially useful in speech technologies. This study proposed a method to estimate prosodic strength of each s...
Conference Paper
Automatic prosodic boundary detection and annotation are important for both speech understanding and natural speech synthesis. Manual annotation of prosody boundary label is very laborious and time consuming. In this paper, from the perspective of interaction of adjacent tones, we proposed a method to automatically detect prosody boundary based on...

Network

Cited By