
Joan A SerenoUniversity of Kansas | KU · Department of Linguistics
Joan A Sereno
PhD
About
113
Publications
29,119
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,304
Citations
Citations since 2017
Publications
Publications (113)
Clearly articulated speech, relative to plain-style speech, has been shown to improve intelligibility. We examine if visible speech cues in video only can be systematically modified to enhance clear-speech visual features and improve intelligibility. We extract clear-speech visual features of English words varying in vowels produced by multiple mal...
The current study investigated the merger-in-progress between word-initial nasal and lateral consonants in Fuzhou Min, examining the linguistic and social factors that modulate the merger. First, the acoustic cues to the l-n distinction were examined in Fuzhou Min. Acoustic analyses suggested a collapse of phonemic contrast between prescriptive L a...
No PDF available
ABSTRACT
Previous research indicates nonnative listeners may have an advantage at understanding nonnative speech of talkers with the same L1 due to shared interlanguage knowledge. The present study offers a comprehensive analysis of various factors that may modulate this advantage, including the proficiency of interlocutors, the ma...
We examine the acoustic characteristics of clear and plain conversational productions of Mandarin tones. Twenty-one native Mandarin speakers were asked to produce a selection of Mandarin words in both plain and clear speaking styles. Several tokens were gathered for each of the four tones giving a total of 2045 productions. Six critical tonal cues...
Phonological alternations pose challenges for models of spoken word recognition in how surface information is mapped onto stored representations in the lexicon. In the current study, an auditory-auditory priming lexical decision experiment was conducted to investigate the alternating representations of Mandarin Tone 3 in both half-third and third t...
Differentially accenting syllables within a word gives rise to lexical stress,
with one syllable in a word more dominant than others. Languages differ
in their phoneme inventories as well as their suprasegmental properties,
with lexical contrasts in language signaled not only by segmental differences (e.g., vowel quality) but also by distinct supra...
This paper reports on a comprehensive phonetic study of American classroom learners of Russian, investigating the influence of the second language (L2) on the first language (L1). Russian and English productions of 20 learners were compared to 18 English monolingual controls focusing on the acoustics of word-initial and word-final voicing. The resu...
No PDF available
ABSTRACT
This study examines the acoustic characteristics of clear and plain conversational productions of Mandarin tones. Twenty-one native Mandarin speakers were asked to produce a selection of Mandarin words in both plain and clear speaking styles. Several tokens were gathered for each of the four tones giving a total 2045 produ...
No PDF available
ABSTRACT
Research has shown that facial articulatory cues aid speech perception. However, how such cues are employed in different speech styles remains unclear. This study examined facial articulatory features of Mandarin tones in clear versus conversational speech styles produced by 20 native Mandarin speakers. Using computer-visi...
No PDF available
ABSTRACT
The Interlanguage Speech Intelligibility Benefit for talkers (ISIB-T) claims non-native learners are better at understanding talkers with a shared L1 than they are at understanding native talkers, and the ISIB for listeners (ISIB-L) claims non-native learners are better at understanding talkers with a shared L1 than native...
This chapter surveys the role of visual cues in Chinese lexical tone production and perception, addressing the extent to which visual information involves either linguistically relevant cues to signal tonal category distinctions or is attention-grabbing in general. Specifically, the survey summarizes research findings on which visual facial cues ar...
The effect of clear speech on the integration of auditory and visual cues to the tense-lax vowel distinction in English was investigated in native and non-native (Mandarin) perceivers. Clear speech benefits for tense vowels /i, ɑ, u/ were found for both groups across modalities, while lax vowels /ɪ, ʌ, ʊ/ showed a clear speech disadvantage for both...
Using computer-vision and image processing techniques, we aim to identify specific visual cues as induced by facial movements made during monosyllabic speech production. The method is named ADFAC: Automatic Detection of Facial Articulatory Cues. Four facial points of interest were detected automatically to represent head, eyebrow and lip movements:...
Phonological alternation (sound change depending on the phonological environment) poses challenges to spoken word recognition models. Mandarin Chinese T3 sandhi is such a phenomenon in which a tone 3 (T3) changes into a tone 2 (T2) when followed by another T3. In a mismatch negativity (MMN) study examining Mandarin Chinese T3 sandhi, participants p...
This study aims to characterize distinctive acoustic features of Mandarin tones based on a corpus of 1025 monosyllabic words produced by 21 native Mandarin speakers. For each tone, 22 acoustic cues were extracted. Besides standard F0, duration, and intensity measures, further cues were determined by fitting two mathematical functions to the pitch c...
Research shows that acoustic modifications in clearly enunciated fricative consonants (relative to the plain, conversational productions) facilitate auditory fricative perception, particularly for auditorily salient sibilant fricatives and for native perception. However, clear-speech effects on visual fricative perception have received less attenti...
Using computer-vision and image processing techniques, we aim to identify specific visual cues as induced by facial movements made during Mandarin tone production and examine how they are associated with each of the four Mandarin tones. Audio-video recordings of 20 native Mandarin speakers producing Mandarin words involving the vowel /3/ with each...
Typological studies have shown that there are more falling tones than rising tones in tone languages, including Chinese. We test the hypothesis that this may be due to a perceptually-based advantage for falling tones over rising tones. Two acoustically comparable (and matched for naturalness) tonal continua in Mandarin (level-falling T1-T4, and lev...
This study investigates the role of language background and bilingual status in the perception of foreign lexical tones. Eight groups of participants, consisting of children of 6 and 8 years from one of four language background (tone or non-tone) × bilingual status (monolingual or bilingual)-Thai monolingual, English monolingual, English-Thai bilin...
Using mathematical modeling, this study aims to characterize distinctive acoustic features of Mandarin tones based on a corpus of 1013 monosyllabic words produced by 21 native Mandarin speakers. For each tone, 22 acoustic cues were extracted. Besides standard F0, duration, and intensity measures, further cues were determined by fitting two mathemat...
We aim to identify visual cues resulting from facial movements made during Mandarin tone production and examine how they are associated with each of the four tones. We use signal processing and computer vision techniques to analyze audio-video recordings of 21 native Mandarin speakers uttering the vowel /ɜ/ with each tone. Four facial interest poin...
Speech communication can adopt different styles as a function of speaking environments and communicative needs. In auditorily or visually challenging contexts, speakers often alter their speech production using a clarified, hyper-articulated speech style with the intention of enhancing speech intelligibility. Such modifications may result in percep...
Speech perception involves multiple input modalities. Research has indicated that perceivers establish cross-modal associations between auditory and visuospatial events to aid perception. Such intermodal relations can be particularly beneficial for speech development and learning, where infants and non-native perceivers need additional resources to...
Clearly enunciated speech (relative to conversational, plain speech) involves articulatory and acoustic modifications that have been shown to enhance segmental intelligibility. However, little research has explored clear-speech effects on the perception of suprasegmental properties such as lexical tone, particularly involving visual (facial) percep...
Research shows that acoustic modifications in clearly enunciated fricative consonants (relative to the plain, conversational productions) facilitate auditory fricative perception. However, clear-speech effects on visual fricative perception have received less attention. A comparison of auditory and visual (facial) clear-fricative perception is part...
Previous studies on tones suggest that Mandarin listeners are more sensitive to pitch direction and slope while English listeners primarily attend to pitch height. In this study, just noticeable differences were established for pitch discrimination using a three-interval, forced-choice procedure with a two-down, one-up staircase design. A high risi...
Using fetal biomagnetometry, this study measured changes in fetal heart rate to assess discrimination of two rhythmically different languages (English and Japanese). Two-minute passages in English and Japanese were read by the same female bilingual speaker. Twenty-four mother-fetus pairs (mean gestational age=35.5 weeks) participated. Fetal magneto...
Studies on acoustic and visual characteristics of English tense and lax vowels show consistent enhancement of tensity contrasts in clear speech. However, the degree to which listeners utilize these enhancements in speech perception remains unclear. The present study addresses this issue by testing speech style effects on tense and lax vowel percept...
Previous studies suggest that Chinese listeners may be more sensitive to pitch direction while American listeners primarily attend to pitch height. The present study sought to establish JNDs for pitch discrimination using a three-interval, forced-choice procedure with a two-down, one-up staircase design (cf. Liu, JASA 2013). We manipulated a high r...
Phonological alternation, in which a sound changes depending on its phonological environment, poses challenges to spoken word recognition models. Mandarin T3 sandhi is such a phenomenon in which a tone 3 changes into a tone 2 when followed by another T3, which raises questions regarding whether the human brain processes the surface acoustic-phoneti...
Speech perception involves multiple input modalities. Research has indicated that perceivers may establish a cross-modal association between auditory and visual-spatial events to aid perception. Such intermodal relations can be particularly beneficial for non-native perceivers who need additional resources to process challenging new sounds. This st...
Clearly produced vowels exhibit longer duration and more extreme spectral properties than plain, conversational vowels. These features also characterize tense relative to lax vowels. This study explored the interaction of clear-speech and tensity effects by comparing clear and plain productions of three English tense–lax vowel pairs (/i-ɪ/, /ɑ-ʌ/,...
The present study examined lexical stress patterns in Uyghur, a Turkic language. The main goal of this research was to isolate and determine which acoustic parameters provide cues to stress in Uyghur. A number of studies have investigated the phonetic correlates of lexical stress across the world's languages, with stressed syllables often longer in...
One basic feature of the Arabic script is its semicursive style: some letters are connected to the next, but others are not, as in the Uyghur word ياخشى /ya xʃi/ ("good"). None of the current orthographic coding schemes in models of visual-word recognition, which were created for the Roman script, assign a differential role to the coding of within...
Phonological alternation poses problems for spoken word recognition. In Mandarin Tone 3 sandhi, a Tone 3 syllable changes to a Tone 2 syllable when followed by another Tone 3 syllable. A traditional phonological account assumes that the initial syllable of Mandarin disyllabic sandhi words is Tone 3 (T3) underlyingly, but becomes Tone 2 (T2) on the...
This study investigated the relationship between clearly produced and plain citation form speech styles and motion of visible articulators. Using state-of-the-art computer-vision and image processing techniques, we examined both front and side view videos of speakers' faces while they recited six English words (keyed, kid, cod, cud, cooed, could) c...
The acoustic features of clearly produced vowels have been widely studied, but a less explored area concerns the difference in the adaptations of tense and lax clear vowels. This study explored the clear production of three pairs of English tense and lax vowels (/i-ɪ/, /ɑ-ʌ/, /u-ʊ/) to determine whether tense vowels show a larger clear versus conve...
Training has been shown to improve American English speakers’ perception and production of the Spanish /ɾ, r, d/ contrast; however, it is unclear whether successfully trained contrasts are encoded in the lexicon. This study investigates whether learners of Spanish process the /ɾ, r, d/ contrast differently than native speakers and whether training...
The present study examines the relative impact of segments and intonation on accentedness, comprehensibility, and intelligibility, specifically investigating the separate contribution of segmental and intonational information to perceived foreign accent in Korean-accented English. Two English speakers and two Korean speakers recorded 40 English sen...
In Mandarin, tone 3 sandhi is a tonal alternation phenomenon in which a tone 3 syllable changes to a tone 2 syllable when it is followed by another tone 3 syllable. Thus, the initial syllable of Mandarin bisyllabic sandhi words is tone 3 underlyingly but becomes tone 2 on the surface. An auditory-auditory priming lexical decision experiment was con...
Sound symbolism is a concept in which the sound of a word and the meaning of the word are systematically related. The current study investigated whether the voicing contrast between voiced /d, g, z/ and voiceless /t, k, s/ consonants systematically affects categorization of Japanese mimetic stimuli along a number of perceptual and evaluative dimens...
The Silver Medal is presented to individuals, without age limitation, for contributions to the advancement of science, engineering, or human welfare through the application of acoustic principles, or through research accomplishment in acoustics.
We examined how letter position coding is achieved in a script (Arabic) in which the different letter forms (i.e., allographs) may vary depending on their position within the letter string (e.g., compare the same-ligation pair
and
vs. the different-ligation pair
and
. To that end, we conducted an experiment in Uyghur, an agglutinative langu...
Two priming experiments examined the separate contribution of lexical tone and segmental information in the processing of spoken words in Mandarin Chinese. Experiment 1 contrasted four types of prime-target pairs: tone-and-segment overlap (ru4-ru4), segment-only overlap (ru3-ru4), tone-only overlap (sha4-ru4) and unrelated (qin1-ru4) in an auditory...
Spoken words carry linguistic and indexical information to listeners. Abstractionist models of spoken word recognition suggest that indexical information is stripped away in a process called normalization to allow processing of the linguistic message to proceed. In contrast, exemplar models of the lexicon suggest that indexical information is retai...
This study investigates the effectiveness of three high variability training paradigms in training 42 speakers of American English to correctly perceive and produce Spanish intervocalic /d, ɾ, r/. Since Spanish spirantization and English flapping both affect /d/ intervocalically, the acquisition of the /d/-/ɾ/ contrast proves difficult for English...
Previous research established that high variability training improves both perception and production of novel L2 contrasts and that training noncontrastive sounds in subjects' L1 results in increased MMN responses to those sounds. However, it is unclear whether training novel contrasts in an L2 also results in increased amplitude of MMN responses t...
Two priming experiments examined the separate contribution of lexical tone and segmental information in the processing of spoken words in Mandarin Chinese. Experiment 1 contrasted four types of prime-target pairs: tone-and-segment overlap (bo1-bo1), segment-only overlap (bo2-bo1), tone-only overlap (zhua1-bo1), and unrelated (han3-bo1) in an audito...
The present research examines how perceptual constancy is achieved by exploring three fundamental sources of acoustic variability: changes in the rate at which a sentence is spoken, differences due to which speaker produced the sentence, and variations resulting from the sentence context. These sources of variability are investigated by examining t...
Acoustic and perceptual effects of emphasis, a secondary articulation in the posterior vocal tract, were investigated in Urban Jordanian Arabic. Twelve speakers of Jordanian Arabic recorded both consonants and vowels of monosyllabic minimal CVC pairs containing plain or emphatic consonants in initial and final position to investigate the extent of...
This paper presents an acoustic and perceptual study of alveolar flaps in American English. In the acoustic study, vowel duration differences in disyllabic tokens replicated previous findings in that vowels preceding /d/ were significantly longer than those preceding /t/. Flap frequency was also analyzed based on a method of distinguishing flapped...
The present study investigates the extent of word-final devoicing in Russian for three groups of speakers: monolingual native Russian speakers (4 Ss), native Russian speakers with knowledge of English (7 Ss), and American English learners of Russian (9 Ss). Thirty-four minimal pairs of Russian words differing in the underlying voicing of word-final...
Background:
Numerous studies have demonstrated that the negative effect of noise and other distortions on speech understanding is greater for older adults than for younger adults. Anecdotal evidence suggests that older adults may also be disproportionately negatively affected by foreign accent. While two previous studies found no interaction betwe...
The present research investigates the effects of variation in speaking rate on the production of Mandarin tone. Fourteen speakers (6M, 8F) produced 15 syllables, each with the four different Mandarin tones. Speakers produced these syllables at fast, normal, and slow speaking rates in isolation. To induce rate change, the 60 target words were presen...
In a previous experiment [Ferguson et al., J. Acoust. Soc. Am. 118, 1932 (2005)], young listeners with normal hearing, older adults with essentially normal hearing, and older adults with hearing loss were found to be similarly affected by the presence of a foreign accent on a word identification task in various listening conditions. This result sta...
Neighborhood density refers to the number of words that sound similar to a given word. Previous studies have found that neighborhood density influences the recognition of spoken words (Luce & Pisoni, 1998); however, this work has focused almost exclusively on monosyllabic words in English. To investigate the effects of neighborhood density on longe...
English-speaking learners of Spanish often fail to achieve nativelike pronunciation of the tap-trill distinction in words like caro "expensive" and carro "car." The trill proves difficult because it is neither a phoneme nor an allophone in English. Although the tap exists as an allophone of t and d in American English, learners of Spanish must lear...
The present study investigates the role of F0, duration, and vowel quality in English lexical stress perception by second language learners with a tonal L1. Mandarin speakers learning English as a second language (advanced learners, n=25; beginning learners n=25) were compared to a control group of native English speakers (n=25). Resynthesized disy...
Acoustic characteristics of stress were examined in second language learners’ productions of English lexical stress. Fourteen minimal pairs of disyllabic English nouns and verbs contrasting in stress pattern (trochees and iambs; e.g., OBject and obJECT) were recorded. Eighteen Chinese learners of English (NNSs) (nine advanced, nine basic) and ten n...
This study addressed whether acoustic variability and category overlap in non-native speech contribute to difficulty in its recognition, and more generally whether the benefits of exposure to acoustic variability during categorization training are stable across differences in category confusability. Three experiments considered a set of Spanish-acc...
Individuals who speak English as a second language vary in their ability to produce appropriate stress, which often impedes their intelligibility. The present study investigated the production of lexical stress by native speakers of English as well as learners of English. Minimal pairs were recorded by 8 native speakers of English and 8 Arabic lear...
This research explores the representation and access of lexical form during spoken word recognition. Two experiments were conducted examining segmental priming effects. In the first set of experiments, a single fricative segment functioned as a prime to targets with either a matching or mismatching fricative in initial (Experiment 1a) or final (Exp...
The present study employs event related potentials (ERPs) to verify the utility of using electrophysiological measures to study developmental questions within the field of language comprehension. Established ERP components (N400 and P600) that reflect semantic and syntactic processing were examined. Fifteen adults and 14 children (ages 8-13) proces...
The authors conducted 4 experiments to test the decision-bound, prototype, and distribution theories for the categorization of sounds. They used as stimuli sounds varying in either resonance frequency or duration. They created different experimental conditions by varying the variance and overlap of 2 stimulus distributions used in a training phase...
This study, following up on work on Dutch by Warner, Jongman, Sereno, and Kemps (2004. Journal of Phonetics, 32, 251–276), investigates the influence of orthographic distinctions and underlying morphological distinctions on the small sub-phonemic durational differences that have been called incomplete neutralization. One part of the previous work i...
In a phonological priming experiment using spoken Dutch words, Dutch listeners were taught varying expectancies and relatedness relations about the phonological form of target words, given particular primes. They learned to expect that, after a particular prime, if the target was a word, it would be from a specific phonological category. The expect...
This study investigated hemispheric lateralization of Mandarin tone. Four groups of listeners were examined: native Mandarin listeners, English–Mandarin bilinguals, Norwegian listeners with experience with Norwegian tone, and American listeners with no tone experience. Tone pairs were dichotically presented and listeners identified which tone they...
Words which are expected to contain the same surface string of segments may, under identical prosodic circumstances, sometimes be realized with slight differences in duration. Some researchers have attributed such effects to differences in the words’ underlying forms (incomplete neutralization), while others have suggested orthographic influence an...
Functional magnetic resonance imaging was employed before and after six native English speakers completed lexical tone training as part of a program to learn Mandarin as a second language. Language-related areas including Broca's area, Wernicke's area, auditory cortex, and supplementary motor regions were active in all subjects before and after tra...
Either orthographic distinctions or underlying morphological distinctions might cause incomplete neutralization. Much previous work has found small durational differences between pairs such as German ``Rad'' ``wheel'' and ``Rat'' ``advice,'' which are traditionally said to be neutralized to [rat] in both cases. Incomplete neutralization effects are...
This research explores the representation and access of lexical form during spoken word recognition. Two experiments were conducted examining segmental priming effects. In the first set of experiments, a single fricative segment functioned as a prime to targets with either a matching or mismatching fricative in initial (Experiment 1a) or final (Exp...
Training American listeners to perceive Mandarin tones has been shown to be effective, with trainees' identification improving by 21%. Improvement also generalized to new stimuli and new talkers, and was retained when tested six months after training [Y. Wang et al., J. Acoust. Soc. Am. 106, 3649-3658 (1999)]. The present study investigates whether...
The present study investigates the perception of foreign-accented speech. It seeks to address the issue of whether systematic exposure to a representative sample of speech from a specific foreign accent improves comprehension of that accent. In this study, perception of Spanish-accented English was examined before and after a training regimen that...