Conference PaperPDF Available



Abstract and Figures

The study examines a number of acoustic properties of non-native speech directed to a native speaker, a non-native speaker with a shared first language background, and a non-native speaker with a different first language. Results demonstrate that the interlocutor condition interacts with the language attitudes factor: Participants with more positive attitudes towards their second language (English) differ along several acoustic dimensions from participants with more positive attitudes towards their first language (Mandarin), especially when interacting with native speakers of English. Expanded vowel space, higher articulation rate, and increased pitch adopted by English-oriented participants in interactions with native speakers of English may be indicative of their greater positive emotional involvement in the interaction. .
Content may be subject to copyright.
Olga Dmitrieva, Wai Ling, Law, Mengxi Lin, Yuanyuan Wang, Jenna Conklin, Ashley Kentner
Purdue University
{odmitrie, wlaw, lin211, wang861, jconkli, akentner}
The study examines a number of acoustic properties
of non-native speech directed to a native speaker, a
non-native speaker with a shared first language
background, and a non-native speaker with a different
first language. Results demonstrate that the
interlocutor condition interacts with the language
attitudes factor: Participants with more positive
attitudes towards their second language (English)
differ along several acoustic dimensions from
participants with more positive attitudes towards their
first language (Mandarin), especially when
interacting with native speakers of English. Expanded
vowel space, higher articulation rate, and increased
pitch adopted by English-oriented participants in
interactions with native speakers of English may be
indicative of their greater positive emotional
involvement in the interaction.
Keywords: non-native speech, language attitudes,
listener-oriented, vowel space, rate of speech
It has been known for some time that speakers can
adjust the acoustic characteristics of their speech to
accommodate the communicative needs of the
listeners. The speaking style directed at increasing
speech intelligibility, dubbed ‘clear speech’, has been
shown to be adopted by native speakers in specific
communicative settings, for example in the presence
of noise or when addressing hearing-impaired
listeners [11]. Clear speech is typically characterized
by a decrease in the rate of speech, higher pitch and
expanded pitch range, an increase in vowel duration,
and expanded vowel space. Other populations of
listeners who elicit similar adaptations in native
speech include foreigners, infants and young
children, and even pets [5], [12], [14].
Less is known about listener-oriented speaking
style adaptations that may occur in non-native (L2)
speech. Perceptual studies demonstrate that clear
speech produced by proficient L2 speakers leads to
intelligibility benefits comparable to those produced
by native speakers’ clear speech [13]. There is also
some evidence that different pairings of native and
non-native interlocutors may result in changes in
speaking style, as assessed via the differences in
resulting intelligibility and degree of phonetic
convergence [7], [8], [13], [15]. These findings
suggest that not only non-native speakers are able to
modify their speech ‘at will’, they may do so in the
absence of explicit instructions (e.g. to speak more
clearly) but in spontaneous response to the change of
listener and their perceived communicative needs.
The present study investigates the possible effects
that the change of the interlocutor characteristics in
terms of native language background may have on the
acoustic properties of non-native speech. More
specifically, we are testing the hypothesis that non-
native speakers may choose to speak more clearly’
to a particular group of listeners; those that the
speakers expect to experience the greatest
intelligibility-related difficulty with their accented
speech. Listeners with whom the speakers don’t share
a common native language and those who have less
exposure to non-native speech may belong to this
However, listener-oriented adaptations in non-
native speech may also be modulated by speakers’ L2
proficiency levels and attitudes towards their second
language. In particular, participants with a more
positive attitude to their second language and a
greater motivation to be perceived as a proficient
speaker by native listeners may choose to speak more
clearly when addressing the native speakers of their
We address these questions by examining the
acoustic characteristics of non-native speech
addressed to native and non-native listeners (those
with the same and different L1 backgrounds) in light
of speakers’ language attitudes.
2.1. Participants
Thirteen participants (5 women, 8 men) took part in
the study to date. All were native speakers of
Mandarin from the same dialectal area: Northern
regions of mainland China (north of Yangtze River).
Participants were recruited on the campus of a major
Midwestern university and received payment for their
participation. All participants completed a post-test
questionnaire, adapted from [3] and [10], with
detailed questions concerning their proficiency in
second language (English, self-rated), amount of first
and second language use (in hours per week), quality
of linguistic interactions (with native vs. non-native
speakers of the language), and language attitudes
(how much importance they attach to being perceived
as a proficient/authentic speaker of their native vs.
non-native language). Only language attitude results
are discussed in this paper.
Three confederates (all women) served as the
conversation partners in the experimental sessions.
The first confederate was a native speaker of
Mandarin (a non-native speaker with the same L1
background as the participants); the second
confederate was a native speaker of the Midwestern
dialect of American English (a native speaker of the
participants’ L2); the third confederate was a native
speaker of Russian (a non-native speaker with a
different L1 background). Both non-native
confederates learned English as a second language in
adolescence/adulthood and spoke noticeably
accented English. The participants were also notified
of the confederates’ native language backgrounds
during the introduction part prior to the experimental
sessions. All confederates are authors on this paper.
2.2. Materials
Three different versions of the map, similar to those
found in the HCRC Map Task Corpus [1], were
created for the experiment. Each map contained the
same 13 labeled landmarks, which were arranged in a
different order and connected with a different route
on each map. Both the participant and the confederate
were given a copy of the same map per interaction,
however the confederate’s map did not have the route.
2.3. Procedure
After participants had given informed consent, they
were instructed to complete the map task three times.
For each task, participants were instructed to explain
the route on their map to the task partner (one of the
three confederates) such that the partner could
replicate the route on their map. Participants were
informed that their task partner does not have the
route drawn on their map. Confederates were
presented as fellow participants in order to allow for
the most natural interaction possible and avoid any
formality that may have been induced by speaking
knowingly with an experimenter. During the map
task, participants were seated across the table from
the confederate, in the sound-attenuated booth. A
custom divider did not allow the task partners to see
each other’s maps but did not interfere with visual
contact. The order of interactions with three
confederates was counterbalanced across
participants. Each interaction lasted for about 10
minutes, the entire experiment lasting between 30 and
40 minutes. Both the participants and confederates’
voices were recorded digitally to separate channels.
2.4. Measurements
The participants’ recordings were manually
annotated for the syllable and stressed vowel
boundaries in the target words (map landmarks). The
values of the first two formant frequencies (F1 and
F2) at midpoint of the vowel were collected. Formant
values were examined for outliers and corrected
manually where necessary.
Average pitch per syllable was also obtained using
an autocorrelation pitch tracking algorithm. Outlying
values due to pitch tracking errors were removed
from the analysis. All the annotations and
measurements were done in Praat [4].
Vowel space between the point vowels [i] [æ] [u]
and [ɑ] was calculated by adding the areas of the two
triangles, that between vowels [i] [æ] and [u] and that
between [u] [æ] and [ɑ]. The areas of the triangles
were found using the formula in (1), where x
corresponds to the F1 value, y corresponds to the F2
value, and A, B, and C stand for the three point
Articulation rate was calculated by dividing the
number of syllables in each participant’s response
(estimated as a number of vocalic segments) by the
participant’s phonation time (total response time
minus silence time).
Attitude ratio was calculated based on
participants’ responses to the Language Attitudes part
of the questionnaire. Participants rated on the 6-point
scale (0-strongly disagree, 6-strongly agree)
statements such as “I identify with an
English/Mandarin-speaking culture” and “I want
others to think I am a native/proficient speaker of
English/Mandarin”. The totals of points in English-
related statements and Mandarin-related statements
were obtained and a Mandarin/English attitudes ratio
(AR) was calculated. The ratio of 1 indicated that the
participants valued the authenticity of their Mandarin
and English-speaking identities equally. A ratio lower
than 1 indicated a greater value of the English-
speaking identity, while a ratio greater than 1
indicated a greater value of the Mandarin-speaking
2.5. Analysis
Participants were divided into two groups based on
the attitudes ratio: Group 1 contained seven
participants whose AR was higher than 1 (Mandarin-
oriented); Group 2 contained six participants whose
AR was 1 or lower (English-oriented). Acoustic
parameters were checked for interactions of the AR
grouping variable (English-oriented vs. Mandarin-
oriented) with the interlocutor’s native language
variable (English, Mandarin, and Russian) in a series
of repeated measures ANOVAs. Analyses with
significant interactions were followed up by repeated
measures ANOVAs within each AR group.
3.1. Vowel space
There was a significant interaction between the AR
variable and the Interlocutor’s Language (IL) in the
analysis of vowel space: F(2,22)=5.907, p<0.01,
which indicated that English-oriented and Mandarin-
oriented groups demonstrated different vowel space
patterns across the interlocutor conditions. Figure 1
shows that the two groups diverged in terms of vowel
space in the native English-speaking condition.
Figure 1: Vowel space in English, Mandarin, and
Russian Interlocutor Conditions for English-
oriented and Mandarin-oriented groups.
Follow-up within-group analyses showed a
significant effect of IL within the English-oriented
group: F(2,10)=5.507, p<0.05. In this group of
participants, vowel space was more expanded when
they were addressing a native English-speaking
interlocutor compared to interactions with non-native
listeners. Post hoc pairwise comparisons (Bonferroni)
showed a strong trend for a significant difference in
terms of vowel space between English and Russian
conditions (p=0.074).
3.2. Articulation rate
There was a significant interaction between the AR
factor and the IL factor in the analysis of articulation
rate: F(2,22)=5.631, p<0.05. Figure 2 shows that
participants in English-oriented and Mandarin-
oriented groups used different articulation rates when
addressing native English-speaking participants.
Figure 2: Articulation rate in English, Mandarin,
and Russian Interlocutor Conditions for English-
oriented and Mandarin-oriented groups.
Within-group analyses demonstrated a significant
effect of Interlocutor Language within the Mandarin-
oriented group: F(2,12)=4.001, p<0.05. These
participants spoke slower when addressing the native
English-speaking interlocutor compared to
interactions with non-native interlocutors. While the
effect of Interlocutor Language did not reach
significance within the English-oriented group, the
quantitative tendency was opposite to that of the
Mandarin-oriented group, similarly to the pattern of
vowel space results.
3.3. Pitch
The analysis of mean f0 showed a significant
interaction between the AR factor and the IL factor:
F(2,22)=5.512, p<0.05. Figure 3 shows that English-
oriented and Mandarin-oriented groups of
participants adopted different mean levels of pitch
across different interlocutor conditions. In particular,
English-oriented participants spoke with a higher f0
when addressing English and Russian-speaking
interlocutors, while Mandarin-oriented participants
spoke with a higher f0 when addressing Mandarin
Figure 3: Mean f0 in English, Mandarin, and
Russian Interlocutor Conditions for English-
oriented and Mandarin-oriented groups.
Within-group comparisons demonstrated a near-
significant effect of Interlocutor Language within the
English-oriented group: F(2,10)=4.103, p=0.05.
The results demonstrated that listener-oriented
adaptation in the non-native speech of the participants
were strongly influenced by their language attitudes.
English-oriented and Mandarin-oriented groups of
participants made difference acoustic adjustments in
their speech across the three interlocutor conditions.
Especially prominent is the quantitative tendency to
treat the native English-speaking group differently
from the two non-native groups. English-oriented
speakers used a more hyperarticulated (expanded)
vowel space when addressing native English
listeners, while for Mandarin-oriented speakers the
tendency was in the opposite direction. This finding
is consistent with the prediction that non-native
speakers who value their L2 identity and strive to be
perceived as authentic/proficient L2 speakers will
choose to speak more clearly to native English
However, the articulation rate results are pointing
in a different direction. In this analysis, English-
oriented group of speakers spoke faster when
addressing native English listeners, while Mandarin-
oriented speakers spoke slower to native English
speakers. Clear speech is typically characterized with
a slower speech rate and it is difficult to reconcile this
result with the clear speech pattern. However, it is
plausible that rate of speech varies in English- and
Mandarin-oriented groups across interlocutor
condition as function of speakers’ emotional
involvement during the interaction. Research shows
that rate of speech increases as speakers take a
stronger stance in the conversation [6]. English-
oriented speakers may be showing a greater degree of
emotional involvement in the interactions with native
speakers by increasing articulation rate.
Mean f0 results are also largely consistent with
this interpretation: English-oriented speakers’ speech
was on average higher-pitched when addressing
English and Russian listeners than the speech of
Mandarin-oriented group, while Mandarin-oriented
speakers adopted higher pitch when addressing
Mandarin listeners. Higher pitch has also been shown
to correlate with greater emotional engagement,
stance taking, and positive affect in speech [2]. [9].
Thus, the present results demonstrate that groups
of non-native speaker adopt different listener-
oriented strategies depending on the value they attach
to their first and second language. The findings are
most consistent with the interpretation that speakers
who value their L2 identity are more positively
emotionally involved in the interactions with native
speakers, which is manifested in hyperarticulated
vowel space, faster articulation rate, and higher pitch.
Speakers who value their L1 identities demonstrate
nearly opposite acoustic patterns in interactions with
native and non-native listeners.
These results expand our understanding of
listener-oriented properties of non-native speech
beyond the clear speech settings, where participants
are explicitly instructed to modify their speech. They
show an interaction of the listener characteristics
(such as that of a potential ‘judge’ of speakers’
authenticity) and speaker characteristics (such as
attitudes to one’s first and second language) in a more
spontaneous conversational environment.
[1] Anderson, A., Bader, M., Bard, E., Boyle, E., Doherty,
G.M., Garrod, S., Isard, S., Kowtko, J., McAllister, J.,
Miller, J., Sotillo, C., Thompson, H.S. Weinert, R.
1991. The HCRC Map Task Corpus. Language and
Speech 34, 351-366.
[2] Belyk, M., Brown, S. (2014). The acoustic correlates of
valence depend on emotion family. Journal of
Voice, 28(4), 523-e9.
[3] Birdsong, D., Gertken, L.M., Amengual, M. 2012.
Bilingual Language Profile: An Easy-to-Use
Instrument to Assess Bilingualism. COERLL,
University of Texas at Austin. Web. 20 Jan.
[4] Boersma, P., Weenink, D. 2001. Praat, a system for
doing phonetics by computer. Glot International
5:9/10, 341-345.
[5] Burnham, D., Kitamura, C., Vollmer-Conna, U. 2002.
What's new, pussycat? On talking to babies and
animals. Science, 296(5572), 1435-1435.
[6] Freeman, V., Wright, R., Levow, G.-A., Luan, Y.,
Chan, J., Tran, T., Zayats, V., Antoniak, M., Ostendorf,
M. 2014. Phonetic correlates of stance-taking. The
Journal of the Acoustical Society of America, 136(4),
[7] Hayes-Harb, R., Smith, B. L., Bent, T., Bradlow, A. R.
2008. The interlanguage speech intelligibility benefit
for native speakers of Mandarin: Production and
perception of English word-final voicing
contrasts. Journal of phonetics, 36(4), 664-679.
[8] Kim, M., Horton, W. S., Bradlow, A. R. 2011. Phonetic
convergence in spontaneous conversations as a
function of interlocutor language distance. Laboratory
phonology, 2(1), 125-156.
[9] Laukka, P., Neiberg, D., Forsell, M., Karlsson, I.,
Elenius, K. 2011. Expression of affect in spontaneous
speech: Acoustic correlates and automatic detection of
irritation and resignation. Computer Speech &
Language, 25(1), 84-104.
[10] Marian, V., Blumenfeld, H. K., Kaushanskaya, M.
2007. The language experience and proficiency
questionnaire (LEAP-Q): Assessing language profiles
in bilinguals and multilinguals. Journal of Speech
Language and Hearing Research, 50(4), 940-967.
[11] Picheny, M.A., Durlach, N.I., Braida, L.D. 1986.
Speaking Clearly for the Hard of Hearing II Acoustic
Characteristics of Clear and Conversational Speech.
JSHR 29, 434-436.
[12] Scarborough, R., Brenier, J., Zhao, Y., Hall-Lew, L.,
and Dmitrieva, O. 2007. An acoustic study of real and
imagined foreigner-directed speech. Proceedings of the
15th International Congress of Phonetic Sciences,
[13] Smiljanić, R., Bradlow, A. R. 2011. Bidirectional
clear speech perception benefit for native and high
proficiency non-native talkers and listeners:
Intelligibility and accentedness). The Journal of the
Acoustical Society of America, 130(6), 4020-4031.
[14] Uther, M., Knoll, M.A., Burnham, D. 2007. Do you
speak E-NG-L-I-SH? A comparison of foreigner- and
infant-directed speech. Speech Communication 49, 2-7.
[15] Van Engen, K. J., Baese-Berk, M., Baker, R. E., Choi,
A., Kim, M., Bradlow, A. R. 2010. The Wildcat Corpus
of native-and foreign-accented English:
Communicative efficiency across conversational dyads
with varying language alignment profiles. Language
and speech, 53(4), 510-540.
... Previous research [7], showed that Mandarin speakers of English modified acoustic properties of their English speech as a factor of both the interlocutor (native vs. non-native speakers of English) and their own attitudes towards Mandarin and English. The present study investigates whether these acoustic modifications are perceptible to native speakers of English. ...
... One study that demonstrated the effect of speaker attitude on non-native speech [7] examined the speech of Mandarin speakers of English conversing with interlocuters from different L1 backgrounds (Mandarin, Russian, and English). When addressing native English speakers, Mandarin speakers who reported being more English-oriented (had a positive attitude towards English), utilized a more hyperarticulated vowel space, faster articulation rate, and higher pitch compared to speakers who were more Mandarin-oriented. ...
... While [7] found that English-oriented Mandarin speakers made modifications to their English speech, it is unclear whether those adaptations were implemented with a purpose in mind. One possibility is that participants aimed to make their speech more intelligible or less accented when addressing Englishspeaking interlocutors. ...
Conference Paper
Full-text available
Previous research [7], showed that Mandarin speakers of English modified acoustic properties of their English speech as a factor of both the interlocutor (native vs. non-native speakers of English) and their own attitudes towards Mandarin and English. The present study investigates whether these acoustic modifications are perceptible to native speakers of English. Seventy-two native English listeners rated short English speech samples from twenty-four Mandarin learners with respect to speaker's intelligibility, proficiency, and accentedness, on a 7-point scale. The results showed that the interlocutor condition was not reflected in listeners' ratings. However, speakers' attitudes significantly predicted listeners' ratings. Participants who were more positively oriented towards Mandarin than English were perceived as less intelligible, less proficient, and more accented. The results suggest that the effects of language attitudes on second language speech are salient and perceptible to native listeners.
... Indirect evidence of attitudinal effects on bilingual speech can be found in previous research. For instance, Dmitrieva, Law, Lin, Wang, Conklin and Kentner (2015) showed that, depending on their language attitudes toward Mandarin and English, Mandarin speakers living in the United States displayed different speech qualities when addressing native English interlocutors in English (their L2). Participants who were positively predisposed toward English produced speech with more expanded vowel space, greater articulation rate, and higher average pitch than participants with more positive attitudes toward Mandarin. ...
Bilinguals’ attitudes toward their languages can be a major source of linguistic variability. However, the effect of attitudes on crosslinguistic phonetic interactions in bilinguals remains largely unexplored. This study investigated the possibility of such effects in Cantonese-English bilinguals in Hong Kong ( n = 26). Participants produced near-homophones in each language on separate days. Formant values of Cantonese [ɐ] and English [ʌ] and degrees of diphthongization of Cantonese [o] and [ai], and English [oʊ] and [ai], were analyzed as a function of language proficiency, use, and language attitude scores drawn from a background questionnaire. Participants’ attitudes toward Cantonese were predictive of the acoustic difference between similar Cantonese and Hong Kong English (HKE) vowels: More Cantonese-oriented speakers produced greater acoustic distance between crosslinguistically similar vowels. No effects of English attitudes, proficiency, or use were found. These results demonstrate that bilinguals’ attitude toward their native language can affect the degree of phonetic similarity between the two languages they speak.
Full-text available
Stance, or a speaker’s attitudes or opinions about the topic of discussion, has been investigated textually in conversation- and discourse analysis and in computational models, but little work has focused on its acoustic-phonetic properties. This is a difficult problem, given that stance is a complex activity that must be expressed along with several other types of meaning (informational, social, etc.) using the same acoustic channels. In this presentation, we begin to identify some acoustic indicators of stance in natural speech using a corpus of collaborative conversational tasks which have been hand-annotated for stance strength (none, weak, moderate, and strong) and polarity (positive, negative, and neutral). A preliminary analysis of 18 dyads completing two tasks suggests that increases in stance strength are correlated with increases in speech rate and pitch and intensity medians and ranges. Initial results for polarity also suggest correlations with speech rate and intensity. Current investigations center on local modulations in pitch and intensity, durational and spectral differences between stressed and unstressed vowels, and disfluency rates in different stance conditions. Consistent male/female differences are not yet apparent but will also be examined further.
Full-text available
This study explores phonetic convergence during conversations between pairs of talkers with varying language distance. Specifically, we examined conversations within two native English talkers and within two native Korean talkers who had either the same or different regional dialects, and between native and nonnative talkers of English. To measure phonetic convergence, an independent group of listeners judged the similarity of utterance samples from each talker through an XAB perception test, in which X was a sample of one talker's speech and A and B were samples from the other talker at either early or late portions of the conversation. The results showed greater convergence for same-dialect pairs than for either the different-dialect pairs or the different-L1 pairs. These results generally support the hypothesis that there is a relationship between phonetic convergence and interlocutor language distance. We interpret this pattern as suggesting that phonetic convergence between talker pairs that vary in the degree of their initial language alignment may be dynamically mediated by two parallel mechanisms: the need for intelligibility and the extra demands of nonnative speech production and perception.
Full-text available
The acoustic properties of foreigner-directed speech are surprisingly understudied, and many existing studies evoke imagined interlocutors to elicit foreigner-directed speech. This study provides an acoustic comparison of foreigner-directed and native-directed speech in real and imaginary conditions. Ten native U.S. English speakers described the path between landmarks on a map to two confederate listeners (one native English speaker and one native Mandarin speaker) and to two imagined listeners (described as a native U.S. English speaker and a non-native speaker). Vowel duration, rate of speech, and vowel space size were examined across native/foreigner and real/imagined conditions. Stressed vowels were longer, rate of speech was slower, and vowel space distances were expanded in the foreigner-directed and imaginary conditions than in the native-directed and real ones. Speakers made acoustic-phonetic adjustments in foreigner-directed speech that are consistent with those seen in listener-directed clear speech, and these additional adjustments were made for both native and foreign listeners when the listener was imagined rather than real.
Full-text available
This study investigated how native language background interacts with speaking style adaptations in determining levels of speech intelligibility. The aim was to explore whether native and high proficiency non-native listeners benefit similarly from native and non-native clear speech adjustments. The sentence-in-noise perception results revealed that fluent non-native listeners gained a large clear speech benefit from native clear speech modifications. Furthermore, proficient non-native talkers in this study implemented conversational-to-clear speaking style modifications in their second language (L2) that resulted in significant intelligibility gain for both native and non-native listeners. The results of the accentedness ratings obtained for native and non-native conversational and clear speech sentences showed that while intelligibility was improved, the presence of foreign accent remained constant in both speaking styles. This suggests that objective intelligibility and subjective accentedness are two independent dimensions of non-native speech. Overall, these results provide strong evidence that greater experience in L2 processing leads to improved intelligibility in both production and perception domains. These results also demonstrated that speaking style adaptations along with less signal distortion can contribute significantly towards successful native and non-native interactions.
Full-text available
This paper describes the development of the Wildcat Corpus of native- and foreign-accented English,a corpus containing scripted and spontaneous speech recordings from 24 native speakers of American English and 52 non-native speakers of English.The core element of this corpus is a set of spontaneous speech recordings, for which a new method of eliciting dialogue-based, laboratory-quality speech recordings was developed (the Diapix task). Dialogues between two native speakers of English, between two non-native speakers of English (with either shared or different LIs), and between one native and one non-native speaker of English are included and analyzed in terms of general measures of communicative efficiency.The overall finding was that pairs of native talkers were most efficient, followed by mixed native/non-native pairs and non-native pairs with shared LI. Non-native pairs with different LIs were least efficient.These results support the hypothesis that successful speech communication depends both on the alignment of talkers to the target language and on the alignment of talkers to one another in terms of native language background.
Full-text available
This study investigated the intelligibility of native and Mandarin-accented English speech for native English and native Mandarin listeners. The word-final voicing contrast was considered (as in minimal pairs such as `cub' and `cup') in a forced-choice word identification task. For these particular talkers and listeners, there was evidence of an interlanguage speech intelligibility benefit for listeners (i.e., native Mandarin listeners were more accurate than native English listeners at identifying Mandarin-accented English words). However, there was no evidence of an interlanguage speech intelligibility benefit for talkers (i.e., native Mandarin listeners did not find Mandarin-accented English speech more intelligible than native English speech). When listener and talker phonological proficiency (operationalized as accentedness) was taken into account, it was found that the interlanguage speech intelligibility benefit for listeners held only for the low phonological proficiency listeners and low phonological proficiency speech. The intelligibility data were also considered in relation to various temporal-acoustic properties of native English and Mandarin-accented English speech in effort to better understand the properties of speech that may contribute to the interlanguage speech intelligibility benefit.
The voice expresses a wide range of emotions through modulations of acoustic parameters such as frequency and amplitude. Although the acoustics of individual emotions are well understood, attempts to describe the acoustic correlates of broad emotional categories such as valence have yielded mixed results. In the present study, we analyzed the acoustics of emotional valence for different families of emotion. We divided emotional vocalizations into “motivational,” “moral,” and “aesthetic” families as defined by the OCC (Ortony, Clore, and Collins) model of emotion. Subjects viewed emotional scenarios and were cued to vocalize congruent exclamations in response to them, for example, “Yay!” and “Damn!”. Positive valence was weakly associated with high-pitched and loud vocalizations. However, valence interacted with emotion family for both pitch and amplitude. A general acoustic code for valence does not hold across families of emotion, whereas family-specific codes provide a more accurate description of vocal emotions. These findings are consolidated into a set of “rules of expression” relating vocal dimensions to emotion dimensions.
The majority of previous studies on vocal expression have been conducted on posed expressions. In contrast, we utilized a large corpus of authentic affective speech recorded from real-life voice controlled telephone services. Listeners rated a selection of 200 utterances from this corpus with regard to level of perceived irritation, resignation, neutrality, and emotion intensity. The selected utterances came from 64 different speakers who each provided both neutral and affective stimuli. All utterances were further automatically analyzed regarding a comprehensive set of acoustic measures related to F0, intensity, formants, voice source, and temporal characteristics of speech. Results first showed that several significant acoustic differences were found between utterances classified as neutral and utterances classified as irritated or resigned using a within-persons design. Second, listeners’ ratings on each scale were associated with several acoustic measures. In general the acoustic correlates of irritation, resignation, and emotion intensity were similar to previous findings obtained with posed expressions, though the effect sizes were smaller for the authentic expressions. Third, automatic classification (using LDA classifiers both with and without speaker adaptation) of irritation, resignation, and neutral performed at a level comparable to human performance, though human listeners and machines did not necessarily classify individual utterances similarly. Fourth, clearly perceived exemplars of irritation and resignation were rare in our corpus. These findings were discussed in relation to future research.
Infant-directed speech has three main roles – it attracts attention, conveys emotional affect, and conveys language-specific phonological information, and each of these roles are reflected in certain components of the speech signal – pitch, rated affect, and vowel hyperarticulation. We sought to investigate the independence of these components by comparing British English speech directed to first language English learners (infants), and second language English learners (adult foreigners), populations with similar linguistic but dissimilar affective needs. It was found that, compared with British adult-directed speech, vowels were equivalently hyperarticulated in infant- and foreigner-directed speech. On the other hand, pitch was higher in speech to infants than to foreigners or adult British controls; and positive affect was highest in infant-directed and lowest in foreigner-directed speech. These results suggest that linguistic modifications found in both infant- and foreigner-directed speech are didactically oriented, and that linguistic modifications are independent of vocal pitch and affective valence.