Article

Complain like you mean it! How prosody conveys suffering even about innocuous events

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

When complaining, speakers can use their voice to convey a feeling of pain, even when describing innocuous events. Rapid detection of emotive and identity features of the voice may constrain how the semantic content of complaints is processed, as indexed by N400 and P600 effects evoked by the final, pain-related word. Twenty-six participants listened to statements describing painful and innocuous events expressed in a neutral or complaining voice, produced by ingroup and outgroup accented speakers. Participants evaluated how hurt the speaker felt under EEG monitoring. Principal Component Analysis of Event-Related Potentials from the final word onset demonstrated N400 and P600 increases when complainers described innocuous vs. painful events in a neutral voice, but these effects were altered when utterances were expressed in a complaining voice. Independent of prosody, N400 amplitudes increased for complaints spoken in outgroup vs. ingroup accents. Results demonstrate that prosody and accent constrain the processing of spoken complaints as proposed in a parallel-constraint-satisfaction model.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
What makes human communication exceptional is the ability to grasp speaker’s intentions beyond what is said verbally. How the brain processes communicative functions is one of the central concerns of the neurobiology of language and pragmatics. Linguistic-pragmatic theories define these functions as speech acts, and various pragmatic traits characterise them at the levels of propositional content, action sequence structure, related commitments and social aspects. Here I discuss recent neurocognitive studies, which have shown that the use of identical linguistic signs in conveying different communicative functions elicits distinct and ultra-rapid neural responses. Interestingly, cortical areas show differential involvement underlying various pragmatic features related to theory-of-mind, emotion and action for specific speech acts expressed with the same utterances. Drawing on a neurocognitive model, I posit that understanding speech acts involves the expectation of typical partner follow-up actions and that this predictive knowledge is immediately reflected in mind and brain.
Article
Full-text available
La perception de la parole accentuée crée un certain nombre de biais, mais les preuves expérimentales concernant leur nature et les facteurs en jeu dans ces processus sont rares. La présente étude a porté sur les populations francophones de Montréal, évaluant les attitudes implicites et explicites fondées sur l'accent entre les groupes québécois (canadiens français) et européens français. Vingt-sept participants québécois et trente-et-un participants français ont été soumis à un test d'association implicite modifié utilisant des échantillons de discours aux accents québécois et français, ainsi qu'à des questionnaires sur le contenu stéréotypé concernant la perception des dimensions de la chaleur et de la compétence de chaque groupe. Pour la mesure implicite, les participants français ont exhibé un biais significatif en faveur du groupe, tandis que les participants québécois n'ont montré aucune préférence pour l'un ou l'autre groupe. En revanche, les attitudes explicites étaient pour la plupart congruentes. Les Québécois ont été jugés légèrement plus chaleureux que compétents, tandis que les Français ont été jugés beaucoup moins chaleureux que compétents par tous les participants. Aucune corrélation n'a été trouvée entre les mesures implicites et explicites. L'asymétrie des biais implicites s'explique par l'exposition à l'accent, les Québécois étant plus familiers avec l'accent français que les Français avec l'accent québécois. Les attitudes explicites s'expliquent en partie par les différences de prestige entre les deux accents, mais pourraient également avoir été influencées par les croyances morales des participants français en tant qu'immigrants récents au Québec. Ces résultats soulignent que les biais implicites et explicites induits par les accents des locuteurs résultent de mécanismes distincts, mais interdépendants, impliquant des facteurs situationnels à la fois mondiaux et locaux. Nos résultats ont des implications pour la compréhension de la dynamique des relations intergroupes à Montréal et dans de nombreux autres contextes où le mélange des accents régionaux est la norme.
Article
Full-text available
Expectation-based theories of language processing, such as Surprisal theory, are supported by evidence of anticipation effects in both behavioural and neurophysiological measures. Online measures of language processing, however, are known to be influenced by factors such as lexical association that are distinct from—but often confounded with—expectancy. An open question therefore is whether a specific locus of expectancy related effects can be established in neural and behavioral processing correlates. We address this question in an event-related potential experiment and a self-paced reading experiment that independently cross expectancy and lexical association in a context manipulation design. We find that event-related potentials reveal that the N400 is sensitive to both expectancy and lexical association, while the P600 is modulated only by expectancy. Reading times, in turn, reveal effects of both association and expectancy in the first spillover region, followed by effects of expectancy alone in the second spillover region. These findings are consistent with the Retrieval-Integration account of language comprehension, according to which lexical retrieval (N400) is facilitated for words that are both expected and associated, whereas integration difficulty (P600) will be greater for unexpected words alone. Further, an exploratory analysis suggests that the P600 is not merely sensitive to expectancy violations, but rather, that there is a continuous relation. Taken together, these results suggest that the P600, like reading times, may reflect a meaning-centric notion of Surprisal in language comprehension.
Article
Full-text available
The ecology of human language is face-to-face interaction, comprising cues such as prosody, co-speech gestures and mouth movements. Yet, the multimodal context is usually stripped away in experiments as dominant paradigms focus on linguistic processing only. In two studies we presented video-clips of an actress producing naturalistic passages to participants while recording their electroencephalogram. We quantified multimodal cues (prosody, gestures, mouth movements) and measured their effect on a well-established electroencephalographic marker of processing load in comprehension (N400). We found that brain responses to words were affected by informativeness of co-occurring multimodal cues, indicating that comprehension relies on linguistic and non-linguistic cues. Moreover, they were affected by interactions between the multimodal cues, indicating that the impact of each cue dynamically changes based on the informativeness of other cues. Thus, results show that multimodal cues are integral to comprehension, hence, our theories must move beyond the limited focus on speech and linguistic processing.
Article
Full-text available
Emotive speech is a social act in which a speaker displays emotional signals with a specific intention; in the case of third-party complaints, this intention is to elicit empathy in the listener. The present study assessed how the emotivity of complaints was perceived in various conditions. Participants listened to short statements describing painful or neutral situations, spoken with a complaining or neutral prosody, and evaluated how complaining the speaker sounded. In addition to manipulating features of the message, social-affiliative factors which could influence complaint perception were varied by adopting a cross-cultural design: participants were either Québécois (French Canadian) or French and listened to utterances expressed by both cultural groups. The presence of a complaining tone of voice had the largest effect on participant evaluations, while the nature of statements had a significant, but smaller influence. Marginal effects of culture on explicit evaluation of complaints were found. A multiple mediation analysis suggested that mean fundamental frequency was the main prosodic signal that participants relied on to detect complaints, though most of the prosody effect could not be linearly explained by acoustic parameters. These results highlight a tacit agreement between speaker and listener: what characterizes a complaint is how it is said (i.e., the tone of voice), more than what it is about or who produces it. More generally, the study emphasizes the central importance of prosody in expressive speech acts such as complaints, which are designed to strengthen social bonds and supportive responses in interactive behavior. This intentional and interpersonal aspect in the communication of emotions needs to be further considered in research on affect and communication.
Article
Full-text available
Distinct theoretical proposals have described how communicative constraints (contextual biases, speaker identity) impact verbal irony processing. Modular models assume that social and contextual factors have an effect at a late stage of processing. Interactive models claim that contextual biases are considered early on. The constraint‐satisfaction model further assumes that speaker's and context's characteristics can compete at early stages of analysis. The present ERP study teased apart these models by testing the impact of context and speaker features (i.e., speaker accent) on irony analysis. Spanish native speakers were presented with Spanish utterances that were ironic or literal. Each sentence was preceded by a negative or a positive context. Each story was uttered in a native or a foreign accent. Results showed that contextual biases and speaker accent interacted as early as 150 ms during irony processing. Greater N400‐like effects were reported for ironic than literal sentences only with positive contexts and native accent, possibly suggesting semantic difficulties when non‐prototypical irony was produced by natives. A P600 effect of irony was also reported indicating inferential processing costs. The results support the constraint‐satisfaction model and suggest that multiple sources of information are weighted and can interact from the earliest stages of irony analysis.
Article
Full-text available
Humans possess a robust speech-perception apparatus that is able to cope with variation in spoken language. However, linguists have often claimed that this coping ability must be limited, since otherwise there is no way for such variation to lead to language change and regional accents. Previous research has shown that the presence or absence of perceptual compensation is indexed by the N400 and P600 components, where the N400 reflects the general awareness of accented speech input, and the P600 responds to phonological-rule violations. The present exploratory paper investigates the hypothesis that these same components are involved in the accommodation to sound change, and that their amplitudes reduce as a sound change becomes accepted by an individual. This is investigated on the basis of a vowel shift in Dutch that has occurred in the Netherlands but not in Flanders (the Dutch-speaking part of Belgium). Netherlandic and Flemish participants were presented auditorily with words containing either conservative or novel vowel realizations, plus two control conditions. Exploratory analyses found no significant differences in ERPs to these realizations, but did uncover two systematic differences. Over 9 months, the N400 response became less negative for both groups of participants, but this effect was significantly smaller for the Flemish participants, a finding in line with earlier results on accent processing. Additionally, in one control condition where a “novel” realization was produced based on vowel lengthening, which cannot be achieved by any rule of either Netherlandic or Flemish Dutch and changes the vowel's phonemic identity, a P600 was obtained in the Netherlandic participants, but not in the Flemish participants. This P600 corroborates a small number of other studies which found phonological P600s, and provides ERP validation of earlier behavioral results that adaptation to variation in speech is possible, until the variation crosses a phoneme boundary. The results of this exploratory study thus reveal two types of perceptual-compensation (dys)function: on-line accent processing, visible as N400 amplitude, and failure to recover from an ungrammatical realization that crosses a phoneme boundary, visible as a P600. These results provide further insight on how these two ERPs reflect the processing of variation.
Article
Full-text available
Research has examined persuasive language, but relatively little is known about how persuasive people are when they attempt to persuade through paralanguage, or acoustic properties of speech (e.g., pitch and volume). People often detect and react against what communicators say, but might they be persuaded by speakers' attempts to modulate how they say it? Four experiments support this possibility, demonstrating that communicators engaging in paralinguistic persuasion attempts (i.e., modulating their voice to persuade) naturally use paralinguistic cues that influence perceivers' attitudes and choice. Rather than being effective because they go undetected, however, the results suggest a subtler possibility. Even when they are detected, paralinguistic attempts succeed because they make communicators seem more confident without undermining their perceived sincerity. Consequently, speakers' confident vocal demeanor persuades others by serving as a signal that they more strongly endorse the stance they take in their message. Further, we find that paralinguistic approaches to persuasion can be uniquely effective even when linguistic ones are not. A cross-study exploratory analysis and replication experiment reveal that communicators tend to speak louder and vary their volume during paralinguistic persuasion attempts, both of which signal confidence and, in turn, facilitate persuasion. (PsycINFO Database Record (c) 2019 APA, all rights reserved).
Article
Full-text available
The functional interpretation of two salient language-sensitive ERP components – the N400 and the P600 – remains a matter of debate. Prominent alternative accounts link the N400 to processes related to lexical retrieval, semantic integration, or both, while the P600 has been associated with syntactic reanalysis or, alternatively, to semantic integration. The often overlapping predictions of these competing accounts in extant experimental designs, however, has meant that previous findings have failed to clearly decide among them. Here, we present an experiment that directly tests the competing hypotheses using a design that clearly teases apart the retrieval versus integration view of the N400, while also dissociating a syntactic reanalysis/reprocessing account of the P600 from semantic integration. Our findings provide support for an integrated functional interpretation according to which the N400 reflects context-sensitive lexical retrieval – but not integration – processes. While the observed P600 effects were not predicted by any account, we argue that they can be reconciled with the integration view, if spatio-temporal overlap of ERP components is taken into consideration.
Article
Full-text available
Although speaking a foreign language is undoubtedly an asset, foreign-accented speakers are usually perceived negatively. It is unknown, however, to what extent this bias impacts cognitive processes. Here, we used ERPs and pupillometry to investigate whether the negative bias generated by a short exposure to a foreign accent influences the overall perception of a speaker, even when the person is not speaking. We compared responses to written sentence comprehension, memory and visual perception, associated with native speakers (high and low social status) and a foreign-accented speaker (high social status). The foreign-accented speaker consistently fell in-between the high-status native speaker and the low-status native speaker. This is the first physiological demonstration that short exposure to a foreign accent impacts subsequent cognitive processes, and that foreign-accented speakers seem to be considered less reliable than native speakers, even with equally high social status. Awareness of this bias is essential to avoid discriminations in our multilingual society.
Article
Full-text available
Research on social cognition has fruitfully applied computational modeling approaches to explain how observers understand and reason about others’ mental states. By contrast, there has been less work on modeling observers’ understanding of emotional states. We propose an intuitive theory framework to studying affective cognition—how humans reason about emotions—and derive a taxonomy of inferences within affective cognition. Using this taxonomy, we review formal computational modeling work on such inferences, including causal reasoning about how others react to events, reasoning about unseen causes of emotions, reasoning with multiple cues, as well as reasoning from emotions to other mental states. In addition, we provide a roadmap for future research by charting out inferences—such as hypothetical and counterfactual reasoning about emotions—that are ripe for future computational modeling work. This framework proposes unifying these various types of reasoning as Bayesian inference within a common “intuitive Theory of Emotion.” Finally, we end with a discussion of important theoretical and methodological challenges that lie ahead in modeling affective cognition.
Article
Full-text available
While evidence suggests that pain cries produced by human babies and other mammal infants communicate pain intensity, whether the pain vocalisations of human adults also encode pain intensity, and which acoustic characteristics influence listeners’ perceptions, remains unexplored. Here, we investigated how trained actors communicated pain by comparing the acoustic characteristics of nonverbal vocalisations expressing different levels of pain intensity (mild, moderate and severe). We then performed playback experiments to examine whether vocalisers successfully communicated pain intensity to listeners, and which acoustic characteristics were responsible for variation in pain ratings. We found that the mean and range of voice fundamental frequency (F0, perceived as pitch), the amplitude of the vocalisation, the degree of periodicity of the vocalisation and the proportion of the signal displaying non-linear phenomena all increased with the level of simulated pain intensity. In turn, these parameters predicted increases in listeners’ ratings of pain intensity. We also found that while different voice features contributed to increases in pain ratings within each level of expressed pain, a combination of these features explained an impressive amount of the variance in listeners’ pain ratings, both across (76%) and within (31–54%) pain levels. Our results show that adult vocalisers can volitionally simulate and modulate pain vocalisations to influence listeners’ perceptions of pain in a manner consistent with authentic human infant and nonhuman mammal pain vocalisations, and highlight potential for the development of a practical quantitative tool to improve pain assessment in populations unable to self-report their subjective pain experience.
Article
Full-text available
Emotional communication often needs the integration of affective prosodic and semantic components from speech and the speaker's facial expression. Affective prosody may have a special role by virtue of its dual-nature; pre-verbal on one side and accompanying semantic content on the other. This consideration led us to hypothesize that it could act transversely, encompassing a wide temporal window involving the processing of facial expressions and semantic content expressed by the speaker. This would allow powerful communication in contexts of potential urgency such as witnessing the speaker's physical pain. Seventeen participants were shown with faces preceded by verbal reports of pain. Facial expressions, intelligibility of the semantic content of the report (i.e., participants' mother tongue vs. fictional language) and the affective prosody of the report (neutral vs. painful) were manipulated. We monitored event-related potentials (ERPs) time-locked to the onset of the faces as a function of semantic content intelligibility and affective prosody of the verbal reports. We found that affective prosody may interact with facial expressions and semantic content in two successive temporal windows, supporting its role as a transverse communication cue.
Article
Full-text available
Event-Related Potentials (ERPs)—stimulus-locked, scalp-recorded voltage fluctuations caused by post-synaptic neural activity—have proven invaluable to the study of language comprehension. Of interest in the ERP signal are systematic, reoccurring voltage fluctuations called components, which are taken to reflect the neural activity underlying specific computational operations carried out in given neuroanatomical networks (cf. Näätänen and Picton, 1987). For language processing, the N400 component and the P600 component are of particular salience (see Kutas et al., 2006, for a review). The typical approach to determining whether a target word in a sentence leads to differential modulation of these components, relative to a control word, is to look for effects on mean amplitude in predetermined time-windows on the respective ERP waveforms, e.g., 350–550 ms for the N400 component and 600–900 ms for the P600 component. The common mode of operation in psycholinguistics, then, is to tabulate the presence/absence of N400- and/or P600-effects across studies, and to use this categorical data to inform neurocognitive models that attribute specific functional roles to the N400 and P600 component (see Kuperberg, 2007; Bornkessel-Schlesewsky and Schlesewsky, 2008; Brouwer et al., 2012, for reviews). Here, we assert that this Waveform-based Component Structure (WCS) approach to ERPs leads to inconsistent data patterns, and hence, misinforms neurocognitive models of the electrophysiology of language processing. The reason for this is that the WCS approach ignores the latent component structure underlying ERP waveforms (cf. Luck, 2005), thereby leading to conclusions about component structure that do not factor in spatiotemporal component overlap of the N400 and the P600. This becomes particularly problematic when spatiotemporal component overlap interacts with differential P600 modulations due to task demands (cf. Kolk et al., 2003). While the problem of spatiotemporal component overlap is generally acknowledged, and occasionally invoked to account for within-study inconsistencies in the data, its implications are often overlooked in psycholinguistic theorizing that aims to integrate findings across studies. We believe WCS-centric theorizing to be the single largest reason for the lack of convergence regarding the processes underlying the N400 and the P600, thereby seriously hindering the advancement of neurocognitive theories and models of language processing.
Article
Full-text available
Here, we conducted the first study to explore how motivations expressed through speech are processed in real-time. Participants listened to sentences spoken in two types of well-studied motivational tones (autonomy-supportive and controlling), or a neutral tone of voice. To examine this, listeners were presented with sentences that either signaled motivations through prosody (tone of voice) and words simultaneously (e.g., "You absolutely have to do it my way" spoken in a controlling tone of voice), or lacked motivationally biasing words (e.g., "Why don't we meet again tomorrow" spoken in a motivational tone of voice). Event-related brain potentials (ERPs) in response to motivations conveyed through words and prosody showed that listeners rapidly distinguished between motivations and neutral forms of communication as shown in enhanced P2 amplitudes in response to motivational when compared to neutral speech. This early detection mechanism is argued to help determine the importance of incoming information. Once assessed, motivational language is continuously monitored and thoroughly evaluated. When compared to neutral speech, listening to controlling (but not autonomy-supportive) speech led to enhanced late potential ERP mean amplitudes, suggesting that listeners are particularly attuned to controlling messages. The importance of controlling motivation for listeners is mirrored in effects observed for motivations expressed through prosody only. Here, an early rapid appraisal, as reflected in enhanced P2 amplitudes, is only found for sentences spoken in controlling (but not autonomy-supportive) prosody. Once identified as sounding pressuring, the message seems to be preferentially processed, as shown by enhanced late potential amplitudes in response to controlling prosody. Taken together, results suggest that motivational and neutral language are differentially processed; further, the data suggest that listening to cues signaling pressure and control cannot be ignored and lead to preferential, and more in-depth processing mechanisms.
Article
Full-text available
One of the frequent questions by users of the mixed model function lmer of the lme4 package has been: How can I get p values for the F and t tests for objects returned by lmer? The lmerTest package extends the 'lmerMod' class of the lme4 package, by overloading the anova and summary functions by providing p values for tests for fixed effects. We have implemented the Satterthwaite's method for approximating degrees of freedom for the t and F tests. We have also implemented the construction of Type I - III ANOVA tables. Furthermore, one may also obtain the summary as well as the anova table using the Kenward-Roger approximation for denominator degrees of freedom (based on the KRmodcomp function from the pbkrtest package). Some other convenient mixed model analysis tools such as a step method, that performs backward elimination of nonsignificant effects - both random and fixed, calculation of population means and multiple comparison tests together with plot facilities are provided by the package as well.
Article
Full-text available
Stimulus-locked averaged event-related potentials (ERPs) are among the most frequently used signals in Cognitive Neuroscience. However, the late, cognitive or endogenous ERP components are often variable in latency from trial to trial in a component-specific way, compromising the stability assumption underlying the averaging scheme. Here we show that trial-to-trial latency variability of ERP components not only blurs the average ERP waveforms, but may also attenuate existing or artificially induce condition effects in amplitude. Hitherto this problem has not been well investigated. To tackle this problem, a method to measure and compensate component-specific trial-to-trial latency variability is required. Here we first systematically analyze the problem of single trial latency variability for condition effects based on simulation. Then, we introduce a solution by applying residue iteration decomposition (RIDE) to experimental data. RIDE separates different clusters of ERP components according to their time-locking to stimulus onsets, response times, or neither, based on an algorithm of iterative subtraction. We suggest to reconstruct ERPs by re-aligning the component clusters to their most probable single trial latencies. We demonstrate that RIDE-reconstructed ERPs may recover amplitude effects that are diminished or exaggerated in conventional averages by trial-to-trial latency jitter. Hence, RIDE-corrected ERPs may be a valuable tool in conditions where ERP effects may be compromised by latency variability.
Article
Full-text available
Ten years ago, researchers using event-related brain potentials (ERPs) to study language comprehension were puzzled by what looked like a Semantic Illusion: Semantically anomalous, but structurally well-formed sentences did not affect the N400 component---traditionally taken to reflect semantic integration---but instead produced a P600-effect, which is generally linked to syntactic processing. This finding led to a considerable amount of debate, and a number of complex processing models have been proposed as an explanation. What these models have in common is that they postulate two or more separate processing streams, in order to reconcile the Semantic Illusion and other semantically induced P600-effects with the traditional interpretations of the N400 and the P600. Recently, however, these multi-stream models have been called into question, and a simpler single-stream model has been proposed. According to this alternative model, the N400 component reflects the retrieval of word meaning from semantic memory, and the P600 component indexes the integration of this meaning into the unfolding utterance interpretation. In the present paper, we provide support for this 'Retrieval–Integration' account by instantiating it as a neurocomputational model. This neurocomputational model is the first to successfully simulate N400 and P600 amplitude in language comprehension, and simulations with this model provide a proof of concept of the single-stream Retrieval–Integration account of semantically-induced patterns of N400 and P600 modulations.
Article
Full-text available
Very little previous research has addressed the prosodic characteristics of third party complaints. This paper discusses an utterance and word level intonational analysis of this speech act in four speakers of Mexican Spanish. The effect of social distance/ power relationships was incorporated into the study by creating an experimental data elicitation task in which participants addressed identical complaints to a friend, as well as a boss, based on a series of hypothetical contexts. Major global findings revealed that all speakers significantly increased their fundamental frequency (F0) mean when directing their complaints to friends, however, only two speakers significantly expanded their F0 range in the same circumstance. Locally, peaks and valleys were manifested at significantly higher levels across the board when addressing friends. Finally, while speakers produced complaint contours of similar overall shape regardless of hearer, individual variation was present in the form of circumflex versus suppressed utterance-final F0 configurations. Overall, the relatively small data set initiated preliminary thoughts on the application of both cross-linguistic and language-and dialect-specific intonational concepts to complaints while also emphasizing the importance of relationships between interlocutors for future studies.
Article
Full-text available
We consider several key aspects of prediction in language comprehension: its computational nature, the representational level(s) at which we predict, whether we use higher-level representations to predictively pre-activate lower level representations, and whether we “commit” in any way to our predictions, beyond pre-activation. We argue that the bulk of behavioural and neural evidence suggests that we predict probabilistically and at multiple levels and grains of representation. We also argue that we can, in principle, use higher-level inferences to predictively pre-activate information at multiple lower representational levels. We suggest that the degree and level of predictive pre-activation might be a function of its expected utility, which, in turn, may depend on comprehenders’ goals and their estimates of the relative reliability of their prior knowledge and the bottom-up input. Finally, we argue that all these properties of language understanding can be naturally explained and productively explored within a multi-representational hierarchical actively generative architecture whose goal is to infer the message intended by the producer, and in which predictions play a crucial role in explaining the bottom-up input.
Article
Full-text available
Analyses of complaint discourse examine the procedures involved in the interactional management of this communicative activity and the conversational devices employed to invite an affiliative display. In this article, we centre particularly on emotions display in complaint discourse and discuss the gender meaning of certain affective intensification devices. After reviewing previous studies on complaint discourse, we apply methods of conversation analysis and interpretative sociolinguistics to the analysis of conversational extracts, in which female and male speakers complain about the negative behaviour of a third party displaying a high degree of emotive involvement. Our analysis emphasizes the construction of female and male styles, in the display of indignation, through complaint activities in these interactions, and the key role played by prosody in this respect.
Article
Full-text available
This study investigates the mechanisms responsible for fast changes in processing foreign-accented speech. Event Related brain Potentials (ERPs) were obtained while native speakers of Spanish listened to native and foreign-accented speakers of Spanish. We observed a less positive P200 component for foreign-accented speech relative to native speech comprehension. This suggests that the extraction of spectral information and other important acoustic features was hampered during foreign-accented speech comprehension. However, the amplitude of the N400 component for foreign-accented speech comprehension decreased across the experiment, suggesting the use of a higher level, lexical mechanism. Furthermore, during native speech comprehension, semantic violations in the critical words elicited an N400 effect followed by a late positivity. During foreign-accented speech comprehension, semantic violations only elicited an N400 effect. Overall, our results suggest that, despite a lack of improvement in phonetic discrimination, native listeners experience changes at lexical-semantic levels of processing after brief exposure to foreign-accented speech. Moreover, these results suggest that lexical access, semantic integration and linguistic re-analysis processes are permeable to external factors, such as the accent of the speaker.
Article
Full-text available
s Metrics Comments Related Content Abstract Introduction Material and Methods Results Conclusion Supporting Information Author Contributions References Reader Comments (0) Media Coverage (0) Figures Abstract Does it matter if you speak with a regional accent? Speaking immediately reveals something of one’s own social and cultural identity, be it consciously or unconsciously. Perceiving accents involves not only reconstructing such imprints but also augmenting them with particular attitudes and stereotypes. Even though we know much about attitudes and stereotypes that are transmitted by, e.g. skin color, names or physical attractiveness, we do not yet have satisfactory answers how accent perception affects human behavior. How do people act in economically relevant contexts when they are confronted with regional accents? This paper reports a laboratory experiment where we address this question. Participants in our experiment conduct cognitive tests where they can choose to either cooperate or compete with a randomly matched male opponent identified only via his rendering of a standardized text in either a regional accent or standard accent. We find a strong connection between the linguistic performance and the cognitive rating of the opponent. When matched with an opponent who speaks the accent of the participant’s home region—the in-group opponent –, individuals tend to cooperate significantly more often. By contrast, they are more likely to compete when matched with an accent speaker from outside their home region, the out-group opponent. Our findings demonstrate, firstly, that the perception of an out-group accent leads not only to social discrimination but also influences economic decisions. Secondly, they suggest that this economic behavior is not necessarily attributable to the perception of a regional accent per se, but rather to the social rating of linguistic distance and the in-group/out-group perception it evokes.
Article
Full-text available
Accents provide information about the speaker's geographical, socio-economic, and ethnic background. Research in applied psychology and sociolinguistics suggests that we generally prefer our own accent to other varieties of our native language and attribute more positive traits to it. Despite the widespread influence of accents on social interactions, educational and work settings the neural underpinnings of this social bias toward our own accent and, what may drive this bias, are unexplored. We measured brain activity while participants from two different geographical backgrounds listened passively to 3 English accent types embedded in an adaptation design. Cerebral activity in several regions, including bilateral amygdalae, revealed a significant interaction between the participants' own accent and the accent they listened to: while repetition of own accents elicited an enhanced neural response, repetition of the other group's accent resulted in reduced responses classically associated with adaptation. Our findings suggest that increased social relevance of, or greater emotional sensitivity to in-group accents, may underlie the own-accent bias. Our results provide a neural marker for the bias associated with accents, and show, for the first time, that the neural response to speech is partly shaped by the geographical background of the listener. © The Author 2014. Published by Oxford University Press.
Article
Interpersonal communication often involves sharing our feelings with others; complaining, for example, aims to elicit empathy in listeners by vocally expressing a speaker's suffering. Despite the growing neuroscientific interest in the phenomenon of empathy, few have investigated how it is elicited in real time by vocal signals (prosody), and how this might be affected by interpersonal factors, such as a speaker's cultural background (based on their accent). To investigate the neural processes at play when hearing spoken complaints, twenty-six French participants listened to complaining and neutral utterances produced by in-group French and out-group Québécois (i.e., French-Canadian) speakers. Participants rated how hurt the speaker felt while their cerebral activity was monitored with electroencephalography (EEG). Principal Component Analysis of Event-Related Potentials (ERPs) taken at utterance onset showed culture-dependent time courses of emotive prosody processing. The high motivational relevance of ingroup complaints increased the P200 response compared to all other utterance types; in contrast, outgroup complaints selectively elicited an early posterior negativity in the same time window, followed by an increased N400 (due to ongoing effort to derive affective meaning from outgroup voices). Ingroup neutral utterances evoked a late negativity which may reflect re-analysis of emotively less salient, but culturally relevant ingroup speech. Results highlight the time-course of neurocognitive responses that contribute to emotive speech processing for complaints, establishing the critical role of prosody as well as social-relational factors (i.e., cultural identity) on how listeners are likely to “empathize” with a speaker.
Article
Our perception of someone's accent influences our expectations about what they might say or do. In this experiment, EEG data were recorded while participants listened to cliché sentences matching or not the stereotypes associated with the speaker's accent (upper-class Parisian accent or banlieue accent, a negatively connoted accent associated with youth from suburban areas; e.g. “I always listen to rap in my car” said with a banlieue accent (congruent) or an upper-class accent (incongruent)). Mismatches between social accent and stereotypical content triggered an event-related potential (ERP) known as the N400, albeit more anterior than the one observed for semantic violations, as well as a P3. These results are in line with other studies – conducted in particular with gender stereotypes – suggesting that stereotypes are stored in semantic categorical knowledge and that mismatches trigger integration difficulties and checking and updating mechanisms, and extend them to socially marked accents.
Article
This study investigated the impact of the speaker’s identity generated by the voice on sentence processing. We examined the relation between ERP components associated with the processing of the voice (N100 and P200) from voice onset and those associated with sentence processing (N400 and late positivity) from critical word onset. We presented Dutch native speakers with sentences containing true (and known) information, unknown (but true) information or information violating world knowledge and had them perform a truth evaluation task. Sentences were spoken either in a native or a foreign accent. Truth evaluation judgments were not different for statements spoken by the native-accented and the foreign-accented speakers. Reduced N100 and P200 were observed in response to the foreign speaker’s voice compared to the native speaker’s. While statements containing unknown information or world knowledge violations generated a larger N400 than true statements in the native condition, they were not significantly different in the foreign condition, suggesting shallower processing of foreign-accented speech. The N100 was a significant predictor for the N400 in that the reduced N100 observed for the foreign speaker compared to the native speaker was related to a smaller N400 effect. These finding suggest that the impression of the speaker that listeners rapidly form from the voice affects semantic processing, which confirms that speaker’s identity and language comprehension cannot be dissociated.
Article
In social interactions, speakers often use their tone of voice (“prosody”) to communicate their interpersonal stance to pragmatically mark an ironic intention (e.g., sarcasm). The neurocognitive effects of prosody as listeners process ironic statements in real time are still poorly understood. In this study, 30 participants judged the friendliness of literal and ironic criticisms and compliments in the absence of context while their electrical brain activity was recorded. Event-related potentials reflecting the uptake of prosodic information were tracked at two time points in the utterance. Prosody robustly modulated P200 and late positivity amplitudes from utterance onset. These early neural responses registered both the speaker's stance (positive/negative) and their intention (literal/ironic). At a later timepoint (You are such a great/horrible cook), P200, N400, and P600 amplitudes were all greater when the critical word valence was congruent with the speaker’s vocal stance, suggesting that irony was contextually facilitated by early effects from prosody. Our results exemplify that rapid uptake of salient prosodic features allows listeners to make online predictions about the speaker’s ironic intent. This process can constrain their representation of an utterance to uncover nonliteral meanings without violating contextual expectations held about the speaker, as described by parallel-constraint satisfaction models.
Preprint
Neurocognitive models (e.g., Schirmer & Kotz, 2006) have helped to characterize how listeners incrementally derive meaning from vocal expressions of emotion in spoken language, what neural mechanisms are involved at different processing stages, and their relative time course. But how can these insights be applied to communicative situations in which prosody serves a predominantly interpersonal function? This comment examines recent data highlighting the dynamic interplay of prosody and language, when vocal attributes serve the sociopragmatic goals of the speaker or reveal interpersonal information that listeners use to construct a mental representation of what is being communicated. Our comment serves as a beacon to researchers interested in how the neurocognitive system "makes sense" of socioemotive aspects of prosody.
Article
Speakers modulate their voice (prosody) to communicate non-literal meanings, such as sexual innuendo (She inspected his package this morning, where “package” could refer to a man’s penis). Here, we analyzed event-related potentials to illuminate how listeners use prosody to interpret sexual innuendo and what neurocognitive processes are involved. Participants listened to third-party statements with literal or ‘sexual’ interpretations, uttered in an unmarked or sexually evocative tone. Analyses revealed: 1) rapid neural differentiation of neutral vs. sexual prosody from utterance onset; (2) N400-like response differentiating contextually constrained vs. unconstrained utterances following the critical word (reflecting integration of prosody and word meaning); and (3) a selective increased negativity response to sexual innuendo around 600 ms after the critical word. Findings show that the brain quickly integrates prosodic and lexical-semantic information to form an impression of what the speaker is communicating, triggering a unique response to sexual innuendos, consistent with their high social relevance.
Article
The way that speakers communicate their stance towards the listener is often vital for understanding the interpersonal relevance of speech acts, such as basic requests. To establish how interpersonal dimensions of an utterance affect neurocognitive processing, we compared event-related potentials elicited by requests that linguistically varied in how much they imposed on listeners (e.g., Lend me a nickel vs. hundred) and in the speaker's vocally-expressed stance towards the listener (polite or rude tone of voice). From utterance onset, effects of vocal stance were robustly differentiated by an early anterior positivity (P200) which increased for rude versus polite voices. At the utterance–final noun that marked the 'cost' of the request (nickel vs. hundred), there was an increased negativity between 300 and 500 ms in response to high-imposition requests accompanied by rude stance compared to the rest of the conditions. This N400 effect was followed by interactions of stance and imposition that continued to inform several effects in the late positivity time window (500–800 ms post-onset of the critical noun), some of which correlated significantly with prosody-related changes in the P200 response from utterance onset. Results point to rapid neural differentiation of voice-related information conveying stance (around 200 ms post-onset of speech) and exemplify the interplay of different sources of interpersonal meaning (stance, imposition) as listeners evaluate social implications of a request. Data show that representations of speaker meaning are actively shaped by vocal and verbal cues that encode interpersonal features of an utterance, promoting attempts to reanalyze and infer the pragmatic significance of speech acts in the 500–800 ms time window.
Article
Most research on cross-cultural emotion recognition has focused on facial expressions. To integrate the body of evidence on vocal expression, we present a meta-analysis of 37 cross-cultural studies of emotion recognition from speech prosody and nonlinguistic vocalizations, including expressers from 26 cultural groups and perceivers from 44 different cultures. Results showed that a wide variety of positive and negative emotions could be recognized with above-chance accuracy in cross-cultural conditions. However, there was also evidence for in-group advantage with higher accuracy in within- versus cross-cultural conditions. The distance between expresser and perceiver culture, measured via Hofstede’s cultural dimensions, was negatively correlated with recognition accuracy and positively correlated with in-group advantage. Results are discussed in relation to the dialect theory of emotion.
Article
We investigated the neurocognitive processes behind the asymmetry of affect observed in irony understanding, where ironic criticism is more easily understood than ironic praise. We recorded the ERPs of participants while they listened to positive (e.g., “These children are always smiling”) or negative (e.g., “His son is very unfortunate”) remarks pronounced with a sincere or ironic prosody. Participants had to decide whether or not the speaker was sincere. Behavioural results confirmed the asymmetry of affect phenomenon and ERP results revealed that the N400 and P600 were differentially sensitive to the negative or positive emotional connotations of the speaker's messages. These findings shed new light on the cognitive processes behind biphasic N400/P600 cycles, and how they are differentially affected by negativity.
Article
Our decision to believe what another person says can be influenced by vocally expressed confidence in speech and by whether the speaker-listener are members of the same social group. The dynamic effects of these two information sources on neurocognitive processes that promote believability impressions from vocal cues are unclear. Here, English Canadian listeners were presented personal statements (She has access to the building) produced in a confident or doubtful voice by speakers of their own dialect (in-group) or speakers from two different "out-groups" (regional or foreign-accented English). Participants rated how believable the speaker is for each statement and event-related potentials (ERPs) were analysed from utterance onset. Believability decisions were modulated by both the speaker's vocal confidence level and their perceived in-group status. For in-group speakers, ERP effects revealed an early differentiation of vocally expressed confidence (i.e., N100, P200), highlighting the motivational significance of doubtful voices for drawing believability inferences. These early effects on vocal confidence perception were qualitatively different or absent when speakers had an accent; evaluating out-group voices was associated with increased demands on contextual integration and re-analysis of a non-native representation of believability (i.e., increased N400, late negativity response). Accent intelligibility and experience with particular out-group accents each influenced how vocal confidence was processed for out-group speakers. The N100 amplitude was sensitive to out-group attitudes and predicted actual believability decisions for certain out-group speakers. We propose a neurocognitive model in which vocal identity information (social categorization) dynamically influences how vocal expressions are decoded and used to derive social inferences during person perception.
Article
New research is exploring ways that prosody fulfils different social-pragmatic functions in spoken language by revealing the mental or affective state of the speaker, thereby contributing to an understanding of speaker’s meaning. Prosody is often pivotal in signaling speaker attitudes or stance in the interpersonal context of the speaker-hearer; in contrast to the vocal communication of emotions, the significance of prosodic cues serving an emotive or interpersonal function is often dependent on the type of speech act being made and other contextual-situational parameters. Here, we discuss recent acoustic-perceptual studies in our lab demonstrating how prosody marks the interpersonal stance of a speaker and how this information is used by listeners to uncover social intentions that may be non-literal or covert. Focusing on how prosody operates in the communication of politeness, ironic attitudes, and sincerity, our data add to research on the acoustic-perceptual characteristics of prosody in these interpersonal contexts. Future implications for how acoustic cues underlying “prosodic attitudes” affect the interpretive process during on-line speech processing and how they influence behavioral outcomes are considered.
Article
Extending affective speech communication research in the context of authentic, spontaneous utterances, the present study investigates two signals of affect defined by extreme levels of physiological arousal—Passion and Indifference. Exemplars were mined from podcasts conducted in informal, unstructured contexts to examine communication at extreme levels of perceived hyper- and hypo-arousal. Utterances from twenty native speakers of Canadian/American English were submitted for perceptual validation for judgments of affective meaning (Passion, Indifference, or Neutrality) and level of arousal (“Not At All” to “Very Much”). Arousal ratings, acoustic patterns, and linguistic cues (affect/emotion words and expletives) were analyzed. In comparison to neutral utterances, Passion was communicated with the highest maximum pitch and pitch range, and highest maximum and mean amplitude, while Indifference was communicated via decreases in these measures in comparison to neutral affect. Interestingly, Passion and Neutrality were expressed with comparable absolute ranges of amplitude, while the minimum amplitudes of both Passion and Indifference were greater than those of Neutral expressions. Linguistically, Indifference was marked by significantly greater use of explicit expressions of affect (e.g. I don't care…), suggesting a linguistic encoding preference in this context. Passion was expressed with greater use of expletives; yet, their presence was not necessary to facilitate perception of a speaker's level of arousal. These findings shed new light upon the paralinguistic and linguistic features of spontaneous expressions at the extremes of the arousal continuum, highlighting key distinctions between Indifference and Neutrality with implications for vocal communication research in healthy and clinical populations.
Article
It is widely accepted that emotional expressions can be rich communicative devices. We can learn much from the tears of a grieving friend, the smiles of an affable stranger, or the slamming of a door by a disgruntled lover. So far, a systematic analysis of what can be communicated by emotional expressions of different kinds and of exactly how such communication takes place has been missing. The aim of this article is to introduce a new framework for the study of emotional expressions that I call the theory of affective pragmatics (TAP). As linguistic pragmatics focuses on what utterances mean in a context, affective pragmatics focuses on what emotional expressions mean in a context. TAP develops and connects two principal insights. The first is the insight that emotional expressions do much more than simply expressing emotions. As proponents of the Behavioral Ecology View of facial movements have long emphasized, bodily displays are sophisticated social tools that can communicate the signaler's intentions and requests. Proponents of the Basic Emotion View of emotional expressions have acknowledged this fact, but they have failed to emphasize its importance, in part because they have been in the grip of a mistaken theory of emotional expressions as involuntary readouts of emotions. The second insight that TAP aims to articulate and apply to emotional expressions is that it is possible to engage in analogs of speech acts without using language at all. I argue that there are important and so far largely unexplored similarities between what we can “do” with words and what we can “do” with emotional expressions. In particular, the core tenet of TAP is that emotional expressions are a means not only of expressing what's inside but also of directing other people's behavior, of representing what the world is like and of committing to future courses of action. Because these are some of the main things we can do with language, the take home message of my analysis is that, from a communicative point of view, much of what we can do with language we can also do with non-verbal emotional expressions. I conclude by exploring some reasons why, despite the analogies I have highlighted, emotional expressions are much less powerful communicative tools than speech acts.
Article
Hierarchies in the correlated forms of power (resources) and status (prestige) are constants that organize human societies. This article reviews relevant social psychological literature and identifies several converging results concerning power and status. Whether rank is chronically possessed or temporarily embodied, higher ranks create psychological distance from others, allow agency by the higher ranked, and exact deference from the lower ranked. Beliefs that status entails competence are essentially universal. Interpersonal interactions create warmth-competence compensatory tradeoffs. Along with societal structures (enduring inequality), these tradeoffs reinforce status-competence beliefs. Race, class, and gender further illustrate these dynamics. Although status systems are resilient, they can shift, and understanding those change processes is an important direction for future research, as global demographic changes disrupt existing hierarchies.
Chapter
Complaints might be thought a priori to be a good place to find paralinguistic features in a natural setting. Using conversation analytic methodology, I argue that the phonetic design of complaints is mostly determined by other sequential features of the turn in which the complaint is delivered. In particular, a turn delivering a complaint can either be marked as designed to receive an affiliative reponse (and thus a continuation of the activity of complaining), or marked as closing down the complaint sequence.
Chapter
Prosody is constitutive for spoken interaction. In more than 25 years, its study has grown into a full-fledged and very productive field with a sound catalogue of research methods and principles. This volume presents the state of the art, illustrates current research trends and uncovers potential directions for future research. It will therefore be of major interest to everyone studying spoken interaction. The collection brings together an impressive range of internationally renowned scholars from different, yet closely related and compatible research traditions which have made a significant contribution to the field. They cover issues such as the units of language, the contextualization of actions and activities, conversational modalities and genres, the display of affect and emotion, the multimodality of interaction, language acquisition and aphasia. All contributions are based on empirical, audio- and/or video-recorded data of natural talk-in-interaction, including languages such as English, German and Japanese. The methodologies employed come from Ethnomethodology, Conversation Analysis and Interactional Linguistics.
Article
The abstract for this document is available on CSA Illumina.To view the Abstract, click the Abstract button above the document title.
Article
Recent years have seen a major change in views on language and lan- guage use. During the last decades, language use has been more and more recognized as an intentional action (Grice 1957). In the form of speech acts (Austin 1962; Searle 1969), language expresses the speaker’s attitudes and communicative intents to shape the listener’s reaction. Notably, the speaker’s intention is often not directly coded in the lexical meaning of a sentence, but rather conveyed implicitly, for example via nonverbal cues such as mimics, body posture, and speech prosody. The theoretical work of intonational phonologists seeking to define the meaning of specific vocal intonation profiles (Bolinger 1986; Kohler 1991) demonstrates the role of prosody in conveying the speaker’s conversational goal. However, to date only little is known about the neurocognitive architecture underlying the comprehension of communicative intents in general (Holtgraves 2005; Egorova, Shtyrov, Pulvermüller 2013), and the distinctive role of prosody in particular. The present study aimed, therefore, to investigate this interpersonal role of prosody in conveying the speaker’s intents and its underlying acoustic properties. Taking speech act theory as a framework for intention in language (Austin 1962; Searle 1969), we created a novel set of short (non-)word utterances intoned to express different speech acts. Adopting an approach from emotional prosody research (Banse, Scherer 1996; Sauter, Eisner, Calder, Scott 2010), this stimulus set was employed in a combination of behavioral ratings and acoustic analyzes to test the following hypotheses: If prosody codes for the communicative intention of the speaker, we expect 1) above-chance behavioral recognition of different intentions that are merely expressed via prosody, 2) acoustic markers in the prosody that identify these intentions, and 3) independence of acoustics and behavior from the overt lexical meaning of the utterance. The German words ‘‘Bier’’ (beer) and ‘‘Bar’’ (bar) and the non-words ‘‘Diem’’ and ‘‘Dahm’’ were recorded from four (two female) speakers expressing six different speech acts in their prosody—crit- icism, wish (expressives), warning, suggestion (directives), doubt, and naming (assertives). Acoustic features for pitch, duration, intensity, and spectral features were extracted with PRAAT. These measures were subjected to discriminant analyzes—separately for words and non-words—in order to test whether the acoustic features have enough discriminant power to classify the stimuli to their corre- sponding speech act category. Furthermore, 20 participants were tested for the behavioral recognition of the speech act categories with a 6 alternative-forced-choice task. Finally, a new group of 40 par- ticipants performed subjective ratings of the different speech acts (e.g. ‘‘How much does the stimulus sound like criticism?’’) to obtain more detailed information on the perception of different intentions and allow, as quantitative variable, further analyzes in combination with the acoustic measures. The discriminant analyzes of the acoustic features yielded high above chance predictions for each speech act category, with an overall classification accuracy of about 90 % for both words and non- words (chance level: 17 %). Likewise, participants were behaviorally very well able to classify the stimuli into the correct category, with a slightly lower accuracy for non-words (73 %) than for words (81 %). Multiple regression analyzes of participants’ ratings of the different speech acts and the acoustic measures further identified distinct pat- terns of physical features that were able to predict the behavioral perception. These findings indicate that prosodic cues convey sufficient detail to classify short (non-)word utterances according to their underlying intention, at acoustic as well as perceptual levels. Lexical meaning seems to be supportive but not necessary for the comprehension of different intentions, given that participants showed a high perfor- mance for the non-words, but scored higher for the words. In total, our results show that prosodic cues are powerful indicators for the speaker’s intentions in interpersonal communication. The present carefully constructed stimulus set will serve as a useful tool to study the neural correlates of intentional prosody in the future.
Article
Background Conventionally, event-related brain potentials (ERPs) are obtained by averaging a number of single trials. This can be problematic due to trial-to-trial latency variability. Residue iteration decomposition (RIDE) was developed to decompose ERPs into component clusters with different latency variability and to re-synchronize the separated components into a reconstructed ERP. New method RIDE has been continuously upgraded and now converges to a robust version. We describe the principles of RIDE and detailed algorithms of the functional modules of a toolbox. We give recommendations and provide caveats for using RIDE from both methodological and psychological perspectives. Results RIDE was applied to several data samples to demonstrate its ability to decompose and reconstruct latency-variable components of ERPs and to retrieve single trial variability information. Different functionalities of RIDE were shown in appropriate examples. Comparison with existing methods RIDE employs several modules to achieve a robust decomposition of ERP. As main innovations RIDE (1) is able to extract components based on the combination of known event markers and estimated latencies, (2) prevents distortions much more effectively than previous methods based on least-square algorithms, and (3) allows time window confinements to target relevant components associated with sub-processes of interest. Conclusions RIDE is a convenient method that decomposes ERPs and provides single trial analysis, yielding rich information about sub-components, and that reconstructs ERPs, more closely reflecting the combined activity of single trial ERPs. The outcomes of RIDE provide new dimensions to study brain–behavior relationships based on EEG data.
Article
With increasing availability of digital text, there has been an explosion of computational methods designed to turn patterns of word co-occurrence in large text corpora into numerical scores expressing the “semantic distance” between any two words. The success of such methods is typically evaluated by how well they predict human judgments of similarity. Here, I examine how well corpus-based methods predict amplitude of the N400 component of the event-related potential (ERP), an online measure of lexical processing in brain electrical activity. ERPs elicited by the second words of 303 word pairs were analyzed at the level of individual items. Three corpus-based measures (mutual information, distributional similarity, and latent semantic analysis) were compared to a traditional measure of free association strength. In a regression analysis, corpus-based and free association measures each explained some of the variance in N400 amplitude, suggesting that these may tap distinct aspects of word relationships. Lexical factors of concreteness of word meaning, word frequency, number of semantic associates, and orthographic similarity also explained variance in N400 amplitude at the single-item level.