Article

Pitch Changes during Attempted Deception

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Two studies on speech samples from 32 male college students are reported. In the first, it was shown that the average voice fundamental frequency of the subjects was higher when lying than when telling the truth. In the second, judges rated the truthfulness of 64 true and false utterances either from an audiotape that had been electronically filtered to render the semantic content unintelligible or from an unfiltered tape. The truthfulness ratings of the judges who heard the content-filtered tape were negatively correlated with fundamental frequency, whereas for the unfiltered condition, truthfulness ratings were uncorrelated with pitch. Although raings made under the two conditions did not differ in overalll accuracy, accuracy differences were found that depended on how an utterance had been elicited originally.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Acoustic pitch cues include the change in average fundamental frequency and the variance in fundamental frequency within statements. Higher fundamental frequency has been associated with deceitful speech (Anolli & Ciceri, 1997;Ekman, O'Sullivan, Friesen, & Scherer, 1991;Streeter, Krauss, Geller, Olson, & Apple, 1977), but not consistently (Kirchübel & Howard, 2013;Rockwell et al., 1997aRockwell et al., , 1997b. Furthermore, a greater variance in fundamental frequency has been associated with deceitful speech (Anolli & Ciceri, 1997;Rockwell et al., 1997a). ...
... Higher fundamental frequency as the result of tightening of the vocal muscles has been associated with emotional stress such as fear, while a faster tempo has been associated with happiness and surprise (Scherer & Oshinsky, 1977;Tolkmitt & Scherer, 1986). Streeter et al. (1977) instructed adults to try to deceive an interviewer on certain questions during an interview task and found that a difference in average fundamental frequency was a significant indicator of deception. In addition, when participants were emotionally aroused by being told that successful deception was correlated with IQ, the difference in fundamental frequency between truths and lies became significantly greater. ...
... In contrast, the blunted emotional response theory led to the hypothesis that bilinguals speaking their L2 will provide fewer cues to deception than monolinguals and bilinguals using their L1. Although the cognitiveload theory and the emotional-arousal theory generate similar predictions, we expected different dependent variables to be stronger indicators of each process, with reaction time effects lending more support to the cognitive-load theory (e.g., Walczyk et al., 2003) and fundamental frequency effects lending more support to the emotional-arousal theory (e.g., Streeter et al., 1977). ...
Article
Full-text available
Acoustic cues to deception on a picture-naming task were analyzed in three groups of English speakers: monolinguals, bilinguals with English as their first language, and bilinguals with English as a second language. Results revealed that all participants had longer reaction times when generating falsehoods than when producing truths, and that the effect was more robust for English as a second language bilinguals than for the other two groups. Articulation rate was higher for all groups when producing lies. Mean fundamental frequency and intensity cues were not reliable cues to deception, but there was lower variance in both of these parameters when generating false versus true labels for all participants. Results suggest that naming latency was the only cue to deception that differed by language background. These findings broadly support the cognitive-load theory of deception, suggesting that a combination of producing deceptive speech and using a second language puts an extra load on the speaker.
... Another set of vocalic cues to attempted deception are the paralinguistic features of an utterance. Paralinguistics have been investigated according to four (4) logical subcategories: rate, pitch, tone and volume (Knapp, Hart & Dennis, 1974;Mehrabian, 1971;Streeter et al., 1977). It is commonsensical to think that deception-induced stress produces faster speech rates in those attempting deception, yet the empirical evidence is inconclusive (Markel, Vargas & Howard, 1972). ...
... To the extent that this is true, one would expect to find higher fundamental frequencies for senders of deceptive messages than for truthful ones; indeed, such a difference has been reported (Ekman, Friesen & Scherer, 1976). In addition, when the semantic content of an utterance is rendered unintelligible, as with filtered speech, subjects have relied on pitch cues to discern truth from lie (Streeter et al., 1977). Research has been done to assess the importance of tone as an auditory cue (DePaulo, Zuckerman & Rosenthal, 1980; see Scherer, 1977 for a review of 20 studies). ...
... Other research has addressed the difference in the importance of nonverbal cues, some using a DCA methodology (Feldman, 1976;Krauss, 1977;Littlepage & Pineault, 1978;Streeter et al., 1977 (Krauss, Geller & Olsen, 1976). ...
... Besides the contributions of Hansen and colleagues, the first studies that come to mind with regard to psychological stress are probably those on deceptive speech by Hirschberg, Benus, and colleagues (e.g., [6,7,8]). Although these are all recent studies, this line of research actually stretches back over at least 40 years, see [9,10,11,12,13,14]. ...
... For example, fear (particularly cold fear) is often found to cause high flat F0 patterns, lower and less variable intensity patterns, a softer voice quality, and a faster speaking rate [15,16,17,18]. Similarly, deception as well as other types of increased mental/ cognitive workload, for example, those due to multi-tasking, are typically characterized by an increase in all F0 and intensity parameters, including ranges, a tenser or creakier voice quality, and a lower speaking rate [8,9,12,13,14,19,20]. ...
... The quizmaster's realizations of the four alternative answers per question were acoustically analyzed with respect to a set of 7 prosodic parameters that are known from previous studies to be involved in psychological stress/deception and (cold) fear [6][7][8][9][10][11][12][13][14], perceptual prominence and accentuation [22,23], and prosodic phrasing [24]. That is, if the quizmaster did subconsciously send out subtle cues to correct answers, then we wanted to detect these telltale details, irrespective of whether they originated from stress and emotion or just occurred as changes in the degree to which the correct answer was singled out from its alternatives, for example, by means of a stronger/ weaker accentuation or a stronger/weaker phrase boundary. ...
Conference Paper
Full-text available
Starting from previous research on the prosodic patterns of emotion, psychological stress and deceptive speech, the paper investigates whether quizmasters convey telltale cues to correct answers in the popular four-alternatives (a/b/c/d) framework of "Who Wants to Be a Millionaire?" (WWTBAM). We simulated this game-show scenario in the lab, based on 20 naive German participants who took the roles of either quizmaster or contestant. Quizmasters were instructed to take care not to reveal correct answers to contestants. Despite this explicit instruction, our acoustic-prosodic analysis yielded clear telltale signs of correct answers. These telltale signs were consistent across all quizmasters, but complex insofar as they differed across question positions (a/b/c/d) could not be found in the introductory letters. Cues to correct answers involved timing and range of F0 and intensity patterns, speaking rate, and degree of final lengthening; pause durations between answers and introductory letters were irrelevant. The results are discussed with respect to their implications for real quiz shows and the elicitation of emotions and stress in the lab.
... There have been a number of studies of linguistic cues to deceptive speech and text, mostly conducted by psychologists, and more recently by computer scientists. Early work by Ekman et al. [6] and Streeter et al. [7] found pitch increases in deceptive speech. An increased effect was observed when subjects were highly motivated to deceive. ...
... Previous work [7] found that there are some correlations between deception and fundamental frequency. In order to capture the frequency information, we extracted 42 features which come from fundamental frequency variation (FFV) spectrum with 7 components [24]. ...
Conference Paper
Full-text available
Improving methods of automatic deception detection is an important goal of many researchers from a variety of disciplines, including psychology, computational linguistics, and criminol-ogy. We present a system to automatically identify deceptive utterances using acoustic-prosodic, lexical, syntactic, and phono-tactic features. We train and test our system on the Interspeech 2016 ComParE challenge corpus, and find that our combined features result in performance well above the challenge base-line on the development data. We also perform feature ranking experiments to evaluate the usefulness of each of our feature sets. Finally, we conduct a cross-corpus evaluation by training on another deception corpus and testing on the ComParE corpus .
... While most of the past studies on deception, conducted by psychologists, have focused on standard biometric indicators which are commonly measured in polygraph (cardiovascular, electrodermal, and respiratory) [28] and non-verbal clues like gestures, body movement, facial expressions, brain imaging, body odor, vocal behavior [3,23,35], it appears that literature on the acoustic correlates of deception by real criminals is limited. Therefore further research into this area would provide a valuable contribution. ...
... They were under the influence of real stress which led to higher percentage increase in mean F0 as compared to that reported by [22]. Increase in pitch was also reported to be a reliable clue associated with deception in [23][24][25][26][27]. Fundamental frequency (F0) of subject responses was reported to be higher for deceptive than non-deceptive responses in [28]. ...
... As previously noted, other vocal qualities beyond speech rate and intonation should influence persuasion in a similar manner as shown in Experiment 1. For example, research has shown that changes in vocal pitch reliably influence listener's judgments of a speaker on various dimensions, including competence (Brown, Strong, & Rencher, 1974), honesty (Streeter, Krauss, Geller, Olson, & Apple, 1977), and anxiety (Apple, Streeter, & Krauss, 1979), such that raised pitch elicits more negative evaluations on each dimension. Given that raised pitch is associated with negative evaluations on anxiety, and anxiety and confidence are inversely related, it follows that listeners may associate decreased confidence with raised pitch. ...
Article
Full-text available
Three experiments were designed to investigate the effects and psychological mechanisms of three vocal qualities on persuasion. Experiment 1 (N = 394) employed a 2 (elaboration: high vs. low) × 2 (vocal speed: fast vs. slow) × 2 (vocal intonation: falling vs. rising) between-participants factorial design. As predicted, vocal speed and vocal intonation influenced global perceptions of speaker confidence. Under high-elaboration, vocal confidence biased thought-favorability, which influenced attitudes. Under low-elaboration, vocal confidence directly influenced attitudes as a peripheral cue. Experiments 2 (N = 412) and 3 (N = 397) conceptually replicated the bias and cue effects in Experiment 1, using a 2 (elaboration: high vs. low) × 2 (vocal pitch: raised vs. lowered) between-participants factorial design. Vocal pitch influenced perceptions of speaker confidence as predicted. These studies demonstrate that changes in three vocal properties influence global perceptions of speaker confidence, influencing attitudes via different mediating processes moderated by amount of thought. Evaluation of alternative mediators in Experiments 2 and 3 failed to support these alternatives to global perceptions of speaker confidence.
... Louder voices are judged to be friendlier than softer ones [15]. Deception is correlated with a rising F0 mean [22]. ...
Article
Suprasegmental (prosody) features of discourse provide a vehicle by which speakers reflect their mental purposes to listeners. Generating suitable prosody information is critical to expressing messages and improving the intelligibility and naturalness of synthetic speech. Generic prosody generators should provide information about pitch frequency (F 0) contours, energy levels, word durations, and inter-word pause durations for speech synthesizers. The present study used a recurrent neural network (RNN) for prosody generation. The inputs of this RNN were word-level and syllable-level linguistic features. To provide data efficiently for the RNN-based prosody generator in the training, validation, and test phases, automatic segmentation and labeling of phonemes were performed. The number of inputs to the RNN was reduced by employing a binary gravitational search algorithm (BGSA) for feature selection (FS). The proposed prosody generator provided 12 output prosodic parameters for the current syllable for representing pitch contour, log-energy contour, inter-syllable pause duration, duration of syllable, duration of the vowel in the syllable, and vowel onset time. Experimental results demonstrated the success of the RNN-based prosody generator in synthesizing the six prosodic elements with acceptable root mean square error (RMSE). By using a BGSA-based FS unit, a lighter neural model was achieved with a 53 % reduction in the number of weight connections, producing RMSEs with acceptable degradation over the no-FS unit prosody generator. The performance of the BGSA-based FS method was compared with a binary particle swarm optimization (BPSO) algorithm, and the BGSA showed slightly better results. A modified mean opinion score scale was used to evaluate the intelligibility and naturalness of synthesized speech using the proposed method.
... Despite the fact that raters assigned marginally higher romantic interest ratings to romantic clips than friend clips based upon paralanguage, personality assessments were less positive for these romantic clips. Stripping the content from the paralanguage clips perhaps allowed raters to notice potential stress cues, such as differences in prosody and intensity (Streeter et al. 1977). These results potentially point to the vulnerability associated with early stage romantic love. ...
... The sole study appears to be Heeschen et al. (1988). Based on earlier studies indicating higher vocal pitch (fundamental frequency, F0) is related to greater stress in normal adults (Ekman, Friesen, & Scherer, 1976;Streeter, Geller, & Apple, 1977), these authors measured F0 as an indicator of stress. F0 was measured from the speech of adults with Broca's aphasia, those with Wernicke's aphasia, and a matched comparison group during two conditions: A high stress condition required the description of specific actions presented in visual stimuli (i.e., "What is X doing?"), and the low stress condition consisted of a casual conversation. ...
Article
Full-text available
Individuals with aphasia face significant challenges in their lives. These challenges stem from the difficulties caused by impaired language function. Impairment in the ability to successfully communicate could be a significant source of stress to individuals with aphasia. The purpose of the current paper is to present a review of the literature on the neuropsychobiology of stress and aphasia, give a contemporary conceptualization of stress (both neurobiological and psychological), offer a framework and directions for future investigations in stress and aphasia, and finally suggest clinical implications for this line of inquiry.
... Previous studies have examined stress in voice based on laboratory induced stress (simulated situations), namely subjects trying to deceive the investigator (Streeter et al., 1977;Pollina et al., 1998;Patil et al., 2013), stress due to work overload (Scherer et al., 2002), deprived sleep, fatigue (Bagnall, 2007), stressful visual stimuli (Johnson et al., 1979), unpleasant slide shows, pictures of humans suffering with skin diseases or severe accident injuries (Tolkmitt and Scherer, 1986), voice of pilot just prior to a fatal aircraft crash or the classic Hindenburg radio announcement (Williams and Stevens, 1972;Williams and Stevens, 1969), recordings of pilot connecting with the base station prior to the loss of control of the helicopter and shortly thereafter (Protopapas and Lieberman, 1995), phonetically rich sentences from the exam stress corpus (Sigmund, 2007), answering verbal quiz while simultaneously playing simulated air controller designed to gradually get complex (Scherer et al., 2008), database consisting of voiced segments of five vowels (Sigmund et al., 2008), database consisting of task-oriented dialog (in a limited time frame) between subjects (Frampton et al., 2010), investigations based on the (SUSAS) database (Hansen and Bou-Ghazale, 1997;Hansen and Womack, 1996;Casale et al., 2007) and SAVEE database (Mongia and Sharma, 2014), EMOVO corpus: an Italian emotional speech database Mencattini et al., 2014) and the TU Berlin database, containing 800 emotional sentences uttered by actors and actresses (Bageshree et al., 2012). Unlike previous studies, emphasis here is to analyse voice of subjects under real stressful situation. ...
Article
Full-text available
When a person is emotionally charged, stress could be discerned in his voice. This paper presents a simplified and a non-invasive approach to detect psycho-physiological stress by monitoring the acoustic modifications during a stressful conversation. Voice database consists of audio clips from eight different popular FM broadcasts wherein the host of the show vexes the subjects who are otherwise unaware of the charade. The audio clips are obtained from real-life stressful conversations (no simulated emotions). Analysis is done using PRAAT software to evaluate mean fundamental frequency (F0) and formant frequencies (F1, F2, F3, F4) both in neutral and stressed state. Results suggest that F0 increases with stress; however, formant frequency decreases with stress. Comparison of Fourier and chirp spectra of short vowel segment shows that for relaxed speech, the two spectra are similar; however, for stressed speech, they differ in the high frequency range due to increased pitch modulation.
... Specifically, it is possible that high-pitched male voices are perceived as more trustworthy up to a certain pitch level above which they start to appear untrustworthy. A number of studies have revealed a link between deception and high vocal pitch (Apple, Streeter, & Krauss, 1979;Ekman, Friesen, & Scherer, 1976;Ekman, O'Sullivan, Friesen, & Scherer, 1991;Sporer & Schwandt, 2006;Lakhani & Taylor, 2003;Streeter, Krauss, Geller, Olson, & Apple, 1977;Taylor & Hick, 2007;Villar, Arciuli, & Paterson, 2013;Zuckerman, Koestner, & Driver, 1981). Borkowska and Pawlowski (2011) have reported a similar, nonlinear, relationship between vocal pitch and attractiveness in female voices. ...
Article
Full-text available
Vocal pitch has been found to influence judgments of perceived trustworthiness and dominance from a novel voice. However, the majority of findings arise from using only male voices and in context-specific scenarios. In two experiments, we first explore the influence of average vocal pitch on first-impression judgments of perceived trustworthiness and dominance, before establishing the existence of an overall preference for high or low pitch across genders. In Experiment 1, pairs of high- and low-pitched temporally reversed recordings of male and female vocal utterances were presented in a two-alternative forced-choice task. Results revealed a tendency to select the low-pitched voice over the high-pitched voice as more trustworthy, for both genders, and more dominant, for male voices only. Experiment 2 tested an overall preference for low-pitched voices, and whether judgments were modulated by speech content, using forward and reversed speech to manipulate context. Results revealed an overall preference for low pitch, irrespective of direction of speech, in male voices only. No such overall preference was found for female voices. We propose that an overall preference for low pitch is a default prior in male voices irrespective of context, whereas pitch preferences in female voices are more context- and situation-dependent. The present study confirms the important role of vocal pitch in the formation of first-impression personality judgments and advances understanding of the impact of context on pitch preferences across genders.
... For example, Vrij [9] noted that liars may experience emotional arousal, contributing to a higher pitched voice (cf. [10]). Cognitive load also has been implicated as a factor, based on evidence that speech hesitations and speech errors increase with the complexity of the lie in question [6]. ...
... Ekman et al. [4] found a significant increase of pitch measures in deceptive speech. Similar results are reported in the work [5], with a higher pitch when lying than when telling the truth. The work of [6] shows that Teager energy-related features and formant variations indicate the possibility of discriminating between truthful and deceptive utterances. ...
... The problem is that these predictions are contradictory across, as well as within theories. This is likely because due to the methods used in the studies on deception so far: As can be seen in, e.g., [2,[6][7][8][9][10][11], there are a number of studies who investigated artificial situations in laboratory settings. While such a setting has the advantage of a controlled environment, it is questionable if those results can be translated to real-life situations. ...
... In three studies, experimental stress led only to negligible increases in f 0 [1,6,14]. In two studies the observed increase in f 0 was significant [4,19], but was smaller than observed in real-life emergencies. Wittels [23], were able to induce short-term psychoemotional stress by asking study participants to negotiate a risky physical obstacle, the guerilla slide (GS). ...
Conference Paper
Full-text available
It is commonly known that a relationship exists between the human voice and various emotional states. Past studies have demonstrated changes in a number of vocal features, such as fundamental frequency f0 and peakSlope, as a result of varying emotional state. These voice characteristics have been shown to relate to emotional load, vocal tension, and, in particular, stress. Although much research exists in the domain of voice analysis, few studies have assessed the relationship between stress and changes in the voice during a dyadic team interaction. The aim of the present study was to investigate the multimodal interplay between speech and physiology during a high-workload, high-stress team task. Specifically, we studied task-induced effects on participants' vocal signals, specifically, the f0 and peakSlope features, as well as participants' physiology, through cardiovascular measures. Further, we assessed the relationship between physiological states related to stress and changes in the speaker's voice. We recruited participants with the specific goal of working together to diffuse a simulated bomb. Half of our sample participated in an "Ice Breaker" scenario, during which they were allowed to converse and familiarize themselves with their teammate prior to the task, while the other half of the sample served as our "Control". Fundamental frequency (f0), peakSlope, physiological state, and subjective stress were measured during the task. Results indicated that f0 and peakSlope significantly increased from the beginning to the end of each task trial, and were highest in the last trial, which indicates an increase in emotional load and vocal tension. Finally, cardiovascular measures of stress indicated that the vocal and emotional load of speakers towards the end of the task mirrored a physiological state of psychological "threat".
... Different measures of speech latencies have been also used to index differences in native vs. foreign language processing, and between false and true statements. For example, vocal pitch changes and high fundamental frequency (F0) have been considered relatively good markers of deceptive speech (e.g.,Ecoff, Ekman, Mage, & Frank, 2000;Streeter et al., 1977;but see Spence, Villar, & Arciuli, 2012). More importantly for our purposes here, speech latency measures are also sensitive to deception. ...
Article
Full-text available
This study explores the interaction between deceptive language and second language processing. One hundred participants were asked to produce veridical and false statements in either their first or second language. Pupil size, speech latencies, and utterance durations were analyzed. Results showed additive effects of statement veracity and the language in which these statements were produced. That is, false statements elicited larger pupil dilations and longer naming latencies compared with veridical statements, and statements in the foreign language elicited larger pupil dilations and longer speech durations and compared with first language. Importantly, these two effects did not interact, suggesting that the processing cost associated with deception is similar in a native and foreign language. The theoretical implications of these observations are discussed.
... Numerous lexical and acousticprosodic cues have been evaluated. Early work by Ekman et al. [3] and Streeter et al. [4] found pitch increases in deceptive speech. Linguistic Inquiry and Word Count (LIWC) categories were found to be useful in deception detection studies across five corpora, where subjects lied or told the truth about their opinions on controversial topics [5]. ...
... Previous research has found that people speak with a higher pitch and with more variation in pitch or fundamental frequency when under increased stress or arousal [1,17,18,35]. However, there are many other factors that can contribute to variation in pitch. ...
Article
We have created an automated kiosk that uses embodied intelligent agents to interview individuals and detect changes in arousal, behavior, and cognitive ef- fort by using psychophysiological information systems. In this paper, we describe the system and propose a unique class of intelligent agents, which are described as Special Purpose Embodied Conversational Intelligence with Environmental Sensors (SPECIES). SPECIES agents use heterogeneous sensors to detect human physiology and behavior during interactions, and they affect their environment by influencing hu- man behavior using various embodied states (i.e., gender and demeanor), messages, and recommendations. Based on the SPECIES paradigm, we present three studies that evaluate different portions of the model, and these studies are used as founda- tional research for the development of the automated kiosk. the first study evaluates human–computer interaction and how SPECIES agents can change perceptions of information systems by varying appearance and demeanor. Instantiations that had the agents embodied as males were perceived as more powerful, while female embodied agents were perceived as more likable. Similarly, smiling agents were perceived as more likable than neutral demeanor agents. the second study demonstrated that a single sensor measuring vocal pitch provides SPECIES with environmental awareness of human stress and deception. the final study ties the first two studies together and demonstrates an avatar-based kiosk that asks questions and measures the responses using vocalic measurements.
... If not, receivers might observe telltale signs of stress in a persuader's voice (Giddens et al. 2013;Hollien 1980). While this is far from perfectly diagnostic-some competent people show signs of stress when speaking whereas some lacking skill do not-stress in a persuader's voice may be perceived as revealing hidden information (Streeter et al. 1977), lowering perceived competence and undermining the persuasion attempt (Apple et al. 1979). Indeed, extant research alludes to this possibility by underscoring that stressed entrepreneurs may be viewed as struggling to competently ensure project success (Grant and Ferris 2012;Wincent, Örtqvist, and Drnovsek 2008). ...
Article
Full-text available
Persuasion success is often related to hard-to-measure characteristics, such as the way the persuader speaks. To examine how vocal tones impact persuasion in an online appeal, this research measures persuaders’ vocal tones in Kickstarter video pitches using novel audio mining technology. Connecting vocal tone dimensions with real-world funding outcomes offers insight into the impact of vocal tones on receivers’ actions. The core hypothesis of this paper is that a successful persuasion attempt is associated with vocal tones denoting (1) focus, (2) low stress, and (3) stable emotions. These three vocal tone dimensions—which are in line with the stereotype content model—matter because they allow receivers to make inferences about a persuader’s competence. The hypotheses are tested with a large-scale empirical study using Kickstarter data, which is then replicated in a different category. In addition, two controlled experiments provide evidence that perceptions of competence mediate the impact of the three vocal tones on persuasion attempt success. The results identify key indicators of persuasion attempt success and suggest a greater role for audio mining in academic consumer research.
... Interestingly, with pitch and tone of voice, a clearer distinction can be made to determine if honesty is achieved or if truth is being told. According to Ekman et al. (1976), when a person is lying, vocal pitch tends to be higher and through this means, the receiver can deduce the genuineness of utterances (Streeter, Krauss, Geller, Olson, & Apple, 1977). ...
... The most well-known of these is vocal analysis as part of lie-detector testing to identify deception. 13 Similarly, attempts to assess an individual's underlying stress of an individual within experimental or real-world scenarios have long been subject to acoustic analysis. 14 15 The most commonly assessed parameters of voice include the previously mentioned fundamental frequency and the first four formants. ...
Article
Introduction Stress may serve as an adjunct (challenge) or hindrance (threat) to the learning process. Determining the effect of an individual’s response to situational demands in either a real or simulated situation may enable optimisation of the learning environment. Studies of acoustic analysis suggest that mean fundamental frequency and formant frequencies of voice vary with an individual’s response during stressful events. This hypothesis is reviewed within the otolaryngology (ORL) simulation environment to assess whether acoustic analysis could be used as a tool to determine participants’ stress response and cognitive load in medical simulation. Such an assessment could lead to optimisation of the learning environment. Methodology ORL simulation scenarios were performed to teach the participants teamwork and refine clinical skills. Each was performed in an actual operating room (OR) environment (in situ) with a multidisciplinary team consisting of ORL surgeons, OR nurses and anaesthesiologists. Ten of the scenarios were led by an ORL attending and ten were led by an ORL fellow. The vocal communication of each of the 20 individual leaders was analysed using a long-term pitch analysis PRAAT software (autocorrelation method) to obtain mean fundamental frequency (F0) and first four formant frequencies (F1, F2, F3 and F4). In reviewing individual scenarios, each leader’s voice was analysed during a non-stressful environment (WHO sign-out procedure) and compared with their voice during a stressful portion of the scenario (responding to deteriorating oxygen saturations in the manikin). Results The mean unstressed F0 for the male voice was 161.4 Hz and for the female voice was 217.9 Hz. The mean fundamental frequency of speech in the ORL fellow (lead surgeon) group increased by 34.5 Hz between the scenario’s baseline and stressful portions. This was significantly different to the mean change of −0.5 Hz noted in the attending group (p=0.01). No changes were seen in F1, F2, F3 or F4. Conclusions This study demonstrates a method of acoustic analysis of the voices of participants taking part in medical simulations. It suggests acoustic analysis of participants may offer a simple, non-invasive, non-intrusive adjunct in evaluating and titrating the stress response during simulation.
... Исследования Л. Стритера и Б. ДеПауло также доказывают возрастание частоты основного тона и напряжения голосовых связок во время произнесения ложных высказываний. [11,12]. ...
Article
Full-text available
Статья представляет обзор просодических и акустических индикаторов лжи в высказываниях. Предварительный частотно-спектральный анализ фрагментов речи показал увеличение частоты основного тона при реализации ложных высказываний по сравнению с речью испытуемого при отсутствии психологического напряжения. Сравнение спектрограмм позволило получить данные о таких индикаторах обмана, как заполненные паузы, смех и восходящая интонация.
... Навіть якщо їх можна подолати, використовуючи більш м'які форми подразників, зв'язок між стимулом та реакцією аж ніяк не є простим. Реакції різних суб'єктів на один і той же стимул можуть не тільки бути різними, а й непередбачувано змінюватися залежно від їх досвіду та психотипу особистості [17]. Крім того, часто буває практично неможливо виміряти рівень реально пережитого стресу. ...
... Among other speech cues, acoustic-prosodic features (e.g., formant frequencies, speech intensity) and lexical features (e.g., verb tense, use of negative emotion words) were found to be predictive of deceptive utterances [67]. Increased changes in speech parameters were observed when speakers are highly motivated to deceive [98]. ...
Chapter
Full-text available
Internet-connected devices, such as smartphones, smartwatches, and laptops, have become ubiquitous in modern life, reaching ever deeper into our private spheres. Among the sensors most commonly found in such devices are microphones. While various privacy concerns related to microphone-equipped devices have been raised and thoroughly discussed, the threat of unexpected inferences from audio data remains largely overlooked. Drawing from literature of diverse disciplines, this paper presents an overview of sensitive pieces of information that can, with the help of advanced data analysis methods, be derived from human speech and other acoustic elements in recorded audio. In addition to the linguistic content of speech, a speaker's voice characteristics and manner of expression may implicitly contain a rich array of personal information, including cues to a speaker's biometric identity, personality, physical traits, geographical origin, emotions, level of intoxication and sleepiness, age, gender, and health condition. Even a person's socioeconomic status can be reflected in certain speech patterns. The findings compiled in this paper demonstrate that recent advances in voice and speech processing induce a new generation of privacy threats.
... Speech-based deception studies focused on identifying the correlation between deception and acoustic information incorporated in speech such as the intonation, the pitch, the energy, the accent, the rhythm, the melody, etc. Ekman in [26] has proven that deceivers present a high pitch range compared to truth-tellers. Similar results have been found by Streeter et al. in [75] and [69]. Furthermore, authors in [20] reported an increased vocal tension during deception. ...
Article
Full-text available
Automatic deception detection is an important task that has gained a huge interest in different fields due to its potential applications. Particularly, it can improve justice and security in society by helping in detecting deceivers in high-stakes situations across jurisprudence, law enforcement, and national security domains, among others. However, the existing deception detection systems until today are not as accurate as it is expected, which makes their use very risky especially in critical fields. This article outlines an approach for automatically distinguishing between deceit and truth based on audio, video and text modalities and explores the possibility of combining them together in order to detect deception more accurately. First, each modality has been evaluated separately and then a feature and decision-level fusion approaches have been proposed to combine the considered modalities. The proposed feature level fusion approach investigates a diversified feature selection techniques to select the most relevant ones among the whole used feature set, while the decision level fusion approach is based on the belief theory considering information about the certainty degree of each modality. To do so, we used a real-life video dataset of people communicating truthfully or deceptively collected from public american court trials. Unimodal models trained on audio, video and text separately achieved an accuracy rate of 60%, 94% and 58% respectively. When using the feature level fusion approach, the best accuracy deception detection result reaches 93% using only 19 combined features, while a 100% deception recognition rate has been obtained with the decision-level fusion proposed approach, outperforming the results obtained in the literature.
... Fundamental frequency, denoted as f0, is the lowest frequency of a periodic waveform (Pell, 1999). Streeter established that fundamental frequency is higher during deception than during non-deceptive times (Streeter, Krauss, Geller, Olson, & Apple, 1977). More recently, Villar et al., found supporting evidence that subjects' fundamental frequencies increased when the lied (Villar, Arciuli, & Paterson, 2013). ...
Conference Paper
Full-text available
In this paper we explore the range of multimodal signals involved in deceptive communication, including discourse and language, vocal and acoustic, nonverbal and gestural, and facial cues that combine to convey deceptive messages. We examine emerging theoretical perspectives that relate the physical embodiment of political phenomena, integrating perspectives from the hard sciences, like biology and neurology, and the social sciences, like politics and psychology. The burgeoning "text-as-data" approach from discourse analysis likewise holds strong potential for researchers in political science, and the social sciences more broadly. Researchers in this area have used computational linguistics to assess syntactic and semantic features of language, sentiment, document topics, and psychological attributes using a battery of linguistic instruments. Scholars are presently able to make inferences relating to an individual's intended audience, cognitive framework, emotional state, group affiliation, organizational hierarchy, and issue priorities. Linguistic analysis alone, however, may only reveal part of the larger picture involved in the delivery of political speeches and addresses. For this reason, we broaden the perspective to include the range of multimodalities in our analysis.
... Acoustic-prosodic features chosen here have been proved to possess certain influences when telling and judging lies [7,26,27,28,29,30,31]. It is interesting that the deception behaviour could be clustered in the corpus. ...
Conference Paper
Full-text available
Being able to distinguish the differences between deceptive and truthful statements in a dialogue is an important skill in daily life. Extensive studies on the acoustic features of deceptive English speech have been reported, but such research in Mandarin is relatively scarce. We constructed a Mandarin deception database of daily dialogues from native speakers in Tai-wan. College students were recruited to participate in a game in which they were encouraged to lie and convince their opponents of experiences that they did not have. After data collection, acoustic-prosodic features were extracted. The statistics of these features were calculated so that the differences between truthful and deceptive sentences, both as they were intended and perceived, can be compared. Results indicate that different people tend to use different acoustic features when telling a lie; the participants could be put into 10 categories in a dendro-gram, with an exception of 31 people from whom no acoustic indicators for deception were found. Without considering interpersonal differences, our best classifier reached an F1 score of 53.37% in distinguishing deceptive and truthful segmentation units. We hope to present this new database as a corpus for future studies on deception in Mandarin conversations.
... For example, for the most utilitarian categories, characteristics such as reliability, confidence, competency, and responsibility may be important, and to project reliability and confidence, pitch and intensity may be salient sound traits. This may reflect the results of some research that higher pitch in an interview is regarded as more competent by listeners [51], and powerful speakers are perceived as more credible and trustworthy [52]. The two segments show both similarities and differences in the preference of voices. ...
Article
Full-text available
Firms endeavor to differentiate their products and brands by applying various elements to their marketing communication tools. One underutilized element is the human voice, which carries much information about the speaker (both static and dynamic) and may serve as the “auditory face” of a brand. In this study, we propose a concept that we have named the “voiceprint,” which identifies the “ideal” voice for promoting a product/brand. In our conceptual framework, we illustrate how different combinations of acoustic features in voices that represent the product evoke certain perceptions and images, which ultimately drive preferences toward the product/brand. To empirically demonstrate that consumers are indeed affected by voices, we conducted a laboratory study wherein subjects evaluated different actors’ voices in radio advertisements for various product categories. The data were analyzed using voice feature extraction methods and by applying a latent class multivariate model. The results showed that different voices indeed have a significant effect on people’s preferences. In addition, heterogeneity in different consumer segments and product categories was found regarding important voice features that drove preferences. Managerially, our findings provide guidance for marketers regarding how to effectively select the right voice for their marketing communications.
... Early work by psychologists (e.g. Ekman et al., 1991, Streeter et al., 1977, Newman et al., 2003 have found indicators of deceptive speech include pitch increases, LIWC (Pennebaker et al., 2001) features, etc. More recently, computer scientists have investigated deception detection in various contexts, identifying cues from texts, speech signals, gestures, and facial expressions. ...
Article
Subjects took a test of skill at interpreting nonverbal communication, the Profile of Nonverbal Sensitivity (PONS), and a videotaped test of ability to distinguish between truthful and deceptive statements. It was hypothesized that skill at reading the types of nonverbal cues known to indicate deception, i.e., body movement, paralinguistics, would be related to actual accuracy at detection of deception. Correlations between 23 PONS scales and 3 deception measures were generally nonsignificant, suggesting that accurate detection of deception may not be merely a function of sensitivity to nonverbal cues.
Chapter
Full-text available
This chapter provides an overview of research on non-verbal lie detection and some new meta-analytic findings. The chapter begins with a review of classic theories of deception which predict the existence of non-verbal cues to deception. A brief overview of the empirical findings on behavioural cues to deception based on meta-analytic findings is also provided. We next provide a meta-analysis documenting a decline effect in research on deception: cues to deception decrease in strength the longer they are studied. Next discussed are various aspects of the relationship between demeanour and perceivers’ judgments of deception made by perceiver. In particular, a meta-analysis of perceived honesty across various modalities is reported. The results show that people who look honest also sound honest, and these same individuals spin convincing tales. Finally, we analyse and discuss the determinants of accurate lie detection from demeanour based on direct and indirect measures of veracity. Contrary to some current perspectives, meta-analysis suggests that accuracy is higher when assessed with direct measures than indirect assessment.
Article
This study models the process by which auditors draw a suspicion of lying from an interviewee's behavior and tests the model with perceptions and beliefs collected from 147 auditors after they observed an interview. An understanding of the way auditors become suspicious of lying may lead to improvements in their deception-detection training. The proposed structural model includes impressions of the interviewee as intervening variables between perceptions of behavior and suspicion of lying. It was evaluated using SEM. The results show the instrumentality of two impressions, uneasiness and non-informativeness. An additional finding is that mental stress, which can be more intense for liars than for truth tellers, is not a suspicious behavior. Auditors may disregard signs of mental stress because interviewees typically have to answer thought-provoking questions, thereby confounding the mental stress of lying with the mental stress of the interview.
Book
Liars give off clues that they are skirting the truth, often through language, refuting the tacit assumption that language is under the conscious control of the liar. In reality the opposite is true, as speakers effectively control only the meaning they convey, and not the linguistic style of their communication. This book examines the promising field of computer applications to identify the language of the liar and reviews the systems built in the past 10 years to discriminate this language. It also provides a background in deception studies from the physiological, psychological, and linguistic work on deception, and discusses the sources and issues of collecting real-world critical data containing lies including police, security, border crossing, customs, and asylum interviews; congressional hearings; financial reporting; legal depositions; human resource evaluation; predatory communications that include Internet scams, identity theft, and fraud; and false product reviews. The book concludes with a set of open questions that the computational work has generated.
Article
Full-text available
Background: Voice, apart from its semantic content also carries information about the speaker's psychological and physical state. Emotional stress or physical fatigue, are the pathological elements of this condition. The possible relationship between emotional stress and the measurable changes to the voice signal was the subject of this study. Method: Eleven subjects were interviewed with questions from two domains and their responses were recorded. In the first domain, two men, two women and three teenagers were asked to remember an incident from their past where they felt embarrassed or ashamed of their own act. In the second domain, three women and one man from the house keeping staff were interviewed for the stolen mobile phone. These subjects were different from the subjects who participated in domain 1. Stress in voice was detected as a measure of shift in the acoustic parameters with respect to the baseline. All recordings were analyzed using PRAAT software. Spectrograms were also plotted for qualitative comparison between normal speech and stressed speech. Result: Significant increase in mean pitch and substantial decrease in the first two formants (F1 and F2) were observed under stress. Other acoustic measures did undergo change under stress but failed to reveal any significance. Spectrograms were distinct for the two conditions. Conclusion: Obtained results indicate that, when a person is emotionally charged, stress could be discerned in his voice. Mean pitch and Formants F1 and F2 have been obtained as reliable vocal indicators of emotional stress. This study proposes a simple non-invasive approach which can act as an alibi for innocent people. General Terms Voice stress analysis, speech processing, human computer interface.
Chapter
As adults in a Western culture, most of us have come to learn to control the extent to which, and the ways in which, we nonverbally display our emotions and attitudes. We may hide our feelings of disgust. Our anger may be partially concealed with a fixed smile. There are even situations in which we may wish to conceal our positive attitudes; for instance, in a bargaining exchange.
Chapter
There are two major points that I would like to make in this chapter: first, that voice and speech cues are important yet neglected indicators of stress, and second, that stress should be studied within the general framework of a comprehensive theory of emotion. The order of these two points also reflects the development of research interests and theoretical inclinations in my research group during the past decade. Consequently, in addition to developing arguments for these points, I use the opportunity to describe a series of studies conducted in our laboratory. In doing so, I stress the historical development of this research, since it may illustrate how we came to hold the theoretical views that are presented in this chapter.
Chapter
Purpose - The advancement of multimedia technology has spurred the use of multimedia in business practice. The adoption of audio and visual data will accelerate as marketing scholars become more aware of the value of audio and visual data and the technologies required to reveal insights into marketing problems. This chapter aims to introduce marketing scholars into this field of research. Design/methodology/approach - This chapter reviews the current technology in audio and visual data analysis and discusses rewarding research opportunities in marketing using these data. Findings - Compared with traditional data like survey and scanner data, audio and visual data provides richer information and is easier to collect. Given these superiority, data availability, feasibility of storage, and increasing computational power, we believe that these data will contribute to better marketing practices with the help of marketing scholars in the near future. Practical implications: The adoption of audio and visual data in marketing practices will help practitioners to get better insights into marketing problems and thus make better decisions. Value/originality - This chapter makes first attempt in the marketing literature to review the current technology in audio and visual data analysis and proposes promising applications of such technology.We hope it will inspire scholars to utilize audio and visual data in marketing research.
Chapter
During the past decade evidence from cross-cultural studies of emotions has provided strong confirmation for Darwin’s century-old hypothesis that there is continuity of facial expressions in animals and human beings and that the facial expressions of certain emotions are innate and universal. Scherer’s chapter argues that the vocal expressions of certain emotions also show evolutionary continuity and universality. The adaptive advantages of vocal expression over facial expression are obvious in situations where darkness or distance would prevent social communication by way of the facial-visual system.
Article
Evaluating truthfulness and detecting deception is a capstone skill of criminal justice professionals, and researchers have long examined nonverbal cues to aid in such determinations. This paper examines the notion that testing clusters of nonverbal behaviors is a more fruitful way of making such determinations than single, specific behaviors. Participants from four ethnic groups participated in a mock crime and either told the truth or lied in an investigative interview. Fourteen nonverbal behaviors of the interviewees were coded from the interviews; differences in the behaviors were tested according to type of question and veracity condition. Different types of questions produced different nonverbal reactions. Clusters of nonverbal behaviors differentiated truth tellers from liars, and the specific clusters were moderated by question. Accuracy rates ranged from 62.6 to 72.5% and were above deception detection accuracy rates for humans and random data. These findings have implications for practitioners as well as future research and theory.
Chapter
It is without question that law enforcement, security, intelligence and related agencies need an effective method for the detection of deception. Incidentally, “detection of deception” is the phrase used by many professionals when they mean lying. Whether a lie detector—were it to exist—would be used effectively, legally and ethically is not the focus of our concern here (however, these issues will be discussed briefly at the end of the chapter). Rather, I will stress the need for such a system and some approaches to the problem. But first, let us consider the basic issue: can lies be detected; is there such a thing as a lie response?
Chapter
Perhaps W. C. Fields’ intense dislike of children stemmed from his experience that children were unable to appreciate his sarcastic and lampooning humor. After all, a comedian is only as funny as the strength of his audience’s response. By definition, sarcastic humor expresses meaning contrary to what might be expected in a particular context. Similarly, feelings of ambivalence and attempts at deception also might lead senders, or comedians, to express different messages or affects in different verbal and nonverbal channels. This chapter is concerned with how and when children learn to interpret and understand these discrepancies among social messages, channels, or affects. In everyday life, children’s and adults’ interpretations, weighing, and “trusting” of these discrepant or “inconsistent” social messages certainly have implications for the development of satisfying interpersonal relations in general, to say nothing of the appreciation of sardonic comedians in particular.
Chapter
We hope that this chapter, and indeed the book, will give the reader some idea of the wide range of psychological research relevant to the law. Our intention is to illustrate the sorts of questions which have interested psychologists and to indicate the kinds of findings which have been produced. The review is not intended to be exhaustive, for this would require a book rather than a chapter (for other reviews of this literature see Bermant et al., 1976; Tapp, 1976). Some of the topics mentioned here are reviewed in more detail in succeeding chapters. Although most of the research quoted is American, we have tried to pay special attention to British work.
Article
Full-text available
Evidence suggests that many physical, behavioral, and trait qualities can be detected solely from the sound of a person’s voice, irrespective of the semantic information conveyed through speech. This study examined whether raters could accurately assess the likelihood that a person has cheated on committed, romantic partners simply by hearing the speaker’s voice. Independent raters heard voice samples of individuals who self-reported that they either cheated or had never cheated on their romantic partners. To control for aspects that may clue a listener to the speaker’s mate value, we used voice samples that did not differ between these groups for voice attractiveness, age, voice pitch, and other acoustic measures. We found that participants indeed rated the voices of those who had a history of cheating as more likely to cheat. Male speakers were given higher ratings for cheating, while female raters were more likely to ascribe the likelihood to cheat to speakers. Additionally, we manipulated the pitch of the voice samples, and for both sexes, the lower pitched versions were consistently rated to be from those who were more likely to have cheated. Regardless of the pitch manipulation, speakers were able to assess actual history of infidelity; the one exception was that men’s accuracy decreased when judging women whose voices were lowered. These findings expand upon the idea that the human voice may be of value as a cheater detection tool and very thin slices of vocal information are all that is needed to make certain assessments about others.
Conference Paper
Full-text available
СПЕЦКУРС «ПРОФЕСІЙНО ОРІЄНТОВАНА МЕДІАОСВІТА» У РОЗВИТКУ ФАХОВОЇ КОМПЕТЕНТНОСТІ МАЙБУТНІХ ПРАЦІВНИКІВ ГІРНИЧОЇ ТА НАФТОГАЗОВОЇ ГАЛУЗЕЙ Білецький Володимир Стефанович, доктор технічних наук, професор НТУ «Харківський політехнічний інститут» ukcdb@i.ua Онкович Ганна Володимирівна, доктор педагогічних наук. професор Київський медичний університет onkan@ukr.net Онкович Артем Дмитрович, кандидат педагогічних наук, доцент Національний університет культури і мистецтв ioj@ukr.net Анотація. Сучасний навчальний процес в царині інженерії суттєво базується на використанні інтернету. Медіаграмотність стала ключовою ознакою фаховості спеціаліста. Її набуття супроводжується розвитком медіаосвітніх технологій. Дослідниками-практиками було розглянуто різні аспекти використання засобів масової інформації та їхніх продуктів у навчальному процесі. У різних навчальних закладах з’являються спецкурси з медіаосвіти. Це свідчить про розвиток медіадидактики вищої школи, яка збагачувався новітніми технологіями, термінами, поняттями. Постала потреба узагальнити ці теоретичні напрацювання з опертям на досвід практиків, зокрема, в сфері гірничої та нафтогазової інженерії. У статті для спеціальностей 184 «Гірництво» і 185 «Нафтогазова освіта» пропонується програма нового спецкурсу «ПРОФЕСІЙНО ОРІЄНТОВАНА МЕДІАОСВІТА ДЛЯ СПЕЦІАЛІСТІВ З ГІРНИЧОЇ ТА НАФТОГАЗОВОЇ ІНЖЕНЕРІЇ» і даються деякі рекомендації з його впровадження у систему вищої освіти. Ключові слова: гірнича інженерія, нафтогазова інженерія, медіакультура, медіадидактика, медіадидактика вищої школи, медіаосвіта, професійно орієнтована медіаосвіта, фахова компетентність, медіакомпетентність, спецкурс, розвиток фахової компетеності. Рр.340-349.
ResearchGate has not been able to resolve any references for this publication.