Chapter

Conducting Judgment Studies: Some Methodological Issues

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

For many years the Handbook of Methods in Nonverbal Behavior Research (Scherer & Ekman, 1982) has been an invaluable text for researchers looking for methods to study nonverbal behavior and the expression of affect. A successor to this essential text, The New Handbook of Methods in Nonverbal Behavior Research is a substantially updated volume with 90% new material. It includes chapters on coding and methodological issues for a variety of areas in nonverbal behavior: facial actions, vocal behavior, and body movement. Issues relevant to judgment studies, methodology, reliability, analyses, etc. have also been updated. The topics are broad and include specific information about methodology and coding strategies in education, psychotherapy, deception, nonverbal sensitivity, and marital and group behavior. There is also a chapter detailing specific information on the technical aspects of recording the voice and face, and specifically in relation to deception studies. This volume will be valuable for both new researchers and those already working in the fields of nonverbal behavior, affect expression, and related topics. It will play a central role in further refining research methods and coding strategies, allowing a comparison of results from various laboratories where research on nonverbal behavior is being conducted. This will advance research in the field and help to coordinate results so that a more comprehensive understanding of affect expression can be developed.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Instead, we focus on third-party observers' perceptual ratings of emotional expressiveness. This approach has the potential to introduce subjectivity and bias, but we can quantify this subjectivity through inter-rater reliability analyses and mitigate it through the aggregation of ratings [38,51]. Operationalizing emotional expressiveness through observer ratings also has the benefit of paralleling the standard approach used in medicine, where clinicians observe patients' behavior and rate their emotional expressiveness [1]. ...
... In order to measure how emotionally expressive each participant in the GFT dataset was, we paid human annotators to watch and rate each video. To mitigate the influence of annotator subjectivity, each video was rated by multiple annotators, and ratings were averaged per video [51]. Moreover, to increase the reliability and validity of these ratings, annotators completed multiple items (i.e., questions) measuring different aspects of emotional expressiveness, which we combined into a single score per video using latent variable modeling [34]. ...
Conference Paper
Emotional expressiveness captures the extent to which a person tends to outwardly display their emotions through behavior. Due to the close relationship between emotional expressiveness and behavioral health, as well as the crucial role that it plays in social interaction, the ability to automatically predict emotional expressiveness stands to spur advances in science, medicine, and industry. In this paper, we explore three related research questions. First, how well can emotional expressiveness be predicted from visual, linguistic, and multimodal behavioral signals? Second, how important is each behavioral modality to the prediction of emotional expressiveness? Third, which behavioral signals are reliably related to emotional expressiveness? To answer these questions, we add highly reliable transcripts and human ratings of perceived emotional expressiveness to an existing video database and use this data to train, validate, and test predictive models. Our best model shows promising predictive performance on this dataset (RMSE = 0.65, R2 = 0.45, r = 0.74). Multimodal models tend to perform best overall, and models trained on the linguistic modality tend to outperform models trained on the visual modality. Finally, examination of our interpretable models' coefficients reveals a number of visual and linguistic behavioral signals-such as facial action unit intensity, overall word count, and use of words related to social processes-that reliably predict emotional expressiveness.
Article
Full-text available
The facilitative interpersonal skills (FIS) task is a performance-based task designed to assess clinicians’ capacity for facilitating a collaborative relationship. Performance on FIS is a robust clinician-level predictor of treatment outcomes. However, the FIS task has limited scalability because human rating of FIS requires specialized training and is time-intensive. We aimed to catalyze a “big needle jump” by developing an artificial intelligence- (AI-) based automated FIS measurement that captures all behavioral audiovisual markers available to human FIS raters. A total of 956 response clips were collected from 78 mental health clinicians. Three human raters rated the eight FIS subscales and reached sufficient interrater reliability (intraclass correlation based on three raters [ICC3k] for overall FIS = 0.85). We extracted text-, audio-, and video-based features and applied multimodal modeling (multilayer perceptron with a single hidden layer) to predict overall FIS and eight FIS subscales rated along a 1–5 scale continuum. We conducted 10-fold cross-validation analyses. For overall FIS, we reached moderate size relationships with the human-based ratings (Spearman’s ρ = .50). Performance for subscales was variable (Spearman’s ρ from .30 to .61). Inclusion of audio and video modalities improved the accuracy of the model, especially for the Emotional Expression and Verbal Fluency subscales. All three modalities contributed to the prediction performance, with text-based features contributing relatively most. Our multimodal model performed better than previously published unimodal models on the overall FIS and some FIS subscales. If confirmed in external validation studies, this AI-based FIS measurement may be used for the development of feedback tools for more targeted training, supervision, and deliberate practice.
Article
Full-text available
The aim was to define the association between the severity of depression, prosody, and voice acoustic features in women suffering from depression and its comparisons with nondepressed people. Prosody and acoustic features in 30 women with major depression hospitalized in a psychiatric ward and 30 healthy women were investigated in a cross-sectional study. To define the severity of depression, the Hamilton Rating Scale for Depression (HRS-D) was applied. Acoustic parameters such as jitter, shimmer, cepstral peak prominence (CPP), standard deviation of fundamental frequency (SD F0), harmonic-to-noise ratio, and F0 and also some speech prosodic features including the speed of speech, switching pause duration means, and durations of produced sentences with different modals were measured quantitatively. Also, six raters judged the patient’s prosody qualitatively. SPSS V.28 was used for all statistical analyses ( p < 0.05 ). There was a significant correlation between HRS-D with jitter, SD F0, speed of speech, and switching pause means ( p ≤ 0.05 ). The means of CPP and duration of producing emotional sentences differed between the depression and control groups. The HRS-D scores were significantly correlated with switching pauses in patients (Pearson coefficient = 0.47, p = 0.05 ). The results of the perceptual evaluation of prosody judged by six raters showed an 85% correlation between them ( p ≤ 0.001 ). Some acoustic and prosodic parameters are different between healthy women and those with depression disorder (e.g., CPP and duration of emotional sentences) and may also have an association with the severity of depression (e.g., jitter, SD F0, speed of speech, and switching pause means) in women with depression disorder. It was indicated that the best sentence modal to assess prosody in patients with depression would be exclamatory ones compared to declarative and interrogative sentences.
Article
Full-text available
To make machines more human-like, computer vision is essential. Computer vision is a field that focuses on mimicking the capabilities of the human visual framework to capture high-level understanding from enhanced images or recordings. Such a computer vision application gives the machine the ability to recognize a person and detect their emotions in order to process them appropriately. Facial verification can be a process of identifying or validating an object through an image, video, or any audio-visual component of their face. It could be a biometric discriminant proof strategy that works directly and confronts measures of discriminating individuals through their facial design and biometric information. The innovation collects a unique set of biometric information from each individual regarding their face and facial expressions to identify an individual. Facial sensory recognition can be an innovation used to analyse estimates from a variety of sources, such as images and recordings. It makes a difference as machines get better the way humans do and treat them according to their emotions. We are using a deep learning computation called Convolutional Neural Networks (CNN) to prepare for this demonstration that determines the sentiment of certain input images. We need to pre-process the images to prepare and test the model. For pre-processing, we do image enhancement combining resizing, equalizing and converting the image to grayscale for the machine to achieve. This demonstration can have multiple applications in both surveillance and feedback systems.
Article
Full-text available
A large number of publications have focused on the study of pain expressions. Despite the growing knowledge, the availability of pain-related face databases is still very scarce compared with other emotional facial expressions. The Pain E-Motion Faces Database (PEMF) is a new open-access database currently consisting of 272 micro-clips of 68 different identities. Each model displays one neutral expression and three pain-related facial expressions: posed, spontaneous-algometer and spontaneous-CO2 laser. Normative ratings of pain intensity, valence and arousal were provided by students of three different European universities. Six independent coders carried out a coding process on the facial stimuli based on the Facial Action Coding System (FACS), in which ratings of intensity of pain, valence and arousal were computed for each type of facial expression. Gender and age effects of models across each type of micro-clip were also analysed. Additionally, participants' ability to discriminate the veracity of pain-related facial expressions (i.e., spontaneous vs posed) was explored. Finally, a series of ANOVAs were carried out to test the presence of other basic emotions and common facial action unit (AU) patterns. The main results revealed that posed facial expressions received higher ratings of pain intensity, more negative valence and higher arousal compared with spontaneous pain-related and neutral faces. No differential effects of model gender were found. Participants were unable to accurately discriminate whether a given pain-related face represented spontaneous or posed pain. PEMF thus constitutes a large open-source and reliable set of dynamic pain expressions useful for designing experimental studies focused on pain processes.
Article
The ability to identify whether a user is “zoning out” (mind wandering) from video has many HCI (e.g., distance learning, high-stakes vigilance tasks). However, it remains unknown how well humans can perform this task, how they compare to automatic computerized approaches, and how a fusion of the two might improve accuracy. We analyzed videos of users’ faces and upper bodies recorded 10s prior to self-reported mind wandering (i.e., ground truth) while they engaged in a computerized reading task. We found that a state-of-the-art machine learning model had comparable accuracy to aggregated judgments of nine untrained human observers (area under receiver operating characteristic curve [AUC] = .598 versus .589). A fusion of the two (AUC = .644) outperformed each, presumably because each focused on complementary cues. Furthermore, adding more humans beyond 3–4 observers yielded diminishing returns. We discuss implications of human–computer fusion as a means to improve accuracy in complex tasks.
Article
Full-text available
The role of face and context in emotion perception was investigated by manipulating features relevant to the stimuli and to the observer. A nested-stimulus design was used, with subjects nested under stimulus item (an encoder’s facial expression or a written emotion-eliciting scenario presented alone or in an incongruent pair) and type of task instruction (judgment of encoder’s expressed or felt emotion). Subjects, using one type of task instruction, completed a decoding task in which they viewed a facial expression, a written scenario, or a facial expression paired with an emotion-incongruent scenario. Type of task instruction was intended to alter subjects’ perception by directing attention to favor face information (judgment of expressed emotion) or context information (judgment of felt emotion). Then subjects selected the predominant emotion and indicated the intensity of various emotions they perceived the encoder to be expressing or feeling. Judgments were examined for target emotion match to face and/or context using a by-stimulus analysis. The results suggest that when an encoder’s facial expression is discordant with the emotion-eliciting event, subjects will favor facial information when judging what the encoder is expressing, whereas they will favor context information when judging what the encoder is feeling. When face or context was seen alone, type of task instruction did not influence subjects’ judgments. This research provides a more detailed understanding of the role of face and context by exploring how features associated with the observer-encoder interaction influence emotion perception.
Article
Full-text available
Language discordance poses a barrier to effective physician-patient communication, and health care outcomes, such as patient satisfaction, can be associated with language barriers experienced by Spanish-speaking patients. This exploratory study assessed specific aspects of communication between 128 Spanish-speaking primary care patients and their physicians (primary English speakers without an interpreter present). The rating scale developed for this study was used by five raters, who listened to audiotapes of each of these medical visits. Patients and physicians completed measures of visit satisfaction. Results indicated physicians with better Spanish-language skills were less frustrated with medical visit communication and more connected to their patients; patients whose physicians were rated as having better Spanish-speaking ability reported having greater choice in their medical care. Patients whose physicians spoke more Spanish were more satisfied with the information given by their physicians. Physicians rated as having better Spanish-speaking ability were more likely to say they could not understand all the patients wanted to tell them. These data support the importance of language concordance in physician-patient communication and awareness of potential communication barriers between physicians and patients.
Article
Full-text available
The assessment of personality traits is now a key part of many important social activities, such as job hunting, accident prevention in transportation, disease treatment, policing, and interpersonal interactions. In a previous study, we predicted personality based on positive images of college students. Although this method achieved a high accuracy, the reliance on positive images alone results in the loss of much personality-related information. Our new findings show that using real-life 2.5D static facial contour images, it is possible to make statistically significant predictions about a wider range of personality traits for both men and women. We address the objective of comprehensive understanding of a person's personality traits by developing a multiperspective 2.5D hybrid personality-computing model to evaluate the potential correlation between static facial contour images and personality characteristics. Our experimental results show that the deep neural network trained by large labeled datasets can reliably predict people's multidimensional personality characteristics through 2.5D static facial contour images, and the prediction accuracy is better than the previous method using 2D images.
Article
Full-text available
Appearance can affect social interaction, which in turn affects personality development. There is ample evidence that facial morphology and social cues provide information about human personality and behavior. In this study, we focused on the relationship between self-reported personality characteristics and facial features. We propose a new approach for predicting college students’ personality characteristics (on the basis of the Big Five personality characteristics) with static facial images. First, we construct a dataset containing 13,347 data pairs composed of facial images and personality characteristics. Second, we train a deep neural network with 10,667 sample pairs from the dataset and use the remaining samples to test (1335 pairs) and validate (1335 pairs) self-reported Big Five personalities. We trained a series of deep neural networks on a large, labeled dataset to predict the self-reported Big Five personality trait scores. This novel work applies deep learning to this topic. We also verify the network’s advanced nature on the publicly available database with obvious personality characteristics. The experimental results show that 1) personality traits can be reliably predicted from facial images with an accuracy that exceeds 70%. In five-character tag classification, the recognition accuracy of neuroticism and extroversion was the most accurate, and the prediction accuracy exceeded 90%. 2) Deep learning neural network features are better than traditional manual features in predicting personality characteristics. The results strongly support the application of neural networks trained on large-scale labeled datasets in multidimensional personality feature prediction from static facial images. 3) There are some differences in the personality traits of college students with different academic backgrounds. Future research can explore the relative contribution of other facial image features in predicting other personality characteristics.
Article
Full-text available
Thin slices are used across a wide array of research domains to observe, measure, and predict human behavior. This article reviews the thin-slice method as a measurement technique and summarizes current comparative thin-slice research regarding the reliability and validity of thin slices to represent behavior or social constructs. We outline decision factors in using thin-slice behavioral coding and detail three avenues of thin-slice comparative research: (1) assessing whether thin slices can adequately approximate the total of the recorded behavior or be interchangeable with each other (representativeness); (2) assessing how well thin slices can predict variables that are different from the behavior measured in the slice (predictive validity), and (3) assessing how interpersonal judgment accuracy can depend on the length of the slice (accuracy-length validity). The aim of the review is to provide information researchers may use when designing and evaluating thin-slice behavioral measurement.
Article
Full-text available
Background Narrative communication is often more persuasive for promoting health behaviour change than communication using facts and figures; the extent to which narrative persuasiveness is due to patients’ identification with the storyteller vs engagement with the story is unclear. Objective To examine the relative impacts of patient engagement, age concordance and gender concordance on perceived persuasiveness of video‐recorded narrative clips about opioid tapering. Methods Patient raters watched and rated 48 brief video‐recorded clips featuring 1 of 7 different storytellers describing their experiences with opioid tapering. The dependent variable was clips’ perceived persuasiveness for encouraging patients to consider opioid tapering. Independent variables were rater engagement with the clip, rater‐storyteller gender concordance and rater‐storyteller age concordance (<60 vs ≥60). Covariates were rater beliefs about opioids and opioid tapering, clip duration and clip theme. Mixed‐effects models accounted for raters viewing multiple clips and clips nested within storytellers. Results In multivariable models, higher rater engagement with the clip was associated with higher perceived persuasiveness (coefficient = 0.46, 95% CI 0.39‐0.53, P < .001). Neither age concordance nor gender concordance significantly predicted perceived persuasiveness. The theme Problems with opioids also predicted perceived persuasiveness. Conclusion Highly engaging, clinically relevant stories are likely persuasive to patients regardless of the match between patient and storyteller age and gender. When using patient stories in tools to promote health behaviour change, stories that are clinically relevant and engaging are likely to be persuasive regardless of storytellers’ demographics. Patient or public contribution Patients were involved as storytellers (in each clip) and assessed the key study variables.
Article
We report two studies that used facial features to automatically detect mind wandering, a ubiquitous phenomenon whereby attention drifts from the current task to unrelated thoughts. In a laboratory study, university students (N=152)(N = 152) read a scientific text, whereas in a classroom study high school students (N=135)(N = 135) learned biology from an intelligent tutoring system. Mind wandering was measured using validated self-report methods. In the lab, we recorded face videos and analyzed these at six levels of granularity: (1) upper-body movement; (2) head pose; (3) facial textures; (4) facial action units (AUs); (5) co-occurring AUs; and (6) temporal dynamics of AUs. Due to privacy constraints, videos were not recorded in the classroom. Instead, we extracted head pose, AUs, and AU co-occurrences in real-time. Machine learning models, consisting of support vector machines (SVM) and deep neural networks, achieved F1 scores of .478 and .414 (25.4% and 20.9% above-chance improvements, both with SVMs) for detecting mind wandering in the lab and classroom, respectively. The lab-based detectors achieved 8.4% improvement over the previous state-of-the-art; no comparison is available for classroom detectors. We discuss how the detectors can integrate into intelligent interfaces to increase engagement and learning by responding to wandering minds.
Article
Full-text available
Although the functions of messages varying in verbal person centeredness (PC) are well-established, we know less about the linguistic content that differentiates messages with distinct levels of PC. This study examines the lexicon of different levels of PC comfort and seeks to ascertain whether computerized analysis can complement human coders when coding supportive conversations. Transcripts from support providers trained to enact low, moderate, or high levels of PC were subjected to the Linguistic Inquiry and Word Count (LIWC) dictionary. Results reveal that several categories in the LIWC dictionary vary systematically as a function of conversational PC level. LIWC categories, particularly pronouns, social process, cognitive process, anxiety, and anger words, reliably predict which level of the PC hierarchy an interaction represents based on whether a conversation was designed to be high, moderate, or low in PC. The implications are discussed in the context of the lexicon of conversations that vary in PC.
Article
This study examined communication between 51 transition-aged foster youth and their social workers as related to perceived relationship quality and satisfaction with care receipt/provision. Youth–worker dyads were audio-recorded during a requisite monthly meeting and completed assessments of perceived relationship quality and satisfaction with social services. Communication was rated in a 5-minute excerpt across full audio-recorded speech, verbal transcribed content, and nonverbal content-filtered tone. Findings Ratings of workers’ communication in transcribed content most closely reflected workers’ reported perceptions of their relationship with the youth. In turn, youth’s perceptions of the relationship and satisfaction with care were most strongly linked to the content of workers’ communication. Similarly, youth’s communication in full speech and content most closely reflected their reported perceptions of their relationship with the worker and their satisfaction with care, and workers’ perceptions of the relationship and satisfaction with care were most strongly linked to these channels of youth communication. Applications Findings suggest that foster youth and social workers may communicate their authentic beliefs and expectations differentially by communicative channel. Further, both communication partners appeared selectively attuned to the most authentic speaker channels. These findings can inform case planning and intervention work focused on leveraging the power of the worker–youth relationship to improve key service outcomes for foster youth.
Article
Affective computing (AC) adopts a computational approach to study affect. We highlight the AC approach towards automated affect measures that jointly model machine-readable physiological/behavioral signals with affect estimates as reported by humans or experimentally elicited. We describe the conceptual and computational foundations of the approach followed by two case studies: one on discrimination between genuine and faked expressions of pain in the lab, and the second on measuring nonbasic affect in the wild. We discuss applications of the measures, analyze measurement accuracy and generalizability, and highlight advances afforded by computational tipping points, such as big data, wearable sensing, crowdsourcing, and deep learning. We conclude by advocating for increasing synergies between AC and affective science and offer suggestions toward that direction.
Chapter
The objective of the chapter is to highlight the potentially problematic effects that emotional variables and, in particular, evaluative anxiety may have on response processing data collected using think-aloud interviews. Given the collection of response processing data to inform the nature of processing skills that test-takers use to respond to test items, and related validity arguments, it is imperative that the integrity of these data be considered. Recent research illustrating how distractions and disruptions, in some cases emotionally-induced, can influence response processing data in think-aloud interviews is presented. Methods to control and minimize disruptive emotions for participants in think-aloud interviews are also considered, as well as areas for future research.
Article
Full-text available
Recent research has indicated that judgments of competence based on very short exposure to political candidates’ faces reliably predict electoral success. An unexplored question is whether presenting written information of the kind to which voters are typically exposed during an election alongside candidates’ faces affects competence judgments. We conducted three studies using photographs of 16 pairs of competing politicians in 16 medium-sized towns of northeast Italy as stimuli. Study 1 confirmed the external validity of earlier research in which participants were exposed to candidates’ faces without providing any other information. Study 2a showed that competence judgments were not subject to in-group favoritism: candidates’ faces were presented alongside information about the political coalition to which they belonged (center left; center right) to participants who declared a left or right political orientation. Finally, Study 2c compared the competence inferences made in Study 1 (face-only condition) with those of Study 2a (face plus political coalition label) and with new inferences (Study 2b) based on candidates’ faces plus information about campaign promises (greater equality; lower taxes). The results showed that automatic competence inferences are not substantially modified when relevant written information is presented alongside candidates’ faces.
Chapter
Research agendas aimed at the development of socially believable behaving systems often indicate automatic social perception as one of the steps. However, the exact meaning of the word “perception” seems still to be unclear in the computing community, in particular when it applies to social and psychological phenomena that are not accessible to direct observation. This chapter tries to shed light on the problem by showing examples of approaches that perform Automatic Personality Perception, i.e. the prediction of personality traits that people attribute to others.
Article
Full-text available
Nonverbal behavior is a hot topic in the popular management press. However, management scholars have lagged behind in understanding this important form of communication. Although some theories discuss limited aspects of nonverbal behavior, there has yet to be a comprehensive review of nonverbal behavior geared toward organizational scholars. Furthermore, the extant literature is scattered across several areas of inquiry, making the field appear disjointed and challenging to access. The purpose of this paper is to review the literature on nonverbal behavior with an eye towards applying it to organizational phenomena. We begin by defining nonverbal behavior and its components. We review and discuss several areas in the organizational sciences that are ripe for further explorations of nonverbal behavior. Throughout the paper, we offer ideas for future research as well as information on methods to study nonverbal behavior in lab and field contexts. We hope our review will encourage organizational scholars to develop a deeper understanding of how nonverbal behavior influences the social world of organizations.
Article
Full-text available
Previous work in automatic affect analysis (AAA) has emphasized static expressions to the neglect of the dynamics of facial movement and considered head movement only a nuisance variable to control. We investigated whether the dynamics of head and facial movements apart from specific facial expressions communicate affect in infants, an under-studied population in AAA. Age-appropriate tasks were used to elicit positive and negative affect in 31 ethnically diverse infants. 3D head and facial movements were tracked from 2D video. Head angles in the horizontal (pitch), vertical (yaw), and lateral (roll) directions were used to measure head movement; and the 3D coordinates of 49 facial points to measure facial movements. Strong effects were found for both head and facial movements. Angular velocity and angular acceleration of head pitch, yaw, and roll were higher during negative relative to positive affect. Amplitude, velocity, and acceleration of facial movement were higher as well during negative relative to positive affect. A linear discriminant analysis using head and facial movement achieved a mean classification rate of positive and negative affect equal to 65% (Kappa = 0.30). Head and facial movements individually and in combination were also strongly related to observer ratings of affect intensity. Our results suggest that the dynamics of head and facial movements communicate affect at ages as young as 13 months. These interdisciplinary findings from behavioral science and computer vision deepen our understanding of communication of affect and provide a basis for studying individual differences in emotion in socio-emotional development.
Article
Affect detection systems require reliable methods to annotate affective data. Typically, two or more observers independently annotate audio-visual affective data. This approach results in inter-observer reliabilities that can be categorized as fair (Cohen's kappas of approximately.40). In an alternative iterative approach, observers independently annotate small amounts of data, discuss their annotations, and annotate a different sample of data. After a pre-determined reliability threshold is reached, the observers independently annotate the remainder of the data. The effectiveness of the iterative approach was tested in an annotation study where pairs of observers annotated affective video data in nine annotate-discuss iterations. Self-annotations were previously collected on the same data. Mixed effects linear regression models indicated that inter-observer agreement increased (unstandardized coefficient B =.031) across iterations, with agreement in the final iteration reflecting a 64 percent improvement over the first iteration. Follow-up analyses indicated that the improvement was nonlinear in that most of the improvement occurred after the first three iterations (B =.043), after which agreement plateaued (B ≈ 0). There was no notable complementary improvement (B ≈ 0) in self-observer agreement, which was considerably lower than observer-observer agreement. Strengths, limitations, and applications of the iterative affective annotation approach are discussed.
Article
This paper presents a research framework for understanding the empathy that arises between people while they are conversing. By focusing on the process by which empathy is perceived by other people, this paper aims to develop a computational model that automatically infers perceived empathy from participant behavior. To describe such perceived empathy objectively, we introduce the idea of using the collective impressions of external observers. In particular, we focus on the fact that the perception of other's empathy varies from person to person, and take the standpoint that this individual difference itself is an essential attribute of human communication for building, for example, successful human relationships and consensus. This paper describes a probabilistic model of the process that we built based on the Bayesian network, and that relates the empathy perceived by observers to how the gaze and facial expressions of participants co-occur between a pair. In this model, the probability distribution represents the diversity of observers' impression, which reflects the individual differences in the schema when perceiving others' empathy from their behaviors, and the ambiguity of the behaviors. Comprehensive experiments demonstrate that the inferred distributions are similar to those made by observers.
Article
This study investigates the role of gender in physician-patient communication among African American patients in primary care. Patients (N = 137) aged 33 to 67 were nested within 79 southern California primary care physicians' practices. In 48 interactions (35%), the physician was female and/or a member of a minority group. The study directly assessed gender differences through audiotaped physician-patient interactions as well as by measuring patients' and physicians' perceptions of their visit. This study employed a multi-informant design, in which independent raters assessed both physician and patient in audiotaped interactions, and both physician and patient self-reported on aspects of their visit. Discussions of prevention and health promotion were found to be significantly more common with male patients than with female patients but only when the physician was a nonminority male; these disparities disappeared when the physician was female and/or minority. Findings are discussed in terms of physician training, particularly for men and nonminorities.
Article
Full-text available
Four studies investigated the reliability and validity of thin slices of nonverbal behavior from social interactions including (a) how well individual slices of a given behavior predict other slices in the same interaction; (b) how well a slice of a given behavior represents the entirety of that behavior within an interaction; (c) how long a slice is necessary to sufficiently represent the entirety of a behavior within an interaction; (d) which slices best capture the entirety of behavior, across different behaviors; and (e) which behaviors (of six measured behaviors) are best captured by slices. Notable findings included strong reliability and validity for thin slices of gaze and nods, and that a 1.5-min slice from the start of an interaction may adequately represent some behaviors. Results provide useful information to researchers making decisions about slice measurement of behavior. © 2014 by the Society for Personality and Social Psychology, Inc.
Article
Full-text available
Effective pain communication is essential if adequate treatment and support are to be provided. Pain communication is often multimodal, with sufferers utilising speech, nonverbal behaviours (such as facial expressions), and co-speech gestures (bodily movements, primarily of the hands and arms that accompany speech and can convey semantic information) to communicate their experience. Research suggests that the production of nonverbal pain behaviours is positively associated with pain intensity, but it is not known whether this is also the case for speech and co-speech gestures. The present study explored whether increased pain intensity is associated with greater speech and gesture production during face-to-face communication about acute, experimental pain. Participants (N = 26) were exposed to experimentally elicited pressure pain to the fingernail bed at high and low intensities and took part in video-recorded semi-structured interviews. Despite rating more intense pain as more difficult to communicate (t(25) = 2.21, p = .037), participants produced significantly longer verbal pain descriptions and more co-speech gestures in the high intensity pain condition (Words: t(25) = 3.57, p = .001; Gestures: t(25) = 3.66, p = .001). This suggests that spoken and gestural communication about pain is enhanced when pain is more intense. Thus, in addition to conveying detailed semantic information about pain, speech and co-speech gestures may provide a cue to pain intensity, with implications for the treatment and support received by pain sufferers. Future work should consider whether these findings are applicable within the context of clinical interactions about pain.
Article
Full-text available
Scholars of supportive communication are primarily concerned with how variations in the quality of enacted support affect individual and relational health and well-being. But who gets to determine what counts as enacted support? There is a large degree of operational heterogeneity for what gets called enacted support, but little attention has been afforded to the issue of whether these assessments are substitutable. In two studies we use self-reports, conversational partner-reports, and third-party ratings of two quintessential behavioral support indicators, namely, listening and immediacy. Using a multitrait–multimethod (MTMM) design, Study 1 found (1) little association between the enacted support assessments and (2) a high degree of common method variance. A second study found moderate-to-high degrees of effective reliability (i.e., consistency of judgments within a set of judgments, or mean judgments) for enacted support evaluations from the perspective of unacquainted and untrained third-party judges. In general, our data provide cautionary evidence that when scholars examine evaluations of enacted support, perspective matters and might ultimately contribute differently to well-being and health.
Article
Full-text available
Personality is a psychological construct aimed at explaining the wide variety of human behaviors in terms of a few, stable and measurable individual characteristics. In this respect, any technology involving understanding, prediction and synthesis of human behavior is likely to benefit from Personality Computing approaches, i.e. from technologies capable of dealing with human personality. This paper is a survey of such technologies and it aims at providing not only a solid knowledge base about the state-of-the-art, but also a conceptual model underlying the three main problems addressed in the literature, namely Automatic Personality Recognition (inference of the true personality of an individual from behavioral evidence), Automatic Personality Perception (inference of personality others attribute to an individual based on her observable behavior) and Automatic Personality Synthesis (generation of artificial personalities via embodied agents). Furthermore, the article highlights the issues still open in the field and identifies potential application areas.
Conference Paper
Full-text available
Computer classification of facial expressions requires large amounts of data and this data needs to reflect the diversity of conditions seen in real applications. Public datasets help accelerate the progress of research by providing researchers with a benchmark resource. We present a comprehensively labeled dataset of ecologically valid spontaneous facial responses recorded in natural settings over the Internet. To collect the data, online viewers watched one of three intentionally amusing Super Bowl commercials and were simultaneously filmed using their webcam. They answered three self-report questions about their experience. A subset of viewers additionally gave consent for their data to be shared publicly with other researchers. This subset consists of 242 facial videos (168, 359 frames) recorded in real world conditions. The dataset is comprehensively labeled for the following: 1) frame-by-frame labels for the presence of 10 symmetrical FACS action units, 4 asymmetric (unilateral) FACS action units, 2 head movements, smile, general expressiveness, feature tracker fails and gender, 2) the location of 22 automatically detected landmark points, 3) self-report responses of familiarity with, liking of, and desire to watch again for the stimuli videos and 4) baseline performance of detection algorithms on this dataset. This data is available for distribution to researchers online, the EULA can be found at: http://www.affectiva.com/facial-expression-dataset-am-fed/.
Article
Recent judgment studies have shown that people are able to fairly correctly attribute emotional states to others' bodily expressions. It is, however, not clear which movement qualities are salient, and how this applies to emotional gesture during speech-based interaction. In this study we investigated how the expression of emotions that vary on three major emotion dimensions-that is, arousal, valence, and potency-affects the perception of dynamic arm gestures. Ten professional actors enacted 12 emotions in a scenario-based social interaction setting. Participants (N = 43) rated all emotional expressions with muted sound and blurred faces on six spatiotemporal characteristics of gestural arm movement that were found to be related to emotion in previous research (amount of movement, movement speed, force, fluency, size, and height/vertical position). Arousal and potency were found to be strong determinants of the perception of gestural dynamics, whereas the differences between positive or negative emotions were less pronounced. These results confirm the importance of arm movement in communicating major emotion dimensions and show that gesture forms an integrated part of multimodal nonverbal emotion communication.
Article
Full-text available
This study is interested in what sources of team identity formation are related to self-categorization as a sport team fan and the strength of that team identification, and what affective and psychological outcomes become salient in spectatorship sce-narios. Participants were administered self-report instruments previously designed to measure team identity formation and psychological effects, then given cognitive tasks adapted from a previous study (Markus, 1977). Participants were required to return to the lab to watch highlights and lowlights of the attending football team's season. These videos were recorded and coded for affective responses. Because previous evidence supports connections between identity formation, self-categori-zation/strength of identity, psychological effects, and affective responses, a general-ized latent variable model was estimated. The model fit the data, exposing a mediated relationship. This study extends upon previous research by isolating spe-cific aspects of team identity formation that differentially influence affective and com-municative responses, especially when mediated by sport team identification. Findings also support the assertion that identity is related to the value and emotional attachment placed on a group membership.
Article
Full-text available
Facial encoding of a sample of children with high-functioning autism spectrum disorders (HFASD) was compared to facial encoding of matched typically developing children. Each participant was photographed after being prompted to enact a facial expression for six basic emotions. Raters evaluated (a) the extent to which the photo reflected the emotion, (b) the emotion in the photograph, and (c) the degree to which the photo appeared odd. Children with HFASD were significantly less adept at encoding sadness, and their expressions were significantly odder than those of their typical peers. Nonsignificant trends for children with HFASD suggested somewhat greater difficulty encoding anger and fear, as well as somewhat greater skill in encoding surprise and disgust, which was unanticipated.
Article
Full-text available
The ability to measure agreement between two independent observers is vital to any observational study. We use a unique situation, the calculation of inter-rater reliability for transcriptions of a parrot’s speech, to present a novel method of dealing with inter-rater reliability which we believe can be applied to situations in which speech from human subjects may be difficult to transcribe. Challenges encountered included (1) a sparse original agreement matrix which yielded an omnibus measure of inter-rater reliability, (2) “lopsided” 2×22\times 2 2 × 2 matrices (i.e. subsets) from the overall matrix and (3) categories used by the transcribers which could not be pre-determined. Our novel approach involved calculating reliability on two levels—that of the corpus and that of the above mentioned smaller subsets of data. Specifically, the technique included the “reverse engineering” of categories, the use of a “null” category when one rater observed a behavior and the other did not, and the use of Fisher’s Exact Test to calculate r r -equivalent for the smaller paired subset comparisons. We hope this technique will be useful to those working in similar situations where speech may be difficult to transcribe, such as with small children.
Article
Full-text available
To investigate the relation between vocal prosody and change in depression severity over time, 57 participants from a clinical trial for treatment of depression were evaluated at seven-week intervals using a semistructured clinical interview for depression severity (Hamilton Rating Scale for Depression (HRSD)). All participants met criteria for major depressive disorder (MDD) at week one. Using both perceptual judgments by naive listeners and quantitative analyses of vocal timing and fundamental frequency, three hypotheses were tested: 1) Naive listeners can perceive the severity of depression from vocal recordings of depressed participants and interviewers. 2) Quantitative features of vocal prosody in depressed participants reveal change in symptom severity over the course of depression. 3) Interpersonal effects occur as well; such that vocal prosody in interviewers shows corresponding effects. These hypotheses were strongly supported. Together, participants' and interviewers' vocal prosody accounted for about 60 percent of variation in depression scores, and detected ordinal range of depression severity (low, mild, and moderate-to-severe) in 69 percent of cases (kappa = 0.53). These findings suggest that analysis of vocal prosody could be a powerful tool to assist in depression screening and monitoring over the course of depressive disorder and recovery.
Article
Full-text available
Computational research with continuous representations depends on obtaining continuous representations from human labellers. The main method used for that purpose is tracing. Tracing raises a range of challenging issues, both psychological and statistical. Naive assumptions about these issues are easy to make, and can lead to inappropriate requirements and uses. The natural function of traces is to capture perceived affect, and as such they belong in long traditions of research on both perception and emotion. Experiments on several types of material provide information about their characteristics, particularly the ratings on which people tend to agree. Disagreement is not necessarily a problem in the technique. It may correctly show that people's impressions of emotion diverge more than commonly thought. A new system, Gtrace, is designed to let rating studies capitalise on a decade of experience and address the research questions that are opened up by the data now available.
Article
Full-text available
Two studies examined vocal affect in medical providers’ and patients’ content-filtered (CF) speech. A digital methodology for content-filtering and a set of reliable global affect rating scales for CF voice were developed. In Study 1, ratings of affect in physicians’ CF voice correlated with patients’ satisfaction, perceptions of choice/control, medication adherence, mental and physical health, and physicians’ satisfaction. In Study 2, ratings of affect in the CF voices of physicians and nurses correlated with their patients’ satisfaction, and the CF voices of nurses and patients reflected their satisfaction. Voice tone ratings of providers and patients were intercorrelated, suggesting reciprocity in their vocal affective communication.
Article
Full-text available
terms-and-conditions-of-access.pdf This article may be used for research, teaching and private study purposes. Any substantial or systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.
Conference Paper
Full-text available
Many people believe that emotions and subjective feelings are one and the same and that a goal of human-centered computing is emotion recognition. The first belief is outdated; the second mistaken. For human-centered computing to succeed, a different way of thinking is needed. Emotions are species-typical patterns that evolved because of their value in addressing fundamental life tasks. Emotions consist of multiple components, of which subjective feelings may be one. They are not directly observable, but inferred from expressive behavior, self-report, physiological indicators, and context. I focus on expressive facial behavior because of its coherence with other indicators and research. Among the topics included are measurement, timing, individual differences, dyadic interaction, and inference. I propose that design and implementation of perceptual user interfaces may be better informed by considering the complexity of emotion, its various indicators, measurement, individual differences, dyadic interaction, and problems of inference.
ResearchGate has not been able to resolve any references for this publication.