[Show abstract][Hide abstract] ABSTRACT: MACH--My Automated Conversation coacH--is a novel system that provides ubiquitous access to social skills training. The system includes a virtual agent that reads facial expressions, speech, and prosody and responds with verbal and nonverbal behaviors in real time. This paper presents an application of MACH in the context of training for job interviews. During the training, MACH asks interview questions, automatically mimics certain behavior issued by the user, and exhibit appropriate nonverbal behaviors. Following the interaction, MACH provides visual feedback on the user's performance. The development of this application draws on data from 28 interview sessions, involving employment-seeking students and career counselors. The effectiveness of MACH was assessed through a weeklong trial with 90 MIT undergraduates. Students who interacted with MACH were rated by human experts to have improved in overall interview performance, while the ratings of students in control groups did not improve. Post-experiment interviews indicate that participants found the interview experience informative about their behaviors and expressed interest in using MACH in the future.
[Show abstract][Hide abstract] ABSTRACT: We create two experimental situations to elicit two affective states: frustration, and delight. In the first experiment, participants were asked to recall situations while expressing either delight or frustration, while the second experiment tried to elicit these states naturally through a frustrating experience and through a delightful video. There were two significant differences in the nature of the acted versus natural occurrences of expressions. First, the acted instances were much easier for the computer to classify. Second, in 90 percent of the acted cases, participants did not smile when frustrated, whereas in 90 percent of the natural cases, participants smiled during the frustrating interaction, despite self-reporting significant frustration with the experience. As a follow up study, we develop an automated system to distinguish between naturally occurring spontaneous smiles under frustrating and delightful stimuli by exploring their temporal patterns given video of both. We extracted local and global features related to human smile dynamics. Next, we evaluated and compared two variants of Support Vector Machine (SVM), Hidden Markov Models (HMM), and Hidden-state Conditional Random Fields (HCRF) for binary classification. While human classification of the smile videos under frustrating stimuli was below chance, an accuracy of 92 percent distinguishing smiles under frustrating and delighted stimuli was obtained using a dynamic SVM classifier.
[Show abstract][Hide abstract] ABSTRACT: Have you ever wondered whether it's possible to quantitatively measure how friendly or welcoming a community is? Or imagined which parts of the community are happier than others? In this work, we introduce a new technology that begins to address these questions.
[Show abstract][Hide abstract] ABSTRACT: This work is part of research to build a system to combine facial and prosodic information to recognize commonly occurring user states such as delight and frustration. We create two experimental situations to elicit two emotional states: the first involves recalling situations while expressing either delight or frustration; the second experiment tries to elicit these states directly through a frustrating experience and through a delightful video. We find two significant differences in the nature of the acted vs. natural occurrences of expressions. First, the acted ones are much easier for the computer to recognize. Second, in 90% of the acted cases, participants did not smile when frustrated, whereas in 90% of the natural cases, participants smiled during the frustrating interaction, despite self-reporting significant frustration with the experience. This paper begins to explore the differences in the patterns of smiling that are seen under natural frustration and delight conditions, to see if there might be something measurably different about the smiles in these two cases, which could ultimately improve the performance of classifiers applied to natural expressions.
[Show abstract][Hide abstract] ABSTRACT: This work is part of a research effort to understand and characterize the morphological and dynamic features of polite and amused smiles. We analyzed a dataset consisting of young adults (n=61), interested in learning about banking services, who met with a professional banker face-to-face in a conference room while both participants’ faces were unobtrusively recorded. We analyzed 258 instances of amused and polite smiles from this dataset, noting also if they were shared, which we defined as if the rise of one starts before the decay of another. Our analysis confirms previous findings showing longer durations of amused smiles while also suggesting new findings about symmetry of the smile dynamics. We found more symmetry in the velocities of the rise and decay of the amused smiles, and less symmetry in the polite smiles. We also found fastest decay velocity for polite but shared smiles.
Affective Computing and Intelligent Interaction - 4th International Conference, ACII 2011, Memphis, TN, USA, October 9-12, 2011, Proceedings, Part I; 01/2011
[Show abstract][Hide abstract] ABSTRACT: Participatory user interface design with adolescent users on the autism spectrum presents a number of unique challenges and opportunities. Through our work developing a system to help autistic adolescents learn to recognize facial expressions, we have learned valuable lessons about software and hardware design issues for this population. These lessons may also be helpful in assimilating iterative user input to customize technology for other populations with special needs.
Proceedings of the 27th International Conference on Human Factors in Computing Systems, CHI 2009, Extended Abstracts Volume, Boston, MA, USA, April 4-9, 2009; 01/2009
[Show abstract][Hide abstract] ABSTRACT: This paper describes the challenges of getting gro und truth affective labels for spontaneous video, and presents implicat ions for systems such as virtual agents that have automated facial analysis capabilities. We first present a dataset from an intelligent tutoring application an d describe the most prevalent approach to labeling such data. We then present an alternative labeling approach, which closely models how the majority of automated facial analysis systems are designed. We show that while participan ts, peers and trained judges report high inter-rater agreement on expressions of delight, confusion, flow, frustration, boredom, surprise, and neutral when sh own the entire 30 minutes of video for each participant, inter-rater agreement d rops below chance when human coders are asked to watch and label short 8 s econd clips for the same set of labels. We also perform discriminative analysis for facial action units for each affective state represented in the clips. The results emphasize that human coders heavily rely on factors such as familiarity of the person and context of the interaction to correctly infer a person's affec tive state; without this information, the reliability of humans as well as m achines attributing affective labels to spontaneous facial-head movements drops s ignificantly.
Intelligent Virtual Agents, 9th International Conference, IVA 2009, Amsterdam, The Netherlands, September 14-16, 2009, Proceedings; 01/2009
[Show abstract][Hide abstract] ABSTRACT: Individuals on the autism spectrum often have difficulties producing intelligible speech with either high or low speech rate, and atypical pitch and/or amplitude affect. In this study, we present a novel intervention towards customizing speech enabled games to help them produce intelligible speech. In this approach, we clinically and computationally identify the areas of speech production difficulties of our participants. We provide an interactive and customized interface for the participants to meaningfully manipulate the prosodic aspects of their speech. Over the course of 12 months, we have conducted several pilots to set up the experimental design, developed a suite of games and audio processing algorithms for prosodic analysis of speech. Preliminary results demonstrate our intervention being engaging and effective for our participants.
INTERSPEECH 2009, 10th Annual Conference of the International Speech Communication Association, Brighton, United Kingdom, September 6-10, 2009; 01/2009
[Show abstract][Hide abstract] ABSTRACT: Social communication in autism is significantly hindered by difficulties processing affective cues in realtime face-to-face interaction. The interactive Social-Emotional Toolkit (iSET) allows its users to record and annotate video with emotion labels in real time, then review and edit the labels later to bolster understanding of affective information present in interpersonal interactions. The iSET demo will let the ACII audience experience the augmentation of interpersonal interactions by using the iSET system.