Facial Action Coding System (FACS): A technique for the measurement of facial action
... The coder was qualified through successful completion of an examination administered by the system's developers. To calculate interrater reliability, 30 seconds of 20% of videos were additionally FACS coded and the Ekman-Friesen formula (35) was used to compute reliability, reaching 90.5% agreement between FACS codes, which compared favorably to previous studies (35). FACS coding was performed using specialized software, namely the Observer Video-Pro (Noldus Information Technology). ...
... The coder was qualified through successful completion of an examination administered by the system's developers. To calculate interrater reliability, 30 seconds of 20% of videos were additionally FACS coded and the Ekman-Friesen formula (35) was used to compute reliability, reaching 90.5% agreement between FACS codes, which compared favorably to previous studies (35). FACS coding was performed using specialized software, namely the Observer Video-Pro (Noldus Information Technology). ...
... Similarly, Wong et al. (27) demonstrated that hindering patients from observing smiles due to doctors wearing face masks had a significant negative effect on perceptions of empathy. Notably, our study differs from these previous studies as the first to systematically code smiling using the Facial Action Coding System (FACS); which allows for a nuanced, reliable and valid analysis of facial expression (35). ...
Introduction
Although the importance of facial expressions for good doctor-patient communication is widely acknowledged, empirical evidence supporting this notion is scarce. We used a fine-grained, anatomically-based measure to investigate which facial expressions are displayed in (simulated) doctor-patient consultations and whether these can predict communication quality.
Methods
Fifty two medical students engaged in simulated doctor-patient consultations with standardized patients (SPs) and their facial expressions were analyzed using the Facial-Action-Coding-System (FACS). The quality of the communication was rated by SPs, medical students, and by communication experts. SPs also rated their level of comfort.
Results
The predominant facial expression being displayed by medical students was smiling. Medical students' smiling positively predicted the communication quality and level of comfort experienced by SPs. In contrast, smiling had little effect on medical students' self- and expert-assessments of communication quality. Smiling of medical students significantly predicted patient level of comfort and perceived quality of communication. This predictive power was found for genuine and for social smiles as well as for smiles displayed during speaking and during listening.
Discussion
Smiling seems to be a robust non-verbal behavior that has the potential to improve doctor-patient communication. This knowledge should be taken into consideration in medical training programs.
... Emotions and facial expressions were broadly studied by Ekman (2003), which led to the creation of the Facial Action Coding System (FACS; Ekman and Friesen, 1978;Ekman, 2002). The FACS describes the movements of facial muscles (anatomical facial movements), which are composed of "action units" (AU; Ekman and Friesen, 1978). ...
... Emotions and facial expressions were broadly studied by Ekman (2003), which led to the creation of the Facial Action Coding System (FACS; Ekman and Friesen, 1978;Ekman, 2002). The FACS describes the movements of facial muscles (anatomical facial movements), which are composed of "action units" (AU; Ekman and Friesen, 1978). Ekman (2003) also identified seven universal emotions (universal across cultures)-fear, disgust, joy, sadness, anger, contempt, and surprise, which are all distinctive but share related action units. ...
... The analysis of the action units and emotions was performed using iMotions computer software, version 8.2.4.0 (iMotions). The software is based on the Facial Action Coding System (FACS; Ekman and Friesen, 1978;Ekman, 2002) ...
Introduction
Self-protection, also called protective anger or assertive anger, is a key factor in mental health. Thus, far, researchers have focused mainly on the qualitative analysis of self-protection.
Methods
Therefore, we investigated facial action units, emotions, and vocal cues in low and high self-protective groups of participants in order to detect any differences. The total sample consisted of 239 participants. Using the Performance factor in the Short version of the Scale for Interpersonal Behavior (lower 15th percentile and upper 15th percentile) we selected 33 high self-protective participants (11 men, 22 women) and 25 low self-protective participants (eight men, 17 women). The self-protective dialogue was recorded using the two-chair technique script from Emotion Focused Therapy. The subsequent analysis was performed using iMotions software (for action units and emotions) and Praat software (for vocal cues of pitch and intensity). We used multilevel models in program R for the statistical analysis.
Results
Compared to low self-protective participants, high self-protective participants exhibited more contempt and fear and less surprise and joy. Compared to low self-protective participants, high self-protective participants expressed the action units the following action units less often: Mouth Open (AU25), Smile (AU12), Brow Raise (AU2), Cheek Raise (AU6), Inner Brow Raise (AU1), and more often Brow Furrow (AU4), Chin Raise (AU17), Smirk (AU12), Upper Lip Raise (AU10), and Nose Wrinkle (AU9). We found no differences between the two groups in the use of vocal cues.
Discussion
These findings bring us closer to understanding and diagnosing self-protection.
... These systems leverage a modular pipeline approach: capturing input, preprocessing images, detecting and cropping the facial region, extracting features, and classifying emotions (Goodfellow et al., 2013) [5]. Beyond improving human-computer interaction, these systems have shown promise in areas such as mental health monitoring, adaptive learning environments, and marketing analysis (Ekman & Friesen, 1978) [6]. This paper focuses on presenting a robust system capable of detecting seven primary human emotions using a combination of CNNs for feature extraction and logistic regression for classification. ...
... These systems leverage a modular pipeline approach: capturing input, preprocessing images, detecting and cropping the facial region, extracting features, and classifying emotions (Goodfellow et al., 2013) [5]. Beyond improving human-computer interaction, these systems have shown promise in areas such as mental health monitoring, adaptive learning environments, and marketing analysis (Ekman & Friesen, 1978) [6]. This paper focuses on presenting a robust system capable of detecting seven primary human emotions using a combination of CNNs for feature extraction and logistic regression for classification. ...
... This procedure was originally expected to inhibit smiling and lowered the funniness ratings of a set of cartoons. The lips position has been assumed to suppresses the smile by engaging the orbicularis oris muscle, leading to lower EMG activity in the zygomaticus major and potentially increased activity in the orbicularis oris (Ekman et al., 1971;Ekman & Friesen, 1978;Hjortsjö, 1970;Izard, 1971;Soussignan, 2002). However, this is not without controversy since EMG evidence (see Oberman et al., 2007) suggest that this procedure promotes a pattern of facial muscle activation close to relaxation, having no differential activation of any muscle (levator, zygomaticus, orbicularis oris or buccionator). ...
... Hold a pen horizontally in the mouth with the lips (preventing teeth to touch it) -We will refer this as the Niedenthal procedure (see Maringer et al., 2011). EMG evidence (Domingos, 2012) found support for the assumption that it interferes with the activation of lower face muscles, since it activates the zygomaticus usually associated with happy displays (Ekman et al., 1971;Ekman & Friesen, 1978;Hjortsjö, 1970;Izard, 1971). Congruently, Niedenthal et al. (2009;Experiment 3) found the procedure to selective impair the accuracy of judging joy words, as emotional. ...
Processing is oriented by goals that determine the details of the stimuli to be attended. Previous studies claim that the determination of word valence (neutral, positive, or negative) is prioritized at early processing stages. This effect of immediate processing of affective information is supported by behavioral and psychophysiological evidence. Here we address this primacy of affect hypothesis in word processing by performing different blocking procedures on the facial muscles relevant for processing the affective dimension of the stimuli on preference (Experiment 1) and lexical decision tasks (Experiment 2). The results show that not only evaluative judgments were disturbed by blocking procedures, but that the same result occurred when the affective information was irrelevant to the task. Evidence shows similar interference from blocking facial muscle activity on affective word processing in both experiments, with procedures that immobilize the zygomatic muscle having a greater impact on the processing of positive words. We discuss the informative role of demonstrating these effects as occurring regardless of the processing goal, highlighting different patterns associated with the various blocking procedures.
... The Facial Action Coding System (FACS) has been a widely used tool in human facial expression research since Ekman and colleagues made it available in a training manual [1][2][3]. FACS is a standardised coding system, which identifies and describes in detail the brief facial movements of the human face based on the corresponding musculature. Whenever a facial muscle contracts, parts of the skin move, leading to noticeable changes in the face's appearance. ...
... In this study, we introduce the GorillaFACS (Gorilla Facial Action Coding System), the first objective, systematic, and quantifiable tool for the scientific measurement of gorilla facial movements. It is based on the underlying musculature of the gorilla face and its muscular homologies to the human face, following the methodology of the human FACS, which was initially developed to study human facial behaviour [1][2][3]. ...
The Facial Action Coding System (FACS) is an objective observation tool for measuring human facial behaviour. It avoids subjective attributions of meaning by objectively measuring independent movements linked to facial muscles, called Action Units (AUs). FACS has been adapted to 11 other taxa, including most apes, macaques and domestic animals, but not yet gorillas. To carry out cross species studies of facial expressions within and beyond apes, gorillas need to be included in such studies. Hence, we developed the GorillaFACS for the Gorilla spp. We followed similar methodology as previous FACS: First, we examined the facial muscular plan of the gorilla. Second, we analysed gorilla videos in a wide variety of contexts to identify their spontaneous facial movements. Third, we classified the individual facial movements according to appearance changes produced by the corresponding underlying musculature. A diverse repertoire of 42 facial movements was identified in the gorilla, including 28 AUs and 14 Action Descriptors, with several new movements not identified in the HumanFACS. Although some of the movements in gorillas differ from humans, the total number of AUs is comparable to the HumanFACS (32 AUs). Importantly, the gorilla’s range of facial movements was larger than expected, suggesting a more relevant role in social interactions than what was previously assumed. GorillaFACS is a scientific tool to measure facial movements, and thus, will allow us to better understand the gorilla’s expressions and communication. Furthermore, GorillaFACS has the potential be used as an important tool to evaluate this species welfare, particularly in settings of close proximity to humans.
... Acknowledging the challenges these children might face in navigating real-world social interactions, we opted to utilize cartoon representations of animals with simplified human expressions in the game. This design choice focuses on key facial features-such as eyes, mouth, and eyebrows-to vividly portray a spectrum of emotions, according to the FACS (Facial Action Coding System) [29]. The simplification to non-human cartoons not only makes the emotional expressions clearer but also reduces the social complexity often associated with ...
... FACS-based[29] emotional expression designs in EMooly.AngerBrowLowerer, Upper Lid Raiser, Lid Tightener, Lip Tightener Fear Inner Brow Raiser, Outer Brow Raiser, Brow Lowerer, Upper Lid Raiser, Lid Tightener, Lip Stretcher, Jaw DropNeutralIn a natural state, facial expressions involve no muscle activity human faces. This approach aims to facilitate a more accessible and less pressured learning environment for autistic children, allowing them to understand and learn about emotions more comfortably[74]. ...
Children with autism spectrum disorder (ASD) have social-emotional deficits that lead to difficulties in recognizing emotions as well as understanding and responding to social interactions. This study presents EMooly, a tablet game that actively involves caregivers and leverages augmented reality (AR) and generative AI (GenAI) to enhance social-emotional learning for autistic children. Through a year of collaborative effort with five domain experts, we developed EMooly that engages children through personalized social stories, interactive and fun activities, and enhanced caregiver participation, focusing on emotion understanding and facial expression recognition. Compared with a baseline, a controlled study with 24 autistic children and their caregivers showed EMooly significantly improved children's emotion recognition skills and its novel features were preferred and appreciated. EMooly demonstrates the potential of AI and AR in enhancing social-emotional development for autistic children via prompt personalizing and engagement, and highlights the importance of caregiver involvement for optimal learning outcomes.
... And in the second -this is a reflexive process that is manifested by a person unconsciously. Paul Ekman and Wallace Friesen developed the Facial Action Coding System (FACS) in 1978 [19]. It classifies a person's facial expressions. ...
This article discusses the rapid development of emotional artificial intelligence and its significance in distance education. It highlights the necessity of considering students emotions to enhance educational quality. The analysis shows that emotions are reactions to stimuli and help the body adapt, with facial expressions being a universal method to express emotions. These expressions can be encoded using a facial movement coding system, simplifying the process of reading emotions. Additionally, the article also examines the influence of personality types on emotions and presents existing classifications of personality types. A functional model has been developed to organizing personalized distance education, integrating a decision support system within the LMS that considers students’ emotional states. furthermore, a mathematical model for managing the distance education process has been created, taking into account students’ emotional information, focusing on five primary emotions and four personality types. The article details the development of an algorithm for providing emotional support to influence students’ emotional states, which has been implemented in the decision support system for distance education. Enhancing distance education presents significant opportunities, such as a personalized learning, reduced stress, increased motivation, and efficiency all contributing to a higher quality of education.
... Coding System (FACS) [6] maps facial movements into Action Units (AUs) with intensity levels, aggregating into the Prkachin and Solomon Pain Intensity (PSPI) score [23]. The UNBC dataset [19] provides AU and PSPI-labeled videos, supporting AU-informed methods like K-Nearest Neighbor [30] and Bayesian Networks [9], which report high accuracy but suffer from overoptimism due to class imbalance and overlook AU relationships. ...
Understanding pain-related facial behaviors is essential for digital healthcare in terms of effective monitoring, assisted diagnostics, and treatment planning, particularly for patients unable to communicate verbally. Existing data-driven methods of detecting pain from facial expressions are limited due to interpretability and severity quantification. To this end, we propose GraphAU-Pain, leveraging a graph-based framework to model facial Action Units (AUs) and their interrelationships for pain intensity estimation. AUs are represented as graph nodes, with co-occurrence relationships as edges, enabling a more expressive depiction of pain-related facial behaviors. By utilizing a relational graph neural network, our framework offers improved interpretability and significant performance gains. Experiments conducted on the publicly available UNBC dataset demonstrate the effectiveness of the GraphAU-Pain, achieving an F1-score of 66.21% and accuracy of 87.61% in pain intensity estimation.
... In the control condition, a still image is shown in which the activist has a neutral facial expression, while in the experimental condition, a still image is shown in which the same activist has an expression of contempt on her face. Both images were analysed independently by two F.A.C.S. coders (Facial Action Coding System; Ekman et al. 1978;Hager et al. 2002). The analysis of facial expressions showed that in the neutral image (Figure 1) Luisa shows no Action Unit (AU), whereas in the contempt image (Figure 2), analysed with the F.A.C.S., she shows AU14 (forming dimples at the sides of the mouth due to the activation of the Buccinator muscle), AU17 (raising of the chin through the Mentalis muscle) and a slight AU24 (pressure of the lips given by the Orbicularis oris muscle). ...
Despite the growing interest in the social effects of emotional expressions within contemporary psychosocial studies, the impact of the expression of moral emotions, and in particular the expression of contempt, on the process of counteracting denial on threatening issues remains an under‐researched area. This study examines the impact of a speaker's contemptuous facial expression on the audience's perception and emotional reactions during a speech aimed at breaking politicians' denial and inaction regarding climate change. The participants (N = 100) were randomly assigned to view either a neutral or a contemptuous facial expression of climate activist Luisa Neubauer, followed by an excerpt of her speech in which she criticises political inaction. The findings indicated that participants who observed the neutral expression perceived the speech as more strategic and cold than those who observed the expression of contempt, although the low scores in these perceptions were stated by both groups. Conversely, when invited to put themselves in the shoes of the audience to which the talk was originally addressed, participants who observed the expression of contempt indicated that they would experience greater happiness than those who observed the neutral expression. These findings highlight the significance of considering the multimodal aspects of communication. Indeed, when verbal condemnation is not accompanied by a congruent facial expression, the communicated message is perceived more negatively, thereby increasing the risk of failing to achieve the desired effect. Finally, the limitations and future perspectives are presented to further investigate the interaction between different moral emotions, audience characteristics, and the specific context of climate communication.
... Decades of affective science research has provided valuable knowledge about how humans make emotion-related perceptual decisions (e.g., Cuthbert et al. 1998;Ekman and Friesen 1978;Lang and Bradley 2010;Méndez-Bértolo et al. 2016;Russell 1980;Schupp et al. 2000). However, our current knowledge on this subject has largely come from paradigms asking people to judge a single stimulus. ...
Emotion‐guided endogenous attention (e.g., attending to fear) may play a crucial role in determining how humans integrate emotional evidence from various sources when assessing the general emotional tenor of the environment. For instance, what emotion a presenter focuses on can shape their perception of the overall emotion of the room. While there is an increasing interest in understanding how endogenous attention affects emotion perception, existing studies have largely focused on single‐stimulus perception. There is limited understanding of how endogenous attention influences emotion evidence integration across multiple sources. To investigate this question, human participants ( N = 40) were invited to judge the average emotion across an array of faces ranging from fearful to happy. Endogenous attention was manipulated by instructing participants to decide whether the face array was “fearful or not” (fear attention), “happy or not” (happy attention). Eye movement results revealed an endogenous attention‐induced sampling bias such that participants paid more attention to extreme emotional evidence congruent with the target emotion. Computational modeling revealed that endogenous attention shifted the decision criterion to be more conservative, leading to reduced target‐category decisions. These findings unraveled the cognitive and computational mechanisms of how endogenous attention impacts the way we gather emotional evidence and make integrative decisions, shedding light on emotion‐related decision‐making.
... Deep learning has been one of the key directions of academic research recently. Additionally, it has made significant strides in the research of college students' affective cognition [8][9][10]. Many scholars have proposed to use deep neural network models such as CNN [11] and Bi-LSTM [12] to analyze and train the influencing factors affecting students' affective cognition problems and establish analytical models to classify and predict students' affective cognition problems. ...
Currently, under the impact of the COVID-19, college students are facing increasingly elevatedemployment pressure and higher education pressure. This can easily cause a hugepsychological burden on them, causing affective cognition problems such as anxiety anddepression. In the long run, this is not conducive to students’ physical and mental health, noris it conducive to the healthy development of the school and even the whole society. Therefore,it is imperative to build a novel adaptive affective cognition analysis model for college students.In particular, in the context of smart cities and smart China, many universities have opened thesmart campus mode, which provides a huge data resource for our research. Due to problems ofthe low real-time evaluation and single data source in traditional questionnaire evaluationmethods, evaluation errors are prone to occur, which in turn interferes with subsequenttreatment. Therefore, for the purpose of alleviating the above deficiencies and improving theefficiency and accuracy of the affective cognition analysis model of college students, this paperstudies the adaptive affective cognition analysis method of college students on basis of deeplearning. First, because students’ psychological problems are often not sudden, on the contrary,most of these abnormalities will leave traces in their daily activities. Therefore, this paperconstructs a multisource dataset with the access control data, network data, and learning datacollected from the smart campus platform to describe the affective cognition status of students.Second, the multisource dataset is divided into two categories: image and text, and the CNNmodel is introduced to mine the psychological characteristics of college students, so as toprovide a reference for the subsequent affective cognition state assessment. Finally, simulationtests are developed to confirm the viability of the technique suggested in this research. Theexperiments demonstrate that the accuracy of the assessment model is significantly increasedbecause it can fully reflect the heterogeneity and comprehensiveness of the data. This alsohighlights that the new method has a wide range of potential applications in the modern campussetting and is also helpful in fostering the accuracy and depth of college students’ work on theiraffective cognition
(PDF) THE INFLUENCE OF E-ASSESSMENT ON STUDENTS' COGNITIVE ENGAGEMENT IN HIGHER EDUCATION. Available from: https://www.researchgate.net/publication/391009387_THE_INFLUENCE_OF_E-ASSESSMENT_ON_STUDENTS'_COGNITIVE_ENGAGEMENT_IN_HIGHER_EDUCATION [accessed May 04 2025].
... Figure 2 presents the total Action Unit 14 (AU14) values according to the frame number of the experimental videos 1 . Action Units are measures to capture human facial movements by the appearance on their face [9,12,13]. AU14 corresponds to a movement of buccinator, which can be an index of smile and laugh [18,50]. In the analysis of our preliminary study, we confirmed a significant difference in AU14 between when learners keep up with the contents and when they cannot (p<.001 in Mann-Whitney U test, and effect size 0.22 in Figure 2: Values of Action Units 14 (AU14) of the 5 participants according to frame number. ...
Among various methods to learn a second language (L2), such as listening and shadowing, Extensive Viewing involves learning L2 by watching many videos. However, it is difficult for many L2 learners to smoothly and effortlessly comprehend video contents made for native speakers at the original speed. Therefore, we developed a language learning assistance system that automatically adjusts the playback speed according to the learner's comprehension. Our system judges that learners understand the contents if they laugh at the punchlines of comedy dramas, and vice versa. Experimental results show that this system supports learners with relatively low L2 ability (under 700 in TOEIC Score in the experimental condition) to understand video contents. Our system can widen learners' possible options of native speakers' videos as Extensive Viewing material.
... Indeed, the human face has the unique ability to convey both simple emotions, such as happiness and sadness (see, e.g. Ekman and Friesen, 1978), and more complex expressions, such as mistrust or threat (Oosterhof and Todorov 2008). These facial expressions provide essential information in interactions, enabling others to infer intentions and desires even in the absence of verbal communication (see Todorov et al. 2015). ...
Introduction
Facial expressions play a crucial role in social interactions, influencing trust, and decision‐making. In negotiations, threatening expressions may convey dominance or hostility, potentially reducing cooperation. This study explores how threatening facial expressions and autistic traits (ATs) affect social decision‐making in the Ultimatum Game (UG), focusing on their main effects on UG offers.
Method
Fifty adults participated in the study. A Linear Mixed Model (LMM) was conducted to analyze the main effects of threat level and ATs on UG proposals. In addition, eye‐tracking technology was used to investigate participants' visual attention toward different facial areas.
Results
The results revealed a significant main effect of threatening facial expressions, as participants made lower offers in response to high‐threat faces. However, ATs did not show a significant main effect on UG proposals. Eye‐tracking data showed that participants focused more on the eyes of high‐threat faces compared to low‐threat faces.
Conclusion
These findings support the Emotion‐as‐Social‐Information (EASI) model, suggesting that emotional expressions, particularly threatening ones, influence negotiation behavior. The study enhances understanding of how facial cues and individual differences in ATs affect cooperation and decision‐making in social interactions.
... These emotional expressions are coded through a system called FACS (Facial Action Coding System). This system is based on facial anatomy and muscle movement analysis (Lidstrom, 2008, p. 87;Ekman-Rosenberg, 2005;Ekman, et al., 1978). ...
In the history of Roman, political and social events are very interesting and similar enough to be compared with the present day. In addition, the lives of people, senators, etc., especially emperors who shape Roman history, continue to be a subject of curiosity. The recording of almost every incident in the Roman period has enabled us to learn not only the situations of these people but also their reactions in the face of these events. As a result of this, the accuracy of the information transferred to us and the psychological framework of the reactions given it has aroused curiosity. In Roman sculpture, emperor portraits are the most defined and categorized works. The typology of the emperors' portraits is based on the critical events that took place during their reign, as well as the details of their hair and beard. However, the expression of emotion and state of mind in the emperors' facial expressions has not been adequately analyzed. In this study, four Roman emperors were selected, and the meanings conveyed by the facial expressions and gestures in their portraits were interpreted with the help of psychology science. In the evaluations, it was tried to determine whether the character traits transferred in ancient texts are reflected in the portraits of Emperor. As a result, the accuracy of the information transferred in ancient manuscripts was analyzed with the data obtained. In addition, an attempt has been tried to be given an additional perspective to the identification of the emperors's portraits.
... For a broad comprehension of dogs' visual communication, the emission of other facial expressions (for example ear positions) that could be elicited by blinking and nose licking were investigated using the DogFACS [13,76], which is an objective coding system based on facial muscles and adapted from the original 'HumanFACS -Facial Action Coding System' [77]. The need for a standardized tool was underlined by [78,79] who have shown that even experts and professionals cannot be completely accurate when assessing some emotions (e.g. ...
Blinking, along with other facial expressions, has been suggested to play a role in dogs’ intra- and interspecific communication, however the feedback this signal elicits from the audience is still poorly studied. In this study, we investigated the behavioural and physiological responses of 54 domestic dogs to videos of conspecifics performing blink. Based on existing literature, we hypothesized that dogs would show a higher rate of blinking when exposed to blink than to another facial expression (nose lick) and to an attentive still-looking face (control). Results showed that dogs blinked more during the blink video compared to the nose lick (NL) video, suggesting a mimicry phenomenon and implying a possible role of blinking in dogs’ communication. Cardiac analyses showed increased heart rate variability values during the video sessions independently to the type of facial signal projected, suggesting that the stimuli were not perceived as stressful. The present results open the door to future investigation of blink synchronization, as this aspect was not directly addressed in the present study. Future research should also explore the effects of eye blink and NL in modulating intraspecific social interactions.
... A large sector of researchers worked on creating facial action coding systems (FACS) in animals. Notably, FACS was initially developed in human faces by observing facial muscular movements [23]. Subsequently, the same approach was developed in animals such as dogs, cats [12], and horse. ...
... Following Ekman, the basic emotions are happiness, sadness, surprise, disgust, fear and anger (Ekman, 1982). The underlying basis for the expression of emotions can be described with the Facial Action Coding System (FACS; Ekman and Friesen, 1978). The FACS characterises the visible facial movements and does so by classifying the individual facial moving parts into groups. ...
In the age of digitalization with its progressive visualization of a wide variety of content, avatars are becoming increasingly important. Consequently, virtual human representations are also being used in healthcare, for example in therapeutic applications. The use of technologies such as virtual reality or virtual characters like avatars can support people who find social interactions challenging. This is often the case for people with autism spectrum disorder (ASD). The aim of the presented study was to evaluate the suitability of different avatar types for the development of a future digital application for people with ASD. Based on the results of a previous study, two avatar types that differed in their degree of realism were analysed: a cartoonish, simple avatar and a stylized, moderately detailed avatar. First, it was investigated how participants set different basic emotions on these avatars and whether these settings are comparable between the different types of realism as well as with previously defined values for the basic emotions. Afterwards, a threshold test was used to determine at what point the emotions displayed by the avatars became recognizable and at what point they were rated as pronounced. Overall, ten men and ten women (M = 24.9 years old, SD = 3.64) participated in the study. The results show that the average values set by the participants correspond to the previously defined values for the basic emotions, with a relatively high degree of variation in some cases. Overall, the insights gained within this experiment provide a basis for future research on this topic.
... Notably, it stands as the sole dataset capturing EEG signals and facial videos concurrently while subjects viewed videos across various categories such as funny, horror, disgusting, and relaxing. These emotion categories were inspired by Ekman's [18] six basic emotion categories: anger, happiness, fear, surprise, disgust and sadness. Details pertaining to the preparation process are provided in Table 2. ...
Nowadays, bio-signal-based emotion recognition have become a popular research topic. However, there are some problems that must be solved before emotion-based systems can be realized. We therefore aimed to propose a feature-level fusion (FLF) method for multimodal emotion recognition (MER). In this method, first, EEG signals are transformed to signal images named angle amplitude graphs (AAG). Second, facial images are recorded simultaneously with EEG signals, and then peak frames are selected among all the recorded facial images. After that, these modalities are fused at the feature level. Finally, all feature extraction and classification experiments are evaluated on these final features. In this work, we also introduce a new multimodal benchmark dataset, KMED, which includes EEG signals and facial videos from 14 participants. Experiments were carried out on the newly introduced KMED and publicly available DEAP datasets. For the KMED dataset, we achieved the highest classification accuracy of 89.95% with k-Nearest Neighbor algorithm in the (3-disgusting and 4-relaxing) class pair. For the DEAP dataset, we got the highest accuracy of 92.44% with support vector machines in arousal compared to the results of previous works. These results demonstrate that the proposed feature-level fusion approach have considerable potential for MER systems. Additionally, the introduced KMED benchmark dataset will facilitate future studies of multimodal emotion recognition.
... The facial action coding system (FACS; Ekman and Friesen, 1978) was used to quantify the facial expression of pain in healthy participants while brain responses evoked by brief moderately painful heat stimulation were recorded using fMRI. For each trial, the intensity and the frequency of painrelated action units were scored and combined into a FACS composite score (Materials and methods). ...
Pain is a private experience observable through various verbal and non-verbal behavioural manifestations, each of which may relate to different pain-related functions. Despite the importance of understanding the cerebral mechanisms underlying those manifestations, there is currently limited knowledge of the neural correlates of the facial expression of pain. In this functional magnetic resonance imaging (fMRI) study, noxious heat stimulation was applied in healthy volunteers and we tested if previously published brain signatures of pain were sensitive to pain expression. We then applied a multivariate pattern analysis to the fMRI data to predict the facial expression of pain. Results revealed the inability of previously developed pain neurosignatures to predict the facial expression of pain. We thus propose a facial expression of pain signature (FEPS) conveying distinctive information about the brain response to nociceptive stimulations with minimal or no overlap with other pain-relevant brain signatures associated with nociception, pain ratings, thermal pain aversiveness, or pain valuation. The FEPS may provide a distinctive functional characterization of the distributed cerebral response to nociceptive pain associated with the socio-communicative role of non-verbal pain expression. This underscores the complexity of pain phenomenology by reinforcing the view that neurosignatures conceived as biomarkers must be interpreted in relation to the specific pain manifestation(s) predicted and their underlying function(s). Future studies should explore other pain-relevant manifestations and assess the specificity of the FEPS against simulated pain expressions and other types of aversive or emotional states.
... Units We considered that the reason for the difference in expression intensity ratings among 2D images, EMOCA, and DECA was related to the reproducibility of the expression muscle representations, or Action Units (AU) [5], in the 3D models. We obtained the intensity of each AU using Py-Feat [14]. ...
... The motion capture was realised with 16 synchronized cameras operating at 120 Hz and participants wore Vicon's suits with 77 refetive markers. Arkit and depth camera were used for the facial motion system, to extract 52 blendshape weights at 60 Hz, desgined on Facial action coding system, FACS [7]. Additionally, it claims to provide various annotations: text transcriptions at both the word level using an in-house-built ASR model and the phoneme level using the Montreal Forced Aligner, MFA [20], which relies on Kaldi [23], emotion annotations at the recording level, and semantic annotations at the gesture level. ...
This paper presents a real-time system for the detection and classification of facial micro-expressions, evaluated on the CASME II dataset. Micro-expressions are brief and subtle indicators of genuine emotions, posing significant challenges for automatic recognition due to their low intensity, short duration, and inter-subject variability. To address these challenges, the proposed system integrates advanced computer vision techniques, rule-based classification grounded in the Facial Action Coding System, and artificial intelligence components. The architecture employs MediaPipe for facial landmark tracking and action unit extraction, expert rules to resolve common emotional confusions, and deep learning modules for optimized classification. Experimental validation demonstrated a classification accuracy of 93.30% on CASME II, highlighting the effectiveness of the hybrid design. The system also incorporates mechanisms for amplifying weak signals and adapting to new subjects through continuous knowledge updates. These results confirm the advantages of combining domain expertise with AI-driven reasoning to improve micro-expression recognition. The proposed methodology has practical implications for various fields, including clinical psychology, security, marketing, and human-computer interaction, where the accurate interpretation of emotional micro-signals is essential.
Plant vegetation is nature’s symphony, offering sensory experiences that influence ecological systems, human well-being, and emotional states and significantly impact human societal progress. This study investigated the emotional and perceptual impacts of specific monocultural vegetation (palm and rubber) in Nigeria, through audiovisual interactions using facial expression analysis, soundscape, and visual perception assessments. The findings reveal three key outcomes: (1) Facial expressions varied significantly by vegetation type and time of day, with higher “happy” valence values recorded for palm vegetation in the morning (mean = 0.39), and for rubber vegetation in the afternoon (mean = 0.37). (2) Gender differences in emotional response were observed, as male participants exhibited higher positive expressions (mean = 0.40) compared to females (mean = 0.33). (3) Perceptual ratings indicated that palm vegetation was perceived as more visually beautiful (mean = 4.05), whereas rubber vegetation was rated as having a more pleasant soundscape (mean = 4.10). However, facial expressions showed weak correlations with soundscape and visual perceptions, suggesting that other cognitive or sensory factors may be more influential. This study addresses a critical gap in soundscape research for monocultural vegetation and offers valuable insights for urban planners, environmental psychologists, and restorative landscape designs.
Background
Conventional approaches for major depressive disorder (MDD) screening rely on two effective but subjective paradigms: self-rated scales and clinical interviews. Artificial intelligence (AI) can potentially contribute to psychiatry, especially through the use of objective data such as objective audiovisual signals.
Objective
This study aimed to evaluate the efficacy of different paradigms using AI analysis on audiovisual signals.
Methods
We recruited 89 participants (mean age, 37.1 years; male: 30/89, 33.7%; female: 59/89, 66.3%), including 41 patients with MDD and 48 asymptomatic participants. We developed AI models using facial movement, acoustic, and text features extracted from videos obtained via a tool, incorporating four paradigms: conventional scale (CS), question and answering (Q&A), mental imagery description (MID), and video watching (VW). Ablation experiments and 5-fold cross-validation were performed using two AI methods to ascertain the efficacy of paradigm combinations. Attention scores from the deep learning model were calculated and compared with correlation results to assess comprehensibility.
Results
In video clip-based analyses, Q&A outperformed MID with a mean binary sensitivity of 79.06% (95%CI 77.06%‐83.35%; P =.03) and an effect size of 1.0. Among individuals, the combination of Q&A and MID outperformed MID alone with a mean extent accuracy of 80.00% (95%CI 65.88%‐88.24%; P = .01), with an effect size 0.61. The mean binary accuracy exceeded 76.25% for video clip predictions and 74.12% for individual-level predictions across the two AI methods, with top individual binary accuracy of 94.12%. The features exhibiting high attention scores demonstrated a significant overlap with those that were statistically correlated, including 18 features (all Ps <.05), while also aligning with established nonverbal markers.
Conclusions
The Q&A paradigm demonstrated higher efficacy than MID, both individually and in combination. Using AI to analyze audiovisual signals across multiple paradigms has the potential to be an effective tool for MDD screening.
Abstract: Individuals with schizophrenia often experience social skill deficits, leading to reduced social interaction quality. Emotional mimicry, the automatic imitation of a counterpart’s expression, plays a crucial role in social interactions. This study introduces a novel methodology for assessing positive emotional mimicry during a naturalistic conversation. We recruited interacting partners (n=20), each engaging in two interactions: one with an individual diagnosed with schizophrenia (n=20) and one with a matched healthy control (n=20). Participants were video recorded while taking turns sharing happy personal memories during six minutes. Using OpenFace, we detected participants’ emotional expressions and computed mimicry scores based on their temporal alignment. Consistent with our hypotheses, individuals with schizophrenia exhibited reduced smiling and positive emotion mimicry. Furthermore, interacting partners reported lower willingness to continue interacting with individuals with schizophrenia compared to healthy controls. This study stands out for its innovative methodology, assessing a key social skill in an ecological setting. Our findings highlight the potential of emotional mimicry training as an important intervention to improve social interaction in schizophrenia.
Facial Emotion Recognition (FER) has emerged as a significant component in the creation of emotionally intelligent systems, attempting to bridge the communication gap between humans and technology. This project presents a real-time face emotion detection web application that uses live camera feeds to determine users' emotional states and dynamically improves user engagement according to mood. To provide a smooth and responsive experience, the system makes use of the DeepFace framework, OpenCV for video processing, and Flask for backend administration. When the application detects an emotion, such as happiness, sorrow, anger, surprise, fear, or neutrality, it instantly modifies the web interface's background color to match the user's mood and recommends a carefully chosen playlist of YouTube songs that are appropriate for the emotion. The user experience can be further customized with optional features like facial recognition and predicted age detection. The suggested method provides a dynamic and immersive platform with real-time input, thereby addressing the shortcomings of conventional static emotion analysis systems. Through the integration of visual, aural, and interactive components, the program improves emotional engagement and shows how emotion-aware services may be used in a variety of industries, including customer service, entertainment, mental wellness, and adaptive learning. Because of the system's emphasis on accessibility, simplicity, and computational economy, it can function properly even on common consumer hardware without the need for expensive GPUs. The potential for developing sympathetic human-computer interfaces that react to users' current states both rationally and emotionally is demonstrated by this work. Embedding direct multimedia playing, enabling multi-emotion detection per frame, and expanding its application to mobile and edge computing platforms are possible future advances. Keywords— Facial Emotion Detection, DeepFace, OpenCV, Flask, Real-Time Processing, Mood-Based Adaptation, Human-Computer Interaction
Emotion recognition plays a vital role in human–computer interaction and mental health assessment. Despite rapid evolution from psychological approaches to AI-driven methods, the field lacks a systematic understanding of its knowledge structure and research patterns. This scientometric analysis examined 39,686 articles from the Web of Science (2004–2023), revealing the field's intellectual structure and major research clusters. Our findings show the United States, China, and the United Kingdom lead the field, yet research collaboration remains fragmented. The analysis identifies three major transitions: from single-modal to multimodal analysis, from laboratory settings to real-world applications, and from algorithm development to addressing practical challenges (exemplified by innovations in mask detection during COVID-19). This study is limited by its reliance on a single database source and exclusion of non-English publications, which may introduce regional and linguistic biases in the findings. Future research should incorporate multiple databases and multilingual sources to provide more comprehensive insights. This work provides a framework for understanding emotion recognition research development, offering insights for both theoretical advancement and practical applications.
Facial action units (AUs) serve as a precise descriptor of facial expressions, revealing an individual’s psychological and mental state. Due to the fact that each AU is confined to a specific facial region, AU feature extraction usually necessitates the integration of landmark detection tools and prior knowledge regarding to locations of different AUs to partition the face, which leads to time-consuming and laborsome pre-processing procedure. To tackle this issue, a weakly supervised guided attention inference network is proposed for AU detection. The network encompasses two modules with shared parameters: a classification and ROI segmentation module (CRSM) and an attention mining module (AMM). The CRSM autonomously identifies regions of interest for target AUs and generates class activation attention maps. The AMM utilizes these maps to exclude facial regions of target AUs so that a weak constraint that minimizes AU prediction scores for target AUs can be imposed on network training, which thereby ensures that the network’s attention maps encompass all most discriminant regions contributing to AU classification decisions. Experimental results on the BP4D and DISFA datasets demonstrate that, even in the absence of landmark detection and pre-facial region partitioning, the proposed model sustains excellent detection performance during testing. Furthermore, the generated AU attention map can accurately indicate the spatial locations of AU occurrences, which makes the AU detection results explainable.
Facial Emotion Recognition (FER) has emerged as a significant component in the creation of emotionally intelligent systems, attempting to bridge the communication gap between humans and technology. This project presents a real-time face emotion detection web application that uses live camera feeds to determine users' emotional states and dynamically improves user engagement according to mood. To provide a smooth and responsive experience, the system makes use of the DeepFace framework, OpenCV for video processing, and Flask for backend administration. When the application detects an emotion, such as happiness, sorrow, anger, surprise, fear, or neutrality, it instantly modifies the web interface's background color to match the user's mood and recommends a carefully chosen playlist of YouTube songs that are appropriate for the emotion. The user experience can be further customized with optional features like facial recognition and predicted age detection. The suggested method provides a dynamic and immersive platform with real-time input, thereby addressing the shortcomings of conventional static emotion analysis systems. Through the integration of visual, aural, and interactive components, the program improves emotional engagement and shows how emotion-aware services may be used in a variety of industries, including customer service, entertainment, mental wellness, and adaptive learning. Because of the system's emphasis on accessibility, simplicity, and computational economy, it can function properly even on common consumer hardware without the need for expensive GPUs. The potential for developing sympathetic human-computer interfaces that react to users' current states both rationally and emotionally is demonstrated by this work. Embedding direct multimedia playing, enabling multi-emotion detection per frame, and expanding its application to mobile and edge computing platforms are possible future advances. Keywords— Facial Emotion Detection, DeepFace, OpenCV, Flask, Real-Time Processing, Mood-Based Adaptation, Human-Computer Interaction
Communication is one of the significant milestones of human beings; however, most of what we communicate is transmitted through facial expressions. Due to this, Facial Expression Recognition is one of the areas of most outstanding research in Artificial Intelligence for the potential benefits it can provide both economically and socially in areas such as marketing, education, security, technology, entertainment, politics, human resource management, or physical and mental health. This paper presents a review with the result of the comparative analysis of the main advances in the challenges that the identification of emotions entails, presenting the conceptual architecture model of each solution. It describes the praxis of how machine learning, artificial neural networks, deep learning, optimization algorithms, metaheuristics, and even anatomy and psychology tools—such as Action Units and the Arousal-Valence Two-Dimensional Model—drive the development of this transcendent area.
Football is a dynamic sport where physicality, competitiveness, and psychological tension often give rise to aggressive behaviors. While aggression can sometimes serve as a motivating force, unchecked aggression may lead to injuries, disciplinary actions, and disrupted team dynamics. This study explores the complex nature of football player aggression during gameplay and training, emphasizing how modern technologies-such as wearable sensors, video analytics, artificial intelligence (AI), and biofeedback systems-are being used to monitor, predict, and manage aggressive behaviors. By combining psychological theory with technological interventions, the paper highlights how real-time data collection and behavior prediction can help coaches and sports psychologists reduce harmful aggression while maintaining competitive intensity. The research draws upon current literature, case studies, and data from elite clubs to present a holistic view of aggression control in modern football. Furthermore, it discusses ethical considerations, potential biases in AI models, and the future of aggression analytics in sport.
An adaptive game design includes human emotions as a key factor for selecting the next level of harness in the game design. In most of the situation, face expression could not identified exactly and this may leads to an uncertainty for selecting a next level in the game flow. An efficient game design requires an efficient behaviour tree construction model based on the emotions of a player. This paper presented an artificial intelligent based game design using behaviour tree model by including an efficient emotion detection or classification system using a deep learning model ResNet 50. The proposed technique classifies the emotion of a player based five different category and the player current state of mind will be calculated based on this emotion score. The Behavior tree has been constructed from the foundation based on the hardness value calculated for each sub-BT. The performance evaluation for the emotion classification archives close to 92% and this accuracy will lead to construct an efficient BT for any game. We have evaluated the emotion detection system with other related Deep learning models.
This study aimed to explore the influence of various mask attributes on the recognition of micro-expressions (happy, neutral, and fear) and facial favorability under different background emotional conditions (happy, neutral, and fear). The participants were asked to complete an ME (micro-expression) recognition task, and the corresponding accuracy (ACC), reaction time (RT), and facial favorability were analyzed. Results: (1) Background emotions significantly impacted the RT and ACC in micro-expression recognition, with fear backgrounds hindering performance. (2) Mask wearing, particularly opaque ones, prolonged the RT but had little effect on the ACC. Transparent masks and non-patterned masks increased facial favorability. (3) There was a significant interaction between background emotions and mask attributes; negative backgrounds amplified the negative effects of masks on recognition speed and favorability, while positive backgrounds mitigated these effects. This study provides insights into how masks influence micro-expression recognition, crucial for future research in this area.
Head and eyebrow movements have been reported as question markers in both spoken (e.g. Swerts & Krahmer, 2004) and sign languages (e.g., Zeshan, 2004). However, the relative weight of these visual cues in conveying prosodic meaning remains unexplored. This study examines, through a kinematic analysis, if (and how) the amplitude of head falling movements varies in statements versus questions, both in Portuguese Sign Language (LGP) and in the spoken modality of European Portuguese. The results show that the head falling movement plays a key role in conveying interrogativity in Portuguese, in varying degrees. In LGP, the head amplitude is larger than in the spoken modality, and the shape of the head movement varies across sentence types, thus showing the primary role of this visual cue in LGP prosodic grammar. In spoken Portuguese, although the head amplitude also differs between sentence types, the shape of the movement over time is always the same (falling), thus pointing to a secondary/complementary role in spoken Portuguese.
These findings not only contribute to the knowledge of the prosodic grammar of spoken and sign languages, but also challenge traditional language processing models, mostly focused on verbal language.
The assumption that there are a number (typically six) of universal expressions of some evolutionary acquired emotions has undergone theoretical and methodological criticisms based on a growing number of empirical findings. Thus, the field is at a crossroad: should the concept of expression of basic emotion disappear as a viable scientific concept? Should researchers give it less importance as a channel for the communication of affective states? Or should it become a more expansive, catch-all concept that encompasses a large number of communication channels and affective states? We describe the evolution of the field over the last 50 years and the problems that led to these dilemmas.
Learning objective
To validate a novel video-based emotion identification measure in persons with neurodegeneration and show correspondence to emotion-relevant brain systems
Background
Given advances in disease-modifying therapies for dementia, the dementia field needs objective, practical behavioral assessment tools for patient trial selection and monitoring. The Dynamic Affect Recognition Test (DART) was designed to remedy limitations of instruments typically used to measure emotion identification deficits in persons with dementia (PWD).
Method
Participants included 372 individuals, including 257 early stage PWD (Clinical Dementia Rating ≤1, Mini-Mental State Examination ≥20; 66 behavioral variant frontotemporal dementia [bvFTD], 27 semantic variant primary progressive aphasia [svPPA], 23 semantic bvFTD [sbvFTD], 33 non-fluent PPA [nfvPPA], 26 progressive supranuclear palsy [PSP], 28 corticobasal syndrome [CBS], 42 Alzheimer’s disease [AD], 12 logopenic variant PPA [lvPPA]), and 115 healthy controls (HC), watched 12 15-second videos of an actor expressing a basic emotion (happy, surprised, sad, angry, fearful, disgusted) via congruent facial/vocal/postural cues, with semantically neutral scripts. Participants selected the emotion from a randomized visual array. Voxel-based morphometry (VBM) analysis was performed to show brain structure correlates of DART, controlling for non-emotional naming ability (Boston Naming Test, BNT).
Results
DART performance was worse in PWD than older HC (p<0.001), with the lowest scores observed in the sbvFTD group. A DART 10 cut-off score differentiates PWD from HC with a 90% sensitivity and 49% specificity (AUC=82%). A DART 9/12 score yielded 93% sensitivity/67% specificity (AUC=87%) for discriminating social cognition disorders from HC, while a 7/12 score differentiated sbvFTD from HC with 100% sensitivity/93% specificity (AUC=97%). VBM showed poorer DART performance significantly predicts focal brain volume loss in right-sided emotion processing areas including insula, temporal pole, caudate, superior frontal gyrus and supplementary motor cortex (pFWE<0.05).
Conclusions
The DART is a brief, psychometrically robust video-based test of emotion reading (i) designed to be practically useful in realistic assessment settings, (ii) effectively reveals emotion identification impairments in PWD, (iii) shows specificity for identifying PWD exhibiting real-life SCDs (i.e. bvFTD, svPPA, sbvFTD), (iv) corresponds to the expected structural anatomy of emotion reading, and (v) is freely available to researchers and clinicians.
This paper presents the first in-car VR motion sickness (VRMS) detection model based on lower face action units (LF-AUs). Initially developed in a simulated in-car environment with 78 participants, the model’s generalizability was later tested in realworld driving conditions. Motion sickness was induced using visual linear motion in the VR headset and physical horizontal rotation via a rotating chair. We used a convolutional neural network (MobileNetV3) to automatically extract LF-AUs from images of the users’ mouth region, captured by the VR headset’s built-in camera. These LF-AUs were then used to train a Support Vector Regression (SVR) model to estimate motion sickness scores. We compared the SVR model’s performance using LF-AUs, pupil diameters, and physiological features (individually and in combination) from the same VR headset. Results showed that both individual LF-AU (right dimple) and combined LF-AUs had significant Pearson correlations with self-reported motion sickness scores and achieved lower root mean squared error compared to pupil diameters. The best detection results were obtained by combining LF-AUs and pupil diameters, while physiological features alone did not yield significant results. The LF-AUs-based model demonstrated encouraging generalizability across different settings in the independent studies.
With facial expression recognition gradually becoming a hot topic in the fields of image processing and artificial intelligence research, more and more scholars are paying attention to the fact that the micro expressions instantly revealed on the face can better reflect human inner emotions and thoughts. In this paper, firstly, the research status of micro expression and the commonly used micro expression datasets are described, and the advantages and disadvantages of each dataset are analyzed. Then the feature extraction of micro expression is analyzed from the two algorithms. Finally, the application fields of micro expression research and the challenges facing the future development are discussed.
The authors reviewed and analysed Russian and foreign experimental studies on pantomimic stereotypes. Initially, stereotypical behaviuor was negatively evaluated. Modern research considers adaptive functions and possibilities of self-stimulation as a way to harmonise emotional and mental state. A comparative analysis of circular, pendulum and diagonal movements in children and Old World monkeys was conducted using an ethological approach to the study of behavioural patterns. Human observations were conducted in the psychoneurological department of Silischeva Astrakhan Regional Children’s Clinical Hospital, 40 preschool children with mental dysontogenesis participated. Five laboratory macaques and a family of hamadryas baboons kept in an aviary with homologous kinesics were observed in Sukhumi nursery. According to the authors’ team, walking (running) in a circle and diagonally, swinging the body “right-to-leftˮ in the pantomimic production of children and monkeys are associated with self-stimulation of an altered state of consciousness. Trance stereotypes divert attention from external stressors and stimuli and harmonise mental homeostasis. The study may be of interest to anthropologists, primatologists, specialists in the study of the psyche and pathological behaviour of animals and Homo sapiens.
ResearchGate has not been able to resolve any references for this publication.