Conference Paper

Emotion Recognition in Gamers Wearing Head-mounted Display

Authors:
Conference Paper

Emotion Recognition in Gamers Wearing Head-mounted Display

If you want to read the PDF, try requesting it from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The proposed architecture is developed for user engagement estimation in entertainment applications. In [127], CNN is used to estimate emotions from the partially covered human face images by wearing a head-mounted display (HMD). Work presented in [128] discusses a framework that can capture emotional expressions and predict the mood of a person, perceived by other persons. ...
Article
Full-text available
With the advancement of human-computer interaction, robotics, and especially humanoid robots, there is an increasing trend for human-to-human communications over online platforms (e.g., zoom). This has become more significant in recent years due to the Covid-19 pandemic situation. The increased use of online platforms for communication signifies the need to build efficient and more interactive human emotion recognition systems. In a human emotion recognition system, the physiological signals of human beings are collected, analyzed, and processed with the help of dedicated learning techniques and algorithms. With the proliferation of emerging technologies, e.g., the Internet of Things (IoT), future Internet, and artificial intelligence, there is a high demand for building scalable, robust, efficient, and trustworthy human recognition systems. In this paper, we present the development and progress in sensors and technologies to detect human emotions. We review the state-of-the-art sensors used for human emotion recognition and different types of activity monitoring. We present the design challenges and provide practical references of such human emotion recognition systems in the real world. Finally, we discuss the current trends in applications and explore the future research directions to address issues, e.g., scalability, security, trust, privacy, transparency, and decentralization.
... Explainability is an active topic in machine learning, mostly pursued in the domain of medical image analysis; we will review the literature, apply, and extend the most promising approaches to movement. In addition to movement classification and analysis there has already been work done on emotion recognition from image data specifically for VR, handling the partially occluded face due to the head mounted display [33]. If other sensors for face muscle activation [19,22] or EEG [23] are available, they can also be used for emotion detection. ...
Article
Recently, a concept of virtual emotion barrier by wireless signal gained a lot of interests because it supports human emotion recognition within given IoT environment. If detection of human emotion can be made with possibly high detection accuracy by virtual emotion barrier, it can be utilized to an extensive scope of emotion-based applications and communication software services. But, some IoT devices failures in the barrier may cause emotion-based system performance degradation critically. Also, thanks to recent advent of technology, intelligent smart UAVs with rapid speed are considered as promising devices when existing infrastructures are recovered. In this study, fault-tolerant mutual assisted virtual emotion barrier system is introduced with intelligent autonomous UAVs in convergent smart cities. Moreover, we present a formal definition of the problem whose goal is maximizing entire total accuracy of emotion detection where the failed locations are recovered by UAVs' movement from initial locations without any potential conflicts among UAVs. A greedy UAV movement scheme that constructs resilient virtual emotion barriers is proposed by handling those failed parts in the barriers continuously. Moreover, the proposed schemes are executed by comprehensive simulations with useful scenarios and we evaluate the earned outcome.
Article
Similar to language and music, dance performances provide an effective way to express human emotions. With the abundance of the motion capture data, content‐based motion retrieval and classification have been fiercely investigated. Although researchers attempt to interpret body language in terms of human emotions, the progress is limited by the scarce 3D motion database annotated with emotion labels. This article proposes a hybrid feature for emotional classification in dance performances. The hybrid feature is composed of an explicit feature and a deep feature. The explicit feature is calculated based on the Laban movement analysis, which considers the body, effort, shape, and space properties. The deep feature is obtained from latent representation through a 1D convolutional autoencoder. Eventually, we present an elaborate feature fusion network to attain the hybrid feature that is almost linearly separable. The abundant experiments demonstrate that our hybrid feature is superior to the separate features for the emotional classification in dance performances. This article proposes a hybrid feature for emotional classification in dance performances, which is composed of an explicit feature and a deep feature. We present an elaborate feature fusion network to attain the hybrid feature that is almost linearly separable. The abundant experiments demonstrate that our hybrid feature is superior to the separate features for the emotional classification in dance performances.
Conference Paper
Full-text available
This paper discusses the baseline for the Emotion Recognition in the Wild (EmotiW) 2016 challenge. Continuing on the theme of automatic affect recognition `in the wild', the EmotiW challenge 2016 consists of two sub-challenges: an audio-video based emotion and a new group-based emotion recognition sub-challenges. The audio-video based sub-challenge is based on the Acted Facial Expressions in the Wild (AFEW) database. The group-based emotion recognition sub-challenge is based on the Happy People Images (HAPPEI) database. We describe the data, baseline method, challenge protocols and the challenge results. A total of 22 and 7 teams participated in the audio-video based emotion and group-based emotion sub-challenges, respectively.
Conference Paper
Full-text available
This paper presents the techniques employed in our team's submissions to the 2015 Emotion Recognition in the Wild contest, for the sub-challenge of Static Facial Expression Recognition in the Wild. The objective of this sub-challenge is to classify the emotions expressed by the primary human subject in static images extracted from movies. We follow a transfer learning approach for deep Con-volutional Neural Network (CNN) architectures. Starting from a network pre-trained on the generic ImageNet dataset, we perform supervised fine-tuning on the network in a two-stage process, first on datasets relevant to facial expressions, followed by the contest's dataset. Experimental results show that this cascading fine-tuning approach achieves better results, compared to a single stage fine-tuning with the combined datasets. Our best submission exhibited an overall accuracy of 48.5% in the validation set and 55.6% in the test set, which compares favorably to the respective 35.96% and 39.13% of the challenge baseline.
Conference Paper
Full-text available
There are currently no solutions for enabling direct face-to-face interaction between virtual reality (VR) users wearing head-mounted displays (HMDs). The main challenge is that the headset obstructs a significant portion of a user's face, preventing effective facial capture with traditional techniques. To advance virtual reality as a next-generation communication platform, we develop a novel HMD that enables 3D facial performance-driven animation in real-time. Our wearable system uses ultra-thin flexible electronic materials that are mounted on the foam liner of the headset to measure surface strain signals corresponding to upper face expressions. These strain signals are combined with a head-mounted RGB-D camera to enhance the tracking in the mouth region and to account for inaccurate HMD placement. To map the input signals to a 3D face model, we perform a single-instance offline training session for each person. For reusable and accurate online operation, we propose a short calibration step to readjust the Gaussian mixture distribution of the mapping before each use. The resulting animations are visually on par with cutting-edge depth sensor-driven facial performance capture systems and hence, are suitable for social interactions in virtual worlds.
Article
Full-text available
Virtual reality (VR) offers tourism many useful applications that deserve greater attention from tourism researchers and professionals. As VR technology continues to evolve, the number and significance of such applications undoubtedly will increase. Planning and management, marketing, entertainment, education, accessibility, and heritage preservation are six areas of tourism in which VR may prove particularly valuable. Part of VR's possible utility as a preservation tool derives from its potential to create virtual experiences that tourists may accept as substitutes for real visitation to threatened sites. However, the acceptance of such substitutes will be determined by a tourist's attitudes toward authenticity and his or her motivations and constraints. As VR is further integrated into the tourism sector new questions and challenges clearly will emerge. The sector will benefit from future research into the topics that are discussed and numerous suggestions for future research are presented.
Conference Paper
Full-text available
The explosion of image data on the Internet has the potential to foster more sophisticated and robust models and algorithms to index, retrieve, organize and interact with images and multimedia data. But exactly how such data can be harnessed and organized remains a critical problem. We introduce here a new database called ldquoImageNetrdquo, a large-scale ontology of images built upon the backbone of the WordNet structure. ImageNet aims to populate the majority of the 80,000 synsets of WordNet with an average of 500-1000 clean and full resolution images. This will result in tens of millions of annotated images organized by the semantic hierarchy of WordNet. This paper offers a detailed analysis of ImageNet in its current state: 12 subtrees with 5247 synsets and 3.2 million images in total. We show that ImageNet is much larger in scale and diversity and much more accurate than the current image datasets. Constructing such a large-scale database is a challenging task. We describe the data collection scheme with Amazon Mechanical Turk. Lastly, we illustrate the usefulness of ImageNet through three simple applications in object recognition, image classification and automatic object clustering. We hope that the scale, accuracy, diversity and hierarchical structure of ImageNet can offer unparalleled opportunities to researchers in the computer vision community and beyond.
Article
Full-text available
A fundamental challenge in face recognition lies in determining which facial characteristics are important in the identification of faces. Several studies have indicated the significance of certain facial features in this regard, particularly internal ones such as the eyes and mouth. Surprisingly, however, one rather prominent facial feature has received little attention in this domain: the eyebrows. Past work has examined the role of eyebrows in emotional expression and nonverbal communication, as well as in facial aesthetics and sexual dimorphism. However, it has not been made clear whether the eyebrows play an important role in the identification of faces. Here, we report experimental results which suggest that for face recognition the eyebrows may be at least as influential as the eyes. Specifically, we find that the absence of eyebrows in familiar faces leads to a very large and significant disruption in recognition performance. In fact, a significantly greater decrement in face recognition is observed in the absence of eyebrows than in the absence of eyes. These results may have important implications for our understanding of the mechanisms of face recognition in humans as well as for the development of artificial face-recognition systems.
Conference Paper
Full-text available
This paper describes the use of statistical techniques and hidden Markov models (HMM) in the recognition of emotions. The method aims to classify 6 basic emotions (anger, dislike, fear, happiness, sadness and surprise) from both facial expressions (video) and emotional speech (audio). The emotions of 2 human subjects were recorded and analyzed. The findings show that the audio and video information can be combined using a rule-based system to improve the recognition rate
Conference Paper
We report our image based static facial expression recognition method for the Emotion Recognition in the Wild Challenge (EmotiW) 2015. We focus on the sub-challenge of the SFEW 2.0 dataset, where one seeks to automatically classify a set of static images into 7 basic emotions. The proposed method contains a face detection module based on the ensemble of three state-of-the-art face detectors, followed by a classification module with the ensemble of multiple deep convolutional neural networks (CNN). Each CNN model is initialized randomly and pre-trained on a larger dataset provided by the Facial Expression Recognition (FER) Challenge 2013. The pre-trained models are then fine-tuned on the training set of SFEW 2.0. To combine multiple CNN models, we present two schemes for learning the ensemble weights of the network responses: by minimizing the log likelihood loss, and by minimizing the hinge loss. Our proposed method generates state-of-the-art result on the FER dataset. It also achieves 55.96% and 61.29% respectively on the validation and test set of SFEW 2.0, surpassing the challenge baseline of 35.96% and 39.13% with significant gains.
Conference Paper
This paper reviews the various optical technologies that have been developed to implement HMDs (Head Mounted Displays), both as AR (Augmented Reality) devices, VR (Virtual Reality) devices and more recently as smart glasses, smart eyewear or connected glasses. We review the typical requirements and optical performances of such devices and categorize them into distinct groups, which are suited for different (and constantly evolving) market segments, and analyze such market segmentation.
Article
Many research fields concerned with the processing of information contained in human faces would benefit from face stimulus sets in which specific facial characteristics are systematically varied while other important picture characteristics are kept constant. Specifically, a face database in which displayed expressions, gaze direction, and head orientation are parametrically varied in a complete factorial design would be highly useful in many research domains. Furthermore, these stimuli should be standardised in several important, technical aspects. The present article presents the freely available Radboud Faces Database offering such a stimulus set, containing both Caucasian adult and children images. This face database is described both procedurally and in terms of content, and a validation study concerning its most important characteristics is presented. In the validation study, all frontal images were rated with respect to the shown facial expression, intensity of expression, clarity of expression, genuineness of expression, attractiveness, and valence. The results show very high recognition of the intended facial expressions.
Article
Virtual and Artificial Reality have become in the last few years one of the major new hype words. Subsequently there has been a plethora of glossy books and droll conference proceedings describing various systems and hardware implementation problems. As has always been discovered in computer science the major effort is in designing and building the software applications. Alan's aim has been to ignore the hardware side and concentrate on the far larger and almost impossible problem of what to do with it. This book is a collection of ten essays trying to look slightly into the future and define actual uses for Virtual Reality kits rather than showing off expensive hardware. This has resulted in a series of topics, each defines a different interface problem between the user and machine which may have some solution by using Virtual Reality. Even though the topics vary, at times drastically, Alan has managed to use editorial selection very well intertwining them into a reasonably coherent whole. The scope is too large for any single book to cover in any detail and as is inevitable important topics for example military and medicine have been excluded. Topics chosen range from traditional computer information database visualisation to planetary exploration to the Virtual Reality version of the music video and literacy in cyberspace.