Figure 4 - uploaded by Alessandro Delfino
Content may be subject to copyright.
First two layers of Parrot's emotion classification 

First two layers of Parrot's emotion classification 

Source publication
Chapter
Full-text available
Human-machine interaction is performed by devices such as the keyboard, the touch-screen, or speechto- text applications. For example, a speech-to-text application is software that allows the device to translate the spoken words into text. These tools translate explicit messages but ignore implicit messages, such as the emotional status of the spea...

Context in source publication

Context 1
... requires a sophisticated internetworking approach and, in particular, advanced inter-processor communications approaches aimed at increasing processing availability and at reducing overheads, and power consumption, which results in reduced battery life and usage time for the end user. A possible low-cost cost solution to solve such problem may be to merge the Application Processor and the Cellular Baseband blocks into a single ASIC consisting of two or three cores. This approach eliminates the performance conflict between the communication co mmunication protocol and multimedia tasks but the complexity of the inter-processor processor communication is not reduced, significantly. In the recent few years, mobile network providers and users will have the new opportunity to come up with more advanced and inn innovative context-aware aware services based on the real-time real knowledge of the user’s surroundings. In facts, context data must be acquired from the smartphone’s audio environment. In general, the classification of an audio environment or the correspondence betwee between two or more audio contexts and the number and the gender of active speakers near the smartphone together with other possible context features, can be useful information employed to obtain helpful contents and services directly by the mobile users’ devices. device Human beings and animals have to deal with emotions in everyday life. People often use expressions like “I’m happy!”, “It’s sad but true” or “I love you” in order to express a sensation that is generally called “emotion”. Every one of us knows the great impact that a statement such as those reported may have on our interpersonal relations or social life. Our emotional state often affects our way of approaching people and, more generally, animals or things. Furthermore, the externalization extern of this interior state can change the way in which other people approach to us. In addiction, is well known that “emotions” have great influence on our actions and opinions. For example, a scared person rarely is able to control itself in front of f a danger or a threat. Despite its socio socio-cultural cultural importance, the concept of “emotion” represents a particularly thorny problem. Even though the term is used very frequently, the question “What is an emotion?” rarely generates the same answer from different differe nt individuals, scientists or laymen alike. One of the most challenging issues of emotion detection is the description of what an emotion is and how it can be classified. Since the emotional state of a person is highly subjective, it is operatively very difficult fficult to find an objective and universal definition of “emotion”. The emotion classification problem was scientifically addressed for the first time by Descartes in the treatise Passions of the Soul in 1649. In this work Descartes defines six basic emotions called “primitive passions”: wonder, love, hatred, desire, joy and sadness. Along with these primary emotions Descartes also accepts “an unlimited number” of further, specific passions, called secondary passions, which are combinations of the six primitive passions in different proportions. The secondary emotions are blend of the innate basic emotions much in the way that different colors can be created by mixing red, blue and green, for this reason this classification is known as palette theory. This kind of classification was also adopted in the 19th century when Charles Darwin's (Darwin, 1872)and William James' (James, 1884) theories of emotions have been proposed. Darwin, in his work, highlights the universal nature of expressions: "...the young and the old of widely different races, both with man and animals, express the same state of mind by the same movements". On this assumption Darwin classifies the emotions that are innate and can be recognized in each human culture population, and calls them “basic”. Six basic emotions are distinguished: happiness, surprise, fear, disgust, anger and sadness. This type of classification had been widely used by the scientists, Ekman in 1972 (Ekman P., 1971) codified six basic or primary emotions in facial expressions: anger, disgust, fear, happiness, sadness and surprise. A similar widely accepted classification is the one provided by Plutchik (Plutchik, 2001). He defines 8 basic emotions and provides a graphic representation of them, which is the wheel shown in Figure 3. The emotions defined by Plutchik are: anger, anticipation, joy, trust, fear, surprise, sadness and disgust. Plutchik's wheel representation repres entation is formed by 4 couple of bipolar emotions, joy is opposed to sadness, anger to fear, anticipation to surprise and disgust to trust. A tree-structured structured list was proposed by Parrot in 2001 (Parrott, 2001), , where the firs first level is composed by six primary emotions (love, joy, surprise, anger, sadness, fear). The first two layers of Parrot's classification are shown in Figure 4. This classification differs from the others previously described because the secondary emotions are derivation of the primary ones instead of being combination of them. Some scientists refused to engage this irresolvable dispute on the number and type of emotions, producing an alternative to the categorical classification: the dimensional classifica classification, a more flexible solution to the problem of the representation of the emotional states. These dimensions include evaluation, activation, control, power, etc. In particular a 2-D 2 D representation of emotions called activation-evaluation space (Cowie, et al., 2001) has a long history in psychology, Figure 5. Research from Darwin forward has recognized that emotional states involve dispositions to act in certain ways. A basic way of reflecting that theme turns out to be surprisingly surprisi ngly useful. States are simply rated in terms of the associated activation level, i.e., the strength of the person’s disposition to take some action rather than none. The axes of activation-evaluation evaluation space reflect those themes. The vertical axis shows activation act level and the horizontal axis evaluation. The most recent classification merges the categorical and the dimensional classification: the cube representation proposed by Hugo Lövheim (2011) (Lövheim, 2012). In this representation eight basic emotions are ordered in an orthogonal coordinate system of the three main monoaminergic axes. The axes represent serotonin (5-HT, 5-hydroxytryptamine), dopamine (DA) and noradrenaline (NE), and each end of the arrows represents low and high levels of signaling respectively. Serotonin, noradrenaline and dopamine are the most important monoamine neurotransmitters. Many studies from different research fields support the belief that all three of the monoamines, serotonin, dopamine and noradrenaline are essential in the control of behaviors and emotions, and each of them is involved in a different behavior or emotion. The serotonin axis seems to represent aspects such as self-confidence, inner strength and satisfaction. The dopamine axis has been found to be involved in reward, motivation and reinforcement. Noradrenaline has been coupled to the fight or flight response and to stress and anxiety, and appears to represent an axis of activation, vigilance and attention. From this the 3-D representation shown in Figure 6 is derived. The basic emotions which label the axis are derived by Tomkins affect theory (Tomkins, 1962). Speech carries two different kind of emotion information: one refers to what is said and it is the task of recognizing the emotion from the explicit message of the speech, the second is the implicit message and it consists on how it is said and it is the task of detecting the emotion by the pitch, the intonation and the prosody of the speaker. The emotion recognition from explicit part of the speech can be brought back to a text-based emotion detection problem (Kao, Liu, Yang, Hsieh, & Soo, 2009) applying speech to text algorithms. Problems of recognizing emotions in text can be divided in three categories: 1) Keyword-based detection: the emotions are detected by the presence of some keywords in the input text; 2) Learning-based detection: emotions are detected based on previous training results which permit to collect specific statistic through learning methods; 3) Hybrid detection: emotions are detected exploiting the combination of detected keywords, learned patterns, and other supplementary information. Keyword-based emotion detection serves as the starting point of textual emotion recognition. Once the set of emotion labels (and related words) is constructed, it can be used exhaustively to examine if a sentence contains any emotions. This type of emotion detection has the pro that it is very straightforward and easy to use but it has some limitations: the definition of the keywords meaning can be ambiguous, the same word could change its meaning according to different usage and context; a keyword- based detector can not recognize sentences which do not contain keywords. This kind of approach is totally based on the set of emotion keywords. Therefore, sentences without any keyword would imply they do not contain any emotions at all; another limitation is that the keyword-based approach does not take into account of linguistic information. Learning-based approach consists in classifying the input texts into different emotions using a previously trained classifier. This approach has some limitations, though learning-based methods can automatically determine the probabilities between features and emotions, learning-based methods still need keywords, but just in the form of features. The cascading problems would be the same as those in keyword-based methods. Moreover, most learning-based detector can classify sentences into two classes, positive and negative. Since keyword-based methods and naive learning-based methods could ...

Similar publications

Article
Full-text available
Computers are arguably among the most powerful machines created by mankind. A specific field of computer technology, called Artificial Intelligence, explores how computers can take on increasingly complex human tasks. In one way of telling the story, the field of Artificial Intelligence started with Alan Turing’s question “Can machines think?” (Tur...
Article
Full-text available
In the human–machine interactive scene of the service robot, obstacle information and destination information are both required, and both kinds of information need to be saved and used at the same time. In order to solve this problem, this paper proposes a topological map construction pipeline based on regional dynamic growth and a map representati...
Conference Paper
Full-text available
This research proposes an on-line incremental 3D reconstruction framework that can be used on human machine interaction (HMI) or augmented reality (AR) applications. There is a wide variety of research opportunities including high performance imaging, multi-view video, virtual view synthesis, etc. One fundamental challenge in geometry reconstructio...
Thesis
Full-text available
Die meisten Unfälle mit Personenschaden in der Bundesrepublik Deutschland sind infolge urbaner Verkehrskonflikte zu verzeichnen. Die Mehrzahl dieser Unfälle findet in Kreuzungssituationen statt (sog. Kreuzen-, Einbiege- und Abbiege-Unfälle). Heutige Assistenzsysteme zur Kollisionsvermeidung oder -abschwächung stoßen in diesen Situationen aufgrund d...

Citations

... For example, there is particular interest in the computer-aided affective learning (Moridis and Economides, 2008); where the most significant work is probablye the model proposed by Kort et al. (2001), in which are established specific emotions for learning in students in the Science, Math, Engineering, and Technology areas (SMET). An alternative to the categorical classification with a defined number of emotions is the dimensional approach, which dimensions include evaluation, activation, control, and power, among others (Bisio et al., 2016). Here, the most common dimensions used for words are arousal that goes from the calm to the excited degrees, valence that goes from the negative to the positive degree, and the degree of control or attention that a word might provoke in the personspeople (Bradley and Lang, 1999). ...
Article
Purpose Most studies on Sentiment Analysis are performed in English. However, as the third most spoken language on the Internet, Sentiment Analysis for Spanish presents its challenges from a semantic and syntactic point of view. This review presents a scope of the recent advances in this area. Design/methodology/approach A systematic literature review on Sentiment Analysis for the Spanish language was conducted on recognized databases by the research community. Findings Results show classification systems through three different approaches: Lexicon based, Machine Learning based and hybrid approaches. Additionally, different linguistic resources as Lexicon or corpus explicitly developed for the Spanish language were found. Originality/value This study provides academics and professionals, a review of advances in Sentiment Analysis for the Spanish language. Most reviews on Sentiment Analysis are for English, and other languages such as Chinese or Arabic, but no updated reviews were found for Spanish.
Article
Full-text available
During COVID-19 pandemic, interest in mHealth rose dramatically. An ample literature review was carried out to discover whether personality traits could be the basis for mHealth personalization for human-computer interaction improvement. Moreover, the study of three most popular mHealth applications was conducted to determine data collected by users. The results showed that personality traits affected communication and physical activity preferences, motivation, and application usage. mHealth personalization based on personality traits could suggest enjoyable physical activities and motivational communication. mHealth applications already process enough user information to enable seamless inference of personality traits.
Article
We present a novel framework to predict the success of Kickstarter campaigns based on the emotional intensity induced by domain specific aspects. The framework enables to automatically mine (from campaign descriptions and product reviews) clusters of aspects characterizing a domain of interest. A Need Index-based model is built in order to predict whether a campaign will result in success (i.e., reach its funding goal). The easy to interpret Need Index representation enables to understand and monitor the most relevant domain aspects and their related emotional intensities. We tested our framework on Kickstarter campaigns in the dominant domain of mobile games with a prediction accuracy of 94.4%. The methodology opens new ground for further interdisciplinary research on causal inference to support predictions related to customer needs, particularly in the areas of behavioural economics, marketing, brand management and market research.
Article
Full-text available
The emergence of new wearable technologies, such as action cameras and smart glasses, has driven the use of the first-person perspective in computer applications. This field is now attracting the attention and investment of researchers aiming to develop methods to process first-person vision (FPV) video. The current approaches present particular combinations of different image features and quantitative methods to accomplish specific objectives, such as object detection, activity recognition, user–machine interaction, etc. FPV-based navigation is necessary in some special areas, where Global Position System (GPS) or other radio-wave strength methods are blocked, and is especially helpful for visually impaired people. In this paper, we propose a hybrid structure with a convolutional neural network (CNN) and local image features to achieve FPV pedestrian navigation. A novel end-to-end trainable global pooling operator, called AlphaMEX, has been designed to improve the scene classification accuracy of CNNs. A scale-invariant feature transform (SIFT)-based tracking algorithm is employed for movement estimation and trajectory tracking of the person through each frame of FPV images. Experimental results demonstrate the effectiveness of the proposed method. The top-1 error rate of the proposed AlphaMEX-ResNet outperforms the original ResNet (k = 12) by 1.7% on the ImageNet dataset. The CNN-SIFT hybrid pedestrian navigation system reaches 0.57 m average absolute error, which is an adequate accuracy for pedestrian navigation. Both positions and movements can be well estimated by the proposed pedestrian navigation algorithm with a single wearable camera.