Article
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Automatic human affect recognition is a key step towards more natural human-computer interaction. Recent trends include recognition in the wild using a fusion of audiovisual and physiological sensors, a challenging setting for conventional machine learning algorithms. Since 2010, novel deep learning algorithms have been applied increasingly in this field. In this paper, we review the literature on human affect recognition between 2010 and 2017, with a special focus on approaches using deep neural networks. By classifying a total of 950 studies according to their usage of shallow or deep architectures, we are able to show a trend towards deep learning. Reviewing a subset of 233 studies that employ deep neural networks, we comprehensively quantify their applications in this field. We find that deep learning is used for learning of (i) spatial feature representations, (ii) temporal feature representations, and (iii) joint feature representations for multimodal sensor data. Exemplary state-of-the-art architectures illustrate the progress. Our findings show the role deep architectures will play in human affect recognition, and can serve as a reference point for researchers working on related applications.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

Article
Full-text available
This research aims to reveal the semiotic meaning contained in the poem entitled Syuhadā’ul- `Ilmi wal-Gharbah written by Ahmed Shawqi. Shawqi saw and observed the conditions of the Egyptian people at that time who were experiencing difficulties, sorrow, poverty, ignorance, and British colonialism. Therefore, the Egyptian government sent young people to western countries to study in the hope that after returning to Egypt, they could build their country and nation. In order to reveal the semiotic meaning contained in the poem entitled Syuhadā’ul-`Ilmi wal-Gharbah, the semiotic theory was used through methods of heuristic reading and hermeneutic reading or retroactive reading. Meanwhile, the reading technique was conducted by reading one by one or reading gradually from the heuristic reading to the hermeneutic reading or retroactive reading.
Article
Full-text available
Article ReviewThis research is motivated by three things, namely 1) moral literacy which becomes an ability that needs to be considered in the 21st century, 2) moral literacy which is used to make ethical decisions, and 3) Burlian novel which is one of the literary works that reflects moral literacy. This study aims to describe the ethical decision making in the novel Burlian by Tere Liye with the study of moral literacy. This research is a qualitative descriptive type. Data collection were taken through library techniques and data analysis techniques using content analysis techniques. The results of this study indicate that ethical decision making is based on three components of moral literacy, namely ethical sensitivity, ethical reasoning, and moral imagination. The pattern of the three components of moral literacy is ethical sensitivity as a form of ethical action, whereas ethical reasoning and moral imagination are the foundation of the ethical action.
Article
Full-text available
This research is motivated by the ability to speak is one of the language skills that must be mastered by a student because speaking competence is one of the components in the learning objectives. Based on learning to speak Indonesian students, there are still many who have not been able to apply the speaking method properly. They also think that speaking is difficult, when in fact they have not mastered the material or material they are going to talk about. One that can see or measure speaking ability is by giving speeches because this activity requires careful preparation and clear concepts so that speech activities can be carried out properly. This study will describe the speaking ability in the speaking skills teaching course for Indonesian Language Education students at UMMY Solok. This type of research is quantitative using a descriptive method while the population and sample are students from the 2018 batch of 8 people. The results of this study are that the students’ speaking abilities can be grouped with a score of 88.9 with a Very Good qualification (BS) obtained by 4 students with a percentage of 50%, a score of 83.3 with a Good qualification (B) obtained by 1 person with a percentage of 12.5%. , scores of 72.2 and 66.7 with more than adequate qualifications (LDC) were obtained by 3 people with a presentation of 37.5%
Article
Full-text available
Interest in reading is a strong desire accompanied by one's efforts to read. Interest in reading does not suddenly arise from within a person. However, interest in reading arises from encouragement and the right environment. The COVID-19 pandemic has limited the drive and environment that fosters primary school students' interest in reading. Attention is needed to focus on the reading interest of elementary school students during the COVID-19 pandemic. During the COVID-19 pandemic, the reading interest of elementary school students decreased. This is evident from the results of the study that elementary school students tend to take reading activities as normal. In addition, reading activities tend to be only because of the teacher's task, ordered by parents. The duration of student reading is also only limited to 10 s.d. 30 minutes. This is because there are limitations during the pandemic. This results in students starting to think that reading activities are normal routines, without any motivation and feeling of happiness when doing them.
Article
Full-text available
This research aims to analyze astral projection as a concept of Death and Dying by using an analytical approach to postmodern fiction in the Insidious film script. This research shows that astral projection is a skill possessed by a person to leave his physical body and explore an astral world or a spirit world. Astral projection is a concept of death and dying that was introduced as a postmodern fictional strategy known as the superimposition strategy. This strategy illustrates that there are 2 worlds that overlap and overlap with each other. Its presence is a way of deconstructing thoughts about something that is considered strange (uncanny) and unusual as well as a counter to the whole that emphasizes the ontological side of the existence of something. McHale (1987) argues that it is a sister-genre of postmodern fiction. Science fiction explores ontological issues in order to build a good story, whereas postmodern fiction merely presents the case without having to build a story. Furthermore, the two genres can take turns adopting their strategies. Meanwhile, the bond between postmodern fiction and fantasy fiction is the same, borrowing strategies in exploring ontological cases.
Article
Full-text available
In this paper, we discuss the objective and subjective observations used by the author in depicts the world as he travels. The world here is not only about landscapes or places, but also in people. In objective observation, the writer takes distance from the world it describes. On the other hand, subjective observation emphasizes involvement the author's personality to the world that is told. Observation always moves from the objective to the subjective. A number of place is described objectively, but then shifts to the subjective.
Article
Full-text available
Speech in advertising discourse has a psychological impact on consumers and / or potential consumers, so that what is expected by producers that the consumers or potential consumers are interested and they buy the product which advertised. However, it does not only affect the inter-est of potential consumers or consumers but utterances in advertisements can also affect the form of speech acts of users of a product, especially high school students, who may be one of the potential consumers or only connoisseurs of advertising discourse. Theoretically, the objective that the researcher intends to achieve in this study is to obtain a deeper picture of psychoprag-matics in discourse and utterance of advertisement that has a psychological impact on high school students in Madiun City. This study uses qualitative methods with interviews and open questionnaires as data collection tools. The analysis of data uses referential techniques with a distinction theory approach and presented in descriptive techniques. From the data obtained, 52.8% indicated that the words in the advertisement were seen, mimicked or muttered by stu-dents in the act of communication. If it is broken down in more detail, the words in the adver-tisement have more effect on female students (66.7%) than male students (33.3%).Based on the analysis, the results showed that the words in the advertising discourse had quite an effect on the psychological condition of the students in acting in speech act. The advantage in this research is to add and enrich insight in the field of pragmatics, especially psychopragmatics; to add and enrich insight in the application of psychopragmatic language learning.
Article
Full-text available
The purpose of this research is: (1) to describe violation of the politeness principle in the speech of children with tunagrahita in SLB NUngaran, and (2) to describe the conversational implicateon the speech of children with tunagrahita in SLB N Ungaran.The research approach used in this research is descriptive qualitative. The data collection method used in this research is the method of referring by using some technique that is, the technique of competent libel, recording technique, and technique of note. In this study show that children with disabilities violate the politeness. Violations of the politeness principle presented in 61, 40 %. Which often happens is an tactmaxim presentedin 10, 52 %, generosity maxim presented in 8,77 %, approbation maximpresented in 7,01 %, modesty maxim presented in 7, 01 %, agreement maxim presented in 14, 03 %, and sympathy maxim presented in 12, 28 %. Violations of the politeness principles the most presented is agreement maxim presented in 14, 03 %. While the conversational implicatepresented in 38, 59 % the conventional implicate presented in 12, 28 %, nonconventional implicates presented in 14, 03 %, and presuppositions presented 12, 28 %. Conversational implicate the most present nonconventional implicate presented in 14, 03 %. The violations presented because there are intentional and accidentals factors of speakers and speech partners in comminicating using everyday language.
Article
Full-text available
This study aims to find a new method in learning and teaching literature to students in a practical way and to determine student responses to the assignment of poetry musicalization projects through social media. The results show that students' creativity is in the form of works and packaging when they upload their work, so that they are able to implement the theories or concepts they have learned during one semester into their final work, whether poetry is produced with its intrinsic elements, musical arrangements that are collaborated when the poem is read until it is finished. collection process on social media.
Article
Full-text available
This goal of this research is to find the register that on the languange of Sugar Glider Lovers on social media Facebook. Data was taken by what on facebook social media in group of sugar glider house. The data analyzed by equivalent method analysis. The result showed that there’s a register between sugar glider lovers by form and meaning and function inside register.
Article
Full-text available
Reading Interest is a strong desire accompanied by one's efforts to read. Reading interest does not suddenly arise from a person. However, reading interest arises from encouragement and the right environment. The COVID-19 pandemic has limited the encouragement and environment that fosters primary school students' reading interest. Therefore, this study focuses on the read-ing interest of elementary school students during the COVID-19 pandemic. This study aims to determine the impact of the COVID-19 pandemic on the reading interest of elementary school students. This research method is descriptive qualitative.
Article
Full-text available
Today, the studies on post-colonial urban space in relation with Indonesian modern prose can be said to be minimal. On the other hand, in the history of Indonesian literature, we are endowed with writers like Armijn Pane, Idrus, Pramoedya Ananta Toer, Iwan Simatupang, Budi Darma etc., who consistently write prose with the background of the ex-colonial towns. Unfortunately, there is no enough studies of the encounter between those writers with the city as a representation of a post-colonial experience. This paper tried to track the integral relationship between Indonesian modern prose writers to the city by using the spatial space theory in relation to the study of post-colonial literature in Indonesia. This analysis used the spatial theory by Sara Upstone as the approach, especially regarding to space (spatial) city.
Article
Full-text available
The emergence of literary criticism, as sort of "literary institution" in a modern sense, is relatively a new phenomenon in France as it happened in 19th century and would be more institutionalized in 20th century.
Article
Full-text available
The source of data in this study is onlinenews published byradarmadiun.co.id October 2021 edition. The data collection techniques in this study are documentation, reading, and notes. Data analysis techniques in this study are data reduction, data presentation, verification, or concluding. The results of this study are the Ngawi regional news in the online media ofradarmadiun.co.id October 2021 edition. In Leeuwen's inclusion version of critical discourse analysis, there are objectivations-abstractions, nominations-categorizations, and nominations-identifications.
Article
Full-text available
This study aimed to describe and to know the terms and symbolic meanings of Javanese keris’ patra ornaments. The meaning is not only lexical but also cultural; meaning which is associated with the civilization prevailing in that society.. The results of this study indicated that the patraornament of Javanese keris has certain terms in each part. The meaning contained in the patra deder of Javanese keris is related to human life. It means that religious support, behaving politely and not arrogant, respecting each other, being responsible, obeying the prevailing norms and having positive thought are substantial in human life.
Article
Full-text available
The Internet is believed to have changed reading habits: deep reading has shifted due to the easy access on the Net. This research is intended to unravel the reading behavior of Junior High School students in the Municipality of Malang. The results indicate that reading is not listed as a favorite past-time: they spend a little time reading. The trend is that they tend to spend their time for accessing the Internet in a way different from that of the previous time. The analysis shows that there is a positive correlation between reading habit and reading variables related to the Internet.
Article
Full-text available
The research aims at reviewing the symbolic violence in mass media. The aspect of symbolic violence to be reviewed includes (a) the forms, (b) strategies, and (c) effects of symbolic violence to the reader. The research uses critical discourse analysis design developed by Fairclough. The result of data analyses found the realization of symbolic violence in the news text, in form of (a) blurred meaning, (b) bias logics, and (c) bias value. The blurred meaning, bias logics, and bias value as form of bias information in news text. The strategies of symbolic violence found in this research are (a) the softening of information, (b) information reasoning (logic), and (c) creating positive information. The effects of symbolic violence to the readers is seen in receptions of the information from the mass media includes (a) reception based on prejudice attitude and view and (b) reception based on neutral attitude. The symbolic violence does not impact to the readers reality in a short time respons.
Article
Full-text available
This study aims to analyze astral projection as a concept of Death and Dying by using a postmodern fiction discourse analysis perspective in Insidious movie script. This study found that astral projection is a capability possessed by a person to leave physical body and explore an astral world or the spirit world. Astral projection is a death and dying concept that is presented as one of the postmodern fictional strategies known as superimposition. This strategy illustrates that there are two worlds that accumulate and co-exist with each other. Its presence is a way of deconstructing thoughts about something that is considered uncanny and unusual as well as a counterpart of totality that puts the ontological side of the existence of something. It is said by McHale (1987) that it is a sister-genre of postmodern fiction. Science fiction explores ontological issues in order to build a good story while postmodern fiction simply presents the problem without having to build a story. Furthermore, both genres can adopt each other's strategies. Meanwhile, postmodern fictional relations and fantasy fiction are the same, borrowing strategies for exploring ontological issues.
Article
Full-text available
Questioning the terminology related to ecranization, filmization, or sailing of literary works or commonly known as a change in the form of literary works in the form of textual products into films in the form of visual products and vice versa, it requires a fairly good process of imagining and depicting in each creative process. Because Eagleton explains that the concept of imagery in the mind is the subjective work of the energy of human spirituality, which is not limited, and is not limited to absolute truth but by control in the consciousness of human reason. From there, there is an idea about the idea of ownership of literary works as a representative of the latest form of energy about the depiction. Therefore, in the transfer process, it is necessary to adjust, describe, and interpret the process, so there will be some conversions in the transfer process. The ecranization process has also been known as filmization and is more often referred to as adaptation, transformation, or transfer. Problems like that often lead to the creation of new characters as a process of change, such as from short stories to films or vice versa. One of them is a film titled Tak Ada Yang Gila di Kota Ini which has been successfully rewritten, developed, and adapted from the original short film into a short film. of the many short stories with the same title belonging to Eka Kurniawan in his book entitled Cinta Tak Ada Mati. Wregas' decision to adapt the story from Nothing Crazy in This City from Eka Kurniawan's book was not without reason. Wregas' anxiety about the story about a group of people who are mentally marginalized, alienated from civilization, and often considered unsuitable for social life becomes a good premise to reveal and develop, especially a piece of the story of characters with psychological problems in the short story Tak Ada Yang Yang. Crazy in this City belongs to Eka Kurniawan. Kartono said that abnormal psychology is closely related to deviations and abnormal behavior. In fact, talking about the related concepts (normal) and (abnormal) is very difficult to identify boundaries. In this study, a character named Athirah found a change in principle
Article
Full-text available
Indonesia was born out of war, revolution against Dutch colonialism assisted by several European countries other and military. The military became entrepreneurs and minister; they are even in charge of religion and art.
Article
Full-text available
This study analyzes memory formation along with its transmission and forgiveness as a tool used by Gin as the main character to form the discourse of truth within the scope of society's way of thinking that has not been touched by the modernization of the Meiji Restoration era. Hanauzumi's novel implies that a bitter and tragic past can not be forgotten just by Gin. And furthermore, there is a chance redemption done by Gin by striving to realize his goal of reaching the doctor's profession as a sense of solidarity among fellow women. The purpose of this study is to present memory and forgiveness as a tool used by Gin to form the discourse of truth, thus giving him the power to fight for justice for women's rights, rights for himself and for other women. The method used in this research is descriptive qualitative. This research tries to illustrate the traumatic events of the main character who became the forerunner of his struggle to become the first female doctor in Japan.
Article
Full-text available
Social media is common nowadays, especially for millennials as the net generation. Social media itself is vulnerable to the spread of hoaxes. Adolescents who are less than 20 years old tend to be more easily consumed by hoaxes. They have high self-confidence and are active on social media. Teenagers who are less than 20 years old are usually students, which is one of the net generations. It is easy for students to be consumed by hoaxes, one of which is the lack of ability in digital literacy. The lack of digital literacy skills possessed by students makes them vulnerable to hoaxes. The ability to think critically is also needed by students, especially those under 20 years old. When viewing or receiving information from social media, it is best to first check the truth of the information. Always be careful when digesting information. When you find information or news that is not relevant, you should report it so that the information is not spread widely. This can be done to show concern for the spread of hoaxes on social media. Social media is a platform where we can communicate and obtain information very easily. More
Article
Full-text available
This article examines the editorials in Koran Tempo and Kompas in representing their ideology of COVID-19 handling in Indonesia. This linguistic research is conducted qualitatively. The data were in the form of Indonesian-language editorial discourse, which discussed the COVID-19 handling in Indonesia. The written research data were taken from national newspapers, namely Koran Tempo and Kompas, and were obtained through the use of listening and note-taking techniques. They were then analyzed using Van Dijk’s critical discourse analysis model. The results of the analysis show that there are differences in the representation of ideology in Koran Tempo and Kompas on COVID-19 handling in Indonesia through their editorials that are systematically constructed in microstructure, superstructure, and macrostructure. In the microstructure, ideology is realized through the lexicon, specifically the use of the dominant persona, use of syntactic structures in the form of active-passive sentences, affirmative sentences, and imperative sentences, as well as the use of repetition styles and metaphors. Koran Tempo uses ideological patterns as actions and ideology beliefs in its superstructure. Meanwhile, Kompas uses ideological patterns as systems of thought and systems of action. The difference between the microstructure and the superstructure results in a different macrostructure. Koran Tempo portrays government as the key stakeholder in handling COVID-19 in Indonesia. Meanwhile, Kompas’ editorial was directed at how the handling of COVID-19 was done through communal actions. The Koran Tempo ideology underlines who has a role in handling COVID-19, while the Kompas ideology focuses at what needs to be done in handling COVID-19.
Article
Full-text available
The moral decadence of the nation's children today is very worrying. Not only students at school, but also students in Madrasah, who are more focused on learning Islamic education and Arabic language more than in school. Therefore, the cultivation of student character should be carried out through development curriculum for all subjects, including Arabic lessons. Curriculum development is carried out by: (1) utilizing teaching materials from texts that are loaded with character values. (2) maximize mahfuzhat learning, (3) organize Arabic language routine activities, (4) providing an example, and (5) conditioning. By inculcating character through curriculum development, it will be able to create a young generation who understands the character that is needed for the future.
Article
Full-text available
The short movie of “tilik” has become a hot topic of discussion on social media by Indonesian netizens. In this film, there are many things that can be studied and attracted attention. One of which is about gossip phenomenon displayed in almost every part, which represent the reality of Indonesia people’s life especially in rural communities with strong social interactions. There-fore, the aims of this research is to examine the dialogues of the actors that reflect one form of non-formal social control, namely rumors or gossip. The research method used is descriptive qualitative, using a sociological analysis literature to connect the content of the film to the reality of existing societal phenomena. This study resulted the following conclusions : first, there is a close relationship between literature works with the real life of society, showed from gossip activ-ities which have become a phenomena of society, especially women. Second, gossip that occurs in society has a function to influence individuals to comply with applicable norms and rules.© 2021 The Authors. Published by UNNES. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/)* E-mail: ikeheppiyani@upi.eduAddress: Jl. Dr. Setiabudhi no 229 BandungDOI 10.15294/jsi.v10i2.47268
Article
Full-text available
Both oral and written ghost stories as myths have been around for hundreds of years. Urban legends have a place in literature. Typically, female ghosts are the most well-known and storied spirits. These female ghosts seem to embody the authors' own feminine values as well as those of ghosts. Intan Paramaditha's collection of short stories Sihir Perempuan, which he wrote utilizing a feminist construction approach, is the focus of this study's investigation of female ghosts. The conceptual foundation of the study is based on Cixous' (2008) theory of feminist writing and a method called "shift to the center." The research's conclusions reveal a paradox in Intan Paramaditha's reconstruction of femininity. Female ghosts are portrayed as myths in an effort to promote femininity.
Article
Full-text available
The emergence of literary criticism as a "literary institution", in the modern sense, is a relatively new phenomenon in France, circa 19th century, and increasingly institutionalized in 20th century. The institutionalization of literary criticism emerged along with the separation between criticism made by critics creators and criticisms made by the “critic”, and along with the collapse of that poetic representation built on the hierarchy of literary genres and the superiority of the critic over the creator in the 17th century. This article examines the emergence of the institutionalization of literary criticism and the phenomenon of the duality of literary criticism through Roland Barthes' theory of "two criticisms" applied to the practice of reading literary criticism in France, from 16th century to the middle of the 20th century.
Article
Full-text available
Writing about environmental issues before and after climate change and other human-made ecological damages, Eka Budianta has continually taken up environmental topics in his oeuvre. The study presented in the present article aims to scrutinize 6 selected poems by Eka Budianta to see how the poet has dealt with ecological debate throughout the years. The six poems comprise four poems written in 1984, i. Third, optimistic tone is palpable in Eka Budianta’s newest poem «Sungai Sejati». This study concurs that literature can partake in exposing global climate change as well as advocating sustainable living in a way often ignored in ecological praxis that only celebrates concrete results.
Article
Full-text available
have anything at the same time. The economic gap was so high. There were a fair number of rich people who could afford cars and lux. Most of his creative works were poems, which were written in his youth. Many of his poems have been studied by researchers, but most of the studies focus on the figures of speech he uses in them. Thus, there has been no research examining how Reagan's poems reveal his unconscious perception of Am. ed as drowning into alienation in the poem's text. The interpretation of "we"-with the structure constructed by the 'big Other'-in the analysis is based on Žižek's dialectical materialism perspective in conjunction with Lacan's conception of the symbolic, the imaginary, and the real. Žižek believes that the subject will not bear its own desire. Rather, it is manipulated by 'big Other's desire. Therefore, it is also important to look at the manipulated desire which leads to the pseudo activity that the subject does. The symbolic and imaginary aspects of the poem are specifically examined to discover the realm of the 'big. Repeated 20 times in the text, and it is mostly present in lines that express the struggle of fighting. The speaker glorifies the struggle as a persuasive invitation. The similarity between particular lines in his poem and the abovementioned sentences in his speech makes the topic in
Article
Full-text available
Paper Review Indonesia is recognized as one of the main exporters of immigrant workers in Southeast Asia and Indonesia has become the largest country of origin for international labor migration. Several Indonesian citizens who were sent abroad and became Indonesian Migrant Workers to various countries have not been officially recorded, there are several IDMWs who work in the private sector. Several narratives are circulating to describe IMDW's illegitimate conditions and lack nuance in most cases. This can be unilateral, from the media, or from the employer. when the story of the IMDWs situation is revealed, the workers are always portrayed as a problem because some of them have to face problems and conflicts in the workplace. General depictions with little nuance tend to build certain images that are condescending. As a result, the narrative that occurs is always unfavorable and undignified for immigrants. Six poems written by known people have described migration experiences that are very different from existing narratives of migration experiences. Studies have shown how the resilience and courage of immigrants has contributed to their migration experience, he is how this mental strength supports their struggle to renegotiate their status, subvert stigma, and rewrite negative narratives about them. The renegotiation for and its challenges with the current stereotype of IMDWs is observed through the presence of fortitude, courage, subversion and protest/confrontation in the six selected poems studied.Setiap dari mereka dikontekstualisasikan dalam fakta bahwa IMDWs telah lama menjadi sasaran stigma sosial karena pekerjaan mereka dan bahwa mereka mengalami master/ hubungan biner budak antara mereka dan mereka majikan, yang ingin mereka hadapi melalui puisi-puisi ini.
Article
Full-text available
In the third semester, short stories become part of the intermediate reading course material, which aims to develop students reading strategies in making inferences, analyzing figurative language, word choice, word order, organization and idea development, identifying types of tests, and appreciating short stories (Department of English Catalogue, 2018). The course weighs three credits with delivery twice a week for one semester. Of the 32 meetings, 16 meetings were to discuss popular articles, 14 sessions were for short stories discussion, and the first meeting was filled with subject introductions through the Course Profile (Intermediate Reading Course Profile, 2018). The debate on popular articles and short stories is carried out alternately, starting from the third week of the meeting. The course material consists of two parts. The first part contains critical words, inference, figurative language, diction, and juxtaposition (Andreani and oka, 2012). Each topic begins with theory, followed by exercises on sentences, paragraphs, and articles so that students' skills will increase.
Article
Full-text available
Humans always interact in social life through language. The human speech apparatus produces a sound symbol system that is used as a communication tool called language (Agustina, 2020). Language is one of the human cultures that has a very high value as conveyed (Agustina, 2020) through language all feelings, ideas, ideas, desires, and experiences can be channeled both orally and in writing. A person who can produce sentences that have never been heard before and knows thousands of sentences can be said to have mastered the language. The purpose of this research is to describe the locutionary speech acts of the characters in Tere Liye's novel About Us. Theoretically, this research is expected to provide benefits to the development of pragmatic theory regarding how to analyze speech acts in novels, while practically it can be used as teaching material for students regarding the types of locutionary speech acts. Locutionary speech acts of statements (declarative) The locutionary speech acts of statements (declaratives) only serve to tell something to others so that listeners are expected to pay attention. No Speech Locutionary Questions (introgative) The question locutionary speech act (inrogative) serves to ask something with the aim that the audience can provide an answer to a question posed by the speaker. Not Speaking Locutionary Commands (Imperative) Locutionary speech acts of commands aim to make the audience react to the requested activity or action. In Tere Liye's novel About Us, according to the results of the analysis of internal locutionary speech acts, there are three types of locutionary forms found, consisting of statement locutions (declarative), question locutions (inrogative) and command locutions (imperative). The locutionary statement in the novel About Us contains more information that the speaker wants to convey to the speech partner. The locutionary questions contained in the novel About Us are marked by punctuation marks that state the
Article
Full-text available
Research on the transformation of the forms ballet dance legend Kamandaka in Banyumas is one of the research focuses on the potential of local wisdom. With descriptive-analytic method that is supported by the theory of transformation, This research is used to see the changing shape of the Kamandaka legend into a ballet in Banyumas. These changes are in each ballet that consists of five rounds. The first round immediately displays Kamandaka, who met the gods and removed the story from the Padjajaran kingdom. Half to two figures Pulebahas king who wants to apply Ciptarasa, whereas in this story in the last parts of the story. Half to three Kamandaka contest that has changed langur unnoticed Ciptarasa, whereas in the story Ciptarasa know. The second round is the figure of Raja Pulebahas who wants to apply for Ciptarasa, while in this story there are the last parts of the story. The third round of the Kamandaka contest has changed langur without Ciptarasa knowing, while in the story, Ciptarasa knows. The fourth round of the slapstick mother waistband and “prenes” Kamandaka. On waistband mother slapstick story not depicted and prenes Kamandaka occur before the contest langur. In round five the fight Kamandaka and Pulebahas be the highlight of the story, while the story is still no continuity with the royal attack Pulebahas Nusatembini belonging to Pasir Luhur. This transformation also as an entertainment alternative to the preservation of local wisdom that has potential as a tourist attraction. This transformation also as an entertainment alternative to the preservation of local wisdom that has potential as a tourist attraction.
Article
Full-text available
Although there are no standard rules for the use of Indonesian, as a social media user, this journal conveys that there must be ethics used in using social media. In this journal research, the use of positive and negative words in social media texts tends to use impolite expressions. In addition, nowadays technology has advanced and there are many social media platforms to express something, social media is so fast that it greatly affects the form and nature of interaction between individuals. We as Indonesians should start using good and correct Indonesian, convey our thoughts politely and appreciate the feedback given by others. The rise of hate speech is contrary to our culture which emphasizes good manners. In this journal, researchers try to summarize the impoliteness of social media users in expressing opinions and also channeling typing in social media. The purpose of this study is, 1) to describe the form of violation of positive face-threatening action and negative face-to-face on social media text texts based on Brown and Levinson's theory; 2) to describe the politeness strategy in speech on social media texts. This research was conducted on social media text facebook and twitter in 2018. The method used is qualitative descriptive method. Technique of collecting data using technique note. There are four subbidal faces in facebook text media, namely 1) expression of disapproval, 2) emotional expression, 3) impolite expression. Adverse action on facebook social media text there are two subbidal, namely 1) warning expression, and 2) expression of negative falling. There are five subbidials, 1) expression of non-negative on the text of social media twitter there are two subbidal, that is 1) warning expression, and 2) expression of negative feelings.
Article
Full-text available
Self-study in English is now becoming the norm =carried out, especially in the context of online learning during the pandemic 2020. This study aims to determine the level of awareness and attitudes of learners towards independent English learning, as well as abilities (affordances) and challenges in implementing independent English learning. This research using a cross-sectional survey design involving students majoring in language English at a leading private university in Malang, East Java. Results of data analysis descriptively shows that awareness of the importance of language learning English independently is very high, while the respondent's attitude towards learning English independently is also generally positive. This learning, especially conducted in an online and informal context, according to respondents it is fun and educational, but some technical problems were also found. Therefore, The next aspect discusses various things that allow respondents to carry out self-study of English, which is found in the technical area, learning resources, as well as learning management. As for the challenges, findings from the data analysis shows that the technical constraints and aspects of learning management to be two main factors.
Article
Full-text available
A Narrative About Discrimination, Racism, and the Holocaust The Nazi ruling since 1933 made no secret of his dislike of Jews. Adolf Hitler even declared that "rational antisemitism must lead to a system of legal opposition, the ultimate goal of wiping out all Jews". He also developed the idea that Jews were the evil race that dominated the world. Nazi anti-Semitism was instilled in religious situations and in political context.
Article
Full-text available
A "fairy tale" is a short story that aims to entertain the reader. Usually, fairy tales are intended for children to read or listen to the stories. In addition, fairy tales also aim to convey moral values to the reader, explicitly or implicitly. The use of language that is simple and easy to understand is one of the reasons why fairy tales are so popular. One of the novels known throughout the world is Haensel and Gretel. The popularity of the novel finally made many book publishers around the world publish the novel Haensel and Gretel in a translated version. The number of popular fairy tales from foreign works attracts Indonesian people. Because of language limitations, translated fairy tales are an alternative for people to read fairy tales abroad. However, when we read translated fairy tales, we often feel uncomfortable. Unclear linguistic stories and unfamiliar words are the reasons for our discomfort in reading translated fairy tales. The interpretation of external terms that become ambiguous sentences when translated makes people unable to understand what moral values the author wants to convey. These things become the weaknesses of translated fairy tales. The Haensel and Gretel fairy tales are very popular among children. The use of language that is easily understood by children makes fairy tales a good vehicle to convey moral values. Because the Haensel and Gretel fairy tale is a foreign fairy tale, many people, especially book publishers in Indonesia, want to publish the Indonesian version of the Haensel and Gretel fairy tale. However, the writers of these translated fairy tales often do not pay attention to the differences between languages. In translating a fairy tale, the writer must know the difference between the original language of the fairy tale and the language of the translation. Cultural differences can also lead to terms that are not known by Indonesians, so that the sentences produced from the translation become ambiguous. This causes the message to be conveyed by the author of the fairy tale to not be conveyed properly to the reader.
Article
Full-text available
Although previous studies have given so much attention to exploring the students’ attributions in various settings and academic levels, there is a scarcity of research that presents and compares findings on the attributions of Indonesian university students, specifically in the reading (receptive skill) and writing (productive skill) courses. This study researched students’ attributions in their English as a Foreign Language (EFL) writing and reading classrooms at a private university in Indonesia.
Article
Full-text available
Environmental destruction Kailasa Village agriculture is the main problem of Yahya's character in nature and environment. The existence of a communal movement of farmers that is anthropocentric in nature opens up contestation to access natural resources. Nature is used to get great benefits, both for the farmer and for other interested parties, but there is no long-term balance of agricultural areas. From an ecocritical point of view observing the narrative of ecosystems in the contestation of ecological interests.
Article
Full-text available
A plethora of research has been carried out internationally among students of different levels to explore the success or failure attributions of their learning. In a largescale study involving 2,000 Taiwanese secondary school students. Researching 319 adult students who learned EFL in an intensive language program in Turkey, highlighted students' efforts as the main attributions for the students' success in their EFL learning. Research involving 113 students from three high schools in Morocco, where teacher influence and classroom became the main attributions for the students success in their English language learning. Therefore, teachers who can motivate their students to study. The study revealed that high proficiency students regarded effort and ability as the attributions for their success and considered classroom atmosphere and interest in the task as the attributions for their failure. Although the previous studies have given so much attention to exploring the students' attributions, there is a scarcity of research that presents and compares findings on the attributions of Indonesian university students. With these justifications, the present study aimed to explore Indonesian students' attributions in their EFL reading and writing courses and answer the following research questions 1. Do the students feel that they successfully achieve learning objectives in their EFL reading and writing classes? , and 2. How do they clarify the most common attributions for their success or failure in achieving the learning objectives in the classes? This study was conducted in two courses in an English Language Education Program of a private university (EDU) in Indonesia. The selection of those two courses, reading (receptive skill) and
Article
Full-text available
Women can't be equated with men, especially nowadays women are no longer synonymous with three things, namely: the kitchen, the well, and the mattress. In the journal, Nurjanah's character changed because her outlook on life changed. According to him, marriage is what makes him disappear so that he is not himself and he wants to be independent. She has her own belief as a woman that she must be autonomous. Indeed, there is no prohibition that orders not to adhere to the principles of her life. But what Nurjannah does is not leading to positive things but losses in herself. This journal itself tells us that there is nothing wrong with being an autonomous woman, but we must be able to become women who are firm in their stance, not easily provoked by others, as women still have to carry out their obligations as mothers, the light in the house they have chosen and the movement we want to start. must be positive, not harming yourself or others. Women are seen as being weak, a lot of assumptions that circulate in the community about her own self that causes women more marginalized. Problem this study are: How is a form of rebellion woman and factors causing the emergence of women's rebellion in the novel Perempuan Badai Mustofa Wahid Hasyim's work. The purpose of this research to describe a form of rebellion women and factors causing the emergence of women's rebellion in the novel Perempuan Badai Mustofa Wahid Hasyim's work. This study uses the theory of feminism, rebellion, violence. Data which is the object or research is parts of the text novel. The results showed the other side of women's lives, a phenomenon that rarely happens when the women with the determination and persistence to get out of the lives of the less side. The rebellion described in this novel is that nowadays being a woman is not easy. If they stay in place, not following the flow of change is a problem. They are considered as stone women and deserve to be marginalized by society.
Article
Full-text available
The journal review discussed a poem entitled "perhaps because of the moon" by W.S Rendra. In understanding the meaning and understanding symbols so that they can appreciate poetry well. In this journal, discussions about the meaning of poetry, the elaboration of literary works, and, the presentation of an analysis of the meaning of marking, symbols, imitation of a literary work, and the criticisms that want to be expressed in this poem are conveyed well so that readers can understand the intent to be conveyed by the author. Creating literary works requires the expression of various forms of emotions related to life. In the delivery of the expressions in poetry and journals that discuss the meaning and also the expressions in the poems are explained well so that they can be easily understood. By analyzing the elements that exist in poetry, readers and other parties can be helped in knowing other points of view in an analysis of poetry. Each country has its literary works of language. This is the uniqueness of the nation and must be preserved. Literary works are divided into three main categories, namely in the form of prose fiction, poetry, and drama. Then in this article, we discuss the criticism of the poem entitled Maybe because of the Moon. Poetry expresses forms of language into a unique picture and can be different for each author. The form of expression in question is emotion and also problems related to life. In the poem "Perhaps Because of the Moon" there is a form of expression with a special sign and meaning in it. The method used to understand the meaning of this poem is the reflection method of poetry. This method is done by retelling poetry and also linking poetry with real life. In addition, in analyzing poetry, critical and analytical characteristics are needed. There are four approaches to critically understanding meaning. The first is a mimetic or mimetic approach that is oriented towards imitating the universe and nature; then a pragmatic approach with reader orientation; an expressive approach that pays attention to the elements of the author; and an objective approach that pays attention to literary works.
Article
Full-text available
Appreciating the Indonesian contemporary work of painting created by Agus Suwage must be related to the painter’s birth, existence, culture, and the nation environments condition. Mostly, Suwage exploit himself as an object which later on becomes a medium to express his restlessness through his ideas . The object of self portrait in Suwage’s painting is a metaphoric icon. However, the portraiture technique which is chosen by Suwage in his paintings does not tell a story about himself.
Article
Full-text available
Humans interact through communication and tend to make adjustments with their interlocutors. This event is known as communication accommodation. This study aims to describe the forms of communication accommodation that occur in conversations between multilingual family members and identify the factors that encourage the emergence of such communication accommodation. The data in this study are conversational data between multilingual family members. Based on the results of the analysis, it was found that there is a communication accommodation process that occurs in conversations between multilingual family members. These accommodations are either convergent or divergent.
Article
Full-text available
Novel Negeri Menara Menara by Ahmad Fuadi is one of the novels that discusses pesantren as its main setting. Islamic boarding schools are educational institutions in the field of religion that provide the teaching of Islam, as well as the development and spread of Islam. The purpose of this study is to describe the life of pesantren in Negeri Lima Menara novels and analyze the forms of social interaction in Negeri Lima Menara novels. This research method is qualitative using social interaction theory. Literature technique is done by using descriptive analysis. The results of this study are the social interactions that exist in the Negeri Lima Menara novel covering Associative and Dissociative. Associative interactions include cooperation, acculturation, and accommodation. Whereas dissociative interactions include conflict, competition, and contravention.
Article
Full-text available
Poetry is a literary work that is unique and distinguishes it from other literary works. In addition, poetry can also have characteristics made by the poet in creating the poem where the application of symbols and meanings plays a significant role in the process of interpreting meaning in poetry. Therefore, an analysis of poetry is needed to understand the symbolism, meaning, or even imitation in poetry. One approach that can be taken to understand this process is the mimetic literary criticism approach which is an approach to interpreting the model, personification, meaning, and symbolism of a work of poetry by depicting it using the reality that exists in the universe. This study uses the poem "Barangkali Karena Bulan" by W.S Rendra. This study aims to interpret the meanings and symbols of mimesis in the poem by looking at another point of view in an analysis of poetry. This research was conducted using a qualitative method with a descriptive-analytic approach. The results showed that WS Rendra used the disclosure of symbols and imitation of the universe to express his love and longing for someone.
Article
Full-text available
Since March 2020, the Corona Virus has spread over the world. The education sector has experienced significant disruption, as have all other industries. Teachers and lecturers, in particular, hurried to switch their learning methods to online ones. A communication aid is a language. Symbols and consensus build up into messages that communicators express through language. Language requires courtesy. The critical requirement for effective communication is courteous language. Language etiquette is the central topic that needs research regarding online learning. This study aims to identify and characterize the compliance and courtesy breaches that occur while learning a language. A methodological and theoretical approach is used in this study. The research's data are presented in the form of speech (written) conveyed by students via LMS (Bella), the students of the Indonesian Language Education Study Program (Madiun City Campus) Widya Mandala Catholic University Surabaya served as the source of this research's data. According to the research, students understand the rules of linguistic etiquette when speaking, particularly in the official (educational) setting.
Conference Paper
Full-text available
Research in automatic affect recognition has come a long way. This paper describes the fifth Emotion Recognition in the Wild (EmotiW) challenge 2017. EmotiW aims at providing a common benchmarking platform for researchers working on different aspects of affective computing. This year there are two sub-challenges: a) Audio-video emotion recognition and b) group-level emotion recognition. These challenges are based on the acted facial expressions in the wild and group affect databases, respectively. The particular focus of the challenge is to evaluate method in `in the wild' settings. `In the wild' here is used to describe the various environments represented in the images and videos, which represent real-world (not lab like) scenarios. The baseline, data, protocol of the two challenges and the challenge participation are discussed in detail in this paper.
Conference Paper
Full-text available
This paper presents our approach for group-level emotion recognition in the Emotion Recognition in the Wild Challenge 2017. The task is to classify an image into one of the group emotion such as positive, neutral or negative. Our approach is based on two types of Convolutional Neural Networks (CNNs), namely individual facial emotion CNNs and global image based CNNs. For the individual facial emotion CNNs, we first extract all the faces in an image, and assign the image label to all faces for training. In particular, we utilize a large-margin softmax loss for discriminative learning and we train two CNNs on both aligned and non-aligned faces. For the global image based CNNs, we compare several recent state-of-the-art network structures and data augmentation strategies to boost performance. For a test image, we average the scores from all faces and the image to predict the final group emotion category. We win the challenge with accuracies 83.9% and 80.9% on the validation set and testing set respectively, which improve the baseline results by about 30%.
Article
Full-text available
Recurrent neural networks (RNNs) have been successfully applied to various natural language processing (NLP) tasks and achieved better results than conventional methods. However, the lack of understanding of the mechanisms behind their effectiveness limits further improvements on their architectures. In this paper, we present a visual analytics method for understanding and comparing RNN models for NLP tasks. We propose a technique to explain the function of individual hidden state units based on their expected response to input texts. We then co-cluster hidden state units and words based on the expected response and visualize co-clustering results as memory chips and word clouds to provide more structured knowledge on RNNs' hidden states. We also propose a glyph-based sequence visualization based on aggregate information to analyze the behavior of an RNN's hidden state at the sentence-level. The usability and effectiveness of our method are demonstrated through case studies and reviews from domain experts.
Conference Paper
Full-text available
Automatic emotion recognition is a challenging task which can make great impact on improving natural human computer interactions. In this paper, we present our effort for the Affect Subtask in the Audio/Visual Emotion Challenge (AVEC) 2017, which requires participants to perform continuous emotion prediction on three affective dimensions: Arousal, Valence and Likability based on the audiovisual signals. We highlight three aspects of our solutions: 1) we explore and fuse different hand-crafted and deep learned features from all available modalities including acoustic, visual, and textual modalities, and we further consider the interlocutor influence for the acoustic features; 2) we compare the effectiveness of non-temporal model SVR and temporal model LSTM-RNN and show that the LSTM-RNN can not only alleviate the feature engineering efforts such as construction of contextual features and feature delay, but also improve the recognition performance significantly; 3) we apply multi-task learning strategy for collaborative prediction of multiple emotion dimensions with shared representations according to the fact that different emotion dimensions are correlated with each other. Our solutions achieve the CCC of 0.675, 0.756 and 0.509 on arousal, valence, and likability respectively on the challenge testing set, which outperforms the baseline system with corresponding CCC of 0.375, 0.466, and 0.246 on arousal, valence, and likability.
Article
Full-text available
Dimensional affect recognition is a challenging topic and current techniques do not yet provide the accuracy necessary for HCI applications. In this work we propose two new methods. The first is a novel self-organizing model that learns from similarity between features and affects. This method produces a graphical representation of the multidimensional data which may assist the expert analysis. The second method uses extreme learning machines, an emerging artificial neural network model. Aiming for minimum intrusiveness, we use only the heart rate variability, which can be recorded using a small set of sensors. The methods were validated with two datasets. The first is composed of 16 sessions with different participants and was used to evaluate the models in a classification task. The second one was the publicly available Remote Collaborative and Affective Interaction (RECOLA) dataset, which was used for dimensional affect estimation. The performance evaluation used the kappa score, unweighted average recall and the concordance correlation coefficient. The concordance coefficient on the RECOLA test partition was 0.421 in arousal and 0.321 in valence. Results shows that our models outperform state-of-the-art models on the same data and provides new ways to analyze affective states.
Conference Paper
Full-text available
In this paper, we propose a multimodal deep learning architecture for emotion recognition in video regarding our participation to the audio-video based sub-challenge of the Emotion Recognition in the Wild 2017 challenge. Our model combines cues from multiple video modalities, including static facial features, motion patterns related to the evolution of the human expression over time, and audio information. Specifically, it is composed of three sub-networks trained separately: the first and second ones extract static visual features and dynamic patterns through 2D and 3D Convolutional Neural Networks (CNN), while the third one consists in a pretrained audio network which is used to extract useful deep acoustic signals from video. In the audio branch, we also apply Long Short Term Memory (LSTM) networks in order to capture the temporal evolution of the audio features. To identify and exploit possible relationships among different modalities, we propose a fusion network that merges cues from the different modalities in one representation. The proposed architecture outperforms the challenge baselines (38.81% and 40.47%): we achieve an accuracy of 50.39% and 49.92% respectively on the validation and the testing data.
Article
Full-text available
Recent advancements in human–computer interaction research has led to the possibility of emotional communication via brain–computer interface systems for patients with neuropsychiatric disorders or disabilities. In this study, we efficiently recognize emotional states by analyzing the features of electroencephalography (EEG) signals, which are generated from EEG sensors that non-invasively measure the electrical activity of neurons inside the human brain, and select the optimal combination of these features for recognition. In this study, the scalp EEG data of 21 healthy subjects (12–14 years old) were recorded using a 14-channel EEG machine while the subjects watched images with four types of emotional stimuli (happy, calm, sad, or scared). After preprocessing, the Hjorth parameters (activity, mobility, and complexity) were used to measure the signal activity of the time series data. We selected the optimal EEG features using a balanced one-way ANOVA after calculating the Hjorth parameters for different frequency ranges. Features selected by this statistical method outperformed univariate and multivariate features. The optimal features were further processed for emotion classification using support vector machine (SVM), k-nearest neighbor (KNN), linear discriminant analysis (LDA), Naive Bayes, Random Forest, deep-learning, and four ensembles methods (bagging, boosting, stacking, and voting). The results show that the proposed method substantially improves the emotion recognition rate with respect to the commonly used spectral power band method.
Article
Full-text available
Many paralinguistic tasks are closely related and thus representations learned in one domain can be leveraged for another. In this paper, we investigate how knowledge can be transferred between three paralinguistic tasks: speaker, emotion, and gender recognition. Further, we extend this problem to cross-dataset tasks, asking how knowledge captured in one emotion dataset can be transferred to another. We focus on progressive neural networks and compare these networks to the conventional deep learning method of pre-training and fine-tuning. Progressive neural networks provide a way to transfer knowledge and avoid the forgetting effect present when pre-training neural networks on different tasks. Our experiments demonstrate that: (1) emotion recognition can benefit from using representations originally learned for different paralinguistic tasks and (2) transfer learning can effectively leverage additional datasets to improve the performance of emotion recognition systems.
Article
Full-text available
Facial expressions play a significant role in human communication and behavior. Psychologists have long studied the relationship between facial expressions and emotions. Paul Ekman et al., devised the Facial Action Coding System (FACS) to taxonomize human facial expressions and model their behavior. The ability to recognize facial expressions automatically, enables novel applications in fields like human-computer interaction, social gaming, and psychological research. There has been a tremendously active research in this field, with several recent papers utilizing convolutional neural networks (CNN) for feature extraction and inference. In this paper, we employ CNN understanding methods to study the relation between the features these computational networks are using, the FACS and Action Units (AU). We verify our findings on the Extended Cohn-Kanade (CK+), NovaEmotions and FER2013 datasets. We apply these models to various tasks and tests using transfer learning, including cross-dataset validation and cross-task performance. Finally, we exploit the nature of the FER based CNN models for the detection of micro-expressions and achieve state-of-the-art accuracy using a simple long-short-term-memory (LSTM) recurrent neural network (RNN).
Article
Full-text available
Automatic affect recognition is a challenging task due to the various modalities emotions can be expressed with. Applications can be found in many domains including multimedia retrieval and human computer interaction. In recent years, deep neural networks have been used with great success in determining emotional states. Inspired by this success, we propose an emotion recognition system using auditory and visual modalities. To capture the emotional content for various styles of speaking, robust features need to be extracted. To this purpose, we utilize a Convolutional Neural Network (CNN) to extract features from the speech, while for the visual modality a deep residual network (ResNet) of 50 layers. In addition to the importance of feature extraction, a machine learning algorithm needs also to be insensitive to outliers while being able to model the context. To tackle this problem, Long Short-Term Memory (LSTM) networks are utilized. The system is then trained in an end-to-end fashion where - by also taking advantage of the correlations of the each of the streams - we manage to significantly outperform the traditional approaches based on auditory and visual handcrafted features for the prediction of spontaneous and natural emotions on the RECOLA database of the AVEC 2016 research challenge on emotion recognition.
Article
Full-text available
Facial expression recognition (FER) is increasingly gaining importance in various emerging affective computing applications. In practice, achieving accurate FER is challenging due to the large amount of inter-personal variations such as expression intensity variations. In this paper, we propose a new spatio-temporal feature representation learning for FER that is robust to expression intensity variations. The proposed method utilizes representative expression-states (e.g., onset, apex and offset of expressions) which can be specified in facial sequences regardless of the expression intensity. The characteristics of facial expressions are encoded in two parts in this paper. As the first part, spatial image characteristics of the representative expression-state frames are learned via a convolutional neural network. Five objective terms are proposed to improve the expression class separability of the spatial feature representation. In the second part, temporal characteristics of the spatial feature representation in the first part are learned with a long short-term memory of the facial expression. Comprehensive experiments have been conducted on a deliberate expression dataset (MMI) and a spontaneous micro-expression dataset (CASME II). Experimental results showed that the proposed method achieved higher recognition rates in both datasets compared to the state-of-the-art methods.
Conference Paper
Full-text available
Automatic emotion recognition from speech is a challenging task which relies heavily on the effectiveness of the speech features used for classification. In this work, we study the use of deep learning to automatically discover emotionally relevant features from speech. It is shown that using a deep recurrent neural network, we can learn both the short-time frame-level acoustic features that are emotionally relevant, as well as an appropriate temporal aggregation of those features into a compact utterance-level representation. Moreover, we propose a novel strategy for feature pooling over time which uses local attention in order to focus on specific regions of a speech signal that are more emotionally salient. The proposed solution is evaluated on the IEMOCAP corpus, and is shown to provide more accurate predictions compared to existing emotion recognition algorithms.
Article
Full-text available
Pain is an unpleasant feeling that has been shown to be an important factor for the recovery of patients. Since this is costly in human resources and difficult to do objectively, there is the need for automatic systems to measure it. In this paper, contrary to current state-of-the-art techniques in pain assessment, which are based on facial features only, we suggest that the performance can be enhanced by feeding the raw frames to deep learning models, outperforming the latest state-of-the-art results while also directly facing the problem of imbalanced data. As a baseline, our approach first uses convolutional neural networks (CNNs) to learn facial features from VGG_Faces, which are then linked to a long short-term memory to exploit the temporal relation between video frames. We further compare the performances of using the so popular schema based on the canonically normalized appearance versus taking into account the whole image. As a result, we outperform current state-of-the-art area under the curve performance in the UNBC-McMaster Shoulder Pain Expression Archive Database. In addition, to evaluate the generalization properties of our proposed methodology on facial motion recognition, we also report competitive results in the Cohn Kanade+ facial expression database.
Article
Full-text available
Throughout many present studies dealing with multi-modal fusion, decisions are synchronously forced for fixed time segments across all modalities. Varying success is reported, sometimes performance is worse than unimodal classification. Our goal is the synergistic exploitation of multimodality whilst implementing a real-time system for affect recognition in a naturalistic setting. Therefore we present a categorization of possible fusion strategies for affect recognition on continuous time frames of complete recording sessions and we evaluate multiple implementations from resulting categories. These involve conventional fusion strategies as well as novel approaches that incorporate the asynchronous nature of observed modalities. Some of the latter algorithms consider temporal alignments between modalities and observed frames by applying asynchronous neural networks that use memory blocks to model temporal dependencies. Others use an indirect approach that introduces events as an intermediate layer to accumulate evidence for the target class through all modalities. Recognition results gained on a naturalistic conversational corpus show a drop in recognition accuracy when moving from unimodal classification to synchronous multimodal fusion. However, with our proposed asynchronous and event-based fusion techniques we are able to raise the recognition system’s accuracy by 7.83% compared to video analysis and 13.71% in comparison to common fusion strategies.
Conference Paper
Full-text available
This paper presents the techniques used in our contribution to Emotion Recognition in the Wild 2016’s video based sub-challenge. The purpose of the sub-challenge is to classify the six basic emotions (angry, sad, happy, surprise, fear & disgust) and neutral. Compared to earlier years’ movie based datasets, this year’s test dataset introduced reality TV videos containing more spontaneous emotion. Our proposed solution is the fusion of facial expression recognition and audio emotion recognition subsystems at score level. For facial emotion recognition, starting from a network pre-trained on ImageNet training data, a deep Convolutional Neural Network is fine-tuned on FER2013 training data for feature extraction. The classifiers, i.e., kernel SVM, logistic regression and partial least squares are studied for comparison. An optimal fusion of classifiers learned from different kernels is carried out at the score level to improve system performance. For audio emotion recognition, a deep Long Short-Term Memory Recurrent Neural Network (LSTM-RNN) is trained directly using the challenge dataset. Experimental results show that both subsystems individually and as a whole can achieve state-of-the art performance. The overall accuracy of the proposed approach on the challenge test dataset is 53.9%, which is better than the challenge baseline of 40.47% .
Conference Paper
State-of-the-art approaches for the previous emotion recognition in the wild challenges are usually built on prevailing Convolutional Neural Networks (CNNs). Although there is clear evidence that CNNs with increased depth or width can usually bring improved predication accuracy, existing top approaches provide supervision only at the output feature layer, resulting in the insufficient training of deep CNN models. In this paper, we present a new learning method named Supervised Scoring Ensemble (SSE) for advancing this challenge with deep CNNs. We first extend the idea of recent deep supervision to deal with emotion recognition problem. Benefiting from adding supervision not only to deep layers but also to intermediate layers and shallow layers, the training of deep CNNs can be well eased. Second, we present a new fusion structure in which class-wise scoring activations at diverse complementary feature layers are concatenated and further used as the inputs for second-level supervision, acting as a deep feature ensemble within a single CNN architecture. We show our proposed learning method brings large accuracy gains over diverse backbone networks consistently. On this year's audio-video based emotion recognition task, the average recognition rate of our best submission is 60.34%, forming a new envelop over all existing records.
Article
Regularization is one of the crucial ingredients of deep learning, yet the term regularization has various definitions, and regularization methods are often studied separately from each other. In our work we present a systematic, unifying taxonomy to categorize existing methods. We distinguish methods that affect data, network architectures, error terms, regularization terms, and optimization procedures. We do not provide all details about the listed methods; instead, we present an overview of how the methods can be sorted into meaningful categories and sub-categories. This helps revealing links and fundamental similarities between them. Finally, we include practical recommendations both for users and for developers of new regularization methods.
Article
While logistic sigmoid neurons are more biologically plausable that hyperbolic tangent neurons, the latter work better for training multi-layer neural networks. This paper shows that rectifying neurons are an even better model of biological neurons and yield equal or better performance than hyperbolic tangent networks in spite of the hard non-linearity and non-differentiability at zero, creating sparse representations with true zeros, which seem remarkably suitable for naturally sparse data. Even though they can take advantage of semi-supervised setups with extra-unlabelled data, deep rectifier networks can reach their best performance without requiring any unsupervised pre-training on purely supervised tasks with large labelled data sets. Hence, these results can be seen as a new milestone in the attempts at understanding the difficulty in training deep but purely supervised nueral networks, and closing the performance gap between neural networks learnt with and without unsupervised pre-training
Conference Paper
Recurrent neural networks (RNNs) are a powerful model for sequential data. End-to-end training methods such as Connectionist Temporal Classification make it possible to train RNNs for sequence labelling problems where the input-output alignment is unknown. The combination of these methods with the Long Short-term Memory RNN architecture has proved particularly fruitful, delivering state-of-the-art results in cursive handwriting recognition. However RNN performance in speech recognition has so far been disappointing, with better results returned by deep feedforward networks. This paper investigates $backslash$emphdeep recurrent neural networks, which combine the multiple levels of representation that have proved so effective in deep networks with the flexible use of long range context that empowers RNNs. When trained end-to-end with suitable regularisation, we find that deep Long Short-term Memory RNNs achieve a test set error of 17.7% on the TIMIT phoneme recognition benchmark, which to our knowledge is the best recorded score.
Conference Paper
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif- ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implemen- tation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called dropout that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry
Article
Automated affective computing in the wild setting is a challenging problem in computer vision. Existing annotated databases of facial expressions in the wild are small and mostly cover discrete emotions (aka the categorical model). There are very limited annotated facial databases for affective computing in the continuous dimensional model (e.g., valence and arousal). To meet this need, we collected, annotated, and prepared for public distribution a new database of facial emotions in the wild (called AffectNet). AffectNet contains more than 1,000,000 facial images from the Internet by querying three major search engines using 1250 emotion related keywords in six different languages. About half of the retrieved images were manually annotated for the presence of seven discrete facial expressions and the intensity of valence and arousal. AffectNet is by far the largest database of facial expression, valence, and arousal in the wild enabling research in automated facial expression recognition in two different emotion models. Two baseline deep neural networks are used to classify images in the categorical model and predict the intensity of valence and arousal. Various evaluation metrics show that our deep neural network baselines can perform better than conventional machine learning methods and off-the-shelf facial expression recognition systems.
Article
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.0 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
Article
This paper presents a novel and efficient Deep Fusion Convolutional Neural Network (DF-CNN) for multi-modal 2D+3D Facial Expression Recognition (FER). DF-CNN comprises a feature extraction subnet, a feature fusion subnet and a softmax layer. In particular, each textured 3D face scan is represented as six types of 2D facial attribute maps (i.e., geometry map, three normal maps, curvature map, and texture map), all of which are jointly fed into DF-CNN for feature learning and fusion learning, resulting in a highly concentrated facial representation (32- dimensional). Expression prediction is performed by two ways: 1) learning linear SVM classifiers using the 32-dimensional fused deep features; 2) directly performing softmax prediction using the 6-dimensional expression probability vectors. Different from existing 3D FER methods, DF-CNN combines feature learning and fusion learning into a single end-to-end training framework. To demonstrate the effectiveness of DF-CNN, we conducted comprehensive experiments to compare the performance of DFCNN with handcrafted features, pre-trained deep features, finetuned deep features, and state-of-the-art methods on three 3D face datasets (i.e., BU-3DFE Subset I, BU-3DFE Subset II, and Bosphorus Subset). In all cases, DF-CNN consistently achieved the best results. To the best of our knowledge, this is the first work of introducing deep CNN to 3D FER and deep learning based feature-level fusion for multi-modal 2D+3D FER.
Article
We present a new action recognition deep neural network which adaptively learns the best action velocities in addition to the classification. While deep neural networks have reached maturity for image understanding tasks, we are still exploring network topologies and features to handle the richer environment of video clips. Here, we tackle the problem of multiple velocities in action recognition, and provide state-of-the-art results for facial expression recognition, on known and new collected datasets. We further provide the training steps for our semi-supervised network, suited to learn from huge unlabeled datasets with only a fraction of labeled examples.
Article
Emotion analysis is a crucial problem to endow artifact machines with real intelligence in many large potential applications. As external appearances of human emotions, electroencephalogram (EEG) signals and video face signals are widely used to track and analyze human's affective information. According to their common characteristics of spatial-temporal volumes, in this paper we propose a novel deep learning framework named spatial-temporal recurrent neural network (STRNN) to unify the learning of two different signal sources into a spatial-temporal dependency model. In STRNN, to capture those spatially cooccurrent variations of human emotions, a multi-directional recurrent neural network (RNN) layer is employed to capture longrange contextual cues by traversing the spatial region of each time slice from multiple angles. Then a bi-directional temporal RNN layer is further used to learn discriminative temporal dependencies from the sequences concatenating spatial features of each time slice produced from the spatial RNN layer. To further select those salient regions of emotion representation, we impose sparse projection onto those hidden states of spatial and temporal domains, which actually also increases the model discriminant ability because of this global consideration. Consequently, such a two-layer RNN model builds spatial dependencies as well as temporal dependencies of the input signals. Experimental results on the public emotion datasets of EEG and facial expression demonstrate the proposed STRNN method is more competitive over those state-of-the-art methods.
Article
Speech Emotion Recognition (SER) can be regarded as a static or dynamic classification problem, which makes SER an excellent test bed for investigating and comparing various deep learning architectures. We describe a frame-based formulation to SER that relies on minimal speech processing and end-to-end deep learning to model intra-utterance dynamics. We use the proposed SER system to empirically explore feed-forward and recurrent neural network architectures and their variants. Experiments conducted illuminate the advantages and limitations of these architectures in paralinguistic speech recognition and emotion recognition in particular. As a result of our exploration, we report state-of-the-art results on the IEMOCAP database for speaker-independent SER and present quantitative and qualitative assessments of the model’s performance.
Article
One of the serious obstacles to the applications of speech emotion recognition systems in real-life settings is the lack of generalisation of the emotion classifiers. Many recognition systems often present a dramatic drop in performance when tested on speech data obtained from different speakers, acoustic environments, linguistic content, and domain conditions. In this letter, we propose a novel unsupervised domain adaptation model, called Universum Autoencoders, to improve the performance of the systems evaluated in mismatched training and test conditions. To address the mismatch, our proposed model not only learns discriminative information from labelled data, but also learns to incorporate the prior knowledge from unlabelled data into the learning. Experimental results on the labelled Geneva Whispered Emotion Corpus (GeWEC) database plus other three unlabelled databases demonstrate the effectiveness of the proposed method when compared to other domain adaptation methods.
Conference Paper
We present a novel method for classifying emotions from static facial images. Our approach leverages on the recent success of Convolutional Neural Networks (CNN) on face recognition problems. Unlike the settings often assumed there, far less labeled data is typically available for training emotion classification systems. Our method is therefore designed with the goal of simplifying the problem domain by removing confounding factors from the input images, with an emphasis on image illumination variations. This, in an effort to reduce the amount of data required to effectively train deep CNN models. To this end, we propose novel transformations of image intensities to 3D spaces, designed to be invariant to monotonic photometric transformations. These are applied to CASIA Webface images which are then used to train an ensemble of multiple architecture CNNs on multiple representations. Each model is then fine-tuned with limited emotion labeled training data to obtain final classification models. Our method was tested on the Emotion Recognition in the Wild Challenge (EmotiW 2015), Static Facial Expression Recognition sub-challenge (SFEW) and shown to provide a substantial, 15.36% improvement over baseline results (40% gain in performance).
Article
Multimodal recognition of affective states is a difficult problem, unless the recording conditions are carefully controlled. For recognition “in the wild”, large variances in face pose and illumination, cluttered backgrounds, occlusions, audio and video noise, as well as issues with subtle cues of expression are some of the issues to target. In this paper, we describe a multimodal approach for video-based emotion recognition in the wild. We propose using summarizing functionals of complementary visual descriptors for video modeling. These features include deep convolutional neural network (CNN) based features obtained via transfer learning, for which we illustrate the importance of flexible registration and fine-tuning. Our approach combines audio and visual features with least squares regression based classifiers and weighted score level fusion. We report state-of-the-art results on the EmotiW Challenge for “in the wild” facial expression recognition. Our approach scales to other problems, and ranked top in the ChaLearn-LAP First Impressions Challenge 2016 from video clips collected in the wild.
Article
Affective computing is an emerging interdisciplinary research field bringing together researchers and practitioners from various fields, ranging from artificial intelligence, natural language processing, to cognitive and social sciences. With the proliferation of videos posted online (e.g., on YouTube, Facebook, Twitter) for product reviews, movie reviews, political views, and more, affective computing research has increasingly evolved from conventional unimodal analysis to more complex forms of multimodal analysis. This is the primary motivation behind our first of its kind, comprehensive literature review of the diverse field of affective computing. Furthermore, existing literature surveys lack a detailed discussion of state of the art in multimodal affect analysis frameworks, which this review aims to address. Multimodality is defined by the presence of more than one modality or channel, e.g., visual, audio, text, gestures, and eye gage. In this paper, we focus mainly on the use of audio, visual and text information for multimodal affect analysis, since around 90% of the relevant literature appears to cover these three modalities. Following an overview of different techniques for unimodal affect analysis, we outline existing methods for fusing information from different modalities. As part of this review, we carry out an extensive study of different categories of state-of-the-art fusion techniques, followed by a critical analysis of potential performance improvements with multimodal analysis compared to unimodal analysis. A comprehensive overview of these two complementary fields aims to form the building blocks for readers, to better understand this challenging and exciting research field.
Conference Paper
With rapid developments in the design of deep architecture models and learning algorithms, methods referred to as deep learning have come to be widely used in a variety of research areas such as pattern recognition, classification, and signal processing. Deep learning methods are being applied in various recognition tasks such as image, speech, and music recognition. Convolutional Neural Networks (CNNs) especially show remarkable recognition performance for computer vision tasks. In addition, Recurrent Neural Networks (RNNs) show considerable success in many sequential data processing tasks. In this study, we investigate the result of the Speech Emotion Recognition (SER) algorithm based on CNNs and RNNs trained using an emotional speech database. The main goal of our work is to propose a SER method based on concatenated CNNs and RNNs without using any traditional hand-crafted features. By applying the proposed methods to an emotional speech database, the classification result was verified to have better accuracy than that achieved using conventional classification methods.
Article
Background and objective: Using deep-learning methodologies to analyze multimodal physiological signals becomes increasingly attractive for recognizing human emotions. However, the conventional deep emotion classifiers may suffer from the drawback of the lack of the expertise for determining model structure and the oversimplification of combining multimodal feature abstractions. Methods: In this study, a multiple-fusion-layer based ensemble classifier of stacked autoencoder (MESAE) is proposed for recognizing emotions, in which the deep structure is identified based on a physiological-data-driven approach. Each SAE consists of three hidden layers to filter the unwanted noise in the physiological features and derives the stable feature representations. An additional deep model is used to achieve the SAE ensembles. The physiological features are split into several subsets according to different feature extraction approaches with each subset separately encoded by a SAE. The derived SAE abstractions are combined according to the physiological modality to create six sets of encodings, which are then fed to a three-layer, adjacent-graph-based network for feature fusion. The fused features are used to recognize binary arousal or valence states. Results: DEAP multimodal database was employed to validate the performance of the MESAE. By comparing with the best existing emotion classifier, the mean of classification rate and F-score improves by 5.26%. Conclusions: The superiority of the MESAE against the state-of-the-art shallow and deep emotion classifiers has been demonstrated under different sizes of the available physiological instances.
Conference Paper
In the past three years, Emotion Recognition in the Wild (EmotiW) Grand Challenge has drawn more and more attention due to its huge potential applications. In the fourth challenge, aimed at the task of video based emotion recognition, we propose a multi-clue emotion fusion (MCEF) framework by modeling human emotion from three mutually complementary sources, facial appearance texture, facial action, and audio. To extract high-level emotion features from sequential face images, we employ a CNN-RNN architecture, where face image from each frame is first fed into the fine-tuned VGG-Face network to extract face feature, and then the features of all frames are sequentially traversed in a bidirectional RNN so as to capture dynamic changes of facial textures. To attain more accurate facial actions, a facial landmark trajectory model is proposed to explicitly learn emotion variations of facial components. Further, audio signals are also modeled in a CNN framework by extracting low-level energy features from segmented audio clips and then stacking them as an image-like map. Finally, we fuse the results generated from three clues to boost the performance of emotion recognition. Our proposed MCEF achieves an overall accuracy of 56.66% with a large improvement of 16.19% with respect to the baseline.