ChapterPDF Available

Emoji as a Proxy of Emotional Communication


Abstract and Figures

Nowadays, emoji plays a fundamental role in human computer-mediated communications, allowing the latter to convey body language, objects, symbols, or ideas in text messages using Unicode standardized pictographs and logographs. Emoji allows people expressing more “authentically” emotions and their personalities, by increasing the semantic content of visual messages. The relationship between language, emoji, and emotions is now being studied by several disciplines such as linguistics, psychology, natural language processing (NLP), and machine learning (ML). Particularly, the last two are employed for the automatic detection of emotions and personality traits, building emoji sentiment lexicons, as well as for conveying artificial agents with the ability of expressing emotions through emoji. In this chapter, we introduce the concept of emoji and review the main challenges in using these as a proxy of language and emotions, the ML, and NLP techniques used for classification and detection of emotions using emoji, and presenting new trends for the exploitation of discovered emotional patterns for robotic emotional communication. See the full chapter here:
Content may be subject to copyright.
Selection of our books indexed in the Book Citation Index
in Web of Science™ Core Collection (BKCI)
Interested in publishing with us?
Numbers displayed above are based on latest data collected.
For more information visit
Open access books available
Countries delivered to Contributors from top 500 universities
International authors and editor s
Our authors are among the
most cited scientists
We are IntechOpen,
the world’s leading publisher of
Open Access books
Built by scientists, for scientists
TOP 1%
Emoji as a Proxy of Emotional
Guillermo Santamaría-Bonfil
and Orlando Grabiel Toledano López
Nowadays, emoji plays a fundamental role in human computer-mediated com-
munications, allowing the latter to convey body language, objects, symbols, or ideas
in text messages using Unicode standardized pictographs and logographs. Emoji
allows people expressing more authenticallyemotions and their personalities, by
increasing the semantic content of visual messages. The relationship between lan-
guage, emoji, and emotions is now being studied by several disciplines such as
linguistics, psychology, natural language processing (NLP), and machine learning
(ML). Particularly, the last two are employed for the automatic detection of emo-
tions and personality traits, building emoji sentiment lexicons, as well as for con-
veying artificial agents with the ability of expressing emotions through emoji. In
this chapter, we introduce the concept of emoji and review the main challenges in
using these as a proxy of language and emotions, the ML, and NLP techniques used
for classification and detection of emotions using emoji, and presenting new trends
for the exploitation of discovered emotional patterns for robotic emotional
Keywords: emoji, machine learning, natural language processing,
emotional communication, human-robot interaction
1. Introduction
Recently, in the episode Smileof the popular science fiction television pro-
gram Doctor Who,a hypothetical off-earth colony is presented. This colony is
maintained and operated by robots, which communicate and express emotions with
humans and its pairs, through the usage of emoji. Sure, one may argue that such
technology, besides being mere science fiction, is ridiculous since phonetic com-
munication is much simpler and much easier to understand. While this is true for
conventional information (e.g., explaining the concept of real numbers), commu-
nicating body emotional responses or gesticulation (e.g., to describe confusion)
using only phonograms would require many more words to convey the same mes-
sage than an emoji (e.g., or ). In this sense, emoji serve as a visual simplified
form of (affective) communication that broadens the total amount of information
(e.g., cues and gestures), which can be shared between humans and virtual/
embodied artificial entities. If we consider that human languages, such as Chinese,
Nahuatl [1], or even Sign Language, have evolved from ideographs and pictographs
lexicons, can we expect that in the near future, artificial entities (virtual or embod-
ied) would employ emoji in their emotional communication?
The Japanese word emoji (e = picture and moji = word) literally stands for
picture word.Although recently popularized, its older predecessors can be
tracked to the nineteenth century, when cartoons were employed for humorous
writing. Smileys followed in 1964 and were meant to be used by an insurance
companys promotional merchandise to improve the morale of its employee. The
first to employ the emoticon :) in an online bulletin forum to denote humorous
messages was Carnegie Mellon researchers in 1982, and 10 years later, the emoti-
cons were already widespread in emails and Websites [2]. Finally, in 1998,
Shigetaka Kurita devised emoji to improve emoticons pictorially, and became
widespread by 2010. From this moment, the use of emoji has gained a lot of
momentum, even achieving that the word namely Face with Tears of Joy() was
chosen by the Oxford Dictionary as the Word of the Year [24]. This choice was
made under the assumption that the pictograph represented the ideas, beliefs,
mood, and concerns of English speakers in 2015.
Since its origin, emoji undoubtedly have become a part of the mainstream
communication around the globe allowing people, with different languages and
cultural backgrounds, to share and interpret more accurately ideas and emotions. In
this vein, it has been hypothesized that emoji shall become a universal language due
to its generic communication features and its ever progressing lexicon [2, 57].
Although, this idea is controversial [8, 9] since emoji usage during the communica-
tion is influenced by factors such as context, users interrelations, usersfirst lan-
guage, repetitiveness, socio-demographics, among others [2, 5, 8]. This clearly adds
ambiguity on how to employ them and its proper interpretation. Nevertheless, in
the same fashion as sentiment analysis mines sentiments, attitudes, and emotions
from text [10], we can employ billions (or perhaps more) of written messages
within the Internet that contains emoji, to generate affective responses in artificial
entities. More precisely, using natural language processing (NLP) along with
machine learning (ML), we can extract semantics, express beat gestures, emotional
states, and affective cues, add personality layers, among other characteristics from
text. All this knowledge can be used to build, for instance, emoji sentiment lexicons
[10] that will conform the emoji communication competence [2] that will power the
engines of the emotional expression and communication of an artificial entity.
In the rest of this chapter, we first review the elements of the emoji code, and
how emoji are used in the emotional expression and communication (Section 2).
Afterward, in Section 3, we present a review of the state of the art in the usage of
NLP and ML to classify and predict annotation and expression of emotions, ges-
tures, affective cues, and so on, using written messages from multiple types of
sources. In Section 4, we present several examples on how emoji are currently
employed by artificial entities, both virtual and embodied, for expressing emotions
during its interaction with humans. Lastly, Section 5 summarizes the chapter and
discusses open questions regarding emoji usage as a source for robotic emotional
2. Competence, lexicon, and ways of usage of emoji
To study how emoji are employed and about its challenges, we cannot simply do
it without specifying the emoji competence [2]. Loosely speaking, competence (either
linguistic or communicative) stands for the rules (e.g., grammar) and abilities an
individual owns to correctly employ a given language to convey a specific idea [11].
Hence, the emoji competence stands for an adequate usage of emoji within
Future of Robotics - Becoming Human with Humanoid or Emotional Intelligence
messages, not only in their representation but also in exact position within the
message, to address a specific function (e.g., emotional expression, gestures, main-
tain interest in the communication, etc.) [2]. Nevertheless, even while the emoji
competence has not been formally defined yet, and it can only be developed
through the usage of emoji themselves [2, 6], here, we elaborate several of its
A key element of the emoji competence is the emoji lexicon, which is the stan-
dardization of pictograms (i.e., figures that resemble the real-world object), ideo-
grams (i.e., figures that represent an idea) and logograms (i.e., figures that
represent a sound or words) into anime-like graphical representations that belong to
the ever-growing Unicode computer character standard [2, 6, 12]. These are
employed within any message in three different ways: adjunctively, substitutively,
or providing mixed textuality. In the first case, emoji appear along text within
specific points of the written message (e.g., at the end of it) conveying it with
emotional tone or adding visual annotations; it requires an overall low emoji com-
petence. In the second case, emoji replace words, requiring a higher degree of
competence to understand, not only the symbols per se but also the layout structure
of the message, for instance, if we consider syntagms, which are symbols sequen-
tially grouped that together conform a new idea (e.g., I love coffee = ). The
third case intertwines text with emoji in a substitutive form rather than
adjunctively. This case is the one that requires the highest emoji competence degree,
since its decoding requires sophisticated knowledge about rhetorical structures and
the proper usage of signs and symbols.
The emoji lexicon possesses generic features such as representationality, which
allows signs and usage rules to be combined in specific forms to convey a message.
Similarly, any person who is well versed with codes signs and rules is capable of
interpreting any message based on the code (i.e., interpretability). However, mes-
sages built using the emoji lexicon are affected by contextualization, allowing that
references, interpersonal relationships, and other factors affect the meaning of the
message [4, 5]. Besides these, the emoji code is composed by a core and peripheral
lexicon [2, 5]. As in the Swadesh list, the core lexicon stands for those emoji whose
meaning and usage is, somehow, universally accepted and used, even while the
Unicode supports more than 1000 different emoji [10]. Within this stand, all facial
emoji also contain those emoji that stand for Ekmans six basic emotions such as
surprise (
) or anger ( ) [2, 13]. On the other hand, the peripheral lexicon is
constituted by specialized communication symbols such as the one required for
marketing, education [14], promoting national identity, or cultural cues [2], among
others. Nevertheless, it is worth mentioning that since emoji may be used as nouns,
verbs, or other grammatical structure, even those in the core lexicon can be used as
a peripheral element in accordance with usersfirst language, its position within
message, or by concatenating several of them into a syntagm.
2.1 How do we use emoji?
Emoji within any message can have several functions; Figure 1 summarizes
these. As shown by the latter, one of the most important functions an emoji has is
emotivity, which adds an emotional layer to plain text communication. In this sense,
emoji serve as a substitute of face-to-face (F2F) facial expressions, gestures, and
body language, to state oneself emotional states, moods, or affective nuances. When
used in this manner, emoji take the role of discourse strategies such as intonation or
Emoji as a Proxy of Emotional Communication
phrasing [2, 4, 15]. Emoji emotivity mostly conveys positive emotions, hence it can
be employed to emphasize an specific point of view, such as sarcasm, while soften-
ing the negative emotions associated with it (e.g., with respect to the one that is
being sarcastic), allowing the receiver of the message to focus on the content
instead of the negativity elicited [2, 14].
Another important role of emoji is as phatic instrument during communication
[2, 16]. In this sense, they are employed as utterances that allow the flow of the
conversation to unfold pleasantly and smoothly. In this sense, emoji serve as an
opener or ending utterance (i.e., waving hand) to open or close a conversation,
respectively, maintaining a positive dialog regardless of the content. Similarly,
emoji can be used to fill uncomfortable moments of silence during a conversation
avoiding its abrupt interruption. Beat gestures are another function of emoji; the
former can be defined as a repetitive rhythmical co-speech gesture that emphasizes
the rhythm of the speech [9]. For instance, in the same way that keeping nodding up
and down during a conversation emphasizes agreement with the interlocutor, emoji
can be repeated to convey the same meaning (e.g., ). Keeping in mind that
although emoji, neither as utterance nor as beat gesture, explicitly stands for an
emotional reaction, they implicitly convey an emotional (positive) tone to the
conversation. Likewise, the other function of emoji, which is also implicitly related
to emotion, is personality. The latter stands for basal characteristics that have pre-
established effects on thoughts, behaviors, and emotions [17]. Been considered a
genetic trait, it suffers less variability over time in contrast to emotions and moods
[17]. In this sense, emoji can be used to elucidate the underlying personality traits of
individuals, either by data mining or by replacing text-based items by their emoji
equivalent in personality tests [18].
Figure 1.
Emoji functions within the computer-mediated communications.
Future of Robotics - Becoming Human with Humanoid or Emotional Intelligence
3. Studying emoji usage using formal frameworks
Emoji usage has had a deep impact on humanscomputer-mediated communi-
cation (CMC). With the increasing use of social media platforms such as Facebook,
Twitter, or Instagram, people now massively interchange messages and ideas
through text-based chat tools that support emoji usage, imbuing these with seman-
tic, emotional, and meaningful meaning. In order to analyze and extract compre-
hensive knowledge from emoji-embedded message data sets, many methods have
been developed through the usage of a multidisciplinary approach, which involves
ML along with NLP, psychology, robotics, and so on. Among the tasks developed
with ML algorithms for the analysis of emoji usage stand sentiment analysis [5, 19],
polarity analysis [10, 20], sentiment lexicon buildage [10], utterance embeddings
[21], personality assessment [18], to mention a few. These applications are
summarized in Table 1.
The following section shows an analysis from the point of view of the use of ML
algorithms to support tasks related to the sentiment analysis through the use of
emoji, classification, comparison, polarity detection, data preprocessing from
tweets with emoji embeddings, and computer vision techniques for video
processing to detect facial expression.
3.1 Emoji classification and comparison
In recent years, algorithms such as deep learning (DL) have emerged as a new
area of ML, which involve new techniques for signal and information processing.
This type of algorithms employ several nonlinear layers for information processing
through supervised and unsupervised feature extraction, and transformation for
pattern analysis and classification. It also includes algorithms for multiple levels of
representation attaining models that can describe the complex relations within data.
Particularly, if data sets are considerably large, a deep-learning approach is the best
option for reaching a well-trained model regarding if data are labeled or not
[25, 26]. Until our days, ML algorithms that use shallow architecture show a good
performance and effectiveness for solving simple problems, for instance, linear
regression (LR), support vector machines (SVM), multilayer perceptron (MPL)
with a single hidden layer, decision trees like random forest or ID3, among others.
These architectures have limitations for extracting patterns from a wide complex
problems variety, such as signals, human speech, natural language, images, and
sound processing [25]. For this reason, a deep-learning approach allows to solve
these limitations showing good results.
Emoji classification and comparison constitute two important tasks for discrim-
inating several kinds of emoji, including those with similar meaning. Deep-learning
models have been used for this goal in texts where emoji are embedded, producing
better result than softmax methods, such as logistic regression, naive Bayes,
artificial neural networks, and others. For example, Xiang Li et al. developed a deep
neural network architecture for getting a trained model that could predict the
correct emoji for its corresponding utterance [21]. This approach provides the
possibility that machines generate an automatic message for humans during a
conversation with the use of implicit sentiment and better semantic on ideas.
In Li et al.s [21] proposal, the system receives as input an utterance set Y¼
and an emoji set X¼x1,x2,,xn
. The main goal is to train a
classification model, which could predict the correct emoji for an utterance given.
The architecture used in this work has two parts. The first is a convolutional
neural network (CNN) for giving a sentence embedding that represents an
Emoji as a Proxy of Emotional Communication
Problems addressed Method Emoji use Emoji
[21] Emoji classification correct emoji
Matching utterance embeddings
with emoji embeddings
Emoji for sentiment analysis
One-hot vector
Sliding windows
Cosine similarity
Dynamic Pooling
[20] Sentiment analysis
Polarity detection
10-fold cross validation
Shallow classifiers: SVM
and LR
Search-based classifier
[22] Image processing & computer
vision to detect facial expression
Emoji embeddings
Haar classifier
Canny algorithms
Emotive Adjunctive
[19] Sentiment analysis
Auto-labeling using emoji
Emoji classification
Emoticons as a heuristic data
Tweets data preprocessing
Ensemble classifiers
Deep learning
[10] Sentiment analysis
Emoji sentiment map & lexicon
Polarity detection
Emoji sentiment ranking
Discrete probability
Welchs t-test
Krippendorffs Alpha-
[5] Sentiment analysis
Automated analysis of social media
Emoji classification
Correlation analysis among
Nearest neighbors Emotive
[23] Emotions detection
Emoji classification
Adaptive Boosted Decision
Trees (ADT)
10-fold cross validation
Random Forests (RF)
[18] Emotions representation using
Big 5 personality assessment test
using emoji
Exploratory Factor
Analysis (EFA)
Confirmatory Factor
Analysis (CFA)
Bonferroni correction
[9] Emoji as co-speech element
Emoji-based measures
NA Beat
[24] Facial expressions recognition
Emoji embeddings
Emoji usage for peer
Emoji as social cues
Haar classifier
Table 1.
Comparative table of the articles analyzed.
Future of Robotics - Becoming Human with Humanoid or Emotional Intelligence
utterance, and the second one is the embedding of emoji and this part should be
trained. In order to join both parts, a matching structure was created due to
embeddings in continuous vector space that could well represent emoji, conse-
quently performing better than discrete softmax classifier.
The bottom of CNN is a word embedding layer for tasks of NLP. This provides
semantic information about a word using real vector that represents its features. For
an utterance that represent a sequence of words, for each word wiis a one-hot
vector of dictionary dimension, a bit from witakes value 1 if it corresponds to word
on the dictionary and 0 for remaining bits. In Eq. (1), the embedding matrix is
defined such that [21]:
E1ϵRDxV, (1)
where Dand Vare word embedding and word dictionary dimensions, respec-
tively. Each e1wi
ðÞϵE1is the embedding for word in a dictionary. The convolutional
layer uses sliding windows to get information from word embeddings; for this
process, the following function is used (see the Eq. (2) [21]):
ðÞ, (2)
where tis the size of window and b1is the bias vector. Hence, the parameter to
be trained is W1.
Once obtained a series continuous representations of local features from
convolutional layer, theory of dynamic pooling is used for sensitizing these embed-
dings into one vector of the whole utterance. This produces by output the max
pooling. The hidden layer uses the sentence embedding of the utterance obtained as
y2and returns finally the vector to represent the utterance.
Similarly to the word embedding layer, the emoji embedding layer uses a matrix
defined as E2ϵRDxV to obtain e2xi
ðÞ, where $K$ is the one-hot vectors length that
represents each emoji xi. Each e2xi
ðÞof E2is one parameter of neural network. The
process of training is a forward propagation for computing the matching score
between the given utterance and the correct emoji, and matching score between the
given utterance and the negative emoji. Backward propagation is used to update
model parameters. For calculating the matching score, the cosine similarity measure
is used, whereas for training the neural network, the Hinge Loss function was used.
It is worth mentioning that the latter is very useful for carrying out pairwise
comparison to identify similar emoji types.
Finally, the author obtains an architecture that uses a CNN and a matching
approach for classifying and learning emoji embeddings. The importance of the
aforementioned work regarding the field of robotics is the possibility of producing a
facial gesture as a result of the introduction of a statement, conversation, or idea to a
machine, employing the semantic and emotional relation of emoji.
3.2 Emoji sentiment analysis
In the area of decision making, it is being relevant to know how the people think
and what they will do in the future. These produce the needs of grouping peoples
in accordance with their interaction on Internet and social networks. Sentiment
analysis or opinion mining is the study of peoples opinions, sentiments, or emo-
tions, using an NLP approach, which includes, but is not limited to, text mining,
data mining, ML, and deep learning [20]. For instance, the CNNs usage has been
employed to predict the tweetsemoji polarities. These techniques have showed to
Emoji as a Proxy of Emotional Communication
be more effective than shallow models in image recognition and text classification
where they reach better results [19].
Tweets processing for mining opinion and text analysis tasks play a crucial role
for different areas in the industry because these produce relevant result for feed-
back the design of products and services. As Twitter is a platform where user
interactions are very informal and unstructured and people use many languages and
acronyms, it becomes necessary to build a model language-independent and
nonsupervised learning. We can see the use of emoji or emoticons in this scenario
through heuristic labels for a system; for this, the features extraction process was
developed by unsupervised techniques. The emoji/emoticons are the final result
that represents a sentiment that a tweet contains. According to Mohammad Hanafy
et al. in order to get a trained model for text processing, it is essential to do a data
preprocessing for obtaining the data sets, where noisy elements are removed such as
hashtags and other stranger characters like at,reduction of words by removing
duplicated words, and very important, reemphasizing the emoticons with their
scores. Each emoticon has a raw data that contain a sentiment classified as negative,
neutral, or positive. For each classification, a continuous value is recorded. This
representation is used in auto-label phase, for generating the training data using the
score for determining emoji [19].
Feature extraction stage uses the Tf-idf approach; it indicates the importance of
a word in the text through its frequency in the text or texts set. Using Eq. (3), we
can calculate this as follows [19, 27]:
TfIDF t,d,FðÞ¼tf t,dðÞlog nd
df d,tðÞþ1(3)
where tis the word and dis the tweet. Term frequency in the document is tf ,df
is document frequency where word exists, and ndis the number of tweets.
Other feature-extracting methods employed were bag-of-words (BOW) and
Word2Vec. BOW selects a set of important words in tweets, and then each docu-
ment is represented as a vector of the number of occurrences of the selected words.
Word2Vec uses a two-layer neural network to represent each word as a vector of
certain length based on its context. This feature extraction model computes a
distributed vector representation of the word, been its main advantage that similar
words are close in the vector space. Moreover, it is very useful for named entity
recognition, parsing, disambiguation, tagging, and machine translation. In the area
of big data processing, the library Spark ML within the Apache Spark engine uses
skip-gram-model-based implementation that seeks to learn vector representation
that take into account the contexts in which words occur [27].
Skip-gram model learns word vector representations that are good at predicting
its context in the same sentence or sequence of training words denoted as
, where Tis W
. The objective function is to maximize the
average log-likelihood, which is defined by Eq. (4) [27]:
log pw
, (4)
where kis the size of training windows. Each wis associated with two vectors uw
as word and vwas context, respectively. Using Eq. (5), given the word wj, the
probability of correctly predicting wiis computed as [27]:
¼exp uwi,uwj
l¼1exp vl,vwj
, (5)
Future of Robotics - Becoming Human with Humanoid or Emotional Intelligence
where Vis the vocabulary length. The cost of computing pw
is expensive;
consequently, Spark ML uses hierarchical softmax with computational cost of
Olog VðÞðÞ[27].
These feature extractor models were used with other classifiers, such as SVM,
MaxEnt, voting ensembles, CNN, and LSTM to extend the architecture of recurrent
neural network (RNN). As solution proposal, a weighted voting ensemble classifier
is used that combines the output of different models and its classification probabil-
ities. For each model, a different weight when voting is assigned. The proposed
model reaches a considerable accuracy in comparison with other models. This
approach is very important in scenarios where we need no human intervention and
any information about the used language; it is very useful to apply a good
combination between classical and deep-learning algorithm to achieve better
accuracy [19].
3.3 From video to emoji
As consequence of the semantic meaning that emoji carriers, there are some
applications and researches that involve the image processing for generating emoji
classification or an utterance with emoji embeddings. For that purpose, Chembula
et al. have created an application that receives as input a stream of video or images
from a person and create an emoticon based on image face. The solution detects the
facial expression at the time that message is being generated. Once that facial
expression is detected, the device generates a message with the suitable
emoticon [28].
This system performs a facial detection, facial feature detection, and classifica-
tion task to finally identify the facial expression. Although the initial processing
proposed by Chembula and Pradesh [22] was not specified on the general descrip-
tion, we can use open source solutions in order to aim this job.
OpenCV is an open source library for computer vision, and it includes
classifiers for real-time face detection and tracing like Haar classifiers and
Adaptive Boosting (AdaBoost). We can download trained model for performing
this task; the model is an XML file that can be imported inside the OpenCV
project. For featuring extraction, the library includes algorithm for detecting
region of interest in human face like eyes, mouth, and nose. For this propose,
drop information from image stream using gray scale convert and afterward
using Gaussian Blur for reducing noise is important. Canny algorithms may be
used for tracking facial features with more precision than others like Sobel and
Laplace [29].
In [24], Microsofts emotion API is used as a tool to detect facial images from the
Webcam image capture of the computer. Once the image is captured, the detected
face is classified into seven emotion tags. Although the process is not specified
exactly, the API mentioned works on an implementation of the OpenCV library for .
NET [30], so the algorithms used for face detection should be the same as those
described above.
For classification task, we can use nearest neighbor classifier, SVM, logistic
classifier, Parzen density classifiers, normal density Bayes classifiers, and Fishers
linear discriminant [31]. Finally, when the classification is done, the output layer
consists a group of types of emoji according to the meaning for each type of emotion
detected in the image face. The importance of this contribution lies in the possibility
of introducing new forms of human-computer interaction through the use of emo-
tions. This can be useful for intelligent assistants both physical and visual that are
able to react or are current according to the mood of people who use a particular
intelligent ecosystem.
Emoji as a Proxy of Emotional Communication
Figure 2 shows in a general way the operation of what has been explained above.
4. Applications to virtual and embodied robotics
As already mentioned, in this work, our intention is to elaborate the elements
that will power an artificial intelligent entity, either virtual or physically embodied,
with the capacity to recognize and express (R&E) emotional content using emoji. In
this sense, we can collect massive amounts of human-human (and perhaps human-
machines too) interactions from multiple Internet sources such as social media or
forums, to train ML algorithms, which can R&E emotions, utterances, beat ges-
tures, and even assess personality of the interlocutor. Furthermore, we may even
reconstruct text phrases from speech in which emoji are embedded to these to
obtain a bigger picture of the semantic meaning. For instance, if we asked the robot
are you sure?while raising the eyebrows to emphasize our incredulity, we may
obtain an equivalent expression such as are you sure? Once the models are
defined and trained, these will be embedded into the artificial entity, which will be
interacting with humans. This conceptual framework is displayed in Figure 3.
While in a virtual entity such as a chatbot, the inference of emotional states or
personality, as well as expressing emotions or beat gestures using emoji, is straight-
forward, in an embodied entity such as a physical robot that requires a little bit
more of elaboration. In the latter, an interlocutors emotional or personality first
requires the humansfacial expressions and gestures to be transformed into emoji
from video streams or speech similarly as shown in [22]. Then, the same pipeline as
the one used for a chatbot may be employed, identifying the corresponding emo-
tional state using pretrained sentiment detection algorithm such as in [20]. There-
fore, since both, embodied and virtual artificial entities, can employ the same
pipeline, we focus on applications to the former. In particular, we discuss some
works, which are delved in this direction, and how the cognitive interaction
Figure 2.
General process of facial detection and its corresponding classification using emoji.
Future of Robotics - Becoming Human with Humanoid or Emotional Intelligence
between humans and artificial entities may be improved by modeling the emotional
exchange as shaped by emoji usage.
4.1 Embodied service robots study cases
Service robots are a type of embodied artificial intelligent entities (EAIE), which
are meant to enhance and support human social activities such as health and elder
care, education, domestic chores, among others [3234]. A very important element
for EAIE is improving the naturalness of human-robot interactions (HRI), which
can provide EAIE with the capacity to R&E emotions to/from their human inter-
locutors [32, 33].
Regarding the emotional mechanisms of an embodied robot per se, a relevant
example is the work by [33], which consists in an architecture for imbuing an EAIE
with emotions that are displayed in an LED screen using emoticons. Such architec-
ture establishes that a robots emotions are in terms of long-medium-short affective
states suchlike its personality (i.e., social and mood changes), the surrounding
ambient (i.e., temperature, brightness, and sound levels), and human interaction
(i.e., hit, pat, and stroke sensors), respectively. All of these sensory inputs were
employed to determine EAIE emotional state using ad hoc rules, which are coded
into a fuzzy logic algorithm, which is then displayed in an LED face. Facial gestures
corresponding to Ekmans basic emotions expressions are shown in the form of
An important application of embodied service robots is the support of elders
daily activities to promote a healthy life style and providing them with an enriching
companion. In such case, a more advanced interaction models for EAIE based on an
emotional model, gestures, facial expressions, and R&E utterances are proposed
[32, 3537]. The authors of these works put forward several cost-efficient EIAE
based on mobile device technologies namely iPhonoid,iPhonoid-C, and iPadrone.
These are robotic companions based on an architecture, which among other fea-
tures is built upon the informationally structured spaces (ISS) concept. The latter
allows to gather, store, and transform multimodal data from the surrounding ambi-
ance into a unified framework for perception, reasoning, and decision making. This
is a very interesting concept since, not only EAIE behavior may be improved by its
own perceptions and HRI but also from remote usersinformation such as elders
activities from Internet or grocery shopping. Likewise, all these multimodal infor-
mation can be exploited by any family member to improve the quality of his/her
relation with the elder ones [36]. Regarding the emotional model, the perception
and action modules are the most relevant. Among the perceptions considered in
these frameworks stand the number of people in the room, gestures, utterances,
Figure 3.
Emoji emotional communication conceptual framework.
Emoji as a Proxy of Emotional Communication
colors, etc. In the same fashion as [33], these EAIE implements an emotional time-
varying framework, which considers emotion, feeling, and mood (from shorter to
longer emotional duration states, respectively). First, perceptions are transformed
into emotions using expert-defined parameters, then emotions and long-term traits
(i.e., mood) serve as the input of feelings whose activation follows a spiking neural
network model [32, 35]. Particularly, mood and feelings are within a feedback loop,
which emphasize the emotional time-varying approach. Once perceptions are
turned into its corresponding emotional state, the latter is sent to the action module
to determine the robot behavior (i.e., conversation content, gestures, and facial
expression). As mentioned earlier, EAIE also R&E utterances, which provide feed-
back to the robots emotional state. Another interesting feature of the architecture
of these EAIE is its conversational framework. In this sense, the usage of certain
utterances, gestures, or facial expressions depends on conversation modes, which in
turn depends on NLP processing for syntactic and semantic analyses [32, 37]. Nev-
ertheless, with regard to facial and gesture expressions, these works take them for
granted and barely discuss both. In particular, how facial expressions are designed
and expressed can only be guessed from figures of these EAIE papers, which closely
resemble emoji-like facial expressions.
Embodied service robots are also beneficial in the pedagogical area as educa-
tional agents [38, 39]. Under this setting, robots are employed in a learning-by-
teaching approach where students (ranging from kindergarten to preadolescence)
read and prepare educational material beforehand, which is then taught to the
robotic peer. This has shown to improve students understanding and knowledge
retention about the studied subject, increasing their motivation and concentration
[38, 40]. Likewise, robots may enhance its classroom presence and the elaboration
of affective strategies by means of recognizing and expressing emotional content.
For instance, one may desire to elicit an affective state that engages students in an
activity or identify boredom in students. Then, robots reaction has to be an opti-
mized combination of gestures, intonation, and other nonverbal cues, which maxi-
mize learning gains while minimizing distraction [41]. Humanoid robots are
preferred in education due to their anthropomorphic emotional expression, which
is readily available through body and head posture, arms, speech intonation, and so
on. Among the most popular humanoid robotic frameworks stand the Nao
robots [3840]. In particular, Pepper is a small humanoid robot, which is
provided with microphones, 3D sensors, touch sensors, gyroscope, RGB camera,
and touch screen placed on the chest of the robot, among other sensors. Through
the ALMood Module, Pepper is able to process perceptions from sensors (e.g.,
interlocutorsgaze, voice intonation, or linguistic semantics of speech) to provide
an estimation of the instantaneous emotional state of the speaker, surrounding
people, and ambiance mood [42, 43]. However, Pepper communication and its
emotional expression is mainly carried out through speech consequence of limita-
tions such as a static face, unrefined gestures, and other nonverbal cues, which are
not as flexible as human standards [44], for instance while we consider Figure 4,
which is a picture displaying a sad Pepper. Only by looking the picture, it is unclear
if the robot is sad, looking at its wheels, or simply turned off.
4.2 Study cases through the emoji communication lens
In summary, in the above revised EAIE cases (emoticon-based expression,
iPadrone/iPhonoid, and Pepper), emotions are generated through an ad hoc archi-
tecture, which considers emotions and moods that are determined by multimodal
data. A cartoon of these works is presented in Figure 5, displaying on (a) the work
Future of Robotics - Becoming Human with Humanoid or Emotional Intelligence
of [33] on (b) the work of [32, 3537], and on (c) Pepper the robot as described in
In these cases, we can integrate emoji-based models to enhance the emotional
communication with humans, for some tasks more directly than for others. Take for
instance, the facial expressions by itself, in the case of (a) and (b), the replacement
of emoticon-based emotional expression by its emoji counterpart is straightforward.
This will not only improve visually the robots facial expression but also allowing
more complex facial expressions to be displayed such as sarcasm ( ) or co-speech
gestures as after making a joke. Another important feature of replacing
emoticon-based faces by emoji is that the latter are used mostly to convey positive
emotions even when criticizing or giving negative feedback [2]. Therefore, this
feature could be really useful for maintaining a perpetual friendly tone of an elder
robotic partner (b) or as an educational agent (c).
Regarding the emotional expression of the discussed EAIE, this is contingent to
the emotional model, which in the case of (a) and (b) are expert-design knowledge
coded into fuzzy logic behavior rules and more complex neural networks, respec-
tively. In both cases, this not only will bias the EAIE into specific emotional states
but also will require vast human effort to maintain it. In contrast, Peppers frame-
work is robuster, includes a developer kit, which allows modifying robots behaviors
and the integration of third party chatbots, performing semantic and utterance
analysis, and is maintained and improved by a robotics enterprise. Yet, Peppers
Figure 4.
Is Pepper sad or just shutdown?
Emoji as a Proxy of Emotional Communication
emotional communication is constrained by a static face, while it can express
emotions by changing the color of its led eyes and adopt the corresponding body
posture; its emotional communication is mainly done through verbal expressions.
Nevertheless, in a pragmatic sense, do we really need to emulate emotions for a
robot to have an emotional communication or is enough to R&E emotions so that a
human interlocutor cannot distinguish between man and machine? In this sense,
NLP and ML can be used to leverage the emotional communication of a robot by
first mapping multimodal data into a discourse-like text where emoji are embedded,
and then, using emoji-based models to recognize sentiments, utterances, and ges-
tures so the decision-making module can determine the corresponding message
along with its corresponding emoji. In the case of (a), the microphone and in the
case of (b), the microphone, camera, and ambient sensors will be responsible for
capturing speech and facial expressions that will be converted into a discourse-like
text. Once the emotional content of the message is identified, the corresponding
emoji shall be displayed. In the case of Pepper, F2F communication can be
improved directly by displaying emoji in its front tablet. For instance, when Pepper
starts waving to a potential speaker, a friendly emoji such as a waving hand or a
greeting smile shall be portrayed in the tablet. Likewise, emoji usage as utter-
ances and beat gestures can be employed by Pepper to avoid silences in a goofy
manner ( ), to indicate a lack of knowledge about a particular topic ( ), or to
emphasize politeness when asking an interlocutor for an action ( ).
Figure 5.
Case studies using emoji-based modules to improve its emotional R&E models.
Future of Robotics - Becoming Human with Humanoid or Emotional Intelligence
5. Discussion
Emotional communication is a key piece for enhancing HRI, after all it will be
very useful if our smart phones, personal computers, cars or busses, and other
devices could exploit our emotional information to improve our experience and
communication. While nowadays, several proposals for robotic emotional commu-
nication are undergoing, emoji as a framework for the latter present a novel
approach with high applicability and big usage opportunities. Some of the works
presented here discussed the linguistic aspects of emoji, as well as the technical
aspects in terms of ML and NLP to R&E emotions, utterances, gestures in texts,
which contain emoji. Furthermore, we also presented some related works in the
area of HRI, which can easily adopt emoji for imbuing an embodied artificial
intelligent entity with the capacity for expressing and recognizing emotional aspects
of the communication. On the whole, ML models support these issues, but we do
not exclude the important task that involves the processing and transformation of
data to reach a suitable input representation for training an appropriate model.
On the other hand, there are several open questions regarding the usage of emoji
for emotional communication. For instance, are emoji suitable for the communica-
tion of every robotic entity? Emoji are mostly employed in a friendly manner and
for maintaining a positive communication. If the objective is to model a virtual
human, emoji usage will clearly restrain the spectrum of emotions, which may be
detected and expressed due to its knowledge base. An important example to con-
sider is the humanoid robot designed by Hiroshi Ishiguro, the man who made a copy
of himself [45]. Ishiguros proposal is that in order to understand and model emo-
tions, we must first understand ourselves. Hence, this humanoid robot, namely
Geminoid HI-1, is capable of displaying ultrarealistic human-like behaviors. How-
ever, do we really want to interact with service robots, which may have bad per-
sonality traits such as been unsociable and fickle, or whose mood can be affected by
heat and noise like a human does? Do we really want to interact with service robots,
which can be rude as a real elderly caretaker could? In this sense, emoji usage for the
emotional communication may be best suited when the task at hand (e.g., robotic
retail store cashier or an educational agent) requires keeping a friendly tone with
the human interlocutor. Another question is, should the entire emoji lexicon be used
or be restricted only to the core lexicon, which refers to facial expressions? In an
ultrarealistic anthropomorphic robot such as Geminoid HI-1, all hand gestures
might be carried out by robots hands itself, thus it should be unnecessary to even fit
a screen for displaying a waving emoji ( ) while greeting. On the contrary, more
constrained entities such as a Roomba
or Pepper
may clearly be benefited from
both core and peripheral emoji lexicons for improving its emotional communication
with humans. Also, since most of the emoji knowledge is based on short text
messages, multimodal data first need to be converted into their corresponding
discourse text message, which is, by itself, an open research question.
Author GSB thanks the Cátedra CONACYT program for supporting this
research. Author OGTL thanks GSB for his excellent collaboration.
Emoji as a Proxy of Emotional Communication
Author details
Guillermo Santamaría-Bonfil
* and Orlando Grabiel Toledano López
1 CONACYT-INEEL, National Institute of Electricity and Clean Energies,
Cuernavaca, Morelos, Mexico
2 University of Informatics Sciences, La Habana, Cuba
*Address all correspondence to:
© 2019 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms
of the Creative Commons Attribution License (
by/3.0), which permits unrestricted use, distribution, and reproduction in any medium,
provided the original work is properly cited.
Future of Robotics - Becoming Human with Humanoid or Emotional Intelligence
[1] Hurlburt G. Emoji: Lingua franca or
passing fancy? IT Professional. 2018;
[2] Danesi M. The Semiotics of Emoji:
The Rise of Visual Language in the Age
of the Internet. Bloomsbury Academic:
UK; 2016
[3] Skiba D. Face with tears of joy is
word of the year: Are emoji a sign of
things to come in health care? Nursing
Education Perspectives. 2016;37(1):
56-57. Available from: http://insights.
[4] Wiseman S, Gould S. Repurposing
emoji for personalised communication.
In: 2018 CHI Conference on Human
Factors in Computing Systems.
Montréal, QC: ACM; 2018. pp. 1-10
[5] Barbieri F, Kruszewski G, Ronzano F,
Saggion H. How cosmopolitan are
emojis?: Exploring emojis usage and
meaning over different languages with
distributional semantics. In:
Proceedings of the 2016 ACM
Multimedia Conference. 2016.
pp. 531-535
[6] Alshenqeeti H. Are emojis creating a
new or old visual language for new
generations? A socio-semiotic study.
Advances in Language and Literary
Studies. 2016;7(6):56-69
[7] Ai W, Lu X, Liu X, Wang N,
Huang G, Mei Q. Untangling emoji
popularity through semantic
embeddings. In: Proceedings of the
Eleventh International AAAI
Conference on Web and Social Media
ICWSM 17 [Internet]. 2017. pp. 2-11.
Available from:
[8] Kerslake L, Wegerif R. The semiotics
of emoji: The rise of visual language in
the age of the internet. Media and
Communication. 2017;5(4):75
[9] McCulloch G, Gawne L. Emoji
grammar as beat gestures. CEUR
Workshop Proceedings. 2018;2130:3-6
[10] Kralj Novak P, SmailovićJ,
Sluban B, MozetičI. Sentiment of
emojis. PLoS One. 2015;10(12):
e0144296. DOI: 10.1371/journal.
[11] Chomsky N. Aspects of the theory of
syntax. The Philosophical Quarterly.
MIT press; 2014;11:1-8
[12] Barbieri F, Ballesteros M, Saggion H.
Are emojis predictable? In: Proceedings
of the 15th Conference of the European
Chapter of the Association for
Computational Linguistics: Volume 2,
Short Papers [Internet]. Stroudsburg,
PA, USA: Association for Computational
Linguistics; 2017. pp. 105-111. Available
[13] Hussien W, Al-Ayyoub M,
Tashtoush Y, Al-Kabi M. On the Use of
Emojis to Train Emotion Classifiers.
2019. Available from:
[14] Doiron JAG. Emojis: Visual
communication in higher education.
PUPIL: International Journal of
Teaching, Education and Learning.
[15] Betz N, Hoemann K, Barrett LF.
Words are a context for mental
inference. Emotion. 2019:1-15. DOI:
[16] Guibon G, Ochs M, Bellot P,
Guibon G, Ochs M, Bellot P, et al. From
emoji usage to categorical emoji
prediction. In: 19th International
Conference on Computational
Linguistics and Intelligent Text
Emoji as a Proxy of Emotional Communication
Processing (CICLING 2018). Vietnam:
Hanoï. p. 2018
[17] Querengässer J, Schindler S. Sad but
true?How induced emotional states
differentially bias self-rated Big Five
personality traits. BMC Psychology.
[18] Marengo D, Giannotta F,
Settanni M. Assessing personality using
emoji: An exploratory study. Personality
and Individual Differences. 2017;112:
74-78. DOI: 10.1016/j.paid.2017.02.037
[19] Hanafy M, Khalil MI, Abbas HM.
Combining Classical and Deep Learning
Methods for Twitter Sentiment
Analysis. Switzerland: Springer Nat
Switz; 2018. pp. 281-292
[20] Karthik V. Opinion mining on emoji
using deep learning techniques.
Procedia Computer Science. 2018;132:
[21] Li X, Yan R, Zhang M. Joint emoji
classification and embedding. Learning.
[22] Chembula AB, Pradesh A. Generating
Emoticons Based on an Image of Face.
Vol. 2. US Patent; 21 February 2017
[23] Zhang AX, Igo M, Karger D,
Facciotti M. Using Student Annotated
Hashtags and Emojis to Collect Nuanced
Affective States. London: ACM; 2017
[24] Liu M, Wong A, Pudipeddi R,
Hou B, Wang D, Hsieg G. ReactionBot:
Exploring the effects of expression-
triggered emoji in text messages.
Proceedings of the ACM on Human
Computer Interaction. 2018;2:1-16
[25] Deng L, Yu D. Deep learning
methods and applications. Foundations
and Trends in Signal Processing. 2014;7:
[26] Bishop CM. Pattern Recognition and
Machine Learning. Cambridge, UK:
Springer Science+Business Media, LLC;
2006. 749 p
[27] Pentreath N. Machine Learning with
Spark. Birmingham, UK: Packt
Publishing; 2015. 338 p
[28] Chembula AB, Pradesh A. Generatin
emoticons based on an image of face.
Vol. 2. USA. US 9,576,175 B2. 2017
[29] Baggio DL. OpenCV 3.0 Computer
Vision with Java. Birmingham, UK:
Packt Publishing; 2015. 174 p
[30] Larsen L. Learning Microsoft
Cognitive Services. Birmingham, UK:
Packt Publishing; 2017. 484 p
[31] Pelillo M. Advances in Computer
Vision and Pattern Recognition. LLC:
Springer Science+Business Media; 2013.
293 p
[32] Tang D, Yusuf B, Botzheim J,
Kubota N, Chan CS. A novel multimodal
communication framework using robot
partner for aging population. Expert
Systems with Applications. 2015;42(9):
4540-4555. Available from. DOI:
[33] Daosodsai N, Maneewarn T. Fuzzy
based emotion generation mechanism
for an emoticon robot. 13th
International Conference on Control,
Automation and Systems (ICCAS 2013).
Gwangju; 2013:1073-1078. DOI:
10.1109/ICCAS.2013.6704075. Available
[34] Clabaugh C, Mataric M. Robots for
the people, by the people: Personalizing
human-machine interaction. Science
robotics. 2015;3(21):1-2
[35] Yorita A, Botzheim J, Kubota N.
Emotional models for multi-modal
communication of robot partners. IEEE
International Symposium on Industrial
Electronics. 2013:1-6
Future of Robotics - Becoming Human with Humanoid or Emotional Intelligence
[36] Obo T, Kakudi HA, Yoshihara Y,
Loo CK, Kubota N. Lifelog visualization
for elderly health care in
informationally structured space. In:
2015 4th Int Conf Informatics, Electron
Vision, ICIEV 2015. 2015;(March 2017)
[37] Woo J, Botzheim J, Kubota N. A
socially interactive robot partner using
content-based conversation system for
information support.
Journal of Advanced Computational
Intelligence and Intelligent Informatics.
[38] Tanaka F, Isshiki K, Takahashi F,
Uekusa M, Sei R, Hayashi K. Pepper
learns together with children:
Development of an educational
application. In: 2015 IEEE-RAS 15th
International Conference on Humanoid
Robots (Humanoids). 2015. pp. 270-275
[39] Lehmann H, Rossi G. Social robots
in educational contexts: Developing an
application in enactive didactics. Journal
of e-Learning and knowledge Society.
[40] Jamet F, Masson O, Jacquet B,
Stilgenbauer J-L, Baratgin J. Learning by
teaching with humanoid robot: A new
powerful experimental tool to improve
childrens learning ability. Journal of
Robotics. 2018;2018:1-11
[41] Belpaeme T, Kennedy J,
Ramachandran A, Scassellati B,
Tanaka F. Social robots for education: A
review. Journal of Robotics. 2018;3(21):
1-9. Available from: https://robotics.scie
[42] Europe SR. {ALMood} {API}. 2019
[43] Val-Calvo M, Grima-Murcia MD,
Sorinas J, Álvarez-Sánchez JR, de la Paz
Lopez F, Ferrández-Vicente JM, et al.
Exploring the physiological basis of
emotional HRI using a BCI Interface. In:
Ferrández Vicente JM, Álvarez-
Sánchez JR, de la Paz López F, Toledo
Moreo J, Adeli H, editors. Natural and
Artificial Computation for Biomedicine
and Neuroscience. Cham: Springer
International Publishing; 2017.
pp. 274-285
[44] Europe SR. How to Create a Great
Experience with Pepper. September 2017.
Available from: http://doc.aldebaran.
com/ [Last accessed: 17/09/2019]
[45] Guizzo E. Hiroshi Ishiguro: The Man
Who Made a Copy of Himself
[Internet]. IEEE Spectrum. 2010.
Available from:
Emoji as a Proxy of Emotional Communication
Full-text available
WhatsApp is a free messaging and calling application that can be downloaded to mobile phones all over the world that supports the feature. It is straightforward, trustworthy, and secure. WhatsApp enables us to exchange text, voice, and audio-visual (video) messages. However, text message recipients often fail to understand the intended meaning of a text message because the text messages are not provided with intonation and direct expression, resulting in the recipient’s difficulty interpreting the message's intended meaning. This study aimed to find out the functions of emojis in WhatsApp text messages. Interpretative qualitative was used as the research methodology in this study. The data source used is the results of screenshots of chat activities on the WhatsApp application. The study came up with the findings that the primary function of emojis in WhatsApp text messages is to show and emphasize emotion, expression, and feeling like happiness, sadness, apology, embarrassment, support, anger, mockery, gratitude, criticism, humorous, etc. that cannot be spoken verbally or conveyed/shown directly.
In most cultures, the Slightly Smiling Face (smiley) icon indicates friendliness and niceness. However, this SSF symbol in emoji may also indicate a negative meaning of sarcasm and irony to some Chinese social media users. This research analyses the sentiment reflected in the use of the SSF emoji as used by Chinese users on Twitter and applies quantitative methods to investigate the linguistic and social constraints of the SSF emoji's negative variable from 2016 to 2020. Results show that positive or negative emotional expression of SSF emoji is highly dependent on the content of the sentence and its context. Therefore, the SSF emoji has no semantic value as a word for expressing emotions but acts as an emotive anaphora or a modal particle. Simplified Chinese users from mainland China use the SSF emoji with a negative sense more than Traditional Chinese users from Taiwan. These differences may reflect the users’ media preference and cultural identification through the use of emoji, a global language for the digital age. Most Chinese users use a single SSF emoji, which can convey either emotion, at the end of the sentence, but when the SSF emoji is used in a repetitive manner, it is more likely to indicate a sarcastic emotion. Both variables, (single use / repetition) and Chinese types (Simplified / Traditional), significantly correlate with the use of the negative variant of the SSF emoji (p < 0.05). The change in the meaning of the SSF emoji from the expression of positive to negative sentiment demonstrates that emojis may change through time in ways similar to other forms of language.
Emojis are becoming a new visual and linguistic tool that allows users to express their feelings and communicate with each other on social media. Driven by the importance of emoji interpretation, an emoji prediction task, which aims to predict the most likely emojis within a text, has gained significant attention. Thus, we propose a new deep neural network model for this task, MultiEmo (multi-task framework for emoji prediction), to predict the most relevant emoji by considering the emotion detection task. Our experiment shows that MultiEmo is superior to existing models using the Twitter dataset, implying that the models can learn richer representations from semantically related tasks. We systematically confirm that each emoji is associated with a particular emotion in a similar context. We also introduce new evaluation metrics to measure the comprehensive performance of the models.
With the prevalence of affective computing, emotion recognition becomes vital in any work related to natural language understanding. The inspiration for this work is provided by supplying machines with complete emotional intelligence and integrating them into routine life to satisfy complex human desires and needs. The text being a common communication medium on social media even now, it is important to analyze the emotions expressed in the text which is challenging due to the absence of audiovisual cues. Additionally, the conversational text conveys many emotions through communication contexts. Emoticon serves the purpose of self-annotation of writer's emotion in text. Therefore, a machine learning-based text emotion recognition model using emotive features proposed and evaluated it on the SemEval-2019 dataset. The proposed work involves exploitation of different emotion-based features with classical machine learning classifiers like SVM, Multilayer perceptron, REPTree, and decision tree classifiers. The proposed system performs competitively well in terms of f-score 65.31% and accuracy 87.55%.
Full-text available
AI has enabled new ways of learning and teaching. This may change society in ways that pose new challenges for educators and educational institutions. Intelligent Learning Environments (ILE) will provide exciting new opportunities for adapting learning content based on students' cognitive and affective individual needs. In this volume, we present seven research works in some of the most interesting fields of intelligent learning systems. These papers show clearly the main directions of research in intelligent learning environments in México. The papers were carefully chosen by the editorial board based on three reviews made by the members of the technical reviewing committee. The reviewers took into account the originality, scientific contribution to the field, soundness and technical quality of the papers. We appreciate the funding provided by RedICA (Conacyt Thematic Network in Applied Computational Intelligence) and we thank its members that were part of the Technical Committee as well as members of Mexican Society for Artificial Intelligence (SMIA Sociedad Mexicana de Inteligencia Artificial). Last, but not least, we thank Centro de Investigación en Computación-Instituto Politécnico Nacional (CIC-IPN) for their support in preparation of this volume.
Full-text available
In recent years, many studies have explored endowing chatbots with personality comprising various human-like characteristics, such as personal traits and emotional behavior. From these studies an area of research has emerged focusing on enabling chatbots to even perform in a human-like fashion as an individual-like conversational agent. As yet, however, no survey paper has sufficiently reviewed these various studies on personality-related chatbots. Therefore, this article sets out to be the first survey to provide a comprehensive and systematic review of the history, the state-of-the-art, and potential trends in this emerging field. We offer a unified definition and a layered model of personality to the rich personal characteristics that can be endowed on chatbots' personalities. We further propose the term of Personality-Aware Chatbot (PAC) to unify the various terms used to taxonomize chatbots with personality. We have further subdivided all PACs in this survey into two basic categories, i.e., the Self-Personality-Aware Chatbot (SPAC), which has its own personality, and the Other Personality-Aware Chatbot (OPAC) that adapts to a specific user's personality. Then, we review the details of SPACs and OPACs as well as their concrete personalities, functions and applications in existing studies. Moreover, a general PAC framework is proposed to set out four kinds of personality embedding techniques, namely personality regulation, correlation, constraint and adjustment. In addition, this article also reviews available personality-labeled corpora and evaluation approaches used in current PACs, and discusses technical challenges and potential applications of future PACs.
The English as a foreign language (EFL) learners’ levels of attention and meditation as well as brainwaves while interacting with an interlocutor in three different second-language (L2) socialization contexts—with another human in person, with another person through a virtual platform, and with an artificial intelligence (AI) chatbot—were explored in this study. Thirty participants participated in an experiment, throughout which they were asked to wear a NeuroSky Mindwave headset to assess their real-time levels of attention and meditation, as well as their brainwave activities in each of the three contexts. Statistical analyses of the results revealed a significant effect of the EFL socialization context on participants’ level of attention and meditation. The EFL learners’ level of attention was highest when they were socializing with other humans in person. When their interlocutor was a chatbot, their level of meditation was highest. When they were interacting with another person in a virtual environment, both their attention and meditation were lowest. A one-way multivariate analysis of variance (MANOVA) revealed a significant main effect of the dominant ratios of participants’ brainwave activities, based on their interactions with interlocutors in all three contexts. The AI chatbot was associated with the greatest dominant ratio of delta and theta brainwaves for EFL learners. Face-to-face L2 socialization with interlocutors triggered alpha and beta brainwaves, whereas interaction with human interlocutors in the virtual environment made gamma brainwaves dominant. The present study is the first to have empirically examined EFL learners’ levels of attention and meditation as well as brainwaves during L2 socialization in three different contexts.
Full-text available
Accumulating evidence indicates that context has an important impact on inferring emotion in facial configurations. In this paper, we report on three studies examining whether words referring to mental states contribute to mental inference in images from the Reading the Mind in the Eyes Test (Study 1), Baron-Cohen et al. (2001) in static emoji (Study 2), and in animated emoji (Study 3). Across all three studies, we predicted and found that perceivers were more likely to infer mental states when relevant words were embedded in the experimental context (i.e., in a forced-choice task) versus when those words were absent (i.e., in a free-labeling task). We discuss the implications of these findings for the widespread conclusion that faces or parts of faces “display” emotions or other mental states, as well as for psychology’s continued reliance on forced-choice methods.
Full-text available
Browsing the literature shows that an increasing number of authors choose to use the learning by teaching approach in the field of educational robotics. The goal of this paper is, on the one hand, to produce a review of articles describing the effects of this approach on learning and, on the other hand, to review the literature in order to explore the characteristics at the core of this approach. We will only focus on the work using a humanoid robot. The areas of learning studied are writing, reading, vocabulary, and reasoning, but also there are some metacognitive abilities like task commitment and mental state attribution. Their targets are from very young children to preadolescents. We can already notice some studies on pupils with special educational needs. In all of these domains, the results show a nonnegligible effect of learning by teaching both on learning and on metacognitive abilities. If the concept of learning by teaching is clear, a careful investigation of the different studies shows that experimental paradigms do not use the same basic characteristics. For some, it is the robot’s weakness, the care that must be given to it, which is the main requirement for the approach, while for others it is the unbalanced distribution of knowledge which is at the heart of it. The learning by teaching approach we will study has two components: the robot and the child tutor. The characteristics of the robot and what is asked of the child to accomplish his or her task of the tutor will be analyzed.
Full-text available
Millions of tweets are published every day which contain massive amount of opinions and sentiments. Thus, twitter is used heavily in research and business areas. Twitter is a global platform that is accessed from all the globe. Users express their opinions freely, using informal language, without any rules and with different languages. We propose a unified system that could be applied on any raw tweets and could be applied without any man-made intervention. We use emoticons as heuristic labels for our system and extract features statistically or with unsupervised techniques. We combine classical and deep learning algorithms with an ensemble algorithm to make use of different features of each model and achieve better accuracy. The results show that our approach is reliable and achieves accuracy near the state-of-the-art with a smaller set of labeled tweets.
Full-text available
Social robots can be used in education as tutors or peer learners. They have been shown to be effective at increasing cognitive and affective outcomes and have achieved outcomes similar to those of human tutoring on restricted tasks. This is largely because of their physical presence, which traditional learning technologies lack. We review the potential of social robots in education, discuss the technical challenges, and consider how the robot’s appearance and behavior affect learning outcomes.
Emojis have gone viral on the Internet across platforms and devices. Interwoven into our daily communications, they have become a ubiquitous new language. However, little has been done to analyze the usage of emojis at scale and in depth. Why do some emojis become especially popular while others don't? How are people using them among the words? In this work, we take the initiative to study the collective usage and behavior of emojis, and specifically, how emojis interact with their context. We base our analysis on a very large corpus collected from a popular emoji keyboard, which contains a full month of inputs from millions of users. Our analysis is empowered by a state-of-the-art machine learning tool that computes the embeddings of emojis and words in a semantic space. We find that emojis with clear semantic meanings are more likely to be adopted. While entity-related emojis are more likely to be used as alternatives to words, sentiment-related emojis often play a complementary role in a message. Overall, emojis are significantly more prevalent in a sentimental context.
The increasing use of emojis, digital images that can represent a word or feeling in a text or email, and the fact that they can be strung together to create a sentence with real and full meaning raises the question of whether they are creating a new language amongst technologically savvy youth, or devaluing existing language. There is however a further depth to emoji usage as language, suggesting that they are in fact returning language to an earlier stage of human communication. Parallels between emojis and hieroglyphs and cuneiform can be seen which indicates the universality of visual communication forms, rather than written alphabetised language. There are also indications that emojis may be cultural or gender-specific with indications that women use more emojis than men to express their feelings and that age is less of an indicator of usage than technological awareness and capability. It appears that emojis are filling the need for adding non-verbal cues in in digital communication about the intent and emotion behind a message. Examinations of the way that emojis have developed and evolved and their current and forecast usage leads to the conclusion that they are not a “new” language developed by the technological adept younger generations, but instead are an evolution of older visual language systems that make use of digital technology to create greater layers and nuance in asynchronous communications. Furthermore, emojis are devices for demonstrating tone, intent and feelings that would normally be conveyed by non-verbal cues in personal communications but which cannot be achieved in digital messages. It is also evident from prior works and analyses of usage that there are universal meanings to Emojis. This suggests that as a language form, emojis may be able to contribute to increased cross-cultural communication clarity. Further research is however recognised as being necessary to fully understand the role that emojis can play as a visual language for all generations, not just those termed millennials or technologically savvy youths.
In this paper we present ReactionBot, a system that attaches emoji based on users' facial expressions to text messages on Slack. Through a study of 16 dyads, we found that ReactionBot was able to help communicate participants' affect, reducing the need for participants to self-react with emoji during conversations. However, contrary to our hypothesis, ReactionBot reduced social presence (behavioral interdependence) between dyads. Post study interviews suggest that the emotion feedback through ReactionBot indeed provided valuable nonverbal cues: offered more genuine feedback, and participants were more aware of their own emotions. However, this can come at the cost of increasing anxiety from concerns about negative emotion leakage. Further, the more active role of the system in facilitating the conversation can also result in unwanted distractions and may have attributed to the reduced sense of behavioral interdependence. We discuss implications for utilizing this type of cues in text-based communication.
The emoji has a long historical tail as a visual means of communication. Nonetheless, it is truly a product of the digital age. This paper explores the worldwide acceptance of emoji and the implications of this acceptance.
The development of robot partners for supporting human life has been growing for many years. One main feature that should be considered in developing such robots is the conversation system. In this study, a conversation system called iPhonoid-C is introduced. The iPhonoid-C is a robot partner based on a smart device. A conversation is a form of communication in which two or more people exchange words and information. Therefore, one important part of judging the effectiveness of the interaction must be to evaluate if the appropriate amount of information is provided by the robot. In this research, we focused on a time-dependent utterance system to adjust the amount of conversation based on Grice’s maxim of quantity. By utilizing Grice’s theory, it is possible to tailor the robot’s communication by selecting the Grice value to correspond to the human’s condition. Using this method, the robot partner can control the amount of information it communicates to adapt to the human’s situation based on Grice’s maxim of quantity. An experimental result with the robot partner is presented to validate the proposed time-dependent conversation system.
Multimodal, interactive, and multitask machine learning can be applied to personalize human-robot and human-machine interactions for the broad diversity of individuals and their unique needs.