Conference PaperPDF Available

HBE: Hashtag-Based Emotion Lexicons for Twitter Sentiment Analysis

Authors:

Abstract and Figures

In this paper we report the first effort of constructing emotion lexicon by utilizing Twitter as source of data. Specifically we used hashtag feature to obtain tweets with certain emotion label in English. There are eight emotion classes used in our work, comprising of angry, disgust, fear, joy, sad, surprise, trust and anticipation that refer to the Plutchik's wheel. To obtain the lexicon, we first ranked the words according to its term frequency. After that, we reduced some irrelevant words by removing words with low frequency. We also enriched the lexicon with the synonym and conducted filtering by utilizing sentiment lexicon (40,288 words). As result, we successfully constructed 4 Hashtag-Based Emotion (HBE) Lexicons through different procedures and called them as HBE-A1 (50,613 words), HBE-B1 (23,400 words), HBE-A2 (26,909 words) and HBE-B2 (14,905 words). In our experiment, we used the lexicons in investigating Twitter Sentiment Analysis and the result reveals that our proposed emotion lexicons can boost the accuracy and even improve over than NRC-Emotion lexicon. It is also worth noting that our construction idea is simple, automatic, inexpensive and suitable for Social Media analysis.
Content may be subject to copyright.
HBE: Hashtag-Based Emotion Lexicons for
Twitter Sentiment Analysis
Fajri Koto
Faculty of Computer Science
University of Indonesia
Depok, Jawa Barat, Indonesia
fajri91@ui.ac.id
Mirna Adriani
Faculty of Computer Science
University of Indonesia
Depok, Jawa Barat, Indonesia
mirna@cs.ui.ac.id
ABSTRACT
In this paper we report the first effort of constructing emo-
tion lexicon by utilizing Twitter as source of data. Specifi-
cally we used hashtag feature to obtain tweets with certain
emotion label in English. There are eight emotion classes
used in our work, comprising of angry, disgust, fear, joy, sad,
surprise, trust and anticipation that refer to the Plutchik’s
wheel. To obtain the lexicon, we first ranked the words ac-
cording to its term frequency. After that, we reduced some
irrelevant words by removing words with low frequency. We
also enriched the lexicon with the synonym and conducted
filtering by utilizing sentiment lexicon (40,288 words). As
result, we successfully constructed 4 Hashtag-Based Emo-
tion (HBE) Lexicons through different procedures and called
them as HBE-A1 (50,613 words), HBE-B1 (23,400 words),
HBE-A2 (26,909 words) and HBE-B2 (14,905 words). In our
experiment, we used the lexicons in investigating Twitter
Sentiment Analysis and the result reveals that our proposed
emotion lexicons can boost the accuracy and even improve
over than NRC-Emotion lexicon. It is also worth noting that
our construction idea is simple, automatic, inexpensive and
suitable for Social Media analysis.
Keywords
emotion lexicon, twitter, hashtag, sentiment analysis, sub-
jectivity, polarity
1. INTRODUCTION
Emotion is part of human communication that comprises
of psychology expression in conveying message. Emotion
becomes the identity of message and determines the respond
of partner or listener in term of how to react. In face-to-face
communication the emotion can be differentiated not only
through the facial expression and voice intonation but also
the words used. For example, delightful and yummy indicate
the emotion of joy, gloomy and cry are indicative of sadness,
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full cita-
tion on the first page. Copyrights for components of this work owned by others than
ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-
publish, to post on servers or to redistribute to lists, requires prior specific permission
and/or a fee. Request permissions from permissions@acm.org.
FIRE ’15, December 04-06, 2015, Gandhinagar, India
c
2015 ACM. ISBN 978-1-4503-4004-5/15/12. . . $15.00
DOI: http://dx.doi.org/10.1145/2838706.2838718
shout and boiling are indicative of anger, and so on. Thus,
it is obvious that words are associated with emotion [1].
Previous study has shown that emotion factor is related
to Sentiment Analysis. By combining emotion in sentiment
classification, the system can achieve better performance
[2]. Sentiment Analysis is a classification task to determine
the text polarity and refers to broad area of natural lan-
guage processing, computational linguistic and text mining
[3]. In this work, we follow [2] and use two types of bi-
nary classification: 1) polarity classification with classes =
{positive, negativ e}, and 2) subjectivity classification with
classes = {subjectiv e, objective}. As the name, the posi-
tive or negative sentences contain the polarity of positive
or negative. On the other hand, an objective sentence usu-
ally presents information containing facts or news that lacks
with arguments, while subjectivity reflects a private point
of view, emotion or belief [4].
Various studies of Sentiment Analysis were done in do-
mains of structured and unstructured document such as
news [5], blogs [6], forum review [7] etc. Due to its rapid
growth, Social Media also becomes interesting domain for
sentiment investigation. The abundance of texts in Social
Media engenders critical informations for today’s business.
Some enhancements and further investigation are being in
progress in order to build more sophisticated system. Driven
by this fact, we are interested to study the correlation be-
tween emotion and sentiment in Social Media domain, Twit-
ter1by building an emotion word list.
As the second largest social media today, Twitter has more
than 600 million active users at 20142. Twitter is a simple
microblogging environment which allows user to post cer-
tain free text, limited up to 140 characters, called a tweet.
Since 2014, there have been more than 58 million tweets are
posted on average per day. The unique phenomena of tweet
is the usage of one or more words immediately preceded with
a hash symbol (#). These words are called hashtag and may
be often found in a tweet data. It may serve many purposes,
but most notable it is used to indicate topic. Often these
words add to the information in the tweet, including the in-
ternal emotion of the author [8]. For instance: “when i’m
sad, i listen to songs that make me even more sad. #sad”.
By utilizing hashtag feature in Twitter, we build an emo-
tion word list by first collecting the corpora based on emo-
tion classes that are noted by the hashtag. We then use
our emotion lexicon to classify sentiment tweet and see its
comparison between existing emotion lexicon, NRC [1].
1http://www.twitter.com
2http://www.statisticbrain.com
2. RELATED WORK
The first study of tweet sentiment was done by Go et al. in
which they utilized emoticons to annotate tweet with senti-
ment label [9]. The next study by Agarwal et al. used man-
ually annotated tweets with sentiment and perform unigram
model to do classification [10]. Koto et al. investigated the
use of POS sequence in Twitter Sentiment Analysis to un-
derstand the sentence pattern of tweet with sentiment [18].
In other study, Bravo-Marquez et al. attempted to boost
Sentiment Analysis by combining aspects such as opinion
strength, emotion and polarity. Their result shows that
combining emotion factor in sentiment analysis can provide
significant improvements [2].
In their work, Bravo-Marquez et al. used English emotion
lexicon published by Mohammad et al. as feature to clas-
sify sentiment [1,2]. This word list was built by eight basic
emotion classes that come from human psychology. Their
work refers to Ekman emotions [11] and annotated the words
by class of Plutchik’s wheel [12]: joy, trust, sadness, anger,
suprise, fear, anticipation and disgust. This lexicon is known
as National Research Council Canada (NRC)-Emotion lex-
icon and contains 8,258 unique words that were manually
annotated by crowdsourcing Amazon [1].
Even though the NRC lexicon is a large list of words, some
of either formal and informal word/phrases that are com-
monly used in Social Media are not included. Phrases such
as lol or wtf are the example. We argue that these words
should be considered especially in Social Media analysis,
since NRC was built by utilizing Macquarie Thesaurus[13]
as source, in which it was constructed two decades before
Social Media era. Driven by this fact and insight from [8],
we construct emotion corpora by utilizing hashtag feature as
label. By following [8] we collect tweets with emotion label
and then extract the lexicon.
3. BUILDING THE EMOTION LEXICONS
In Fig. 1 we describe the procedure to construct our emo-
tion lexicon. Here, we also use eight basic emotions intro-
duced by Plutchik as query to collect the tweet. For exam-
ple, we use #angry as query to obtain the relevance tweets
of anger. Since our preliminary research focus on English,
we implement language filter of English in crawling process.
Figure 1: The construction stage of emotion lexicon
To obtain the raw data faster we conducted conventional
method by utilizing browser to search the emotion query
instead of using API. The process was manually done by
collecting raw data starting from November 22, 2013, move
backward until the browser can not give any old tweet.
Through the html file we collected 5,316 tweets in total and
the summary is described in Table 1. In this table, we ad-
just number of certain emotion class like Sad, caused by its
large number if it is collected since 2008. So we limited the
crawling period only until February 19, 2013 for Sad emotion
label.
Table 1: Corpus of tweet with emotion hashtag label
Emotion The oldest tweet Total of tweet
Angry 21-Jul-07 2759
Disgust 12-Mar-09 226
Fear 24-Oct-08 2967
Joy 29-Sep-08 5316
Sad 19-Feb-13 4906
Surprise 29-Mar-08 4212
Trust 18-Oct-12 2908
Anticipation 3-Dec-08 722
The raw data was processed by some preprocessing stages
that comprise of 1) removing username that contains @
entity and url in a tweet, 2) removing RT phrase, 2) nor-
malization of all words to lowercase character, 3) removing
all special characters in tweet, 4) stemming, 5) lemmatiza-
tion, and 6) removing stopwords. Specifically, stemming and
lemmatization were done by utilizing NLTK library [14].
After list of these preprocessing done, for each emotion
set we counted its term frequency. According to the result,
we reduce irrelevant words by: 1) removing words with term
frequency equals to 1 and 2) selecting only the top-75% of
words from the first removal. The process was continued by
enhancing lexicon with the synonym set. Here, we applied
NLTK library in Python to complete the word list. As re-
sult, the emotion lexicon, called as Hastag Based Emotion-A
(HBE-A) was obtained.
We argue that selecting words that contain polarity like
sentiment lexicon from HBE-A may could improve its per-
formance. Moreover it is also obvious that emotion have
strong correlation with the sentiment factor. Therefore, in
the last step, we filter the resulting lexicon by the sentiment
words that are obtained from 1) Bing Liu [21], 2) Wilson[22],
3) AFINN [16], and 4) Senti-WordNet [23]. In total there
are 40,288 unique sentiment words used for conducting fi-
nal filtering. Specifically for Senti-WordNet, we selected the
words which have positive and negative score that do not
equal to 0 (non-neutral words). As result of this filtering,
HBE-B lexicon was obtained.
In total there are 4 kinds emotion lexicon produced and
used in the experiment: HBE-A1, HBE-B1, HBE-A2, and
HBE-B2. It is differentiated by two different procedures in
stage of applying synonym: 1)We add synonym of all words
produced by previous stage and labeled the results as A1
and B1; 2) We only add synonym of the top-25% words and
result in A2 and B2. It was done by considering hypothesis
that instead of adding the synonym of all words, using words
with higher frequency can produce better lexicon. In Table 2
the summary of HBE and NRC is provided.
Table 3: The accuracy (%) of subjectivity classification
Dataset AFINN +NRC +HBE
A1 B1 A2 B2
Sanders 61.89 63.40 63.95 59.87 64.41 64.66
HCR 60.36 60.89 65.00 60.54 64.64 62.32
SemEval 68.04 68.20 68.15 68.53 68.48 68.66
OMD 52.12 52.00 60.44 64.56 61.31 64.00
Table 4: The accuracy (%) of polarity classification
Dataset AFINN +NRC +HBE
A1 B1 A2 B2
Sanders 70.72 73.69 72.70 75.41 72.94 73.87
HCR 61.55 58.42 60.19 61.96 62.09 60.73
SemEval 73.88 75.22 76.17 76.28 76.34 76.06
OMD 62.25 63.31 64.31 63.94 66.12 65.56
Table 2: Number of words on hashtag-based emotion
and NRC lexicon
Emotion HBE
NRC
A1 B1 A2 B2
Angry 7305 3444 3909 1978 1245
Disgust 909 482 331 1948 1055
Fear 7088 3440 3746 1975 1474
Joy 9447 4211 5275 2469 689
Sad 9555 4377 5530 2647 1191
Surprise 7788 3302 4078 1809 839
Trust 6042 2940 3068 1612 564
Anticipation 2479 1204 972 467 1231
Total 50613 23400 26909 14905 8258
Table 5: Balanced Dataset
Subjectivity Sanders HCR OMD SemEval
#neutral 1190 280 800 2256
#objective 1190 280 800 2256
#total 2380 560 1600 4512
Polarity Sanders HCR OMD SemEval
#negative 555 368 800 896
#positive 555 368 800 896
#total 1110 736 1600 1792
4. EXPERIMENT
4.1 Experimental Set-Up
The experiment was conducted in two domains of Senti-
ment Analysis: polarity and subjectivity and used 4 differ-
ent datasets: 1) Sanders3,2) Health Care Reform (HCR)4,
3) Obama-McCain Debate (OMD)6which were used by Spe-
riosu et al. [15], and 4) International Workshop Sem-Eval
2013 (SemEval)5data. Each tweet in these datasets includes
a positive, negative, or neutral tag. In this work, we per-
formed binary classification and tackled the class imbalance
by sampling tweets. The summary of these data is given in
Table 5.
As baseline, we used AFINN [16], a lexicon containing
3http://www.sanalytics.com/lab/twitter-sentiment
4https://bitbucket.org/speriosu/
5http://www.cs.york.ac.uk/semeval-2013/
2,477 English words and constructed based on the Affec-
tive Norms for English Words lexicon (ANEW) proposed by
Bradley and Lang [17]. It is motivated by their good perfor-
mance in performing sentiment classification over Twitter.
A recent comparative study on twitter sentiment analysis
by [19] has also shown that AFINN is a good feature to be
used as baseline in Twitter sentiment analysis.
4.2 Experiment Result
To investigate the performance of HBE in sentiment clas-
sification, we compared AFINN lexicon with the incorpo-
ration of 1) AFINN and NRC, and 2) AFINN and HBE,
for both classifications (subjectivity and polarity). In Ta-
ble 3 and Table 4 these incorporations are written as +NRC
and +HBE. AFINN Lexicon is used by extracting tweet into
two main features called APO (AFINN Positivity) and ANE
(AFINN negativitiy). APO is extracted by summing score
of positive words (from 1 to 5), while ANE is extracted by
summing score of negative words (score -5 to -1). Whereas,
emotion feature is simply extracted by counting number of
words that matches with the corresponding emotion class
word list. Their incorporation is simply done by concate-
nating both features. Thus, 10 attributes were used for the
incorporation. The classification was performed by 5-fold
cross validation, in which 80% of tweets are as training set,
where the remainder of tweet as the test set. Here, we used
LibSVM in open source tool Rapidminer6to classify dataset.
According to Table 3 and 4, our experiment results show
that the incorporation of AFINN and HBE can improve the
baseline accuracy better than incorporation of AFINN and
NRC, the existing emotion lexicon. The colored cell in the
tables represents result with higher accuracy if the scores
are better than baseline and +NRC. In both results, it is
clearly shown that the proposed lexicons can achieve the
best accuracy for all dataset.
In Table 3, three HBE-B and one HBE-A become the
best approach in subjectivity classification. The AFINN ac-
curacy increases significantly in HCR and OMD data by
4.64% and 12.44% consecutively. It shows that our effort in
filtering HBE-A with sentiment words can improve the clas-
sifier to recognize tweet in subjective domain. Selecting and
using sentiment words may enable machine to easier identify
subjective tweets that tend to reflect private point of view
6http://www.rapidminer.com
or opinion.
In the second result, Table 4 shows three HBE-A2 and
one HBE-B1 as the best performance. The AFINN accuracy
increases by 4.69%, 0.54%, 2.46%, and 3.87% for Sanders,
HCR, SemEval, and OMD consecutively. Here, the filtering
with sentiment words may not give major improvement since
the sentiment words were obtained by combining all type of
polarity. However, our effort to only apply synonym to the
top-25% of lexicon shows better improvement than A1 and
B1. The similar result is also shown by Table 3 that all cells
in these column (A2 and B2) are colored.
5. CONCLUSION
In this work we attempted to build emotion lexicon from
Twitter data by utilizing hashtag feature as label to differen-
tiate emotion. We call the resulting lexicon as HBE lexicons
and used it in Twitter sentiment classification as prelim-
inary investigation. The experiment result reveals better
performances than existing emotion lexicon, NRC. Our lexi-
cons are more suitable for Social Media analysis since it was
built from Twitter itself. The other advantage of our idea
is the lexicon can be larger in term of size and functionality
along by the growing of Twitter data. As future work, this
emotion lexicons can be used to investigate emotion classi-
fication either on Social Media or another domain such as
speech [20], news, chat etc.
6. ACKNOWLEDGEMENT
Part of this work was supported by FredMan Foundation,
Samsung R&D Institute Indonesia.
7. REFERENCES
[1] S. M. Mohammad, and P. D. Turney. Crowdsourcing a
word-emotion association lexicon. In Computational
Intelligence, 29(3): 436–465, 2013.
[2] F. Bravo-Marquez, M. Mendoza, and B. Poblete.
Combining strengths, emotions and polarities for
boosting Twitter sentiment analysis. In Proceedings of
the Second International Workshop on Issues of
Sentiment Discovery and Opinion Mining, 2, 2013.
[3] W. J. Trybula. Data Mining and Knowledge
Discovery. In Annual review of information science
and technology (ARIST), 32: 197–229, 1997.
[4] S. Raaijmakers and W. Kraaij. A Shallow Approach to
Subjectivity Classification. In ICWSM, 2008.
[5] W. Zhang and S. Skiena. Trading Strategies to Exploit
Blog and News Sentiment. In ICWSM, 2010.
[6] V. K. Singh, R. Piryani, A. Uddin, and P. Waila.
Sentiment analysis of Movie reviews and Blog posts.
In Advance Computing Conference (IACC): 893–898,
2013.
[7] L. Shi, B. Sun, L. Kong, and Y. Zhang. Web forum
Sentiment analysis based on topics. In Computer and
Information Technology, 2: 148–153, 2009.
[8] S. M. Mohammad. #Emotional tweets. In Proceedings
of the Sixth International Workshop on Semantic
Evaluation, Association for Computational
Linguistics, 2012.
[9] A. Go, R. Bhayani, and L. Huang. Twitter sentiment
classification using distant supervision. In CS224N
Project Report, Stanford, 2009.
[10] A. Agarwal, B. Xie, I. Vovsha, O. Rambow, and R.
Passonneau. Sentiment analysis of twitter data. In
Proceedings of the Workshop on Languages in Social
Media: 30–38, 2011. 30–38.
[11] P. Ekman. An argument for basic emotions. Cognition
and Emotion, 6(3-4): 169–200, 1992.
[12] R. Plutchik. The psychology and biology of emotion.
HarperCollins College Publishers, 1994.
[13] The Macquarie Thesaurus. In Macquarie Library, 1986
[14] S. Bird. NLTK: the natural language toolkit. In
Proceedings of the COLING/ACL on Interactive
presentation sessions: 69–72, 2006.
[15] M. Speriosu, N. Sudan, S. Upadhyay, and J.
Baldridge. Twitter polarity classification with label
propagation over lexical links and the follower graph.
In Proceedings of the First workshop on Unsupervised
Learning in NLP: 53–63, 2011.
[16] F. A. Nielsen. A new ANEW: Evaluation of a word
list for sentiment analysis in microblogs. In arXiv
preprint arXiv: 1103.2903., 2011.
[17] M. M. Bradley and P. J. Lang. Affective norms for
English words (ANEW): Instruction manual and
affective ratings. In Technical Report C-1, The Center
for Research in Psychophysiology, University of
Florida, 1999.
[18] F. Koto and M. Adriani. The Use of POS Sequence for
Analyzing Sentence Pattern in Twitter Sentiment
Analysis. In Proceedings of the 29th WAINA (the
Eight International Symposium on Mining and Web),
Gwangju, Korea: 547–551, 2015.
[19] F. Koto and M. Adriani. A Comparative Study on
Twitter Sentiment Analysis: Which Features are
Good?. In Natural Language Processing and
Information System, Springer International
Publishing, 9103: 453–457, 2015.
[20] N. Lubis, D. Lestari, A. Purwarianti, S. Sakti, and S.
Nakamura. Emotion recognition on Indonesian
television talk shows. In Proceedings of Spoken
Language Technology Workshop (SLT): 466–471, 2014.
[21] B. Liu, M. Hu and J. Cheng. Opinion observer:
analyzing and comparing opinions on the web. In
Proceedings of the 14th international conference on
World Wide Web: 342–351, 2005.
[22] T. Wilson, J. Wiebe, and P. Hoffmann. Recognizing
contextual polarity in phrase-level sentiment analysis.
In Proceedings of the conference on human language
technology and empirical methods in natural language
processing: 347–354 2005.
[23] S. Baccianella, A. Esuli, and F. Sebastiani.
SentiWordNet 3.0: An Enhanced Lexical Resource for
Sentiment Analysis and Opinion Mining. In LREC,
10: 2200–2204, 2010.
... Moreover, this study also emphasized that sentiment classification is more arduous than general classification tasks. In order to improve the performance of the sentiment analysis model, researchers tried to add different features to the supervised model [7][8][9][10]. The diversity of different features improves the performance of the model, but the supervised model requires a lot of labeled data, which is commonly costly, resulting in a high labor cost in real world applications. ...
... When the corpus is relatively large, the threshold should be small, because the low frequency will be arduous to fit. We set the value of minCount to (2,3,4,5,6,7,8) for testing. Finally, the model has the highest accuracy, recall and F1 score at the value of 5. ...
... Therefore, several features are explored: the number of positive, negative, and neutral words, the number of links, negative terms or retweets, or linguistic attributes (e.g., the ratio of adverbs and adjectives). The system of [5] uses hashtags to detect different emotions (e.g., sadness, joy, etc.). A term frequency is computed for each hashtag, and four hashtag-based emotion lexicons are built and applied for the whole process. ...
Article
Full-text available
"Identifying the sentiment of collected tweets has become a challenging and interesting task. In addition, mining and defining relevant features that can improve the quality of a classification system is crucial. The data modeling phase is fundamental for the whole process since it can reveal hidden information from the textual inputs. Two models are defined in the presented paper, considering Twitter-specific concepts: a hashtagbased representation and a text-based one. These models will be compared and integrated into an unsupervised system that determines groups of tweets based on sentiment labels (positive and negative). Moreover, wordembedding techniques (TF-IDF and frequency vectors) are used to convert the representations into a numeric input needed for the clustering methods. The experimental results show good values for Silhouette and Davies-Bouldin measures in the unsupervised environment. A detailed investigation is presented considering several items (dataset, clustering method, data representation, or word embeddings) for checking the best setup for increasing the quality of detecting the sentiment from Twitter’s messages. The analysis and conclusions show that the first results can be considered for more complex experiments. Keywords: Sentiment Analysis, Twitter, Data Representation, Hashtags, Clustering. "
... Rainy days always make me feel sad #ihaterain Positive I prefer the fresh air on rainy days #iredhaterain Table 1: Example of Inconsistent Sentiment between Online Posts and their Hashtags overcome these challenges, the extensive use of hashtags in social media has drawn growing attention from researchers (Davidov, Tsur, and Rappoport 2010;Wang et al. 2011;Koto and Adriani 2015;Kouloumpis, Wilson, and Moore 2011). ...
Preprint
Full-text available
With the spreading of hate speech on social media in recent years, automatic detection of hate speech is becoming a crucial task and has attracted attention from various communities. This task aims to recognize online posts (e.g., tweets) that contain hateful information. The peculiarities of languages in social media, such as short and poorly written content, lead to the difficulty of learning semantics and capturing discriminative features of hate speech. Previous studies have utilized additional useful resources, such as sentiment hashtags, to improve the performance of hate speech detection. Hashtags are added as input features serving either as sentiment-lexicons or extra context information. However, our close investigation shows that directly leveraging these features without considering their context may introduce noise to classifiers. In this paper, we propose a novel approach to leverage sentiment hashtags to enhance hate speech detection in a natural language inference framework. We design a novel framework SRIC that simultaneously performs two tasks: (1) semantic relation inference between online posts and sentiment hashtags, and (2) sentiment classification on these posts. The semantic relation inference aims to encourage the model to encode sentiment-indicative information into representations of online posts. We conduct extensive experiments on two real-world datasets and demonstrate the effectiveness of our proposed framework compared with state-of-the-art representation learning models.
... Although the models analyzed in the existing literature, which are characterized by the diversity of different features, improve performance that can be evaluated by metrics such as accuracy, Recall and F1-score, these supervised models have been trained on a large volume of data and, therefore, require a lot of labeled data, which is usually costly and leads to high labor cost in real-world applications [10,11]. ...
Article
Full-text available
Most sentiment analysis models that use supervised learning algorithms consume a lot of labeled data in the training phase in order to give satisfactory results. This is usually expensive and leads to high labor costs in real-world applications. This work consists in proposing a hybrid sentiment analysis model based on a Long Short-Term Memory network, a rule-based sentiment analysis lexicon and the Term Frequency-Inverse Document Frequency weighting method. These three (input) models are combined in a binary classification model. In the latter, each of these algorithms has been implemented: Logistic Regression, k-Nearest Neighbors, Random Forest, Support Vector Machine and Naive Bayes. Then, the model has been trained on a limited amount of data from the IMDB dataset. The results of the evaluation on the IMDB data show a significant improvement in the Accuracy and F1 score compared to the best scores recorded by the three input models separately. On the other hand, the proposed model was able to transfer the knowledge gained on the IMDB dataset to better handle a new data from Twitter US Airlines Sentiments dataset.
... The ratio of sentiments after AFINN is shown in Table VII. We used AFINN in comparison with TextBlob because many studies show the significance of AFINN for tweets sentiment analysis [36]- [38]. Table VIII showing the performance of machine learning models on AFINN sentiments. ...
... When pictures were presented, participants were required to perceive the emotion. However, a sticker is not always a direct labeling of emotional content; it can sometimes be ambiguous (Koto and Adriani, 2015) and thus may add to the task difficulty. ...
Article
Full-text available
This article aims to investigate the interaction effects of emotional valence (negative, positive) and stimulus type (sticker, face) on attention allocation and information retrieval in spatial working memory (WM). The difference in recognition of emotional faces and stickers was also further explored. Using a high-resolution event-related potential (ERP) technique, a time-locked delayed matching-to-sample task (DMST) was employed that allowed separate investigations of target, delay, and probe phases. Twenty-two subjects participated in our experiment. The results indicated that negative face can catch early attention in information encoding, which was indicated by the augmentation of the attention-related P200 amplitude. In the delay phase, the N170 component represents facial specificity and showed a negative bias against stickers. For information retrieval, the increase in the emotion-related late positive component (LPC) showed that positive emotion could damage spatial WM and consume more cognitive resources. Moreover, stickers have the ability to catch an individual’s attention throughout the whole course of spatial WM with larger amplitudes of the attention-related P200, the negative slow wave (NSW), and the LPC. These findings highlight the role of stickers in different phases of spatial WM and provide new viewpoints for WM research on mental patients.
... This work Wheel of emotions (Plutchik, 1991) EL involves the classification of words in tweets into eight affective categories, based on the affective classification developed by Plutchik, by detecting hashtags that can be associated with emotions. A similar study was presented by Koto and Adriani (2015); however, these authors added synonyms and polarity and included other classification schemes in addition to the Plutchik model. Volkova et al. (2012) presented an affective lexicon called CLex that was created through crowdsourcing. ...
Article
Full-text available
Purpose This paper aims to propose a method for automatically labelling an affective lexicon with intensity values by using the WordNet Similarity (WS) software package with the purpose of improving the results of an affective analysis process, which is relevant to interpreting the textual information that is available in social networks. The hypothesis states that it is possible to improve affective analysis by using a lexicon that is enriched with the intensity values obtained from similarity metrics. Encouraging results were obtained when an affective analysis based on a labelled lexicon was compared with that based on another lexicon without intensity values. Design/methodology/approach The authors propose a method for the automatic extraction of the affective intensity values of words using the similarity metrics implemented in WS. First, the intensity values were calculated for words having an affective root in WordNet. Then, to evaluate the effectiveness of the proposal, the results of the affective analysis based on a labelled lexicon were compared to the results of an analysis with and without affective intensity values. Findings The main contribution of this research is a method for the automatic extraction of the intensity values of affective words used to enrich a lexicon compared with the manual labelling process. The results obtained from the affective analysis with the new lexicon are encouraging, as they provide a better performance than those achieved using a lexicon without affective intensity values. Research limitations/implications Given the restrictions for calculating the similarity between two words, the lexicon labelled with intensity values is a subset of the original lexicon, which means that a large proportion of the words in the corpus are not labelled in the new lexicon. Practical implications The practical implications of this work include providing tools to improve the analysis of the feelings of the users of social networks. In particular, it is of interest to provide an affective lexicon that improves attempts to solve the problems of a digital society, such as the detection of cyberbullying. In this case, by achieving greater precision in the detection of emotions, it is possible to detect the roles of participants in a situation of cyberbullying, for example, the bully and victim. Other problems in which the application of affective lexicons is of importance are the detection of aggressiveness against women or gender violence or the detection of depressive states in young people and children. Social implications This work is interested in providing an affective lexicon that improves attempts to solve the problems of a digital society, such as the detection of cyberbullying. In this case, by achieving greater precision in the detection of emotions, it is possible to detect the roles of participants in a situation of cyber bullying, for example, the bully and victim. Other problems in which the application of affective lexicons is of importance are the detection of aggressiveness against women or gender violence or the detection of depressive states in young people and children. Originality/value The originality of the research lies in the proposed method for automatically labelling the words of an affective lexicon with intensity values by using WS. To date, a lexicon labelled with intensity values has been constructed using the opinions of experts, but that method is more expensive and requires more time than other existing methods. On the other hand, the new method developed herein is applicable to larger lexicons, requires less time and facilitates automatic updating.
Article
Full-text available
Social media popularity and importance is on the increase due to people using it for various types of social interaction across multiple channels. This systematic review focuses on the evolving research area of Social Opinion Mining, tasked with the identification of multiple opinion dimensions, such as subjectivity, sentiment polarity, emotion, affect, sarcasm and irony, from user-generated content represented across multiple social media platforms and in various media formats, like text, image, video and audio. Through Social Opinion Mining, natural language can be understood in terms of the different opinion dimensions, as expressed by humans. This contributes towards the evolution of Artificial Intelligence which in turn helps the advancement of several real-world use cases, such as customer service and decision making. A thorough systematic review was carried out on Social Opinion Mining research which totals 485 published studies and spans a period of twelve years between 2007 and 2018. The in-depth analysis focuses on the social media platforms, techniques, social datasets, language, modality, tools and technologies, and other aspects derived. Social Opinion Mining can be utilised in many application areas, ranging from marketing, advertising and sales for product/service management, and in multiple domains and industries, such as politics, technology, finance, healthcare, sports and government. The latest developments in Social Opinion Mining beyond 2018 are also presented together with future research directions, with the aim of leaving a wider academic and societal impact in several real-world applications.
Conference Paper
Full-text available
In this paper, investigations of Sentiment Analysis over a well-known Social Media Twitter were done. As literatures show that some works related to Twitter Sentiment Analysis have been done and delivered interesting idea of features, but there is no a comparative study that shows the best features in performing Sentiment Analysis. In total we used 9 feature sets (41 attributes) that comprise punctuation, lexical, part of speech, emoticon, SentiWord lexicon, AFINN-lexicon, Opinion lexicon, Senti-Strength method, and Emotion lexicon. Feature analysis was done by conducting supervised classification for each feature sets and continued with feature selection in subjectivity and polarity domain. By using four different datasets, the results reveal that AFINN lexicon and Senti-Strength method are the best current approaches to perform Twitter Sentiment Analysis.
Conference Paper
Full-text available
As one of the largest Social Media in providing public data every day, Twitter has attracted the attention of researcher to investigate, in order to mine public opinion, which is known as Sentiment Analysis. Consequently, many techniques and studies related to Sentiment Analysis over Twitter have been proposed in recent years. However, there is no study that discuss about sentence pattern of positive/negative sentence and neither subjective/objective sentence. In this paper we propose POS sequence as feature to investigate pattern or word combination of tweets in two domains of Sentiment Analysis: subjectivity and polarity. Specifically we utilize Information Gain to extract POS sequence in three forms: sequence of 2-tags, 3-tags, and 5-tags. The results reveal that there are some tendencies of sentence pattern which distinguish between positive, negative, subjective and objective tweets. Our approach also shows that feature of POS sequence can improve Sentiment Analysis accuracy.
Conference Paper
Full-text available
There is high demand for automated tools that assign polarity to microblog content such as tweets (Twitter posts), but this is challenging due to the terseness and informality of tweets in addition to the wide variety and rapid evolution of language in Twitter. It is thus impractical to use standard supervised machine learning techniques dependent on annotated training examples. We do without such annotations by using label propagation to incorporate labels from a maximum entropy classifier trained on noisy labels and knowledge about word types encoded in a lexicon, in combination with the Twitter follower graph. Results on polarity classification for several datasets show that our label propagation approach rivals a model supervised with in-domain annotated tweets, and it outperforms the noisily supervised classifier it exploits as well as a lexicon-based polarity ratio classifier.
Article
Full-text available
Even though considerable attention has been given to the polarity of words (positive and negative) and the creation of large polarity lexicons, research in emotion analysis has had to rely on limited and small emotion lexicons. In this paper we show how the combined strength and wisdom of the crowds can be used to generate a large, high-quality, word-emotion and word-polarity association lexicon quickly and inexpensively. We enumerate the challenges in emotion annotation in a crowdsourcing scenario and propose solutions to address them. Most notably, in addition to questions about emotions associated with terms, we show how the inclusion of a word choice question can discourage malicious data entry, help identify instances where the annotator may not be familiar with the target term (allowing us to reject such annotations), and help obtain annotations at sense level (rather than at word level). We conducted experiments on how to formulate the emotion-annotation questions, and show that asking if a term is associated with an emotion leads to markedly higher inter-annotator agreement than that obtained by asking if a term evokes an emotion.
Article
Full-text available
We examine sentiment analysis on Twitter data. The contributions of this paper are: (1) We introduce POS-specific prior polarity fea- tures. (2) We explore the use of a tree kernel to obviate the need for tedious feature engineer- ing. The new features (in conjunction with previously proposed features) and the tree ker- nel perform approximately at the same level, both outperforming the state-of-the-art base- line. kernel based model. For the feature based model we use some of the features proposed in past liter- ature and propose new features. For the tree ker- nel based model we design a new tree representa- tion for tweets. We use a unigram model, previously shown to work well for sentiment analysis for Twit- ter data, as our baseline. Our experiments show that a unigram model is indeed a hard baseline achieving over 20% over the chance baseline for both classifi- cation tasks. Our feature based model that uses only 100 features achieves similar accuracy as the uni- gram model that uses over 10,000 features. Our tree kernel based model outperforms both these models by a significant margin. We also experiment with a combination of models: combining unigrams with our features and combining our features with the tree kernel. Both these combinations outperform the un- igram baseline by over 4% for both classification tasks. In this paper, we present extensive feature analysis of the 100 features we propose. Our ex- periments show that features that have to do with Twitter-specific features (emoticons, hashtags etc.) add value to the classifier but only marginally. Fea- tures that combine prior polarity of words with their parts-of-speech tags are most important for both the classification tasks. Thus, we see that standard nat- ural language processing tools are useful even in a genre which is quite different from the genre on which they were trained (newswire). Furthermore, we also show that the tree kernel model performs roughly as well as the best feature based models, even though it does not require detailed feature en-
Conference Paper
Full-text available
Twitter sentiment analysis or the task of automatically retrieving opinions from tweets has received an increasing interest from the web mining community. This is due to its importance in a wide range of fields such as business and politics. People express sentiments about specific topics or entities with different strengths and intensities, where these sentiments are strongly related to their personal feelings and emotions. A number of methods and lexical resources have been proposed to analyze sentiment from natural language texts, addressing different opinion dimensions. In this article, we propose an approach for boosting Twitter sentiment classification using different sentiment dimensions as meta-level features. We combine aspects such as opinion strength, emotion and polarity indicators, generated by existing sentiment analysis methods and resources. Our research shows that the combination of sentiment dimensions provides significant improvement in Twitter sentiment classification tasks such as polarity and subjectivity.
Article
As interaction between human and computer continues to develop to the most natural form possible, it becomes more and more urgent to incorporate emotion in the equation. The field continues to develop, yet exploration of the subject in Indonesian is still very lacking. This paper presents the first study of emotion recognition in Indonesian, including the construction of the first emotionally colored speech corpus in the language, and the building of an emotion classifier through an optimized machine learning process. We construct our corpus using television talk show recordings in various topics of discussion, yielding colorful emotional utterances. In our machine learning experiment, we employ the support vector machine (SVM) algorithm with feature selection and parameter optimization to ensure the best resulting model possible. Evaluation of the experiment result shows recognition accuracy of 68.31% at best.
Conference Paper
This paper presents our experimental work on performance evaluation of the SentiWordNet approach for document-level sentiment classification of Movie reviews and Blog posts. We have implemented SentiWordNet approach with different variations of linguistic features, scoring schemes and aggregation thresholds. We used two pre-existing large datasets of Movie Reviews and two Blog post datasets on revolutionary changes in Libya and Tunisia. We have computed sentiment polarity and also its strength for both movie reviews and blog posts. The paper also presents an evaluative account of performance of the SentiWordNet approach with two popular machine learning approaches: Naïve Bayes and SVM for sentiment classification. The comparative performance of the approaches for both movie reviews and blog posts is illustrated through standard performance evaluation metrics of Accuracy, F-measure and Entropy.
Article
Reviews research on data mining (DM) and knowledge discovery (KD). Discusses popular conceptions of information growth, the knowledge acquisition process, major elements of the DM process, and evaluation methods. Research on DM and KD is divided into two categories: analysis of numerical databases and analysis of non-numerical or textual databases. Contains 99 references. (AEF)