Content uploaded by Fajri Koto
Author content
All content in this area was uploaded by Fajri Koto on Dec 15, 2015
Content may be subject to copyright.
HBE: Hashtag-Based Emotion Lexicons for
Twitter Sentiment Analysis
Fajri Koto
Faculty of Computer Science
University of Indonesia
Depok, Jawa Barat, Indonesia
fajri91@ui.ac.id
Mirna Adriani
Faculty of Computer Science
University of Indonesia
Depok, Jawa Barat, Indonesia
mirna@cs.ui.ac.id
ABSTRACT
In this paper we report the first effort of constructing emo-
tion lexicon by utilizing Twitter as source of data. Specifi-
cally we used hashtag feature to obtain tweets with certain
emotion label in English. There are eight emotion classes
used in our work, comprising of angry, disgust, fear, joy, sad,
surprise, trust and anticipation that refer to the Plutchik’s
wheel. To obtain the lexicon, we first ranked the words ac-
cording to its term frequency. After that, we reduced some
irrelevant words by removing words with low frequency. We
also enriched the lexicon with the synonym and conducted
filtering by utilizing sentiment lexicon (40,288 words). As
result, we successfully constructed 4 Hashtag-Based Emo-
tion (HBE) Lexicons through different procedures and called
them as HBE-A1 (50,613 words), HBE-B1 (23,400 words),
HBE-A2 (26,909 words) and HBE-B2 (14,905 words). In our
experiment, we used the lexicons in investigating Twitter
Sentiment Analysis and the result reveals that our proposed
emotion lexicons can boost the accuracy and even improve
over than NRC-Emotion lexicon. It is also worth noting that
our construction idea is simple, automatic, inexpensive and
suitable for Social Media analysis.
Keywords
emotion lexicon, twitter, hashtag, sentiment analysis, sub-
jectivity, polarity
1. INTRODUCTION
Emotion is part of human communication that comprises
of psychology expression in conveying message. Emotion
becomes the identity of message and determines the respond
of partner or listener in term of how to react. In face-to-face
communication the emotion can be differentiated not only
through the facial expression and voice intonation but also
the words used. For example, delightful and yummy indicate
the emotion of joy, gloomy and cry are indicative of sadness,
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full cita-
tion on the first page. Copyrights for components of this work owned by others than
ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-
publish, to post on servers or to redistribute to lists, requires prior specific permission
and/or a fee. Request permissions from permissions@acm.org.
FIRE ’15, December 04-06, 2015, Gandhinagar, India
c
2015 ACM. ISBN 978-1-4503-4004-5/15/12. . . $15.00
DOI: http://dx.doi.org/10.1145/2838706.2838718
shout and boiling are indicative of anger, and so on. Thus,
it is obvious that words are associated with emotion [1].
Previous study has shown that emotion factor is related
to Sentiment Analysis. By combining emotion in sentiment
classification, the system can achieve better performance
[2]. Sentiment Analysis is a classification task to determine
the text polarity and refers to broad area of natural lan-
guage processing, computational linguistic and text mining
[3]. In this work, we follow [2] and use two types of bi-
nary classification: 1) polarity classification with classes =
{positive, negativ e}, and 2) subjectivity classification with
classes = {subjectiv e, objective}. As the name, the posi-
tive or negative sentences contain the polarity of positive
or negative. On the other hand, an objective sentence usu-
ally presents information containing facts or news that lacks
with arguments, while subjectivity reflects a private point
of view, emotion or belief [4].
Various studies of Sentiment Analysis were done in do-
mains of structured and unstructured document such as
news [5], blogs [6], forum review [7] etc. Due to its rapid
growth, Social Media also becomes interesting domain for
sentiment investigation. The abundance of texts in Social
Media engenders critical informations for today’s business.
Some enhancements and further investigation are being in
progress in order to build more sophisticated system. Driven
by this fact, we are interested to study the correlation be-
tween emotion and sentiment in Social Media domain, Twit-
ter1by building an emotion word list.
As the second largest social media today, Twitter has more
than 600 million active users at 20142. Twitter is a simple
microblogging environment which allows user to post cer-
tain free text, limited up to 140 characters, called a tweet.
Since 2014, there have been more than 58 million tweets are
posted on average per day. The unique phenomena of tweet
is the usage of one or more words immediately preceded with
a hash symbol (#). These words are called hashtag and may
be often found in a tweet data. It may serve many purposes,
but most notable it is used to indicate topic. Often these
words add to the information in the tweet, including the in-
ternal emotion of the author [8]. For instance: “when i’m
sad, i listen to songs that make me even more sad. #sad”.
By utilizing hashtag feature in Twitter, we build an emo-
tion word list by first collecting the corpora based on emo-
tion classes that are noted by the hashtag. We then use
our emotion lexicon to classify sentiment tweet and see its
comparison between existing emotion lexicon, NRC [1].
1http://www.twitter.com
2http://www.statisticbrain.com
2. RELATED WORK
The first study of tweet sentiment was done by Go et al. in
which they utilized emoticons to annotate tweet with senti-
ment label [9]. The next study by Agarwal et al. used man-
ually annotated tweets with sentiment and perform unigram
model to do classification [10]. Koto et al. investigated the
use of POS sequence in Twitter Sentiment Analysis to un-
derstand the sentence pattern of tweet with sentiment [18].
In other study, Bravo-Marquez et al. attempted to boost
Sentiment Analysis by combining aspects such as opinion
strength, emotion and polarity. Their result shows that
combining emotion factor in sentiment analysis can provide
significant improvements [2].
In their work, Bravo-Marquez et al. used English emotion
lexicon published by Mohammad et al. as feature to clas-
sify sentiment [1,2]. This word list was built by eight basic
emotion classes that come from human psychology. Their
work refers to Ekman emotions [11] and annotated the words
by class of Plutchik’s wheel [12]: joy, trust, sadness, anger,
suprise, fear, anticipation and disgust. This lexicon is known
as National Research Council Canada (NRC)-Emotion lex-
icon and contains 8,258 unique words that were manually
annotated by crowdsourcing Amazon [1].
Even though the NRC lexicon is a large list of words, some
of either formal and informal word/phrases that are com-
monly used in Social Media are not included. Phrases such
as “lol” or “wtf ” are the example. We argue that these words
should be considered especially in Social Media analysis,
since NRC was built by utilizing Macquarie Thesaurus[13]
as source, in which it was constructed two decades before
Social Media era. Driven by this fact and insight from [8],
we construct emotion corpora by utilizing hashtag feature as
label. By following [8] we collect tweets with emotion label
and then extract the lexicon.
3. BUILDING THE EMOTION LEXICONS
In Fig. 1 we describe the procedure to construct our emo-
tion lexicon. Here, we also use eight basic emotions intro-
duced by Plutchik as query to collect the tweet. For exam-
ple, we use #angry as query to obtain the relevance tweets
of anger. Since our preliminary research focus on English,
we implement language filter of English in crawling process.
Figure 1: The construction stage of emotion lexicon
To obtain the raw data faster we conducted conventional
method by utilizing browser to search the emotion query
instead of using API. The process was manually done by
collecting raw data starting from November 22, 2013, move
backward until the browser can not give any old tweet.
Through the html file we collected 5,316 tweets in total and
the summary is described in Table 1. In this table, we ad-
just number of certain emotion class like Sad, caused by its
large number if it is collected since 2008. So we limited the
crawling period only until February 19, 2013 for Sad emotion
label.
Table 1: Corpus of tweet with emotion hashtag label
Emotion The oldest tweet Total of tweet
Angry 21-Jul-07 2759
Disgust 12-Mar-09 226
Fear 24-Oct-08 2967
Joy 29-Sep-08 5316
Sad 19-Feb-13 4906
Surprise 29-Mar-08 4212
Trust 18-Oct-12 2908
Anticipation 3-Dec-08 722
The raw data was processed by some preprocessing stages
that comprise of 1) removing username that contains ’@’
entity and url in a tweet, 2) removing “RT ” phrase, 2) nor-
malization of all words to lowercase character, 3) removing
all special characters in tweet, 4) stemming, 5) lemmatiza-
tion, and 6) removing stopwords. Specifically, stemming and
lemmatization were done by utilizing NLTK library [14].
After list of these preprocessing done, for each emotion
set we counted its term frequency. According to the result,
we reduce irrelevant words by: 1) removing words with term
frequency equals to 1 and 2) selecting only the top-75% of
words from the first removal. The process was continued by
enhancing lexicon with the synonym set. Here, we applied
NLTK library in Python to complete the word list. As re-
sult, the emotion lexicon, called as Hastag Based Emotion-A
(HBE-A) was obtained.
We argue that selecting words that contain polarity like
sentiment lexicon from HBE-A may could improve its per-
formance. Moreover it is also obvious that emotion have
strong correlation with the sentiment factor. Therefore, in
the last step, we filter the resulting lexicon by the sentiment
words that are obtained from 1) Bing Liu [21], 2) Wilson[22],
3) AFINN [16], and 4) Senti-WordNet [23]. In total there
are 40,288 unique sentiment words used for conducting fi-
nal filtering. Specifically for Senti-WordNet, we selected the
words which have positive and negative score that do not
equal to 0 (non-neutral words). As result of this filtering,
HBE-B lexicon was obtained.
In total there are 4 kinds emotion lexicon produced and
used in the experiment: HBE-A1, HBE-B1, HBE-A2, and
HBE-B2. It is differentiated by two different procedures in
stage of applying synonym: 1)We add synonym of all words
produced by previous stage and labeled the results as A1
and B1; 2) We only add synonym of the top-25% words and
result in A2 and B2. It was done by considering hypothesis
that instead of adding the synonym of all words, using words
with higher frequency can produce better lexicon. In Table 2
the summary of HBE and NRC is provided.
Table 3: The accuracy (%) of subjectivity classification
Dataset AFINN +NRC +HBE
A1 B1 A2 B2
Sanders 61.89 63.40 63.95 59.87 64.41 64.66
HCR 60.36 60.89 65.00 60.54 64.64 62.32
SemEval 68.04 68.20 68.15 68.53 68.48 68.66
OMD 52.12 52.00 60.44 64.56 61.31 64.00
Table 4: The accuracy (%) of polarity classification
Dataset AFINN +NRC +HBE
A1 B1 A2 B2
Sanders 70.72 73.69 72.70 75.41 72.94 73.87
HCR 61.55 58.42 60.19 61.96 62.09 60.73
SemEval 73.88 75.22 76.17 76.28 76.34 76.06
OMD 62.25 63.31 64.31 63.94 66.12 65.56
Table 2: Number of words on hashtag-based emotion
and NRC lexicon
Emotion HBE
NRC
A1 B1 A2 B2
Angry 7305 3444 3909 1978 1245
Disgust 909 482 331 1948 1055
Fear 7088 3440 3746 1975 1474
Joy 9447 4211 5275 2469 689
Sad 9555 4377 5530 2647 1191
Surprise 7788 3302 4078 1809 839
Trust 6042 2940 3068 1612 564
Anticipation 2479 1204 972 467 1231
Total 50613 23400 26909 14905 8258
Table 5: Balanced Dataset
Subjectivity Sanders HCR OMD SemEval
#neutral 1190 280 800 2256
#objective 1190 280 800 2256
#total 2380 560 1600 4512
Polarity Sanders HCR OMD SemEval
#negative 555 368 800 896
#positive 555 368 800 896
#total 1110 736 1600 1792
4. EXPERIMENT
4.1 Experimental Set-Up
The experiment was conducted in two domains of Senti-
ment Analysis: polarity and subjectivity and used 4 differ-
ent datasets: 1) Sanders3,2) Health Care Reform (HCR)4,
3) Obama-McCain Debate (OMD)6which were used by Spe-
riosu et al. [15], and 4) International Workshop Sem-Eval
2013 (SemEval)5data. Each tweet in these datasets includes
a positive, negative, or neutral tag. In this work, we per-
formed binary classification and tackled the class imbalance
by sampling tweets. The summary of these data is given in
Table 5.
As baseline, we used AFINN [16], a lexicon containing
3http://www.sanalytics.com/lab/twitter-sentiment
4https://bitbucket.org/speriosu/
5http://www.cs.york.ac.uk/semeval-2013/
2,477 English words and constructed based on the Affec-
tive Norms for English Words lexicon (ANEW) proposed by
Bradley and Lang [17]. It is motivated by their good perfor-
mance in performing sentiment classification over Twitter.
A recent comparative study on twitter sentiment analysis
by [19] has also shown that AFINN is a good feature to be
used as baseline in Twitter sentiment analysis.
4.2 Experiment Result
To investigate the performance of HBE in sentiment clas-
sification, we compared AFINN lexicon with the incorpo-
ration of 1) AFINN and NRC, and 2) AFINN and HBE,
for both classifications (subjectivity and polarity). In Ta-
ble 3 and Table 4 these incorporations are written as +NRC
and +HBE. AFINN Lexicon is used by extracting tweet into
two main features called APO (AFINN Positivity) and ANE
(AFINN negativitiy). APO is extracted by summing score
of positive words (from 1 to 5), while ANE is extracted by
summing score of negative words (score -5 to -1). Whereas,
emotion feature is simply extracted by counting number of
words that matches with the corresponding emotion class
word list. Their incorporation is simply done by concate-
nating both features. Thus, 10 attributes were used for the
incorporation. The classification was performed by 5-fold
cross validation, in which 80% of tweets are as training set,
where the remainder of tweet as the test set. Here, we used
LibSVM in open source tool Rapidminer6to classify dataset.
According to Table 3 and 4, our experiment results show
that the incorporation of AFINN and HBE can improve the
baseline accuracy better than incorporation of AFINN and
NRC, the existing emotion lexicon. The colored cell in the
tables represents result with higher accuracy if the scores
are better than baseline and +NRC. In both results, it is
clearly shown that the proposed lexicons can achieve the
best accuracy for all dataset.
In Table 3, three HBE-B and one HBE-A become the
best approach in subjectivity classification. The AFINN ac-
curacy increases significantly in HCR and OMD data by
4.64% and 12.44% consecutively. It shows that our effort in
filtering HBE-A with sentiment words can improve the clas-
sifier to recognize tweet in subjective domain. Selecting and
using sentiment words may enable machine to easier identify
subjective tweets that tend to reflect private point of view
6http://www.rapidminer.com
or opinion.
In the second result, Table 4 shows three HBE-A2 and
one HBE-B1 as the best performance. The AFINN accuracy
increases by 4.69%, 0.54%, 2.46%, and 3.87% for Sanders,
HCR, SemEval, and OMD consecutively. Here, the filtering
with sentiment words may not give major improvement since
the sentiment words were obtained by combining all type of
polarity. However, our effort to only apply synonym to the
top-25% of lexicon shows better improvement than A1 and
B1. The similar result is also shown by Table 3 that all cells
in these column (A2 and B2) are colored.
5. CONCLUSION
In this work we attempted to build emotion lexicon from
Twitter data by utilizing hashtag feature as label to differen-
tiate emotion. We call the resulting lexicon as HBE lexicons
and used it in Twitter sentiment classification as prelim-
inary investigation. The experiment result reveals better
performances than existing emotion lexicon, NRC. Our lexi-
cons are more suitable for Social Media analysis since it was
built from Twitter itself. The other advantage of our idea
is the lexicon can be larger in term of size and functionality
along by the growing of Twitter data. As future work, this
emotion lexicons can be used to investigate emotion classi-
fication either on Social Media or another domain such as
speech [20], news, chat etc.
6. ACKNOWLEDGEMENT
Part of this work was supported by FredMan Foundation,
Samsung R&D Institute Indonesia.
7. REFERENCES
[1] S. M. Mohammad, and P. D. Turney. Crowdsourcing a
word-emotion association lexicon. In Computational
Intelligence, 29(3): 436–465, 2013.
[2] F. Bravo-Marquez, M. Mendoza, and B. Poblete.
Combining strengths, emotions and polarities for
boosting Twitter sentiment analysis. In Proceedings of
the Second International Workshop on Issues of
Sentiment Discovery and Opinion Mining, 2, 2013.
[3] W. J. Trybula. Data Mining and Knowledge
Discovery. In Annual review of information science
and technology (ARIST), 32: 197–229, 1997.
[4] S. Raaijmakers and W. Kraaij. A Shallow Approach to
Subjectivity Classification. In ICWSM, 2008.
[5] W. Zhang and S. Skiena. Trading Strategies to Exploit
Blog and News Sentiment. In ICWSM, 2010.
[6] V. K. Singh, R. Piryani, A. Uddin, and P. Waila.
Sentiment analysis of Movie reviews and Blog posts.
In Advance Computing Conference (IACC): 893–898,
2013.
[7] L. Shi, B. Sun, L. Kong, and Y. Zhang. Web forum
Sentiment analysis based on topics. In Computer and
Information Technology, 2: 148–153, 2009.
[8] S. M. Mohammad. #Emotional tweets. In Proceedings
of the Sixth International Workshop on Semantic
Evaluation, Association for Computational
Linguistics, 2012.
[9] A. Go, R. Bhayani, and L. Huang. Twitter sentiment
classification using distant supervision. In CS224N
Project Report, Stanford, 2009.
[10] A. Agarwal, B. Xie, I. Vovsha, O. Rambow, and R.
Passonneau. Sentiment analysis of twitter data. In
Proceedings of the Workshop on Languages in Social
Media: 30–38, 2011. 30–38.
[11] P. Ekman. An argument for basic emotions. Cognition
and Emotion, 6(3-4): 169–200, 1992.
[12] R. Plutchik. The psychology and biology of emotion.
HarperCollins College Publishers, 1994.
[13] The Macquarie Thesaurus. In Macquarie Library, 1986
[14] S. Bird. NLTK: the natural language toolkit. In
Proceedings of the COLING/ACL on Interactive
presentation sessions: 69–72, 2006.
[15] M. Speriosu, N. Sudan, S. Upadhyay, and J.
Baldridge. Twitter polarity classification with label
propagation over lexical links and the follower graph.
In Proceedings of the First workshop on Unsupervised
Learning in NLP: 53–63, 2011.
[16] F. A. Nielsen. A new ANEW: Evaluation of a word
list for sentiment analysis in microblogs. In arXiv
preprint arXiv: 1103.2903., 2011.
[17] M. M. Bradley and P. J. Lang. Affective norms for
English words (ANEW): Instruction manual and
affective ratings. In Technical Report C-1, The Center
for Research in Psychophysiology, University of
Florida, 1999.
[18] F. Koto and M. Adriani. The Use of POS Sequence for
Analyzing Sentence Pattern in Twitter Sentiment
Analysis. In Proceedings of the 29th WAINA (the
Eight International Symposium on Mining and Web),
Gwangju, Korea: 547–551, 2015.
[19] F. Koto and M. Adriani. A Comparative Study on
Twitter Sentiment Analysis: Which Features are
Good?. In Natural Language Processing and
Information System, Springer International
Publishing, 9103: 453–457, 2015.
[20] N. Lubis, D. Lestari, A. Purwarianti, S. Sakti, and S.
Nakamura. Emotion recognition on Indonesian
television talk shows. In Proceedings of Spoken
Language Technology Workshop (SLT): 466–471, 2014.
[21] B. Liu, M. Hu and J. Cheng. Opinion observer:
analyzing and comparing opinions on the web. In
Proceedings of the 14th international conference on
World Wide Web: 342–351, 2005.
[22] T. Wilson, J. Wiebe, and P. Hoffmann. Recognizing
contextual polarity in phrase-level sentiment analysis.
In Proceedings of the conference on human language
technology and empirical methods in natural language
processing: 347–354 2005.
[23] S. Baccianella, A. Esuli, and F. Sebastiani.
SentiWordNet 3.0: An Enhanced Lexical Resource for
Sentiment Analysis and Opinion Mining. In LREC,
10: 2200–2204, 2010.