Content uploaded by Fajri Koto
Author content
All content in this area was uploaded by Fajri Koto on Mar 25, 2015
Content may be subject to copyright.
The Use of POS Sequence for Analyzing Sentence
Pattern in Twitter Sentiment Analysis
Fajri Koto, and Mirna Adriani
Faculty of Computer Science
University of Indonesia
Depok, Jawa Barat, Indonesia 16423
Email: fajri91@ui.ac.id, mirna@cs.ui.ac.id
Abstract—As one of the largest Social Media in providing
public data every day, Twitter has attracted the attention of
researcher to investigate, in order to mine public opinion, which
is known as Sentiment Analysis. Consequently, many techniques
and studies related to Sentiment Analysis over Twitter have been
proposed in recent years. However, there is no study that discuss
about sentence pattern of positive/negative sentence and neither
subjective/objective sentence. In this paper we propose POS
sequence as feature to investigate pattern or word combination
of tweets in two domains of Sentiment Analysis: subjectivity and
polarity. Specifically we utilize Information Gain to extract POS
sequence in three forms: sequence of 2-tags, 3-tags, and 5-tags.
The results reveal that there are some tendencies of sentence
pattern which distinguish between positive, negative, subjective
and objective tweets. Our approach also shows that feature of
POS sequence can improve Sentiment Analysis accuracy.
Keywords—social media, twitter, sentiment analysis, POS se-
quence, subjectivity, polarity
I. INTRODUCTION
Nowadays the emerging popularity of Social Media brings
in an overwhelming amount of data published by people.
According to Statistic Brain1,Twitter2is one of the world
largest Social Media with more than 600 million active users
at 2014. Twitter is a microblogging environment which allows
users to post certain free text, limited up to 140 characters,
called a tweet. Since 2014, there have been 58 million tweets
posted in average per day. Recent estimates indicate that one
out of five tweets discuss products or brands [1]. This implies
the abundance of user-generated content published through
such social media renders automated information monitoring
tools crucial for today’s business [2].
In information retrieval, the automatic system predicting
the sentiment of textual data is known as Sentiment Analysis.
This field refers to a broad area of natural language processing,
computational linguistic, and text mining. Typically, the goal
is to determine the polarity of natural language texts [2]
These sentiments can be categorized either into two categories:
positive and negative; or into an n-point scale, e.g., very
good, good, satisfactory, bad, very bad. In this respect, a
sentiment analysis task can be interpreted as a classification
task where each category represents a sentiment [3]. In this
work, we follow Bravo-Marquez et al. who roughly divide
these tasks into two categories: 1) Subjectivity classification
and 2) Polarity classification [4].
1http://www.statisticbrain.com/
2http://www.twitter.com
Subjectivity classification involves the discriminations be-
tween subjective and objective utterances. Formally, it can
be defined as: Given a collection of tweet T and the set
of binary subjectivity classification classes S = {subjective,
objective}, the goal is to approximate the unknown target
function F : T −→ S that is called as binary subjectivity
classifier. Objective utterance commonly contains facts, while
the subjective reflects a private point of view, emotion or
belief [5]. In polarity classification, three sentiment classes:
positive, negative and neutral have been introduced. According
to Aisopos et al., the classification can be defined as two
problems: 1) classes P1 = {positive, negative}for binary
polarity classification and 2) classes P2 = {positive, negative,
neutral}for general polarity classification. The goal is also to
approximate the unknown target Function F1 : T−→ P1 and
F2:T−→ P2 [6].
Many approaches have been addressed to approximate the
unknown target function of Sentiment Analysis. However,
from the existing approaches, there is no study that discusses
about sentence pattern of positive/negative tweets and neither
subjective/objective tweets. We hypothesize that there may
be some tendencies of pattern or word combinations which
distinguish tweets in polarity and either subjectivity domain.
For instance, people tend to write ”I like this phone” rather
than ”I don’t hate this phone”. It is uncommon for people
to use negation word in expressing their positive impression.
In subjectivity domain, the presence of adverb and adjective
is also commonly different between subjective and objective
sentence. People tend to use adjective or adverbs rather than
noun in uttering their opinion, emotion or belief.
Therefore, in this study we propose Part of Speech (POS)
sequence to investigate this issue. Part of Speech is linguistic
category of words which is generally defined by the syntactic.
We use combination of consecutive POS tags and call them as
POS sequence in order to investigate the pattern of word com-
bination that commonly appear in tweet containing sentiment.
Specifically, we conduct experiment by performing sequence
of 2, 3 and 5 tags. For each type of sequences, we calculate
their Information Gain and then use the top-k sequences. In
addition we also perform supervised classification in which
we incorporate POS sequence with previous method in Twitter
sentiment analysis.
The rest of this paper is structured as follows. Section
2 describes our approach in using POS sequence as feature
of Sentiment Analysis. Experimental-set up will be given in
Section 3, while Section 4 describes the experiment results
2015 29th International Conference on Advanced Information Networking and Applications Workshops
978-1-4799-1775-4/15 $31.00 © 2015 IEEE
DOI 10.1109/WAINA.2015.58
547
consisting of sentence pattern analysis and POS sequence
performance in Sentiment Classification. Finally conclusion
are drawn in Section 5.
II. RELATED WORK
The first investigation of tweet sentiment was done by
Go et al. in which they utilized emoticons to annotate tweet
with sentiment label [7]. The next study by Agarwal et al.
used manually annotated tweets with sentiment and perform
unigram model to do classification [8]. In other studies, Wang
et al. utilized hashtag3to perform graph-based classification
[9], while Cui et al. analyzed the emoticon of tweets with
graph propagation algorithm for emoticon weighting [10].
Some lexical resources of Sentiment Analysis such as: Opinion
Finder [11], SentiWord [12], ANEW [13], AFINN [14] and
NRC-emotion lexicon [15] were also released.
As discussed in previous section, our work purpose is
to investigate the sentence pattern that distinguish tweet in
sentiment domain. Although our effort is the first in twitter
sentiment analysis, some previous works of POS Sequence
have been published. Bandersky et al. used POS Sequence as
one of features in detecting memorable quote from structured
document like book. Specifically, they used Information Gain
to select top-isequence and then performed a supervised
quotable phrase detection using other lexical and syntactic
features [16]. Mukherjee et al. has also proposed POS sequence
as feature in gender classification of blog authors. The main
idea of their algorithm is to perform a level-wise search for
such patterns, which are POS sequences with minsup and
minadherence [17].
We realize that generating POS Sequence in unstructured
document like tweets is not a trivial matter. Applying POS
tagger to Twitter needs consideration since study shows that
the POS tagger accuracy drops when it is trained on well-
formed language and tested on Twitter data. A study by Ritter
et al. shows that POS tagger accuracy drops about 0.1-0.2
when it is applied to tweet. A key reason for this drop in
accuracy is that Twitter contains far more Out of Vocabulary
(OOV) words than grammatical text [18]. However, this is a
research area in itself. As preliminary study, we have started
our investigation by using existing POS-tagger in generating
POS Sequence. We argue that this accuracy reduction is still
acceptable in conducting the investigation.
III. POS SEQUENCE AS SENTIMENT ANALYSIS FEATURE
POS sequence is defined as a series of several tags which
are limited to certain number of tags. As example, a sentence
”I went to school yesterday” with its POS sequence PRP-VBD-
TO-NN-ADV, can produce 3 sequences of 3-tags: PRP-VBD-
TO, VBD-TO-NN, and TO-NN-ADV. To retrieve sequences
that represent space of dataset, weighting mechanism is re-
quired. In this work, by following Bandersky et al., Information
Gain (Eq. 1 and Eq. 2) is used to select the top-isequences.
IG(X,Y )=H(X)−H(X|Y)(1)
H(X)=−p(x)log2p(x)(2)
3a word started with the #symbol, and is used to mark keywords or topics
in a Tweet.
Fig. 1. Sequence of n-tags Extraction
Technically, feature of POS sequence is written as
#IGSeq[i]that expresses a number of POS sequence i-th
contained in the tweet, where X indicates the presence or
absence of POS sequence in current tweet, and Y indicates
the type of tweet. In this context, it can be positive or nega-
tive for polarity classification and subjective or objective for
subjectivity classification. Intuitively, the features #IGSeq[i]
measure how many POS sequence i-th are indicative of tweet
with certain sentiment.
The POS sequences are generated based on procedure
described in Fig. 1. The preprocessing stage includes: 1) Re-
moving url and phrase of twitter @account, 2) removing non-
alphabetic symbol, 3) removing RT phrase, and 4) converting
tweet into lowercase character. After that, the POS tagger
is applied. The dataset then is divided accordingly by their
sentiment class in order to calculate information gain for all
sequences. Finally, we select the top of-isequences and use it
as Sentiment Analysis feature.
IV. EXPERIMENTAL SET-UP
A. Datasets
The experiments were conducted in two domains: polarity
and subjectivity and used 5 different datasets: 1) Standford
Twitter Sentiment (STS)4which was used by Go et al. [7], 2)
Sanders5,3)Health Care Reform (HCR)6,4)Obama-McCain
Debate (OMD)6which were used by Speriosu et al. [19], and
5) International Workshop Sem-Eval 2013 (SemEval)7data.
Each tweet in these datasets includes a positive, negative, or
neutral tag. The summary of all datasets is given in Table I.
TABLE I. DATASETS STATISTIC
STS Sanders HCR OMD SemEval
#negative 177 635 784 1582 896
#positive 182 555 368 844 2341
#neutral 139 2293 280 813 2256
#total 498 3483 1432 3239 5493
4http://cs.standford.edu/people/alecmgo/trainingtestdata.zip
5http://www.sanalytics.com/lab/twitter-sentiment
6https://bitbucket.org/speriosu/updown/src/5de483437466/data?at=default
7http://www.cs.york.ac.uk/semeval-2013/
548
TABLE II. BALANCED DATASET
Subjectivity STS Sanders HCR OMD SemEval
#neutral 139 1190 280 800 2256
#objective 139 1190 280 800 2256
#total 278 2380 560 1600 4512
Polarity STS Sanders HCR OMD SemEval
#negative 177 555 368 800 896
#positive 177 555 368 800 896
#total 354 1110 736 1600 1792
In this work, we perform binary classification and tackle
the class imbalance by sampling tweets in Table I. Polarity
classification is done by only using positive and negative
label, while our subjectivity classification considers neutral
tweet as objective and positive/negative tweet as subjective.
As example, the class imbalance in STS is tackled by sampling
177 positive tweets from 182 tweets for polarity classification.
Whereas we only use 139 neutral and 139 objective tweets to
conduct subjectivity classification. The summary of these data
is given in Table II.
B. Experiment Stage
Fig. 2. Stage of experiments
Experiments were done based on experiment stages de-
scribed in Figure 2. First we utilized NLTK Python [20] as
POS Tagger to build data in POS tag form. Tweet datasets
then were extracted to several sequences of n-tags based on
steps in Figure 1. There were three forms of POS sequence
conducted in our experiment: n=2,n=3, and n=5, and
we only selected the top-100 of sequences of n-tags based on
Information Gain. After that, the extracted data were applied
with SVM weighting in order to filter out them into top-
10. This procedure was applied to each five datasets told in
previous subsection. Consequently, five variations of top-10
sequences were generated. We then selected sequences which
exist in two or more top-10 sequences in order to construct
analysis of sentence pattern in Twitter Sentiment Analysis.
In addition, we also conducted sentiment classification us-
ing top-100 POS sequences constructed by Information Gain.
We performed 5-fold cross validation with 80% of tweets as
training set, where the reminder of tweet as the test set. For
each fold in cross validation, the POS Sequences are extracted
based on the training set. Here, we used LibSVM [21] in open
source tool Rapidminer8[22] to classify tweet datasets. As
baseline, we used AFINN [13], a lexicon containing 2477
English words and constructed based on the Affective Norms
for English Words lexicon (ANEW) proposed by Bradley and
Lang [14]. Bravo-Marquez et al. in their latest study also
used this lexicon as baseline [4]. It is motivated by their
good performance in performing sentiment classification over
Twitter.
V. E XPERIMENT RESULT
A. Sentence Pattern of Tweet Containing Sentiment
After applying SVM weighting to all five datasets in three
forms of sequence based on Figure 2, 3x5 variations of top-10
POS sequences are produced. To perform analysis, we selected
sequences which arise in two or more datasets. We provide the
result in Table III for subjectivity and Table IV for polarity
domain. In these tables, result of sequence containing 2 and
3 tags are given. The #Dataset column reflects number of
datasets which arise a certain sequence as their Top-10.
Unlike sequence of 2-tags and 3-tags, results of 5-tags
sequence are difficult to interpret. It is caused by more
combination of POS sequences for higher nvalue. It impacts
the extracted POS sequences are very sparse and resemble to
vector 0. Therefor in this paper we don’t show the result in
the table.
We also provide the frequency column to look the tendency
of word combination between two classes. The column reflects
the appearance number of a sequence in all datasets by their
sentiment class. As example, in Table III sequence of RB-VBG
has subjective frequency equals to 258. This number shows that
RB-VBG appears 258 times in all of our subjective tweets.
TABLE III. WORD COMBINATION IN SUBJECTIVITY DOMAIN
Sequence of Description #Dataset Frequency
2 tags Subjective Objective
RB-VBG Adverb-Verb 4 258 124
RB-VB Adverb-Verb 3 709 381
RB-JJ Adverb-Adj 3 477 343
NN-PRP Noun-Pronoun 2 1506 981
VBZ-VBG Verb-Verb 2 221 177
PRP-VBP Pronoun-Verb 2 1702 1189
NN-NNS Noun-Noun 2 436 541
RB-NN Adverb-Noun 2 296 235
VBP-JJ Verb-Adj 2 326 269
Sequence of Description #Dataset Frequency
3 tags Subjective Objective
NN-NN-PRP Noun-Noun-Pronoun 3 593 374
RB-JJ-NN Adverb-Adj-Noun 3 247 181
PRP-VBP-JJ Pronoun-Verb-Adj 3 209 105
NN-NN-IN Noun-Noun-Conj 2 1356 1608
NN-PRP$-NN Noun-Possessive-Noun 2 143 96
IN-DT-JJ Conj-Det-Adj 2 321 337
MD-RB-VB Modal-Adverb-Verb 2 333 131
NN-NN-NN Noun-Noun-Noun 2 2933 3368
NN-VBZ-RB Noun-Verb-Adverb 2 143 98
1) Subjectivity: Results in Table III reveal the difference of
word combination between subjective and objective tweets. In
sequence of 2-tags, the sequences of RB-VBG, RB-VB, RB-
NN, RB-JJ, VBZ-VBG, VB-JJ tend to be POS sequence of
subjective tweets. The examples of them: ”am happy, am itchy,
too big, seriously hate, so amazing, and so boring”, are agreed
8http://www.rapidminer.com
549
with subjective utterance purpose that reflects private point of
view, emotion, and opinion. Moreover, these sequences consist
of adjective and adverb that is also in line with our hypothesis.
Unlike the subjective, this table only shows that the objective
tweets tend to have POS sequence of NN-NNS. It indicates
that objective tweets tend to have more nouns than subjective
tweets. The example of these sequences are ”city politics and
house correspondents”.
In results of 3-tags sequence, the selected sequences are
in line with sequence of 2-tags. Subjective tweets also consist
of POS sequences containing adverb and adjective (RB-JJ-NN,
PRP-VBP-JJ, MD-RB-VB, and NN-VBZ-RB), while objective
tweets also tend to have POS sequence containing noun (NN-
NN-IN and NN-NN-NN).
TABLE IV. WORD COMBINATION IN POLARITY DOMAIN
Sequence of Description #Dataset Frequency
2 tags Positive Negative
RB-VB Adverb-Verb 4 792 1346
NN-DT Noun-Det 2 476 557
PRP-VBD Pronoun-Verb 2 195 292
NN-PRP Noun-Pronoun 2 834 853
PRP-RB Pronoun-Adverb 2 156 237
VBZ -DT Verb-Det 2 124 190
NN-WRB Noun-WH(adverb) 2 52 131
Sequence of Description #Dataset Frequency
3 tags Positive Negative
VBZ-DT-NN Verb-Det-Noun 3 54 116
PRP-VBP-RB Pronoun-Verb-Adverb 3 123 236
NN-NN-IN Noun-Noun-Conj 2 691 681
VBD-NN-NN Verb-Noun-Noun 2 109 95
MD-RB-VB Modal-Adverb-Verb 2 133 225
NN-DT-NN Noun-Det-Noun 2 296 350
2) Polarity: As shown in Table IV, sequence of 2-tags
results reveal that negative tweets tend to have POS sequence
of RB-VB, PRP-RB and PRP-VBD. The example of RB-VB:
”not love, firmly believe, ugly love, not regret”, indicate that
tweets with negative sentiment tend to have an affirmation
words before the verb. The affirmation is also shown by
POS sequence of PRP-RB which uses adverb to affirm the
negativity. The examples of this sequence are ”I highly, I
seriously, I never, me crazy, and I just”. In other side, result
of PRP-VBD reveal that people are prefer to use past tense
rather than present tense in expressing the negativity.
Affirmation words are also found in sequence of 3-tags
result. The negative tweet tends to have POS sequence of PRP-
VBP-RB and MD-RB-VB. In our dataset, the words that are
commonly used to express the negativity are ”not and never”.
The examples are ”I am not, can not wait, and will never buy”.
B. POS sequence to boost Sentiment Analysis Classification
To investigate the performance of POS sequence in sen-
timent classification, we compared AFINN lexicon with the
incorporation of POS sequence and AFINN lexicon for both
classification (subjectivity and polarity). AFINN lexicon is
used by extracting tweet into two main features called APO
(AFINN Positivity) and ANE (AFINN Ne-gativity). APO is
extracted by summing score of positive words (from 1 to 5),
while ANE is extracted by summing score of negative words
(score -5 to -1). The powerfulness of AFINN for sentiment
classification over Twitter is its words that include slang
and obscene words as also acronyms and web jargon. The
incorporation of AFINN and POS Sequence features is simply
done by concatenating both features. Thus, 102 attributes were
used for the incorporation.
Fig. 3. The Accuracy of Polarity Classification using AFINN and incorpo-
ration of AFINN and Top-100 sequence for each dataset
Fig. 4. The Accuracy of Subjectivity Classification using AFINN and
incorporation of AFINN and Top-100 sequence for each dataset
Due to the result in previous section, we discarded se-
quence of 5-tags and only used sequence of 3-tags like
Bandersky et al. work in classifying memorable quote [16].
As feature to perform sentiment classification, we used the
top-100 sequence yielded from each training set. The results
of experiment are shown in Figure 3 and Figure 4 and reveal
that the incorporations of AFINN and POS sequence are able
to boost the accuracy of AFINN lexicon.
In polarity classification, all five datasets give positively
improvement by 0.23%, 3.06%, 3.25%, 1.67% and 2.37% for
STS, Sanders, HCR, SemEval, and OMD consecutively. In
other side, The positive result are also shown in subjectiv-
ity classification. The accuracies increase by 2.52%, 0.42%,
0.18%, 1.24% and 11.44% for STS, Sanders, HCR, SemEval,
and OMD consecutively. These results enable us to affirm that
POS sequence is able to be utilized in Sentiment Classification
over Twitter in subjectivity and polarity classification.
550
VI. CONCLUSION
In this study, we discuss about the use of POS Sequence in
Sentiment Analysis over Twitter in two domains: subjectivity
and polarity. To achieve the most optimum POS sequence
in uncovering sentence pattern, we conducted the study in
three variations of POS sequence (n=2,n=3, and
n=5). In addition, we performed sentiment classification
by incorporating AFINN Lexicon and POS Sequence.
In our first experiment, the results reveal that subjective
tweets tend to have word combinations consisted of adverb
and adjective. This is in line with subjective utterance purpose
that expresses emotion or private point of view. In contrast,
the objective tweets tend to have word combination of nouns
which basically aims to express a fact or neutrality rather than
emotion. Whereas, in polarity domain, the negative tweets tend
to have word combination of affirmation words which often
appear as negation word. In the second experiment, the results
show that features of POS sequence are able to boost the
accuracy in incorporation between AFINN and POS sequence.
It affirms that POS sequence can be utilized for performing
Sentiment Analysis over Twitter.
REFERENCES
[1] B. J. Jansen, M. Zhang, K. Sobel, and A. Chowdury, “Twitter power:
Tweets as electronic word of mouth”. In Journal of the American society
for information science and technology 60.11, 2009, pp. 2169-2188.
[2] A. Hogenboom, D. Bal, F. Frasincar, M. Bal, F. de Jong, and U.
Kaymak, “Exploiting emoticons in sentiment analysis”. In Proc. of the
28th Annual ACM Symposium on Ap-plied Computing, 2013, pp. 703-
710.
[3] R. Prabowo, and M. Thelwall, “Sentiment Analysis: A Combined
Approach”. In Journal of Informetrics 3.2, 2009, pp. 143-157.
[4] F. Bravo-Marquez, M. Mendoza, and B. Poblete, “Combining strengths,
emotions and polarities for boosting Twitter sentiment analysis”. In
Proc. of the Second International Workshop on Issues of Sentiment
Discovery and Opinion Mining, 2013.
[5] S. Raaijmakers, and W. Kraaij, “A Shallow Approach to Subjectivity
Classification”. In ICWSM, 2008.
[6] F. Aisopos, G. Papadakis, K. Tserpes, and T. Varvarigou, “Content vs.
context for sentiment analysis: a comparative analysis over microblogs.”
In Proc. of the 23rd ACM conference on Hypertext and social media,
2012, pp. 187-196.
[7] A. Go, R. Bhayani, and L. Huang, “Twitter sentiment classification
using distant supervision.” In CS224N Project Report, Stanford, 2009,
pp. 1-12.
[8] A. Agarwal, B. Xie, I. Vovsha, O. Rambow, and R. Passonneau,
“Sentiment analysis of twitter data”. In Proc. of the Workshop on
Languages in Social Media, 2011, pp. 30-38.
[9] X. Wang, F. Wei, X. Liu, M. Zhou, and M. Zhang, “Topic senti-
ment analysis in twitter: a graph-based hashtag sentiment classification
approach”. In Proc. of the 20th ACM international conference on
Information and knowledge management, 2011 pp. 1031-1040
[10] A. Cui, M. Zhang, Y. Liu, and S. Ma, “Emotion tokens: Bridging the gap
among multilingual twitter sentiment analysis”. In Proc. Information
retrieval technology, Springer, Berlin Heidelberg, 2011, pp. 238-249.
[11] T. Wilson, J. Wiebe, and P. Hoffmann, “Recognizing contextual polarity
in phrase-level sentiment analysis”. In Proc. of the conference on
human language technology and empirical methods in natural language
processing, 2005, pp. 347-354.
[12] A. Esuli, and F. Sebastiani, “Sentiwordnet: A publicly available lexical
resource for opinion mining”. In Proc. of LREC Vol. 6, 2006, pp. 417-
422.
[13] F. A. Nielsen, “A new ANEW: Evaluation of a word list
for sentiment analysis in microblogs”. 2001, Available at
http://arxiv.org/abs/1103.2903
[14] M. M. Bradley, and P. J. Lang, “Affective norms for English words
(ANEW): Instruction ma-nual and affective ratings”. Technical Report
C-1, The Center for Research in Psychophysiology, University of
Florida, 1999.
[15] S. M. Mohammad, P. D. Turney, “Crowdsourcing a wordemotion
association lexicon”. In Computational Intelligence, 2013, pp. 436-465.
[16] M. Bendersky, and D. A. Smith, “A dictionary of wisdom and wit:
Learning to extract quotable phrases.” In Proc. NAACL-HLT 2012, 2012,
pp. 69.
[17] A. Mukherjee, and B. Liu, “Improving Gender Classification of Blog
Authors”. In Proc. of the 2010 Conference on Empirical Methods in
Natural Language Processing, pp. 207-217.
[18] A. Ritter, S. Clark, Mausam, and E. Oren, “Named Entity Recognition
in Tweets: And Experimental Study”. In Proc. of the 2011 Conference
on Empirical Methods in Natural Language Processing, pp. 1524-1534.
[19] M. Speriosu, N. Sudan, S. Upadhyay, J. Baldridge, “Twitter polarity
classification with label propagation over lexical links and the follower
graph”. In Proc. of the EMNLP First workshop on Unsupervised
Learning in NLP. Edinburgh, Scotland, 2011.
[20] S. Bird, “NLTK: the natural language toolkit”. In Proc. of the COL-
ING/ACL on Interactive presentation sessions, 2006, pp. 69-72.
[21] C. C. Chang, and C. J. Lin, “LIBSVM: a library for support vector
machines”. In ACM Transactions on Intelligent Systems and Technology
(TIST), 2011, 2(3), 27.
[22] F. Akthar, and C. Akthar, “RapidMiner 5 Operator Reference”, 2012.
551