Conference PaperPDF Available

Abstract

Microblog content poses serious challenges to the applicabil-ity of sentiment analysis, due to its inherent characteristics. We introduce a novel method relying on content-based and context-based features, guaranteeing high effectiveness and robustness in the settings we are considering. The evalua-tion of our methods over a large Twitter data set indicates significant improvements over the traditional techniques.
Textual and Contextual Patterns
for Sentiment Analysis over Microblogs
Fotis Aisopos$, George Papadakis$,3, Konstantinos Tserpes$, Theodora Varvarigou$
3L3S Research Center, Germany papadakis@L3S.de
$ICCS, National Technical University of Athens, Greece {fotais, gpapadis, tserpes, dora}@mail.ntua.gr
ABSTRACT
Microblog content poses serious challenges to the applicabil-
ity of sentiment analysis, due to its inherent characteristics.
We introduce a novel method relying on content-based and
context-based features, guaranteeing high effectiveness and
robustness in the settings we are considering. The evalua-
tion of our methods over a large Twitter data set indicates
significant improvements over the traditional techniques.
Categories and Subject Descriptors
H.3.3 [Information Storage and Retrieval]: Information
Search and Retrieval—Information filtering; I.2.7 [Artificial
Intelligence]: Natural Language Processing—Text analysis
General Terms
Algorithms, Experimentation
Keywords
Sentiment Analysis, N-gram Graphs, Social Context, Social
Media
1. INTRODUCTION
The inherent characteristics of the text that is shared in
social media infuse challenges in the extraction of sentiment-
expressive patterns [4, 5, 6] that call for a different approach
than those commonly followed by existing Sentiment Anal-
ysis (SA) systems. To this end, they employ either dis-
criminative (series of) words [1] or dictionaries that assess
the meaning and the lexical category of specific words and
phrases (e.g. SentiWordNet1). Such characteristics are spar-
sity (short free-form text), neologisms[2],noise (misspelled
text) and multilinguality. We introduce a novel approach
based on two complementary sources of evidence, which are
language-neutral and robust to noise: a content-based ap-
proach using the n-gram graphs document representation
model and a context-based approach relying on social graph
connections to capture the mood expressed in the social con-
text of each message. We apply our approach in a large
Twitter dataser and compare between these two sources of
evidence and analytically examine how they perform in con-
junction. We focus on effectiveness and efficiency, experi-
menting on multiple classification algorithms and configura-
tions, aiming at the lower possible processing time.
1http://sentiwordnet.isti.cnr.it
Copyright is held by the author/owner(s).
WWW 2012 Companion, April 16–20, 2012, Lyon, France.
ACM 978-1-4503-1230-1/12/04.
2. PROBLEM FORMULATION
In this work, we exclusively focus on document-level SA
in the context of microblog posts, detecting the sentiment
polarity” of individual Twitter messages (tweets). This can
be categorized into two distinct problems:
Binary Polarity Classification: Classify a document col-
lection into two binary polarization classes
PB={negative, positiv e}.
General Polarity Classification: Classify a document col-
lection into three polarization classes
PG={negative, neutr al, positive}.
3. APPROACH
3.1 Content-based Models
The alternative representation models used in the content-
based approach, to extract the polarity features in each doc-
ument (tweet) are the following:
Term Vector Model: this model aggregates the set of
distinct words contained in a document to represent as
a vector of frequencies.
Punctuation Model: this model takes into account the
punctuation and character-based features that are con-
tained in a document such as: (i) number of special char-
acters, (ii) number of “!”, (iii) number of quotes, (iv)
number of “?”, (v) number of capitalized tokens, (vi)
length in characters.
Character N-grams Model: this model comprises all sub-
strings of length nin a document. This model constructs
a vector providing the n-gram frequencies.
Character N-gram Graphs Model: this model forms a
graph whose nodes correspond to distinct n-grams, while
its edges are weighted proportionally to the average dis-
tance - in terms of n-grams - between the adjacent nodes.
3.1.1 Character N-Gram Graphs Model
In the N-gram Graphs model, each polarity class is mod-
eled by a single graph, uniformly aggregating the documents
comprising it. After merging all individual document graphs
into the class graph, its edges encapsulate the most char-
acteristic patterns contained in the class’ content, such as
recurring and neighboring character sequences, special char-
acters, and digits.
To estimate the similarity between a new document (tweet)
graph Gtiand a class graph GTp, we employ one of the es-
tablished n-gram graph similarity metrics [3]:
(i) Containment Similarity (CS), which expresses the pro-
portion of edges of a small graph Gtithat are shared with
graph GTp.
Problem 1 Problem 2
4-Gram Discr. Discr. Social 4-Gram Discr. Discr. Social
Graphs Graphs Punct. Polarity Polarity Context Graphs Graphs Punct. Polarity Polarity Context
NB 91.51% 96.36% 56.64% 53.40% 74.61% 51.05% 75.82% 93.43% 44.69% 37.40% 60.02% 34.33%
C4.5 98.76% 97.17% 60.98% 80.08% 72.89% 60.44% 96.85% 94.98% 46.00% 66.55% 61.47% 46.38%
SVM 86.10% 84.57% 50.12% 73.19% 72.89% 56.93% 79.18% 78.82% 39.02% 52.86% 57.27% 36.68%
Table 1: Accuracy of all combinations between models and classification algorithms over both polarity problems.
G
G
neg
G
neu
G
pos
G
ti
neg
neu
pos
Graphs
Comparison
Graphs
Comparison
Graphs
Comparison
(CSneg,NVSneg,Vsneg,CSneu,NVSneu,VSneu,CSpos,NVSpos,VSpos)
Figure 1: Deriving the feature vector from the n-
gram graphs model for General Polarity Classifica-
tion Problem.
(ii) Value Similarity (VS), which indicates how many of the
edges contained in graph Gtiare shared with graph GTp,
considering also their weights.
(iii) Normalized Value Similarity (NVS), which decouples
value similarity from the effect of the largest graph’s size.
To enhance the classification efficiency of the n-gram graphs
model, we propose an intuitive method for discretizing its
similarity values, employing pair-wise comparisons between
the values of the same metric for different polarity classes,
to produce a nominal label.
Figure 1 depicts the described process for estimating po-
larity between graph Gtiand the three class graphs (Gneg,
Gpos,Gneu ) in the General Polarity Classification Problem.
3.2 Context-based Models
The 2 representation models for the context-based ap-
proach, aim at quantifying the effect of social context, along
with their features.
3.2.1 Social Polarity Model
The aggregate sentiment of a set of tweets is determined
by the dominant polarity class: if the positive messages sig-
nificantly outnumber the negative ones, the overall senti-
ment is considered positive and vice versa. We consider the
following context-based features:
Author Polarity Ratio: the aggregate polarity of all mes-
sages posted by the same author
Author’s Followees Polarity Ratio: the aggregate senti-
ment of all messages posted by the author’s followees
Author’s Reciprocal Friends Polarity Ratio: the aggre-
gate sentiment of the tweets posted by the author’s re-
ciprocal friends
Topic(s) Polarity Ratio: the overall sentiment of all tweets
that pertain to the same topic
Mention(s) Polarity Ratio: the overall sentiment of all
tweets that mention the same user
URL(s) Polarity Ratio: the aggregate polarity of all tweets
with the same URL
3.2.2 Social Context Model
To reduce the feature extraction cost of the above model,
we also consider an alternative set of context-based features
that can be directly derived from a user’s account and the
characteristics of her messages. These features are the num-
ber of: Author’s Tweets, Author’s Followees, Author’s Re-
ciprocal Friends, Author’s Reciprocal Friends, Topics, Men-
tions, URLs.
These features rely on the same evidence with the Polarity
Ratio model, but do not take into account the aggregate
polarity of the underlying instances.
4. EVALUATION
Dataset. To examine the performance of our models,
we conducted a thorough experimental study on a large-
scale data set that was employed in [7]. To measure the
effectiveness of the classification models, we considered the
established metric of classification accuracy α.
Evaluation Method. To evaluate the performance of
our models, we employ the traditional 10-fold cross-validation
approach. For the comparative analysis of the document
representation models, we employed the Naive Bayes Multi-
nomial (NBM) and the Support Vector Machines (SVM).
For the rest of the models, we employed: Naive Bayes (NB),
C4.5 and the SVM. For the functionality of the n-gram
graphs, we employed the open source library of JInsect2.
For the implementation of the classification algorithms, we
used the Weka open source library3.
Evaluation Results. Table 1 summarizes all experimen-
tal results over both polarity classification problems: On
the whole, the four-gram graphs achieve the highest accu-
racy across all representation models and classification al-
gorithms - especially after discretizing their values (as ex-
plained in Section 3.1.1). This means that they are more
suitable for tackling the inherent characteristics of microblog
content.
5. ACKNOWLEDGEMENT
This work has been partly funded by the FP7 EU Project
SocIoS (Contract No. 257774).
References
[1] D. Davidov, O. Tsur, and A. Rappoport. Enhanced sentiment
learning using twitter hashtags and smileys. In COLING, 2010.
[2] J. Eisenstein, B. O’Connor, N. A. Smith, and E. P. Xing. A latent
variable model for geographic lexical variation. In EMNLP, 2010.
[3] G. Giannakopoulos, V. Karkaletsis, G. A. Vouros, and P. Stam-
atopoulos. Summarization system evaluation revisited: N-gram
graphs. TSLP, 5(3), 2008.
[4] L. Jiang, M. Yu, M. Zhou, X. Liu, and T. Zhao. Target-dependent
Twitter sentiment classification. In COLING, 2011.
[5] B. O’Connor, R. Balasubramanyan, B. R. Routledge, and N. A.
Smith. From tweets to polls: Linking text sentiment to public
opinion time series. In ICWSM, 2010.
[6] A. Tumasjan, T. O. Sprenger, P. G. Sandner, and I. M. Welpe.
Predicting elections with twitter: What 140 characters reveal
about political sentiment. In ICWSM, 2010.
[7] J. Yang and J. Leskovec. Patterns of temporal variation in online
media. In WSDM, pages 177–186, 2011.
2http://sourceforge.net/projects/jinsect
3http://www.cs.waikato.ac.nz/ml/weka
... Researchers also worked on deriving emotions based on expressions coming in the form of text. Emotions are derived using single prominent keyword or association of multiple keywords [20,21,22]. ...
... Self-report surveys are also utilized, based on psychological and anthropological expertise. Twenty-two (22) questions are created with the inputs from the experts, common issues in the literature, and our specific objectives. Responses to questions are captured in the form of images in selected cases and MCQ options or descriptive text in rest of the cases. ...
... Twenty-two (22) questions are considered which do not contain any personal information. Data is prepared that can help in decoding context based on behavior of candidates. ...
Preprint
Full-text available
Culture is core to human civilization, and is essential for human intellectual achievements in social context. Culture also influences how humans work together, perform particular task and overall lifestyle and dealing with other groups of civilization. Thus, culture is concerned with establishing shared ideas, particularly those playing a key role in success. Does it impact on how two individuals can work together in achieving certain goals? In this paper, we establish a means to derive cultural association and map it to culturally mediated success. Human interactions with the environment are typically in the form of expressions. Association between culture and behavior produce similar beliefs which lead to common principles and actions, while cultural similarity as a set of common expressions and responses. To measure cultural association among different candidates, we propose the use of a Graphical Association Method (GAM). The behaviors of candidates are captured through series of expressions and represented in the graphical form. The association among corresponding node and core nodes is used for the same. Our approach provides a number of interesting results and promising avenues for future applications.
... Researchers also worked on deriving emotions based on expressions coming in the form of text. Emotions are derived using single prominent keyword or association of multiple keywords [20,21,22]. ...
... Self-report surveys are also utilized, based on psychological and anthropological expertise. Twenty-two (22) questions are created with the inputs from the experts, common issues in the literature, and our specific objectives. Responses to questions are captured in the form of images in selected cases and MCQ options or descriptive text in rest of the cases. ...
... Twenty-two (22) questions are considered which do not contain any personal information. Data is prepared that can help in decoding context based on behavior of candidates. ...
Preprint
Full-text available
Culture is core to human civilization, and is essential for human intellectual achievements in social context. Culture also influences how humans work together, perform particular task and overall lifestyle and dealing with other groups of civilization. Thus, culture is concerned with establishing shared ideas, particularly those playing a key role in success. Does it impact on how two individuals can work together in achieving certain goals? In this paper, we establish a means to derive cultural association and map it to culturally mediated success. Human interactions with the environment are typically in the form of expressions. Association between culture and behavior produce similar beliefs which lead to common principles and actions, while cultural similarity as a set of common expressions and responses. To measure cultural association among different candidates, we propose the use of a Graphical Association Method (GAM). The behaviors of candidates are captured through series of expressions and represented in the graphical form. The association among corresponding node and core nodes is used for the same. Our approach provides a number of interesting results and promising avenues for future applications.
... Often the problem is tackled as a multi-label classification problem (e.g. [3,4,11,17,28,35]) in which the documents are assigned weights indicating the degree of correlation with the corresponding class. Of course, this is the case in the majority of the literature works; there is also a large body of scientific publications (e.g. ...
... An alternative approach is the use of features that are contextually related to the document (e.g. [1,2,4]) especially when sentiment analysis is applied on social media which are rich in metadata and other activities of social context (e.g. "mentions", "likes", etc). ...
Research
Full-text available
The Word-Graph Sentiment Analysis Method is proposed to identify the sentiment that expressed in a microblog document using the sequence of the words that contains. The sequence of the words can be represented using graphs in which graph similarity metrics and classification algorithms can be applied to produce sentiment predictions. Experiments that were carried out with this method in a Twitter dataset validate the proposed model and allow us to further understand the metrics and the criteria that can be applied in words-graphs to predict the sentiment disposition of short, microblog documents.
... In recent years, there have been a number of shared task competitions on valence classification such as the 2013, 2014, and 2015 SemEval shared tasks titled Sentiment Analysis in Twitter, the 2014 and 2015 SemEval shared tasks on Aspect Based Sentiment Analysis, the 2015 SemEval shared task Sentiment Analysis of Figurative Language in Twitter, and the 2015 Kaggle competition Sentiment Analysis on Movie Reviews. 2 The NRC-Canada system (Mohammad, Kiritchenko, & Zhu, 2013a;Kiritchenko, Zhu, & Mohammad, 2014b), a supervised machine learning system, came first in the 2013 and 2014 competitions. Other sentiment analysis systems developed specifically for tweets include those by Pak and Paroubek (2010), Agarwal, Xie, Vovsha, Rambow, and Passonneau (2011), Thelwall, Buckley, and Paltoglou (2011), Brody and Diakopoulos (2011), Aisopos, Papadakis, Tserpes, and Varvarigou (2012, Bakliwal, Arora, Madhappan, Kapre, Singh, and Varma (2012). However, even the best systems currently obtain an F-score of only about 0.7. ...
Chapter
Full-text available
A vast majority of the work in Sentiment Analysis has been on developing more accurate sentiment classifiers, usually involving supervised machine learning algorithms and a battery of features. Surveys by Pang and Lee (Found Trends Inf Retr 2(12):1135, 2008), Liu and Zhang (A survey of opinion mining and sentiment analysis. In: Aggarwal CC, Zhai C (eds) In: Mining text data. Springer, New York, pp 415463, 2012), and Mohammad (Mohammad Sentiment analysis: detecting valence, emotions, and other effectual states from text. In: Meiselman H (ed) Emotion measurement. Elsevier, Amsterdam, 2016b) give summaries of the many automatic classifiers, features, and datasets used to detect sentiment. In this chapter, we flesh out some of the challenges that still remain, questions that have not been explored sufficiently, and new issues emerging from taking on new sentiment analysis problems. We also discuss proposals to deal with these challenges. The goal of this chapter is to equip researchers and practitioners with pointers to the latest developments in sentiment analysis and encourage more work in the diverse landscape of problems, especially those areas that are relatively less explored.
Preprint
Full-text available
Various cultural and behavioral preferences delineate human creativity. One such set of preferences, literary inclination, impacts not only the books people choose to read but also on the way they perceive the narrative and human relationships contained within. Understanding the overall development of a plot and the evolution of relationships among different characters has significant implications for holding the reader’s attention. In this paper, we establish a computational means to derive progressions and associations among characters in a given narrative. We use two books from different cultural traditions to validate this technique. For purposes of measuring relationships progression between different characters, we propose the Graphical Association Method (GAM). Further analysis of changes in these imaginary social relationships in relation to a reader’s literary inclinations demonstrates that this method holds promise for a more general analysis of narrative structure.
Article
Social media has reformed into the digital revolution. Applications like Facebook, Instagram, Twitter, LinkedIn, WhatsApp and lot more, are highly enjoyed by social media users. But where some people are enjoying social media to their full, others are the victims of its negative aspect which includes sending obscene messages to someone. Although blocking is the favourable solution to it, it has a deep impact on one's mind. 81 percent of Internet-initiated crime involves social networking sites, mainly Facebook and Twitter due to unhealthy comments and posts. This paper develops the state of art sentiment analysis that provides the particular channel through which any post, comment, message or any other text scrutinized for the sentiment before getting posted to the concerned web and if any unethical sentiment found, action would be placed through defined protocols of that social media. Different sentimental datasets corpus are revived from the cyberspace. Customized naive Bye's classifier is trained for the prediction of respective sentiments of the text. This paper doesn't motivate to not to write controversial comments but discourage the unhealthy way of controversy.
Conference Paper
The Word-Graph Sentiment Analysis Method is proposed to identify the sentiment that expressed in a microblog document using the sequence of the words that contains. The sequence of the words can be represented using graphs in which graph similarity metrics and classification algorithms can be applied to produce sentiment predictions. Experiments that were carried out with this method in a Twitter dataset validate the proposed model and allow us to further understand the metrics and the criteria that can be applied in words-graphs to predict the sentiment disposition of short, microblog documents.
Conference Paper
Full-text available
We connect measures of public opinion measured from polls with sentiment measured from text. We analyze several surveys on consumer confidence and political opinion over the 2008 to 2009 period, and find they correlate to sentiment word frequencies in contempora- neous Twitter messages. While our results vary across datasets, in several cases the correlations are as high as 80%, and capture important large-scale trends. The re- sults highlight the potential of text streams as a substi- tute and supplement for traditional polling.
Conference Paper
Full-text available
Sentiment analysis on Twitter data has attracted much attention recently. In this paper, we focus on target-dependent Twitter sentiment classification; namely, given a query, we classify the sentiments of the tweets as positive, negative or neutral according to whether they contain positive, negative or neutral sentiments about that query. Here the query serves as the target of the sentiments. The state-of-the-art approaches for solving this problem always adopt the target-independent strategy, which may assign irrelevant sentiments to the given target. Moreover, the state-of-the-art approaches only take the tweet to be classified into consideration when classifying the sentiment; they ignore its context (i.e., related tweets). However, because tweets are usually short and more ambiguous, sometimes it is not enough to consider only the current tweet for sentiment classification. In this paper, we propose to improve target-dependent Twitter sentiment classification by 1) incorporating target-dependent features; and 2) taking related tweets into consideration. According to the experimental results, our approach greatly improves the performance of target-dependent sentiment classification.
Article
Full-text available
This article presents a novel automatic method (AutoSummENG) for the evaluation of summa- rization systems, based on comparing the character n-gram graphs representation of the extracted summaries and a number of model summaries. The presented approach is language neutral, due to its statistical nature, and appears to hold a level of evaluation performance that matches and even exceeds other contemporary evaluation methods. Within this study, we measure the effectiveness of different representation methods, namely, word and character n-gram graph and histogram, dif- ferent n-gram neighborhood indication methods as well as different comparison methods between the supplied representations. A theory for the a priori determination of the methods' parameters along with supporting experiments concludes the study to provide a complete alternative to existing methods concerning the automatic summary system evaluation process.
Conference Paper
Online content exhibits rich temporal dynamics, and diverse realtime user generated content further intensifies this process. However, temporal patterns by which online content grows and fades over time, and by which different pieces of content compete for attention remain largely unexplored. We study temporal patterns associated with online content and how the content's popularity grows and fades over time. The attention that content receives on the Web varies depending on many factors and occurs on very different time scales and at different resolutions. In order to uncover the temporal dynamics of online content we formulate a time series clustering problem using a similarity metric that is invariant to scaling and shifting. We develop the K-Spectral Centroid (K-SC) clustering algorithm that effectively finds cluster centroids with our similarity measure. By applying an adaptive wavelet-based incremental approach to clustering, we scale K-SC to large data sets. We demonstrate our approach on two massive datasets: a set of 580 million Tweets, and a set of 170 million blog posts and news media articles. We find that K-SC outperforms the K-means clustering algorithm in finding distinct shapes of time series. Our analysis shows that there are six main temporal shapes of attention of online content. We also present a simple model that reliably predicts the shape of attention by using information about only a small number of participants. Our analyses offer insight into common temporal patterns of the content on theWeb and broaden the understanding of the dynamics of human attention.
Conference Paper
Automated identification of diverse sentiment types can be beneficial for many NLP systems such as review summarization and public media analysis. In some of these systems there is an option of assigning a sentiment value to a single sentence or a very short text. In this paper we propose a supervised sentiment classification framework which is based on data from Twitter, a popular microblogging service. By utilizing 50 Twitter tags and 15 smileys as sentiment labels, this framework avoids the need for labor intensive manual annotation, allowing identification and classification of diverse sentiment types of short texts. We evaluate the contribution of different feature types for sentiment classification and show that our framework successfully identifies sentiment types of untagged sentences. The quality of the sentiment identification was also confirmed by human judges. We also explore dependencies and overlap between different sentiment types represented by smileys and Twitter hashtags.
Conference Paper
The rapid growth of geotagged social media raises new computational possibilities for in- vestigating geographic linguistic variation. In this paper, we present a multi-level generative model that reasons jointly about latent topics and geographical regions. High-level topics such as "sports" or "entertainment" are ren- dered differently in each geographic region, revealing topic-specific regional distinctions. Applied to a new dataset of geotagged mi- croblogs, our model recovers coherent top- ics and their regional variants, while identi- fying geographic areas of linguistic consis- tency. The model also enables prediction of an author's geographic location from raw text, outperforming both text regression and super- vised topic models.
Conference Paper
Twitter is a microblogging website where users read and write millions of short messages on a variety of topics every day. This study uses the context of the German federal election to investigate whether Twitter is used as a forum for political deliberation and whether online messages on Twitter validly mirror offline political sentiment. Using LIWC text analysis software, we conducted a contentanalysis of over 100,000 messages containing a reference to either a political party or a politician. Our results show that Twitter is indeed used extensively for political deliberation. We find that the mere number of messages mentioning a party reflects the election result. Moreover, joint mentions of two parties are in line with real world political ties and coalitions. An analysis of the tweets' political sentiment demonstrates close correspondence to the parties' and politicians' political positions indicating that the content of Twitter messages plausibly reflects the offline political landscape. We discuss the use of microblogging message content as a valid indicator of political sentiment and derive suggestions for further research. Copyright © 2010, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.