ArticlePDF Available

Anger is More Influential Than Joy: Sentiment Correlation in Weibo


Abstract and Figures

Recent years have witnessed the tremendous growth of the online social media. In China, Weibo, a Twitter-like service, has attracted more than 500 million users in less than four years. Connected by online social ties, different users influence each other emotionally. We find the correlation of anger among users is significantly higher than that of joy, which indicates that angry emotion could spread more quickly and broadly in the network. While the correlation of sadness is surprisingly low and highly fluctuated. Moreover, there is a stronger sentiment correlation between a pair of users if they share more interactions. And users with larger number of friends posses more significant sentiment influence to their neighborhoods. Our findings could provide insights for modeling sentiment influence and propagation in online social networks.
Content may be subject to copyright.
arXiv:1309.2402v1 [cs.SI] 10 Sep 2013
Anger is More Influential Than Joy: Sentiment
Correlatio n in Weibo
Rui Fan, Jichang Zhao , Yan Chen and Ke Xu
State Key Laboratory of Software Development Environment, Beihang University,
Beijing 100191, P.R.China
Recent years have witnessed the tremendous growth of the online social me-
dia. In China, Weibo, a Twitter-like service, has attracted more than 500
million users in less than four years. Connected by online social ties, dif-
ferent users influence each other emotionally. We find the correlation of
anger among users is significantly higher than that of joy, which indicates
that angry emotion could spread mor e quickly and broadly in the network.
While the correlation of sadness is surprisingly low and highly fluctuated.
Moreover, there is a stronger sentiment correlation between a pair of users if
they share more interactions. And users with larger number of friends posses
more significant sentiment influence to their neighborhoo ds. Our findings
could provide insights for modeling sentiment influence and propagation in
online social networks.
Sent iment influence, Emotion propagation, Sentiment analysis, Online
social network, Weibo
1. Introduction
From the view of conventional social theory, hom ophily leads to connec-
tions in social network, as the saying “Birds of a feather flock t ogether”
states [1]. Even in online social network, more and more evidence indicates
that the users with similar properties would be connected in the future with
To whom the correspondence should be addressed:
Preprint submitted to Elsevier September 11, 2013
high probabilities [2, 3]. It is clear that homophily could affect user behav-
ior both online and offline [4, 5], while the records in online social network
are relatively easier to be tracked and collected. Moreover, the continuous
growth of the online social media attracts a vast number of internet users
and produces many huge social networks. Twitter
, a microblogging website
launched in 2006 , has over 200 million active users, with over 140 million
microblog posts, known as tweets, being posted everyday. In China, Weibo
a Twitter-like service launched in 2009, has accumulated more than 500 mil-
lion registered users in less than four years. Everyday there will be more than
100 million Chinese tweets published. The high-dimension content generated
by millons of globa l users is a “big data ” window [6] to investigate the on-
line social networks. That is to say, these large-scale online social networks
provide an unprecedented opportunity for the study of human behavior.
Beyond typical demographic features such as age, race, hometown, com-
mon friends and interest, homophily also includes psychological states, like
loneliness and happiness [1, 7, 4]. Previous studies also show that the
computer-mediated emotional communication is similar to the traditional
face-to-face communication, which means there is no evident indicatio n that
human communication in online social media is less emotional or less per-
sonally [8]. The tweets posted in online social networks deliver not only the
factual information but also the sentiment of the users, which represents their
reflections on different social events. R ecent study [4] shows that happiness is
assortative in Twitter network and [6] finds that the average happiness scores
are positively correlated between the Twitter users connected by one, two or
three social ties. While in these studies, the human emotion is simplified
to two classes of positive and negative or just a score of general happiness,
neglecting the detailed aspects of human sentiment, especially the negative
emotion. Because of oversimplification of the emotion classification, it is
hard f or the previous literature to disclose the different correlat ions of differ-
ent sentiments and then make comparisons. However, the negative emotions,
like anger, sadness or disgust, are more applicable in real world scenarios such
as abnormal event detection or emergency spread tracking. Figuring out the
correlation of these emotions could shed light on why and how the a bno r ma l
event begins to spread in the network and then leads to large-scale collective
behavior acro ss the entire network. On the other hand, the investigation
of how the local structure affects the emotion correlation is not systemati-
cally performed yet, while which is essential to studying the mechanism of
sentiment influence and contagion.
Aiming at fill these vital gaps, we divide the sentiment of a person into
four categories, including ange r, joy, sadness and disgust, and investigate
the emotion correlation between connected users in the interaction network
obtained from Weibo. Out of our expectation, it is found that anger has a
stronger correlatio n between different users than that o f joy, while s adness’s
correlation is trivial. This indicates that ange r could propagates fast and
broadly in the network, which could explain why the real-world events about
food security, government bribery o r demolition scandal are always the hot
trend in internet of China. Moreover, node degree a nd tie strength both
could positively boost the emotion cor r elat ion in online social networks. Fi-
nally, We make our datasets in this paper public available to the research
The r est of this paper is organized as follows. In Section 2, closely related
literature would be reviewed, including the methods of sentiment a nalysis
and the difference between our contributions and the previous findings. The
data sets employed in this paper and the methods of emotion classification
would be introduced in Section 3. We also define the correlation of emotion
in this section. Section 4 reports our findings in detail and our empirical
explanations and several real-wo r ld case studies would be elucidated in Sec-
tion 5. Finally, we give a further discussion in Section 6 and then conclude
this paper briefly.
2. Related works
The content in online social media like Twitter or Weibo is mainly recorded
in the form of text. Many approaches have been presented to mine sentiment
from these texts in recent years. One of them is the lexicon based method,
in which the sentiment of a tweet is determined by counting the number
of sentimental words, i.e., positive terms and negative terms. For exam-
ple, Dodds and Danforth measured the happiness of songs, blogs and presi-
dent s [9]. They also employed Amazon Mechanical Turk to score over 10000
unique English words on an integer scale from 1 to 9, where 1 represents
sad and 9 represents happiness [10]. Golder and Macy collected 509 million
English tweets from 2.4 million users in Twitter, then measured the posi-
tive and negative affects using Linguistic Inquiry and Word Count(LIWC)
( While another one is the machine learning based
solution, in which different features are considered to perform the task of
classification, including terms, smileys, emoticons and etc. The first step is
taken by Pang et al. in [11], they treat the sentiment classification of movie
reviews simply as a text categor izat ion problem and investigate several typi-
cal classification algorithms. According to the experimental results, machine
learning based classifiers outperform the baseline method based on human
words list [12, 13, 14]. Different from most work which j ust categorized the
emotion into negative and positive, [15] divided the sentiment into four
classes, then presented a framework based on emoticons without manually-
labelled training tweets and achieved a convincing precision. Because of
the ability of multi-emotions classification, we employ this framework in the
present paper.
Each user in the online so cial network could be a social sensor and the
huge amount of tweets convey complicated signals about the users and the
real-world events, a mo ng which the sentiments are an essential part. Emotio n
states of the users play a key role in understanding the user behaviors in so-
cial networks, whether from an individual or gro up perspective. In addition,
users’ mood states are significantly affected by the real-world events [16]. [17 ]
employed the public mood stat es to predict the stock market and [15] found
the variation of the emotion could be used to detect the abnormal event
in real-world, especially the negative sentiment. Individual happiness was
measured and several temporal patterns of happiness were revealed in [10].
In [18], Golder and Macy collected 509 million English tweets fro m 2.4 mil-
lion users in Twitter and disclosed the individual-level diurnal and seasonal
mood rhythms in cultures across the globe. The population’s mood status
was also used to conduct the political forecasting [19]. About the emotion
correlation, recent studies [4, 6] show that happiness is assortative in Twit-
ter. While other negative emotions’ correlations ar e not considered in these
studies and how the local structure affect the sentiment influence is also not
fully investigated. We try to focus more on the difference of correlation be-
tween different sentiments and probe deeper into the relation between local
structure and emotion correlation. We conjecture that emotion play a sig-
nificant role in information contagion, especially the negative emotions [5].
Because of this, understanding the corr elat ion difference could shed light on
the origin of abnormal event propagation in online social media and provide
many inspirations for modeling sentiment influence.
3. Methods
In this section, the methodology of t he present paper would be depicted.
First, we introduce the collection of the tweets from Weibo and the construc-
tion of the interaction network. Then the classifier we employ to mine the
sentiment from tweets is reported. Thirdly we define two kinds of emotion
correlations for connected users in the interaction network.
3.1. Weibo Dataset
As pointed out in [4], the following relationship in Twitter-like social
networks does not stand for the social interaction, while if two users reply,
retweet or mention each other in their tweets for certain times, the online so-
cial tie between t hem is sufficient to present an alternative means of deriving
a conventional social network [6]. So here we construct an interaction net-
work from the tweets we crawled from Weibo during April 2010 to September
2010, where interaction means the number that two users r etweet or mention
each other is larger than a threshold T . From around 70 million tweets a nd
200,000 users we crawled, an undirected but weighted graph G(V, E, T ) is
constructed, in which V is the set of users, E represents the set of interactive
links among V and T is the minimum number of interactions on each link.
For each link in E, its weight is the sum o f retweet or mention times between
its two ends in the specified time period. Specifically, to exclude occasional
users that are not truly involved in the Weibo social network, we only reserve
those active users in our interaction network that posted more than one tweet
every two days on average over the six months. And to guarantee the valid-
ity of users social interaction, if the number o f two users retweet or mention
each other is less t ha n T , we would omit the connection between them. As
shown in Figure 1, by tuning T we can obt ain networks of different scales.
Generally we set T = 30 and then the interaction network G contains 9868
nodes a nd 19517 links. We also make our entire data set publicly availa ble
3.2. Emotion classification
In this paper, t he emotion is divided into four classes, including anger,
sadness, joy and disgust. We employ the bayesian classifier developed in per-
vious work [15]. In this method, we use the emoticon, which is pervasively
used in Weibo, to label the sentiment of the tweets. At the first stage, 95
Figure 1: The number of nodes or edges varies for different interaction thresh-
old T. In the following pa rt o f the present work, we set T = 30 to extract a
large enough network with convincing interaction strength.
(a) An interaction network. (b) Node colored by emotions.
Figure 2: The giant connected cluster of a network sample with T = 30.
(a) is the network structure, in which each node stands for a user and the
link between two users represents the interaction between them. Based on
this topology, we color each node by its emotion, i.e., the sentiment with the
maximum tweets published by t his node in the sampling period. In (b), the
red stands for anger, the green represents joy, the blue stands for sadness
and the black represents disgust. The regions of same color indicate that
closely connected nodes share the same sentiment.
frequently used emoticons are manually labelled by different sentiments and
then if a tweet only contains the emoticons of a certain sentiment, it would
be labeled with this sentiment. From around 70 million tweets, 3.5 million
tweets with valide emoticons are extracted and labeled. Using this data set
as a training corpus, a simple but fast bayesian classifier is built in the sec-
ond stage to mine the sentiment of the tweets without emoticons, which are
about 95% in Weibo. The averaged precision of this classifier is convincing
and particularly the large amount of tweets we employ in the experiment
can guarantee its accuracy further. Based on this framework, we demon-
strate a sampled snapshot of interaction network with T = 30. As shown
in Figure 2b, in which each user is colored by it s emotion. We can roug hly
find that closely connected nodes generally share the same color, indicating
emotion correlations in Weibo network. Besides, different colors show dif-
ferent clusterings. For example, the color of red, which represents an g er,
shows more evident clustering. These preliminary findings inspire us that
different emotions might have different correlations and a deep investigation
is necessary.
3.3. Emotion correlation
Emotion correlation is a metric to quantify the strength of sentiment
influence between connected users. Fo r a fixed T , we first extract an inter-
action network G and all the tweets posted by the nodes in G. Then by
employing the classifier established in the former section, the tweets for each
user is divided into four categories, in which f
, f
, f
and f
represent the
fraction of angry, joyful, sad and disgusting tweets, respectively. Hence we
can use emotion vector e
, f
, f
, f
) to denote user i’s sentiment stat us.
Based on this, we define pairwise sentiment correlat ion as follows. Given a
certain hop distance h, we collect all user pairs with distance h from G. For
one of t he four emotions m(m = 1, 2, 3, 4) and a user pair (j, q), we put the
source user j’s f
into a sequence S
, and the target user q’s f
to a nother
sequence T
. Then the pairwise correlation could be calculated by Pearson
correlation as
l 1
where hS
i =
is the mean, σ
i) is the
standard deviation and l is length of S
or T
. Or it can also be obtained
(a) Person correlation (b) Spearman correlation
Figure 3: Correlation for different emotions as the hot distance varies. Large
h means a pair o f users are far away from each other in the social network
we build. Here T = 30 is fixed.
from Spear ma n correlat ion as
= 1
l( l
where d
is the rank difference between S
in S
and T
in T
. Intuitively
larger C
and C
both suggest a more positive correlation for sentiment m.
Based o n the dataset and classifier, interaction networks could be built
and tweets of each user in the network would be emotionally la belled. Using
the definition o f correlations, we can then present the comparison of emotion
correlations and the impact of local structures in the following section.
4. Results
First we compare the correlation of different emotions based on the gr aph
of T = 30, which ensures enough number of ties and users, and at the same
time g uarantees relatively high social tie strength. As shown in Figure 3,
both Pearson correlation and Spearman correlation indicate that different
sentiments have different correlations and anger has a surprisingly higher
correlation than other emotions. This suggests that anger could spread
quickly and broadly across the network because of its strong influence to
the neighborhoods in the scope of about three hops. Although the previous
studies [4, 6] show that happiness is assortative in online social networks,
(a) Person correlation (b) Spearman correlation
Figure 4: The emotion sequence is randomly shuffled to test the correlation
but Figure 3 further demonstrates that the correlation of anger is much
stronger than that of happiness. It means the information carrying angry
message might propagate very fast in the network and this phenomenon is
contrary to our intuition. While for sadness and disgust, they both share an
unexpected low cor relation even for small h. For instance, the correlation of
sadness is less than 0.15 as h = 1, which means sad status almost does not
affect the directly connected friends at all. The results are also consistent
with the previous findings that strength of the emotion correlation decreases
as h grows, especially after h > 6 [6]. In fact, as h > 3, the emotion corre-
lation becomes weak for all the sentiments, which means that the influence
of the sentiment in the social network is limited significantly by the social
distance. For example, for strong assortative emotions like anger and joy,
their correlations just fluctuate around 0 as h > 5.
In order to test the above correlation further, we also shuffle S
randomly for sent iment m and recalculate its correlation. As shown in
Figure 4, fo r the shuffled emotion sequence, there is no correlation existing
for all the sentiments. It indicates that the former correlation we get is
truly significant and for random pair of users in social network, there is no
emotion homophily. It further justifies that through social ties, the sentiment
indeed spreads between closely connected friends and different users could
influence their neighborhoods’ mood statuses because of the social bonds
between them.
Investigating to what extent the local structure, like tie strength and
(a) anger (b) joy
(c) sadness (d) disgust
Figure 5: Pearson correlations of different h fo r different networks extracted
by varying T. The case of h > 3 is not considered here because of the weak
sentiment correlation found in Figur 3.
Figure 6: Here T is fixed to 30. Because the network is relatively small, the
largest degree we get is o nly 30. The results just indicate when the degree is
small, all sentiments’ correlatio ns increase with node degrees.
node degree, could affect the emotion correlation is of importance for mod-
eling sentiment influence and propagation. As shown in Figure 5, we first
disclose how the interaction threshold T affects the sentiment influence. As
discussed in Section 3.1, larger T produces smaller networks but with closer
social relations and stro ng online interactions. It is also intuitive that fre-
quent interactions in online social networks are positively related with stro ng
social ties and convincing social bonds. Because of this, we can see in Figure 5
that for all the four emotions, their correlations inside two hops continue a
steady increasing trend with T ’s growth. Particularly for anger, its Pear-
son correlation could rise to around 0.52. For weakly correlated emotions
like sadness and disgust, alt hough t he correlat ion shows a slow growth for
h = 1 and h = 2, while the maximum value of the correlat ion is still lower
than 0.25. As h = 3, the increment of the sentiment influence is trivial,
esp ecially for sadness and disgust. It illustrates that the primary factor of
controlling emotion cor r elat ion is still the social distance and the social tie
strength just functions for close neighbors in the scope of two hops. Secondly,
we check the effect of users’ degrees to the sentiment influence. We select
a node i with degree k and then average its neighbors’ emotion vectors to
), where j is an a rbitrary neighbor
of i. Through adding f
into S
into T
, we could get the
correlation of sentiment m for the users with degree k. As can be seen in
Figure 6, the sentiment correlation grows with k, especially for ang er and
joy, which illustrates tha t nodes with higher degrees in online social network
posses mor e significant emotional influence to their neighborhoods. This
finding is consistent with the conventional viewpoint that high degree nodes
in the social network own more social influence and social capital. Specifi-
cally, the cor r elat ion of anger and joy are almost same for very small degrees,
but later anger shows a significant jump for large degrees and enlarges the
gap as compared t o joy. As k rieses to 30, the corrleation of anger grows to
0.85. While the correlation of sadness and dis gust do not demonstrate an
obvious increasing t r end and just fluctuate around 0.2 or even lower. It is
worthy emphasizing that because the network size is small and we only have
maximum degree around 30, which is far below the Dunbar’s number[20].
We suspect that the correlation might stop rising if the degree is larger than
Dunbar’s number. The results of Spearman correlation are similar and not
reported here.
To sum up, different emotions have different correlations in the social me-
dia. Compared to other sentiments, anger has the most positive correlation,
which indicates its fast and broad propagation. Local structure can affect
the sentiment influence in near neighborhoods, from which we can learn that
tie strength and node degree both could enhance the sentiment influence,
esp ecially for anger and jo y, and their contr ibutions to s adness and disgust
are greatly limited. While high correlation of angry mood but weak influence
of sad status indeed require much more detailed explorations to disclose the
underlying reasons.
5. Empirical Explanation
With the continuous growth, online social medias in China like Weibo
have been becoming the primary channel of information exchange. In Weibo,
the messages do not only deliver the factual information but also propagate
the users opinions about the social event or individual affairs. Hence we
try to unravel the underlying reason o f why anger has a surprisingly high
correlation but the spread of sadness is weak from the view of keywords
the corresponding tweets present. For a certain emotio n m, we collect all
the retweeted tweets(usually contain phrase like “@” or “retweet”) with this
sentiment in a specified time period to combine into a long text document .
Focusing only on retweeted tweets could help reduce the impact of external
media and just consider social influence from the social ties in Weibo. Several
typical techniques are employed to mine the keywords or topic phrases from
the document s, which are reported in Figure 7. Based on the keywords or
topics we find, the real-world events or social issues could be summarized to
understand the sentiment influence in detail.
With respect to anger, we find two kinds of social events are apt to trig-
ger the angry mo od of users in Weibo. First one is the domestic social
problems like food security, government bribery and demolition for resettle-
ment. The “shrimp washing powder” which results in muscle degeneration
and the self-burning event in Fenggang Yihuang County of Jiangxi province
represent this category. These events reflect that people living in China a r e
dissatisfied about some aspects of the current society and this type of event
can spread quickly as the users want to show their sympathy to the victims
by retweeting tweets and criticizing the criminals or the government. Fr e-
quently appearing phrases like “government”, “bribery”, “demolition” and
so on are strongly related with these events. The second type is about the
diplomatic issues, such as the conflict between China and foreign countries.
(a) anger phrases (b) sadness phrases
Figure 7: The example Chinese keywords extracted for anger and sadness,
respectively. The to p 20 keywords are also translated int o English, which
could be fo und through
For instances, in August 2010, United States and South Korea held a drill
on the Yellow Sea, which locates in the east of China. In September 201 0,
the ship collision of China and Japa n also made users in Weibo extremely
rageful. Actually, these events could arouse patriotism and stimulate the
angry mood. Keywords like “Diaoyu Island”, “ship collision” and “Philip-
pines” show the popularity of these events at that time. To sum up, Weibo
is a convenient and ubiquitously channel for Chinese to share their concern
about the continuous social problems and diplomatic issues. Pushed by the
real-world events, these users tend to retweet tweets, express their anger and
hope to get resonance from neighborhoods in online so cial network. While
regarding to sadness, we find its strength of correlation is strongly a ff ected
by the real-world natural disasters like earthquake, as shown in Figure 7b.
Because the natural disaster happens occasionally and then the averaged cor-
relation of the sadness is very low and the strength of its correlation might
be highly fluctuated.
In summary, real-world society issues are easy to get attention f rom the
public and people tend to express their anger towards theses issues through
posting and retweeting tweets in online social media. The angry mood de-
livered thro ugh social t ies could boost the spread of the corresponding news
and speed up the formation of public opinion and collective behavior. This
can explain why the events related to social problems propagate extremely
fast in Weibo.
6. Discussion and Conclusion
Users with similar demographics have high probabilities to get connected
in both online and offline social networks. Recent studies reveal that even
the psychological states like happiness are assortative, which means the hap-
piness or well-being is strongly correlated between connected users in online
social medias like Twitter. Considering the oversimplification of the sen-
timent classification in the previous literature, we divide the emotion into
four catego ries and discuss their different correlations in details based on
the tweets collected from Weibo of China, and the data set has been public
ava ilable to research community.
Our results show that ang e r is more influential than other emotions like
joy, which indicates that the angry tweets can spread quickly and broadly
in the networ k. While out of our expectation, the correlation of sadn ess is
low. Through keywords and topics mining in retweeted angry tweets, we
find the public opinion towards social problems and diplomatic issues are
always angry and t his extreme mental status a lso boost the propagation of
the information in Weibo. This might be the origin of large scale online
collective behavior in Weib o about society problems such as food security
and demolition for resettlement in recent year s. We conjecture that anger
plays a non-ignor able role in massive propagations of the negative news about
the society, which are always hot trends in today’s internet of China.
Besides, we also investigate the affect of local structure to the emotion
correlation in online social media, which is not fully probed before. We find
that for a pair of users the emotion correlation is stronger if mo r e interac-
tions happen between them. We also disclose that the a node’s degree could
significant ly enhance the sentiment influence to its neighborho od, especially
for anger and joy. These findings could shed light on modeling sentiment
influence and spread in social networ ks.
7. Acknowledgements
This work was partially supported by the fund of the State Key Labora-
tory of Software Development Environment under Grant SKLSDE-2011ZX-
02, the Research Fund for the Doctoral Program of Higher Education of
China under Grant 20111102110019, and the National 863 Program under
Grant 2012AA01 1005. JZ and YC both thank t he Innova tion Foundation of
BUAA for PhD Graduates.
[1] M. Miller, S.-L. Lynn, C. James, M, Birds of a feather: Homophily in
social networks, Annual Review of Sociology 27 (2001) 415– 444.
[2] A. Mislove, M. Marcon, K. P. Gummadi, P. Druschel, B. Bhattacharjee,
Measurement and analysis o f online social networks, in: the 7th ACM
SIGCOMM conference on Internet measurement, IMC ’07, 2007, pp.
29–42. doi:
[3] D. Liben-Nowell, J. Kleinberg , The link prediction problem for so-
cial networks, in: the twelfth international conference on Informa-
tion and knowledge management, CIKM ’03, 2003, pp. 556–559.
[4] J. Bollen, B. Gon¸calves, G. Ruan, H. Mao, Happiness is assortat ive in
online social networks, Artif. L ife 17 (3) (2011) 237–251.
[5] A. Chmiel, P. Sobkowicz, J. Sienkiewicz, G. Paltoglou, K. Buckley,
M. Thelwa ll, J. A. Holyst, Negative emotions boost user activity at
bbc forum, Physica A: Statistical Mechanics and it s Applications 390
(2011) 2936–2944.
[6] C. A. Bliss, I. M. Kloumann, K. D. Harris, C. M. Danforth, P. S. Dodds,
Twitter reciprocal reply networks exhibit assortativity with respect to
happiness, Journal of Computational Science 3 (2012) 388–397.
[7] A. L. Traud, E. D. Kelsic, P. J. Mucha, M. A. Porter, Comparing com-
munity structure to characteristics in online collegiate social networks,
ArXiv e-prints, arXiv0809.0690.
[8] D. Derks, A. H. Fischer, A. E. Bos, The role of emotion in computer-
mediated communication: A review, Computers in Human Behavior 24
(2008) 766–785.
[9] P. S. Dodds, C. M. Danforth, Measuring the happiness of large-scale
written expression: So ngs, blogs, and presidents, Journal of Happiness
Studies 11 (4) (2010) 441–456.
[10] P. S. Dodds, K . D. Harris, I. M. Kloumann, C. A. Bliss, C. M. Dan-
forth, Temporal pat terns of happiness and information in a global social
network: Hedonometrics and twitter, PLoS O NE 6.
[11] B. Pang, L. Lee, S. Vaithyanathan, Thumbs up?: sentiment classification
using machine learning techniques, in: EMNLP, 2002, pp. 79–86.
[12] R. Parikh, M. Movassate, Sentiment analysis of user-generated twitter
updates using various classification techniques, Technical report .
[13] J. Read, Using emoticons to reduce dependency in machine learning
techniques for sentiment classification, in: ACLstudent, 2005, pp. 43–
[14] A. Go, R. Bhayani, L. Huang, Twitter sentiment classification using
distant supervision, Technical report, Stanford Digital Library Tech-
nologies Project.
[15] J. Zhao, L. Dong , J. Wu, K. Xu, Moodlens: an emoticon-based sentiment
analysis system for chinese tweets, in: KDD ’12, 2012, pp. 1528–1531.
[16] J. Bollen, A. Pepe, H. Mao, Modeling public mood and emotion: Twitter
sentiment and socio-economic phenomena, in: Fifth ICWSM, 2011.
[17] J. Bollen, H. Mao, X. Zeng, Twitter mood predicts the stock mar ket,
Journal of Computational Science 2 (1) (2011) 1–8.
[18] S. A. Golder, M. W. Macy, Diurnal and seasonal mood vary with work,
sleep, and day length across diverse cultures, Science 333 (6051) (2011)
[19] M. Marchetti-Bowick, N. Chamb ers, Learning for microblogs with dis-
tant supervision: Political forecasting with twitter, in: 13th EACL,
2012, pp. 603–612.
[20] R. Dunbar, Grooming, Gossip, and the Evolution of L anguage, Harvard
University Press, Cambridge, MA, 1998.
... AgrPct the percentage of anger in all emotions. 22 DstPct the percentage of detestation in all emotions. 23 HppPct the percentage of happiness in all emotions. ...
... We collect microblog posts in Beijing and then use the social emotions analysis algorithm from Ref. [22] to analyze the emotions of each microblog post. Each microblog post is classified as one of the following 5 types: anger, detest (dislike), happiness, sadness, fear. ...
Full-text available
Real estate prices have a significant impact on individuals, families, businesses, and governments. The general objective of real estate price prediction is to identify and exploit socioeconomic patterns arising from real estate transactions over multiple aspects, ranging from the property itself to other contributing factors. However, price prediction is a challenging multidimensional problem that involves estimating many characteristics beyond the property itself. In this paper, we use multiple sources of data to evaluate the economic contribution of different socioeconomic characteristics such as surrounding amenities, traffic conditions and social emotions. Our experiments were conducted on 28,550 houses in Beijing, China and we rank each characteristic by its importance. Since the use of multi-source information improves the accuracy of predictions, the aforementioned characteristics can be an invaluable resource to assess the economic and social value of real estate. Code and data are available at:
... Previous studies have revealed the correlation between the performance of information diffusion and information topics [50], emotions expressed in texts [51], and user attributes [52]. In fact, in some cases, user attributes are not related to diffusion performance [27]. ...
Full-text available
Information diffusion in social media has attracted the wide attention of scholars from diverse disciplines. In real life, many offline events can cause online diffusion of relevant information, and the relation between the characteristics of information diffusion and offline events, as well as the diffusion differences corresponding to different phases of offline events have been studied. However, the effects of offline events on information diffusion are not well explored. In this paper, we study the influence of a popular and multi-phase talent show with elimination mechanism on relevant information diffusion. We find that elimination mechanism has significant influence on the features of information diffusion, and elimination results have a negative effect on followers’ emotional tendency. Elimination results also significantly affect the topics discussed by users. Besides elimination results have a negative effect on participants’ popularity, but do not affect the followers’ loyalty to program participants. This study not only reveals the effects of offline events on online information diffusion, but also provides approaches for studying the online diffusion of similar offline events.
... This technique demonstrates its usefulness by examining differences in language use between PTSD and random people, constructing classifiers to separate these two groups, and detecting high PTSD rates using our classifiers at and around U.S. military bases. Fan, Jichang Zhao, Yan Chen, and KeXu [2] defined that, in less than five years in China, Weibo, a Twitter-like service, has attracted over 500 million customers. The distinct users could share comparable affective states with the assistance of internet social sites. ...
Stress is a kind of demand to respond to any in your body's manner. It can be based on experiences that are both good and bad. Psychological stress threatens the health of individuals. People are used to exchanging their schedule and daily operations with colleagues on social media platforms with the reputation of a social media network, creating it possible to hold online social network information for stress detection. For a variety of applications data mining methods are used. Data mining plays a significant role in the detection of stress in sector. We proposed a new model in this article to detect stress. Initially, in this model, discover a correlation between stress states of user and effective public interactions. This describes a set of textual, visual and social characteristics related to stress from different elements and proposes a new hybrid model coupled with Convolutional Neural Network (CNN) to efficiently hold tweet content and data on social interaction to detect stress. The suggested model can enhance the detection efficiency by 97.8 percent, which is quicker than the current scheme, from the experimental outcomes.
... In addition to this, it proposes a series of alternative Probabilistic Bayesian approaches where a user is characterized by his/her check-in frequency at each location [7].Dynamic network change detection is applied to longitudinal observed network data to rapidly detect small persistent changes in the underlying structure being modeled [8]. It assumes that these structures are not fixed and that their relationships, attributes, and composition may change over time [9].The headway of informal organizations like Twitter, Face book and Sina Weibo2, a consistently increasing number of individuals will share their consistently occasions and states of mind and associate with companions through the interpersonal organizations [10]. Here the health development skills should be implemented to avoid the stress which leads to anxiety and aggression [11]. ...
Mental stress is turning into a threat to people's health currently days. With the last step of life, a lot of and a lot of folks are feeling stressed. A novel hybrid model combined with Convolution Neural Network (CNN) to control tweet content and social interaction information for stress detection effectively. Network anomaly detection is an important and dynamic research area. Many network intrusion detection methods and systems (NIDS) have been proposed in the literature. Fake news detection on social media presents unique characteristics and challenges that make existing detection algorithms from traditional news media ineffective or not applicable. Based on the information that is provided by the online social network, the conditions are limited. This method can opinion investigation of Facebook post after Formation of point utilizing Support Vector Method (SVM). After grouping client is in pressure or not k-closest neighbor calculation (KNN) is utilized for proposal emergency clinic on a guide just as Admin can send letters of precautionary measure list for the client for end up solid and upbeat throughout everyday life
... The Hot Search List (HSL) on Sina Weibo, with a role similar to Twitter trending list, is a major source for people from mainland China to obtain real-time information about the popularity of topics. Research on Weibo hot topics has focused on topic dynamics from the perspective of time, geography, demographics, emotion, retweeting, and correlation [15], on similarities and differences to Twitter [16,24], emergence mechanisms [17], patterns of popularity evolution [18], prediction [19][20][21], social emotions and diffusion patterns [22] as well as impact of censorship [23]. ...
Microblogging sites are important vehicles for the users to obtain information and shape public opinion thus they are arenas of continuous competition for popularity. Most popular topics are usually indicated on ranking lists. In this study, we investigate the public attention dynamics through the Hot Search List (HSL) of the Chinese microblog Sina Weibo, where trending hashtags are ranked based on a multi-dimensional search volume index. We characterize the rank dynamics by the time spent by hashtags on the list, the time of the day they appear there, the rank diversity, and by the ranking trajectories. We show how the circadian rhythm affects the popularity of hashtags, and observe categories of their rank trajectories by a machine learning classification algorithm. By analyzing patterns of the ranking dynamics we identify anomalies that are likely to result from the platform provider's intervention into the ranking, including the anchoring of hashtags to certain ranks on the HSL. We propose a simple model of ranking that explains the mechanism of this anchoring effect.
In this work, we have applied the density functional theory (DFT) and time-dependent density-functional theory (TDDFT) to study and discuss the different properties of the inorganic perovskites XZnF3 (X = Ag, Li or Na). In fact, we have presented the structural, electronic and optical properties of the Halide Perovskite XZnF3 (X = Ag, Li or Na). Such materials are in great demand for solar cell uses. To conduct this study, we have applied the Quantum Espresso package using the two methods: GGA–PBE and GGA–PBESol. The different lattice parameter a (Å) values have been used to deduce the energy optimum of the perovskites XZnF3 (X = Ag, Li or Na). Besides, the total and partial density of states (DOS) and the band structure of these materials have been illustrated for the two situations: in the presence and the absence of the Spin Orbit Coupling (SOC) approximation. To complete this study, we have presented the optical properties of the XZnF3 (X = Ag, Li or Na) materials. In fact, such properties have been investigated when exploring the real and imaginary parts of the corresponding dielectric function. To reach this goal, we have applied the two approximations: the GGA–PBE and GGA–PBESOL. Our results reveal high transparency of the electromagnetic radiations in the energy range between (0.0 ħω) Ry and (0.25 ħω) Ry. A notable peak of the imaginary part, has been found at about (0.15 ħω) Ry for the studied materials, confirms the transition from the top of valence band to the bottom of conduction band.
Purpose This study aims to reveal how the COVID-19 vaccine was accepted in the Japanese Twitter-sphere. This study explores how the topics related to the vaccine promotion project changed on Twitter and how the topics that were likely to spread changed during the vaccine promotion project. Design/methodology/approach The computational social science methodology was adopted. This study collected all tweets containing the word “vaccine” using the Twitter API from March to October 2021 and conducted the following analysis: analyzing frequent words and identifying topics likely to spread through the cosine similarity and Tobit model. Findings First, vaccine hesitancy–related words were frequently mentioned during the vaccine introduction and dissemination periods and had diffusing power only during the former period. Second, vaccine administration–related words were frequently mentioned and diffused through April to May and had diffusing power throughout the period. The background to these findings is that the sentiment of longing for vaccines outweighed that of hesitancy toward vaccines during this period. Originality/value This study finds that the timing of the rise in vaccine hesitation sentiment and the timing of the start of vaccine supply were misaligned. This is one of the reasons that Japan, which originally exhibited strong vaccine hesitancy, did not face vaccine hesitancy in the COVID-19 vaccine promotion project.
Air pollution poses a great threat to public health and social stability by influencing multiple emotions. In particular, the air quality in developing countries is deteriorating along with rapid industrialization and urbanization, and multiple emotions may change along with regulation updates and air quality trending. Monitoring changes in public emotion is crucial for environmental governance. However, limited evidence exists for long-term effects of air quality on fine-grained emotions. Traditional surveys have the drawbacks of spatial limitations and high costs of time and money. Here, we use deep learning models to identify multiple emotions of over 10 million haze-related tweets and evaluate the effect of air quality on emotional predispositions for 160 cities from 2014 to 2019 in China. We find that sadness and joy are persistently associated with air quality, while anger and disgust are not. Surprisingly, the effects on fear vanished in the last three years. Moreover, air pollution initially had a greater impact on expressed fear in cities with higher income, poorer air quality and a greater percentage of women. Through popularity ranking and dynamic topic model, we interpretively revealed that people are no longer overly panicked and their attention is shifting toward policies and sources of haze. Our findings highlight the temporal evolution in the public's emotional response and provide significant implications for equitable public policies.
The study investigates hypotheses relating to the effect of investor sentiment on predicting bitcoin returns and volatility. Using moments quantile regression, we present robust empirical evidence for the period 2017–2021. Our findings demonstrate that investor interest and emotions are significant predictors of bitcoin returns and volatility, while VIX and forum are the most suitable predictors for representing investor emotions and interest, respectively. The findings also indicate a nonlinear relationship between investor sentiment and bitcoin returns and volatility, with predictable power changing based on the market conditions. Thus, the study enriches existing literature by providing empirical evidence to affirm the viability of behavioral finance theories in the bitcoin market and complements investors with more information to seek profits in different market conditions.
Using Plutchik’s wheel of emotions framework, we identify the emotional content of 133,487 social media posts and the audience’s emotional engagement expressed in 2,824,162 comments on those posts. We measure nine emotions (anger, anticipation, anxiety, disgust, joy, fear, sadness, surprise, trust) and two sentiments (positive and negative) using two extraction resources (EmoLex, LIWC) for eight major news outlets across four social media platforms (Facebook, Instagram, Twitter, and YouTube) during eight months. We then apply two approaches (Logistic Regression, Long Short-Term Memory) to predict emotional audience reactions before and after publishing the posts. Findings show significant differences for positive emotions but not for negative in the comments among the platforms. F1-scores for predicting emotional audience engagement are more than 70% for some emotions for some news outlets. Implications are that news outlets have leverage in steering emotional engagement for posts on social media platforms. The findings have theoretical and practical implications for understanding the complex emotional and informational interplay among social media content, platforms, and audiences.
Conference Paper
Full-text available
Event detection from tweets is an important task to understand the current events/topics attracting a large number of common users. However, the unique characteristics of tweets (e.g. short and noisy content, diverse and fast changing topics, and large data volume) make event detection a challenging task. Most existing techniques proposed for well written documents (e.g. news articles) cannot be directly adopted. In this paper, we propose a segment-based event detection system for tweets, called Twevent. Twevent first detects bursty tweet segments as event segments and then clusters the event segments into events considering both their frequency distribution and content similarity. More specifically, each tweet is split into non-overlapping segments (i.e. phrases possibly refer to named entities or semantically meaningful information units). The bursty segments are identified within a fixed time window based on their frequency patterns, and each bursty segment is described by the set of tweets containing the segment published within that time window. The similarity between a pair of bursty segments is computed using their associated tweets. After clustering bursty segments into candidate events, Wikipedia is exploited to identify the realistic events and to derive the most newsworthy segments to describe the identified events. We evaluate Twevent and compare it with the state-of-the-art method using 4.3 million tweets published by Singapore-based users in June 2010. In our experiments, Twevent outperforms the state-of-the-art method by a large margin in terms of both precision and recall. More importantly, the events detected by Twevent can be easily interpreted with little background knowledge because of the newsworthy segments. We also show that Twevent is efficient and scalable, leading to a desirable solution for event detection from tweets.
Full-text available
Recent years have witnessed the explosive growth of online social media. Weibo, a Twitter-like online social network in China, has attracted more than 300 million users in less than three years, with more than 1000 tweets generated in every second. These tweets not only convey the factual information, but also reflect the emotional states of the authors, which are very important for understanding user behaviors. However, a tweet in Weibo is extremely short and the words it contains evolve extraordinarily fast. Moreover, the Chinese corpus of sentiments is still very small, which prevents the conventional keyword-based methods from being used. In light of this, we build a system called MoodLens, which to our best knowledge is the first system for sentiment analysis of Chinese tweets in Weibo. In MoodLens, 95 emoticons are mapped into four categories of sentiments, i.e. angry, disgusting, joyful, and sad, which serve as the class labels of tweets. We then collect over 3.5 million labeled tweets as the corpus and train a fast Naive Bayes classifier, with an empirical precision of 64.3%. MoodLens also implements an incremental learning method to tackle the problem of the sentiment shift and the generation of new words. Using MoodLens for real-time tweets obtained from Weibo, several interesting temporal and spatial patterns are observed. Also, sentiment variations are well captured by MoodLens to effectively detect abnormal events in China. Finally, by using the highly efficient Naive Bayes classifier, MoodLens is capable of online real-time sentiment monitoring. The demo of MoodLens can be found at
Textual information in the world can be broadly categorized into two main types: facts and opinions. Facts are objective expressions about entities, events, and their properties. Opinions are usually subjective expressions that describe people’s sentiments, appraisals, or feelings toward entities, events, and their properties. The concept of opinion is very broad. In this chapter, we only focus on opinion expressions that convey people’s positive or negative sentiments. Much of the existing research on textual information processing has been focused on themining and retrieval of factual information, e.g., information retrieval (IR), Web search, text classification, text clustering, and many other text mining and natural language processing tasks. Littleworkhadbeendone on the processing of opinions until only recently. Yet, opinions are so important that whenever we need to make a decision we want to hear others’ opinions. This is not only true for individuals but also true for organizations.
Statistics, a branch of applied mathematics that studies ways of drawing inferences from limited and imperfect data and statisticians can reuse their data to quantify the uncertainty of complex models. A mathematical science of statistics is possible because, although repeat an experiment gives different results, types of results are more common than others, and their relative frequencies are reasonably stable. The two classical responses of statisticians have been to focus on tractable special cases, and to appeal to asymptotic analysis, a method that approximates the limits of functions. The arrival of (comparatively) cheap and fast computers made it feasible for scientists and statisticians to recollections of data and to fit models to them. The bootstrap approximates the sampling distribution, with three sources of approximation error. The bootstrap has earned its place in the statistician's toolkit because, of all the ways of handling uncertainty in complex models, it is at once the most straightforward and the most flexible.
Conference Paper
Microblogging websites such as Twitter offer a wealth of insight into a population's current mood. Automated approaches to identify general sentiment toward a particular topic often perform two steps: Topic Identification and Sentiment Analysis. Topic Identification first identifies tweets that are relevant to a desired topic (e.g., a politician or event), and Sentiment Analysis extracts each tweet's attitude toward the topic. Many techniques for Topic Identification simply involve selecting tweets using a keyword search. Here, we present an approach that instead uses distant supervision to train a classifier on the tweets returned by the search. We show that distant supervision leads to improved performance in the Topic Identification task as well in the downstream Sentiment Analysis stage. We then use a system that incorporates distant supervision into both stages to analyze the sentiment toward President Obama expressed in a dataset of tweets. Our results better correlate with Gallup's Presidential Job Approval polls than previous work. Finally, we discover a surprising baseline that outperforms previous work without a Topic Identification stage.
Emotion is a fundamental object of human existence and determined by a complex set of factors. With the rapid development of online social networks (OSNs), more and more people would like to express their emotion in OSNs, which provides wonderful opportunities to gain insight into how and why individual emotion is evolved in social network. In this paper, we focus on emotion dynamics in OSNs, and try to recognize the evolving process of collective emotions. As a basis of this research, we first construct a corpus and build an emotion classifier based on Bayes theory, and some effective strategies (entropy and salience) are introduced to improve the performance of our classifier, with which we can classify any Chinese tweet into a particular emotion with an accuracy as high as 82%. By analyzing the collective emotions in our sample networks in detail, we get some interesting findings, including a phenomenon of emotion synchronization between friends in OSNs, which offers good evidence for that human emotion can be spread from one person to another. Furthermore, we find that the number of friends has strong correlation with individual emotion. Based on those useful findings, we present a dynamic evolution model of collective emotions, in which both self-evolving process and mutualevolving process are considered. To this end, extensive simulations on both real and artificial networks have been done to estimate the parameters of our emotion dynamic model, and we find that mutual-evolution plays a more important role than self-evolution in the distribution of collective emotions. As an application of our emotion dynamic model, we design an efficient strategy to control the collective emotions of the whole network by selecting seed users according to k-core rather than degree.