ArticlePDF Available


Today’s social media platforms are excellent vehicles for businesses to build and foster relationship with costumers. Companies create official fan pages on social network websites to provide customers with information about their brands, products, promotions, and more. Customers can become fans of these pages, like, reply, share or mark the brand post as favorite. Marketing departments are using these activities to crowdsource marketing and increase brand awareness and popularity. Understanding how crowdsourcing oriented marketing and promotion evolves would be helpful in managing such campaigns. In this paper, we adapt a multidimensional point process methodology to study crowd engagement activities and interactions. Specifically, we investigate the brand post popularity as a joint probability function of time and number of followers. One-dimensional and two-dimensional Hawkes point process models are calibrated to simulate popularity growth patterns of brand post contents on Twitter. Our results suggest that the two-dimensional point process model provides a good model for understanding such crowdsourcing behavior.
Modeling brand post popularity dynamics in online social networks
Amir Hassan Zadeh , Ramesh Sharda
Spears School of Business, Oklahoma State University, Stillwater, OK 74078, USA
abstractarticle info
Available online 13 May 2014
Online social networks
Social media marketing
Brand post popularity
Brand-generated content
Hawkes point process
Today's social media platforms are excellent vehicles for businesses to build and foster relationship with
customers. Companies create ofcial fan pages on social network websites to provide customers with informa-
tion about their brands, products, promotions, and more. Customers can become fans of these pages, and like,
reply, share or mark the brand post as favorite. Marketing departments are using these activities to crowdsource
marketingand increase brandawareness and popularity. Understandinghow crowdsourcing oriented marketing
and promotion evolves wouldbe helpful in managing suchcampaigns. In this paper,we adopt a multidimension-
al point process methodology to study crowd engagement activities and interactions. Specically, we investigate
the brand post popularity as a joint probability function of time and number of followers. One-dimensional and
two-dimensional Hawkes point process models are calibrated to simulate popularity growth patterns of brand
post contents on Twitter. Our results suggest that the two-dimensional point process model provides a good
model for understanding such crowdsourcing behavior.
© 2014 Elsevier B.V. All rights reserved.
1. Introduction
The emergence of Internet-based social media has started a new
kind of conversation among consumers and companies, challenging
traditional ideas about marketing and brand management while creat-
ing new opportunities for organizations to understand customers and
connect with them instantly [56]. Research rm Chadwick MartinBailey
in partnership with Constant Contact conducted a study that analyzed
the behavior of 1491 consumers ages 18 and older throughout the
U.S., and revealed that a whopping 77% of consumers interact with
brands on Twitter or Facebook primarily through reading posts and
updates from the brands. They also noted that 60% of social customers
are more likely to recommend a brand to a friend after following the
brand on Twitter or Facebook, and 50% of them are more likely to buy
from that brand as well. When it comes to Likingbrand posts on
Facebook, the reasons are varied, but for the most part, respondents
said they like a brand on Facebook because they are a customer (58%)
or because they want to receive discounts and promotions (57%) [21].
Today, the customer experience shared through social media, blogs
and discussion forums is becoming a major driver of purchasing
decisions, because these platforms provide consumers a more inuen-
tial voice in effecting changes in their own customer care [15].Barnes'
research [9] indicates that70% of consumers use social media platforms
at least some of the timeto learn about the customer care offered by a
company before they make a purchase. Furthermore, of them, 74% of
customers choose companies based on customer care experience
shared by others in online forums.
Over the past few years, big brands have started taking social media
seriously, and social media marketing has been an inevitable part of
their marketing plan. For example, Coca-Cola, one of the world's most
recognizable brands, had 800 fans on Facebook in 2007, 16.5 million
in 2010, and it has currently crossed over 62.3 million likes. In 2012,
in honor of the Coca-Cola Facebook page becoming the rst retailer
brand to receive 50 million likes, Coca-Cola developed a new
Facebook application to identify and support individuals developing,
inuencing and shaping ideas and ask them to collaborate with the
Facebook community to spread them globally. Through this application,
Coca-Cola teaches the world to sing in perfect harmony, mobilizes
millions of people behind their favorite cause, and encourage them to
become more active and socially involved. As an end result, consumers
become involved in suggesting modications of products and services
and the distribution of these innovations [11,12].
Starbucks, as one of the top ten most followed brands on Twitter,
uses tweets to share knowledge with customers and promote their lat-
est products, campaigns and events [20]. With an average of ten tweets
per day on Twitter, Starbucks extracts relevant knowledge from a net-
work of current and prospective customers around the globe who
express their expectations, likes and dislikes about the brand [20,48].
In 2010, Delta Airlines launched the rst social media ticket
windowon Facebook which allows customers to book a ight without
having to go to any other website. Delta pointed out Facebook is being
used by more customers while in ight than any other Web site, making
Decision Support Systems 65 (2014) 5968
Corresponding author.
E-mail address: (A. Hassan Zadeh).
0167-9236/© 2014 Elsevier B.V. All rights reserved.
Contents lists available at ScienceDirect
Decision Support Systems
journal homepage:
it a natural launching pointfor its initiative [8]. Access to OSNs on
mobile devices has certainly accelerated the popularity of OSNs.
As more and more major brands have established their communities
and fan pages within online social networks (OSNs) and started offering
commerce opportunities delivered through social media platforms,
crowdsourcing applications have become some of the most engaging
tools in digital marketing realm, enabling brands to realize the potential
for their fans' input into the product development and the market
development processes [36]. Such innovative and creative initiatives
enable businesses to improve their products, get brand recommenda-
tions, increase brand awareness and popularity, nd new customers
or even excite a specic demographic. In many cases where fans within
social media are particularly passionate about a brand and its products,
there will be a clear desire to become part of the product itself, have
input as a group and energize the brand and its product lines [57].
Today's openness and exibility of OSNs provide brands with a huge
opportunity to get in touch with customers, crowdsource marketing
tasks and enhance brand awareness. Understanding the structure and
behavior of the fans on OSNs is important to the content providers to
enable better organization of brand post information, design of effective
online communities and for implementing successful marketing
campaigns. In examining the online social interaction structures, the
formation of relationships and interactions, how information moves
on social media platforms, and how users respond to various stimuli
like video, contests, or posts are not clearly understood. The answers
to these questions will offer a more complete picture of the social
dynamics of networking and how individuals manage their virtual
relationships and follow their favorites or brand communities, or how
they inuence their friends to become followers as well. In this paper,
we model the spread of information across Twitter, the most popular
and widely used micro-blogging online social network [37] and analyze
the data from a number of brand posts to discover what rules might
govern the spread of information online. By understanding these
behaviors, companies can become more effective in designing market-
ing campaigns. Being able to analyze a social network of customers,
how customers interact on this type of platforms, and what rhythm
and timing of the most engaging postings look like provides brands
a competitive advantage through forecasting the spread of brand
inuence, and intervening at times with promotions to foster relation-
ship with customers.
The timing pattern of human communication in online social
networks is not random. It has been shown that the communication is
explained by emergent statistical laws such as non-trivial correlations
and clustering [55]. With the possibility of analyzing the multivariate
distribution of the occurrences of activity on OSNs, we can add to our
understanding of these interactions.
Standard models assume a Poisson distribution for events occur-
rence, which is an unrealistic assumption in many social systems.
Point process has shown promise for modeling social event patterns
where the occurrenceof an event increases the likelihood of subsequent
events [22]. It is a novel way of modeling and clustering high frequency
and irregular data in time. It uses a branching structure that corresponds
to background events and offspring events and is able to capture bursts
of activity, dynamics and reactions over time.
In this paper, we model the popularity of a brand post or more
generally an online content on online social networks. The popularity
of an online content is not a well-dened, but a highly subjective term
[39]. Brandpost popularity canbe dened as a mixture of various factors
such as vividness, interactivity, the content of the brand post (informa-
tion, entertainment), and number of times the brand post is mentioned
by fans [25]. We take the position of an individual user's eyes who
conjectures the popularity of a brand's tweet from publicly observable
data by associating the number of impressions it has received (including
total number of retweets, replies, favorites) or the lifespan of threads
over its entire timeline. A tweet is considered a popular tweet if it
receives a certain amount of retweets, replies, and favorites that are
no less than a certain threshold over its lifespan [40,43]. Our goal is to
develop a mechanism for capturing the evolution of the online content
popularity posted by brands on OSNs. In our approach, a model is
specied via the conditional intensity for each event. This provides a
powerful and more natural modeling framework for multivariate social
network event data. Specically, the current study examines the
inuence of user activities on the timing and frequency of a brand
post. The self-exciting Hawkes point process and the ETAS (Epidemic
Type Aftershock Sequences) models are used to analyze data on brand
posts popularity. Unlike Poisson processes, Self-exciting Hawkes point
process and ETAS are classied as counting processes which are basical-
ly a continuous-time non-Markov chain due to the dependence on the
history of the process (i.e. H
) to the extent to which having states 0,
1, 2, . . . moving from state nto state n+ 1, where n0. In case of the
content popularity problem, each state indicates total number of users
who hit the content by time t,andλ(t) is the transition rate of moving
from one state to another state.
The remainder of the paper is organized as follows. The next section
starts with a discussion of online social networks (OSNs). We also
review literature about stochastic point processes and their many
uses. The following section describes how we map the content popular-
ity to the point processes framework. Also, we introduce brand post
data collected from Twitter and the assumptions necessary to proceed
with analysis. In Section 4,wet competing models to data and then
compare the accuracy and complexity of models in capturing the burst
of activity on OSNs. The managerial implications of our ndings, limita-
tions and possible directions for future work are discussed in Section 5.
The nal section presents a general conclusion of the paper.
2. Review of the literature
2.1. Online social networks
During the past few years, millions of people have used social media
applications (Facebook, Twitter, YouTube, Google+, etc.) as a part of
their daily online activities [30]. In 2011, more than half of social
media users followed brands on social media sites, and brands are
increasingly investing in social media to crowdsource marketing activi-
ties, indicated by worldwide marketing spending on social networking
sites of about $4.3 billion [25].
Today companies develop ofcial fan pages and online communities
within online social networks to understand customers, connect with
them instantly and provide them with information about their brands,
products, promotions and more. Meanwhile, brand fans can like, com-
ment and share brand posts. Users of Twitter can retweet, which is
much like a Facebook share. Followers retweet the tweets of those
they are following to propagate information to other people. People
respond to popular users by replyingand/or mentioning[7].
Followers can also mark the content as favorite which is functionally
similar to the likeaction on Facebook. The likeand retweet
buttons are the easiestways for Facebook and Twitter users respectively
to join in on the brand conversation and give feedback. Comments/
replies on brand posts can be positive, neutral or negative. In most
cases, social media users who choose to become fans of a product are
those who are particularly passionate about a brand and its products
and enjoy having input or being a member of a group of like-minded
fans. The brand benets from these fans because they help communi-
cate with a diverse audience of other consumers.
Such individual activities associated with a brand post are visible to
network friends and many times inuence friends to retweet, like, or
mention. Ifa company produces fan page updates that earn high quality
scores, they will reap the benets of greater exposure and possibly an
increased fan base because other network members will see in their
news feed. Jansen et al. [38] discuss OSNs as a form of electronic word
of mouth (eWOM) for sharing consumer opinions concerning brands
and as a part of an organization's marketing strategy. This openness
60 A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 5968
and exibility of social media provides businesses a great opportunity
to bring together a group of people, or crowd, to solve a problem or
engage in an activity and achieve powerful social engagement and
In many ways, the interactivity of social media supports
crowdsourcing. Crowdsourcing is a term coined by journalist Jeff
Howe [34] to mean taking advantage of the talent of the public[46].
Social media provide platforms for existing and potential customers to
engage, learn, and entertain. It enables content marketers to
crowdsource their marketing, reaching vast audiences via word of
mouth. For example, Starbucks developed the My Starbucks Idea
campaign, an online customer community, where customers are asked
to contribute their views and ideas about the company. It keeps
customers in the loop on what business ideas Starbucks is currently
implementing on both the brand and product level. Through linking
this platform to Facebook, Twitter and other social media websites,
customers are able to see what others are suggesting, vote on ideas
and check out the results [49].
Internet service providers, content creators, and online marketers
would like to be able to predict how manyviews and actions an individ-
ual item might create on a given website [58]. This is true for companies
as well who benet from aspects of online social networks by utilizing
fan pages and web advertising. Leveraging the social networking sites
to understand what is most popular helps e-commerceproviders decide
what content to promote on their website. E-commerce providers can
leverage these social signals to ensure the products or services people
are talking about appear higher in their product listings.
Over the last few years, much effort has been devoted to exploring
the statistical features of content popularity in online social networks
(OSNs). Most previous empirical analyses of OSNs have treated such
networks as static [29,61]. They analyze the social networks on a single
data snapshot [3,28,45]. However, such social network systems are
inherently dynamic, characterized by a high burstiness and a strong
positive correlation between two users' activities and consist of a set
of dyadic, directed, time-stamped, cross-affected and sometimes
weighted events. To the best of our knowledge, only a few studies
have analyzed popularity growth patterns of content on OSNs using
prediction models [16,22,29,44,58]. Crane and Sornette [22] propose
contagion models as models of YouTube video viewing dynamics to
understand how popularity bursts can be described. They differentiate
four classes of popularity dynamics (memoryless, viral, quality and
junk) which are all explained by properties of Hawkes point process.
Szabo and Huberman [58] nd a strong linear correlation between
early and later times of the content popularity on YouTube and Digg
networks. This correlation conrms that if the content is popular
when new, it will continue to be popular as it ages. Another interesting
work on social media mining is reported by Chatzopoulou et al. [17].
They nd a strong correlation between total number of comments (or
favorites) and total view count in YouTube. There are relatively few
studies in the literature which explore the capability of online social
networks to predict real-world outcomes such as the revenue or release
time of a product on the market. Sadikov et al. [62],Abeletal.[1] and Rui
and Whinston [54] present case studies in which blogosphere content
can be used as a predictor of movie and music success. They show that
the number of microblog views of content related to the music or
movie (such as FB posts, tweets, YouTube videos, etc.) can provide an
accurate prediction of the movie's or music's success.
While previous studies build popularity models based on a one-
dimensional function of time, we suggest that the content popularity
can be a joint probability function of time and the number of followers.
We focus more on incorporating thenumber of followers as an inuen-
tial metric into predictive models of the content popularity, explicitly
looking at the impact of inuential users on their followers to persuade
them to contribute to brand post popularity. In this paper, we adapt a
mathematical framework based on self-exciting point process to study
brand post popularity on online social networks. Specically, we
calibrate one-dimensional and two-dimensional self-exciting point
process models to estimate popularity growth patterns of brand post
contents on Twitter.
2.2. Stochastic point processes
In this section we present the statistical theory underlying our
approach. First, we dene the conditional intensity function for a point
process. A point process is a stochastic model commonly used
to describe the occurrence of discrete events in time and space. It can
be viewed in terms of a list of times t
at which corresponding
events 1, 2,,noccur [27]. Intuitively, a point process is characterized
by its conditional intensity λ(t), which represents the mean spontane-
ous rate at which events are expected to occur given the history of the
process upto time t[50]. In particular, a version of the conditional inten-
sity may be given by the process
λtðÞ¼ lim
where H
denotes the history of events prior to time t,andtheexpecta-
tion represents the number of events N[t,t+Δt] occurring between
time tand t+Δt. The Poisson process is a special case of a point process
where the interval times between two arrivals are independent, identi-
cally distributed exponential random variables. The conditional intensi-
ty of a Poisson process is deterministic which means that events are
linked causally to the conditional intensity. In other words, a point
process is classied as a Poisson process if events occurring at two
different times are statistically independent of one another, meaning
that an event at time t
neither increases nor decreases the probability
of an event occurring at any subsequent time[27].Sinceahomogeneous
Poisson process indicates complete randomness, it is most commonly
used as a suitable benchmark forassessing self-exciting process models.
A point process is called self-excited if any one event increases the
likelihood of the future events [32]. A self-exciting or Hawkes point
process is a versatile point process which has been extensively studied
from a theoretical and practical point of view. It is dened by its condi-
tional intensity function
ðÞdZ uðÞ¼μþβX
where Zis the normal counting measure [33].Therateofeventsλ(t)is
decomposed into the sum of a Poisson background rate which in most
applications is assumed to be constant in time [33] and a self-exciting
component in which events trigger an increase in the rate of the
process. The self-exciting part of the process has two components: β
and ϕ.βis a constant which reects the magnitude of self-excitation
and ϕis a density function describing the waiting time (lag) distribution
between excited and exciting events. A proper skewed distribution in
which the overall shape reects a long time dependency should be
introduced for the triggering density.
In the Hawkes-based analysis, the events can be viewed as
the realization of a multivariate point process. That is, every single
event is characterized by the occurrence time and the event's type.
Notationally, {T
are random variables where T
is the occur-
rence time of the i
event and Z
{1, 2, ,M} indicates the ith event's
type [13]. A point process issaid to be mutually-excitingif any one event
from a specic event's type at time t
increases the likelihood of an
event in another event's type stream occurring at time t
. Mutually-
exciting Hawkes process is used to capture cross interactions and
mutual information between one sequence of events and another.
Similar to theself-exciting Hawkes process,a mutually-exciting Hawkes
61A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 5968
process with nevent type(s) is dened by its conditional intensity
βijϕij tti
ðÞ k¼1;2;n
ϕij υðÞdυ¼1;ϕij υðÞ1;υ0
where the rate of event type k,λ
(t), is partitioned into the sum of a
Poisson background rate and mutual-exciting components in which
events trigger an increase in the rate of the process. β
is a constant
which reects the strength of self-excitation for (i=j) and the strength
of mutual-excitation for (ij)andϕ
is a density function describing
the triggering distribution between excited event type iand exciting
event type j.
Hawkes-based analysis has long been used in seismology to
recognize similar clustering patterns in earthquakes occurrence data
and to predict subsequent earthquakes, or aftershocks. [2,51,59,60].
It has been applied to many other areas such as nance [10,13],
neurophysiology [19], ecology, social networks [5,27,47] and online
social networks [22,41].
Engle and Lunde [26];Bowsher[13] present a bivariate Hawkes
process model to jointly analyze the timing of trades and quote arrivals
in stock markets. Chavez-Demoulin et al. [18] and Bacry et al. [6] use
Hawkes process structure to estimate value at risk for portfolios of
traded assets over a given holding period of time. Dassios and Zhao
[24] present dynamic contagion process as a generalization of the Cox
process and Hawkes process and use it to model risk process with the
arrival of claims.
Mohler et al. [47], Egesdal et al. [63], and Erik et al. [27] use self-
exciting point process models to predict violent events and security
threats. Erik et al. [27] utilize step functions parameterized by various
values, linear functions and non-parametric approaches as non-
stationary background rates (μ) of the point process.
Alexey et al. [5] use a self-exciting point process to discover missing
data in the series of interaction events between agents in a social
network. They apply this model to the Los Angeles gang network to
predict afliation of the unknown offenders.
Recently, this approach has been used to analyze the dynamics of on-
line social networks. Crane and Sornette [22] and Mitchell and Cates [64]
analyze a family of self-exciting point processes to model correlated
event timing of viewing YouTube videos. They deploy a Pareto distribu-
tion (power law) as a distribution of waiting times between cause and
action, describing the cascade of inuences on the online social network.
It is shown that a Hawkes process enclosing power law distributions
offers many capabilities to calibrate the model to characteristics of the
YouTube views. These characteristics are classied by a combination of
endogenous/exogenous user interactions and the ability of viewers to
inuence others to respond across the network (critical/subcritical).
Howison et al. [35] deploy a mutually excited Hawkes process to un-
derstand the dynamics of the user generated contents over open contri-
bution platforms such as Wikipediaand Linux. They study the inuence
of visible activity of others on the timing and amount of participation in
Wikipedia environment. They model the time at which a response to an
event occurs as a log-normal distribution. But this analysis has not yet
been conducted on social media activities, in particular on Twitterpost-
ings and follow-up actions. Also the role of the inuential users within
OSNs has not been yet considered in such predictive models.
In this paper, we provide a more realistic investigation of the
benets of stochastic point processes for predicting the brand post
popularity on OSNs. To the best of our knowledge, there are relatively
few studies in the literature which explore the capability of point
processes on online social networks to model dynamics and growth
patterns. We use the ETAS model, one of the most widely used point
process in the literature, to shed light on how the content popularity
on OSNs can be described by a function of time and the number of
followers. The number of followers is one of the best metrics to demon-
stratetheroleoftheinuential users within OSNs.
3. Problem formulation
Understanding rules governing collective human behavior, especial-
ly as they affect social interactions on internet-based social media, is a
difcult task in the eld of social media analytics. Our main objective
is to analyze how the popularity of individual brand posts evolves
when the posts are shared with people on social media outlets. We
examine how fans' sequential interactions with network friends
contribute to the popularity of a brand post. The majority of brand
posts experience few hits and can be well described by a Poisson
process. In such a case of little activity, popularity oscillation is quite
steady. In contrast, some brand posts experience bursts of activity and
word of mouth growth through friend sharing features of OSNs. A
standard stochastic process (i.e. Poisson process) fails to address the
burst of popularity;since it is based on the assumption of independence
about arrivals, which is unrealistic in case of future activities arising
from a specic tweet/post/etc. Clustering point processes and epidemic
type models are a good t for modeling such phenomena.
In the online social networks analysis, the social activity event data
can be viewed as the realization of a multivariate point process. Each
event is characterized by its occurrence time (t
), the magnitude of
inuence (number of followers) (m
) with an additional mark attached
to it representing the event's type (z
). Retweeting, replying, tagging
and marking a brand post as a favorite, etc. are different types of user
activities. For the purpose of this paper, we combine these three types
of events into one common set of events.
The beauty of major OSN platforms is that they are structurally
isomorphic. Their similar features, while labeled with site-specic
vocabulary, operatein the same way, making studies of their data easier.
For the purposes of this paper, we will utilize Twitter notations to
explain properties of OSNs.
In order to build our two-dimensional point process, we dene
as random variables where T
is the occurrence
time, M
the magnitude of the ith triggering event and Z
{1, 2, }
indicates the type of i
event. Any event of a specictypeattimet
increases the likelihood of an event of any type stream occurring at
time t
. Now, we formulate the problem using Hawkes process proper-
ties and discuss how those mechanisms work on the time line.
3.1. Candidate models
First, we formulate one sequence of events using a self-exciting point
process to measure the likelihood that individuals are talking about the
brand regardless of the type of events. This model lets us aggregate the
popularity content from across Twitter into a single stream of informa-
tion. It concurrently capturesthe idea that any given activity on a brand
post can causally correspond to a background Poisson process μ(in this
case constant) and foreground self-exciting process as follows:
βϕ tti
ðÞ ð3Þ
The summation component indicates the inuence of users' activity
on the stream. It describes how past events at times t
inuence the
current event rate. Parameter βindicates the amount of excitation an
event contributes to the stream. In behavioral terms, it can be described
as the number of potential users inuenced directly by individuals in
the past who retweeted or replied to the brand post tweet at time t
As mentioned earlier, function ϕis a triggering function describing
distribution of waiting time between a trigger and the response from
users who inuenced to recommend the brand. Mining of our data on
the life cycles of various brand posts in Twitter indicates that unlike
62 A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 5968
YouTube, a brand's tweet gets most of its hits within the rst days
even hours of its life cycle and quickly becomes obsolete. Since most
responses occur almost immediately in the Twitter case, we need a
distribution that enforces the highest intensity at the most immediate
possible time. Furthermore, it should be skewed and long tailed to
reect a long time dependency and burstiness.
3.1.1. Model 1
First, we use an exponential distribution for the response density,
giving the conditional intensity
ðÞ ð4Þ
where tt
is the time elapsed since event i,andαreects a rate
of decay for the triggering density which controls how long self-
excitation takes following a tweet. If αis large, mentioning the brand
post by users will last only a short while and a few events (retweet or
reply) will be only added above a background rate after the initial
brand's tweet over a short period of time. Conversely, if αis small,
self-excitation will last for a much longer period of time and then
many more events will be added to the background rate.
3.1.2. Model 2
There is another characteristic of events in OSNs that should
be taken into consideration. We suggest that the amount of users'
contributions to future events is not only dependent on the occurrence
time, but that the number of followers he/she has is an important factor
as well. Therefore, our second model takes into consideration two
parameters: the occurrence time and the magnitude of triggering
event (number of friends and followers). It means that the event does
not scale just with the occurrence time, but also the magnitude of the
triggering event as well.
One particular form of a self-exciting point process is the ETAS
model (spacetimemagnitude Hawkes process), which is widely
used to describe spatialtemporal patterns. This model takes more
parameters (inputs) into account. We use an early form of this model
(i.e. timemagnitude Hawkes process), similar to [50], to quantify the
popularity of a brand tweet. This model incorporates magnitudes and
occurrence time of triggering events concurrently. The conditional
intensity for the ETAS model is given by
ðÞ ð5Þ
where the history of the process H
bt} also includes
magnitudes m
,μis the arrival rate of new users and ϕis a triggering
function. The ETAS uses a combination of the exponential distribution
and the Pareto distribution for the triggering density ϕ, giving the
conditional intensity
where the power law term governs temporal distribution of subsequent
triggered events and the exponential term explains the factor by which
the user's magnitude m
inates expected number of inuencers. The
term tt
denotes the time elapsed since event i.βis the amplitude
coefcient indicating the amount of direct excitations triggered by
event i. The exponent pis the decay rate, αis interpreted as the pro-
ductivity rate to control the number of potential users inuenced by
individuals in the past, and cis the time offset that will be empirically
determined from the dataset under consideration. Furthermore, M
the lowest magnitude (number of followers) that will be substituted
from the dataset (rescaled to the appropriate range).
3.2. Empirical testing of the models on Twitter datasets
We next apply these models to real data. As mentioned earlier, a
basic analysis of our data on the life cycle of various brand posts in
Twitter using Topsy API and Twitter search API indicates that the major-
ity of a brand's tweet gets most of its activity within the rst days even
hours of its life cycle and hence quickly becomes obsolete. Since we
focus on the brand post popularity, we take brand posts that experience
bursts of activity and electronic word of mouth growth through the
friend sharing features of Twitter. Using Twitter's publicly available
API, we crawled Twitter information streams of more than 120 major
brands that were among the top 500 most valuable global brands [14].
These brands were among the most followed brands and were actively
posting tweets at their fan pages on Twitter. These brands are from
different product and service categories including clothing, cosmetics,
electronics, accessories, foods, beverages, automotive, credit cards,
airlines, etc. Together, these brands published more than 26,500 tweets
in a typical period of one week to provide information to their
customers and promote their latest products, campaigns and events.
We downloaded information of all subsequent activities (retweets,
replies, and marks as favorite) on a brand post for all these 26,500
brand post tweets. We observed that the majority of brand posts tweets
experience few hits and therefore as mentioned earlier, can be modeled
by a Poisson process. However, there are brand posts that became a
major topic (trendingin Twitter parlance), are frequently mentioned
by the brand's followers, and experience bursts of activity. For the
purpose of this paper, we searched through the downloaded tweets to
isolate those tweets that are original tweets from the brands and
where the tweets have been mentioned (retweeted, replied, marked
as favorite) at least 300 times. A number of 221 such brand post tweets
followed by many hits and bursts of activity were identied. At this
stage, 125,861 twitter activities including information on original
tweets, all subsequent retweets, replies and marked as favorite to the
original tweet were processed. The data were divided into individual
datasets. Each dataset contains a corpus of an individual brand
post tweet, its subsequent activities (retweets, replies, and marks as
favorite), along with their timestamps, user ids and number of followers
of the user whocontributes to the tweet stream. We take into consider-
ation only the timestamp of events and the number of followers, while
aggregating the events retweet,reply,andmark as favoriteinto a
single stream of information.
We investigated the content of these 221 most popular brand tweets
and note that the primary topic was the brand campaigns on Twitter
(44%). Some of these campaigns use Twitter to communicate with
fans and followers. Several campaigns use Twitter hashtags to deliver
rewards and sweepstakes to customers. Other campaigns have inter-
active competitionsto create buzz with fans. The second most engaging
brand tweet category is related to the events held by the brands on
Twitter (36%) including surveys etc. The rest of the most popular
brand tweets were related to the informationand entertainment posted
by brands on Twitter.
3.3. Parameter estimation, goodness-of-t, and model comparison
Given a brand post data collectedfrom Twitter, we utilize maximum
likelihood estimation (MLE) methods to estimate the parameters
of candidate self-exciting point process models. While numerical
optimization routines such as the quasi-Newton method, the conjugate
gradient method, the simplex algorithm of Nelder and Mead and the
simulated annealing procedure [23,50,52] are often used to compute
maximum log-likelihood estimation of self-exciting point process
models,we use theexpectation-maximization (EM) algorithm provided
by Veen and Schoenberg [59] to estimateparameters. Veen and Schoen-
berg [59] have demonstrated that the EM algorithm as the estimation
method of choice for incomplete data problems is extremely robust
and accurate compared to traditional methods. The brand post
63A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 5968
popularity can be viewed as an incomplete data problem in which the
unobservable or latent variables ascertain whether an activity belongs
to a background event or whether it is a foreground event and was
triggered by a preceding activity.
Finally, the reliability of each model is statistically tested using the
KolmogorovSmirnov (KS) statistic to assess the extent to which the
model ts the data. This criterion provides useful information of the
absolute goodness-of-t of candidate models. Furthermore, the relative
ability of each model to describe the data is measured by computing the
Akaike information criteria (AIC) [4]. The Akaike statistic provides
germane numerical comparisons of the global t of competing models.
The required package functions in R software are used for tting both
above models to the datasets (Ptproc package [53],Ptprocess[31],
ETAS package (Jalilian, [65]), and R code [59]).
Furthermore, we employ autoregressive integrated moving average
(ARIMA) models as benchmarks which have been regarded as the
closest framework to point processes for event data [23]. We used an
R package Forecast(Hyndman et al. [66]) to perform the time series
analysis. This package allows tting of time series and linear models.
The functions available in this package conduct a search over possible
models within the order constraints provided and return the best
ARIMA model for a univariate time series according to AIC values. In
the next section, we will rst present our results for one of our crawled
datasets to illustrate how our approach works and then we discuss
goodness-of-t of the candidate models by computing their average
AIC values across all the datasets that we compiled from Twitter.
In summary, Fig. 1 illustrates the methodology used for modeling
the contentpopularity on Twitter in this paper.At each stage, the inputs,
the required R-packages used to produce the results and the output are
specied clearly.
Input: Individual tweet dataset
(including user ids, timestamps,
number of follower) in CSV or
XML format
Parameter Estimation for the
point process models: Veen and
Schoenberg’s R-code, PTPROC,
PTPROCESS R-packages
Simulation: PTPROC,
Parameter Estimation and
simulation for the
benchmark ARIMA (p,q,r)
model: Forecast package
Output: Estimated parameters,
log-likelihood function value,
simulated conditional intensity
function, K-S, AIC values etc.
Output: Estimated
parameters for best fitted
ARIMA model, log-
likelihood function value,
simulated ARIMA model,
AIC value etc.
Models Comparison: AIC values
Fig. 1. The methodology used in predicting the online content popularity on Twitter.
Frequency of activities
Mark as Favorite
Fig. 2. Frequency of different types of events.
Fig. 3. A histogram of the number of events per minute.
Fig. 4. Simulated conditional intensity function for model #1.
64 A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 5968
4. Results and analysis
In this section, we focus on one particular dataset to demonstrate
how models work in practice. We set Δt= 1 min for the bin width in
order to control the amount of data through parameter t. From this spe-
cic dataset there are 751 events spanning 10,080 min (one week).
Figs. 2 and 3 provide frequency of different types of hits and a histogram
of the frequency of all events per minute respectively. The most events
occurring in a single minute is 15 and the mean number of events in a
single minute is 0.074. Out of a possible 751 events, 278 events occurred
during the rst two days. Thus, we reason that people respond to a
brand post tweet immediately. Therefore, we would expect that the
distributions to be selected should impose the largest probability mass
at the most immediate possible response time.
Table 1 summarizes the parameter estimates for the rst candidate
The t for the data with self-exciting point process model is plotted
in Fig. 4.
The parameter estimate for βdenotes that immediately after an
event occurs, the conditional intensity is amplied by about 3 events
per minute. The parameter estimate for αindicates an event related to
the brand post tweet is talked about for up to 12 min after posting.
Now let us look at the ETAS model that takes into account the occur-
rence time and the number of followers for every single triggering
event. Fig. 5 provides a snapshot of the number of followers for those
users who appear to have been inuenced by the brand post tweet
either spontaneously or in response to the certain triggers.
Table 2 summarizes the parameter estimates for the ETAS model.
Simulated data with the corresponding ETAS point process model are
shown in Fig. 6.
Our hypothesis is that the greater the number of followers per event,
the greater the inuence. Therefore incorporating the number of
followers into our predictive model as another dimension presumably
provides better results. Fig. 6 reveals that the ETAS model is much
more able to capture jumps and leaps of the process compared to our
Utilizing statistical tests such as the KSgoodness-of-ttestandAIC
test allows us to test whether the number of followers impacts the
model. Table 3 summarizes the results for a two sample KStest
demonstrating how well both models perform in terms of the original
data. It contains the p-values and the values of the KS test statistic
(D) corresponding to each model.
These results support our hypothesis that incorporating the number
of followers into the predictive models provides a better simulation for
understanding such phenomena.
Since the ETAS model has more parameters in comparison to the
self-exciting Hawkes process, AIC values are used to analyze parsimony,
complexity and accuracy of the models. The homogeneous Poisson
model is also often used as a reference model for comparison of com-
peting point process models. Table 4 summarizes the AIC values for
candidate models.
The AIC values show that the ETAS model is the one with the
minimum AIC value. Therefore, the ETAS model provides a better t
than a homogeneous Poisson model or self-exciting Hawkes process
or the benchmark ARIMA time series model.
We next estimate the self-exciting Hawkes process model, ETAS
model, the benchmark Poisson process model and the benchmark
ARIMA model and compare their goodness-of-t by computing their
average AIC values across all datasets.
According to Table 5, the ETAS model has the lowest average AIC
value. The proposed ETAS model outperforms the three benchmarks,
which indicates that it can capture the inuence network better than
other models. The benchmark homogeneous Poisson process and the
benchmark ARIMA time series model seem to fare much lower than
the ETAS and the self-exciting Hawkes process. The Poisson process
model fails to capture any exciting effects among user activities to
make the prediction. Also, the ARIMA time series model appears to fail
to capture the dependency between the current event and the past
events on the time line. Recall that, in the online content popularity
context where the occurrence of an event increases the likelihood of
subsequent events, whether slightly or greatly, it is imperative to
account for exciting effects among users' activities.
Our result implies that the impact of the number of followers on
brand post popularity is an important issue in OSNs. It is necessary to
consider the event occurrence time and the number of followers as
two major factors in modeling of online social dynamics.
We found that ETAS modelprovides much more accuracy to predict
popularity of brand posts. It allows us to consider the role of the
inuential users in amplifying the brand post popularity and secondar-
ily proposing the brand to their friends and followers networks. It
implies that inuential users with a high number of followers can
have a signicant inuence in spreading the content of the brand post
to others.
5. Discussion and limitations
We have adapted a powerful approach for modeling the content
popularity in OSNs. In contrast to the previous studies that focused on
a one-dimensional function of time, the model recommended in this
paper allows us to characterize and quantify the content popularity as
a joint probability function of time and the number of followers. The
self-exciting Hawkes process and ETAS models have been calibrated to
Table 1
Specication of the self-exciting Hawkes process model (1) used for simulation.
Parameter μα β
Value 0.05673 12.14027 2.91944
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Number of followers
Fig. 5. Number of followers over time.
Table 2
Specication of the self-exciting point process model (2) used for simulation.
Parameter μβ α pc m
Value 0.5886 0.01376837 2.1254544 1.157623 0.01343711 0.3
65A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 5968
simulate popularity growth patterns of brand post contents on Twitter
and as expected, the ETAS model outperforms the other models to
capture bursts of activity over time.
This model can enable brand marketing managers to observe how
often their fans respond to their posts within OSNs, and gauge the
response for different types of content such as news, contests, applica-
tions, video, pictures, product information, brand's history, testimonials,
etc. They will also have the ability to see how these brand posts move
through the Internet. These predictive models can help companies
decide how often and when a new brand post should be posted, and
how many times the same piece of content can be shared in order to
engage more fans and followers. Certainly there is no magic number
for the ideal number of posts within OSNs; it is important for brands
to post enough content while refraining from posting too much at the
same time. The mathematical conguration of ETAS model also
conrms that if the time difference between two consecutive events is
big enough, most likely the brand post will become obsolete and
suggests that it is time to post a new content to keep a connection
opened with fans.
As another managerial implication of this study, the mathematical
formulation of the ETAS model reveals that the greater the number of
followers per event, the greater the inuence. This means that a high
number of followers improve activity in posting tweets and being
more often retweeted. It highlights the role of inuential users who
signicantly affect the engagement of a brand post, even if they are
involved later. Thus, if companies identify and increase the number of
inuential users within their online social networks, they should expe-
rience an increase of brand recommendations and awareness. Engaging
more users that are inuential during the early life of the brand post
could cause viral effects, which is likely to inuence potential
consumers for a longer period. Many approaches have been proposed
to nd inuential users within OSNs. The simplest approach is to
count the number of followers, but there are other efcient techniques
based on mining link structure along with the temporal order of infor-
mation adoption [42].
Also, since fans' reactions and response time to different types of
the brand post content are dissimilar, it is important for brands to look
carefully at the performance of their various brand post contents
and see which of them during their lifecycle have similar looking
stationary/non-stationary background rates. If they do not follow the
same growth pattern, each category needs an individual point process
to represent it.
Our work proposes a mechanism for capturing the evolution of the
online content popularity posted by brands on Twitter. It facilitates
the early prediction of a tweet behavior on Twitter and the simulation
of the rhythm and timing of the most engaging postings. Through the
simulation and the early prediction of a brand's tweet, brands have a
better view of timing promotions to foster relationship with customers.
Our research can be extended to determine a peak release time for
products of consumer interest on the market through analyzing aggre-
gative/collective brand posts from OSNs. If brand posts are not propa-
gating further on OSNs, it could indicate that the brand is losing its
fans' awareness and popularity, so improvement actions should be
Several limitations of our study deserve mention. First, we assume
that all users follow the same response time distribution for their own
activities. However, individual activity burst shows a sequence of dis-
crete events. This is unlikely to be a single distribution for the purposes
of tting exponential or Pareto distributions to the longterm dependen-
cy. Another limitation is that thevarious types of events are aggregated.
Multivariate self and mutual-exciting point process models should be
developed to deal with different streams of information and measure
cross interactions and mutual information between one sequence of
events and another.
Furthermore, even though we chose a small time increment, i.e. Δt=
1 min for the bin width in order to control the amount of data through
parameter t, we cannot determine if events occurring in the same minute
are correlated with one another. This means that the events recorded on
the same minute are assumed to be statistically independent.
While we consider the same importance for fan's response times, we
can track down brand's most engaging minutes, hours and days of the
week to determine real effective time windows that should be taken
into computation in order to provide a better prediction.
In summary, our analysis indicates that a stationary Poisson process
for the background rate of spontaneous events is a rather unlikely
assumption in many social systems. The ETAS model and self-exciting
point process can be considered a more reliable underlying process.
6. Conclusion and directions for future research
This paper adopts a stochastic point process framework for analysis
of the dynamic microstructure of online social networks (OSNs). Espe-
cially, we investigate the possibility of using crowdsourcing on OSNs
Fig. 6. Simulated conditional intensity function for model #2.
Table 3
The KSgoodness-of-t test output.
Self-exciting Hawkes process (Model #1) ETAS model (Model #2)
D = 0.3993, p-value = 0.03135 D = 0.1223, p-value = 0.02216
Table 4
AIC test results.
Time series model
(ARIMA (3, 1, 3))
Poisson model
Self-exciting Hawkes
process (Model #1)
ETAS model
(Model #2)
9787.670 5401.592 4473.017 4012.011
Table 5
Models' comparative average AIC values.
Time series model
(ARMIA (p,q,r))
Poisson model
Self-exciting Hawkes
process (Model #1)
ETAS model
(Model #2)
13,398.661 9047.397 7143.110 6415.187
66 A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 5968
as a marketing mechanism to enhance brand awareness and popularity.
Such crowdsourcing activities help brands spur innovation and drive
brand awareness across OSNs platforms. We describe such dynamics
in terms of the stochastic occurrence times and number of followers.
One-dimensional and two-dimensional self-exciting point process
models are adjusted to simulate popularity growth patterns of brand
post contents on Twitter. Our ndings indicate that point models are
able to describe the cascade of inuencers on theonline social networks.
Our results suggest that incorporating the number of followers into pre-
dictive models as another dimension of input provides a better
understanding of the content popularity. Our future work focuses on
applying a full package of multivariate point processes to different
streams of events within OSNs.
[1] F. Abel, E. Diaz-Aviles, et al., Analyzing the blogosphere for predicting the success of
music and movie products, International Conference on Advances in Social Net-
works Analysis and Mining (ASONAM), IEEE, 2010, pp. 276280.
[2] L. Adamopoulos, Cluster models for earthquakes: regional comparisons, Mathemat-
ical Geology 8 (4) (1976) 463475.
[3] Y.-Y. Ahn, S. Han, et al., Analysis of topological characteristics of huge online social
networking services, Proceedings of the 16th international conference on World
Wide Web, ACM, Banff, Alberta, Canada, 2007, pp. 835844.
[4] H. Akaike, Information theory and an extension of the maximum likelihood princi-
ple, 2nd Inter. Symp. on Information Theory, 1, 1992, pp. 610624.
[5] S. Alexey, B.S. Martin, et al., Reconstruction of missing data in social networks based
on temporal patterns of interactions, Inverse Problems 27 (11) (2011) 115013.
[6] E. Bacry, S. Delattre, et al., Modelling microstructure noise with mutually exciting
point processes, Quantitative Finance (2012) 113.
[7] Y. Bae, H. Lee, A sentiment analysis of audiences on twitter: who is the positive or
negative audience of popular twitt erers? Proceedings of the 5th international
conference on Convergence and hybrid information technology, Springer-Verlag,
Daejeon, Korea, 2011, pp. 732739.
[8] C.H. Baird, G. Parasnis, From social media to social customer relationship manage-
ment, Strategy & Leadership 39 (5) (2011) 3037.
[9] N.G. Barnes, Exploring the link between customer care and brand reputation in the
age of social media, in: S. f. NC Research (Ed.), Societ y for New Communication
Research, 2008.
[10] L. Bauwens, N. Hautsch, Modelling nancial high frequ ency data using point
processes, in: T. Mikosch, J.-P. Kreiß, R.A. Davis, T.G. Andersen (Eds.), Handbook of
Financial Time Series, Springer, Berlin Heidelberg, 2009, pp. 953979.
[11] P.R. Berthon, When customers get clever: managerial approaches to dealing with
creative consumers, Strategic Direction 23 (8) (2007).
[12] P.R. Berthon, L.F. Pitt, et al., Marketing meets Web 2.0, social media, and creative
consumers: implications for international marketing strategy, Business Horizons
55 (3) (2012) 261271.
[13] C.G. Bowsher, Modelling security market events in continuous time:intensity based,
multivariate point process models, Journal of Econometrics 141 (2) (2007)
[14] Brand Directory, BrandFinance Banking 500 2013[online], [Accessed 07/01/2013]
Available from 2013.
[15] L. Capozzi, L.B. Zipfel, The conversation age: the opportunity for public relations,
Corporate Communications: An International Journal 17 (3) (2012) 336349.
[16] M. Cha, H. Kwak, et al., Analyzing the video popularity characteristics of large-scale
user generated content systems, IEEE/ACM Tra nsactions on Ne tworking 17 (5 )
(2009) 13571370.
[17] G. Chatzopoulou, S. Cheng, et al., A rst step towards understanding popularity in
youtube, INFOCOM IEEE Conference on Computer Communications Workshops,
[18] V. Chavez-Demoulin, A.C. Davison, et al., Estimating value-at-risk: a point process
approach, Quantitative Finance 5 (2) (2005) 227234.
[19] E. Chornoboy, L. Schramm, et al., Maximum likelihood identication of neural point
process systems, Biological Cybernetics 59 (4) (1988) 265275.
[20] A.Y.K. Chua, S. Banerjee, Customer knowledge management via social media: the
case of Starbucks, Journal of Knowledge Management 17 (2) (2013) 237249.
[21] Constant Conta ct, Report on con sumer behavio r highlights th e need for small
businesses to be active on Facebook, Constant Contact Inc., 2011
[22] R. Crane, D. Sornette, Robust dynamic classes revealed by measuring the response
function of a social system, Proceedings of the National Academy of Sciences 105
(41) (2008) 1564915653.
[23] D.J. Daley, D. Vere-Jones, Conditional intensities and likelihoods, An Introduction to
the Theory of Point Processes, , Springer, New York, 2003. 211287.
[24] A. Dassios, H. Zhao, Ruin by dynamic contagion claims, Insurance: Mathematics and
Economics 51 (1) (2012) 93106.
[25] L. de Vries, S. Ge nsler, et al., Pop ularity of bran d posts on brand fan pages: an
investigation of the effects of social media marketing, Journal of Interactive Market-
ing 26 (2) (2012) 8391.
[26] R.F. Engle, A. Lunde, Trades and quotes: a bivariate point process, Journal of Financial
Econometrics 1 (2) (2003) 159188.
[27] L. Erik,M. George, et al., Self-exciting point process models of civilian deaths inIraq,
[28] F. Benevenuto, T. Rodrigues,V. Almeida, J. Almeida,K. Ross, Video interactions in on-
line video social networks,ACM Transactions on Multimedia Computing, Communi-
cations, and Applications (TOMCCAP) 5 (4) (2009) 30.
[29] F. Figueiredo, Fabr, et al., The tube over time: characterizing popularity growth of
youtube videos, Proceedings of the fourth ACM international conference on Web
search and data mining, ACM, Hong Kong, China, 2011, pp. 745754.
[30] I. Guy,M. Jacovi, et al., Same places, same things, same people?: mining user similar-
ity on social media, Proceedings of the 2010 ACM conference on Computer support-
ed cooperative work, ACM, Savannah, Georgia, USA, 2010, pp. 4150.
[31] D. Harte, PtProcess: an R package for modelling marked point processes indexed by
time, Journal of Statistical Software 35 (8) (2010) 132.
[32] A.G. Hawkes, Spectra of some self-exciting and mutually exciting point processes,
Biometrika 58 (1) (1971) 8390.
[33] A.G. Hawkes, D. Oakes, A cluster process representation of a self-exciting process,
Journal of Applied Probability 11 (3) (1974) 493503.
[34] J. Howe, The rise of crowdsourcing, Wired Magazine 14 (6) (2006) 14.
[35] J. Howison, J.F. Olson, A. Kittur, K.M. Carley, Motivation through visibility in open
contribution systems, 2011 (accessed May 19,
[36] B.A. Huberman, Crowdsourcing and attention, Computer 41 (11) (2008) 103105.
[37] L.B. Jabeur, L. Tam ine, et al., Uprising microblogs: a bayesian netw ork retrieval
model for tweet search, Proceedings of the 27th Annual ACM Symposium on Ap-
plied Computing, ACM, Trento, Italy, 2012, pp. 943948.
[38] B.J.Jansen, M. Zhang, etal., Twitter power:tweets as electronic word of mouth, Jour-
nal of the American Society for Information Science and Technology 60 (11) (2009)
[39] L. Jong Gun, M. Sue, et al., An approach to model and predict the popularity of
online contents with explanatory factors, Web Intelligence and Intelligent
Agent Technology (WI-IAT), 2010 IEEE/WIC/ACM International Conference on,
[40] S. Kong, L. Feng, et al., Predicting lifespans of popular tweets in microblog, Proceed-
ings of the 35th international ACM SIGIR conference on Research and development
in information retrieval, ACM, Portland, Oregon, USA, 2012, pp. 11291130.
[41] M. Lawrence, E.C. Michael, Hawkes process as a model of social interactions: a view
on video dynamics, Journal of Physics A: Mathematical and Theor etical 43 (4)
(2010) 045101.
[42] C. Lee, H. Kwak, et al., Finding inuentials based on the temporal order of informa-
tion adoption in twitter, Proceedings of the 19th International Conference on World
Wide Web, ACM, Raleigh, North Carolina, USA, 2010, pp. 11371138.
[43] J.G. Lee, S. Moon, K. Salamatian, Modeling and predicting the popularity of online
contents with Cox proportional hazard regression model, Neurocomputing 76 (1)
(2012) 134145.
[44] K. Lerman, T. Hogg, Using a model of social dynamics to predict popularity of news,
Proceedings of the 19th International Conference on World wide Web, ACM, Ra-
leigh, North Carolina, USA, 2010, pp. 621630.
[45] J. Leskovec, K.J. Lang, et al., Statistical properties of community structure in large so-
cial and information networks, Proceedings of the 17th international conferenceon
World Wide Web, ACM , Beijing, China, 2 008, pp. 695704.
[46] W.B. Lober, J.L. Flowers, Consumer empowerment in health care amid the internet
and social media, Seminars in Oncology Nursing 27 (3) (2011) 169182.
[47] G.O. Mohler, M.B. Short, et al., Self-exciting point process modeling of crime, Journal
of the American Statistical Association 106 (493) (2011) 100108.
[48] A. Noff, Learning from Starbucks one tweet at a time, available at: http://www. ucks-one-tweet-at-a-time /
2009 (accessed August 25, 2012).
[49] A. Noff, The Starbucks formula for social media succe ss, URL:http://thenextweb.
com/2010/01/11/starbucks-formula-social-media-success/ 2011.
[50] Y. Ogata, Statistical models for earthquake occurrences and residual analysis for
point processes, Journal of the American Statistical Association 83 (401) (1988)
[51] Y. Ogata, D. Vere-Jones, Inference for earthquake models: a self-correcting model,
Stochastic Processes and their Applications 17 (2) (1984) 337347.
[52] T. Ozaki, Maximum likelihood estimation of Hawkes' self-exciting point processes,
Annals of the Institute of Statistical Mathematics 31 (1) (1979) 145155.
[53] R.D. Peng, Multi-dimensional point process models in r, 2002.
[54] H. Rui, A. Whinston, Designing a social-broadcasting-based business intelligence
system, ACM Transactions on Management Information Systems 2 (4) (2012) 119.
[55] D. Rybski, S.V. Buldyrev, et al., Communication activity in a social network: relation
between long-term correlations and inter-e vent clustering, Scienti c Reports 2
[56] SAS Harvard Business Review Analytic Services, The New Conversation:
TakingSocial Media from Talk to Action, Harvard Business School Publishing, 2010.
[57] C.M. Sashi, Customer engagement, buyerseller relationships, and social media,
Management Decision 50 (2) (2012) 253272.
[58] G. Szabo, B.A. Huberman, Predicting the popularity of online content, Communica-
tions of the ACM 53 (8) (2010) 8088.
[59] A. Veen, F.P. Schoenberg, Estimation of spacetime branching process models in
seismology using an EM-type algorithm, Journal of the American Statistical Associ-
ation 103 (482) (2008) 614624.
[60] T. Wang, M. Bebbington, et al., Markov-modulated Hawkes process with stepwise
decay, Annals of the Institute of Statistical Mathematics 64 (3) (2012) 521544.
[61] W. Willinger, R. Rejaie, et al., Research on online social networks: time to face
the real challenges, SIGMETRICS Performance Evaluation Review 37 (3)
(2010) 4954.
67A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 5968
[62] E. Sadikov, A.G. Parameswaran, P. Venetis, et al., Blogs as Predictors of Movie Suc-
cess, International AAAI Conference on Weblogs and Social Media (ICWSM) (2009).
[63] M. Egesdal, C. Fathauer, K. Louie, J. Neuman, G. Mohler, E. Lewis, Statistical and sto-
chastic modeling of gang rivalries in Los Angeles, SIAM Undergraduate Research On-
line 3 (2010) 72394.
[64] L. Mitchell, M.E. Cates, Hawkes process as a model of social interactions: a view on
video dynamics, Journal of Physics A: Mathematical and Theoretical 43 (4) (2010)
[65] A. Jalilian, ETAS: Modeling earthquake data using Epidemic Type Aftershock Se-
quence model, 2012.
[66] R.J. Hyndman, Y. Khandakar, Automatic time series for forecasti ng : the forecast
package for R, 2007.
Amir Hassan Zadeh is a PhD student in the Management
Science and Inf ormation Systems Department within th e
Spears School of Business at Oklahoma State University. He
received his master's in Industrial and Systems Engineering
from Amirkabir University of Technology, and his bachelor's
from Departmen t of Mathematics and Computer Science,
Shahed University, Tehran, Iran. He has bee n published in
the Journalof Production Planning and Control, Annals of Infor-
mationSystems, Advancesin Intelligent andSoft Computing,Af-
rican Journal of Business Management, and also conference
proceedings of DSI, INFORMS and IEEE. His current research
interests include big data and analytics, social networks and
recommender systems. His research also involves decision
support systems, data mining and knowledge disc overy
and systemanalysis and design.Other areas of interestinclude supply chainmanagement,
product design, and healthcare.
Ramesh Sharda is the interim Vice D ean of the Watson
Graduate School of Management, Watson/ConocoPhillips
Chair and a Regents Professor of Management Science and
Information Systems in the Spears School of Business at
Oklahoma State University. He also serves as the Executive
Director of the PhD in Business for Executives Program. He
has coauthored two textbooks (Business Intel ligence and
Analytics: Systems for Decision Support, 10th edition, Prentice
Hall and Business Intelligence: A Managerial Perspective on
Analytics, 3rd Edition, Prentice Hall). His research has been
published in major journals in management science and in-
formation systems including Management Sc ience, Operations
Research, Information Systems Research, Dec ision Support
Systems,Interfaces, INFORMSJournal on Computing,and many
others. He is a member of the editorial boards of journals such as the Decision Support
Systems and Information Systems Frontiers. He is currently servingas the ExecutiveDirector
of Teradata University Network and received the 2013 INFORMS HG Computing Society
Lifetime Service Award.
68 A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 5968
... The model uses a rolling window of less than one-tenth of the total number of RTs in order to generate step-ahead predictions. Zadeh and Sharda [26] applied an epidemic-type Hawkes process model for the prediction of the popularity of brand posts. Their model requires the RT times as well as the number of followers of users who retweeted the original post in order to estimate the final popularity. ...
... where the rate of event type j, λ j (t), is determined by the accumulative self-and mutual-excitement effects of the past occurrence of events of all [26] Self-exciting point process x x Rizoiu, Xie [28] Marked self-exciting point process x x x Yoo, Gu [50] Marked self-exciting point process x x x Yang and Zha [51] Zhou, Zha [13] [54] ...
... We also consider an epidemic-type Hawkes process model based on the work of Zadeh and Sharda [26], similar to the work of [9], where the model takes into consideration both temporal information and follower counts. ...
People create and share content via online social networks, which provide an unparalleled opportunity for brands to gain visibility, promote products or services and drive revenue growth. Much research has focused on why, how, or what social content is popular, trending and “hype”. One central challenge is to forecast the spread (cascades) of information that leads to the popularity of content throughout a social network. Online content tends to have bursts and spikes, experiencing a different cascading pattern depending on the viral propagation. In this paper, we propose and test a flexible framework capable of modelling such patterns and trends. We take temporal and network perspectives and develop a model based on the multivariate Hawkes processes that account for social behaviour and network elements such as follower counts, and activity variation observed in collective re-sharing behaviour. We focus on Twitter as the most widely used micro-blogging online social network and measure the popularity of a brand's tweet by analysing the time-series path of three types of subsequent activities (retweets (RTs), replies (REs) and likes (LKs)). The specific model that we propose in this paper is the multidimensional epidemic-type aftershock sequence (METAS) model, a particular case of the multivariate Hawkes process. It consists of a power-law relaxation governing the timing of activities. It also includes an exponential boost as a reinforcement mechanism for the response amplitude to model the impact of influential users on their followers. Earlier attempts to model online cascades have treated all online responses as one type of activity. Rather than aggregating all the activities into one stream, and therefore, ignoring exciting effects among different types of activities, we incorporate the activity variation into the predictive models of content popularity, explicitly accounting for such excitation effects. We develop epidemic-type mutually exciting Hawkes point processes models to quantify such effects and to predict more accurately the number of follow-up activities (i.e., RTs, REs and LKs) on a brand tweet after it is posted. Our results suggest that the proposed model outperforms the state-of-the-art models in terms of prediction accuracy, as it is able to account for mutual excitations and cross-interactions between sequences of users’ activities from one type to another. These results are relevant for developing and executing a plan for online activities by the brand owners.
... A conceptual framework was proposed and empirically tested. This seminal work has been giving rise to a number of similar articles over the years, with the purpose of identifying antecedents of brand post popularity (e.g., [4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21]). As shown later, some of these works have also resulted in contradictory findings. ...
... Table I summarizes the contexts of extant research (RQ 1) including the platform studied, number of social media posts analyzed (i.e.; sample size), and the type of brands examined. Facebook has been the most widely studied platform while only a handful of works have focused on platforms such as Instagram [9,16,18] and Twitter [6]. The sample size ranges from 164 [5] to as high as 75,000 [9]. ...
... Likes, Comments and Hashtags have been studied in the context of Instagram [18]. In the context of Twitter [6], brand post popularity has been operationalized as Likes, Replies (similar to Comments on Facebook), and Retweets (akin to Shares on Facebook). ...
Conference Paper
Social media has now become an indispensable marketing tool. Much research has been done to understand what makes brands’ social media posts popular by attracting Likes, Comments and/or Shares. The objective of this paper is to carry out a systematic literature review on brand post popularity on social media. Through a literature search on Scopus—the largest database of peer-reviewed literature, 19 relevant articles were identified. Facebook has been the most widely studied platform while only a handful of works have focused on Instagram and Twitter. Platforms such as LinkedIn and TikTok have not been studied. Scholarly attention has mostly been trained on well-known and popular brands. A list of 22 antecedents of brand post popularity could be identified, some of which have often yielded contradictory findings. Several directions for future research are proposed. Consistencies in the literature are also summarized for the benefit of practitioners such as social media marketers.
... Abbr. Reference Autoregressive (âmoving-average) AR (MA) [48,73,74,123,137,223,237] Decision Tree DT [12, 39, 49, 50, 60, 61, 63, 73, 84, 97, 100, 101, 103, 105, 106, 134-136, 138, 194, 200, 218, 230] k-nearest Neighbors Algorithm k-NN [12,48,61,63,73,84,97,98,123,134,135,218] Linear Regression LR [1,3,12,14,24,31,39,48,50,63,72,73,77,90,98,100,101,123,132,136,156,160,168,181,184,186,192,208,212,218,229,233,234 [4,23,39,49,50,63,71,72,86,97,103,136,142,161,175,187,189,191,200,207,208,215,230,241] Support Vector Machine SVM [12,36,39,49,50,61,64,72,73,84,86,91,96,97,100,101,103,108,128,131,132,134,135,138,158,166,175,188,192,200,218,225,227,232,241] addition of content features, which is also confirmed in Reference [106]. Authors of Reference [136] argued that content features explain the variance of popularity poorly. ...
... where V (t ) is the exogenous source, μ j represents the number of potential participants that will be influenced by u j to join in this cascade C i at time t, and ϕ (t ) ∼ 1/t 1+θ (0 < θ < 1) is a memory kernel. The idea was extended in Reference [237], utilizing self-exciting Hawkes process to characterize the tweet popularity in Twitter. The number of followers and a Pareto distribution of the kernel ϕ (τ ) were used to model the magnitude μ j of each event. ...
Full-text available
The deluge of digital information in our daily life—from user-generated content, such as microblogs and scientific papers, to online business, such as viral marketing and advertising—offers unprecedented opportunities to explore and exploit the trajectories and structures of the evolution of information cascades. Abundant research efforts, both academic and industrial, have aimed to reach a better understanding of the mechanisms driving the spread of information and quantifying the outcome of information diffusion. This article presents a comprehensive review and categorization of information popularity prediction methods, from feature engineering and stochastic processes, through graph representation, to deep learning-based approaches. Specifically, we first formally define different types of information cascades and summarize the perspectives of existing studies. We then present a taxonomy that categorizes existing works into the aforementioned three main groups as well as the main subclasses in each group, and we systematically review cutting-edge research work. Finally, we summarize the pros and cons of existing research efforts and outline the open challenges and opportunities in this field.
... It is widely known that the evolution of the Web 2.0 has led into a new era, where the social networks have gained a crucial role in people's lives. More specific, in the recent years, Twitter has been one of the most popular networks that has been broadly utilized in a wide plethora of research studies that are trying to extract and analyze users' activity with the intention of finding valuable trends or patterns [1]. Such trends or patterns have been proved to be valuable information for businesses. ...
Conference Paper
Full-text available
Nowadays, more and more people are using online social media to express their thoughts and opinions on a variety of topics that interest or concern them. Through social networking platforms, people have the ability to communicate directly and share knowledge with people all around the world. Twitter is one of the most popular social media, used by millions of users daily. In particular, people use it to express their opinion directly and freely on whatever concerns them, thereby generating a large amount of data. The abundance of this information and its multifaceted importance, emphasizes how important is to find ways of collecting and analyzing such data in order to extract valuable knowledge. Such data, are a valuable source of information whose extraction can help individuals or even businesses in the decision-making process. The present research focuses on the study of user communication about a brand on Twitter, and in particular on exploiting user feelings about this brand effectively. In more detail, this work promotes the efficient modeling and management of the business-consumer relationship by studying the interactions of users who are discussing a specific brand name. The purpose of this research work is to provide an efficient tool that will enable businesses to use technological and automated tools in order to effectively manage the emotional state of consumers in relation to their brand. Consumer feedback and expressed emotions may be utilized by companies for making decisions regarding marketing research, competitive business intelligence and online reputation management.
... Another stream of crowdsourcing literature seeks to predict its effects on business ecosystems [44], such as in studies of how organizations can leverage microtasking (e.g., Amazon's Mechanical Turk) to outsource tasks that computers cannot perform or to replace in-house functions, such as market information processing [66], creating brand awareness [105], and new product development and testing [3,44]. Related studies cite the strategic difficulties created by business models that rely heavily on crowdsourcing to create value [43,67] and the challenges of attracting participation in crowdsourcing contests [89]. ...
This conceptual article contains a proposed four-part categorization of crowdsolving platform features: contest, idea, participant, and community. A related conceptual framework depicts the relationships of these features with the outcomes of creating, transferring, and assimilating knowledge for the platform and its client firms, according to the socialization, externalization, combination, and internalization (SECI) model. Arguments based in Social Capital Theory, Social Cognitive Theory, and Organizational Knowledge Creation Theory inform the predictions about the spiral of interactions between explicit and tacit knowledge that occurs during a contest on a crowdsolving platform. The findings suggest ways to design a crowdsolving ba (shared space for creativity) and platform-level absorptive capacity. For platform managers, this study also offers new insights into the important decisions they must make to develop their platforms’ features.
... As the world continues to adopt an evergrowing array of social media platforms, there is an increasing demand for the ability to autonomously derive what's being discussed, as well as the tone of the conversation. Businesses and organizations of all sizes have realized that an online presence is a must in today's culture but reading every comment and tracking the topics that trend is a cumbersome task (Zadeh and Sharda 2014). Thankfully, artificial intelligence (AI) exist that can make this chore more manageable. ...
Full-text available
The advent of digital communications has proliferated the engagements between customers and businesses of all varieties including university athletic programs. Interaction and engagement through social media content play a critical role in developing the relationship between fans and their favorite colligate teams. In this study, we reviewed the existing literature pertaining to the use of sentiment analysis and content categorization for fan engagement in the sports industry. Dozens of sources were examined, and their methodologies were explored. We present an analytic framework that can be used by sports organizations in their efforts to harness the power of AI and social media. The framework encompasses multiple stages related to textual data: data collection, data preparation, sentiment mining, and content categorization. In particular, this study demonstrates the use of text mining and sentiment analysis to provide athletic departments with more efficient and effective data understanding. In turn, this process will yield improved fan engagement to scale without increased expenditures. Using the textual data gathered from social media for a Basketball team at a major university in the United States, multiple analytical models were created using several different text mining packages, each one seeking to classify the polarity of the fan comments being examined. The study explored the possibility of classifying comments as positive or negative at the statement level. Statements were further categorized according to the subject matter of the comment. Inconsistencies were found between what the models identified and fan sentiment. Updating these models and the use of more effective text mining algorithms resulted in improved performance. Ultimately, it was determined that text mining and sentiment analysis models would be capable of performing the necessary analysis. Implications for research and practice are discussed.
... In such previous work, objective functions were designed using expected values of exposure counts of the user content, generated from a Hawkes process. The latter has been applied as a simulation for the social media information diffusion in many recent applications as well (Zadeh and Sharda 2014;Kobayashi and Lambiotte 2016). Multivariate Hawkes Processes (MHP) have proven efficiency and robustness in social media analysis and more specifically in the domain of misinformation. ...
Full-text available
Mitigating misinformation on social media is an unresolved challenge, particularly because of the complexity of information dissemination. To this end, Multivariate Hawkes Processes (MHP) have become a fundamental tool because they model social network dynamics, which facilitates execution and evaluation of mitigation policies. In this paper, we propose a novel light-weight intervention-based misinformation mitigation framework using decentralized Learning Automata (LA) to control the MHP. Each automaton is associated with a single user and learns to what degree that user should be involved in the mitigation strategy by interacting with a corresponding MHP, and performing a joint random walk over the state space. We use three Twitter datasets to evaluate our approach, one of them being a new COVID-19 dataset provided in this paper. Our approach shows fast convergence and increased valid information exposure. These results persisted independently of network structure, including networks with central nodes, where the latter could be the root of misinformation . Further, the LA obtained these results in a decentralized manner, facilitating distributed deployment in real-life scenarios.
Full-text available
Social media has played a pivotal role in polarising views on topics including politics, climate change, and more recently, the Covid-19 pandemic. Social media induced polarisation (SMIP) poses serious challenges to societies as it could enable ‘digital wildfires’ that would wreak havoc worldwide. While the effects of polarisation of online social interactions have been extensively studied, little is explored to advance understanding of the interplay between two components in the phenomenon: confirmation bias (reinforcing one’s attitudes and beliefs) and echo chambers (i.e., hear their own voice). This paper studies SMIP by exploring how manifestations of confirmation bias contributed to the development of ‘echo chambers’ during the Covid-19 pandemic. This study sheds new light on the role of confirmation bias in participants involved in supply chain information processing. This study also identifies five key cross-cutting propositions and the development of a conceptual framework that can be used in future research.
Full-text available
This research aims to examine the influenced social urban by entrepreneurial orientation and diffusion of innovation. This research applies an exploratory-descriptive design. The research data source was primary data collected by questionnaire. The population and sample consisted of the SMME doers in Indonesia, located in tourism objects, 330 respondents. The findings explained that entrepreneurial orientation improved social urban.Diffusion of innovation significantly contributed the social urban improvement. It was proven with the powers of the SMME doers and the increased diffusion of innovation on tourism objects of Indonesia.
Full-text available
The main purpose of this study is to discover the most popular foods in Turkish cuisine by analysing user-generated content (UGC) and analysing Instagram posts to determine the most popular themes within a gastronomical context. Photographs, likes, and hashtags of 1167 posts shared with "#turkishfood" hashtag are analysed due to the representative power of this hashtag for the Turkish cuisine. Photography and text mining techniques are used under data mining. Findings for photographs and likes show that users have high and low perceived images for certain food categories. Hashtag findings support the user's positive attitude towards Turkish cuisine. The study will help the destination develop future social media strategies by revealing the strengths and weaknesses of user-generated content (UGC) in the destination's food image branding. This study offers theoretical and practical implications by showing existing and possible image elements for destination food branding with social media.
Full-text available
Our goal in this article is to characterize temporal patterns of violent civilian deaths in Iraq. These patterns are expected to evolve on time-scales ranging from years to minutes as a result of changes in the security environment on equally varied time-scales. To assess the importance of multiple time-scales in evolving security threats, we develop a self-exciting point process model similar to that used in earthquake analysis. Here the rate of violent events is partitioned into a background rate and a foreground self-exciting component. Background rates are assumed to change on relatively long time-scales. Foreground self-excitation, in which events trigger an increase in the rate of violence, is assumed to be short-lived. We explore the model using data from Iraq Body Count on civilian deaths between 2003 and 2007. Our results indicate that self-excitation makes up as much as 37–50 per cent of all violent events and that self-excitation lasts at most between two and six weeks, depending upon the district in question. Appropriate security responses may benefit from taking these different time-scales of violence into consideration.
Full-text available
Purpose ‐ The purpose of this paper is to analyze the extent to which the use of social media can support customer knowledge management (CKM) in organizations relying on a traditional bricks-and-mortar business model. Design/methodology/approach ‐ The paper uses a combination of qualitative case study and netnography on Starbucks, an international coffee house chain. Data retrieved from varied sources such as newspapers, newswires, magazines, scholarly publications, books, and social media services were textually analyzed. Findings ‐ Three major findings could be culled from the paper. First, Starbucks deploys a wide range of social media tools for CKM that serve as effective branding and marketing instruments for the organization. Second, Starbucks redefines the roles of its customers through the use of social media by transforming them from passive recipients of beverages to active contributors of innovation. Third, Starbucks uses effective strategies to alleviate customers' reluctance for voluntary knowledge sharing, thereby promoting engagement in social media. Research limitations/implications ‐ The scope of the paper is limited by the window of the data collection period. Hence, the findings should be interpreted in the light of this constraint. Practical implications ‐ The lessons gleaned from the case study suggest that social media is not a tool exclusive to online businesses. It can be a potential game-changer in supporting CKM efforts even for traditional businesses. Originality/value ‐ This paper represents one of the earliest works that analyzes the use of social media for CKM in an organization that relies on a traditional bricks-and-mortar business model.
In this paper, we give an overview of the state-of-the-art in the econometric literature on the modeling of so-called financial point processes. The latter are associated with the random arrival of specific financial trading events, such as transactions, quote updates, limit orders or price changes observable based on financial high-frequency data. After discussing fundamental statistical concepts of point process theory, we review durationbased and intensity-based models of financial point processes. Whereas duration-based approaches are mostly preferable for univariate time series, intensity-based models provide powerful frameworks to model multivariate point processes in continuous time. We illustrate the most important properties of the individual models and discuss major empirical applications.
The ideas to be discussed in this chapter have been the subject of intensive development during the last two decades, as much by engineers as by mathematicians and statisticians. The underlying theme is the need for a theory of estimation, prediction, and control for point processes. In the late 1960s and early 1970s engineers, in particular, began to exploit a remarkable analogy between point processes and diffusion processes, with the Poisson process playing a role analogous to that of Brownian motion. Early papers by Yashin (1970) in the Soviet Union and Snyder (1972) and Rubin (1972) in the United States explored the analogy between filtering and detection problems for point processes and the Kaiman filtering techniques for signal-from-noise problems in the Gaussian context; the analogy is closest for doubly stochastic (i.e., Cox) processes. The paper by Gaver (1963) may be regarded as some kind of precursor of these developments. These papers were followed by more systematic studies in the theses by Brémaud (1972) and van Schuppen (1973), and papers by Boel, Varaiya, and Wong (1975), Kailath and Segall (1975), and Davis (1976), to mention only a few. On the probabilistic side, the possibility of a powerful link with martingale theory was noted as early as 1964 by Watanabe (1964) who gave a martingale characterization of the Poisson process; the martingale theory was developed further in Kunita and Watanabe (1967). A synthesis of these approaches was presented by Kabanov, Liptser, and Shiryayev (1975) and incorporated in Volume II of Liptser and Shiryayev (1978). Further important reviews are found in Brémaud and Jacod (1977), Brémaud (1981), Shiryayev (1981), and Jacobsen (1982).
In recent years methods of data analysis for point processes have received some attention, for example, by Cox & Lewis (1966) and Lewis (1964). In particular Bartlett (1963 a,b) has introduced methods of analysis based on the point spectrum. Theoretical models are relatively sparse. In this paper the theoretical properties of a class of processes with particular reference to the point spectrum or corresponding covariance density functions are discussed. A particular result is a self-exciting process with the same second-order properties as a certain doubly stochastic process. These are not distinguishable by methods of data analysis based on these properties.