Content uploaded by Amir Zadeh
Author content
All content in this area was uploaded by Amir Zadeh on Sep 01, 2018
Content may be subject to copyright.
Modeling brand post popularity dynamics in online social networks
Amir Hassan Zadeh ⁎, Ramesh Sharda
Spears School of Business, Oklahoma State University, Stillwater, OK 74078, USA
abstractarticle info
Available online 13 May 2014
Keywords:
Online social networks
Social media marketing
Crowdsourcing
Brand post popularity
Brand-generated content
Hawkes point process
Today's social media platforms are excellent vehicles for businesses to build and foster relationship with
customers. Companies create official fan pages on social network websites to provide customers with informa-
tion about their brands, products, promotions, and more. Customers can become fans of these pages, and like,
reply, share or mark the brand post as favorite. Marketing departments are using these activities to crowdsource
marketingand increase brandawareness and popularity. Understandinghow crowdsourcing oriented marketing
and promotion evolves wouldbe helpful in managing suchcampaigns. In this paper,we adopt a multidimension-
al point process methodology to study crowd engagement activities and interactions. Specifically, we investigate
the brand post popularity as a joint probability function of time and number of followers. One-dimensional and
two-dimensional Hawkes point process models are calibrated to simulate popularity growth patterns of brand
post contents on Twitter. Our results suggest that the two-dimensional point process model provides a good
model for understanding such crowdsourcing behavior.
© 2014 Elsevier B.V. All rights reserved.
1. Introduction
The emergence of Internet-based social media has started a new
kind of conversation among consumers and companies, challenging
traditional ideas about marketing and brand management while creat-
ing new opportunities for organizations to understand customers and
connect with them instantly [56]. Research firm Chadwick MartinBailey
in partnership with Constant Contact conducted a study that analyzed
the behavior of 1491 consumers ages 18 and older throughout the
U.S., and revealed that a whopping 77% of consumers interact with
brands on Twitter or Facebook primarily through reading posts and
updates from the brands. They also noted that 60% of social customers
are more likely to recommend a brand to a friend after following the
brand on Twitter or Facebook, and 50% of them are more likely to buy
from that brand as well. When it comes to “Liking”brand posts on
Facebook, the reasons are varied, but for the most part, respondents
said they like a brand on Facebook because they are a customer (58%)
or because they want to receive discounts and promotions (57%) [21].
Today, the customer experience shared through social media, blogs
and discussion forums is becoming a major driver of purchasing
decisions, because these platforms provide consumers a more influen-
tial voice in effecting changes in their own customer care [15].Barnes'
research [9] indicates that70% of consumers use social media platforms
“at least some of the time”to learn about the customer care offered by a
company before they make a purchase. Furthermore, of them, 74% of
customers choose companies based on customer care experience
shared by others in online forums.
Over the past few years, big brands have started taking social media
seriously, and social media marketing has been an inevitable part of
their marketing plan. For example, Coca-Cola, one of the world's most
recognizable brands, had 800 fans on Facebook in 2007, 16.5 million
in 2010, and it has currently crossed over 62.3 million “likes”. In 2012,
in honor of the Coca-Cola Facebook page becoming the first retailer
brand to receive 50 million “likes”, Coca-Cola developed a new
Facebook application to identify and support individuals developing,
influencing and shaping ideas and ask them to collaborate with the
Facebook community to spread them globally. Through this application,
Coca-Cola teaches the world to sing in perfect harmony, mobilizes
millions of people behind their favorite cause, and encourage them to
become more active and socially involved. As an end result, consumers
become involved in suggesting modifications of products and services
and the distribution of these innovations [11,12].
Starbucks, as one of the top ten most followed brands on Twitter,
uses tweets to share knowledge with customers and promote their lat-
est products, campaigns and events [20]. With an average of ten tweets
per day on Twitter, Starbucks extracts relevant knowledge from a net-
work of current and prospective customers around the globe who
express their expectations, likes and dislikes about the brand [20,48].
In 2010, Delta Airlines launched the first social media “ticket
window”on Facebook which allows customers to book a flight without
having to go to any other website. Delta pointed out Facebook is being
used by more customers while in flight than any other Web site, making
Decision Support Systems 65 (2014) 59–68
⁎Corresponding author.
E-mail address: Amir.zadeh@okstate.edu (A. Hassan Zadeh).
http://dx.doi.org/10.1016/j.dss.2014.05.003
0167-9236/© 2014 Elsevier B.V. All rights reserved.
Contents lists available at ScienceDirect
Decision Support Systems
journal homepage: www.elsevier.com/locate/dss
it a “natural launching point”for its initiative [8]. Access to OSNs on
mobile devices has certainly accelerated the popularity of OSNs.
As more and more major brands have established their communities
and fan pages within online social networks (OSNs) and started offering
commerce opportunities delivered through social media platforms,
crowdsourcing applications have become some of the most engaging
tools in digital marketing realm, enabling brands to realize the potential
for their fans' input into the product development and the market
development processes [36]. Such innovative and creative initiatives
enable businesses to improve their products, get brand recommenda-
tions, increase brand awareness and popularity, find new customers
or even excite a specific demographic. In many cases where fans within
social media are particularly passionate about a brand and its products,
there will be a clear desire to become part of the product itself, have
input as a group and energize the brand and its product lines [57].
Today's openness and flexibility of OSNs provide brands with a huge
opportunity to get in touch with customers, crowdsource marketing
tasks and enhance brand awareness. Understanding the structure and
behavior of the fans on OSNs is important to the content providers to
enable better organization of brand post information, design of effective
online communities and for implementing successful marketing
campaigns. In examining the online social interaction structures, the
formation of relationships and interactions, how information moves
on social media platforms, and how users respond to various stimuli
like video, contests, or posts are not clearly understood. The answers
to these questions will offer a more complete picture of the social
dynamics of networking and how individuals manage their virtual
relationships and follow their favorites or brand communities, or how
they influence their friends to become followers as well. In this paper,
we model the spread of information across Twitter, the most popular
and widely used micro-blogging online social network [37] and analyze
the data from a number of brand posts to discover what rules might
govern the spread of information online. By understanding these
behaviors, companies can become more effective in designing market-
ing campaigns. Being able to analyze a social network of customers,
how customers interact on this type of platforms, and what rhythm
and timing of the most engaging postings look like provides brands
a competitive advantage through forecasting the spread of brand
influence, and intervening at times with promotions to foster relation-
ship with customers.
The timing pattern of human communication in online social
networks is not random. It has been shown that the communication is
explained by emergent statistical laws such as non-trivial correlations
and clustering [55]. With the possibility of analyzing the multivariate
distribution of the occurrences of activity on OSNs, we can add to our
understanding of these interactions.
Standard models assume a Poisson distribution for events occur-
rence, which is an unrealistic assumption in many social systems.
Point process has shown promise for modeling social event patterns
where the occurrenceof an event increases the likelihood of subsequent
events [22]. It is a novel way of modeling and clustering high frequency
and irregular data in time. It uses a branching structure that corresponds
to background events and offspring events and is able to capture bursts
of activity, dynamics and reactions over time.
In this paper, we model the popularity of a brand post or more
generally an online content on online social networks. The popularity
of an online content is not a well-defined, but a highly subjective term
[39]. Brandpost popularity canbe defined as a mixture of various factors
such as vividness, interactivity, the content of the brand post (informa-
tion, entertainment), and number of times the brand post is mentioned
by fans [25]. We take the position of an individual user's eyes who
conjectures the popularity of a brand's tweet from publicly observable
data by associating the number of impressions it has received (including
total number of retweets, replies, favorites) or the lifespan of threads
over its entire timeline. A tweet is considered a popular tweet if it
receives a certain amount of retweets, replies, and favorites that are
no less than a certain threshold over its lifespan [40,43]. Our goal is to
develop a mechanism for capturing the evolution of the online content
popularity posted by brands on OSNs. In our approach, a model is
specified via the conditional intensity for each event. This provides a
powerful and more natural modeling framework for multivariate social
network event data. Specifically, the current study examines the
influence of user activities on the timing and frequency of a brand
post. The self-exciting Hawkes point process and the ETAS (Epidemic
Type Aftershock Sequences) models are used to analyze data on brand
posts popularity. Unlike Poisson processes, Self-exciting Hawkes point
process and ETAS are classified as counting processes which are basical-
ly a continuous-time non-Markov chain due to the dependence on the
history of the process (i.e. H
t
) to the extent to which having states 0,
1, 2, . . . moving from state nto state n+ 1, where n≥0. In case of the
content popularity problem, each state indicates total number of users
who hit the content by time t,andλ(t) is the transition rate of moving
from one state to another state.
The remainder of the paper is organized as follows. The next section
starts with a discussion of online social networks (OSNs). We also
review literature about stochastic point processes and their many
uses. The following section describes how we map the content popular-
ity to the point processes framework. Also, we introduce brand post
data collected from Twitter and the assumptions necessary to proceed
with analysis. In Section 4,wefit competing models to data and then
compare the accuracy and complexity of models in capturing the burst
of activity on OSNs. The managerial implications of our findings, limita-
tions and possible directions for future work are discussed in Section 5.
The final section presents a general conclusion of the paper.
2. Review of the literature
2.1. Online social networks
During the past few years, millions of people have used social media
applications (Facebook, Twitter, YouTube, Google+, etc.) as a part of
their daily online activities [30]. In 2011, more than half of social
media users followed brands on social media sites, and brands are
increasingly investing in social media to crowdsource marketing activi-
ties, indicated by worldwide marketing spending on social networking
sites of about $4.3 billion [25].
Today companies develop official fan pages and online communities
within online social networks to understand customers, connect with
them instantly and provide them with information about their brands,
products, promotions and more. Meanwhile, brand fans can like, com-
ment and share brand posts. Users of Twitter can retweet, which is
much like a Facebook share. Followers retweet the tweets of those
they are following to propagate information to other people. People
respond to popular users by “replying”and/or “mentioning”[7].
Followers can also mark the content as favorite which is functionally
similar to the “like”action on Facebook. The “like”and “retweet”
buttons are the easiestways for Facebook and Twitter users respectively
to join in on the brand conversation and give feedback. Comments/
replies on brand posts can be positive, neutral or negative. In most
cases, social media users who choose to become fans of a product are
those who are particularly passionate about a brand and its products
and enjoy having input or being a member of a group of like-minded
fans. The brand benefits from these fans because they help communi-
cate with a diverse audience of other consumers.
Such individual activities associated with a brand post are visible to
network friends and many times influence friends to retweet, like, or
mention. Ifa company produces fan page updates that earn high quality
scores, they will reap the benefits of greater exposure and possibly an
increased fan base because other network members will see in their
news feed. Jansen et al. [38] discuss OSNs as a form of electronic word
of mouth (eWOM) for sharing consumer opinions concerning brands
and as a part of an organization's marketing strategy. This openness
60 A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 59–68
and flexibility of social media provides businesses a great opportunity
to bring together a group of people, or “crowd”, to solve a problem or
engage in an activity and achieve powerful social engagement and
activation.
In many ways, the interactivity of social media supports
“crowdsourcing”. Crowdsourcing is a term coined by journalist Jeff
Howe [34] to mean “taking advantage of the talent of the public”[46].
Social media provide platforms for existing and potential customers to
engage, learn, and entertain. It enables content marketers to
crowdsource their marketing, reaching vast audiences via word of
mouth. For example, Starbucks developed the “My Starbucks Idea”
campaign, an online customer community, where customers are asked
to contribute their views and ideas about the company. It keeps
customers in the loop on what business ideas Starbucks is currently
implementing on both the brand and product level. Through linking
this platform to Facebook, Twitter and other social media websites,
customers are able to see what others are suggesting, vote on ideas
and check out the results [49].
Internet service providers, content creators, and online marketers
would like to be able to predict how manyviews and actions an individ-
ual item might create on a given website [58]. This is true for companies
as well who benefit from aspects of online social networks by utilizing
fan pages and web advertising. Leveraging the social networking sites
to understand what is most popular helps e-commerceproviders decide
what content to promote on their website. E-commerce providers can
leverage these social signals to ensure the products or services people
are talking about appear higher in their product listings.
Over the last few years, much effort has been devoted to exploring
the statistical features of content popularity in online social networks
(OSNs). Most previous empirical analyses of OSNs have treated such
networks as static [29,61]. They analyze the social networks on a single
data snapshot [3,28,45]. However, such social network systems are
inherently dynamic, characterized by a high burstiness and a strong
positive correlation between two users' activities and consist of a set
of dyadic, directed, time-stamped, cross-affected and sometimes
weighted events. To the best of our knowledge, only a few studies
have analyzed popularity growth patterns of content on OSNs using
prediction models [16,22,29,44,58]. Crane and Sornette [22] propose
contagion models as models of YouTube video viewing dynamics to
understand how popularity bursts can be described. They differentiate
four classes of popularity dynamics (memoryless, viral, quality and
junk) which are all explained by properties of Hawkes point process.
Szabo and Huberman [58] find a strong linear correlation between
early and later times of the content popularity on YouTube and Digg
networks. This correlation confirms that if the content is popular
when new, it will continue to be popular as it ages. Another interesting
work on social media mining is reported by Chatzopoulou et al. [17].
They find a strong correlation between total number of comments (or
favorites) and total view count in YouTube. There are relatively few
studies in the literature which explore the capability of online social
networks to predict real-world outcomes such as the revenue or release
time of a product on the market. Sadikov et al. [62],Abeletal.[1] and Rui
and Whinston [54] present case studies in which blogosphere content
can be used as a predictor of movie and music success. They show that
the number of microblog views of content related to the music or
movie (such as FB posts, tweets, YouTube videos, etc.) can provide an
accurate prediction of the movie's or music's success.
While previous studies build popularity models based on a one-
dimensional function of time, we suggest that the content popularity
can be a joint probability function of time and the number of followers.
We focus more on incorporating thenumber of followers as an influen-
tial metric into predictive models of the content popularity, explicitly
looking at the impact of influential users on their followers to persuade
them to contribute to brand post popularity. In this paper, we adapt a
mathematical framework based on self-exciting point process to study
brand post popularity on online social networks. Specifically, we
calibrate one-dimensional and two-dimensional self-exciting point
process models to estimate popularity growth patterns of brand post
contents on Twitter.
2.2. Stochastic point processes
In this section we present the statistical theory underlying our
approach. First, we define the conditional intensity function for a point
process. A point process is a stochastic model commonly used
to describe the occurrence of discrete events in time and space. It can
be viewed in terms of a list of times t
1
,t
2
,…,t
n
at which corresponding
events 1, 2,…,noccur [27]. Intuitively, a point process is characterized
by its conditional intensity λ(t), which represents the mean spontane-
ous rate at which events are expected to occur given the history of the
process upto time t[50]. In particular, a version of the conditional inten-
sity may be given by the process
λtðÞ¼ lim
Δt→0
E½Nt;tþΔt½Ht
j
Δt
where H
t
denotes the history of events prior to time t,andtheexpecta-
tion represents the number of events N[t,t+Δt] occurring between
time tand t+Δt. The Poisson process is a special case of a point process
where the interval times between two arrivals are independent, identi-
cally distributed exponential random variables. The conditional intensi-
ty of a Poisson process is deterministic which means that events are
linked causally to the conditional intensity. In other words, a point
process is classified as a Poisson process if events occurring at two
different times are statistically independent of one another, meaning
that an event at time t
1
neither increases nor decreases the probability
of an event occurring at any subsequent time[27].Sinceahomogeneous
Poisson process indicates complete randomness, it is most commonly
used as a suitable benchmark forassessing self-exciting process models.
A point process is called self-excited if any one event increases the
likelihood of the future events [32]. A self-exciting or Hawkes point
process is a versatile point process which has been extensively studied
from a theoretical and practical point of view. It is defined by its condi-
tional intensity function
λtðÞ¼μþZ
ti
−∞
ϕt−ti
ðÞdZ uðÞ¼μþβX
tibtfg
ϕt−ti
ðÞ
Z
∞
0
ϕυðÞdυ¼1;ϕυðÞ≤1;∀υ≥0
ð1Þ
where Zis the normal counting measure [33].Therateofeventsλ(t)is
decomposed into the sum of a Poisson background rate which in most
applications is assumed to be constant in time [33] and a self-exciting
component in which events trigger an increase in the rate of the
process. The self-exciting part of the process has two components: β
and ϕ.βis a constant which reflects the magnitude of self-excitation
and ϕis a density function describing the waiting time (lag) distribution
between excited and exciting events. A proper skewed distribution in
which the overall shape reflects a long time dependency should be
introduced for the triggering density.
In the Hawkes-based analysis, the events can be viewed as
the realization of a multivariate point process. That is, every single
event is characterized by the occurrence time and the event's type.
Notationally, {T
i
,Z
i
}
i∈{1,2,..}
are random variables where T
i
is the occur-
rence time of the i
th
event and Z
i
∈{1, 2, …,M} indicates the ith event's
type [13]. A point process issaid to be mutually-excitingif any one event
from a specific event's type at time t
1
increases the likelihood of an
event in another event's type stream occurring at time t
2
. Mutually-
exciting Hawkes process is used to capture cross interactions and
mutual information between one sequence of events and another.
Similar to theself-exciting Hawkes process,a mutually-exciting Hawkes
61A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 59–68
process with nevent type(s) is defined by its conditional intensity
functions
λktðÞ¼μkþX
n
j¼1X
tibtfg
βijϕij t−ti
ðÞ k¼1;2;…n
Z
∞
0
ϕij υðÞdυ¼1;ϕij υðÞ≤1;∀υ≥0
ð2Þ
where the rate of event type k,λ
k
(t), is partitioned into the sum of a
Poisson background rate and mutual-exciting components in which
events trigger an increase in the rate of the process. β
ij
is a constant
which reflects the strength of self-excitation for (i=j) and the strength
of mutual-excitation for (i≠j)andϕ
ij
is a density function describing
the triggering distribution between excited event type iand exciting
event type j.
Hawkes-based analysis has long been used in seismology to
recognize similar clustering patterns in earthquakes occurrence data
and to predict subsequent earthquakes, or aftershocks. [2,51,59,60].
It has been applied to many other areas such as finance [10,13],
neurophysiology [19], ecology, social networks [5,27,47] and online
social networks [22,41].
Engle and Lunde [26];Bowsher[13] present a bivariate Hawkes
process model to jointly analyze the timing of trades and quote arrivals
in stock markets. Chavez-Demoulin et al. [18] and Bacry et al. [6] use
Hawkes process structure to estimate value at risk for portfolios of
traded assets over a given holding period of time. Dassios and Zhao
[24] present dynamic contagion process as a generalization of the Cox
process and Hawkes process and use it to model risk process with the
arrival of claims.
Mohler et al. [47], Egesdal et al. [63], and Erik et al. [27] use self-
exciting point process models to predict violent events and security
threats. Erik et al. [27] utilize step functions parameterized by various
values, linear functions and non-parametric approaches as non-
stationary background rates (μ) of the point process.
Alexey et al. [5] use a self-exciting point process to discover missing
data in the series of interaction events between agents in a social
network. They apply this model to the Los Angeles gang network to
predict affiliation of the unknown offenders.
Recently, this approach has been used to analyze the dynamics of on-
line social networks. Crane and Sornette [22] and Mitchell and Cates [64]
analyze a family of self-exciting point processes to model correlated
event timing of viewing YouTube videos. They deploy a Pareto distribu-
tion (power law) as a distribution of waiting times between cause and
action, describing the cascade of influences on the online social network.
It is shown that a Hawkes process enclosing power law distributions
offers many capabilities to calibrate the model to characteristics of the
YouTube views. These characteristics are classified by a combination of
endogenous/exogenous user interactions and the ability of viewers to
influence others to respond across the network (critical/subcritical).
Howison et al. [35] deploy a mutually excited Hawkes process to un-
derstand the dynamics of the user generated contents over open contri-
bution platforms such as Wikipediaand Linux. They study the influence
of visible activity of others on the timing and amount of participation in
Wikipedia environment. They model the time at which a response to an
event occurs as a log-normal distribution. But this analysis has not yet
been conducted on social media activities, in particular on Twitterpost-
ings and follow-up actions. Also the role of the influential users within
OSNs has not been yet considered in such predictive models.
In this paper, we provide a more realistic investigation of the
benefits of stochastic point processes for predicting the brand post
popularity on OSNs. To the best of our knowledge, there are relatively
few studies in the literature which explore the capability of point
processes on online social networks to model dynamics and growth
patterns. We use the ETAS model, one of the most widely used point
process in the literature, to shed light on how the content popularity
on OSNs can be described by a function of time and the number of
followers. The number of followers is one of the best metrics to demon-
stratetheroleoftheinfluential users within OSNs.
3. Problem formulation
Understanding rules governing collective human behavior, especial-
ly as they affect social interactions on internet-based social media, is a
difficult task in the field of social media analytics. Our main objective
is to analyze how the popularity of individual brand posts evolves
when the posts are shared with people on social media outlets. We
examine how fans' sequential interactions with network friends
contribute to the popularity of a brand post. The majority of brand
posts experience few hits and can be well described by a Poisson
process. In such a case of little activity, popularity oscillation is quite
steady. In contrast, some brand posts experience bursts of activity and
word of mouth growth through friend sharing features of OSNs. A
standard stochastic process (i.e. Poisson process) fails to address the
burst of popularity;since it is based on the assumption of independence
about arrivals, which is unrealistic in case of future activities arising
from a specific tweet/post/etc. Clustering point processes and epidemic
type models are a good fit for modeling such phenomena.
In the online social networks analysis, the social activity event data
can be viewed as the realization of a multivariate point process. Each
event is characterized by its occurrence time (t
i
), the magnitude of
influence (number of followers) (m
i
) with an additional mark attached
to it representing the event's type (z
i
). Retweeting, replying, tagging
and marking a brand post as a favorite, etc. are different types of user
activities. For the purpose of this paper, we combine these three types
of events into one common set of events.
The beauty of major OSN platforms is that they are structurally
isomorphic. Their similar features, while labeled with site-specific
vocabulary, operatein the same way, making studies of their data easier.
For the purposes of this paper, we will utilize Twitter notations to
explain properties of OSNs.
In order to build our two-dimensional point process, we define
{T
i
,M
i
,Z
i
}
i∈{1,2,..}
as random variables where T
i
is the occurrence
time, M
i
the magnitude of the ith triggering event and Z
i
∈{1, 2, …}
indicates the type of i
th
event. Any event of a specifictypeattimet
1
increases the likelihood of an event of any type stream occurring at
time t
2
. Now, we formulate the problem using Hawkes process proper-
ties and discuss how those mechanisms work on the time line.
3.1. Candidate models
First, we formulate one sequence of events using a self-exciting point
process to measure the likelihood that individuals are talking about the
brand regardless of the type of events. This model lets us aggregate the
popularity content from across Twitter into a single stream of informa-
tion. It concurrently capturesthe idea that any given activity on a brand
post can causally correspond to a background Poisson process μ(in this
case constant) and foreground self-exciting process as follows:
λtðÞ¼ΛtðjHtÞ¼μþX
i:tibtfg
βϕ t−ti
ðÞ ð3Þ
The summation component indicates the influence of users' activity
on the stream. It describes how past events at times t
i
influence the
current event rate. Parameter βindicates the amount of excitation an
event contributes to the stream. In behavioral terms, it can be described
as the number of potential users influenced directly by individuals in
the past who retweeted or replied to the brand post tweet at time t
i
.
As mentioned earlier, function ϕis a triggering function describing
distribution of waiting time between a trigger and the response from
users who influenced to recommend the brand. Mining of our data on
the life cycles of various brand posts in Twitter indicates that unlike
62 A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 59–68
YouTube, a brand's tweet gets most of its hits within the first days –
even hours –of its life cycle and quickly becomes obsolete. Since most
responses occur almost immediately in the Twitter case, we need a
distribution that enforces the highest intensity at the most immediate
possible time. Furthermore, it should be skewed and long tailed to
reflect a long time dependency and burstiness.
3.1.1. Model 1
First, we use an exponential distribution for the response density,
giving the conditional intensity
λtðÞ¼μþX
i:tibt
fg
βe−αt−ti
ðÞ ð4Þ
where t−t
i
is the time elapsed since event i,andαreflects a rate
of decay for the triggering density which controls how long self-
excitation takes following a tweet. If αis large, mentioning the brand
post by users will last only a short while and a few events (retweet or
reply) will be only added above a background rate after the initial
brand's tweet over a short period of time. Conversely, if αis small,
self-excitation will last for a much longer period of time and then
many more events will be added to the background rate.
3.1.2. Model 2
There is another characteristic of events in OSNs that should
be taken into consideration. We suggest that the amount of users'
contributions to future events is not only dependent on the occurrence
time, but that the number of followers he/she has is an important factor
as well. Therefore, our second model takes into consideration two
parameters: the occurrence time and the magnitude of triggering
event (number of friends and followers). It means that the event does
not scale just with the occurrence time, but also the magnitude of the
triggering event as well.
One particular form of a self-exciting point process is the ETAS
model (space–time–magnitude Hawkes process), which is widely
used to describe spatial–temporal patterns. This model takes more
parameters (inputs) into account. We use an early form of this model
(i.e. time–magnitude Hawkes process), similar to [50], to quantify the
popularity of a brand tweet. This model incorporates magnitudes and
occurrence time of triggering events concurrently. The conditional
intensity for the ETAS model is given by
λtðÞ¼ΛtðjHtÞ¼μþX
i:tibt
fg
ϕt−ti;mi
ðÞ ð5Þ
where the history of the process H
t
={(t
i
,m
i
):t
i
bt} also includes
magnitudes m
i
,μis the arrival rate of new users and ϕis a triggering
function. The ETAS uses a combination of the exponential distribution
and the Pareto distribution for the triggering density ϕ, giving the
conditional intensity
λtðÞ¼μþX
i:tibtfg
ϕt−ti;mi
ðÞ¼μþX
i:tibtfg
β
t−tiþcðÞ
1þpeαmi−M0
ðÞ
ð6Þ
where the power law term governs temporal distribution of subsequent
triggered events and the exponential term explains the factor by which
the user's magnitude m
i
inflates expected number of influencers. The
term t−t
i
denotes the time elapsed since event i.βis the amplitude
coefficient indicating the amount of direct excitations triggered by
event i. The exponent pis the decay rate, αis interpreted as the pro-
ductivity rate to control the number of potential users influenced by
individuals in the past, and cis the time offset that will be empirically
determined from the dataset under consideration. Furthermore, M
0
is
the lowest magnitude (number of followers) that will be substituted
from the dataset (rescaled to the appropriate range).
3.2. Empirical testing of the models on Twitter datasets
We next apply these models to real data. As mentioned earlier, a
basic analysis of our data on the life cycle of various brand posts in
Twitter using Topsy API and Twitter search API indicates that the major-
ity of a brand's tweet gets most of its activity within the first days –even
hours –of its life cycle and hence quickly becomes obsolete. Since we
focus on the brand post popularity, we take brand posts that experience
bursts of activity and electronic word of mouth growth through the
friend sharing features of Twitter. Using Twitter's publicly available
API, we crawled Twitter information streams of more than 120 major
brands that were among the top 500 most valuable global brands [14].
These brands were among the most followed brands and were actively
posting tweets at their fan pages on Twitter. These brands are from
different product and service categories including clothing, cosmetics,
electronics, accessories, foods, beverages, automotive, credit cards,
airlines, etc. Together, these brands published more than 26,500 tweets
in a typical period of one week to provide information to their
customers and promote their latest products, campaigns and events.
We downloaded information of all subsequent activities (retweets,
replies, and marks as favorite) on a brand post for all these 26,500
brand post tweets. We observed that the majority of brand posts tweets
experience few hits and therefore as mentioned earlier, can be modeled
by a Poisson process. However, there are brand posts that became a
major topic (“trending”in Twitter parlance), are frequently mentioned
by the brand's followers, and experience bursts of activity. For the
purpose of this paper, we searched through the downloaded tweets to
isolate those tweets that are original tweets from the brands and
where the tweets have been mentioned (retweeted, replied, marked
as favorite) at least 300 times. A number of 221 such brand post tweets
followed by many hits and bursts of activity were identified. At this
stage, 125,861 twitter activities including information on original
tweets, all subsequent retweets, replies and marked as favorite to the
original tweet were processed. The data were divided into individual
datasets. Each dataset contains a corpus of an individual brand
post tweet, its subsequent activities (retweets, replies, and marks as
favorite), along with their timestamps, user ids and number of followers
of the user whocontributes to the tweet stream. We take into consider-
ation only the timestamp of events and the number of followers, while
aggregating the events “retweet”,“reply”,and“mark as favorite”into a
single stream of information.
We investigated the content of these 221 most popular brand tweets
and note that the primary topic was the brand campaigns on Twitter
(44%). Some of these campaigns use Twitter to communicate with
fans and followers. Several campaigns use Twitter hashtags to deliver
rewards and sweepstakes to customers. Other campaigns have inter-
active competitionsto create buzz with fans. The second most engaging
brand tweet category is related to the events held by the brands on
Twitter (36%) including surveys etc. The rest of the most popular
brand tweets were related to the informationand entertainment posted
by brands on Twitter.
3.3. Parameter estimation, goodness-of-fit, and model comparison
Given a brand post data collectedfrom Twitter, we utilize maximum
likelihood estimation (MLE) methods to estimate the parameters
of candidate self-exciting point process models. While numerical
optimization routines such as the quasi-Newton method, the conjugate
gradient method, the simplex algorithm of Nelder and Mead and the
simulated annealing procedure [23,50,52] are often used to compute
maximum log-likelihood estimation of self-exciting point process
models,we use theexpectation-maximization (EM) algorithm provided
by Veen and Schoenberg [59] to estimateparameters. Veen and Schoen-
berg [59] have demonstrated that the EM algorithm as the estimation
method of choice for incomplete data problems is extremely robust
and accurate compared to traditional methods. The brand post
63A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 59–68
popularity can be viewed as an incomplete data problem in which the
unobservable or latent variables ascertain whether an activity belongs
to a background event or whether it is a foreground event and was
triggered by a preceding activity.
Finally, the reliability of each model is statistically tested using the
Kolmogorov–Smirnov (K–S) statistic to assess the extent to which the
model fits the data. This criterion provides useful information of the
absolute goodness-of-fit of candidate models. Furthermore, the relative
ability of each model to describe the data is measured by computing the
Akaike information criteria (AIC) [4]. The Akaike statistic provides
germane numerical comparisons of the global fit of competing models.
The required package functions in R software are used for fitting both
above models to the datasets (Ptproc package [53],Ptprocess[31],
ETAS package (Jalilian, [65]), and R code [59]).
Furthermore, we employ autoregressive integrated moving average
(ARIMA) models as benchmarks which have been regarded as the
closest framework to point processes for event data [23]. We used an
R package “Forecast”(Hyndman et al. [66]) to perform the time series
analysis. This package allows fitting of time series and linear models.
The functions available in this package conduct a search over possible
models within the order constraints provided and return the best
ARIMA model for a univariate time series according to AIC values. In
the next section, we will first present our results for one of our crawled
datasets to illustrate how our approach works and then we discuss
goodness-of-fit of the candidate models by computing their average
AIC values across all the datasets that we compiled from Twitter.
In summary, Fig. 1 illustrates the methodology used for modeling
the contentpopularity on Twitter in this paper.At each stage, the inputs,
the required R-packages used to produce the results and the output are
specified clearly.
Input: Individual tweet dataset
(including user ids, timestamps,
number of follower) in CSV or
XML format
Twitter
Database
Parameter Estimation for the
point process models: Veen and
Schoenberg’s R-code, PTPROC,
PTPROCESS R-packages
Simulation: PTPROC,
PTPROCESS, ETAS R-packages
Parameter Estimation and
simulation for the
benchmark ARIMA (p,q,r)
model: Forecast package
Output: Estimated parameters,
log-likelihood function value,
simulated conditional intensity
function, K-S, AIC values etc.
End
Output: Estimated
parameters for best fitted
ARIMA model, log-
likelihood function value,
simulated ARIMA model,
AIC value etc.
Models Comparison: AIC values
Fig. 1. The methodology used in predicting the online content popularity on Twitter.
412
127
212
Frequency of activities
Retweet
Reply
Mark as Favorite
Fig. 2. Frequency of different types of events.
Fig. 3. A histogram of the number of events per minute.
Fig. 4. Simulated conditional intensity function for model #1.
64 A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 59–68
4. Results and analysis
In this section, we focus on one particular dataset to demonstrate
how models work in practice. We set Δt= 1 min for the bin width in
order to control the amount of data through parameter t. From this spe-
cific dataset there are 751 events spanning 10,080 min (one week).
Figs. 2 and 3 provide frequency of different types of hits and a histogram
of the frequency of all events per minute respectively. The most events
occurring in a single minute is 15 and the mean number of events in a
single minute is 0.074. Out of a possible 751 events, 278 events occurred
during the first two days. Thus, we reason that people respond to a
brand post tweet immediately. Therefore, we would expect that the
distributions to be selected should impose the largest probability mass
at the most immediate possible response time.
Table 1 summarizes the parameter estimates for the first candidate
model.
The fit for the data with self-exciting point process model is plotted
in Fig. 4.
The parameter estimate for βdenotes that immediately after an
event occurs, the conditional intensity is amplified by about 3 events
per minute. The parameter estimate for αindicates an event related to
the brand post tweet is talked about for up to 12 min after posting.
Now let us look at the ETAS model that takes into account the occur-
rence time and the number of followers for every single triggering
event. Fig. 5 provides a snapshot of the number of followers for those
users who appear to have been influenced by the brand post tweet
either spontaneously or in response to the certain triggers.
Table 2 summarizes the parameter estimates for the ETAS model.
Simulated data with the corresponding ETAS point process model are
shown in Fig. 6.
Our hypothesis is that the greater the number of followers per event,
the greater the influence. Therefore incorporating the number of
followers into our predictive model as another dimension presumably
provides better results. Fig. 6 reveals that the ETAS model is much
more able to capture jumps and leaps of the process compared to our
dataset.
Utilizing statistical tests such as the K–Sgoodness-of-fittestandAIC
test allows us to test whether the number of followers impacts the
model. Table 3 summarizes the results for a two sample K–Stest
demonstrating how well both models perform in terms of the original
data. It contains the p-values and the values of the K–S test statistic
(D) corresponding to each model.
These results support our hypothesis that incorporating the number
of followers into the predictive models provides a better simulation for
understanding such phenomena.
Since the ETAS model has more parameters in comparison to the
self-exciting Hawkes process, AIC values are used to analyze parsimony,
complexity and accuracy of the models. The homogeneous Poisson
model is also often used as a reference model for comparison of com-
peting point process models. Table 4 summarizes the AIC values for
candidate models.
The AIC values show that the ETAS model is the one with the
minimum AIC value. Therefore, the ETAS model provides a better fit
than a homogeneous Poisson model or self-exciting Hawkes process
or the benchmark ARIMA time series model.
We next estimate the self-exciting Hawkes process model, ETAS
model, the benchmark Poisson process model and the benchmark
ARIMA model and compare their goodness-of-fit by computing their
average AIC values across all datasets.
According to Table 5, the ETAS model has the lowest average AIC
value. The proposed ETAS model outperforms the three benchmarks,
which indicates that it can capture the influence network better than
other models. The benchmark homogeneous Poisson process and the
benchmark ARIMA time series model seem to fare much lower than
the ETAS and the self-exciting Hawkes process. The Poisson process
model fails to capture any exciting effects among user activities to
make the prediction. Also, the ARIMA time series model appears to fail
to capture the dependency between the current event and the past
events on the time line. Recall that, in the online content popularity
context where the occurrence of an event increases the likelihood of
subsequent events, whether slightly or greatly, it is imperative to
account for exciting effects among users' activities.
Our result implies that the impact of the number of followers on
brand post popularity is an important issue in OSNs. It is necessary to
consider the event occurrence time and the number of followers as
two major factors in modeling of online social dynamics.
We found that ETAS modelprovides much more accuracy to predict
popularity of brand posts. It allows us to consider the role of the
influential users in amplifying the brand post popularity and secondar-
ily proposing the brand to their friends and followers networks. It
implies that influential users with a high number of followers can
have a significant influence in spreading the content of the brand post
to others.
5. Discussion and limitations
We have adapted a powerful approach for modeling the content
popularity in OSNs. In contrast to the previous studies that focused on
a one-dimensional function of time, the model recommended in this
paper allows us to characterize and quantify the content popularity as
a joint probability function of time and the number of followers. The
self-exciting Hawkes process and ETAS models have been calibrated to
Table 1
Specification of the self-exciting Hawkes process model (1) used for simulation.
Parameter μα β
Value 0.05673 12.14027 2.91944
0
100
200
300
400
500
600
700
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Number of followers
Time
Fig. 5. Number of followers over time.
Table 2
Specification of the self-exciting point process model (2) used for simulation.
Parameter μβ α pc m
0
Value 0.5886 0.01376837 2.1254544 1.157623 0.01343711 0.3
65A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 59–68
simulate popularity growth patterns of brand post contents on Twitter
and as expected, the ETAS model outperforms the other models to
capture bursts of activity over time.
This model can enable brand marketing managers to observe how
often their fans respond to their posts within OSNs, and gauge the
response for different types of content such as news, contests, applica-
tions, video, pictures, product information, brand's history, testimonials,
etc. They will also have the ability to see how these brand posts move
through the Internet. These predictive models can help companies
decide how often and when a new brand post should be posted, and
how many times the same piece of content can be shared in order to
engage more fans and followers. Certainly there is no magic number
for the ideal number of posts within OSNs; it is important for brands
to post enough content while refraining from posting too much at the
same time. The mathematical configuration of ETAS model also
confirms that if the time difference between two consecutive events is
big enough, most likely the brand post will become obsolete and
suggests that it is time to post a new content to keep a connection
opened with fans.
As another managerial implication of this study, the mathematical
formulation of the ETAS model reveals that the greater the number of
followers per event, the greater the influence. This means that a high
number of followers improve activity in posting tweets and being
more often retweeted. It highlights the role of influential users who
significantly affect the engagement of a brand post, even if they are
involved later. Thus, if companies identify and increase the number of
influential users within their online social networks, they should expe-
rience an increase of brand recommendations and awareness. Engaging
more users that are influential during the early life of the brand post
could cause viral effects, which is likely to influence potential
consumers for a longer period. Many approaches have been proposed
to find influential users within OSNs. The simplest approach is to
count the number of followers, but there are other efficient techniques
based on mining link structure along with the temporal order of infor-
mation adoption [42].
Also, since fans' reactions and response time to different types of
the brand post content are dissimilar, it is important for brands to look
carefully at the performance of their various brand post contents
and see which of them during their lifecycle have similar looking
stationary/non-stationary background rates. If they do not follow the
same growth pattern, each category needs an individual point process
to represent it.
Our work proposes a mechanism for capturing the evolution of the
online content popularity posted by brands on Twitter. It facilitates
the early prediction of a tweet behavior on Twitter and the simulation
of the rhythm and timing of the most engaging postings. Through the
simulation and the early prediction of a brand's tweet, brands have a
better view of timing promotions to foster relationship with customers.
Our research can be extended to determine a peak release time for
products of consumer interest on the market through analyzing aggre-
gative/collective brand posts from OSNs. If brand posts are not propa-
gating further on OSNs, it could indicate that the brand is losing its
fans' awareness and popularity, so improvement actions should be
taken.
Several limitations of our study deserve mention. First, we assume
that all users follow the same response time distribution for their own
activities. However, individual activity burst shows a sequence of dis-
crete events. This is unlikely to be a single distribution for the purposes
of fitting exponential or Pareto distributions to the longterm dependen-
cy. Another limitation is that thevarious types of events are aggregated.
Multivariate self and mutual-exciting point process models should be
developed to deal with different streams of information and measure
cross interactions and mutual information between one sequence of
events and another.
Furthermore, even though we chose a small time increment, i.e. Δt=
1 min for the bin width in order to control the amount of data through
parameter t, we cannot determine if events occurring in the same minute
are correlated with one another. This means that the events recorded on
the same minute are assumed to be statistically independent.
While we consider the same importance for fan's response times, we
can track down brand's most engaging minutes, hours and days of the
week to determine real effective time windows that should be taken
into computation in order to provide a better prediction.
In summary, our analysis indicates that a stationary Poisson process
for the background rate of spontaneous events is a rather unlikely
assumption in many social systems. The ETAS model and self-exciting
point process can be considered a more reliable underlying process.
6. Conclusion and directions for future research
This paper adopts a stochastic point process framework for analysis
of the dynamic microstructure of online social networks (OSNs). Espe-
cially, we investigate the possibility of using crowdsourcing on OSNs
Fig. 6. Simulated conditional intensity function for model #2.
Table 3
The K–Sgoodness-of-fit test output.
Self-exciting Hawkes process (Model #1) ETAS model (Model #2)
D = 0.3993, p-value = 0.03135 D = 0.1223, p-value = 0.02216
Table 4
AIC test results.
Time series model
(ARIMA (3, 1, 3))
Homogeneous
Poisson model
Self-exciting Hawkes
process (Model #1)
ETAS model
(Model #2)
9787.670 5401.592 4473.017 4012.011
Table 5
Models' comparative average AIC values.
Time series model
(ARMIA (p,q,r))
Homogeneous
Poisson model
Self-exciting Hawkes
process (Model #1)
ETAS model
(Model #2)
13,398.661 9047.397 7143.110 6415.187
66 A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 59–68
as a marketing mechanism to enhance brand awareness and popularity.
Such crowdsourcing activities help brands spur innovation and drive
brand awareness across OSNs platforms. We describe such dynamics
in terms of the stochastic occurrence times and number of followers.
One-dimensional and two-dimensional self-exciting point process
models are adjusted to simulate popularity growth patterns of brand
post contents on Twitter. Our findings indicate that point models are
able to describe the cascade of influencers on theonline social networks.
Our results suggest that incorporating the number of followers into pre-
dictive models as another dimension of input provides a better
understanding of the content popularity. Our future work focuses on
applying a full package of multivariate point processes to different
streams of events within OSNs.
References
[1] F. Abel, E. Diaz-Aviles, et al., Analyzing the blogosphere for predicting the success of
music and movie products, International Conference on Advances in Social Net-
works Analysis and Mining (ASONAM), IEEE, 2010, pp. 276–280.
[2] L. Adamopoulos, Cluster models for earthquakes: regional comparisons, Mathemat-
ical Geology 8 (4) (1976) 463–475.
[3] Y.-Y. Ahn, S. Han, et al., Analysis of topological characteristics of huge online social
networking services, Proceedings of the 16th international conference on World
Wide Web, ACM, Banff, Alberta, Canada, 2007, pp. 835–844.
[4] H. Akaike, Information theory and an extension of the maximum likelihood princi-
ple, 2nd Inter. Symp. on Information Theory, 1, 1992, pp. 610–624.
[5] S. Alexey, B.S. Martin, et al., Reconstruction of missing data in social networks based
on temporal patterns of interactions, Inverse Problems 27 (11) (2011) 115013.
[6] E. Bacry, S. Delattre, et al., Modelling microstructure noise with mutually exciting
point processes, Quantitative Finance (2012) 1–13.
[7] Y. Bae, H. Lee, A sentiment analysis of audiences on twitter: who is the positive or
negative audience of popular twitt erers? Proceedings of the 5th international
conference on Convergence and hybrid information technology, Springer-Verlag,
Daejeon, Korea, 2011, pp. 732–739.
[8] C.H. Baird, G. Parasnis, From social media to social customer relationship manage-
ment, Strategy & Leadership 39 (5) (2011) 30–37.
[9] N.G. Barnes, Exploring the link between customer care and brand reputation in the
age of social media, in: S. f. NC Research (Ed.), Societ y for New Communication
Research, 2008.
[10] L. Bauwens, N. Hautsch, Modelling financial high frequ ency data using point
processes, in: T. Mikosch, J.-P. Kreiß, R.A. Davis, T.G. Andersen (Eds.), Handbook of
Financial Time Series, Springer, Berlin Heidelberg, 2009, pp. 953–979.
[11] P.R. Berthon, When customers get clever: managerial approaches to dealing with
creative consumers, Strategic Direction 23 (8) (2007).
[12] P.R. Berthon, L.F. Pitt, et al., Marketing meets Web 2.0, social media, and creative
consumers: implications for international marketing strategy, Business Horizons
55 (3) (2012) 261–271.
[13] C.G. Bowsher, Modelling security market events in continuous time:intensity based,
multivariate point process models, Journal of Econometrics 141 (2) (2007)
876–912.
[14] Brand Directory, “BrandFinance Banking 500 2013”[online], [Accessed 07/01/2013]
Available from http://www.brandirectory.com 2013.
[15] L. Capozzi, L.B. Zipfel, The conversation age: the opportunity for public relations,
Corporate Communications: An International Journal 17 (3) (2012) 336–349.
[16] M. Cha, H. Kwak, et al., Analyzing the video popularity characteristics of large-scale
user generated content systems, IEEE/ACM Tra nsactions on Ne tworking 17 (5 )
(2009) 1357–1370.
[17] G. Chatzopoulou, S. Cheng, et al., A first step towards understanding popularity in
youtube, INFOCOM IEEE Conference on Computer Communications Workshops,
2010.
[18] V. Chavez-Demoulin, A.C. Davison, et al., Estimating value-at-risk: a point process
approach, Quantitative Finance 5 (2) (2005) 227–234.
[19] E. Chornoboy, L. Schramm, et al., Maximum likelihood identification of neural point
process systems, Biological Cybernetics 59 (4) (1988) 265–275.
[20] A.Y.K. Chua, S. Banerjee, Customer knowledge management via social media: the
case of Starbucks, Journal of Knowledge Management 17 (2) (2013) 237–249.
[21] Constant Conta ct, Report on con sumer behavio r highlights th e need for small
businesses to be active on Facebook, Constant Contact Inc., 2011
[22] R. Crane, D. Sornette, Robust dynamic classes revealed by measuring the response
function of a social system, Proceedings of the National Academy of Sciences 105
(41) (2008) 15649–15653.
[23] D.J. Daley, D. Vere-Jones, Conditional intensities and likelihoods, An Introduction to
the Theory of Point Processes, , Springer, New York, 2003. 211–287.
[24] A. Dassios, H. Zhao, Ruin by dynamic contagion claims, Insurance: Mathematics and
Economics 51 (1) (2012) 93–106.
[25] L. de Vries, S. Ge nsler, et al., Pop ularity of bran d posts on brand fan pages: an
investigation of the effects of social media marketing, Journal of Interactive Market-
ing 26 (2) (2012) 83–91.
[26] R.F. Engle, A. Lunde, Trades and quotes: a bivariate point process, Journal of Financial
Econometrics 1 (2) (2003) 159–188.
[27] L. Erik,M. George, et al., Self-exciting point process models of civilian deaths inIraq,
2010.
[28] F. Benevenuto, T. Rodrigues,V. Almeida, J. Almeida,K. Ross, Video interactions in on-
line video social networks,ACM Transactions on Multimedia Computing, Communi-
cations, and Applications (TOMCCAP) 5 (4) (2009) 30.
[29] F. Figueiredo, Fabr, et al., The tube over time: characterizing popularity growth of
youtube videos, Proceedings of the fourth ACM international conference on Web
search and data mining, ACM, Hong Kong, China, 2011, pp. 745–754.
[30] I. Guy,M. Jacovi, et al., Same places, same things, same people?: mining user similar-
ity on social media, Proceedings of the 2010 ACM conference on Computer support-
ed cooperative work, ACM, Savannah, Georgia, USA, 2010, pp. 41–50.
[31] D. Harte, PtProcess: an R package for modelling marked point processes indexed by
time, Journal of Statistical Software 35 (8) (2010) 1–32.
[32] A.G. Hawkes, Spectra of some self-exciting and mutually exciting point processes,
Biometrika 58 (1) (1971) 83–90.
[33] A.G. Hawkes, D. Oakes, A cluster process representation of a self-exciting process,
Journal of Applied Probability 11 (3) (1974) 493–503.
[34] J. Howe, The rise of crowdsourcing, Wired Magazine 14 (6) (2006) 1–4.
[35] J. Howison, J.F. Olson, A. Kittur, K.M. Carley, Motivation through visibility in open
contribution systems, http://repository.cmu.edu/isr/493/ 2011 (accessed May 19,
2014).
[36] B.A. Huberman, Crowdsourcing and attention, Computer 41 (11) (2008) 103–105.
[37] L.B. Jabeur, L. Tam ine, et al., Uprising microblogs: a bayesian netw ork retrieval
model for tweet search, Proceedings of the 27th Annual ACM Symposium on Ap-
plied Computing, ACM, Trento, Italy, 2012, pp. 943–948.
[38] B.J.Jansen, M. Zhang, etal., Twitter power:tweets as electronic word of mouth, Jour-
nal of the American Society for Information Science and Technology 60 (11) (2009)
2169–2188.
[39] L. Jong Gun, M. Sue, et al., An approach to model and predict the popularity of
online contents with explanatory factors, Web Intelligence and Intelligent
Agent Technology (WI-IAT), 2010 IEEE/WIC/ACM International Conference on,
2010.
[40] S. Kong, L. Feng, et al., Predicting lifespans of popular tweets in microblog, Proceed-
ings of the 35th international ACM SIGIR conference on Research and development
in information retrieval, ACM, Portland, Oregon, USA, 2012, pp. 1129–1130.
[41] M. Lawrence, E.C. Michael, Hawkes process as a model of social interactions: a view
on video dynamics, Journal of Physics A: Mathematical and Theor etical 43 (4)
(2010) 045101.
[42] C. Lee, H. Kwak, et al., Finding influentials based on the temporal order of informa-
tion adoption in twitter, Proceedings of the 19th International Conference on World
Wide Web, ACM, Raleigh, North Carolina, USA, 2010, pp. 1137–1138.
[43] J.G. Lee, S. Moon, K. Salamatian, Modeling and predicting the popularity of online
contents with Cox proportional hazard regression model, Neurocomputing 76 (1)
(2012) 134–145.
[44] K. Lerman, T. Hogg, Using a model of social dynamics to predict popularity of news,
Proceedings of the 19th International Conference on World wide Web, ACM, Ra-
leigh, North Carolina, USA, 2010, pp. 621–630.
[45] J. Leskovec, K.J. Lang, et al., Statistical properties of community structure in large so-
cial and information networks, Proceedings of the 17th international conferenceon
World Wide Web, ACM , Beijing, China, 2 008, pp. 695–704.
[46] W.B. Lober, J.L. Flowers, Consumer empowerment in health care amid the internet
and social media, Seminars in Oncology Nursing 27 (3) (2011) 169–182.
[47] G.O. Mohler, M.B. Short, et al., Self-exciting point process modeling of crime, Journal
of the American Statistical Association 106 (493) (2011) 100–108.
[48] A. Noff, Learning from Starbucks —one tweet at a time, available at: http://www.
blonde20.com/blog/2009/11/19/learning-from-starb ucks-one-tweet-at-a-time /
2009 (accessed August 25, 2012).
[49] A. Noff, The Starbucks formula for social media succe ss, URL:http://thenextweb.
com/2010/01/11/starbucks-formula-social-media-success/ 2011.
[50] Y. Ogata, Statistical models for earthquake occurrences and residual analysis for
point processes, Journal of the American Statistical Association 83 (401) (1988)
9–27.
[51] Y. Ogata, D. Vere-Jones, Inference for earthquake models: a self-correcting model,
Stochastic Processes and their Applications 17 (2) (1984) 337–347.
[52] T. Ozaki, Maximum likelihood estimation of Hawkes' self-exciting point processes,
Annals of the Institute of Statistical Mathematics 31 (1) (1979) 145–155.
[53] R.D. Peng, Multi-dimensional point process models in r, 2002.
[54] H. Rui, A. Whinston, Designing a social-broadcasting-based business intelligence
system, ACM Transactions on Management Information Systems 2 (4) (2012) 1–19.
[55] D. Rybski, S.V. Buldyrev, et al., Communication activity in a social network: relation
between long-term correlations and inter-e vent clustering, Scienti fic Reports 2
(2012).
[56] SAS Harvard Business Review Analytic Services, The New Conversation:
TakingSocial Media from Talk to Action, Harvard Business School Publishing, 2010.
[57] C.M. Sashi, Customer engagement, buyer–seller relationships, and social media,
Management Decision 50 (2) (2012) 253–272.
[58] G. Szabo, B.A. Huberman, Predicting the popularity of online content, Communica-
tions of the ACM 53 (8) (2010) 80–88.
[59] A. Veen, F.P. Schoenberg, Estimation of space–time branching process models in
seismology using an EM-type algorithm, Journal of the American Statistical Associ-
ation 103 (482) (2008) 614–624.
[60] T. Wang, M. Bebbington, et al., Markov-modulated Hawkes process with stepwise
decay, Annals of the Institute of Statistical Mathematics 64 (3) (2012) 521–544.
[61] W. Willinger, R. Rejaie, et al., Research on online social networks: time to face
the real challenges, SIGMETRICS Performance Evaluation Review 37 (3)
(2010) 49–54.
67A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 59–68
[62] E. Sadikov, A.G. Parameswaran, P. Venetis, et al., Blogs as Predictors of Movie Suc-
cess, International AAAI Conference on Weblogs and Social Media (ICWSM) (2009).
[63] M. Egesdal, C. Fathauer, K. Louie, J. Neuman, G. Mohler, E. Lewis, Statistical and sto-
chastic modeling of gang rivalries in Los Angeles, SIAM Undergraduate Research On-
line 3 (2010) 72–394.
[64] L. Mitchell, M.E. Cates, Hawkes process as a model of social interactions: a view on
video dynamics, Journal of Physics A: Mathematical and Theoretical 43 (4) (2010)
045101.
[65] A. Jalilian, ETAS: Modeling earthquake data using Epidemic Type Aftershock Se-
quence model, 2012.
[66] R.J. Hyndman, Y. Khandakar, Automatic time series for forecasti ng : the forecast
package for R, 2007.
Amir Hassan Zadeh is a PhD student in the Management
Science and Inf ormation Systems Department within th e
Spears School of Business at Oklahoma State University. He
received his master's in Industrial and Systems Engineering
from Amirkabir University of Technology, and his bachelor's
from Departmen t of Mathematics and Computer Science,
Shahed University, Tehran, Iran. He has bee n published in
the Journalof Production Planning and Control, Annals of Infor-
mationSystems, Advancesin Intelligent andSoft Computing,Af-
rican Journal of Business Management, and also conference
proceedings of DSI, INFORMS and IEEE. His current research
interests include big data and analytics, social networks and
recommender systems. His research also involves decision
support systems, data mining and knowledge disc overy
and systemanalysis and design.Other areas of interestinclude supply chainmanagement,
product design, and healthcare.
Ramesh Sharda is the interim Vice D ean of the Watson
Graduate School of Management, Watson/ConocoPhillips
Chair and a Regents Professor of Management Science and
Information Systems in the Spears School of Business at
Oklahoma State University. He also serves as the Executive
Director of the PhD in Business for Executives Program. He
has coauthored two textbooks (Business Intel ligence and
Analytics: Systems for Decision Support, 10th edition, Prentice
Hall and Business Intelligence: A Managerial Perspective on
Analytics, 3rd Edition, Prentice Hall). His research has been
published in major journals in management science and in-
formation systems including Management Sc ience, Operations
Research, Information Systems Research, Dec ision Support
Systems,Interfaces, INFORMSJournal on Computing,and many
others. He is a member of the editorial boards of journals such as the Decision Support
Systems and Information Systems Frontiers. He is currently servingas the ExecutiveDirector
of Teradata University Network and received the 2013 INFORMS HG Computing Society
Lifetime Service Award.
68 A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 59–68