ArticlePDF Available

Abstract

Today’s social media platforms are excellent vehicles for businesses to build and foster relationship with costumers. Companies create official fan pages on social network websites to provide customers with information about their brands, products, promotions, and more. Customers can become fans of these pages, like, reply, share or mark the brand post as favorite. Marketing departments are using these activities to crowdsource marketing and increase brand awareness and popularity. Understanding how crowdsourcing oriented marketing and promotion evolves would be helpful in managing such campaigns. In this paper, we adapt a multidimensional point process methodology to study crowd engagement activities and interactions. Specifically, we investigate the brand post popularity as a joint probability function of time and number of followers. One-dimensional and two-dimensional Hawkes point process models are calibrated to simulate popularity growth patterns of brand post contents on Twitter. Our results suggest that the two-dimensional point process model provides a good model for understanding such crowdsourcing behavior.
Modeling brand post popularity dynamics in online social networks
Amir Hassan Zadeh , Ramesh Sharda
Spears School of Business, Oklahoma State University, Stillwater, OK 74078, USA
abstractarticle info
Available online 13 May 2014
Keywords:
Online social networks
Social media marketing
Crowdsourcing
Brand post popularity
Brand-generated content
Hawkes point process
Today's social media platforms are excellent vehicles for businesses to build and foster relationship with
customers. Companies create ofcial fan pages on social network websites to provide customers with informa-
tion about their brands, products, promotions, and more. Customers can become fans of these pages, and like,
reply, share or mark the brand post as favorite. Marketing departments are using these activities to crowdsource
marketingand increase brandawareness and popularity. Understandinghow crowdsourcing oriented marketing
and promotion evolves wouldbe helpful in managing suchcampaigns. In this paper,we adopt a multidimension-
al point process methodology to study crowd engagement activities and interactions. Specically, we investigate
the brand post popularity as a joint probability function of time and number of followers. One-dimensional and
two-dimensional Hawkes point process models are calibrated to simulate popularity growth patterns of brand
post contents on Twitter. Our results suggest that the two-dimensional point process model provides a good
model for understanding such crowdsourcing behavior.
© 2014 Elsevier B.V. All rights reserved.
1. Introduction
The emergence of Internet-based social media has started a new
kind of conversation among consumers and companies, challenging
traditional ideas about marketing and brand management while creat-
ing new opportunities for organizations to understand customers and
connect with them instantly [56]. Research rm Chadwick MartinBailey
in partnership with Constant Contact conducted a study that analyzed
the behavior of 1491 consumers ages 18 and older throughout the
U.S., and revealed that a whopping 77% of consumers interact with
brands on Twitter or Facebook primarily through reading posts and
updates from the brands. They also noted that 60% of social customers
are more likely to recommend a brand to a friend after following the
brand on Twitter or Facebook, and 50% of them are more likely to buy
from that brand as well. When it comes to Likingbrand posts on
Facebook, the reasons are varied, but for the most part, respondents
said they like a brand on Facebook because they are a customer (58%)
or because they want to receive discounts and promotions (57%) [21].
Today, the customer experience shared through social media, blogs
and discussion forums is becoming a major driver of purchasing
decisions, because these platforms provide consumers a more inuen-
tial voice in effecting changes in their own customer care [15].Barnes'
research [9] indicates that70% of consumers use social media platforms
at least some of the timeto learn about the customer care offered by a
company before they make a purchase. Furthermore, of them, 74% of
customers choose companies based on customer care experience
shared by others in online forums.
Over the past few years, big brands have started taking social media
seriously, and social media marketing has been an inevitable part of
their marketing plan. For example, Coca-Cola, one of the world's most
recognizable brands, had 800 fans on Facebook in 2007, 16.5 million
in 2010, and it has currently crossed over 62.3 million likes. In 2012,
in honor of the Coca-Cola Facebook page becoming the rst retailer
brand to receive 50 million likes, Coca-Cola developed a new
Facebook application to identify and support individuals developing,
inuencing and shaping ideas and ask them to collaborate with the
Facebook community to spread them globally. Through this application,
Coca-Cola teaches the world to sing in perfect harmony, mobilizes
millions of people behind their favorite cause, and encourage them to
become more active and socially involved. As an end result, consumers
become involved in suggesting modications of products and services
and the distribution of these innovations [11,12].
Starbucks, as one of the top ten most followed brands on Twitter,
uses tweets to share knowledge with customers and promote their lat-
est products, campaigns and events [20]. With an average of ten tweets
per day on Twitter, Starbucks extracts relevant knowledge from a net-
work of current and prospective customers around the globe who
express their expectations, likes and dislikes about the brand [20,48].
In 2010, Delta Airlines launched the rst social media ticket
windowon Facebook which allows customers to book a ight without
having to go to any other website. Delta pointed out Facebook is being
used by more customers while in ight than any other Web site, making
Decision Support Systems 65 (2014) 5968
Corresponding author.
E-mail address: Amir.zadeh@okstate.edu (A. Hassan Zadeh).
http://dx.doi.org/10.1016/j.dss.2014.05.003
0167-9236/© 2014 Elsevier B.V. All rights reserved.
Contents lists available at ScienceDirect
Decision Support Systems
journal homepage: www.elsevier.com/locate/dss
it a natural launching pointfor its initiative [8]. Access to OSNs on
mobile devices has certainly accelerated the popularity of OSNs.
As more and more major brands have established their communities
and fan pages within online social networks (OSNs) and started offering
commerce opportunities delivered through social media platforms,
crowdsourcing applications have become some of the most engaging
tools in digital marketing realm, enabling brands to realize the potential
for their fans' input into the product development and the market
development processes [36]. Such innovative and creative initiatives
enable businesses to improve their products, get brand recommenda-
tions, increase brand awareness and popularity, nd new customers
or even excite a specic demographic. In many cases where fans within
social media are particularly passionate about a brand and its products,
there will be a clear desire to become part of the product itself, have
input as a group and energize the brand and its product lines [57].
Today's openness and exibility of OSNs provide brands with a huge
opportunity to get in touch with customers, crowdsource marketing
tasks and enhance brand awareness. Understanding the structure and
behavior of the fans on OSNs is important to the content providers to
enable better organization of brand post information, design of effective
online communities and for implementing successful marketing
campaigns. In examining the online social interaction structures, the
formation of relationships and interactions, how information moves
on social media platforms, and how users respond to various stimuli
like video, contests, or posts are not clearly understood. The answers
to these questions will offer a more complete picture of the social
dynamics of networking and how individuals manage their virtual
relationships and follow their favorites or brand communities, or how
they inuence their friends to become followers as well. In this paper,
we model the spread of information across Twitter, the most popular
and widely used micro-blogging online social network [37] and analyze
the data from a number of brand posts to discover what rules might
govern the spread of information online. By understanding these
behaviors, companies can become more effective in designing market-
ing campaigns. Being able to analyze a social network of customers,
how customers interact on this type of platforms, and what rhythm
and timing of the most engaging postings look like provides brands
a competitive advantage through forecasting the spread of brand
inuence, and intervening at times with promotions to foster relation-
ship with customers.
The timing pattern of human communication in online social
networks is not random. It has been shown that the communication is
explained by emergent statistical laws such as non-trivial correlations
and clustering [55]. With the possibility of analyzing the multivariate
distribution of the occurrences of activity on OSNs, we can add to our
understanding of these interactions.
Standard models assume a Poisson distribution for events occur-
rence, which is an unrealistic assumption in many social systems.
Point process has shown promise for modeling social event patterns
where the occurrenceof an event increases the likelihood of subsequent
events [22]. It is a novel way of modeling and clustering high frequency
and irregular data in time. It uses a branching structure that corresponds
to background events and offspring events and is able to capture bursts
of activity, dynamics and reactions over time.
In this paper, we model the popularity of a brand post or more
generally an online content on online social networks. The popularity
of an online content is not a well-dened, but a highly subjective term
[39]. Brandpost popularity canbe dened as a mixture of various factors
such as vividness, interactivity, the content of the brand post (informa-
tion, entertainment), and number of times the brand post is mentioned
by fans [25]. We take the position of an individual user's eyes who
conjectures the popularity of a brand's tweet from publicly observable
data by associating the number of impressions it has received (including
total number of retweets, replies, favorites) or the lifespan of threads
over its entire timeline. A tweet is considered a popular tweet if it
receives a certain amount of retweets, replies, and favorites that are
no less than a certain threshold over its lifespan [40,43]. Our goal is to
develop a mechanism for capturing the evolution of the online content
popularity posted by brands on OSNs. In our approach, a model is
specied via the conditional intensity for each event. This provides a
powerful and more natural modeling framework for multivariate social
network event data. Specically, the current study examines the
inuence of user activities on the timing and frequency of a brand
post. The self-exciting Hawkes point process and the ETAS (Epidemic
Type Aftershock Sequences) models are used to analyze data on brand
posts popularity. Unlike Poisson processes, Self-exciting Hawkes point
process and ETAS are classied as counting processes which are basical-
ly a continuous-time non-Markov chain due to the dependence on the
history of the process (i.e. H
t
) to the extent to which having states 0,
1, 2, . . . moving from state nto state n+ 1, where n0. In case of the
content popularity problem, each state indicates total number of users
who hit the content by time t,andλ(t) is the transition rate of moving
from one state to another state.
The remainder of the paper is organized as follows. The next section
starts with a discussion of online social networks (OSNs). We also
review literature about stochastic point processes and their many
uses. The following section describes how we map the content popular-
ity to the point processes framework. Also, we introduce brand post
data collected from Twitter and the assumptions necessary to proceed
with analysis. In Section 4,wet competing models to data and then
compare the accuracy and complexity of models in capturing the burst
of activity on OSNs. The managerial implications of our ndings, limita-
tions and possible directions for future work are discussed in Section 5.
The nal section presents a general conclusion of the paper.
2. Review of the literature
2.1. Online social networks
During the past few years, millions of people have used social media
applications (Facebook, Twitter, YouTube, Google+, etc.) as a part of
their daily online activities [30]. In 2011, more than half of social
media users followed brands on social media sites, and brands are
increasingly investing in social media to crowdsource marketing activi-
ties, indicated by worldwide marketing spending on social networking
sites of about $4.3 billion [25].
Today companies develop ofcial fan pages and online communities
within online social networks to understand customers, connect with
them instantly and provide them with information about their brands,
products, promotions and more. Meanwhile, brand fans can like, com-
ment and share brand posts. Users of Twitter can retweet, which is
much like a Facebook share. Followers retweet the tweets of those
they are following to propagate information to other people. People
respond to popular users by replyingand/or mentioning[7].
Followers can also mark the content as favorite which is functionally
similar to the likeaction on Facebook. The likeand retweet
buttons are the easiestways for Facebook and Twitter users respectively
to join in on the brand conversation and give feedback. Comments/
replies on brand posts can be positive, neutral or negative. In most
cases, social media users who choose to become fans of a product are
those who are particularly passionate about a brand and its products
and enjoy having input or being a member of a group of like-minded
fans. The brand benets from these fans because they help communi-
cate with a diverse audience of other consumers.
Such individual activities associated with a brand post are visible to
network friends and many times inuence friends to retweet, like, or
mention. Ifa company produces fan page updates that earn high quality
scores, they will reap the benets of greater exposure and possibly an
increased fan base because other network members will see in their
news feed. Jansen et al. [38] discuss OSNs as a form of electronic word
of mouth (eWOM) for sharing consumer opinions concerning brands
and as a part of an organization's marketing strategy. This openness
60 A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 5968
and exibility of social media provides businesses a great opportunity
to bring together a group of people, or crowd, to solve a problem or
engage in an activity and achieve powerful social engagement and
activation.
In many ways, the interactivity of social media supports
crowdsourcing. Crowdsourcing is a term coined by journalist Jeff
Howe [34] to mean taking advantage of the talent of the public[46].
Social media provide platforms for existing and potential customers to
engage, learn, and entertain. It enables content marketers to
crowdsource their marketing, reaching vast audiences via word of
mouth. For example, Starbucks developed the My Starbucks Idea
campaign, an online customer community, where customers are asked
to contribute their views and ideas about the company. It keeps
customers in the loop on what business ideas Starbucks is currently
implementing on both the brand and product level. Through linking
this platform to Facebook, Twitter and other social media websites,
customers are able to see what others are suggesting, vote on ideas
and check out the results [49].
Internet service providers, content creators, and online marketers
would like to be able to predict how manyviews and actions an individ-
ual item might create on a given website [58]. This is true for companies
as well who benet from aspects of online social networks by utilizing
fan pages and web advertising. Leveraging the social networking sites
to understand what is most popular helps e-commerceproviders decide
what content to promote on their website. E-commerce providers can
leverage these social signals to ensure the products or services people
are talking about appear higher in their product listings.
Over the last few years, much effort has been devoted to exploring
the statistical features of content popularity in online social networks
(OSNs). Most previous empirical analyses of OSNs have treated such
networks as static [29,61]. They analyze the social networks on a single
data snapshot [3,28,45]. However, such social network systems are
inherently dynamic, characterized by a high burstiness and a strong
positive correlation between two users' activities and consist of a set
of dyadic, directed, time-stamped, cross-affected and sometimes
weighted events. To the best of our knowledge, only a few studies
have analyzed popularity growth patterns of content on OSNs using
prediction models [16,22,29,44,58]. Crane and Sornette [22] propose
contagion models as models of YouTube video viewing dynamics to
understand how popularity bursts can be described. They differentiate
four classes of popularity dynamics (memoryless, viral, quality and
junk) which are all explained by properties of Hawkes point process.
Szabo and Huberman [58] nd a strong linear correlation between
early and later times of the content popularity on YouTube and Digg
networks. This correlation conrms that if the content is popular
when new, it will continue to be popular as it ages. Another interesting
work on social media mining is reported by Chatzopoulou et al. [17].
They nd a strong correlation between total number of comments (or
favorites) and total view count in YouTube. There are relatively few
studies in the literature which explore the capability of online social
networks to predict real-world outcomes such as the revenue or release
time of a product on the market. Sadikov et al. [62],Abeletal.[1] and Rui
and Whinston [54] present case studies in which blogosphere content
can be used as a predictor of movie and music success. They show that
the number of microblog views of content related to the music or
movie (such as FB posts, tweets, YouTube videos, etc.) can provide an
accurate prediction of the movie's or music's success.
While previous studies build popularity models based on a one-
dimensional function of time, we suggest that the content popularity
can be a joint probability function of time and the number of followers.
We focus more on incorporating thenumber of followers as an inuen-
tial metric into predictive models of the content popularity, explicitly
looking at the impact of inuential users on their followers to persuade
them to contribute to brand post popularity. In this paper, we adapt a
mathematical framework based on self-exciting point process to study
brand post popularity on online social networks. Specically, we
calibrate one-dimensional and two-dimensional self-exciting point
process models to estimate popularity growth patterns of brand post
contents on Twitter.
2.2. Stochastic point processes
In this section we present the statistical theory underlying our
approach. First, we dene the conditional intensity function for a point
process. A point process is a stochastic model commonly used
to describe the occurrence of discrete events in time and space. It can
be viewed in terms of a list of times t
1
,t
2
,,t
n
at which corresponding
events 1, 2,,noccur [27]. Intuitively, a point process is characterized
by its conditional intensity λ(t), which represents the mean spontane-
ous rate at which events are expected to occur given the history of the
process upto time t[50]. In particular, a version of the conditional inten-
sity may be given by the process
λtðÞ¼ lim
Δt0
E½Nt;tþΔt½Ht
j
Δt
where H
t
denotes the history of events prior to time t,andtheexpecta-
tion represents the number of events N[t,t+Δt] occurring between
time tand t+Δt. The Poisson process is a special case of a point process
where the interval times between two arrivals are independent, identi-
cally distributed exponential random variables. The conditional intensi-
ty of a Poisson process is deterministic which means that events are
linked causally to the conditional intensity. In other words, a point
process is classied as a Poisson process if events occurring at two
different times are statistically independent of one another, meaning
that an event at time t
1
neither increases nor decreases the probability
of an event occurring at any subsequent time[27].Sinceahomogeneous
Poisson process indicates complete randomness, it is most commonly
used as a suitable benchmark forassessing self-exciting process models.
A point process is called self-excited if any one event increases the
likelihood of the future events [32]. A self-exciting or Hawkes point
process is a versatile point process which has been extensively studied
from a theoretical and practical point of view. It is dened by its condi-
tional intensity function
λtðÞ¼μþZ
ti
−∞
ϕtti
ðÞdZ uðÞ¼μþβX
tibtfg
ϕtti
ðÞ
Z
0
ϕυðÞdυ¼1;ϕυðÞ1;υ0
ð1Þ
where Zis the normal counting measure [33].Therateofeventsλ(t)is
decomposed into the sum of a Poisson background rate which in most
applications is assumed to be constant in time [33] and a self-exciting
component in which events trigger an increase in the rate of the
process. The self-exciting part of the process has two components: β
and ϕ.βis a constant which reects the magnitude of self-excitation
and ϕis a density function describing the waiting time (lag) distribution
between excited and exciting events. A proper skewed distribution in
which the overall shape reects a long time dependency should be
introduced for the triggering density.
In the Hawkes-based analysis, the events can be viewed as
the realization of a multivariate point process. That is, every single
event is characterized by the occurrence time and the event's type.
Notationally, {T
i
,Z
i
}
i{1,2,..}
are random variables where T
i
is the occur-
rence time of the i
th
event and Z
i
{1, 2, ,M} indicates the ith event's
type [13]. A point process issaid to be mutually-excitingif any one event
from a specic event's type at time t
1
increases the likelihood of an
event in another event's type stream occurring at time t
2
. Mutually-
exciting Hawkes process is used to capture cross interactions and
mutual information between one sequence of events and another.
Similar to theself-exciting Hawkes process,a mutually-exciting Hawkes
61A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 5968
process with nevent type(s) is dened by its conditional intensity
functions
λktðÞ¼μkþX
n
j¼1X
tibtfg
βijϕij tti
ðÞ k¼1;2;n
Z
0
ϕij υðÞdυ¼1;ϕij υðÞ1;υ0
ð2Þ
where the rate of event type k,λ
k
(t), is partitioned into the sum of a
Poisson background rate and mutual-exciting components in which
events trigger an increase in the rate of the process. β
ij
is a constant
which reects the strength of self-excitation for (i=j) and the strength
of mutual-excitation for (ij)andϕ
ij
is a density function describing
the triggering distribution between excited event type iand exciting
event type j.
Hawkes-based analysis has long been used in seismology to
recognize similar clustering patterns in earthquakes occurrence data
and to predict subsequent earthquakes, or aftershocks. [2,51,59,60].
It has been applied to many other areas such as nance [10,13],
neurophysiology [19], ecology, social networks [5,27,47] and online
social networks [22,41].
Engle and Lunde [26];Bowsher[13] present a bivariate Hawkes
process model to jointly analyze the timing of trades and quote arrivals
in stock markets. Chavez-Demoulin et al. [18] and Bacry et al. [6] use
Hawkes process structure to estimate value at risk for portfolios of
traded assets over a given holding period of time. Dassios and Zhao
[24] present dynamic contagion process as a generalization of the Cox
process and Hawkes process and use it to model risk process with the
arrival of claims.
Mohler et al. [47], Egesdal et al. [63], and Erik et al. [27] use self-
exciting point process models to predict violent events and security
threats. Erik et al. [27] utilize step functions parameterized by various
values, linear functions and non-parametric approaches as non-
stationary background rates (μ) of the point process.
Alexey et al. [5] use a self-exciting point process to discover missing
data in the series of interaction events between agents in a social
network. They apply this model to the Los Angeles gang network to
predict afliation of the unknown offenders.
Recently, this approach has been used to analyze the dynamics of on-
line social networks. Crane and Sornette [22] and Mitchell and Cates [64]
analyze a family of self-exciting point processes to model correlated
event timing of viewing YouTube videos. They deploy a Pareto distribu-
tion (power law) as a distribution of waiting times between cause and
action, describing the cascade of inuences on the online social network.
It is shown that a Hawkes process enclosing power law distributions
offers many capabilities to calibrate the model to characteristics of the
YouTube views. These characteristics are classied by a combination of
endogenous/exogenous user interactions and the ability of viewers to
inuence others to respond across the network (critical/subcritical).
Howison et al. [35] deploy a mutually excited Hawkes process to un-
derstand the dynamics of the user generated contents over open contri-
bution platforms such as Wikipediaand Linux. They study the inuence
of visible activity of others on the timing and amount of participation in
Wikipedia environment. They model the time at which a response to an
event occurs as a log-normal distribution. But this analysis has not yet
been conducted on social media activities, in particular on Twitterpost-
ings and follow-up actions. Also the role of the inuential users within
OSNs has not been yet considered in such predictive models.
In this paper, we provide a more realistic investigation of the
benets of stochastic point processes for predicting the brand post
popularity on OSNs. To the best of our knowledge, there are relatively
few studies in the literature which explore the capability of point
processes on online social networks to model dynamics and growth
patterns. We use the ETAS model, one of the most widely used point
process in the literature, to shed light on how the content popularity
on OSNs can be described by a function of time and the number of
followers. The number of followers is one of the best metrics to demon-
stratetheroleoftheinuential users within OSNs.
3. Problem formulation
Understanding rules governing collective human behavior, especial-
ly as they affect social interactions on internet-based social media, is a
difcult task in the eld of social media analytics. Our main objective
is to analyze how the popularity of individual brand posts evolves
when the posts are shared with people on social media outlets. We
examine how fans' sequential interactions with network friends
contribute to the popularity of a brand post. The majority of brand
posts experience few hits and can be well described by a Poisson
process. In such a case of little activity, popularity oscillation is quite
steady. In contrast, some brand posts experience bursts of activity and
word of mouth growth through friend sharing features of OSNs. A
standard stochastic process (i.e. Poisson process) fails to address the
burst of popularity;since it is based on the assumption of independence
about arrivals, which is unrealistic in case of future activities arising
from a specic tweet/post/etc. Clustering point processes and epidemic
type models are a good t for modeling such phenomena.
In the online social networks analysis, the social activity event data
can be viewed as the realization of a multivariate point process. Each
event is characterized by its occurrence time (t
i
), the magnitude of
inuence (number of followers) (m
i
) with an additional mark attached
to it representing the event's type (z
i
). Retweeting, replying, tagging
and marking a brand post as a favorite, etc. are different types of user
activities. For the purpose of this paper, we combine these three types
of events into one common set of events.
The beauty of major OSN platforms is that they are structurally
isomorphic. Their similar features, while labeled with site-specic
vocabulary, operatein the same way, making studies of their data easier.
For the purposes of this paper, we will utilize Twitter notations to
explain properties of OSNs.
In order to build our two-dimensional point process, we dene
{T
i
,M
i
,Z
i
}
i{1,2,..}
as random variables where T
i
is the occurrence
time, M
i
the magnitude of the ith triggering event and Z
i
{1, 2, }
indicates the type of i
th
event. Any event of a specictypeattimet
1
increases the likelihood of an event of any type stream occurring at
time t
2
. Now, we formulate the problem using Hawkes process proper-
ties and discuss how those mechanisms work on the time line.
3.1. Candidate models
First, we formulate one sequence of events using a self-exciting point
process to measure the likelihood that individuals are talking about the
brand regardless of the type of events. This model lets us aggregate the
popularity content from across Twitter into a single stream of informa-
tion. It concurrently capturesthe idea that any given activity on a brand
post can causally correspond to a background Poisson process μ(in this
case constant) and foreground self-exciting process as follows:
λtðÞ¼ΛtðjHtÞ¼μþX
i:tibtfg
βϕ tti
ðÞ ð3Þ
The summation component indicates the inuence of users' activity
on the stream. It describes how past events at times t
i
inuence the
current event rate. Parameter βindicates the amount of excitation an
event contributes to the stream. In behavioral terms, it can be described
as the number of potential users inuenced directly by individuals in
the past who retweeted or replied to the brand post tweet at time t
i
.
As mentioned earlier, function ϕis a triggering function describing
distribution of waiting time between a trigger and the response from
users who inuenced to recommend the brand. Mining of our data on
the life cycles of various brand posts in Twitter indicates that unlike
62 A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 5968
YouTube, a brand's tweet gets most of its hits within the rst days
even hours of its life cycle and quickly becomes obsolete. Since most
responses occur almost immediately in the Twitter case, we need a
distribution that enforces the highest intensity at the most immediate
possible time. Furthermore, it should be skewed and long tailed to
reect a long time dependency and burstiness.
3.1.1. Model 1
First, we use an exponential distribution for the response density,
giving the conditional intensity
λtðÞ¼μþX
i:tibt
fg
βeαtti
ðÞ ð4Þ
where tt
i
is the time elapsed since event i,andαreects a rate
of decay for the triggering density which controls how long self-
excitation takes following a tweet. If αis large, mentioning the brand
post by users will last only a short while and a few events (retweet or
reply) will be only added above a background rate after the initial
brand's tweet over a short period of time. Conversely, if αis small,
self-excitation will last for a much longer period of time and then
many more events will be added to the background rate.
3.1.2. Model 2
There is another characteristic of events in OSNs that should
be taken into consideration. We suggest that the amount of users'
contributions to future events is not only dependent on the occurrence
time, but that the number of followers he/she has is an important factor
as well. Therefore, our second model takes into consideration two
parameters: the occurrence time and the magnitude of triggering
event (number of friends and followers). It means that the event does
not scale just with the occurrence time, but also the magnitude of the
triggering event as well.
One particular form of a self-exciting point process is the ETAS
model (spacetimemagnitude Hawkes process), which is widely
used to describe spatialtemporal patterns. This model takes more
parameters (inputs) into account. We use an early form of this model
(i.e. timemagnitude Hawkes process), similar to [50], to quantify the
popularity of a brand tweet. This model incorporates magnitudes and
occurrence time of triggering events concurrently. The conditional
intensity for the ETAS model is given by
λtðÞ¼ΛtðjHtÞ¼μþX
i:tibt
fg
ϕtti;mi
ðÞ ð5Þ
where the history of the process H
t
={(t
i
,m
i
):t
i
bt} also includes
magnitudes m
i
,μis the arrival rate of new users and ϕis a triggering
function. The ETAS uses a combination of the exponential distribution
and the Pareto distribution for the triggering density ϕ, giving the
conditional intensity
λtðÞ¼μþX
i:tibtfg
ϕtti;mi
ðÞ¼μþX
i:tibtfg
β
ttiþcðÞ
1þpeαmiM0
ðÞ
ð6Þ
where the power law term governs temporal distribution of subsequent
triggered events and the exponential term explains the factor by which
the user's magnitude m
i
inates expected number of inuencers. The
term tt
i
denotes the time elapsed since event i.βis the amplitude
coefcient indicating the amount of direct excitations triggered by
event i. The exponent pis the decay rate, αis interpreted as the pro-
ductivity rate to control the number of potential users inuenced by
individuals in the past, and cis the time offset that will be empirically
determined from the dataset under consideration. Furthermore, M
0
is
the lowest magnitude (number of followers) that will be substituted
from the dataset (rescaled to the appropriate range).
3.2. Empirical testing of the models on Twitter datasets
We next apply these models to real data. As mentioned earlier, a
basic analysis of our data on the life cycle of various brand posts in
Twitter using Topsy API and Twitter search API indicates that the major-
ity of a brand's tweet gets most of its activity within the rst days even
hours of its life cycle and hence quickly becomes obsolete. Since we
focus on the brand post popularity, we take brand posts that experience
bursts of activity and electronic word of mouth growth through the
friend sharing features of Twitter. Using Twitter's publicly available
API, we crawled Twitter information streams of more than 120 major
brands that were among the top 500 most valuable global brands [14].
These brands were among the most followed brands and were actively
posting tweets at their fan pages on Twitter. These brands are from
different product and service categories including clothing, cosmetics,
electronics, accessories, foods, beverages, automotive, credit cards,
airlines, etc. Together, these brands published more than 26,500 tweets
in a typical period of one week to provide information to their
customers and promote their latest products, campaigns and events.
We downloaded information of all subsequent activities (retweets,
replies, and marks as favorite) on a brand post for all these 26,500
brand post tweets. We observed that the majority of brand posts tweets
experience few hits and therefore as mentioned earlier, can be modeled
by a Poisson process. However, there are brand posts that became a
major topic (trendingin Twitter parlance), are frequently mentioned
by the brand's followers, and experience bursts of activity. For the
purpose of this paper, we searched through the downloaded tweets to
isolate those tweets that are original tweets from the brands and
where the tweets have been mentioned (retweeted, replied, marked
as favorite) at least 300 times. A number of 221 such brand post tweets
followed by many hits and bursts of activity were identied. At this
stage, 125,861 twitter activities including information on original
tweets, all subsequent retweets, replies and marked as favorite to the
original tweet were processed. The data were divided into individual
datasets. Each dataset contains a corpus of an individual brand
post tweet, its subsequent activities (retweets, replies, and marks as
favorite), along with their timestamps, user ids and number of followers
of the user whocontributes to the tweet stream. We take into consider-
ation only the timestamp of events and the number of followers, while
aggregating the events retweet,reply,andmark as favoriteinto a
single stream of information.
We investigated the content of these 221 most popular brand tweets
and note that the primary topic was the brand campaigns on Twitter
(44%). Some of these campaigns use Twitter to communicate with
fans and followers. Several campaigns use Twitter hashtags to deliver
rewards and sweepstakes to customers. Other campaigns have inter-
active competitionsto create buzz with fans. The second most engaging
brand tweet category is related to the events held by the brands on
Twitter (36%) including surveys etc. The rest of the most popular
brand tweets were related to the informationand entertainment posted
by brands on Twitter.
3.3. Parameter estimation, goodness-of-t, and model comparison
Given a brand post data collectedfrom Twitter, we utilize maximum
likelihood estimation (MLE) methods to estimate the parameters
of candidate self-exciting point process models. While numerical
optimization routines such as the quasi-Newton method, the conjugate
gradient method, the simplex algorithm of Nelder and Mead and the
simulated annealing procedure [23,50,52] are often used to compute
maximum log-likelihood estimation of self-exciting point process
models,we use theexpectation-maximization (EM) algorithm provided
by Veen and Schoenberg [59] to estimateparameters. Veen and Schoen-
berg [59] have demonstrated that the EM algorithm as the estimation
method of choice for incomplete data problems is extremely robust
and accurate compared to traditional methods. The brand post
63A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 5968
popularity can be viewed as an incomplete data problem in which the
unobservable or latent variables ascertain whether an activity belongs
to a background event or whether it is a foreground event and was
triggered by a preceding activity.
Finally, the reliability of each model is statistically tested using the
KolmogorovSmirnov (KS) statistic to assess the extent to which the
model ts the data. This criterion provides useful information of the
absolute goodness-of-t of candidate models. Furthermore, the relative
ability of each model to describe the data is measured by computing the
Akaike information criteria (AIC) [4]. The Akaike statistic provides
germane numerical comparisons of the global t of competing models.
The required package functions in R software are used for tting both
above models to the datasets (Ptproc package [53],Ptprocess[31],
ETAS package (Jalilian, [65]), and R code [59]).
Furthermore, we employ autoregressive integrated moving average
(ARIMA) models as benchmarks which have been regarded as the
closest framework to point processes for event data [23]. We used an
R package Forecast(Hyndman et al. [66]) to perform the time series
analysis. This package allows tting of time series and linear models.
The functions available in this package conduct a search over possible
models within the order constraints provided and return the best
ARIMA model for a univariate time series according to AIC values. In
the next section, we will rst present our results for one of our crawled
datasets to illustrate how our approach works and then we discuss
goodness-of-t of the candidate models by computing their average
AIC values across all the datasets that we compiled from Twitter.
In summary, Fig. 1 illustrates the methodology used for modeling
the contentpopularity on Twitter in this paper.At each stage, the inputs,
the required R-packages used to produce the results and the output are
specied clearly.
Input: Individual tweet dataset
(including user ids, timestamps,
number of follower) in CSV or
XML format
Twitter
Database
Parameter Estimation for the
point process models: Veen and
Schoenberg’s R-code, PTPROC,
PTPROCESS R-packages
Simulation: PTPROC,
PTPROCESS, ETAS R-packages
Parameter Estimation and
simulation for the
benchmark ARIMA (p,q,r)
model: Forecast package
Output: Estimated parameters,
log-likelihood function value,
simulated conditional intensity
function, K-S, AIC values etc.
End
Output: Estimated
parameters for best fitted
ARIMA model, log-
likelihood function value,
simulated ARIMA model,
AIC value etc.
Models Comparison: AIC values
Fig. 1. The methodology used in predicting the online content popularity on Twitter.
412
127
212
Frequency of activities
Retweet
Reply
Mark as Favorite
Fig. 2. Frequency of different types of events.
Fig. 3. A histogram of the number of events per minute.
Fig. 4. Simulated conditional intensity function for model #1.
64 A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 5968
4. Results and analysis
In this section, we focus on one particular dataset to demonstrate
how models work in practice. We set Δt= 1 min for the bin width in
order to control the amount of data through parameter t. From this spe-
cic dataset there are 751 events spanning 10,080 min (one week).
Figs. 2 and 3 provide frequency of different types of hits and a histogram
of the frequency of all events per minute respectively. The most events
occurring in a single minute is 15 and the mean number of events in a
single minute is 0.074. Out of a possible 751 events, 278 events occurred
during the rst two days. Thus, we reason that people respond to a
brand post tweet immediately. Therefore, we would expect that the
distributions to be selected should impose the largest probability mass
at the most immediate possible response time.
Table 1 summarizes the parameter estimates for the rst candidate
model.
The t for the data with self-exciting point process model is plotted
in Fig. 4.
The parameter estimate for βdenotes that immediately after an
event occurs, the conditional intensity is amplied by about 3 events
per minute. The parameter estimate for αindicates an event related to
the brand post tweet is talked about for up to 12 min after posting.
Now let us look at the ETAS model that takes into account the occur-
rence time and the number of followers for every single triggering
event. Fig. 5 provides a snapshot of the number of followers for those
users who appear to have been inuenced by the brand post tweet
either spontaneously or in response to the certain triggers.
Table 2 summarizes the parameter estimates for the ETAS model.
Simulated data with the corresponding ETAS point process model are
shown in Fig. 6.
Our hypothesis is that the greater the number of followers per event,
the greater the inuence. Therefore incorporating the number of
followers into our predictive model as another dimension presumably
provides better results. Fig. 6 reveals that the ETAS model is much
more able to capture jumps and leaps of the process compared to our
dataset.
Utilizing statistical tests such as the KSgoodness-of-ttestandAIC
test allows us to test whether the number of followers impacts the
model. Table 3 summarizes the results for a two sample KStest
demonstrating how well both models perform in terms of the original
data. It contains the p-values and the values of the KS test statistic
(D) corresponding to each model.
These results support our hypothesis that incorporating the number
of followers into the predictive models provides a better simulation for
understanding such phenomena.
Since the ETAS model has more parameters in comparison to the
self-exciting Hawkes process, AIC values are used to analyze parsimony,
complexity and accuracy of the models. The homogeneous Poisson
model is also often used as a reference model for comparison of com-
peting point process models. Table 4 summarizes the AIC values for
candidate models.
The AIC values show that the ETAS model is the one with the
minimum AIC value. Therefore, the ETAS model provides a better t
than a homogeneous Poisson model or self-exciting Hawkes process
or the benchmark ARIMA time series model.
We next estimate the self-exciting Hawkes process model, ETAS
model, the benchmark Poisson process model and the benchmark
ARIMA model and compare their goodness-of-t by computing their
average AIC values across all datasets.
According to Table 5, the ETAS model has the lowest average AIC
value. The proposed ETAS model outperforms the three benchmarks,
which indicates that it can capture the inuence network better than
other models. The benchmark homogeneous Poisson process and the
benchmark ARIMA time series model seem to fare much lower than
the ETAS and the self-exciting Hawkes process. The Poisson process
model fails to capture any exciting effects among user activities to
make the prediction. Also, the ARIMA time series model appears to fail
to capture the dependency between the current event and the past
events on the time line. Recall that, in the online content popularity
context where the occurrence of an event increases the likelihood of
subsequent events, whether slightly or greatly, it is imperative to
account for exciting effects among users' activities.
Our result implies that the impact of the number of followers on
brand post popularity is an important issue in OSNs. It is necessary to
consider the event occurrence time and the number of followers as
two major factors in modeling of online social dynamics.
We found that ETAS modelprovides much more accuracy to predict
popularity of brand posts. It allows us to consider the role of the
inuential users in amplifying the brand post popularity and secondar-
ily proposing the brand to their friends and followers networks. It
implies that inuential users with a high number of followers can
have a signicant inuence in spreading the content of the brand post
to others.
5. Discussion and limitations
We have adapted a powerful approach for modeling the content
popularity in OSNs. In contrast to the previous studies that focused on
a one-dimensional function of time, the model recommended in this
paper allows us to characterize and quantify the content popularity as
a joint probability function of time and the number of followers. The
self-exciting Hawkes process and ETAS models have been calibrated to
Table 1
Specication of the self-exciting Hawkes process model (1) used for simulation.
Parameter μα β
Value 0.05673 12.14027 2.91944
0
100
200
300
400
500
600
700
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Number of followers
Time
Fig. 5. Number of followers over time.
Table 2
Specication of the self-exciting point process model (2) used for simulation.
Parameter μβ α pc m
0
Value 0.5886 0.01376837 2.1254544 1.157623 0.01343711 0.3
65A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 5968
simulate popularity growth patterns of brand post contents on Twitter
and as expected, the ETAS model outperforms the other models to
capture bursts of activity over time.
This model can enable brand marketing managers to observe how
often their fans respond to their posts within OSNs, and gauge the
response for different types of content such as news, contests, applica-
tions, video, pictures, product information, brand's history, testimonials,
etc. They will also have the ability to see how these brand posts move
through the Internet. These predictive models can help companies
decide how often and when a new brand post should be posted, and
how many times the same piece of content can be shared in order to
engage more fans and followers. Certainly there is no magic number
for the ideal number of posts within OSNs; it is important for brands
to post enough content while refraining from posting too much at the
same time. The mathematical conguration of ETAS model also
conrms that if the time difference between two consecutive events is
big enough, most likely the brand post will become obsolete and
suggests that it is time to post a new content to keep a connection
opened with fans.
As another managerial implication of this study, the mathematical
formulation of the ETAS model reveals that the greater the number of
followers per event, the greater the inuence. This means that a high
number of followers improve activity in posting tweets and being
more often retweeted. It highlights the role of inuential users who
signicantly affect the engagement of a brand post, even if they are
involved later. Thus, if companies identify and increase the number of
inuential users within their online social networks, they should expe-
rience an increase of brand recommendations and awareness. Engaging
more users that are inuential during the early life of the brand post
could cause viral effects, which is likely to inuence potential
consumers for a longer period. Many approaches have been proposed
to nd inuential users within OSNs. The simplest approach is to
count the number of followers, but there are other efcient techniques
based on mining link structure along with the temporal order of infor-
mation adoption [42].
Also, since fans' reactions and response time to different types of
the brand post content are dissimilar, it is important for brands to look
carefully at the performance of their various brand post contents
and see which of them during their lifecycle have similar looking
stationary/non-stationary background rates. If they do not follow the
same growth pattern, each category needs an individual point process
to represent it.
Our work proposes a mechanism for capturing the evolution of the
online content popularity posted by brands on Twitter. It facilitates
the early prediction of a tweet behavior on Twitter and the simulation
of the rhythm and timing of the most engaging postings. Through the
simulation and the early prediction of a brand's tweet, brands have a
better view of timing promotions to foster relationship with customers.
Our research can be extended to determine a peak release time for
products of consumer interest on the market through analyzing aggre-
gative/collective brand posts from OSNs. If brand posts are not propa-
gating further on OSNs, it could indicate that the brand is losing its
fans' awareness and popularity, so improvement actions should be
taken.
Several limitations of our study deserve mention. First, we assume
that all users follow the same response time distribution for their own
activities. However, individual activity burst shows a sequence of dis-
crete events. This is unlikely to be a single distribution for the purposes
of tting exponential or Pareto distributions to the longterm dependen-
cy. Another limitation is that thevarious types of events are aggregated.
Multivariate self and mutual-exciting point process models should be
developed to deal with different streams of information and measure
cross interactions and mutual information between one sequence of
events and another.
Furthermore, even though we chose a small time increment, i.e. Δt=
1 min for the bin width in order to control the amount of data through
parameter t, we cannot determine if events occurring in the same minute
are correlated with one another. This means that the events recorded on
the same minute are assumed to be statistically independent.
While we consider the same importance for fan's response times, we
can track down brand's most engaging minutes, hours and days of the
week to determine real effective time windows that should be taken
into computation in order to provide a better prediction.
In summary, our analysis indicates that a stationary Poisson process
for the background rate of spontaneous events is a rather unlikely
assumption in many social systems. The ETAS model and self-exciting
point process can be considered a more reliable underlying process.
6. Conclusion and directions for future research
This paper adopts a stochastic point process framework for analysis
of the dynamic microstructure of online social networks (OSNs). Espe-
cially, we investigate the possibility of using crowdsourcing on OSNs
Fig. 6. Simulated conditional intensity function for model #2.
Table 3
The KSgoodness-of-t test output.
Self-exciting Hawkes process (Model #1) ETAS model (Model #2)
D = 0.3993, p-value = 0.03135 D = 0.1223, p-value = 0.02216
Table 4
AIC test results.
Time series model
(ARIMA (3, 1, 3))
Homogeneous
Poisson model
Self-exciting Hawkes
process (Model #1)
ETAS model
(Model #2)
9787.670 5401.592 4473.017 4012.011
Table 5
Models' comparative average AIC values.
Time series model
(ARMIA (p,q,r))
Homogeneous
Poisson model
Self-exciting Hawkes
process (Model #1)
ETAS model
(Model #2)
13,398.661 9047.397 7143.110 6415.187
66 A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 5968
as a marketing mechanism to enhance brand awareness and popularity.
Such crowdsourcing activities help brands spur innovation and drive
brand awareness across OSNs platforms. We describe such dynamics
in terms of the stochastic occurrence times and number of followers.
One-dimensional and two-dimensional self-exciting point process
models are adjusted to simulate popularity growth patterns of brand
post contents on Twitter. Our ndings indicate that point models are
able to describe the cascade of inuencers on theonline social networks.
Our results suggest that incorporating the number of followers into pre-
dictive models as another dimension of input provides a better
understanding of the content popularity. Our future work focuses on
applying a full package of multivariate point processes to different
streams of events within OSNs.
References
[1] F. Abel, E. Diaz-Aviles, et al., Analyzing the blogosphere for predicting the success of
music and movie products, International Conference on Advances in Social Net-
works Analysis and Mining (ASONAM), IEEE, 2010, pp. 276280.
[2] L. Adamopoulos, Cluster models for earthquakes: regional comparisons, Mathemat-
ical Geology 8 (4) (1976) 463475.
[3] Y.-Y. Ahn, S. Han, et al., Analysis of topological characteristics of huge online social
networking services, Proceedings of the 16th international conference on World
Wide Web, ACM, Banff, Alberta, Canada, 2007, pp. 835844.
[4] H. Akaike, Information theory and an extension of the maximum likelihood princi-
ple, 2nd Inter. Symp. on Information Theory, 1, 1992, pp. 610624.
[5] S. Alexey, B.S. Martin, et al., Reconstruction of missing data in social networks based
on temporal patterns of interactions, Inverse Problems 27 (11) (2011) 115013.
[6] E. Bacry, S. Delattre, et al., Modelling microstructure noise with mutually exciting
point processes, Quantitative Finance (2012) 113.
[7] Y. Bae, H. Lee, A sentiment analysis of audiences on twitter: who is the positive or
negative audience of popular twitt erers? Proceedings of the 5th international
conference on Convergence and hybrid information technology, Springer-Verlag,
Daejeon, Korea, 2011, pp. 732739.
[8] C.H. Baird, G. Parasnis, From social media to social customer relationship manage-
ment, Strategy & Leadership 39 (5) (2011) 3037.
[9] N.G. Barnes, Exploring the link between customer care and brand reputation in the
age of social media, in: S. f. NC Research (Ed.), Societ y for New Communication
Research, 2008.
[10] L. Bauwens, N. Hautsch, Modelling nancial high frequ ency data using point
processes, in: T. Mikosch, J.-P. Kreiß, R.A. Davis, T.G. Andersen (Eds.), Handbook of
Financial Time Series, Springer, Berlin Heidelberg, 2009, pp. 953979.
[11] P.R. Berthon, When customers get clever: managerial approaches to dealing with
creative consumers, Strategic Direction 23 (8) (2007).
[12] P.R. Berthon, L.F. Pitt, et al., Marketing meets Web 2.0, social media, and creative
consumers: implications for international marketing strategy, Business Horizons
55 (3) (2012) 261271.
[13] C.G. Bowsher, Modelling security market events in continuous time:intensity based,
multivariate point process models, Journal of Econometrics 141 (2) (2007)
876912.
[14] Brand Directory, BrandFinance Banking 500 2013[online], [Accessed 07/01/2013]
Available from http://www.brandirectory.com 2013.
[15] L. Capozzi, L.B. Zipfel, The conversation age: the opportunity for public relations,
Corporate Communications: An International Journal 17 (3) (2012) 336349.
[16] M. Cha, H. Kwak, et al., Analyzing the video popularity characteristics of large-scale
user generated content systems, IEEE/ACM Tra nsactions on Ne tworking 17 (5 )
(2009) 13571370.
[17] G. Chatzopoulou, S. Cheng, et al., A rst step towards understanding popularity in
youtube, INFOCOM IEEE Conference on Computer Communications Workshops,
2010.
[18] V. Chavez-Demoulin, A.C. Davison, et al., Estimating value-at-risk: a point process
approach, Quantitative Finance 5 (2) (2005) 227234.
[19] E. Chornoboy, L. Schramm, et al., Maximum likelihood identication of neural point
process systems, Biological Cybernetics 59 (4) (1988) 265275.
[20] A.Y.K. Chua, S. Banerjee, Customer knowledge management via social media: the
case of Starbucks, Journal of Knowledge Management 17 (2) (2013) 237249.
[21] Constant Conta ct, Report on con sumer behavio r highlights th e need for small
businesses to be active on Facebook, Constant Contact Inc., 2011
[22] R. Crane, D. Sornette, Robust dynamic classes revealed by measuring the response
function of a social system, Proceedings of the National Academy of Sciences 105
(41) (2008) 1564915653.
[23] D.J. Daley, D. Vere-Jones, Conditional intensities and likelihoods, An Introduction to
the Theory of Point Processes, , Springer, New York, 2003. 211287.
[24] A. Dassios, H. Zhao, Ruin by dynamic contagion claims, Insurance: Mathematics and
Economics 51 (1) (2012) 93106.
[25] L. de Vries, S. Ge nsler, et al., Pop ularity of bran d posts on brand fan pages: an
investigation of the effects of social media marketing, Journal of Interactive Market-
ing 26 (2) (2012) 8391.
[26] R.F. Engle, A. Lunde, Trades and quotes: a bivariate point process, Journal of Financial
Econometrics 1 (2) (2003) 159188.
[27] L. Erik,M. George, et al., Self-exciting point process models of civilian deaths inIraq,
2010.
[28] F. Benevenuto, T. Rodrigues,V. Almeida, J. Almeida,K. Ross, Video interactions in on-
line video social networks,ACM Transactions on Multimedia Computing, Communi-
cations, and Applications (TOMCCAP) 5 (4) (2009) 30.
[29] F. Figueiredo, Fabr, et al., The tube over time: characterizing popularity growth of
youtube videos, Proceedings of the fourth ACM international conference on Web
search and data mining, ACM, Hong Kong, China, 2011, pp. 745754.
[30] I. Guy,M. Jacovi, et al., Same places, same things, same people?: mining user similar-
ity on social media, Proceedings of the 2010 ACM conference on Computer support-
ed cooperative work, ACM, Savannah, Georgia, USA, 2010, pp. 4150.
[31] D. Harte, PtProcess: an R package for modelling marked point processes indexed by
time, Journal of Statistical Software 35 (8) (2010) 132.
[32] A.G. Hawkes, Spectra of some self-exciting and mutually exciting point processes,
Biometrika 58 (1) (1971) 8390.
[33] A.G. Hawkes, D. Oakes, A cluster process representation of a self-exciting process,
Journal of Applied Probability 11 (3) (1974) 493503.
[34] J. Howe, The rise of crowdsourcing, Wired Magazine 14 (6) (2006) 14.
[35] J. Howison, J.F. Olson, A. Kittur, K.M. Carley, Motivation through visibility in open
contribution systems, http://repository.cmu.edu/isr/493/ 2011 (accessed May 19,
2014).
[36] B.A. Huberman, Crowdsourcing and attention, Computer 41 (11) (2008) 103105.
[37] L.B. Jabeur, L. Tam ine, et al., Uprising microblogs: a bayesian netw ork retrieval
model for tweet search, Proceedings of the 27th Annual ACM Symposium on Ap-
plied Computing, ACM, Trento, Italy, 2012, pp. 943948.
[38] B.J.Jansen, M. Zhang, etal., Twitter power:tweets as electronic word of mouth, Jour-
nal of the American Society for Information Science and Technology 60 (11) (2009)
21692188.
[39] L. Jong Gun, M. Sue, et al., An approach to model and predict the popularity of
online contents with explanatory factors, Web Intelligence and Intelligent
Agent Technology (WI-IAT), 2010 IEEE/WIC/ACM International Conference on,
2010.
[40] S. Kong, L. Feng, et al., Predicting lifespans of popular tweets in microblog, Proceed-
ings of the 35th international ACM SIGIR conference on Research and development
in information retrieval, ACM, Portland, Oregon, USA, 2012, pp. 11291130.
[41] M. Lawrence, E.C. Michael, Hawkes process as a model of social interactions: a view
on video dynamics, Journal of Physics A: Mathematical and Theor etical 43 (4)
(2010) 045101.
[42] C. Lee, H. Kwak, et al., Finding inuentials based on the temporal order of informa-
tion adoption in twitter, Proceedings of the 19th International Conference on World
Wide Web, ACM, Raleigh, North Carolina, USA, 2010, pp. 11371138.
[43] J.G. Lee, S. Moon, K. Salamatian, Modeling and predicting the popularity of online
contents with Cox proportional hazard regression model, Neurocomputing 76 (1)
(2012) 134145.
[44] K. Lerman, T. Hogg, Using a model of social dynamics to predict popularity of news,
Proceedings of the 19th International Conference on World wide Web, ACM, Ra-
leigh, North Carolina, USA, 2010, pp. 621630.
[45] J. Leskovec, K.J. Lang, et al., Statistical properties of community structure in large so-
cial and information networks, Proceedings of the 17th international conferenceon
World Wide Web, ACM , Beijing, China, 2 008, pp. 695704.
[46] W.B. Lober, J.L. Flowers, Consumer empowerment in health care amid the internet
and social media, Seminars in Oncology Nursing 27 (3) (2011) 169182.
[47] G.O. Mohler, M.B. Short, et al., Self-exciting point process modeling of crime, Journal
of the American Statistical Association 106 (493) (2011) 100108.
[48] A. Noff, Learning from Starbucks one tweet at a time, available at: http://www.
blonde20.com/blog/2009/11/19/learning-from-starb ucks-one-tweet-at-a-time /
2009 (accessed August 25, 2012).
[49] A. Noff, The Starbucks formula for social media succe ss, URL:http://thenextweb.
com/2010/01/11/starbucks-formula-social-media-success/ 2011.
[50] Y. Ogata, Statistical models for earthquake occurrences and residual analysis for
point processes, Journal of the American Statistical Association 83 (401) (1988)
927.
[51] Y. Ogata, D. Vere-Jones, Inference for earthquake models: a self-correcting model,
Stochastic Processes and their Applications 17 (2) (1984) 337347.
[52] T. Ozaki, Maximum likelihood estimation of Hawkes' self-exciting point processes,
Annals of the Institute of Statistical Mathematics 31 (1) (1979) 145155.
[53] R.D. Peng, Multi-dimensional point process models in r, 2002.
[54] H. Rui, A. Whinston, Designing a social-broadcasting-based business intelligence
system, ACM Transactions on Management Information Systems 2 (4) (2012) 119.
[55] D. Rybski, S.V. Buldyrev, et al., Communication activity in a social network: relation
between long-term correlations and inter-e vent clustering, Scienti c Reports 2
(2012).
[56] SAS Harvard Business Review Analytic Services, The New Conversation:
TakingSocial Media from Talk to Action, Harvard Business School Publishing, 2010.
[57] C.M. Sashi, Customer engagement, buyerseller relationships, and social media,
Management Decision 50 (2) (2012) 253272.
[58] G. Szabo, B.A. Huberman, Predicting the popularity of online content, Communica-
tions of the ACM 53 (8) (2010) 8088.
[59] A. Veen, F.P. Schoenberg, Estimation of spacetime branching process models in
seismology using an EM-type algorithm, Journal of the American Statistical Associ-
ation 103 (482) (2008) 614624.
[60] T. Wang, M. Bebbington, et al., Markov-modulated Hawkes process with stepwise
decay, Annals of the Institute of Statistical Mathematics 64 (3) (2012) 521544.
[61] W. Willinger, R. Rejaie, et al., Research on online social networks: time to face
the real challenges, SIGMETRICS Performance Evaluation Review 37 (3)
(2010) 4954.
67A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 5968
[62] E. Sadikov, A.G. Parameswaran, P. Venetis, et al., Blogs as Predictors of Movie Suc-
cess, International AAAI Conference on Weblogs and Social Media (ICWSM) (2009).
[63] M. Egesdal, C. Fathauer, K. Louie, J. Neuman, G. Mohler, E. Lewis, Statistical and sto-
chastic modeling of gang rivalries in Los Angeles, SIAM Undergraduate Research On-
line 3 (2010) 72394.
[64] L. Mitchell, M.E. Cates, Hawkes process as a model of social interactions: a view on
video dynamics, Journal of Physics A: Mathematical and Theoretical 43 (4) (2010)
045101.
[65] A. Jalilian, ETAS: Modeling earthquake data using Epidemic Type Aftershock Se-
quence model, 2012.
[66] R.J. Hyndman, Y. Khandakar, Automatic time series for forecasti ng : the forecast
package for R, 2007.
Amir Hassan Zadeh is a PhD student in the Management
Science and Inf ormation Systems Department within th e
Spears School of Business at Oklahoma State University. He
received his master's in Industrial and Systems Engineering
from Amirkabir University of Technology, and his bachelor's
from Departmen t of Mathematics and Computer Science,
Shahed University, Tehran, Iran. He has bee n published in
the Journalof Production Planning and Control, Annals of Infor-
mationSystems, Advancesin Intelligent andSoft Computing,Af-
rican Journal of Business Management, and also conference
proceedings of DSI, INFORMS and IEEE. His current research
interests include big data and analytics, social networks and
recommender systems. His research also involves decision
support systems, data mining and knowledge disc overy
and systemanalysis and design.Other areas of interestinclude supply chainmanagement,
product design, and healthcare.
Ramesh Sharda is the interim Vice D ean of the Watson
Graduate School of Management, Watson/ConocoPhillips
Chair and a Regents Professor of Management Science and
Information Systems in the Spears School of Business at
Oklahoma State University. He also serves as the Executive
Director of the PhD in Business for Executives Program. He
has coauthored two textbooks (Business Intel ligence and
Analytics: Systems for Decision Support, 10th edition, Prentice
Hall and Business Intelligence: A Managerial Perspective on
Analytics, 3rd Edition, Prentice Hall). His research has been
published in major journals in management science and in-
formation systems including Management Sc ience, Operations
Research, Information Systems Research, Dec ision Support
Systems,Interfaces, INFORMSJournal on Computing,and many
others. He is a member of the editorial boards of journals such as the Decision Support
Systems and Information Systems Frontiers. He is currently servingas the ExecutiveDirector
of Teradata University Network and received the 2013 INFORMS HG Computing Society
Lifetime Service Award.
68 A. Hassan Zadeh, R. Sharda / Decision Support Systems 65 (2014) 5968
... While some prior studies have employed distinct models to explain likes, comments and shares individually (Antoniadis et al., 2019;Banerjee and Chua, 2019;Schultz, 2017), others favored a weighted combination of number of likes, shares and comments as the dependent variable (Karpinska-Krakowiak and Modlinski, 2020). Facebook has been the most intensively studied platform, followed by Instagram (Geurin and Burch, 2017;Mazloom et al., 2016;Yu and Sun, 2019), with X (Twitter) receiving less attention (Zadeh and Sharda, 2014). ...
Article
Full-text available
Purpose This research aims to understand the dynamics that drive consumer engagement of multinational brands' social media posts on platform X, formerly known as Twitter. Taking the emotional tone of posts into account, the effect of vivid, interactive, informative, entertaining and practical features of posts on consumer interactions are evaluated across English- and Turkish-speaking markets. Methodology Inspired by the conceptual framework proposed in previous literature, features were extracted computationally using natural language processing from platform X posts of 33 Fortune 500 brands from various industries from June 2016 to June 2021. Following evaluation of regression models on alternative distributions of the dependent variable, which is total number of likes, shares and comments, random subspace regression using bootstrap resampling was applied to calculate an importance score and evaluate the effect of features. Findings Consumers in English- and Turkish-speaking markets perceive and engage with content differently. While informative and entertaining posts resonate more with English speakers, emotions play a broader role for Turkish speakers. English-speaking audience prefers happy and vivid daytime messages with questions, while Turkish-speaking audience is drawn to angry messages, lean toward nighttime posts. Originality This research is a pioneer to evaluate the factors that influence brands' platform X post engagements across markets of different cultural orientation. Beyond assessing the distinctions in brand post elements, the role of emotional content in brand messages were also analyzed across English- and Turkish-speaking markets.
... Our research methodology consists of four major phases: data acquisition, data processing, data modeling, and data analysis [51]. The analytical methodology used in this study is depicted in Fig. 1, illustrating how we identified the characteristics of data breaches. ...
... PENELITIAN YANG TERKAIT Peran media sosial dalam dunia pemasaran adalah menjadi wahana kegiatan promosi agar lebih efisien [9]. Penelitian lain dengan sumber data dari media sosial Twitter, dapat memberikan informasi posisi bencana yang terjadi lebih cepat. ...
Article
Full-text available
Peningkatan jumlah pengguna media sosial dari waktu ke waktu memberikan potensi baru dalam akuisisi data crowdsourcing. Proses akuisisi data tidak lagi membutuhkan banyak biaya dan waktu, karena crowdsourcing dapat digali dengan mudah bahkan tanpa biaya. Kajian ini mengangkat permasalahan apakah data crowdsourcing dari media sosial dapat dijadikan data alternatif dalam kajian geo-informatika. Proses akuisisi data dilakukan dari unggahan pengguna di media sosial. Unggahan menyertakan titik lokasi dan teks pada keterangan. Data yang diperoleh kemudian diolah untuk mengetahui titik lanskap dan kecenderungan penggunaan bahasa. Penggunaan bahasa dianalisis dengan metode RQDA dan diperoleh hasil 5,37% berbicara tentang bentang alam. Sedangkan titik lokasi media sosial dibandingkan dengan data DEMNAS memiliki skor akurasi 437,8 yang divalidasi dengan metode RMSE dan tidak direpresentasikan mendekati 1,0. Disarankan bahwa data media sosial masih jauh untuk dapat menjadi alternatif sumber data lanskap.
... Essentially, the practice highlighted here can also be applied on other point process models such as those of [17], [18], and [19] to get their corresponding order (−1) and harmonic medians based on their simulated predictive distributions. In summary, prediction functionals which are (theoretically) optimal relative to their respective evaluation metrics are exhibited in Table I. ...
Conference Paper
The prediction of future retweet counts for tweets shared on Twitter has been a topic of immense interest recently. Numerous models have been proposed for such prediction, with their accuracy being assessed using certain choices of evaluation metrics. Admittedly, the majority of predictive models involved have overlooked the problem on the use of theoretically optimal functionals as point predictions and resort to employing options which are more accessible like the predictive mean. This motivates our discussion wherein the practicality of using theoretically consistent functionals with respect to the evaluation metrics considered is put forth. We discuss how the median of order (-1) and harmonic median are optimal in theory relative to the mean and median absolute percentage errors respectively, followed by highlighting in contrast how predictive models extant in the literature may suggest otherwise. Specifically, using a Poisson model supported by a large corpus of Twitter data, our numerical experiments indicate that predictions based on different functionals derived from the predictive distribution do not vary materially across the different metrics used, although predictions stemming from the predictive mean are slightly yet consistently more accurate than those based on the other functionals. We further outline how consistent functionals can be obtained accordingly under the settings of more complex predictive models.
... Small businesses can change how they interact with customers, market their products and services, and communicate through social media to improve customer relationships. Zadeh & Sharda (2014), states that social media is used to disseminate information to make friends or followers launch a brand or product. Innovation, as defined by (Shafigullina & Palyakin, 2016), is defined as "a new product, service, idea, or perception from a person." ...
Article
Full-text available
Understanding content-based promotion and marketing on various social media is fascinating for academic business people, policymakers, and other business communities. So for that, we discussed it to get that resilience. Our data was obtained electronically by searching on Google Scholar and Google Search for several documents and scientific evidence relevant to answering this study's question. The procedures that we carry out include the following; reviewing data with a data coding system, analyzing data evaluation, and they conclude through interpretation. So that the data we present meets high validity and reality principles. After a series of studies and discussions, the understanding of content-based promotion and marketing on social media in the technological era is a fascinating discussion. This is because the various content published on social media for business promotion has helped business people increase sales and enable their businesses to continue sustainability. Thus, these findings will be helpful in studies and discussions in the academic environment.
Article
Full-text available
Instagram is one of the most popular and widely used social network platforms. It is used as a digital tool to connect with other users and also to share information and influence them for marketing and advertising purposes. The influence of popular users is broadly determined by post’s engagement rate in terms of likes, comments, and shares, and the number of followers as well. An objective and comprehensive measure of popularity is necessary to understand the factors that will help make an influencer marketing campaign more successful and beneficial for business activities. This research work attempts to take various features of an influencer account and Instagram posts dataset and develop a novel model that accurately quantifies and determines the influence of a user on Instagram. The research is based on datasets of top regional Instagram influencers and their posts based on categories signified through hashtags and captions. Our research attempts to develop a model using principal component analysis to quantify influence and using it to rank influencers. In our experiment, the proposed model after experimentation, gave the Instagram username “iqbaal.e” influence score as 874,712.9526, username “1nctdream” as 753,830.5847 and username “weareone. exo” as 668,054.4360. The proposed model ranks were compared with other ranks for Instagram users based on other measures such as follower rank etc. User names “huyitian”, “bintangemon” and “bimopd” are top social media influencers based on the proposed model for better business advertising and digital marketing outcomes with collected data and experiment context. This proposed approach gives an exploration for the stakeholders to quantify the impact of influencer in social media and demonstrate an innovative approach.
Chapter
Understanding information cascades in social networks is a critical research area with implications in various domains, such as viral marketing, opinion formation, and misinformation propagation. In information cascade prediction problem, one of the most important factors is the cascade structure of the social network, which can be described as a cascade graph, global graph, or an r-reachable graph. However, the majority of existing studies primarily focus on a singular type of relationship within the social network, relying on the homogeneous graph neural network. We introduce two novel approaches for heterogeneous social network cascading and analyze whether heterogeneous social networks have higher predictive accuracy than homogeneous networks, taking into account the potential differential effects of temporal sequences on the models. Further, our work highlights that the selection of edge types plays an important role in the accuracy of predicting information cascades within social networks.
Article
Full-text available
Online social platforms like Twitter, Weibo, and Facebook have developed rapidly in recent years. These platforms offer people more opportunities to exchange information. Understanding and predicting information cascade on social media platforms is a fundamental problem and one of the primary challenges is to predict the popularity of information. However, most existing methods fail to distinguish the cascade structural feature and global structural feature, resulting in unsatisfactory prediction performance. In this paper, we propose a novel framework named VGCas to distinguish the features of cascade structure and global structure and combine them with temporal features of the cascade to predict popularity. To extract the cascade structural feature and global structural feature simultaneously, we utilize a graph attention based variational autoencoder. Then, we use a gated recurrent unit to extract the temporal feature from the time series. Finally, we feed the combination of the two outputs into a multilayer perceptron to predict popularity. We verify the effectiveness of VGCas by applying it to predict retweet cascades on Twitter and Sina Weibo. Experimental results demonstrate a substantial improvement in predictive accuracy over existing approaches.
Article
Full-text available
Comunicación en las redes sociales favoritas en las marcas y empresas Communication in the favorite social networks in brands and companies Comunicação nas redes sociais favoritas em marcas e empresas Comunicación en las redes sociales favoritas en las marcas y empresas expansion of the Internet and the growing popularity of online social networks have had a great impact on the how brands and companies develop marketing strategies to position themselves in the market. Resumo Currently, the basic principles of marketing are applied considering the influence of the Internet and, especially, of social networks, causing this constant technological development that companies are more creative when targeting specific audiences and obtaining profits. Through the use of new technologies, companies attract new customers, serve existing customers and earn money, as well as promote brand image, provide useful services and prepare well-defined advertising campaigns. Therefore, to achieve the success of the brand, it is essential that organizations acquire knowledge about the different tools and strategies offered by digital marketing, including the use of social networks, in order to recognize which tool or strategy it achieves. attract the public and internet traffic to a greater degree, according to the definition of the company's objectives. The objective of this bibliographic review is to know which are the favorite social networks of brands and companies today, considering that in recent years the expansion of the Internet and the growing popularity of online social networks have had a great impact on the how brands and companies develop marketing strategies to position themselves in the market.
Article
We consider the inherent timeline structure of the appearance of content in online social networks (OSNs) while studying content propagation. We model the propagation of a post/content of interest by an appropriate multi-type branching process. The branching process allows one to predict the emergence of global macro properties (e.g., the spread of a post in the network) from the laws and parameters that determine local interactions. The local interactions largely depend upon the timeline (an inverse stack capable of holding many posts and one dedicated to each user) structure and the number of friends (i.e., connections) of users, etc. We explore the use of multi-type branching processes to analyze the viral properties of the post, e.g., to derive the expected number of shares, the probability of virality of the content, etc. In OSNs, the new posts push down the existing contents in timelines, which can greatly influence content propagation; our analysis considers this influence. We find that one leads to draw incorrect conclusions when the timeline (TL) structure is ignored: (a) for instance, even less attractive posts are shown to get viral; (b) ignoring TL structure also indicates erroneous growth rates. More importantly, one cannot capture some interesting paradigm shifts/phase transitions; for example, virality chances are not monotone with network activity parameter, as shown by analysis including TL influence. In the last part, we integrate the online auctions into our viral marketing model. We study the optimization problem considering real-time bidding. We again compared the study with and without considering the TL structure for varying activity levels of the network. We find that the analysis without TL structure fails to capture the relevant phase transitions, thereby making the study incomplete.
Article
Full-text available
Our goal in this article is to characterize temporal patterns of violent civilian deaths in Iraq. These patterns are expected to evolve on time-scales ranging from years to minutes as a result of changes in the security environment on equally varied time-scales. To assess the importance of multiple time-scales in evolving security threats, we develop a self-exciting point process model similar to that used in earthquake analysis. Here the rate of violent events is partitioned into a background rate and a foreground self-exciting component. Background rates are assumed to change on relatively long time-scales. Foreground self-excitation, in which events trigger an increase in the rate of violence, is assumed to be short-lived. We explore the model using data from Iraq Body Count on civilian deaths between 2003 and 2007. Our results indicate that self-excitation makes up as much as 37–50 per cent of all violent events and that self-excitation lasts at most between two and six weeks, depending upon the district in question. Appropriate security responses may benefit from taking these different time-scales of violence into consideration.
Article
Full-text available
Purpose The purpose of this paper is to analyze the extent to which the use of social media can support customer knowledge management (CKM) in organizations relying on a traditional bricks‐and‐mortar business model. Design/methodology/approach The paper uses a combination of qualitative case study and netnography on Starbucks, an international coffee house chain. Data retrieved from varied sources such as newspapers, newswires, magazines, scholarly publications, books, and social media services were textually analyzed. Findings Three major findings could be culled from the paper. First, Starbucks deploys a wide range of social media tools for CKM that serve as effective branding and marketing instruments for the organization. Second, Starbucks redefines the roles of its customers through the use of social media by transforming them from passive recipients of beverages to active contributors of innovation. Third, Starbucks uses effective strategies to alleviate customers' reluctance for voluntary knowledge sharing, thereby promoting engagement in social media. Research limitations/implications The scope of the paper is limited by the window of the data collection period. Hence, the findings should be interpreted in the light of this constraint. Practical implications The lessons gleaned from the case study suggest that social media is not a tool exclusive to online businesses. It can be a potential game‐changer in supporting CKM efforts even for traditional businesses. Originality/value This paper represents one of the earliest works that analyzes the use of social media for CKM in an organization that relies on a traditional bricks‐and‐mortar business model.
Book
In this paper, we give an overview of the state-of-the-art in the econometric literature on the modeling of so-called financial point processes. The latter are associated with the random arrival of specific financial trading events, such as transactions, quote updates, limit orders or price changes observable based on financial high-frequency data. After discussing fundamental statistical concepts of point process theory, we review durationbased and intensity-based models of financial point processes. Whereas duration-based approaches are mostly preferable for univariate time series, intensity-based models provide powerful frameworks to model multivariate point processes in continuous time. We illustrate the most important properties of the individual models and discuss major empirical applications.
Article
The ideas to be discussed in this chapter have been the subject of intensive development during the last two decades, as much by engineers as by mathematicians and statisticians. The underlying theme is the need for a theory of estimation, prediction, and control for point processes. In the late 1960s and early 1970s engineers, in particular, began to exploit a remarkable analogy between point processes and diffusion processes, with the Poisson process playing a role analogous to that of Brownian motion. Early papers by Yashin (1970) in the Soviet Union and Snyder (1972) and Rubin (1972) in the United States explored the analogy between filtering and detection problems for point processes and the Kaiman filtering techniques for signal-from-noise problems in the Gaussian context; the analogy is closest for doubly stochastic (i.e., Cox) processes. The paper by Gaver (1963) may be regarded as some kind of precursor of these developments. These papers were followed by more systematic studies in the theses by Brémaud (1972) and van Schuppen (1973), and papers by Boel, Varaiya, and Wong (1975), Kailath and Segall (1975), and Davis (1976), to mention only a few. On the probabilistic side, the possibility of a powerful link with martingale theory was noted as early as 1964 by Watanabe (1964) who gave a martingale characterization of the Poisson process; the martingale theory was developed further in Kunita and Watanabe (1967). A synthesis of these approaches was presented by Kabanov, Liptser, and Shiryayev (1975) and incorporated in Volume II of Liptser and Shiryayev (1978). Further important reviews are found in Brémaud and Jacod (1977), Brémaud (1981), Shiryayev (1981), and Jacobsen (1982).
Article
In recent years methods of data analysis for point processes have received some attention, for example, by Cox & Lewis (1966) and Lewis (1964). In particular Bartlett (1963 a,b) has introduced methods of analysis based on the point spectrum. Theoretical models are relatively sparse. In this paper the theoretical properties of a class of processes with particular reference to the point spectrum or corresponding covariance density functions are discussed. A particular result is a self-exciting process with the same second-order properties as a certain doubly stochastic process. These are not distinguishable by methods of data analysis based on these properties.