Strong Regularities in Growth and Decline
of Popularity of Social Media Services
University of Bonn,
TU Dortmund University,
We analyze general trends and pattern in time series that
characterize the dynamics of collective attention to social
media services and Web-based businesses. Our study is
based on search frequency data available from Google Trends
and considers 175 diﬀerent services. For each service, we
collect data from 45 diﬀerent countries as well as global av-
erages. This way, we obtain more than 8,000 time series
which we analyze using diﬀusion models from the economic
sciences. We ﬁnd that these models accurately characterize
the empirical data and our analysis reveals that collective
attention to social media grows and subsides in a highly
regular and predictable manner. Regularities persist across
regions, cultures, and topics and thus hint at general mech-
anisms that govern the adoption of Web-based services. We
discuss several cases in detail to highlight interesting ﬁnd-
ings. Our methods are of economic interest as they may
inform investment decisions and can help assessing at what
stage of the general life-cycle a Web service is at.
Categories and Subject Descriptors
G.3 [Probability and Statistics]: Time series analysis;
H.3.5 [Online Information Services]: Web-based services
Economics, Human Factors, Measurement
social media services, collective attention, trend prediction
The problem of understanding the dynamics of collective
human attention has been called a key scientiﬁc challenge for
the information age . In this paper, we address a spe-
ciﬁc aspect of this problem and mine search frequency data
for common trends and shared characteristics. Our focus is
on query logs which summarize the evolution of global and
regional interests in social media services and we explore
to what extend the general dynamics of collective attention
apparent from these data can be modeled mathematically.
Search frequency analysis is an emerging topic and a grow-
ing body of work shows that patterns found in aggregated
search data of large populations of Web users can provide
insights into collective concerns, interests, or habits. Results
on temporal dynamics of search engine queries are reported
from various ﬁelds and include data driven models of the
Figure 1: Examples of Google Trends time series
which summarize how worldwide searches for dif-
ferent social media services evolve over time. Even
though individual curves diﬀer considerably, an ap-
propriately parameterized diﬀusion model accounts
well for the apparent general trends of initial growth
and subsequent decline of interest. Results obtained
from more than 8,000 temporal signatures of collec-
tive attention on the Web indicate that these ﬁnd-
ings are universal and that interests of large crowds
of users follow these patterns regardless of regional,
cultural, or linguistics backgrounds.
spread of diseases , accounts of the propagation of news
items [4,5,14], characterizations of the formation of politi-
cal opinions , or predictions of tourism ﬂows .
Search frequencies are of particular interest in nowcast-
ing which aims at real time monitoring of economic trends
and developments . Aggregated search behaviors of mil-
lions of users yield reliable predictions for sales or general
economic indicators [13,33]. Temporal changes in search
volumes were found to correlate with changes in the behav-
ior of investors [9,16] and to allow for predicting abnormal
stock returns [18,26]. Accordingly, analysts in the social
sciences, public health, or economics are beginning to em-
brace query log analysis as an alternative to more traditional
The work reported here originates from a project on Web
intelligence where we ask for socio-economic motivations for
individuals to participate in collective endeavors on the Web.
Regarding services, products, and campaigns we investigate
approaches that would allow companies or marketeers to
recognize whether they need to adjust their strategies in or-
der to remain competitive in the modern Web environment.
In particular, we ask to what extent it is possible to pre-
dict the future success or adoption of services, products, or
marketing messages using collective Web intelligence?
Our paradigm is to mine Web data for possible indicators
of trends in collective attention. In this paper, we consider
time series obtained from Google Trends which summarize
search interests of millions of users worldwide and we focus
on temporal signatures that characterize evolving interests
in social media. Extending previously published work ,
our contributions are as follows:
1) We brieﬂy review recent results which underline that
Google Trends data provide meaningful and reliable proxies
for research on how opinions and interests of large crowds
and populations evolve over time.
2) We analyze search frequency data from 45 countries
related to 175 social media services and Web businesses.
Given this comprehensive empirical basis, we perform trend
analysis using economic diﬀusion models and ﬁnd them to
be in excellent agreement with the data. In particular, we
ﬁnd that collective attention to social media as evident from
search frequencies evolves according to notably regular pat-
terns. Although microscopic behaviors may be chaotic, gen-
eral trends apparent in these data typically show simple and
highly regular dynamics of growth and decline.
3) We present evidence that this phenomenon persists
across regions, cultures, and linguistic backgrounds and we
elaborate on several particular examples to highlight sev-
eral interesting ﬁndings. We investigate the potential of our
models for forecasting and present qualitative results which
indicate that they indeed allow for reasonable predictions of
future developments of collective attention.
Next, we discuss the empirical basis of our study. Sec-
tion 3reviews models and methods applied for analysis; re-
sults are discussed in section 4. Section 5contrasts our work
to the related literature and section 6concludes this paper.
2. SEARCH FREQUENCY DATA: A PROXY
OF COLLECTIVE ATTENTION
Our overall goal is to proceed towards a better under-
standing of the dynamics of collective interests and concerns
of large populations of Web users. The empirical basis for
the work reported here consists of time series obtained from
Google Trends which indicate how search volumes related to
speciﬁc topics evolve over time.
Google Trends is a publicly accessible service that pro-
vides statistics on queries users submitted to Google’s search
engine. It allows for retrieving weekly summaries of how
frequently a query has been used since January 1st 2004.
Aggregated statistics are available in form of global aver-
ages but can be narrowed down to regional statistics, for
instance on the level of individual countries.
Analyzing topic speciﬁc search dynamics is an increasingly
popular approach in studies on collective preferences [2,4,
5,9,12,13,16,18,22,26,33] and important questions
Table 1: 45 countries considered in this study
Africa MA, NG, ZA
Asia CN, ID, IN, JP, KR, MY, PH, TH, TW
Australia AU, NZ
Europe AT, BE, CH, CZ, DE, DK, ES, FI, FR, GR,
IE, IL, IT, NL, NO, PL, PT, RU, SE, TR,
N-America CA, MX, US
S-America AR, BR, CL, CO, PE, VE
pertaining to its validity and the signiﬁcance of search data
have been addressed in two recent contributions.
Mellon  correlated results from traditional Gallup sur-
veys with Google Trends data and found that, w.r.t. politi-
cal and economic issues covered in traditional opinion polls,
search frequencies provide accurate proxies of the dynamics
of salient public opinions. Teevan et al.  studied how peo-
ple navigate the Web and found that over 25% of all queries
to search engines are navigational queries, i.e. searches for
company names such as facebook,youtube, or myspace that
are intended to ﬁnd and then access particular Web sites.
In other words, a large percentage of Web users consistently
relies on Google searches rather than on bookmarks or on
entering URLs in order to navigate to Web sites. Together
these ﬁndings thus suggest that data from Google Trends
which aggregate information about the search activities of
millions of users are indeed indicative of collective interests
in Internet services, technical products, or novelties.
2.2 Data Collection and Preprocessing
In this paper, we analyze global and regional temporal
search statistics related to query terms such as ebay,face-
book, or youtube that indicate a populations interest in so-
cial media services or Web-based businesses. For potentially
ambiguous queries, we retrieve data for diﬀerent spellings
(e.g. google plus,googleplus,google+,google +) and compute
their average. In total, we consider data from 45 diﬀerent
countries related to 175 services. As we also retrieve corre-
sponding global search activities, our empirical basis consists
of more than 8.000 data sets.
The 45 countries considered in this study are listed in
Tab. 1. They were selected according to population size,
Internet penetration, and availability of query logs. Note
that this sample covers various regions, cultures, and oﬃcial
languages and is deliberately not restricted to countries that
are technologically far advanced.
The 175 social media sites and Web businesses we con-
sider are listed in Tab. 2. These, too, were chosen accord-
ing to penetration and proﬁle. Among others, they include
general and specialized social networking sites, photo- and
video sharing sites, music streaming services, virtual hang-
Table 2: 175 social media services and businesses
43things ﬂixter mocospace studivz
adify fotoki myheritage stumbleupon
airbnb fotolog mylife svpply
aisanavenue foursquare myspace sysomos
amazon friendsreunited nasza-klasa taringa
amirite friendster netlog techcrunch
anobii gaiaonline netvibes technorati
asmallworld getglue nexopia tencent-qq
badoo github odnoklassniki tripadvisor
bebo gogoyoko openbc tripit
betfair goo dreads openid tuenti
bigadda google+ orkut tumblr
biip.no grono owly twango
bitly grooveshark paypal twitpic
blackplanet groupon photobucket twitter
bliptv habbo pinterest viadeo
boxcryptor hi5 plaxo vimeo
busuu hulu playdom virb
buzznet ibibo posterous vkontakte
cafemom imgur qapacity wakoopa
cloob instagram quechup wattpad
cotweet italki qzone weeworld
cozycot itsmy ravelry weibo
craiglist iwiw reddit weread
cyworld janrain renren wesabe
dailybooth jiepang revver wikia
dailymotion joost scribd wikipedia
deezer justin-tv scvngr winpalace
delicious kdice secondlife wordpress
deviantart kickstarter seedrs xanga
digg kiwibox sevenload xing
disaboom knitty shelfari yelp
disqus lagbook shopify youku
dontstayin last.fm skyblog youtube
dropbox librarything skype zaarly
dwolla linkedin skyrock zappos
ebay livejournal slashdot zoho
elftown livemocha slide.com zoomr
elixio living-social songza zooppa
epinions logoworks sonico zotero
facebook meebo soonr zynga
faceparty meetin soundcloud
failblog mendely sourceforge
fetlife metacafe spotify
ﬂickr mixi stackoverﬂow
outs, (micro-)blogging services, and online retailers, trading
platforms, as well as social games providers and thus cover
a wide spectrum of social media.
For each combination of country and service, we collect
a discrete time series z= [z1, z2,...,z483] of weekly search
counts ztfrom January 2004 to March 2013.
As many services in our sample made their ﬁrst appear-
ance later than January 2004 (e.g. youtube) and were thus
not actively searched for during the whole observation pe-
riod, we determine individual onset times tousing CUSUM
statistics . This leaves us with shortened time series
y= [yto,...,y483] which we shift to yt0where t0=t−toin
order to facilitate statistical analysis.
For query terms related to services that were launched
prior to January 2004 (e.g. amazon), we manually determine
the number of weeks Tbetween their ﬁrst public occurrence
toand January 1st 2004 and consider shifted time series
Given these data, we resort to descriptive data mining
techniques in order to identify commonalities or signiﬁcant
diﬀerences between time series. In particular, we consider
three diﬀusion models which we review in the next section.
3. DIFFUSION MODELS
Visual inspection of search frequency time series related to
social media reveals noticeably common patterns: although,
on a microscopic level, collective interest in individual ser-
vices varies chaotically, macroscopic trends typically show
an initial phase of accelerated growth followed by periods
of saturation and prolonged decline (see, for instance, the
examples in Fig. 1).
Skewed temporal distributions like these frequently occur
in economics where they indicate buying behaviors or rates
of adoption are studied using diﬀusion models. We adhere
to this methodology and investigate to what extent simple
diﬀusion models can characterize general trends in our data.
Note that more elaborate approaches such as Gaussian
mixtures or kernel techniques might provide more accurate
ﬁts. Alas, they typically lack interpretability since they yield
abstract in terms of (numerous) latent variables without
physical meaning. Diﬀusion models, on the other hand, are
deliberately designed to explain time series in terms of intu-
itive concepts that represent knowledge about everyday life
and the real world.
Since we are interested in macroscopic trends, we restrict
our analysis to two-parameter models which are unlikely to
over-ﬁt the data but will capture its gist. Moreover, they
facilitate data exploration and simplify comparisons of sets
of time series. In order for this paper to be self contained,
this section brieﬂy reviews the three diﬀusion models we
3.1 The Bass Model
In an inﬂuential paper, Bass  proposed a diﬀusion model
to describe how rates of adoption of novel products vary
over time. Introducing a parameter pto model a propensity
for innovation and a parameter qto model a propensity for
imitation, he cast the hazard rate of product adoption as
h(t) = f(t)
1−F(t)=p+qF (t) (1)
where f(t) is a probability density and F(t) = Rt
the corresponding cumulative density. Solving the diﬀeren-
tial equation in (1) leads to the Bass distribution
fBA(t|p, q ) = (p+q)2
1 + q
Depending on the choice of pand q, this distribution can
assume a variety of shapes. In particular, for q > p, it
will increase to a maximum before decreasing to zero. This
becomes explicit by writing (1) as f(t) = p+qF (t)−qF 2(t)
which exposes the adoption rate to result from composing
two antagonistic processes: a propensity p+qF (t) to grow
countered by a propensity qF 2(t) to decline.
We include the Bass model in our analysis because it often
accurately models sales and thus may also be able explain
collective attention dynamics on the Web.
3.2 The Shifted Gompertz Model
As our second model, we consider the shifted Gompertz
fSG (t|β , η) = βe−βt e−ηe−βt 1 + η1−e−β t(3)
where t, β, η ≥0. It was introduced by Bemmaor  who
showed that the Bass model results from compounding the
shifted Gompertz with an Exponential distribution, i.e.
fBA(t|p, q ) = Z∞
fSG (t|β , η)e−η
such that p=β/(1 + σ) and q=pσ. This reveals a latent
coupling of the Bass parameters pand qdue to taking the
average over the shape parameter ηof the shifted Gompertz.
Bemmaor’s shifted Gompertz therefore provides a more ﬂex-
ible characterization of adoption dynamics and we explore
its merits in our experiments below.
3.3 The Weibull Model
The Weibull distribution is the type III extreme value
distribution and often applied as a life-time model . Its
probability density function is deﬁned for t∈[0,∞) and
fWB (t|κ, λ) = κ
where κand λdetermine shape and scale. For κ= 1, the
Weibull coincides with the Exponential and, for κ≈3.5, it
approaches the Standard Normal.
Studying the dynamics of Internet memes, Bauckhage et
al.  pointed out that the Weibull, too, implicitly couples
two antagonistic growth dynamics. This can be seen from
considering its cumulative density function
FWB (t|κ, λ) = 1 −e−(t/λ)κ.(6)
Setting α= ( 1
λ)κfor brevity, rearranging the terms in (6),
and substituting into (5) yields f(t) = ακtκ−1−ακtκ−1F(t).
Considered as a diﬀusion model, the Weibull distribution
thus combines a propensity ακtκ−1for collective attention
to a service or product to grow with a propensity ακtκ−1F(t)
for attention to subside. In passing, we note that by letting
α=α(t) = qF (t) and setting κ= 1, the Weibull and the
Bass model are related as fBA (t)−p=fWB (t).
3.4 Model Fitting
When applying the above diﬀusion models to analyze tem-
poral signatures of collective attention on the Web, we must
cope with the fact that neither model provides a closed form
solution for the maximum likelihood estimates of their pa-
rameters. Addressing this issue and aiming at high eﬃciency
for large scale processing, we propose the use of multinomial
maximum likelihood techniques.
Throughout, we ﬁt continuous distributions f(t|θ1, θ2) to
discrete series of frequency counts y1,...,ymgrouped into m
distinct intervals (t0, t1], (t1, t2], . . ., (tm−1, tm]. To devise an
eﬃcient algorithm for estimating optimal model parameters
2, we note that a histogram h(y1,...,ym) of counts
can be thought of as a multinomial distribution
h(y1,...,ym) = n!Y
where n=Piyi. Since the cumulative density of the model
F(t) = F(t|θ1, θ2) = Zt
f(τ|θ1, θ2)dτ, (8)
the probabilities piof the multinomial can be expressed as
pi(θ1, θ2) = F(ti)−F(ti−1) so that Pipi=F(tm)−F(t0).
Accordingly, the likelihood for a discrete, truncated time
series y1,...,ymis given by
L(θ1, θ2) = n!
and maximum likelihood estimates of θ1and θ2result from
computing the roots of ∇θlog L. Again, this may not lead to
closed form solutions but may require numerical optimiza-
tion. To this end, we apply an eﬃcient, iterative weighted
least squares scheme
which regresses the yionto their expectations npiand re-
quires to update the weights wi= (npi)−1in each iteration.
In addition to computational convenience, this approach
is robust and has the property that, for pi=pi(θ∗
ﬁnal residual sum of squares follows a χ2statistic . We
thus resort to the χ2-test for goodness of ﬁt (GoF) testing.
Yet, we note that the χ2-test may underestimate the quality
of ﬁts to time series  so that the results reported below
may improve even further if more elaborate tests were used.
4. EMPIRICAL RESULTS
This section presents and discusses trend analysis results
for our data set of about 8,000 social media related search
frequency signatures. In order to illustrate several arguably
important ﬁndings, we compare results obtained for distinct
countries, regions of the world, linguistic backgrounds, and
types of service in form of small case studies.
4.1 Time to Adoption
In a preparatory analysis, we gather statistics as to times
to adoption of social media in diﬀerent countries. For each
service in our data set, we determine its global onset, i.e. the
point in time at which it ﬁrst became visible in Google’s
search frequency data. Then, for every country in our data,
we determine the delay ∆t(in days) between the service’s
global onset and its onset in the country. Finally, we com-
pute the mean (µ) and median (m) delay per country in
order to perform comparisons.
Table 3ranks the 45 countries considered according to
their mean- and median times to adoption; the world map
at the bottom of the table shows a heat map visualization of
median times to adoption. Together with Japan, countries
from the western world lead both rankings. With respect
to both metrics, the US is the country where social media
most quickly achieve noticeable rates of adoption. This is
less surprising since many popular social media services such
as facebook are based in the US and thus may gather an
American audience faster than a global one.
A less anticipated ﬁnding comes from looking at Fig. 2
which plots median times to adoption along the time axis.
The delay between the US and the next fastest adopting
country, the UK, amounts to more than 200 days. For the
majority of countries in our study, we ﬁnd that Web-based
social media achieve noticeable rates of adaption between
400 and 600 days after their launch or ﬁrst observable onset.
At ﬁrst sight, it thus appears surprising to ﬁnd South Korea,
a technologically highly advanced nation, to lag behind in
this statistic. Yet, this can be attributed to peculiar aspects
of South Korean Web culture which features many social
Table 3: Rankings of countries w.r.t. mean (µ) and
median (m) time to adoption of a novel service
µ m µ m µ m
1. US US 16. NL IE 31. NZ AR
2. UK UK 17. AT MX 32. TW DK
3. FR CA 18. IE AT 33. DK NO
4. CA FR 19. MX PL 34. ZA ZA
5. JP DE 20. PL MY 35. CN TW
6. DE JP 21. PH IL 36. CO CO
7. AU ES 22. IN NZ 37. TH VE
8. IT IT 23. PT BR 38. NO GR
9. ES NL 24. IL PH 39. GR NG
10. BE BE 25. MA CL 40. CZ CZ
11. SE AU 26. VE MA 41. ID UA
12. MY SE 27. CL PE 42. UA ID
13. PE FI 28. AR IN 43. KR TH
14. FI PT 29. TR TR 44. RU RU
15. CH CH 30. BR CN 45. NG KR
0 200 400 600 800 1000
Figure 2: Time line showing median times to adop-
tion (in days) of social media for diﬀerent countries.
media such as cyworld or me2day that are very popular
within the country but rather unknown elsewhere.
Findings like these further underline that search frequency
signatures indeed provide plausible proxies for the study of
collective attention on the Web. Next, we therefore address
aspects of attention dynamics expressed in query log data.
4.2 Attention Dynamics
In our main analysis, we apply the economic diﬀusion
models from section 3in order to mine our data for shared
characteristics or noteworthy exceptions.
Table 4presents Goodness-of-Fit (GoF) results for all
three models in terms of p-value statistics obtained from
χ2-tests. To produce these statistics, data from diﬀerent
countries were grouped into clusters representing continents
and the models were evaluated for each cluster. For the
shifted Gompertz, average p-values (the higher the better)
signiﬁcantly exceed 0.5. This holds for ﬁts to data which
reﬂect worldwide interests as well as for ﬁts to continent
speciﬁc data. Moreover, at a signiﬁcance level of 5%, we
Table 4: Goodness of ﬁt w.r.t. regions of the world
region fSG fBA fW B
hpip > 0.05 hpip > 0.05 hpip > 0.05
Africa 0.61 68% 0.55 62% 0.50 57%
Asia 0.57 63% 0.49 54% 0.48 53%
Australia 0.66 70% 0.53 59% 0.50 58%
Europe 0.59 65% 0.48 51% 0.56 54%
N-America 0.54 57% 0.44 50% 0.39 44%
S-America 0.65 71% 0.54 59% 0.55 62%
worldwide 0.59 64% 0.50 55% 0.47 53%
Table 5: Goodness of ﬁt w.r.t. languages of the
language fSG fBA fW B
hpip > 0.05 hpip > 0.05 hpip > 0.05
English 0.55 58% 0.44 49% 0.39 45%
Spanish 0.63 68% 0.52 56% 0.54 60%
Portuguese 0.60 67% 0.50 56% 0.47 51%
Russian 0.68 76% 0.58 66% 0.69 76%
French 0.55 60% 0.46 51% 0.39 45%
German 0.58 64% 0.47 52% 0.47 54%
Chinese 0.50 52% 0.42 46% 0.43 47%
Japanese 0.42 52% 0.38 44% 0.31 38%
Hindi 0.57 64% 0.47 54% 0.48 52%
average 0.57 62% 0.47 52% 0.45 51%
ﬁnd the shifted Gompertz to provide accurate ﬁts for the
majority of our data. In terms of overall GoF, the Bass and
the Weibull perform slightly worse, yet both models yield
statistically signiﬁcant ﬁts for most of the data, too.
Table 5provides an alternative view on our data. While
Tab. 4shows results w.r.t. geographic regions, Tab. 5lists
GoF results w.r.t. major languages spoken across the world.
Data from diﬀerent countries were grouped into clusters rep-
resenting oﬃcial languages and the three diﬀusion models
were evaluated for each cluster. Apparently, the results in
Tab. 5mimic those in Tab. 4. Quality and signiﬁcance of
ﬁts are comparable and the shifted Gompertz again provides
the most accurate explanation.
These results are interesting and important for they sug-
gest that the dynamics of collective attention apparent from
search frequency data can be accurately described in terms of
diﬀusion models. Moreover, they indicate that, around the
world, collective attention to social media evolves similarly
and independent of regions of origin or cultural backgrounds
of crowds of Web users.
Figure 3shows how our diﬀusion models ﬁt general trends
for several well known social media platforms and Web-based
businesses. Gray curves show evolving global search volumes
available from Google Trends; colored curves represent ﬁt-
ted models where the best ﬁtting one (in terms of GoF)
is emphasized. These plots are in line with the results in
Tabs. 4and 5and illustrate that all three models are able to
capture general dynamics even if data for diﬀerent services
show seemingly distinct patterns of growing and declining
A considerable advantage of descriptive data mining for
attention analysis and, in particular, of using two-parameter
diﬀusion models f(t|θ1, θ2) is that they facilitate visual an-
alytics. Once a diﬀusion model has been ﬁt to a temporal
Figure 3: Exemplary visualizations of how the three diﬀusion models (Bass, shifted Gompertz, and Weibull)
ﬁt general trends in temporal signatures of worldwide query logs related to several popular and well known
social media services and Web-based businesses; the respective best ﬁtting model is emphasized.
(a) Bass model (b) shifted Gompertz model (c) Weibull model
Figure 4: Non-linear, two-dimensional embeddings of more than 8.000 search frequency time series into the
parameters spaces of the shifted Gomperts-, the Bass- and the Weibull diﬀusion model. In each case, the 2D
embedding coordinates of the eight examples in Fig. 3are highlighted in color.
signature of search activities, its parameters [θ1, θ2] provide
as a two-dimensional feature vector that characterizes the
time series and may be used in further analysis. Speciﬁ-
cally, our approach immediately allows for non-linear, two-
dimensional embeddings of the data which can be plotted to
visualize whole data sets of time series.
Figure 4displays two-dimensional embeddings of all our
data according to the diﬀerent diﬀusion models. To facilitate
interpretation, the coordinates of the eight time series in
Fig. 3are highlighted in color.
In each case, the embedding coordinates of amazon, a
business that continues to attract increasing user interest,
marks an extreme location in the embedding space. Simi-
larly extreme locations are occupied by craiglist and ebay,
two Web-platforms that were launched in the 1990s and
reached global peak popularity around 2008. The embed-
ding coordinates of google+, a service whose search frequency
time series indicate a spike of global attention after its launch
in 2011, reside at opposite extreme locations. All other time
series from Fig. 3are found more or less close together in
respective giant clusters of embedded search frequency data.
The existence of these giant clusters which contain almost
90% of all time series tested is arguably the most important
result of our analysis. Irrespective of the diﬀusion model
used to characterize general collective attention dynamics
and regardless of which region in the world is considered, it
appears that most time series in our collection show similar
behavior: individual social media services seem to be able to
attract increasing collective attention for a period of 4 to 6
years before user interest inevitably begins to subside. This
is visible in many of the time series shown throughout this
paper, well accounted for by the shape and scale parameters
of economic diﬀusion models, and thus strikingly apparent
in Fig. 4.
4.2.1 Case Study: Countries, Continents, Languages
Figure 5compares examples of attention dynamics for dif-
ferent countries, continents, and linguistic backgrounds.
In Fig. 5(a), we embed data from the US and South Ko-
rea in the parameters space of the shifted Gompertz model.
Above, both countries were found to be most diﬀerent re-
garding median times to adoption of the services considered
in this study. Figure 5(a), however, indicates that attention
dynamics in both countries are rather similar.
Israel and Malaysia, two countries from diﬀerent parts
of the world, occupy middle ranks in Tab. 3. Yet, their
embeddings in Fig. 5(b) overlap with those of the US and
(a) US and South Korea (b) Israel and Malaysia (c) Asia and South America (d) English and Russian
Figure 5: Exemplary comparisons of search frequency time series from diﬀerent countries, continents, and
languages plotted in the two-dimensional parameter space of the shifted Gompertz model.
(a) facebook and myspace (b) ﬂickr and imgur
Figure 6: Exemplary comparisons of temporal
query log data related to diﬀerent social media ser-
South Korea and do not indicate noteworthy diﬀerences as
to collective attention dynamics. Similar conclusion apply
to the comparison of Asian and South American countries in
Fig. 5(c) and the comparison of English and Russian speak-
ing countries in Fig. 5(d).
4.2.2 Case Study: Social Networks, Photo Sharing
While diﬀusion models seem not to allow for a distinction
of attention dynamics in diﬀerent countries or regions, we
ﬁnd that data related to individual services tend to form
compact, separable clusters in the parameter spaces of the
models we consider.
As an example, Fig.6compares two social networks and
two photo sharing sites. Country speciﬁc time series re-
lated to myspace and facebook form distinct clusters in the
embedding space of the shifted Gompertz model. Whereas
myspace is a social networking site that came and went,
facebook seems to just have reached global peak popularity.
This diﬀerence is expressed in the scale parameter of the
shifted Gompertz. Likewise, attention dynamics for ﬂickr
and imgur are well explicable in terms of the general cycle
of growth and decline; the apparent diﬀerence is that, in
most countries interest in ﬂickr seems to decline while for
imgur it is still on the rise.
4.2.3 Case Study: amazon
Among all Web-based services considered in this study,
amazon, an online retailer, is found to cause most diverse
patterns of collective attention dynamics. Figure 7shows
that, while in most countries interest in amazon rises steadily
over the whole observation period, it remains rather constant
Figure 7: Query log data related to amazon.
Figure 8: Query log data related to twitter.
in others, and actually declines in some albeit few cases.
4.2.4 Case Study: twitter
Twitter, a popular micro blogging service is another exam-
ple of a service where attention dynamics vary signiﬁcantly
between countries. While in most countries in our study, in-
terest in twitter seems to just have reached its peak (see the
distinct cluster within the giant component in Fig. 8), there
are a few countries in which interest in this service continues
to rise, notably in France, Malaysia, and Turkey.
Prompted by the overall high statistical signiﬁcance of
ﬁts provided by the three diﬀusion models, we apply them
to predict the future evolution of global collective interest in
existing social media. Next, we present qualitative results of
predictions over the next ﬁve years for exemplary services.
In addition, we consider services launched prior to 2004 and
demonstrate that the technique discussed in section 3also
allows for reasonably predicting the past and actually is able
to reconstruct unobserved past developments.
Figure 9: Predictions of future collective interest in exemplary social media services. Gray curves show
data obtained from Google Trends; solid colored curves indicate ﬁts to these data, and dashed colored curves
show corresponding 5 year predictions. Note that these predictions do not indicate absolute user interest but
predict the evolution of relative search frequencies w.r.t. the maximum interest so far which is scaled to 100.
Figure 10: Predictions of past and future collective interest in Web-based businesses launched prior to 2004.
Figure 9shows predictions of the development of collective
attention to three of todays prominent social media plat-
forms. To create these plots, we scale the range of the best
ﬁtting instance of each model in our tests to match the range
of values used by Google Trends. While our predictions do
not allow for an estimation of the development of absolute
numbers of users interested in a service, they indicate how
interest may evolving relative to the present.
As the available data for the three services is truncated
from above, i.e. each service either has just or has not yet
reached peak popularity, traditional maximum likelihood es-
timates may not be reliable. However, visual inspection sug-
gests that, when using multinomial maximum likelihood, all
three diﬀusion models provide reasonable predictions. While
predictions according to the Weibull seem overly optimistic
and those due to the Bass model seem rather pessimistic,
the shifted Gompertz model marks a middle ground. For
instance, in the case of facebook it predicts that by 2017
collective interest in this service will reduce to 50% of its
current intensity. We remark that, while at ﬁrst sight such
a development may seem improbable from today’s point of
view, the vast majority of the 175 social media considered in
this paper show characteristic cycles of growth and decline.
Given the data available as of this writing, collective atten-
tion to facebook so far seems to follow the same pattern.
Figure 10 shows examples of ﬁts to severely truncated
search frequency data. Each Web-based business in this ﬁg-
ure was launched prior to 2004 so that data from Google
Trends is incomplete regarding the past. Nevertheless, based
on multinomial maximum likelihood, the characterizations
of general trends according to each diﬀusion model are again
reasonable; in particular, onset times predicted by the shifted
Gompertz match the dates these businesses were launched.
5. RELATED WORK
Understanding the collective behavior of crowds of Web
users is a research topic of growing popularity and model-
based approaches have been used in this context before. We
divide our discussion of related work into two major parts:
ﬁrst, we review previous contributions to attention dynamics
on the Web in general and then we discuss two recent, highly
related publications on the evolution of popularity of social
media services which themselves have stirred considerable
attention in early 2014.
5.1 Attention Dynamics on the Web
Statistical distributions similar to the ones considered in
this paper have been previously applied to characterize the
dynamics of the behavior ow crowds of Web users. In an
early contribution, Huberman et al.  analyzed brows-
ing behaviors and found that the number of links a user is
likely to follow on a Web site is distributed according to an
inverse Gaussian. In , Wu and Huberman studied life-
cycles of news items on social bookmarking site and found
that the amount of attention novel content receives is dis-
tributed log-normally. The log-Normal distribution was also
found to model sizes of cascades of messages passed through
a peer-to-peer recommendation network  or the number
of messages exchanged in instant messaging services .
The Weibull distribution in (5) was recently reported to
account well for statistics of dwell times on Web sites ,
times people spend playing online games , or the dynamics
of Internet memes . The Bass diﬀusion model in (1) has
recently been considered in order to reason about structures
of online social networks  or twitter information cascades
. The shifted Gompertz distribution, on the other hand,
was apparently not yet considered in the context of social
media or Web usage dynamics.
While attention dynamics on shorter time scales have been
modeled using random ﬁelds , structured models ,
or diﬀerential equations , long term temporal dynam-
ics of collective attention have previously been modeled us-
ing mixtures of power-law and Poisson distributions 
or systems of diﬀerential equations [1,28] which were in-
spired by techniques from the area of epidemic modeling
[10,17]. In this context, we note that the diﬀusion mod-
els considered in this paper also allow for interpretations in
terms of the dynamics of elementary diﬀerential equations.
For instance, the Weibull model in (5) can be expressed as
f(t) = d
dt F(t) = ακtκ−1−ακtκ−1F(t) which hints at a sim-
ilarity in spirit between economic diﬀusion and established
epidemic models that seems to merit further research.
With respect to time series retrieved from Google Trends,
epidemic models based on diﬀerential equations involving
exogenous end endogenous inﬂuences have been discussed in
. There, they were used as means of classifying, i.e. dis-
tinguishing, diﬀerent types of attention dynamics. Trend
analysis based on data from Google Trends was also per-
formed in  yet there the focus was on developing clus-
tering algorithms to characterize diﬀerent phases in search
frequency data. The approaches in [14,15] are thus related
to what is reported here, however, in contrast to these contri-
butions, we do not explicitly devise new models but consider
simpler representations that implicitly account for diﬀerent
kinds of dynamics. Due to the simplicity of the diﬀusion
models considered here and because of their apparent empir-
ical validity and theoretical plausibility, the results reported
in this paper therefore provide a new baseline for research
on the mechanisms and long-term dynamics of collective at-
tention on the Web.
5.2 The Princeton / Facebook Controversy and
a Contribution from CMU
In a delightful synchronicity, Cannarella and Spechler ,
Ribeiro , and we ourselves  all published analyzes on
how attention to social media evolves over time in early 2014.
While  was uploaded to arXiv,  and  were both
presented at the International World Wide Web Conference
The work by Cannarella and Spechler from Princeton is
noteworthy for triggering a brief but ﬁerce media frenzy.
Just as in the work presented here, the results in  were
obtained from analyzing Google Trends time series. Diﬀer-
ing from our approach, Cannarella and Spechler considered
epidemic models to analyze search frequency time series that
indicate interest in services such as myspace or facebook.
While this methodology had earlier been applied to analyze
the temporal evolution of interest in Internet memes ,
Cannarella and Spechler caused a controversy, because they
used their models to predict that facebook would lose 80% of
their users by 2017. Media interst was particularly stirred
by the fact that facebook data scientist Mike Develin was
quick to humorously “debunk” the Princeton “ﬁndings”
Interestingly, our “qualitative” results in Fig. 9seem to
corroborate Cannarella’s and Spechler’s predictions and we
note that they were obtained from the same data but diﬀer-
ent models. In any case, we certainly agree with Develin’s
objection that predictions based on search frequency data
have to be taken with a grain of salt. Yet, we disagree with
his argument that social media related search interests of
millions of Web users are not indicative of user engagement
(see again our discussion in Section 2) and note the curious
absence of any direct engagement data in his reply.
However, data that directly reﬂects engagement played an
important role in Ribeiro’s analysis performed at CMU .
He considered statistics available from alexa, a subsidiary of
amazon which provides Web traﬃc data that are gathered
using the alexa toolbar, a plugin that volunteers install in
their browsers so that alexa can track which Web pages they
Regarding Ribeiro’s approach, we note that he extended
established epidemic models by new parameters and found
these new models to be in good agreement with his data.
His ﬁndings, too, caused considerable media attention since
he predicted collective user interest in facebook to remain
constant for years to come. Yet, this result as well should
be taken with a grain of salt. While it was derived from
direct engagement data, we point out that alexa data are
likely biased towards technology savvy users who installed
the toolbar and will hardly reﬂect the surﬁng behavior of
average Web users.
Given this discussion, the approach and results presented
here mark a middle ground. On the one hand, we consider
simple diﬀusion models rather than (intricate) models for
the epidemic spread of novelties. On the other hand, the
statistical basis for our analysis far exceeds those in [11,36].
Neither Cannarella and Spechler nor Ribeiro consider coun-
try speciﬁc data and neither of them considers as large a
number of diﬀerent services than we do in this paper. More-
over, we see the main contribution of this paper not in the
predictions in Fig. 9but rather in the empirical observation
that collective attention to social media shows highly reg-
ular patterns of growth and decline regardless of region of
origin or cultural background of crowds of Web users.
In this paper, we performed search frequency analysis in
order to gain insights into the dynamics of collective atten-
tion to social media and Web-based businesses. Search fre-
quency analysis is an emerging topic and a quickly growing
literature shows that data available from Google Trends can
lead to novel insights into collective concerns, interests, or
Interested in collective attention to social media, we col-
lected Google Trends data from 45 diﬀerent countries that
show how user interests in 175 social media services evolved
over time. Focusing on general trends, we considered de-
scriptive data mining techniques and applied economic dif-
fusion models to search our data set of more than 8,000 times
series for common patterns or distinctive diﬀerences.
Diﬀusion models are well established in economics and we
considered their use due to their conceptual simplicity. This
is in contrast to more elaborate approaches such as, say,
Gaussian mixtures or kernel techniques, which yield results
in terms of parameters for which there usually is no physi-
cally plausible counterpart. Diﬀusion models, on the other
hand, are designed to characterize time series in terms of ev-
eryday concepts such as propensities for attention to grow
and to decline and we note that Occam’s razor suggests to
prefer simple explanations whenever available.
Using an eﬃcient algorithm for robust maximum likeli-
hood parameter estimation even under incomplete data, we
ﬁtted the Bass-, the shifted Gompertz-, and the Weibull dif-
fusion model and evaluated their performance. Our most
important results can be summarized as follows:
•economic diﬀusion models provide accurate and statis-
tically signiﬁcant explanations of general trends in ag-
gregated search frequency data which summarize how
collective attention to social media evolves over time.
This capability of diﬀusion models to characterize the data
considered in this study thus suggests that:
•collective attention to social media evolves according
to simple and highly regular dynamics of growth and
In a comparative analysis w.r.t. individual countries, dif-
ferent continents, or linguistic backgrounds, we found these
patterns to be persistent and conclude that
•collective attention to social media evolves globally
similarly and independent of regions of origin or cul-
tural backgrounds of crowds of Web users.
Regarding individual services, however, rates of adoption
may vary between countries. Nevertheless, for almost 90%
of the time series in our data set, we found strikingly similar
attention dynamics and it seems that
•most social media services are able to attract growing
collective attention for a period of 4 to 6 years before
user interest inevitably begins to subside.
Finally, because of the way growth dynamics are encoded
in the diﬀusion models studied here, it appears that public
attention to social media hinges on perceived novelty. In
other words, the more a crowd of users gets used to a service
or the less novel it appears, the faster it looses its appeal.
These are the characteristics of hype cycles. The temporal
behavior exposed in our analysis is therefore well in line with
everyday experience and aptly summarized by the statement
that what goes up, must come down.
Our results are of interest to professionals in marketing
and public relations. According to ﬁndings in [34,38] per-
taining to the saliency of query logs for behavioral studies,
data which aggregate the Web search behavior of millions
of people worldwide provide reasonable proxies for public
interests and preferences. The strongly regular patterns we
identiﬁed in time series that served as proxies for the pop-
ularity of social media therefore indicate that interests of
crowds of Web users are surprisingly predictable.
In summary, the models of attention dynamics considered
in this paper provide simple yet reliable and theoretically
well founded tools for Web trend analysis. They thus consti-
tute new baselines for Web intelligence research that targets
socio-economic questions. In particular, they provide base-
line tools that help estimating the future success or customer
adoption of particular services or Web-based businesses.
The work reported in this paper was carried out within the
Fraunhofer / University of Southampton research project
SoFWIReD and funded by the Fraunhofer ICON initiative.
Kristian Kersting was additionally supported by the Fraun-
hofer ATTRACT fellowship “Statistical Relational Activity
 A. Acerbi, S. Ghirlanda, and M. Enquist. The Logic of
Fashion Cycles. PLoS ONE, 7(3):e32541, 2012.
 C. Artola and E. Galan. Tracking the Future on the
Web: Construction of Leading Indicators using
Internet Searches. Documentos Ocasionales 1203,
Banco de Espana, 2012.
 F. Bass. A New Product Growth Model for Consumer
Durables. Management Science, 15(5):215–227, 1969.
 C. Bauckhage. Insights into Internet Memes. In Proc.
ICWSM. AAAI, 2011.
 C. Bauckhage, K. Kersting, and F. Hadiji.
Mathematical Models of Fads Explain the Temporal
Dynamics of Internet Memes. In Proc. ICWSM.
 C. Bauckhage, K. Kersting, and B. Rastegarpanah.
Collective Attention to Social Media Evolves
According to Diﬀusion Models. In Proc. WWW. ACM,
 C. Bauckhage, K. Kersting, R. Sifa, C. Thurau,
A. Drachen, and A. Canossa. How Players Lose
Interest in Playing a Game: An Empirical Study
Based on Distributions of Total Playing Times. In
Proc. CIG. IEEE, 2012.
 A. Bemmaor. Modeling the Diﬀusion of New Durable
Goods : Word-of-mouth Eﬀect Versus Consumer
Heterogeneity. In G. Laurent, G. Lilien, and B. Pras,
editors, Research Traditions in Marketing, pages
201–229. Springer, 1994.
 I. Bordino, S. Battiston, G. Caldarelle, M. Cristelli,
A. Ukkonen, and I. Weber. Web Search Queries can
Predict Stock Market Volumes. PLoS ONE,
 T. Britton. Stochastic Epidemic Models: A Survey.
Mathematical Biosciences, 225(1):24–35, 2010.
 J. Cannarella and J. Spechler. Epidemiological
Modeling of Online Social Network Dynamics.
arXiv:1401.4208 [cs.SI], 2014.
 J. Castle, N. Fawcett, and D. Hendry. Nowcasting Is
Not Just Comtemporaeneous Forecasting. National
Institute Economic Review, 210(1):71–89, 2009.
 H. Choi and H. Varian. Predicting the Present with
Google Trends. Economic Record, 88(S1):2–9, 2012.
 L. Christiansen, T. Schimoler, R. Burke, and
B. Mobasher. Modeling Topic Trends on the Social
Web Using Temporal Signatures. In Proc. WIDM.
 R. Crane and D. Sornette. Robust Dynamic Classes
Revealed by Measuring the Response Function of a
Social System. PNAS, 105(41):15649–15653, 2008.
 Z. Da, J. Engelberg, and P. Gao. In Search of
Attention. J. of Finance, 66(5):1461–1499, 2011.
 K. Dietz. Epidemics and Rumors: A Survey. J. of the
Royal Statistical Society A, 130(4):505–528, 1967.
 A. Gerow and M. Keane. Mining the Web for the
Voice of the Herd to Track Stock Market Bubbles. In
Proc. IJCAI. AAAI, 2011.
 J. Ginsberg, M. Mohebbi, R. Patel, L. Brammer,
M. Smolinski, and L. Brilliant. Detecting Inﬂuenza
Epidemics Using Search Engine Query Data. Nature,
 L. Gleser and D. Moore. The Eﬀect of Dependence on
Chi-Square and Empiric Distribution Tests of Fit. The
Annals of Statistics, 11(4):1100–1108, 1983.
 S. Goel, D. Watts, and D. Goldstein. The Structure of
Online Diﬀusion Networks. In Proc. EC. ACM, 2012.
 L. Granka. Inferring the Public Agenda from Implicit
Query Data. In Proc. SIGIR. ACM, 2009.
 J. Hermann, W. Rand, B. Schein, and N. Vedopivec.
An Agent-Based Model of Urgent Diﬀusion in Social
Media. Technical report, Social Science Research
 B. Huberman, P. Pirolli, J. Pitkow, and R. Lukose.
Strong Regularities in World Wide Web Surﬁng.
Science, 280(5360):95–97, 1998.
 R. Jennrich and R. Moore. Maximum Likelihood
Estimation by Means of Nonlinear Least Squares. In
Proc. of the Statistical Computing Section. American
Statistical Association, 1975.
 K. Joseph, J. Wintoki, and Z. Zhang. Forecasting
Abnormal Stock Returns and Trading Volume Using
Investor Sentiment: Evidence from Online Search.
Int. J. of Forecasting, 27(4):1116–1127, 2011.
 J. Lescovec and E. Horvitz. Planetary-Scale Views on
a Large Instant-Messaging Network. In Proc. WWW.
 J. Leskovec, L. Adamic, and B. Huberman. The
Dynamics of Viral Marketing. ACM Tans. Web,
 J. Leskovec, L. Backstrom, and J. Kleinberg.
Meme-tracking and the Dynamics of the News Cycle.
In Proc. KDD. ACM, 2009.
 C. Lin, B. Zhao, Q. Mei, and J. Han. PET: A
Statistical Model for Popular Events Tracking in
Social Communities. In Proc. KDD. ACM, 2010.
 C. Liu, R. White, and S. Dumais. Understanding Web
Browsing Behavior through Weibull Analysis of Dwell
Times. In Proc. SIGIR. ACM, 2010.
 D. Luu, E.-P. Lim, T.-A. Hoang, and F. Chua.
Modeling Diﬀusion in Social Networks Using Network
Properties. In Proc. ICWSM. AAAI, 2012.
 N. McLaren and R. Shanbhogue. Using Internet
Search Data as Economic Indicators. Bank of England
Quarterly Bulletin, 51(2):134–140, 2011.
 J. Mellon. Search Indices and Issue Salience: the
Properties of Google Trends as a Measure of Issue
Salience. Sociology Working Papers 2011-01,
University of Oxford, 2011.
 E. Page. Continuous Inspection Scheme. Biometrika,
 B. Ribeiro. Modeling and Predicting the Growth and
Death of Membership-Based Websites. In Proc.
WWW. ACM, 2014.
 H. Rinne. The Weibull Distribution. Chapman & Hall
/ CRC, 2008.
 J. Teevan, D. Liebling, and G. Geetha. Understanding
and Predicting Personal Navigation. In Proc. WSDM.
 F. Wu and B. Huberman. Novelty and Collective
Attention. PNAS, 104(45):17599–17601, 2007.