Content uploaded by Mohammed Abufouda
Author content
All content in this area was uploaded by Mohammed Abufouda on Jan 27, 2018
Content may be subject to copyright.
Stochastic Modeling of the Decay Dynamics of
Online Social Networks
Mohammed Abufouda and Katharina A. Zweig
Abstract The dynamics of online social networks (OSNs) involves a complicated mixture of growth
and decay. In the last decade, many online social networks, like MySpace and Orkut, suffered from
decay until they were too small to sustain themselves. Thus, understanding this decay process
is crucial for many scenarios that include: (1) Engineering a resilient network, (2) Accelerating
the disruption of malicious network structures, and (3) Predicting users’ leave dynamics. In this
work we are interested in modeling and understanding the decay dynamics in OSNs to handle the
aforementioned three scenarios. Here, we present a probabilistic model that captures the dynamics
of the social decay due to the inactivity of the members in a social network. The model is proved
mathematically to have submodularity property. We provide preliminary results and analyse some
properties of real networks under decay process and compare it to the model’s results. The results
show, at the macro level of the networks, that there is a match between the properties of the
decaying real networks and the model.
1 Introduction
Today’s online social networks represent a main source of communication and information exchange
among people all over the world. Many online social networks have proven their usefulness, like Face-
book, Twitter, and Linkedin, in connecting people and facilitating an exquisite new medium for
sharing news, forming groups of people of the same interests, and eliciting knowledge. The growth
of these networks in terms of user activity shows that these online social networks have become a
vital part in today’s human activities. One well-studied aspect of online social networks dynamics
is the growth dynamics of a network. The work by Barab´asi et al. [6] presented a simple model
for understanding the growth dynamics of a network, namely the Preferential Attachment Model
(PAM), which is a rich-get-richer-model. Jin et al. [15] noticed that the model by Barab´asi et al. [6]
and other similar models, like the work by Dorogovtsev et al. [11] for modeling the growth of ran-
Mohammed Abufouda and Katharina A. Zweig
Computer Science Department, University of Kaiserslautern,
Gottlieb-Daimler-Str. 48, 67663, Kaiserslautern, Germany, e-mail: {abufouda,zweig}@cs.uni-kl.de
1
2 Mohammed Abufouda and Katharina A. Zweig
dom networks, are not suitable to understand the growth dynamics of social networks. Thus, they
provided a model that considered the specialty of social networks without a power law distribution
and with large clustering coefficient [15]. With the availability of the online datasets, Newman [25]
studied empirically the growth of social networks using the scientific collaboration networks against
the PAM model [6]. Bala et al. [5] provided a non-cooperative game based model for the network
formation. Later, Jackson [14] surveyed the models and methods that were used to capture the net-
work formation process and compared them in terms of stability and efficiency. Leskovec et al. [21]
first showed on dynamic network data, that networks densify over time and that their diameter is
shrinking. They also provided another growth dynamics model that was able to produce networks
with these properties. The previous work and the availability of rich datasets pushed the research
to an in-depth investigation of the properties of the networks over time. Kumar et al. [20] studied
the growth of a large social network in terms of network component analysis, Kossinets et al. [18]
studied the tie formation process within the social networks that is affected by internal and exter-
nal factors, and Capocci et al. [9] studied the statistical properties of the growth characteristics
of Wikipedia collaboration social networks. Likewise, Backstrom et al. [4] studied empirically how
groups are formed and evolved over time in MySpace social networks , while Mislove et al. [23]
provided a study for the growth of Flicker social network. Even though there are many success-
ful social networks, the evolution of a social network also incorporates decay. In the last decade,
some of the online social networks were closed after a huge loss or inactivity of their members.
Online social networks, like Friendsfeed, Friendster, MySpace, Orkut, and many websites of the
Stack Exchange platform, are now out of service, despite the fact that some of them, e.g., Orkut
and Myspace, showed a tremendous growth [2] just a decade ago. The decay of these networks poses
many questions about the reasons behind their fall down. Garcia et al. [12] and Chhabra et al. [10]
studied the static properties of Friendster and MySpace, respectively, in order to understand the
network-related properties of these networks as an example of a decayed network. Recent studies
by Malliaros et al. [22] and Bhawalkar et al. [8] provided theoretical models for understanding the
social engagement in online social networks with a potential to predict social inactivity. Torkjazi
et al [28] provided an analysis of Myspace online social network and examined the activity and
inactivity of its users with some insights about the reasons behind the fall of MySpace. Similarly,
Ribeiro [26] studied activity and inactivity of the users by providing a model that uses the number
of daily active users as a proxy of the dynamics in the membership based websites. Kairam et
al. [16] provided machine learning prediction models to predict community longevity: how long a
community in an online social network will survive. Another related work done by Asur et al. [3]
discussed the persistence and decay of Twitter tweets. While investigating the reasons behind the
inactivity of members of an online social networks is not in the scope of this work, some recent
studies proposed some answers [27, 17], suggesting that the main reason behind this decay is the
inactivity of the members of the online social networks.
Building a sound understanding of the decay dynamics of networks requires not only studying the
static properties of these networks, but also requires investigating their dynamics and properties
over time, and this is what we are interested in here. As a scenario, we consider the Stack Exchange
websites that were closed after some period of time due to the lack of enough activity required to
keep the website alive. The closed websites are an example of the social network decay, where we
model the members of a website as the nodes of the network and an edge exists between any two
nodes if they post, comment, or answer to the same question in the websites.
While we cannot answer why a person starts losing interest in a social network, we can try to
analyze and model the effect of this behavior on other people. Such a model might in turn hint at
Stochastic Modeling of the Decay Dynamics of Online Social Networks 3
the causes of social decay or at least explain some part of it.
In this work, we provide a probabilistic model for understanding the social decay phenomenon in
online social networks. The model presented here can provide insights regarding the effect of node
leave on the neighborhood nodes. Our contribution in this work is split the following: (1) A longi-
tudinal network analysis of the stack exchange sites showing their decay. (2) A probabilistic model
for social network decay which is a step by step mechanistic model for a node leave and the effect of
its leave. (3) Theoretical proof of the submodularity of the model that leads to viable optimization,
e.g., determining the minimal set of nodes to leave the network for accelerating/decelerating the
decay process. Being submodular renders the maximization problem of the model to be viable.
2 Model and notations
A network G= (V, E) is a tuple of two sets Vand E, where Vis the set of nodes and Eis the set of
edges such that an undirected edge eis defined as e={u, v} ∈ E, where u, v ∈V. As we consider a
dynamic system, the notation Gtis a network at time t. We assume that every node w∈Vhas an
initial Leave Probability πt=0
wwhich denotes the probability of node wleaving the network at time
1, and generally at t+ 1. If a node wdid not leave at t+ 1, i.e., w∈V(Gt+1), then its current leave
probability, πt
w, will be increased depending on its neighbors who left at t−1. The tie strength at
time t−1, representing some possibly dynamic measure of closeness of a relationship, is denoted by
δt−1
v,w and assumed to be ∈(0,1]. The details of this process are described in the following sections.
Definition 1. A dynamic network Gis called a ”Decaying Network” if |E(G)t−1| ≥ |E(G)t|,
|V(G)t−1|≥|V(G)t|, and V(G)t⊆V(G)t−1,∀t > 0.
t = 0 t = 1 t = 2 t = 3
t = 4 t = 5 t = 6
Fig. 1: An illustration of the model. The color of the nodes represents how likely a node will leave in the future, where
white nodes are very unlikely to leave and the level of grayness correlates with the probability to leave. Whenever a
node leaves the network it is marked as black, all its edges are removed, and all of its neighbors get affected by its
leave by increasing their leave probability. The dotted edges are the removed edges.
We assume the model starts with a Decaying Network, i.e, no further nodes or edges are added to
the network. The main idea of the model is shown in Figure 1.
4 Mohammed Abufouda and Katharina A. Zweig
2.1 Probability Gain
At any point of time twhere t>0, the node’s leave probability changes from πt−1
wto πt
w, by adding
Probability Gain ∆πt
w, that never exceeds the value of 1. Thus, a node wwill leave at time t+ 1
with probability πt+1
wsuch that:
πt+1
w=min{1, πt−1
w+∆πt
w}(1)
If a node wdid not leave the network at time t, then we have two sets: Γt−1
wand Γt−1
w, which are
the sets of w’s neighbors who left and did not leave the network at t−1, respectively.
2.1.1 Probability gain due to one node leave:
We first define the probability gain due to the leave of a single neighbor vof the node wat time
point t−1, and then generalize it to w’s neighbors that left the network: Γt−1
w. Now, the probability
gain that a node wwill get at t+ 1 due to the leave of its neighbor node vat t−1 is defined as:
∆πt+1
w(v)=1−(1 −πt−1
v)(1 −δt−1
v,w ) (2)
where the edge e= (v, w)∈E(G)t−2and e= (v , w)/∈E(G)t−1as v∈Γt−1
wand w∈V(G)t−1.
Thus, the total probability gain produced by the leave of node vto all of its neighbors which did
not leave, see Figure 2 for an illustration, is given by:
∆πt(v) = X
w∈Γt−1
v
1−(1 −πt−1
v)(1 −δt−1
v,w ) (3)
t−2t−1t
Fig. 2: This figure shows how a node vaffects all of its neighbors when it leaves. At t−2, the node vhas a leave probability
πt−2
vwhich was gained by v’s initial leave probability π0
vand possible probability gains caused earlier by leaving neighbors,
i.e., πt−2
v=π0
v+Pt=t−3
t=1 ∆πt
v. At time t−1, the node vleaves the network affecting its neighbors by increasing the leave
probability of nodes 1,2,4,5. Here we assume that the tie strength between vand the nodes 1,2,5 is greater than the tie
strength between vand 4. That is why the nodes 1,2,5 gain more leave probability than node 4, which is represented by a
darker color of nodes 1,2,5.
Stochastic Modeling of the Decay Dynamics of Online Social Networks 5
t−2t−1t
Fig. 3: This figure shows how a node wis affected by the leave of its neighbors. At t−2, the nodes 1,4 have leave
probabilities πt−2
1and πt−2
4, respectively, which were gained by the nodes’ initial leave probabilities π0
1and π0
4and possible
earlier probability gains. At time t−1, the nodes 1,4 leaves the network affecting their neighbors, here we are interested
in the node w. The leave of nodes 1,4 left node wwith an increased leave probability at t. Note that nodes 2,3,5,6 are
affected also by the leave of 1,4, but for simplicity and for visualization traceability we concentrated on node w.
2.1.2 Probability gain due to multiple nodes leave:
We now generalize the probability gain induced by the leave of a single node to capture the impact
of all neighbors that left, i.e., Γt−1
w.
∆πt
w= 1 −[ (1 −ξt−1
w)
| {z }
Assures leave
(Y
u∈Γt−1
w
(1 −πt−1
u))
| {z }
Leave probabilities effect
(Y
u∈Γt−1
w
(1 −δt−1
u,w ))
| {z }
Tie strength effect
]
= 1 −[(1 −ξt−1
w)( Y
u∈Γt−1
w
(1 −πt−1
u)(1 −δt−1
u,w ))]
(4)
where ξt−1
w=|Γt−1
w|
|Γt−1
w|and the quantity 1−ξt−1
wassures that when all of the neighbors of the node w
leaves, then the node wwill (be forced to) leave too as it will be disconnected. Thus, Equation 1
becomes:
πt
w=min{1, πt−1
w+ 1 −[(1 −ξt−1
w)( Y
u∈Γt−1
w
(1 −πt−1
u)(1 −δt−1
u,w ))]}(5)
3 Monotonicity and submodularity
In this section, we show the monotonicity and submodularity properties of the model’s equations 1.
Definition 2. Let f: 2V→R≥0, where R≥0={x∈R|x≥0}, be an arbitrary function that
maps the subsets Sand Tto a non-negative real value, where S⊆T⊂V. Then, the function f
is submodular [19] if it satisfies the following inequality: f(S∪ {v})−f(S)≥f(T∪ {v})−f(T),
where v∈V\T.
Lemma 1 (Order preserving of the probability gain sum). Let πt={π1, π2,· · · , πn}, where
πi∈πtand πi∈(0,1]. Then we have: P
πi∈πt
πi≤P
πi∈πt+1
πiwhere πt⊆πt+1, and the sets πtand
πt+1 are defined like above.
1Detailed proofs are provided in an earlier technical paper [1].
6 Mohammed Abufouda and Katharina A. Zweig
Lemma 2 (Order preserving of the probability gain product). Let πt={π1, π2,· · · , πn},
where πi∈πtand πi∈(0,1]. Then we have: Q
πi∈πt
πi≥Q
πi∈πt+1
πiwhere πt⊆πt+1, and the sets πt
and πt+1 are defined like above.
Theorem 1. The leave probability gain function, Equation 3, is submodular.
The interpretation of the theorem is that, the more friends a node vhad before leaving, the higher
its total induced leave probability gain.
Theorem 2. The leave probability gain function, Equation 4, is monotone, i.e., for a node wwe
have πt
w≤πt+1
wif the node wdid not leave the network at t+ 1.
Theorem 3. The leave probability gain function, Equation 4, is submodular.
The theorem state that the more of your friends leave, the less important the others become. Sub-
modulariy entails an interesting properties: the minimization problem of submodular function can
be performed in polynomial time [13], and the maximization problem of the submodular function,
which is NP-Hard problem, can be approximated within a factor of α= (1 −1/e) using a greedy
algorithm [24].
4 Results
In this section, we provide the analysis of the decaying stack exchange websites and the results
of the model. Figure 4a shows the distribution of the number of user comments for alive and
decayed websites. The figure shows that the decayed websites clearly have different distribution
characteristics with a low mean and low standard deviation. A similar behavior is found in Figure 4b
and Figure 4c that represents the distribution of users’ total received Reputation and Upvotes,
respectively. These two properties reflect the level of knowledge and experience that the members
of a website have. For the decayed websites, it is clear that, on average, the members have much less
reputation and upvotes than those in the alive websites. The three figures, Figures 4a, 4b, and 4c
show that there is less social activity in the decayed websites, which may be used as an indication
for studying the future of the alive websites. However, understanding the decay dynamics of the
decayed websites requires a deeper investigation and modeling for the nature of the interaction
among the members. Our approach to better understand what happens during the decay process is
to make a network representation of the members’ interactions, like comments, upvotes, and posts,
as networks. Then, we build a network based model for modeling the decay process.
Stochastic Modeling of the Decay Dynamics of Online Social Networks 7
(a) Comments per user. (b) Reputation per user. (c) Upvotes per user.
Fig. 4: (Color Online) The characteristics of the interaction decay in the decayed and alive websites of the Stack
Exchange websites. The figures show the probability distributions of different types of interactions in these websites.
Markers with bold boarders are decayed websites, µis the mean, and σis the standard deviation. From the figures
it is clear that the decayed networks have different distribution properties from the other alive networks.
Algorithm 1 depicts the steps we followed in our experiments. Line 4 initializes the initial leave
proability π0
v, which is a design decision and we selected values from 0.0005 to 0.045 with an 0.0005
increase step. For each of these values, the model runs and simulates Equation 4. The update step
in line 13 simulates Equation 5. The result of the algorithm is a set of graphs that are used for the
analysis. The output of this algorithm results in a large number of graphs. For example in the case
of the Startup Business website we have analyzed more than 200kgraphs with 250 runs for each
probability to get more confidence of the results. The tie strength was a normalized edge weight
where the weight is the frequency of the interaction between two nodes.
(a) (b)
Node coreness over time
Node degrees over time
(c)
Network density over time
Fig. 5: (Color Online) Macro properties of the real networks under decay for the Startup business site. Figures 5.a, 5.b, 5.c
show the degrees of the nodes, the node coreness, and the network density over time.
In Figure 5 we show the macro properties of the real networks of the Startup Buisness website over
time. The network evolution shows a clear decay that is represented as a decrease in the number
of the nodes. This decrease was associated with a decrease in the average degrees of the nodes
over time and also with a decrease of the node’s coreness [7]. Another macro measure we used is
the network density. Figure 5c shows an increase in the density over time. This increase is due to
early leave of the nodes with less degrees, i.e., the nodes that are part of dense subgraphs seem
to leave the network late. Now, we will show the results of the model simulation. Figure 6a shows
the number of components in the network over simulation for different values of π0
v. The number
of components start to increase to a maximum value before it start to decrease. The reason is that
at the beginning the model starts with a one-connected component graph and after each step some
nodes are removed due to the leave probability. The leave of some nodes results in a disconnected
graph with more components. The number of these disconnected components increases until these
disconnected components are composed of only triples or simple edges. As a result, a node that leaves
8 Mohammed Abufouda and Katharina A. Zweig
from these triples or from these edges will not increase the number of the components anymore.
Figure 6b and Figure 6c show a similar behaviour for the average degree and the average coreness
over time, respectively. The more nodes are being removed from the network, the less edges remain
and thus the average degree and the average coreness decrease uniformly over time. This behavior
of the model is similar to the real data presented in Figure 5. The last global measure that we use
is the network density as shown in Figure 6d. The density of the simulated networks increases over
time for the same reason stated for the real networks in Figure 5. These results show that the model
provides a real-like behaviour of the networks under decay.
Algorithm 1 Model simulation
1: Input: Graph G0
2: Output: Graphs= {G0, G1,··· , Gn−1}where Gnis an empty graph
3: for all v∈V(G0)do
4: initialize π0
v
5: t= 0, Gt=G0, Graphs.add(Gt)
6: while Gtis not empty do
7: LeftNodest=∅
8: t=t+ 1
9: for v∈V(Gt)do
10: if Leave(v,πt
v) is T rue then
11: LeftNodest.Add(v)
12: for all u /∈LeftNodes & Γt−1
u6=∅do
13: update(πt
u,Γt−1
u)
14: remove LeftNodestfrom Gt
15: Graphs.add(Gt)
5 Discussion
There are different applications where the model can be utilized. 1. Social network resilience: the
resilience against huge disruptions in social networks is not well-studied. We think that the model
provides a first step towards engineering a resilient social network via understanding the decay
dynamics of a network. 2. Leave cascade detection: the leave of one member is not as harmful as
a cascade of leaves for the networks that seek growth. The model captures the dynamics of leave
cascades by observing the leave probabilities of the nodes and their increase. 3. Maximizing the
leave effect: for a network where a dissolving process is required, like criminal social networks, the
model is able to provide a viable disruption maximization (thanks to the submodularity property of
the model) to the network with insights about the influential members and the effect of the leave.
6 Conclusion
In this work, we presented an empirical analysis of the social decay dynamics of the closed Stack
Exchange websites. The closed websites showed an inactivity, which might have caused their decay.
We model these interactions between the members of these websites as a network that enabled us to
Stochastic Modeling of the Decay Dynamics of Online Social Networks 9
Fig. 6: (Color Online) The results of multiple global measures of the model. Figure 6a, Figure 6b, Figure 6c, and
Figure 6d show the number of components, the average degree, the average coreness, and the density of the network
over time for different values of initial leave probability π0
v, respectively. The model started with G0as the input
network and simulates the decay over it.
build a model to understand the decay dynamics. Then, we have presented a model for capturing the
decay dynamics in social networks. The model is a probabilistic model that assumes that the leave
of social network members affects the leave of their neighbors. In this work we have also presented
some mathematical properties and proved them. We proved that the model’s main equations are
submodular, which entails doing optimization of the model in a feasible way. Also, we presented
the macro network properties of real networks under decay and compared these results with the
results of the model simulation. The results of the model and the real networks under decay showed
a similar behavior that supports the potential of the model for different usages. In the future, we
will design the optimization algorithms and study the applicability of the model and also provide
more empirical validation of its properties.
References
1. M. Abufouda and K. A. Zweig. A theoretical model for understanding the dynamics of online social networks
decay. arXiv preprint arXiv:1610.01538, 2016.
2. Y.-Y. Ahn, S. Han, H. Kwak, S. Moon, and H. Jeong. Analysis of topological characteristics of huge online social
networking services. In Proceedings of the 16th international conference on WWW, pages 835–844. ACM, 2007.
3. S. Asur, B. A. Huberman, G. Szabo, and C. Wang. Trends in social media: Persistence and decay. Available at
SSRN 1755748, 2011.
4. L. Backstrom, D. Huttenlocher, J. Kleinberg, and X. Lan. Group formation in large social networks: membership,
growth, and evolution. In Proceedings of the 12th ACM SIGKDD, pages 44–54. ACM, 2006.
5. V. Bala and S. Goyal. A noncooperative model of network formation. Econometrica, 68(5):1181–1229, 2000.
10 Mohammed Abufouda and Katharina A. Zweig
6. A.-L. Barab´asi and R. Albert. Emergence of scaling in random networks. American Association for the Ad-
vancement of Science, 286(5439):509–512, 1999.
7. V. Batagelj and M. Zaversnik. An o(m) algorithm for cores decomposition of networks. arXiv preprint
cs/0310049, 2003.
8. K. Bhawalkar, J. Kleinberg, K. Lewi, T. Roughgarden, and A. Sharma. Preventing unraveling in social networks:
the anchored k-core problem. SIAM Journal on Discrete Mathematics, 29(3):1452–1475, 2015.
9. A. Capocci and et al. Preferential attachment in the growth of social networks: The internet encyclopedia
wikipedia. Physical Review E, 74(3):036116, 2006.
10. S. S. Chhabra, A. Brundavanam, and S. Shannigrahi. An alternative explanation for the rise and fall of myspace.
CoRR, abs/1403.5617, 2014.
11. S. N. Dorogovtsev and J. F. F. Mendes. Scaling behaviour of developing and decaying networks. EPL (Euro-
physics Letters), 52(1):33, 2000.
12. D. Garcia, P. Mavrodiev, and F. Schweitzer. Social resilience in online communities: The autopsy of friendster.
In Proceedings of the first ACM conference on Online social networks, pages 39–50. ACM, 2013.
13. S. Iwata, L. Fleischer, and S. Fujishige. A combinatorial strongly polynomial algorithm for minimizing submod-
ular functions. Journal of the ACM (JACM), 48(4):761–777, 2001.
14. M. O. Jackson. A survey of network formation models: stability and efficiency. Group Formation in Economics:
Networks, Clubs, and Coalitions, pages 11–49, 2003.
15. E. M. Jin, M. Girvan, and M. E. Newman. Structure of growing social networks. Physical review E, 64(4), 2001.
16. S. R. Kairam, D. J. Wang, and J. Leskovec. The life and death of online groups: Predicting group growth and
longevity. In Proceedings of the fifth international conference on Web search and data mining, pages 673–682.
ACM, 2012.
17. A. A. Kordestani, M. Limayem, E. Salehi-Sangari, H. Blomgren, and A. Afsharipour. Why a few social networking
sites succeed while many fail. In The Sustainable Global Marketplace, pages 283–285. Springer, 2015.
18. G. Kossinets and D. J. Watts. Empirical analysis of an evolving social network. science, 311(5757):88–90, 2006.
19. A. Krause and D. Golovin. Submodular function maximization. Tractability: Practical Approaches to Hard
Problems, 2012.
20. R. Kumar, J. Novak, and A. Tomkins. Structure and evolution of online social networks. In Proceedings of the
12th ACM SIGKDD Inte. Conference on Knowledge Discovery and Data Mining, pages 611–617, 2006.
21. J. Leskovec, J. Kleinberg, and C. Faloutsos. Graphs over time: densification laws, shrinking diameters and
possible explanations. In Proceedings of the eleventh ACM SIGKDD, pages 177–187. ACM, 2005.
22. F. D. Malliaros and M. Vazirgiannis. To stay or not to stay: modeling engagement dynamics in social graphs.
In Proceedings of the 22nd ACM inter. conference on Information & Knowledge Management, pages 469–478,
2013.
23. A. Mislove, H. S. Koppula, K. P. Gummadi, P. Druschel, and B. Bhattacharjee. Growth of the flickr social
network. In Proceedings of the first workshop on Online social networks, pages 25–30. ACM, 2008.
24. G. L. Nemhauser and L. A. Wolsey. Best algorithms for approximating the maximum of a submodular set
function. Mathematics of operations research, 3(3):177–188, 1978.
25. M. E. Newman. Clustering and preferential attachment in growing networks. Physical review E, 64(2):025102,
2001.
26. B. Ribeiro. Modeling and predicting the growth and death of membership-based websites. In Proceedings of the
23rd international conference on World Wide Web, pages 653–664. ACM, 2014.
27. S. Stieger, C. Burger, M. Bohn, and M. Voracek. Who commits virtual identity suicide? differences in privacy
concerns, internet addiction, and personality between facebook users and quitters. Cyberpsychology, Behavior,
and Social Networking, 16(9):629–634, 2013.
28. M. Torkjazi, R. Rejaie, and W. Willinger. Hot today, gone tomorrow: on the migration of myspace users. In
Proceedings of the 2nd ACM workshop on Online social networks, pages 43–48. ACM, 2009.