IDENTIFYING HIGH VALUE CONSUMERS IN A NETWORK:
NETWORK STRUCTURE VERSUS INDIVIDUAL CHARACTERISTICS
Gary J. Russell†
†Sang-Uk Jung is a lecturer of Marketing in the Business School at University of Auckland, New Zealand, Qin
Zhang is an Assistant Professor of Marketing and Gary J. Russell is the Henry B. Tippie Research Professor of
Marketing in the Tippie College of Business at University of Iowa. Corresponding author: Qin Zhang, e-mail:
firstname.lastname@example.org, Ph: 319-335-3125.
IDENTIFYING HIGH VALUE CONSUMERS IN A NETWORK:
NETWORK STRUCTURE VERSUS INDIVIDUAL CHARACTERISTICS
Firms are interested in identifying customers who generate the highest
revenues. Traditionally, customers are regarded as isolated individuals whose buying behavior
depends solely on their own characteristics. In a social network setting, however, customer
interactions can play an important role in purchase behavior. This study proposes a spatial
autoregressive model that explicitly shows how network effects and individual characteristics
interact in generating firm revenue. Using model output, we develop a method of identifying
individuals whose purchase behavior most impacts the total revenues in the network. An
empirical study using a user-level online gaming dataset demonstrates that the proposed model
outperforms benchmark models in predicting revenues. Moreover, the proposed value measure
outperforms a variety of benchmark measures in identifying the most valuable customers.
Keywords: Social Network, Social Influence, Spatial Autoregressive Model, Customer
Relationship Management, Customer Value, Online Games.
Firms have long been interested in identifying and predicting high value customers. In
traditional customer relationship management (CRM), customers are often viewed as isolated
individuals whose purchase behaviors are solely determined by their own characteristics (e.g.,
demographical or/and behavioral characteristics). In a network setting where customers are
connected and communicated with each other, however, purchases of a customer can also be
influenced by other customers in the network.
Clearly, not all customers have the same degree of influence on others in a network. It is
vital for marketing managers to identify key customers who have greater influence on others’
behavior and to understand how behavioral contagion spreads through networks. Because of
spillover effects, focusing marketing efforts on influential customers can increase the magnitude
and speed of diffusion of marketing effects in a way not possible in traditional marketing. This,
in turn, provides opportunities for improvement in the return on marketing investment (ROMI).
Traditionally, customer influence is inferred by the structure of the network (Burt 1987).
This amounts to measuring the influence of a customer based on how well he/she is connected
with others. Recent studies (Trusov, Bodapati and Bucklin 2010) have argued that the network
connections of an individual may not directly translate into his/her impact in influencing the
purchase behavior of others, i.e., the purchases that generate revenues for a firm. Instead, they
propose to measure an individual’s influence based on how an individual’s behavioral outcomes
affect those of others in the network. We agree with this view and consider that it is necessary to
account for both network structure and purchase behavior in measuring the influence of a
customer on others in a network.
In this paper, we further apply this view to customer valuation in a network setting,
where not only the influence of a customer on others’ contribution but also the customer’s own
contribution to a firm should be taken into account. Specifically, we argue that when measuring
customer value in a network it is important to simultaneously account for both network structure
and individual (demographic or/and behavioral) characteristics. Having a measure that account
for both aspects can help firms better understand the value of their customers, enabling them to
compare the value of customers who show different merits, for example, comparing the value of
customers, who are well connected in the network but do not contribute greatly to the firms’
revenues on their own, with that of customers, who are important contributors to firms’ revenues
individual wise but are not well connected to other customers.
For this purpose, we first propose a spatial autoregressive (SAR) model to explicitly
examine how network structure and customers’ characteristics interact in generating revenues for
a firm. The spatial weight matrix of the SAR model, which represents the social network, is
constructed in such a way that potential asymmetric influential relationships between connected
customers are accommodated and real-value parameter estimation is ensured. Using the model
output, we construct a measure of customer valuation that takes into account the effects of both
network effects and individual characteristics. This measure enables us to identify customers
who have greater impact on the total revenues of a network.
In our empirical application, we are able to take advantage of a data on a virtual social
network – an online gaming community. We obtained this unique user-level dataset from a
popular online gaming company in Korea. The dataset contains information about individual
players’ characteristics – demographics and game-playing behaviors in the virtual game
environment, the behavioral outcomes of these players - the revenues they generated for the
company, and information about the social network – communications between players. These
information sources provide great opportunities for us to study how member characteristics and
network linkages interact to generate behavioral outcomes.
We estimate the proposed model using the online gaming data. The estimated social
network effect is in line with what is found in the literature, and other parameter estimates are
also consistent with our expectations. We also estimate two benchmark models - a model that
only accounts for the individual characteristics but ignores the network effects and a model that
only accounts for the network effects but ignores the effects of individual characteristics on
customer revenues. The proposed model outperforms the benchmark models for both calibration
and holdout samples based on standard fit criteria.
Next, we conduct policy simulations to demonstrate how the proposed customer value
measure can be used to help firms identify high value customers in a network. We compare the
proposed measure with a variety of benchmark measures that are commonly used to evaluate
customer value in a network. Our proposed measure not only outperforms all the benchmark
measures but also to a great extent. Consistent with the findings by Trusov, Bodapati and
Bucklin (2010) and Stephen and Toubia (2010), our results support the view that knowledge of
network structure alone is not sufficient in identifying the most valuable customers in a network.
The rest of the paper is organized as follows. We first discuss pertinent previous literature
and position our work relative to this literature. Next, we describe the proposed model and the
proposed customer valuation measure, and apply the theory empirically. We conclude by
discussing the applicability of our proposed measure in more general network settings and
outline opportunities for future research.
There is a growing body of research in the area of social influence and social networks in
the marketing area. While there are many studies in the literature about the diffusion of
innovations (Iyengar, Van den Bulte and Valente 2010; Watts and Dodds 2007; Argo, Dahl and
Morales 2006, 2008; Goldenberg et al. 2009), the effects of word-of-mouth (Trusov, Bucklin and
Pauwels 2009; Godes et al. 2005; Godes and Mayzlin 2004) and joint group decision making
(Hartmann 2010), we focus attention here on modeling interdependent decisions of individuals
and identifying an individual’s impact on revenue generation to firms.
Identifying Individual Influence in a Network
Social network analysis (SNA) has emerged as a key technique to measure an
individual’s influence in a network in the quantitative sociology area. It has been widely used to
analyze social networks in various disciplines such as sociology, economics, physics, computer
science and marketing. These measures are grounded on the basic assumption that people
occupying some important positions in a network tend to have greater access to relevant
resources and tend to have more influence on others (Freeman 1979; Keller and Barry 2003).
Various measures have been suggested such as centrality, structural equivalence and structural
holes, etc. (Burt 1987, 1992). For example, centrality measures such as degree, betweenness and
closeness represent the social power of an individual based on how well they are connected with
others in the network. Thus, the importance of an individual can be inferred from his or her
location in the network (Bavelas 1950; Beauchamp 1965; Freeman 1977; Opsahl, Agneessens
and Skvoretz 2010).
The notion that individuals’ influence on others can be measured using their location in a
network has also been discussed in marketing (e.g., Iacobucci 1990, 1996, 1998; Iacobucci and
Hopkins 1992; Van Den Bulte and Wuyts 2007). Social networking sites allow researchers to
gather relational data among individuals, such as connections of friendship. Because these links
are easily observable by the firm and researchers, it is tempting to apply SNA directly to infer a
person’s importance in the network to the firm. However, measuring customer value in the
revenue creation perspective is more challenging because little is known about how network
connections are translated into revenues to the firm.
The connection between network structure and behavior has been explored in several
studies. Trusov, Bodapati and Bucklin (2010) found that having many links (high degree) does
not make users influential in terms of revenue creation. They illustrate the potential for large
gaps in financial returns to the firm from using model-based estimates of influence versus count
of connections. Also, Stephen and Toubia (2010) found that the sellers who benefit the most
from the network are not necessarily those who are central to the network, but rather those whose
accessibility is most enhanced by the network. Iyengar, Han and Gupta (2009) attempted to
quantify social influence in terms of purchase probability and revenues at the individual level
using actual purchase data in a social networking site. They found significant heterogeneity in
social influence, in that highly connected people tend to be negatively influenced by their friends’
purchases, whereas people with moderate connections are positively influenced by such
purchases. All these studies make the larger point that knowledge of network structure per se
does not provide sufficient information to predict behavioral outcomes.
Modeling Interdependent Behaviors in a Network
Choosing the correct model of interpersonal influence plays a key role in measuring
individual impact in a network. Individual decision-making models can be categorized into two
major types: linear-in-means, and spatial econometrics. The linear-in-means model has often
been used in social econometrics (Manski 1993). Spatial econometrics, originally developed in
the academic geography literature, is widely applied to social network research in sociology
(Leenders 2002). These two types of models adopt the same basic assumptions that an
individual’s preferences or behaviors are a function of others’ preferences or behaviors.
However, there are several key differences.
The linear-in-means model assumes that the outcome of each individual in a group is
linearly dependent upon the average outcomes and characteristics of his or her reference group
(Manski 1993). In this domain, individual-level variables are typically aggregated into group-
level measures (Hartmann et al. 2008), and significant group-level variables are interpreted as
the presence of neighborhood effects. One basic assumption of this model is that the social
influence on different people in the same group is the same. Since the pioneering work by
Datcher (1982), much of the empirical literature on social interactions in econometrics has
involved extending the general form of the linear-in-means model (Solon 1999, Durlauf and
Seshadri 2003, Graham and Hahn 2005). However, these patterns of social interactions are
highly specialized and cannot be generally applied to all social networks.
In contrast, the spatial econometrics approach is more flexible (Le Sage and Pace 2009).
These models focus on the microstructure of interactions among individuals and allow for the
heterogeneity of interactions across pairs of individual actors. Depending on the theory and the
empirical applications, the interdependence in a network can be represented in two different
ways. First, an individual’s behavioral outcome may depend directly upon the outcomes of
others and thus in proportion to their influence. This model, called a spatial autogressive (SAR)
process, is formalized by including a lagged dependent variable as an additional predictor.
Second, interdependence may be modeled through error terms. This may occur when the
observed dependence does not reflect a truly causal effect (such as homophily or unobserved
Research Framework of This Study
In this research, we adopt a spatial autoregressive (SAR) model to investigate interactions
in a social network. The framework has three key advantages. First, it assumes that individuals
who are near each other are more related than individuals who are distant. Second, it allows for
heterogeneous interactions across different pairs of individuals. Third, it implies a causal link
between individual behavior and the behavior of others in the network. Because the model
allows for spillover and magnified effects across individuals (Anselin 1988; Kelejian, Travlas
and Hondronyiannis 2006), it is possible to infer the relative influence of different individuals on
overall network outcomes.
MODELING CUSTOMER VALUE IN A NETWORK
Customer value in a network is defined as the impact of a particular customer’s actions
on the total behavior of a network. We begin by proposing a spatial autoregressive (SAR) model
and discussing how properties of the SAR model specification are useful in the social network
setting. Using this specification, we show how the SAR model structure can be manipulated to
yield an easily-computed measure of customer value.
Spatial Autoregressive (SAR) Model
Drawing upon the spatial statistics literature (LeSage and Pace 2009), we propose to use
a spatial autoregressive (SAR) model to explore how a customer’s own characteristics interact
with the purchases of others in the network in generating revenues for a firm. Let Y denotes a
vector of revenues generated by N customers in a customer network. We can describe the
network revenues using a SAR model as
where X is a matrix that denotes the N customers’ own (k) characteristics (such as
demographic and behavioral characteristics) that affect purchase behavior, is the
parameter vector, and is a vector of errors, assumed to be normally
distributed, i.e., . Network effects are represented by the term. Here, is
a spatial weight matrix that represents the network connections between customers and is a
spatial lag parameter. The parameter measures the degree of overall interdependence of
purchases among customers in the network. We call
the social influence parameter.
The autoregressive term can be understood as a weighted sum of revenues of other
customers in the network. Assuming that (? − ??) matrix is invertible, the model in (1) can be
The first term on the right-hand side of (2), , can be interpreted as the expected
value of revenues, given individual characteristics X and the network structure W. Further,
assuming that |?| < 1, the matrix inversion in (2), can be expanded in an infinite
power series (Debreu and Herstein 1953) as
(? − ??)−1= ? + ?? + ?2?2+ ?3?3+ ⋯. (3)
where ?? is the mth-order neighbor matrix, measuring the extent to which any two individuals
can be reached in m relational jumps. This expression implies that actions of each customer are
magnified across the network due to a spillover pattern dictated by the network structure.
More generally, equation (2) argues that that the impact of each customer on expected
revenues is an interaction between personal characteristics and network structure. Stated
intuitively, the impact of a customer on network revenues depends on both who the customer is
and where he/she is located in the network.
Representing the Network
Constructing the social network is a critical decision in modeling interdependence among
individuals. A social network is typically represented by the continuity/adjacency matrix C, in
which each element indicates the degree of strength in the relationships between the two
individuals represented by the corresponding row and column. While the continuity/adjacency
matrix in spatial econometrics is often constructed using geographical proximity (Manchanda,
Xie and Youn 2008; Nam, Manchanda and Chintagunta 2008; Bell and Song 2007), it can also
be constructed using socio-demographic or geo-demographic similarity (Strang and Tuma 1993;
Robins, Pattison, and Elliott 2001), self-reported relationships (Iyengar, Van den Bulte and
Valente 2010; Nair, Manchanda and Bhatia, 2010) or observed friendships (Iyengar, Han and
Gupta 2009; Trusov, Bodapati and Bucklin 2010; Trusov, Bucklin and Pauwels 2009).
In this research, we use the communication among network members to represent the
network, which takes into account both interactions as well as the connections between members.
We first construct a symmetric contiguity matrix C. A typical element,
degree of strength in the connection between customer i and customer j. We assume that
where, represents the communication between i and j; represents the total communication
of i with all other customers in the network; represents the communication of j with all other
customers in the network. The spatial weight matrix is then generated by row standardizing
the contiguity matrix C (so that all rows in sum up to unity). Following conventional
practice in the spatial statistics literature, all diagonal elements of C (and consequently W) are set
The matrix in our model has two key features. First, W is in the form of a quasi-
symmetric matrix. This ensures that the SAR model is appropriately specified, leading to a real-
value estimate of the spatial lag parameter (Bhatia, Kittaneh and Li 1998). Second, W
incorporates a type of dominance pattern into the spatial weight structure, allowing for
asymmetric influential relationships between members that may exist in a social network. A
typical element in W, Wij, represents the influence of individual j on individual i in the network
and is proportional to , i.e., resulted from the row standardization of the
contiguity matrix C. The term indicates that the influence depends on not only the number of
communications between individual i and j (represented by
ij S ) but also the total number of
communications of individual j (represented by
j B ). In other words, the influence is determined
by how active individual j is in the network, as well as how much interaction exists between
individual i and j. Therefore, individuals who are more active in the network tend to have larger
weights, thus greater influence, than those who are less active.
CS B B
The SAR model can be regarded as a simultaneous set of regression models (one for each
customer) that are interlocked: the purchases of customer i impact customer j, and vice versa.
Parameters are estimated using maximum likelihood procedures that take into the special
structure of the SAR model (LeSage and Pace 2009).
Measuring Customer Value in a Network
We define customer value in a network as the impact of an individual customer on the
total revenues of the network. The general strategy is to first fit the proposed SAR interaction
model to network data and then to use model output to construct individual measures of
customer. The proposed measure takes into account not only the effects of a customer’s
characteristics on his/her own revenue contribution, but also how those characteristics interact
with the network effect and affect others’ revenue contributions. In other words, the measure
accommodates both the spillover effect - any change of a customer’s characteristics will affect
the revenues of other customers - and the magnified effect - any change of characteristics of
other customers will affect the revenue of the focal customer.
Assume that a firm has N customers and the goal is to choose customers to whom the
intervention, such as target promotions, is implemented. The objective is to maximize the total
revenues of the whole network with the intervention. Based on equation (2), we define the long
run mean revenue of customers as
where, is a vector. are described for equation (1) earlier. This is the
expected revenue of each customer taking into account their own characteristics X and the
spillover effects due to the purchases of other customers.
Suppose now the firm implements an intervention on customers, which yields a direct
increase in spending for each of the customers before they are back into the network. This direct
increase represents customers’ individual responses to the intervention. It is determined by
customers’ characteristics and is a function of X . We denote it as a vector, The ith
element of represents the direct increase in spending by customer i. The updated revenues after
the intervention can be written as
Thus, the distribution of revenues after an intervention can be written as , where
Using this analysis, we define the impact of an intervention on the whole network as the
difference between the total mean revenues across all customers before and after an intervention.
It can be written as
where, is a vector of ones. Thus, the increase in system revenues due to an intervention
depends upon the direct increase in revenues , which is a function of X , and the network
,, , and
() [() ]'
structure defined by W. By defining as 0 for all customers except customer i, we can use this
expression to measure the value of each customer.
Selecting Customers for Intervention
In any practical application, we need to select a group of customers (not just one) in such
a way that spillover and magnified effects across customers in the network are taken into account.
For this purpose, we assume that the firm plans to select a group of m customers to implement an
intervention and the direct increase in spending yielded from the intervention is , where
the ith element of the vector, , is defined as:
Thus, we can rewrite equation (8) for the total network impact from the selected
where is the sum of all elements across the ith column of
From equation (10), it can be seen that to maximize the total impact of the intervention
using m customers, the firm can first sort all customers in descending order by their respective
values of the product, and then choose the top m customers from this list. Given this logic,
we propose to use the product to measure the value of each customer in a network. We
call this proposed measure of customer value in a network Customer Network Value.
1, if customer is selected for the intervention
We apply the proposed model to a unique user-level dataset from the largest online game
publisher and developer in Korea. The dataset contains information about demographics and
gaming behaviors of 1000 game players, interaction between players, and revenues generated by
individual players during a six-month period from March 2010 to August 2010.
The game in the dataset was first released in Korea on July, 2009. The game has been
quite successful in the market. Currently, there are 0.3 million registered users and
approximately 20,000 active players. This game is a typical Massively Multiplayer Online Role-
playing Game (MMORPG), in which a large number of players interact with one another in a
virtual gaming world. MMORPG is distinguished from other online games such as single-player
or small multi-player role-playing games by the number of players and by the continuity of the
virtual world that continues to evolve after a player exits from the game environment.
Player interaction is one of the most important aspects of MMORPG. For example, some
of the gaming tasks, such as coordinated combat and hunting the monsters in the virtual world,
require a number of players to work together in order to successfully complete those tasks.
Another form of player interaction is socialization through “guilds,” groups of like-minded
players who bond together to achieve common goals in the game. A typical guild has a
hierarchal organization structure that consists of guild masters and different levels of officers
who are elected to manage the guild. Decisions like new member recruitment and the distribution
of special items are often the results from those interactions among guild members. The
significance of the player interaction in this game makes it a very appropriate empirical
application to apply our proposed model to study the network influence.
The company which operates this game has two revenues sources from the game: virtual
item purchases by the players, and payments from internet cafés. Virtual items (e.g., virtual
armor, virtual weapons, and avatar decoration) are designed to enhance players’ gaming
experience and are purchased with real money. Playing the game at players’ own homes is free,
but some people choose to play the game at internet cafés, which are particularly popular in Asia.
Internet cafés charge players by the length of their playtime, and the gaming company gets a
share of the revenues by providing participating internet cafés the access to the game. In our
dataset, an Internet cafés pays about 4.15 won (the Korean currency) per minute per player to the
company to access the game. About 60% of the total revenue of the company is from virtual item
purchases while the rest of 40% is from internet café revenue-sharing.
Because our objective is to predict total revenue of a player, we combine the two sources
of revenue to construct a revenue measure as the dependent variable as follows:
ln 4.151 ,
where, is the total revenue earned by player i,
iP is the spending on virtual items by player i,
is the total time (in minutes) that i spends at internet cafés to play the game.
The data also contains the information about players’ demographics, such as age and
gender. Though the data does not contain the information about household income of a player
directly, we are able to combine the address information given by a player when he/she first
registered to play the game and the census information to obtain the average house value at the
zip code level. This serves as a proxy for the household income level.
Game Playing Behavior
There are 13 variables in the data that describe the players’ game-playing behaviors. We
describe these variables in Table 1. To eliminate the potential multi-colinearity problem that may
exist among these variables (we report the correlations between the variables in Table 2), we
conduct a principal component analysis (PCA) with the varimax-rotated method. We report the
factor loading matrix from the PCA in Table 3. According to the correlations between each
factor and variable shown in the factor loading matrix, we determine that four factors can be
used to best describe the information contained in the original 13 variables:
1. Guild Activity represents a player’s activities related to the guild. The factor contains
most of the information in GuildSize, GuildWar and GuildMem.
2. Game Status represents the gaming level of a player. The factor contains most of the
information in ExpScore, RepLevel, RoleLevel, TimeLevel and Master.
3. Action on Other Players represents a player’s activities related to playing with/against
other players in the game. The factor contains most of the information in Wanted, PK
4. Action on Virtual World represents a player’s activities related to playing
with/against virtual figures in the game. The factor contains most of the information
in Hunt and Building.
We use the four factors as the key behavioral variables in the estimation of the SAR model.
[Insert Tables 1 – 3 Here]
Besides the demographical and behavioral variables, we add one additional variable –
Virtual Money – in our analysis. The Virtual Money is a measure that represents a player’s
spending level of using his/her game money, which is earned by completing some gaming tasks
such as treasure hunting and monster hunting. The game money is mainly used to purchase
virtual items that have decorative purposes, while the real money (in Korean currency) is mainly
used to purchase virtual items that could enhance a player's gaming capability. In other words,
substitution between the two sources of money exists but the substitutability is quite limited.
Adding this variable helps account for the potential effect of virtual money on the company
To construct the contiguity matrix C, which represents the network connections among
players and subsequently the construct of matrix, we use the information about numbers of
messages communicated between players. As noted earlier, the connection between player i and
is constructed as
where, is the total number of messages communicated between player i and player j, is the
total number of messages that player i sent to all other players in the network, and is the total
number of messages that player j sent to all other players in the network. The spatial weight
matrix is then generated by row standardizing the contiguity matrix C, and thus
This suggests that the influence of player j on player i is not only determined by how frequently
player i and j communicated with each other, but also how frequently that j communicated in
1 We do recognize that it is possible that the Wij may understate the influence of player j on i as players may
communicate with each other outside of the gaming environment. However, we do not have information about those
activities in our data.
CS B B
Estimation Results and Discussion
We estimate the SAR model using the data and variables described in the previous
section. As the virtual items that a player purchases or the game time that the player spends on
playing the game in internet cafés may also affect the player’s gaming behaviors during the same
period, leading to concerns about potential endogeneity problems. To correct for endogeneity,
we use corresponding lag terms for the four factors that represents players’ gaming behaviors as
instrument variables in model estimation. We use the data from March to June 2010 as the
calibration sample and data from July to August 2010 as the holdout sample.
In additional to the proposed SAR model, we also estimate two benchmark models:
1. Individual Characteristics Only Model: a model that only accounts for the individual
characteristics but ignores the network effects on customer revenues. The model can
be written as:
2. Social Influence Only Model: a model that only accounts for the network effects but
ignores the effects of individual characteristics on customer revenues. The model can
be written as:
Table 4 lists the log-likelihood and Akaike Information Criteria (AIC) of the proposed
model and the two benchmark models on both calibration and holdout samples respectively. The
comparison shows that the proposed model outperforms both benchmark models. We report the
estimation results from the proposed model in Table 5.
[Insert Tables 4 and 5 Here]
The social influence parameter , which represents the degree of overall interdependence
of revenues among players in the network is equal to 0.388 and significant. This implies
substantial network interdependence on players’ revenues. The estimated from our model is
similar in magnitude to the spatial lag parameters obtained in other spatial studies in marketing
(e.g., Yang and Allenby (2003) and Bezawasda et al. (2009)). For the coefficient estimates of
the individual demographic and gaming behavior variables (), six out of eight are significant
and also have expected signs. The results show that the younger a player is, the more revenues
he/she generates. Players who live in areas with higher house value also generate more revenues.
The coefficient of Virtual Money is insignificant, implying that the game money that players
spend on certain virtual items does not have a significant impact on the revenues that they
generate for the firm. All the parameter estimates for gaming behaviors (i.e., Guild Activity,
Game Status, Action on Other Players and Action on Virtual World) are significant and have the
expected positive effects on players’ revenues.
Policy Simulations regarding Customer Value Measures
Next, we conduct policy simulations to demonstrate how the proposed customer value
measure – Customer Network Value - can be used to help firms identify high value customers in
a network. Again, we assume that the firm plans to select a group of m customers to implement
an intervention. Denote the immediate spending increase as (before taking into account
network interactions). The impact of the intervention on the whole network can be measured by
the difference between the total expected revenues across all customers before and after the
intervention, as calculated using the formula in equation (10). To show how the performance of
proposed measures compares with other value measures, we use the same formula to calculate
the impact of intervention on the whole network, but benchmark our approach using the
following alternative measures to select respective sets of m customers for the intervention:2
1. Centrality Measures: The measures are commonly used in the sociology literature to
represent the positions of individuals in a network based on how the individuals are
connected with each other. The basic assumption of centrality measures is that
individuals who occupy more important positions in a network tend to have better access
to relevant resources; thus have more influence on others (Freeman 1979; Keller and
Barry 2003). We construct two centrality measures – Out-Degree and Betweeness as our
benchmarks. Appendix A explains the two measures in detail.
2. Measures Based on Observed Customer Characteristics: The measures are constructed
based on the values of five customer characteristic variables used in the proposed model
respectively. The variables include a demographic variable - House Value, and four
gaming behavior variables – Guild Activity, Game Status, Action on Other Players and
Action on Virtual World. The basic premise of these measures is that the observed
customer characteristics provide useful information regarding customer value in the
3. Observed Customer Revenues: In a scenario when there is no model used to describe the
customer revenue generating process, one naïve way to evaluate customers value is to
use the revenue information observed in the past. We calculate the average revenue for
each customer observed in the study period in the data and use it to infer the customer
2 Note that when m = N, all customers in the network are included for intervention; thus the impacts of intervention are
the same for all measures.
4. Random Selection: When there is no information about customer characteristics,
revenues generated in the past or network structure, one strategy that a firm can use to
evaluate its customers’ value is selecting a group of customers randomly. We call this
measure Random Selection. To ensure a fair comparison, in Appendix B, we derive the
expected (average) impact of choosing all the possible random sets of m customers from
a network of N customers.
For our policy simulation, we need to specify the value of for the simulated
intervention with this network of customers. As stated in the modeling section, represents
customers’ individual responses to the intervention and is a function of X in the general
setting. For the simulation we first examine a representative cases, in which the increase in
spending yielded from the intervention by the firm, , is assumed to be proportional to the long
run mean revenues of customers. In other words, the intervention generates a shift of baseline
revenue in the form: , where f >0 is a proportionality constant. For the easiness of
exposition, we call this case Baseline Shifting Intervention.
Given the specification of , we calculate the revenue impact on the whole network by
targeting at various sets of m customers that are selected based on the proposed and the
benchmark customer value measures respectively. As the Random Selection represents the
average impact that a firm can expect without using any customer or network information, we
use it as the baseline for comparison. Specifically, we normalize the total revenue impacts
calculated based on various measures through dividing them by the corresponding revenue
impact based on the Random Selection measure.
In Table 6, we report the comparison of revenue impact on the whole network for
Baseline Shifting Intervention. We can see that the proposed customer value measure – the
Customer Network Value - performs the best. Though it is not surprising that the proposed
measure outperforms the benchmark measures, what makes it stand out is the extent to which it
outperforms other measures. The comparison in Table 6 shows that the network impact of the
intervention based on the Customer Network Value can be as high as 2.24 times of that based on
Random Selection (when selecting 10 customers for intervention) and consistently much higher
than respective network impact based on other measures. The table also shows that measures that
only use the information about network structure or customer characteristics often perform worse
than the Random Selection, which does not use any information. For example, the network
impact of an intervention based on Betweeness and Out-Degreee is consistently smaller than that
based on the Random Selection when equal to or more than 100 customers are selected for
intervention. Among the measures that are based on individual customer characteristics, the
House Value, which can be considered as a proxy for household income, performs the best, but
only shows equal or slightly higher than respective network impact based on Random Selection.
The measure that ranks customers based on observed revenues performs well when the number
of customers selected for intervention is small – 200 or less, but its performance deteriorates as
more customers are selected for intervention.
[Insert Table 6 Here]
Next we examine a special case of , in which individual customers respond to the
intervention with the same direct spending increase, i.e., , and k is a constant. We call this
case Constant Intervention. In this case, our proposed measure – the Customer Network Value –
still accounts for the spillover and magnified network effects even though the direct individual
response of each customer is the same and is not affected by his/her individual characteristics.
We expect that the measures such as centrality measures, which are based on network structure,
will show improved performances compared with Baseline Shifting Intervention. We report the
comparison of network impact of intervention based on various measures in Table 7. The table
shows that the proposed measure is a significant improvement from the Random Selection
measure. For example, the network impact of the intervention based on the proposed measure is
2.96 times of that based on Random Selection when 10 customers are selected for intervention.
The outperformance of the proposed measure over the rest of benchmark measures is consistent
across different m as in Baseline Shifting Intervention. As expected, the two centrality measures
– Betweeness and Out-Degree – which capture the network structure show significant
improvement of performance compared with that in Baseline Shifting Intervention. The network
impact of intervention based on both measures is consistently higher than respective network
impact based on Random Selection, for example, the network impact of intervention based on
Out-Degree is 1.78 times of that based on Random Selection when 10 customers are selected for
intervention. However, the fact that their performance is still significantly dominated by that of
our proposed Customer Network Value measure demonstrates that when assessing network
effects it is important to account for not only the network structure but also the interaction
between customers in the network (even though customers have equal weights). This is
consistent with findings by Trusov, Bodapati and Bucklin (2010) and Stephen and Toubia (2010).
[Insert Table 7 Here]
Profiling of High Value Customers
It is often of interest for firms to identify customer characteristics that high value
customers may exhibit. This knowledge can help them focus on these customers before they
have more information to compute customer value according to the proposed value measure. For
this purpose, we first rank the customers based on their value calculated using the proposed
Customer Network Value measure. Then, we select two groups of customers - the top 10%
customers who have the highest customer value and the bottom 10% customers who have the
lowest customer value. For each group, we calculate their average value of observed customer
demographical characteristics such as age, gender, house value, as well as game behavior
characteristics such as Guild Activities, Game Status, Action on Other Players and Action on
Virtual World. Next, we conduct a t-test to examine whether there are significant differences
between these two groups for these characteristics. The results for Baseline Shifting Intervention
are reported in Table 83. From the table, we can see that the high value customers tend to be
older, have higher income (higher house value), and are more active game players.
[Insert Table 8 Here]
SUMMARY AND FUTURE RESEARCH
In this study, we construct a spatial autoregressive model to explicitly show how network
structure and customers’ own characteristics interact in generating revenues for a firm. We
propose a measure of customer value that takes into account both individual and network effects.
This measure enables us to identify customers whose purchase behavior most impacts the total
revenues in the network.
We estimate the proposed model using a unique user-level dataset from a popular online
gaming company in Korea. The data contains information about game players’ demographics,
3 The results for Constant Intervention are similar. We do not report them here for conciseness of the paper. They are
available upon request from the authors.
gaming behavior as well as the information about the communication between players, which we
use to construct the network structure. The estimated network effect (i.e., the social influence
parameter) is in line with what is found in the literature, and other parameter estimates are also
consistent with our expectations. We empirically show that network effects play an important
role in customer purchase behaviors.
We conduct policy simulation to demonstrate how our proposed value measure that uses
the model estimates can be used to identify customers who have the most impact on the revenues
of the whole network. We compare our proposed measure with a variety of benchmark measures
that are commonly used to measure customers’ value in the network (e.g., centrality measures,
observed customer characteristics, observed revenues and random selection). Our proposed
measure not only outperforms all benchmark measures but with a large margin. Of particular
interest is the fact that the proposed measures outperforms centrality measures, which take into
account network structure but ignore the interaction between network structure and purchase
behavior, even in the case when individual customer characteristics are assumed not to impact
their purchase behaviors. This suggests that the most influential customers are not necessarily
those in the center of the network. This is consistent with findings by Trusov, Bodapati and
Bucklin (2010) and Stephen and Toubia (2010). Our results support the view that knowledge of
network structure alone is not sufficient to in identifying the most influential customers. We also
find that with information of network structure or customer characteristics alone firms may not
necessarily do much better (and often worse) than if customers are selected randomly for
We also show the profiling of the high value customers in the empirical application. The
ability to identify characteristics of high value customers can help firms target at these customers
even before they have more information to compute customer value according to the proposed
In this paper, we empirically show that network structure interacts with individual
characteristics to affect purchase behaviors of customers connected in a network. We also
demonstrate how spatial statistical models can be used to address a social network problem that
has direct marketing relevance. Particularly, we show that how the spatial weight matrix in a
spatial autoregressive model can be constructed to incorporate some important features of the
social networks (e.g., the frequencies of interaction between members may determine the
strength of connections; asymmetric influential relationships may exist between members in a
social network) while ensuring the model is appropriately specified for estimation. Finally, this
study contributes to marketing knowledge in customer relationship management (CRM). Our
proposed measure for customer valuation in a network substantially outperforms other
commonly used customer value measures. It can help firms identify high value customers to
have greater improvement in the total revenues of their networks of customers.
There are several potential areas for future research. First, the spatial weight matrix (or
the continuity matrix) may be extended to incorporate more individual heterogeneity. It is
plausible that individual characteristics affect not only customer purchases directly but also
customers’ influences on others through the network structure (thus further affect the network
revenues), e.g., a guild master in the online gaming community may have greater influence on a
regular guild member than vice versa. Second, it will be of interest to explore the dynamic aspect
of the social influence. It often takes time for social influence to have an effect. It is important to
understand the network effect as well as its interaction with customer characteristics on customer
value in a long term perspective. Third, the observed correlated behaviors amongst customers in
the network can be attributed to reasons besides social influence, such as homophily and
contextual effects (Manski 1993; Moffit 2001; Hartmann et al. 2008). It is important to identify
different sources of the behavioral correlations.
Real-world-like virtual worlds such as online games provide great opportunities to study
consumer behavior in the real world. The behavioral and relational data from the virtual worlds
are more accessible and complete compared to those from the real world, making testing various
marketing theories more feasible. The work reported here is an initial step and we hope that our
work spurs more future research in this stream of research.
Anselin, L. (1988), Spatial Econometrics: Methods and Models. Dordrecht: Kluwer Academic
Argo, J. J., Dahl, D. W., and Morales, A. C. (2006), “Consumer Contamination: How Consumers
React to Products Touched by Others,” Journal of Marketing, 70 (2), 81–94.
Argo, J. J., Dahl, D. W., and Morales, A. C. (2008), “Positive Consumer Contagion: Responses
to Attractive Others in a Retail Context.” Journal of Marketing Research, 45 (6), 690–701.
Bavelas, A. (1950), "Communication Patterns in Task Oriented Groups," Journal of the
Acoustical Society of America, 57, 271–282.
Beauchamp, M. A. (1965), "An Improved Index of Centrality," Behavioral Science, 10, 161–163.
Bell, D. R. and Song, S. (2007), “Neighborhood Effects and Trial on the Internet: Evidence from
Online Grocery Retailing,” Quantitative Marketing and Economics, 5 (4), 361–400.
Bezawasda, R., Balachander, S., Kannan, P. K., and Shankar, V. (2009), “Cross-Category
Effects of Aisle and Display Placements: A Spatial Modeling Approach and Insights,”
Journal of Marketing, 73 (May), 99–117.
Bhatia, R., Kittaneh, F., and Li, R. (1998), “Eigenvalues of Symmetrizable Matrices,” BIT
Numerical Mathematics, 38 (1), 1–11.
Brock, W. and Durlauf, S. (2001), "Discrete Choice with Social Interactions," Review of
Economic Studies, 68, 235–260.
Burt, R. S. (1987), “Social Contagion and Innovation: Cohesion Versus Structural Equivalence,”
American Journal of Sociology, 92, 1287–1335.
Burt, R. S. (1992), Structural Holes: The Structure of Competition. Cambridge, MA: Harvard
Choi, J., Hui, S. K., and Bell, S. K. (2010), “Spatiotemporal Analysis of Imitation Behavior
across New Buyers at an Online Grocery Retailer,” Journal of Marketing Research, 47
Datcher, L. (1982), “Effects of Community and Family Background on Achievement,” Review of
Economics and Statistics, 64 (1), 32–41.
Debreu, G. and Herstein, I. N. (1953), “Nonnegative Square Matrices,” Econometrica, 21
Durlauf, S. and Seshadri, A. (2003), “Is Assortative Matching Efficient?,” EconomicTheory, 21
Freeman, L. C. (1977), “A Set of Measures of Centrality based upon Betweenness,” Sociometry,
Freeman, L. C. (1979), “Centrality in Social Networks: Conceptual Clarification,” Social
Networks, 1, 215–239.
Godes, D., and Mayzlin, D. (2004), “Using Online Conversations to Study Word-of-Mouth
Communication,” Marketing Science, 23 (4), 545–560.
Godes, D., Mayzlin, D., Chen, Y., Das, S., Dellarocas, C., Pfeiffer, B., Libai, B., Sen, S., Shi, M.,
and Verlegh, P. (2005), “The Firm’s Management of Social Interactions,” Marketing Letters,
16 (3), 415–428.
Goldenberg, J., Han, S., Lehmann, D. R. and Hong, J. (2009), “The Role of Hubs in the
Adoption Process,” Journal of Marketing, 73 (March), 1–13.
Graham, B. and J. Hahn, (2003), Identification and Estimation of Linear-in-Means Models.
mimeo, Department of Economics, Harvard University.
Hartmann, W., Manchanda, P., Nair, H., Bothner, M., Dodds, P., Godes, D., Hosanagar, K., and
Tucker, C. (2008), "Modeling Social Interactions: Identification, Empirical Methods and
Policy Implications," Marketing Letters, 19 (3-4), 287–304.
Hartmann, W. R. (2010), “Demand Estimation with Social Interactions and the Implications for
Targeted Marketing,” Marketing Science, 29 (4), 585–601.
Iacobucci, D. and Nigel Hopkins (1992), “Modeling Dyadic Interactions and Networks in
Marketing,” Journal of Marketing Research, 29 (February), 5–17.
Iacobucci, D. (1996), Networks in Marketing. Thousand Oaks, CA: Sage Publications
Iacobucci, D. (1998), “Interactive Marketing and the Meganet: Network of Networks,” Journal
of Interactive Marketing, 12 (Winter), 5–16.
Iyengar, R., Han, S., and Gupta, S. (2009), “Do Friends Influence Purchases in a Social
Network?” Working Paper, University of Pennsylvania.
Iyengar, R., Van den Bulte, C., and Valente, T.W. (2010), “Opinion Leadership and Social
Contagion in New Product Diffusion,” Marketing Science, 30 (2), 195–212.
Iyengar, R., Van den Bulte, C., and Choi, J. (2011), “Distinguishing between Drivers of Social
Contagion: Insights from Combining Social Network and Co-location Data,” Working Paper,
University of Pennsylvania.
Kelejian, H. H., G. S. Tavlas, and G. Hondronyiannis, (2006), “A Spatial Modeling Approach to
Contagion Among Emerging Economies,” Open Economies Review, 17 (4), 423–442.
Keller, E. and Barry, J. (2003), The Influentials: One American in Ten Tells the Other Nice How
to Vote, Where to Eat, and What to Buy. New York: The Free Press.
LeSage, J. P. and Pace, K. (2009), Introduction to Spatial Econometrics. Chapman and Hall:
Leenders, R. (2002), "Modeling Social Influence through Network Autocorrelation: Constructing
the Weight Matrix," Social Networks, 24 (1), 21–47.
Manchanda, P., Xie, Y., and Youn, N. (2008), “The Role of Targeted Communication and
Contagion in Product Adoption.” Marketing Science, 27 (6), 961–76.
Manski, C. F. (1993), “Identification of Endogenous Social Effects: the Reflection Problem,”
Review of Economic Studies, 60, 531–542.
Moffitt, R. (2001), Policy Interventions Low-level Equilibria, and Social Interactions.
Cambridge, MA, MIT Press.
Nair, H. S., Manchanda, P., and Bhatia, T. (2010), “Asymmetric Social Interactions in Physician
Prescription Behavior: The Role of Opinion Leaders.” Journal of Marketing Research, 47
Nam, S., Manchanda, P. and Chintagunta, P. (2008), “The Effects of Service Quality and Word
of Mouth on Customer Acquisition, Retention and Usage,” Working Paper, University of
Opsahl, T., Filip, A., and Skvoretz, J. (2010), "Node Centrality in Weighted Networks:
Generalizing Degree and Shortest Paths," Social Networks, 32, 245–251.
Robins, G., Pattison, P., and Elliott, P. (2001), “Network Models for Social Influence Processes,”
Psycometrika, 66 (2), 161–90.
Soetevent, A. R. and Kooreman, P. (2007), “A Discrete-choice Model with Social Interactions:
With an Application to High School Teen Behavior,” Journal of Applied Econometrics, 22
Solon, G. (1999), Intergenerational Mobility in the Labor Market. Orley Ashenfelter and David
Card, eds., Handbook of Labor Economics, 3A, 1761–1800, Amsterdam: North-Holland.
Sorensen, A. (2006), “Social Learning and Health Plan Choice,” Journal of Economics, 37, 929–
Stephen, A. and Toubia, O. (2010), “Deriving Value from Social Commerce Networks,” Journal
of Marketing Research, 47 (April), 215–228.
Strang D. and Tuma, N. B. (1993), “Spatial and Temporal Heterogeneity in Diffusion,”
American Journal of Sociology, 99, 614–639.
Trusov, M., Bucklin, R. E., and Pauwels, K. (2009), “Effects of Word-of-Mouth Versus
Traditional Marketing: Findings from an Internet Social Networking Site,” Journal of
Marketing, 73, 90–102.
Trusov, M., Bodapati, A.V., and Bucklin, R.E. (2010), “Determining Influential Users in Internet
Social Networks,” Journal of Marketing Research, 47 (August), 643–658.
Van den Bulte, C. and Wuyts, S. (2007), Social Networks and Marketing. Cambridge, MA:
Marketing Science Institute.
Watts, D. J. and Dodds, P. S. (2007), “Influentials, Networks, and Public Opinion Formation,”
Journal of Consumer Research, 34 (December), 441–458.
Yang, S. and Allenby, G. M. (2003), “Modeling Interdependent Consumer Preferences,”
Journal of Marketing Research, 40 (August), 282–294.
Table 1 -- Players’ Game-Playing Behavior Variables
Variable Names Description
Number of guild members
Number of participating wars with other guilds
Guild membership: 1-in a guild; 0-not in a guild
Experience score of a player. The scores are generally awarded for
completion of quests, overcoming obstacles and opponents,
successful role-playing, and hunting monsters
Reputation level of a player. Reputation is divided into a number of
different levels for which players must earn reputation points to
progress through. Unlike experience, it is possible to lose reputation
points either by killing group members or by assisting rivals
Level of the game character. According to the experience score that a
player reaches, he/she reaches different levels of game character.
Average time (in minutes) that takes for a player to level up to the
next level of the game.
Whether a player is the leader of the a guild. 1- yes; 0 - No
Number of players killed or attacked by a player. A
player can only attack and kill others in designated
areas. Killing or attacking other players outside the
designated arena is not allowed, but PK accounts for
the number of players killed or attacked outside of the
designated areas by a player.
Number of a player is on the “wanted” list. If number of PK that a
player has is over the allowed limit or he/she does PK in a restricted
area, the player is on the list of “wanted” for certain period of time
depending on the seriousness of those PK.
Ratings gained through Player-to-Player combats. By killing players in a
rivalry group, a player gains higher ratings.
Number of monsters hunted. A Monster is a computer generated and
controlled non-player character in the virtual game.
Build Number of the buildings constructed in the virtual world.
Table 2 -- Correlations Between Players’ Game-Playing Behavior Variables
GuildWar 0.529 1.000
GuildMem 0.132 0.109 1.000
ExpScore 0.187 0.317 0.091 1.000
RepLevel 0.355 0.174 0.075 0.116 1.000
RoleLevel 0.243 0.158 -0.005 0.098 0.636 1.000
TimeLevelUp 0.184 0.109 -0.011 0.054 0.432 0.639 1.000
Master 0.207 0.197 0.026 0.071 0.068 0.045 0.054 1.000
Wanted 0.135 0.064 -0.015 0.020 0.021 0.000 0.057 0.322 1.000
PK 0.109 0.040 -0.014 0.015 0.022 0.030 0.032 0.195 0.787 1.000
DeathMatch 0.413 0.429 0.082 0.245 0.119 0.071 0.062 0.303 0.320 0.159 1.000
Hunt 0.156 0.033 -0.007 -0.017 0.119 0.023 0.045 0.023 -0.011 0.018 -0.036 1.000
Build 0.280 0.110 0.038 0.080 0.183 0.092 0.089 0.045 0.029 0.019 0.071 0.389 1.000
Table 3 -- Principle Component Analysis
Guild Activity Game Status Action on Other
Action on Virtual
.059 -.002 -.008
-.013 .049 -.007
.165 .002 .168
Master -.039 .326 -.082 .016
Wanted .010 -.006
PK .029 -.104
DeathMatch .015 .306 .472 .014
Hunt .015 -.062 -.002
Building .088 .122 .015
Note: Factor loadings above .500 in absolute value are written in bold.
Table 4 -- Comparison of Model Performance
Model Calibration Sample Holdout Sample
Log-likelihood AIC Log-likelihood AIC
-3290 6598 -2691 5397
-3394 6804 -2713 5443
-4433 8869 -3319 6640
Table 5 – Model Parameters
Variable Estimate a Standard Error
Social Influence (ρ)
Action on Other Players
Action on Virtual World
a Numbers in bold indicate significance at 5% level.
Table 6 – Relative Revenue Impact for Baseline Shifting Intervention
Measures Based on Observed Customer
2.24 1.19 1.31 1.30
1.03 1.01 1.20
2.07 1.18 1.19 1.09
1.01 1.07 1.11 1.15
1.05 1.03 1.06
1.02 1.01 1.01 1.01
0.98 0.99 0.99 1.00 0.98
0.92 0.93 1.00 0.98 1.00
0.96 0.94 1.00 0.98 0.99
0.98 0.97 1.00 0.99 1.00
0.99 0.98 1.00 1.00 1.00 1.00 1.00 0.98
1000 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
Note: Relative revenue impact is the ratio of total revenue impact of the customers selected based on a measure to the expected revenue of the same
number of randomly selected customers. When m=1000 all customers are included and thus all measures are equivalent. Therefore, the ratio = 1.
Table 7 – Relative Revenue Impact for Constant Intervention
Measures Based on Observed Customer
2.96 1.61 1.78 1.26 1.13
1.08 1.16 1.51
2.71 1.44 1.50 1.27 1.04
1.12 1.22 1.39
2.31 1.27 1.19 1.12
1.08 1.10 1.30
1.98 1.18 1.13 1.02 1.04 1.04 1.05 1.08 1.16
1.67 1.15 1.14 1.02 1.04 1.04 1.05 1.05 1.14
1.49 1.13 1.13 1.03 1.04 1.03 1.02 1.02 1.12
1.37 1.10 1.10 1.01 1.03 1.02 1.03 1.01 1.09
1.27 1.09 1.09 1.01 1.05 1.03 1.01 1.01 1.08
1.20 1.07 1.08 1.01 1.03 1.02
1.13 1.04 1.07 1.01 1.03 1.01
1.08 1.03 1.04
1.03 1.02 1.02 1.01
1.00 0.99 1.00 1.00
1000 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
Note: Relative revenue impact is the ratio of total revenue impact of the customers selected based on a measure to the expected revenue of the same
number of randomly selected customers. When m=1000 all customers are included and thus all measures are equivalent. Therefore, the ratio = 1.
Table 8 -- Comparison of Customer Characteristics between Customers with Top 10% and
Bottom 10% of Customer Network Value
Mean (Top 10%) Mean (Bottom 10%)
p-value of t test
Age 34.000 31.780 0.072
Gender 0.800 0.780 0.730
House Value 7.490 7.408 0.091
Game Status 0.041 -0.087 0.067
Guild Activity 0.243 0.000 0.051
Action on other people 0.001 -0.095 0.235
Action on virtual world -0.146 -0.052 0.446
Appendix A -- Centrality Measures
To describe an importance of individual’s location in a network, centrality measure such
as degree, closeness and betweenness has been widely used in sociology. In this study, we
choose two types of centrality measure – out-degree and betweenness – to compare with the
proposed value measures.
Degree centrality is defined by the number of ties between individuals (Freeman 1977,
1979), which could be expressed as below:
?(?) = ∑ ???=
where ??? represents the links from individual i to individual j.
In current study, in-degree (∑ ???)
should differ from out-degree (∑ ???)
relations between individual i and j are asymmetric. Both measures provide us very similar
results in our analysis though.
Betweenness centrality is defined by “how between an individual is to all others in the
network.” (Freeman 1977, 1979). It is based on the assumption that an individual is central if it
lies between others on their geodesic. To have large betweenness centrality, an individual must
be between many other people on their geodesics. Let ??? be the number of geodesics linking
two individuals, j and k. For three individuals, i, j and k and, ???(?) is the number of geodesics
between j and k that contain i. Betweenness centrality is then summed over all individuals,
which take the form in (A.2):
?(?) = ∑(
43 Download full-text
Appendix B – Expected Impact of Random Selection
We next calculate the average impact of randomly assigning m customers for intervention.
The probability of a customer being randomly chosen as one of the m customers from the total N
customers is as follows:
If customer i is the selected for intervention, according to equation (10), his/her impact on
the network can be written as
w z , where
wz are the ith element of the vector and
z respectively. Thus, the expected impact of customer i on the network with random selection of
m customers for intervention can be written as:
Pr(being chosen as one of customers)
0 1 Pr(being chosen as one of customers)
Thus, the expected total impact of randomly selected m customers among N customers are:
m i m
1 !1 !
Pr(being chosen as one of customers) =