ArticlePDF Available

Mapping the Customer Journey: A Graph-Based Framework for Online Attribution Modeling

Authors:

Abstract and Figures

Advertisers employ various channels to reach consumers over the Internet but often do not know to what degree each channel actually contributes to their marketing success. This attribution challenge is of great managerial interest, yet academic approaches to it developed in marketing academia have not found wide application in practice. To increase practical acceptance, the authors introduce a graph-based framework to analyze multichannel online customer path data as first- and higher-order Markov walks. According to a comprehensive set of criteria for attribution models, embracing both scientific rigor and practical applicability, four model variations are evaluated on four, large, real-world data sets from different industries. Results indicate substantial differences to existing heuristics such as “last click wins” and demonstrate that insights into channel effectiveness cannot be generalized from single data sets. The proposed framework offers support to practitioners by facilitating objective budget allocation and improving team decisions, and allows for future applications such as real-time bidding.
Content may be subject to copyright.
Electronic copy available at: http://ssrn.com/abstract=2343077
Putting Attribution to Work: A Graph-Based Framework for
Attribution Modeling in Managerial Practice
Eva Anderl*, Ingo Becker, Florian v. Wangenheim, Jan H. Schumann
October 23, 2013
Eva Anderl is a Ph.D. Candidate at Passau University, Chair of Marketing and Innovation, Innstr. 27, D-94032 Passau, Germany,
eva.anderl@uni-passau.de.
Ingo Becker is a Ph.D. Candidate at ETH Zurich, Chair of Technology Marketing, Weinbergstr. 56/58, CH-8092 Zurich, Switzerland,
ibecker@student.ethz.ch.
Florian v. Wangenheim is Professor of Technology Marketing, ETH Zurich, Chair of Technology Marketing, Weinbergstr. 56/58, CH-8092
Zurich, Switzerland, fwangenheim@ethz.ch.
Jan H. Schumann is Professor of Marketing and Innovation, Passau University, Chair of Marketing and Innovation, Innstr. 27, D-94032
Passau, Germany, jan.schumann@uni-passau.de.
The authors gratefully acknowledge support from the German Federal Ministry of Education and Research (Project “Fre(E)S – Productivity
of Free E-Services”, FKZ 01FL10038) and the Wharton Customer Analytics Initiative.
Electronic copy available at: http://ssrn.com/abstract=2343077
2
Putting Attribution to Work: A Graph-Based Framework for
Attribution Modeling in Managerial Practice
ABSTRACT
Advertisers employ various channels to reach consumers over the Internet yet often do not
know to what degree each channel actually contributes to their marketing success. This
attribution challenge is of great managerial interest, yet theoretical approaches to it developed
in marketing academia have not been reflected in practical applications. To increase practical
acceptance, the authors introduce a comprehensive set of criteria for attribution models,
embracing both scientific rigor and practical applicability as guidelines for developing a new
adaptive model. A graph-based framework analyzes multichannel online customer path data
as first- and higher-order Markov walks. According to the criteria, 16 model variations are
evaluated on four, large, real-world data sets from different industries. Results indicate
substantial differences to existing heuristics such as “last click wins” and demonstrate that
insights into channel effectiveness cannot be generalized from single data sets. The proposed
framework offers support to practitioners by facilitating objective budget allocation and
improving team decisions; it also has been successfully implemented in practice at a German
multichannel tracking provider.
Keywords: online advertising, attribution, marketing models, Markov models, multichannel
3
Online advertising is essential to many industries’ promotional mix (Raman et al. 2012). In
the United States, online marketing accounts for more than 20% of overall marketing
spending, amounting to $36.6 billion in 2012 (PriceWaterhouseCoopers 2013). Advertisers
use a variety of online marketing channels,
1
including paid search and display marketing, as
well as e-mail, mobile, and social media advertising and affiliate marketing to reach
consumers. This proliferation of channels makes budget allocation decisions increasingly
complex (Raman et al. 2012). Furthermore, consumers may be exposed to advertisements
through various channels, yet advertisers lack transparency about the degree to which each
channel or campaign contributes to their companies’ success.
Marketing executives thus call for performance measures of the contributions of each
online marketing channel (Econsultancy 2012a; Ramsey 2009). The challenge of attributing
credit to different channels (Neslin and Shankar 2009) involves finding ways to measure “the
partial value of each interactive marketing contact that contributed to a desired outcome”
(Osur 2012, p. 3). To award such credit, many advertisers apply simple heuristics, such as
“last click wins, such that the value gets attributed solely to the marketing channel that
directly preceded the conversion (Econsultancy, 2012a), and any prior customer interactions
are disregarded.
Modern technological advancements enable recording online customer journeys
though,
2
including all advertising exposures for an individual customer over all online
marketing channels. These data offer new ways to address the attribution problem (Bucklin
and Sismeiro 2009). The market for such attribution technologies has gained momentum
(Osur 2012; Tucker 2012), such that the use of attribution models has doubled since 2008
(Econsultancy 2012a; Riley 2009), and nearly 75% of marketers believe that attribution
measures can improve the allocations of their budgets across channels, which might enhance
their return on investments as well. Although some software tool providers now offer
4
multitouch attribution solutions,
3
last and first click wins heuristics remain among the most
widely used attribution methods in practice (Econsultancy 2012a). Furthermore, even the
multitouch attribution tools largely rely on simple heuristics, such as weights predefined by
an advertiser, which assigns a particular weight to each position or channel over the course of
successful customer journeys. Other popular heuristics include linear attribution approaches,
which split the contribution evenly across all channels included in a successful journey, and
time-decay methods, for which contacts closer to the conversion receive more credit
(Econsultancy 2012a, 2012b; Osur 2012). Only three major vendors offer statistical or
algorithmic attribution methodologies (Osur 2012), but their mechanisms remain publicly
unavailable and irreproducible (Dalessandro et al. 2012).
In turn, despite its relevance, the attribution problem only recently has become a focus
for marketing researchers (Abhishek, Fader, and Hosanagar 2012; Haan, Wiesel, and Pauwels
2013; Kireyev, Pauwels, and Gupta 2013; Li and Kannan 2013), likely related to the
increasing availability of high-quality clickstream data (Bucklin and Sismeiro 2009).
However, to the best of our knowledge, sophisticated attribution approaches have not found
wide application in practice. This gap, such that academically developed methods do not find
their way into managerial applications, is a widely lamented problem (Reibstein, Day, and
Wind 2009). Acceptance and adaption of marketing models demands more than analytical
rigor (Lehmann, McAlister, and Staelin 2011; Little 1970; Wübben and Wangenheim 2008);
managers hesitate to base their decisions on mechanisms whose results are not available
when they need them (Lodish 2001) or if they do not understand how the insights are
generated (Little 1970, 2004). In this sense, “simplicity and transparency beats complex
optimization every time … because the managers with responsibility [are] unwilling to
implement recommendations they [do] not understand (Little 2004, p. 1857). Thus, though
the available academic frameworks are appealing and innovative, they do not address
5
important criteria for real-world acceptance, such as ease of interpretation, versatility, or
algorithmic efficiency.
In response, we introduce a set of criteria to evaluate attribution models, building on
existing research on managerial decision models (Lehmann, McAlister, and Staelin 2011;
Lilien 2011; Little 1970, 2004; Lodish 2001). Our novel attribution framework is based on
Markovian graph-based data mining techniques inspired by Archak, Mirrokni, and
Muthukrishnan (2010); we evaluate it according to our proposed set of criteria. We compare
our suggested framework against existing attribution models and apply it to a real-life system
implemented at a German multichannel tracking provider. In this way we extend existing
discussions of attribution and contribute to marketing theory and practice.
First, by providing marketing managers with a set of evaluation criteria, we reduce the
thresholds for applying and selecting attribution techniques in managerial practice
(Econsultancy 2012a), foster standardization and cross-industry acceptance (Dalessandro et
al. 2012), and improve fairness in evaluating the contribution of online marketing channels
(Dalessandro et al. 2012). This effort also could support adequate remunerations of
advertising-financed publishers (Jordan et al. 2011). Second, we propose a novel framework
for analyzing multichannel, online customer path data, according to our proposed set of
criteria. We model and analyze individual-level multichannel customer journeys as first- and
higher-order Markov walks. Applying our proposed criteria, we evaluate 16 model variations
using four, large, real-life data sets from different industries, then compare the attribution
results against existing heuristics. Third, our research provides solutions to several explicit
problems that practitioners confront. It facilitates an objective and independent logic for
deriving budget optimization processes and strategic decisions, such as channel selection.
The framework also can update the mental models of decision makers, by building up the
expertise of marketing managers. Because it is purely data-driven, it reduces organizational
6
hierarchies and can consecutively improve team decisions (Lilien 2011). The versatility of
our framework makes it suitable across industries and marketing contexts and allows for
future applications. Fourth, we contribute new insights into online marketing effectiveness in
a multichannel setting. We find that higher-order models outperform first-order, “memory-
free” Markov models in their predictive accuracy. This proof adds to the existing evidence
that one-click heuristics, such as last click wins, are not sufficient to capture the contributions
of online channels. In line with prior research, certain channels such as display are
undervalued by existing attribution heuristics, whereas the contributions of other channels are
overestimated (Abhishek, Fader, and Hosanagar 2012; Li and Kannan 2013). With our four,
large-scale data sets across three different industries, we enable cross-industry comparisons
and find that insights on channel effectiveness cannot be generalized from results obtained
from a single data set. This finding enhances the demand for an easily applicable, versatile
attribution framework. Fifth, we introduce alternative measures to evaluate predictive
accuracy for domains including online marketing with skewed class distributions and unequal
classification error costs. Although analyses of receiver operating characteristic (ROC)
curves have appeared in a variety of disciplines, they have not been applied frequently in
marketing research (Baesens et al. 2002; Fawcett 2006). Sixth, our study responds to research
requests to develop marketing impact models and techniques based on individual-level,
single-source data (Rust, Lemon, and Zeithaml 2004) and provides a new perspective on
analyzing path data in marketing (Hui, Fader, and Bradlow 2009).
The rest of this article is organized as follows. We first propose multiple criteria that a
successful attribution model should fulfill to be both scientifically valid and applicable in
practice. After we locate our research in the context of existing literature, we introduce our
clickstream data, develop our framework, introduce several model variations, and present the
evaluation results. Next, we introduce the practical implementation and discuss the impacts
7
of our research for marketing theory and practice. Finally, we conclude with a discussion of
limitations and directions for further research.
CRITERIA FOR ATTRIBUTION MODELING IN THEORY AND PRACTICE
Putting academic marketing models to work in practice is challenging, because
building scientific models that improve productivity is an art (Lodish 2001, pp. S45). The
most complex model is not necessarily the one that will affect an organization’s productivity
(Little 1970; Lodish 2001). We therefore conceptualize marketing attribution modeling with
a catalogue of six criteria that do not focus solely on scientific rigor and complexity but also
include aspects relevant to the implementation in practice, which should help practitioners
reduce barriers to evaluating and implementing attribution. We build on prior research into
the acceptance of marketing decision models (Lilien 2011; Little 1970, 1979, 2004; Lodish
2001; Reibstein, Day, and Wind 2009) and connect them with criteria previously discussed in
the context of attribution modeling (Dalessandro et al. 2012; Shao and Li 2011).
Objectivity
Marketing decision models should enable the computation of the relative impacts of
different decision variables and enable objectivity in budget decisions (Lilien 2011). In the
case of attribution, models need to be able to assign credit to individual channels or
campaigns, in accordance with their factual ability to generate value, such as by contributing
to conversions or increasing revenues (Dalessandro et al. 2012). We refer to this property as
objectivity. Models from related disciplines have demonstrated their ability to improve the
objective quality of decisions (Lilien 2011). If objectivity is not ensured, marketing decisions
become random, leading to suboptimal outcomes. Although this criterion seems to be an
obvious prerequisite, most models applied in practice break this rule. For example, models
8
that condense user journeys to one click (e.g., first- or last-click heuristics) omit any
additional marketing contacts, and more complex models based on predefined weights by the
advertiser fail to attribute credit fairly across channels.
Predictive Accuracy
Although attribution primarily takes a retrospective view, attribution models should
be able to correctly classify conversion events (Shao and Li 2011). In addition to ensuring
scientific rigor, this classification helps persuade managers of the model’s credibility (Lodish
2001). We therefore introduce predictive accuracy as a second criterion. However, standard
accuracy measures, such as percentage correctly classified, are poor metrics for measuring
classification performance in the case of unequal misclassification costs or if the class
distribution is skewed. Therefore, we extend this discussion by applying ROC curves,
widely-accepted in related disciplines (Baesens et al. 2002; Fawcett 2006), as an alternative
measure. The ROC curve analysis decouples classification performance from class
distribution, which makes it especially attractive for online marketing data with their highly
unbalanced distribution of conversion and non-conversion events.
Robustness
We include robustness as another important metric to evaluate model fitness (Little
1970, 2004; Shao and Li 2011). Robustness indicates the ability of a model to deliver stable
and reproducible results if the model runs multiple times. To gain practical acceptance,
decision models need to be robust to avoid bad, unstable results (Little 1970). Stable and
reproducible results for performance metrics such as value contributions are indispensable for
clear and sustainable decisions, because advertisers apply them to judge and tactically assign
budgets to different channels.
9
Interpretability
To ensure managerial acceptance, models need to be simple and easy to communicate
(Little 1970, 2004). Managers should be capable of adjusting inputs and understanding
outputs with relative ease. The interpretability of the results is of utmost importance for
practical acceptance, because managers often refuse to use black box approaches that conceal
how they work or how they generate results (Lilien 2011; Little 1970; Lodish 2001).
Therefore, the structure of an attribution model should be transparent to all stakeholders with
reasonable effort. Output metrics also should be intuitively understandable and easy to
communicate, such that they enable users to transform the results directly into managerial
decisions (Little 1970). Following Dalessandro et al. (2012, p. 2), who define interpretability
as the acceptance of the attribution system “by all parties with material interest,” we
introduce interpretability as an important criterion for attribution models, incorporating both
simplicity and ease of communication.
Versatility
Little (1970) posits that models should be adaptive and easy to control. Adaptability
encompasses the capability of incorporating new information that becomes available over
time. Models that are easy to control enable users to adjust inputs to derive appropriate
outputs. These two aspects are strongly related from a methodological point of view, so we
combine them into a single criterion, which we call versatility. In an online environment,
advertisers include different channels in their marketing mix, and the set of available
channels is constantly evolving (Evans 2009). An attribution framework therefore should be
able to include varying channels and different conversion events and should easily be adapted
to user preferences or extended toward innovative forms of advertising. Furthermore, the
framework should allow for different aggregation levels, because managers are likely to
10
neglect results if the measures are not accessible at the right level of aggregation (Lodish
2001).
Algorithmic Efficiency
Finally, we introduce algorithmic efficiency, or the speed with which the model
computes outputs when requested, as a sixth criterion. Repeatedly, Little (1970, 2004) has
called for completeness in relevant issues, meaning that model structures should be able to
handle many phenomena without getting bogged down. Although Little did not explicitly
refer to algorithmic efficiency, today, his criterion can be extended by this dimension. About
a decade after his 1970 publication, Little reacted to the emergence of new computation
technologies by asserting that “with the right software … a management scientist or other
person can perform a wide scope of analysis smoothly, quickly, and efficiently” (Little 1979,
p. 11). However, with recent advances in online tracking technologies, the size of available
data sets is growing strongly and rapidly, posing new challenges for algorithmic efficiency.
Clickstream data sets can be of tremendous size, comprising hundreds of thousands of clicks
or even billions of impressions (Bucklin and Sismeiro 2009). An attribution methodology
must be able to handle these volumes quickly and efficiently, as a basic precondition, to be
suitable for practical purposes (Archak, Mirrokni, and Muthukrishnan 2010), because
practitioners will not apply results that are not available when required (Lodish 2001).
Using these criteria, derived both from research on marketing model acceptance and
recent work on attribution modeling, we provide a comprehensive framework to evaluate
attribution models that includes requirements from both academia and practice. Table 1
provides an overview of the six criteria we propose and their relation to prior literature.
[INSERT TABLE 1 AROUND HERE]
11
RESEARCH BACKGROUND
Academic research on attribution is still scarce (Raman et al. 2012; Tucker 2012) but
can build on prior studies pertaining to online advertising effectiveness. Most existing
research focuses on single channels, such as search (Ghose and Yang 2009; Rutz and Bucklin
2011; Yang and Ghose 2010) or display (Braun and Moe 2013; Goldfarb and Tucker 2011).
Studies comparing the short- and long-term effectiveness of different online advertising
channels based on aggregate data relate to the attribution problem (Breuer and Brettel 2012;
Breuer, Brettel, and Engelen 2011), yet they do not attempt to award credit for conversions.
Jordan et al. (2011) examine allocation decisions for publishers, using multiple attribution
approaches, and derive allocation and pricing rules for publishers selling advertising slots. In
a study of the economic welfare consequences of the use of attribution technologies, Tucker
(2012) finds evidence for more conversions at lower costs, due to the ability to systematically
substitute towards selected campaigns across advertising platforms. These findings underline
the potential impact of attribution on marketing effectiveness, though the attribution
methodology they use remains undisclosed and not subject to examination. Practice-oriented
literature on attribution mainly highlights the relevance of the topic or summarizes ongoing
industry activities, without providing methodological details (Chandler-Pepelnjak 2008;
Econsultancy 2012a, 2012b; Lovett 2009; Osur 2012; Riley 2009).
Furthermore, we know of few academic studies that address the online attribution
problem: Shao and Li (2011) introduce two attribution approaches, a bagged logistic
regression model and a simple probabilistic model. Building on their work, Dalessandro et al.
(2012) propose a more complex, causally motivated attribution methodology based on
cooperative game theory. Abhishek, Fader, and Hosanagar (2012) suggest a dynamic hidden
Markov model, based on individual consumer behavior, that captures a consumer’s
12
deliberation process along the typical stages of the purchase funnel: dormant, awareness,
consideration, and conversion. Li and Kannan (2013) propose a Bayesian attribution model to
measure the carryover and spillover effects of multiple channels using individual conversion
path data. A multivariate time-series model based on aggregate data by Kireyev, Pauwels,
and Gupta (2013) analyzes attribution dynamics for display and search advertising. Finally,
Haan, Wiesel, and Pauwels (2013) propose a structural vector autoregressive model, also
based on aggregate data, to determine the effectiveness of different online advertising
channels. We place this existing literature in the context of the evaluation criteria we have
identified for attribution modeling in Table 2.
[INSERT TABLE 2 AROUND HERE]
Two existing models (Abhishek, Fader, and Hosanagar 2012; Li and Kannan 2013)
objectively assign credit to individual contacts in accordance with their factual ability to
generate value. In contrast, the approaches by Shao and Li (2011) and Dalessandro et al.
(2012) neglect the frequency of channels in a customer journey; models based on aggregate
data ignore the influence of individual contacts (Haan, Wiesel and Pauwels 2013; Kireyev,
Pauwels, and Gupta 2013). All of the cited studies evaluate predictive accuracy using a
variety of measures, such as log-likelihood (Abhishek, Fader, and Hosanagar 2012; Li and
Kannan 2013) or mean absolute percentage error (Haan, Wiesel, and Pauwels 2013; Li and
Kannan 2013). The hidden Markov model proposed by Abhishek, Fader, and Hosanagar
(2012) outperforms a simple logit model on its root mean squared error and log-likelihood.
Yet no overall comparison of predictive accuracy for the existing approaches is possible,
because the data sets used and implementation details are not publically available, and the
measures differ across studies. Only Shao and Li (2011) evaluate robustness, which they call
variability. No other studies explicitly analyze robustness, though Li and Kannan (2013)
provide additional validation using a field experiment. With standard statistical methods, the
13
approaches adopted by Shao and Li (2011) and Dalessandro et al. (2012) are relatively easy
to interpret, even without profound knowledge of marketing modeling techniques. The degree
of complexity of the other models likely makes it difficult for practitioners to follow their
calculation logic, leading to limited interpretability. In addition, though some models are
highly flexible (Dalessandro et al. 2012; Shao and Li 2011), the versatility of other
approaches is limited by their explicit assumptions about the customer decision process and
channel characteristics (Abhishek, Fader, and Hosanagar 2012; Li and Kannan 2013), as well
as their restrictions regarding specific channels. For example, Haan, Wiesel, and Pauwels
(2013) do not include channels with performance-based payment models, such as affiliates,
to avoid endogeneity. No authors mention algorithmic efficiency, possibly because some of
the samples used were relatively small, such that efficiency considerations became less
relevant. Because reliable statements about algorithmic efficiency are hard to make from an
outside perspective, we deliberately choose not to evaluate this criterion.
Overall, this review of existing literature on attribution modeling indicates important
progress from an academic perspective but also shows that practical considerations are often
not reflected. We therefore seek to develop a model that meets all of the suggested criteria
and evaluate it using real-life data sets.
DATA
Our research is based on four clickstream data sets provided by online advertisers, in
collaboration with a multichannel tracking provider. Clickstream data record each user's
Internet activity and thus trace the navigation path he or she takes (Bucklin and Sismeiro
2009). For each visit to the advertiser’s website during the observation period, the data
include detailed information about the source of the click and an exact timestamp. Clicks
either represent a direct behavioral response to an advertising exposure or result from the user
14
entering the advertiser’s URL directly into the browser, so these sources comprise all online
marketing channels, as well as direct type-ins. We also know for each visit whether it was
directly followed by a conversion, in this case a purchase transaction. We use this
information to construct customer journeys that describe the click pattern of individual
consumers across all online marketing channels and their purchase behavior. Thus, we not
only track successful journeys ending with a conversion but also journeys that never lead to a
conversion, within a timeframe of 30 days of the last exposure.
The data collection occurs at the cookie-level, such that we identify individual
consumersor more accurately, individual devices. The use of cookie data suffers several
limitations, such as an inability to track multidevice usage or bias due to cookie deletion
(Chatterjee, Hoffman, and Novak 2003; Rutz, Trusov, and Bucklin 2011), yet cookies remain
the industry standard for multichannel tracking (Tucker 2012). In contrast with prior research
(Breuer and Brettel 2012; Breuer, Brettel, and Engelen 2011; Lohtia, Donthu, and Yaveroglu
2007), we use cross-sectional field data, which allow insights into the interaction of
individual advertising exposures.
The advertisers that provide the data sets for this study operate in different online
industries: fashion retail, luggage retail, and travel. All are pure online players, so we can
exclude online/offline cross-channel effects (Wiesel, Pauwels, and Arts 2011). Sophisticated
attribution is not required for journeys consisting of just a single click, so we exclude one-
click journeys from our analysis. In case of a single click, both the “last click wins” and the
“first click wins” heuristics deliver objective results that would satisfy our criteria.
Each data set includes a minimum of 105,000 journeys of length > 1 per advertiser.
Their average length is 2.95.3 contacts, and between 2.8% and 7.3% of all journeys lead to a
successful conversion. All advertisers included in the evaluation distinguish eight different
online channels, though the channels used differ partly across firms. Search engine
15
advertising (SEA), search engine optimization (SEO), affiliate, and newsletter information
appears in all four data sets. Other channels used by the advertisers include display, price
comparison, social media advertising, and retargeting. In Table 3, we present detailed
descriptions of our data sets.
[INSERT TABLE 3 ABOUT HERE]
MODEL DEVELOPMENT
We propose a graph-based Markovian framework to analyze customer journeys and
derive an attribution model, adapting an approach proposed by Archak, Mirrokni, and
Muthukrishnan (2010) in the context of search engine advertising. Markov chains are
probabilistic models that can represent dependencies between sequences of observations of a
random variable. They have a long history in marketing (Styan and Smith 1964) and have
been used frequently to model customer relationships (Homburg, Steiner, and Totzek 2009;
Pfeifer and Carraway 2000). Other applications include advertising frequency decisions
(Bronnenberg 1998) and brand loyalty (Che and Seetharaman 2009). In our model, we
represent customer journeys as chains in Markov graphs.
4
A Markov graph   is
defined by a set of states:
 
and a transition matrix W with edge weights:
         

State Definitions
Customer journeys contain one or more contacts across a variety of channels,
represented in the set of states S. We construct four types of Markov graphs that differ in
their definitions of the model states. All the graphs contain three special states: a START state
16
that represents the starting point of a customer journey; a CONVERSION state representing a
successful conversion; and an absorbing NULL state for customer journeys that have not
ended in a conversion. The definition of the remaining states varies across model types; in the
base model, each state corresponds to one channel. The transition probability wij corresponds
to the probability that a contact in channel i is followed by a contact in channel j. Cycles in
the graph are possible, such as when a sequence of two identical channels appears. Similar to
Archak, Mirrokni, and Muthukrishnan (2010), we complement this “simple” model with
several more elaborated models that incorporate the position of the contact in the customer
journey. Incorporating journey positions in the state definitions is equivalent to transforming
a non-homogeneous Markov process to a time-homogeneous one (Çinlar 1975), and it results
in a directed acyclic graph.
In the “forward” model, states are defined by the channel and position in the customer
journey, counted from the beginning. Consider two customer journeys: SEADISPLAY
CONVERSION and DISPLAYSEACONVERSION. Whereas in the simple model, the set
Ssimple = {START, CONVERSION, NULL, SEA, DISPLAY} contains five states, the forward
model considers the different positions of SEA and DISPLAY, leading to Sforward = {START,
CONVERSION, NULL, DISPLAY1, DISPLAY2, SEA1, SEA2} with seven elements. The
“backward” model similarly incorporates the channel and the position in the journey, counted
from the last observation. Because many advertisers emphasize both the first and the last
contact in a journey (Robinson 2012), we also include a “bathtub” model that distinguishes
these two positions. All other positions are aggregated in an intermediate state per channel,
which can be regarded as a normalization of journey lengths.
5
An exemplary customer
journey SEADISPLAYSEOSEACONVERSION then leads to a set Sbathtub = {START,
CONVERSION, NULL, SEAfirst, DISPLAYintermediate, SEOintermediate, SEAlast}. The term
“bathtub” refers to the “bathtub heuristic,which extends the last and first click wins
17
heuristics (Robinson 2012) and thus should help practitioners put our approach into context.
Figure 1 provides a graphical structure of the simple model for data set 1.
[INSERT FIGURE 1 ABOUT HERE]
Model Order
Markovian models, as used by Archak, Mirrokni, and Muthukrishnan (2010), suggest
that the present only depends on the first lag and do not incorporate previous observations.
Because previous research suggests that clickstreams should not be regarded as Markovian
though (Chierichetti et al. 2012; Montgomery et al. 2004), we introduce alternative higher-
order Markov models, in which the present depends on the last k observations. Transition
probabilities thus can be defined as follows:
        
           
For our implementation, we exploit the knowledge that a Markov chain of order k, over some
alphabet A, is equivalent to a first-order Markov chain over the alphabet Ak of k-tuples.
States in higher-order models therefore include k-tuples of states in the first-order models.
Unfortunately, the number of independent parameters increases exponentially with the order
of the Markov chain and quickly becomes too large to be estimated efficiently with real-
world data sets (Berchtold and Raftery 2002). Considering these implementation issues in
relation to algorithmic efficiency, we limit our analyses to Markov chains of a maximum
order of four. By combining the different types of state definitions with model orders
between one and four, we analyze 16 different models in total.
Ad Factors
The representation as Markov graphs allows identifying structural correlations in the
customer journey data that can be used to develop an attribution model. Archak, Mirrokni,
18
and Muthukrishnan (2010) propose a set of ad factors to capture the role of each state, such
as Eventual Conversion(si), or the probability of reaching conversion from a given state si.
Visit(si) is the probability of passing si on a random walk beginning in the START state. For
attribution modeling, we propose using the ad factor Removal Effect(si), defined as the
change in probability of reaching the CONVERSION state from the START state when we
remove si from the graph. We assume that all incoming edges of the state si that we remove
are redirected to the absorbing NULL state. Using this assumption, Removal Effect(si) is
equivalent to the multiplication of Visit(si) and Eventual Conversion(si) and can be efficiently
calculated using matrix multiplication or applying local algorithms provided by Archak,
Mirrokni, and Muthukrishnan (2010).
RESULTS
We evaluate our models according to the previously established criteria: objectivity,
predictive accuracy, robustness, interpretability, versatility, and algorithmic efficiency. We
also compare our results against existing attribution heuristics.
Application of Evaluation Criteria
Because it includes all contacts in the analysis and makes no previous assumptions
about the importance of individual channels or channel order, the graph-based framework we
propose satisfies the objectivity criterion. In contrast with existing practical applications, the
analyses are completely data driven, and the mechanics of model building and ad factor
calculation are fully disclosed and reproducible.
Predictive accuracy measures how many conversion events get classified correctly.
Standard metrics for classification accuracy, such as percentage correctly classified or log
likelihood, are poor metrics for measuring classification performance in the case of unequal
19
misclassification costs or when class distribution is skewed though (He and Garcia 2009;
Provost, Fawcett, and Kohavi 1998). Journey conversion rates in our data sets are
approximately 5%, making class distributions highly unbalanced. We therefore choose the
area under the ROC curve (AUC) as an alternative measure for predictive performance that
decouples classification performance from class distributions and misclassification costs. A
ROC curve is a two-dimensional graph; the true positive rate is plotted on the x-axis, while
false positives appear on the y-axis (Bradley 1997; Fawcett 2006). To compare our models,
we reduce ROC performance to a single scalar value, the area under the ROC curve (Bradley
1997). This AUC can take values between 0 and 1, though a realistic classifier always uses an
AUC of more than .5, the value reached by random guessing. Figure 2 contains the ROC
curves for the simple models.
[INSERT FIGURE 2 ABOUT HERE]
To ensure practical applicability, we measured predictive performance both within and out of
sample. We use 10-fold cross-validation, which is superior to leave-one-out validation or
bootstrapping, because all the data serve as the holdout once (Kohavi 1995; Sood, James, and
Tellis 2009). The cross-validation results are in Table 4.
[INSERT TABLE 4 ABOUT HERE]
Although the overall predictive accuracy varies significantly between data sets, the relative
performance of the 16 model types is comparable, leading to similar rankings of the model
types. Within- and out-of-sample performance is nearly identical, indicating high suitability
for forecasting future events and a low risk of overfitting. Increasing the memory capacity of
the models improves predictive performance. Second-order models outperform first-order
models. The largest performance increase results from moving from second- to third-order
models. Increasing the memory capacity to four lags further improves predictive
performance, though only marginally in most cases. State definitions have little impact on
20
predictive accuracy, except for forward models. For the lower-order models, the forward
variant exceeds the other models in predictive accuracy, though this difference decreases with
model order and disappears for fourth-order models.
The third evaluation criterion, robustness, applies to two measures. First, predictive
accuracy should be robust across all cross-validation repetitions. Table 4 lists the standard
deviations of the area under the ROC curve for each model. The standard deviations and
coefficients of variation increase for higher-order models, though the mean coefficient of
variation of .013 (within and out-of-sample) implies low overall variation. Second, the
variable used for attribution modeling should provide stable attribution results that offer a
reliable basis for managerial decisions, such as budget shifts. Therefore, we specifically test
the robustness of the Removal Effect(si) ad factor. For each model state, we compute the
average standard deviation of the removal effect across 10 cross-validation repetitions. We
report the stability of the removal effects as percentages of the average removal effect across
all states, as the number of states per model and correspondingly the mean Removal Effect(si)
varies. We summarize these validation results in Table 5.
[INSERT TABLE 5 ABOUT HERE]
For all data sets in our sample, the average standard deviation as a percentage of the average
removal effect increases with model order. For first-order models, variation is lowest for
simple and backward models and highest for forward models.
Although objectivity, predictive accuracy, and robustness represent necessary
conditions for attribution models, additional criteria such as interpretability must be fulfilled
to foster acceptance and application in practice. Even without advanced statistical knowledge,
managers prefer to comprehend how the models work and generate results (Lilien 2011;
Little 1970; Lodish 2001). The graphical representation (Figure 1) of our framework can help
online marketing managers understand the basic concept. In discussions with online
21
marketing managers, we discovered that despite their initial skepticism toward algorithmic
attribution approaches in general, the proposed framework was regarded as easy to interpret
and well accepted. The output metrics can be provided in the same format as existing
heuristics and are intuitively interpretable and easy to communicate to other stakeholders.
Because it requires no preliminary assumptions about channels or decision processes,
our framework is highly versatile. The only prerequisite for building graphical models is the
availability of historical, individual-level tracking data. Our framework can evaluate various
conversion types, including sales, sign-ups, or leads, and easily integrate new online
marketing channels. Analyses might run on different aggregation levels, such that users can
analyze not only channels but also advertising campaigns or even different creatives.
Considering the large data volumes in online marketing (Bucklin and Sismeiro 2009)
and practitioners’ requests for regular updates (Econsultancy 2012a), algorithmic efficiency
has become a decisive criterion for attribution models. Using matrix multiplication or
applying local algorithms, all the models can be calculated efficiently and allow for frequent
updates. However, as the number of independent parameters increases exponentially with the
order of the Markov chain, lower-order models may be more suitable for analyses on
campaign or creative levels or for updating models in real-time.
Combining objectivity and measures of model fit with practical considerations, we
recommend using first- and second-order models for standard attribution analyses. The
simple first-order model can be compared most easily with existing attribution heuristics that
yield one coefficient per channel and offer a good trade-off between accuracy and stability.
Alternative state definitions (forward, backward, bathtub) can differentiate the effects for
particular positions in the customer journey. Using higher-order models also yields additional
insights into channel interactions, further increasing managers’ understanding of the interplay
across channels. We illustrate these findings with exemplary analyses next.
22
Comparison with Existing Attribution Heuristics
We compare the attribution results of our proposed framework with the last and first
click wins heuristics, that is, the versions most widely used in industry practice (Econsultancy
2012a). To present comparable results, we used simple first-order Markov models. The
removal effects per state appear as percentages of the sum of all removal effects, excluding
the special states START, CONVERSION, and NULL. Our analyses are based on overall
data sets, without any hold-out samples. Figure 3 shows the value contribution of each
channel towards final conversions according to last and first click wins heuristics and the
simple first-order Markov model. The columns to the right show the percentage difference
between attribution computed by our graph-based model and existing heuristics.
[INSERT FIGURE 3 ABOUT HERE]
We observe substantial differences for the results of the Markov model and those of
the last and first click wins heuristics. Both SEO and display are consistently undervalued by
the heuristic attribution approaches. For the other channels, the picture is less homogeneous.
Affiliate marketing seems overvalued by the last click wins approach but undervalued by first
click wins, though not in all cases. Similar tendencies hold for direct type-ins, when users
directly access the company website. For the remaining channels, SEA, price comparison,
newsletter, referrer, and social media leave a more ambiguous picture, such that the
implications need to be derived and verified individually for each data set.
For a more detailed view, models with higher orders or different state definitions
should be taken into consideration; in Table 6 we provide the results for the second-order
simple model for data set 1. For many channels, including SEA and newsletter, the increase
in overall purchase probability is highest right after the START state, near the beginning of
the journey. Sequences of identical channels show high removal effects, which might indicate
channel preferences for some users. For example, affiliate preceded by affiliate has a
23
percentage removal effect of 5.94%, whereas the average removal effect for affiliate
preceded by another channel is only 1.57%. Although SEO and affiliate are comparable in
their total effects, the removal effect of SEA preceded by affiliate is significantly lower than
that of SEA preceded by SEO. Furthermore, SEO seems to work especially well if preceded
by another interaction in a search context (SEO or SEA).
[INSERT TABLE 6 ABOUT HERE]
We summarize the results of the first-order bathtub model applied to data set 1 in
Table 7. Both affiliate and SEO are very strong in the last position. The removal effect of
SEA is highest in the first position, lower in the intermediate position, and higher again in the
last position. Display, in contrast, seems less effective at the end of the journey but stronger
in the intermediate and first positions. This result is in line with findings that imply that
display tends to be located at earlier stages of the conversion funnel, influencing the progress
from one stage to another rather than ultimate sales (Abhishek, Fader, and Hosanagar 2012).
[INSERT TABLE 7 ABOUT HERE]
Applying models with higher model orders or different state definitions, in addition to
the simple first-order model that can easily be compared with existing heuristics, enables
advertisers to gain a more detailed understanding of the interplay across channels and the
effectiveness of different positions in the customer journey.
IMPLEMENTATION AND IMPACT
We implemented our attribution framework in a real industry environment, such that
we can illustrate how our approach contributes to marketing practice and theory.
Implementation
We developed a prototype of our framework, including all 16 model types, and
evaluated it as described in the previous sections before implementing a real-life system at a
24
German multichannel tracking provider. The provider offers the attribution tool as part of its
multichannel tracking services. Thus far, several companies from diverse sectors, including,
e.g., mobile network operators or online retailers, have applied our attribution tool in
practice, confirming its high usability and practical impact. In transferring the prototype, we
took special care to enhance algorithmic efficiency. The features and functionality of the
model were designed in close cooperation with practitioners (i.e., selected clients of the
multichannel tracking provider) to meet their expectations for their day-to-day business. The
tool assists practitioners in enhancing the transparency of their online activities, enabling
data-driven and more rational budget decisions. To facilitate adoption, advertisers can assess
model results in relation to well-established attribution heuristics. Prior research has shown
that putting model results in the context of previous situations also can improve managerial
decision making (Hoch and Schkade 1996). Following the specific needs and requests of the
practitioners, we assigned a special focus to the versatility of the implementation. Advertisers
can analyze different conversion types and individually define conversion events of interest,
such as purchases or newsletter sign-ups. The model also allows users to decompose and
aggregate channels and campaigns, reflecting the individual splits of the advertiser. Because
of the overall versatility of the underlying framework, this tool can be applied in various
industries and environments and include emerging online advertising channels. The only
prerequisite for using it is the availability of customer journey tracking data on the individual
cookie level. These data must include a unique journey identifier to differentiate individual
journeys, as well as a channel identifier and time stamp for each customer interaction to
establish the chronological order.
25
Impact
Our framework contributes to marketing theory and practice in several ways. First, we
introduce a comprehensive set of six criteria required for successful attribution models.
Building on existing literature related to the acceptance of marketing decision models in
practice (Lehmann, McAlister, and Staelin 2011; Lilien 2011; Little 1970, 1979, 2004;
Lodish 2001), we ensure scientific rigor by assessing objectivity, predictive accuracy, and
robustness; we also include criteria to encourage application in practice, namely,
interpretability, versatility, and algorithmic efficiency. Whereas previous studies have
discussed selected properties for attribution methods (Dalessandro et al. 2012; Shao and Li
2011), we present the first exhaustive set of criteria that acknowledges practitioners’
requirements and fully reflects them in the attribution framework. Clear criteria reduce the
barriers to applying attribution techniques in managerial practice (Econsultancy 2012a) and
foster standardization and cross-industry acceptance (Dalessandro et al. 2012). Increased
objectivity in evaluating the contribution of online marketing channels also should result in
fairer remuneration for advertising-financed publishers (Jordan et al. 2011). The incentives of
advertisers and other market actors, such as publishers or agencies, are seldom congruent
(Abou-Nabout et al. 2012), which creates the demand for independent, objective criteria to
assess attribution models.
Second, we propose a novel, graph-based framework for analyzing multichannel
online customer journeys, represented as Markov walks in directed graphs. In addition to a
simple, non-homogeneous model, we use three time-homogeneous models (forward,
backward, and bathtub), to incorporate the position of the contact in the customer journey.
We further introduce higher-order Markov models, in which the present depends on the last k
observations, in an online marketing context. The representation in directed Markov graphs
26
supports the calculation of ad factors (Archak, Mirrokni, and Muthukrishnan 2010) that
capture specific properties of each state. We use the Removal Effect(si) ad factor, defined as
the change in probability of reaching the CONVERSION state from the START state when si is
removed from the graph, to derive state and channel contributions, respectively. In total, we
rigorously evaluate 16 different models according to our criteria for successful attribution
models using four, large-scale, real-life data sets. A comparison of the results against existing
attribution heuristics shows substantial differences between the results of the Markov model
and the last and first click wins approaches. Thus we provide a practice-oriented alternative
to widely used, often misleading attribution heuristics. We also extend existing attribution
literature (Abhishek, Fader, and Hosanagar 2012; Dalessandro et al. 2012; Haan, Wiesel, and
Pauwels 2013; Kireyev, Pauwels, and Gupta 2013; Li and Kannan 2013; Shao and Li 2011),
by introducing an approach that meets both academic standards of objectivity, predictive
accuracy, and robustness and the criteria relevant for implementation in practice.
Third, scientifically validated attribution models help resolve several managerial
problems. Decision making is a complex task for online advertisers, in that it spans various
online marketing channels and goals (Raman et al. 2012). Our framework can facilitate
independent managerial (budget) decisions by providing easy-to-interpret, objective
information that factors out subjective influences. Budgets should be allocated across
channels, according to their value contribution. Certain channels may be underrepresented;
others contribute little to the company’s success at relatively high costs. Tucker (2012) finds,
in a related study, that attribution can enable advertisers to substitute towards more successful
campaigns, leading to more conversions at lower costs. Furthermore, to shape digital
marketing strategies, advertisers need to step into the shoes of (potential) customers to
understand and anticipate their online behavior. In other words, advertisers need to know
where to meet customers online to make strategic channel decisions. Our framework reveals
27
which channels customers use and to what degree they drive marketing effectiveness. Thus
advertisers can use it to constantly review and adjust their current strategic orientation.
Such usage also should enhance decision makers’ expertise and update their mental
models, which are prone to systematic errors and biases (Tversky and Kahneman 1974).
Online marketing managers often base their decisions on simple heuristics, combined with
personal expertise. Daily work with our model and its results would help them gradually
build new knowledge and better understand the interplay of online marketing measures with
their success drivers. Even when detailed tracking data are available, personal preferences
likely still affect budget and channel decisions. By setting our model in the context of well-
known approaches such as last click wins heuristics, we give decision makers a means to
calibrate their marketing measures, anticipate their impact, and sharpen their expertise, such
that they should improve their future marketing decisions.
The introduction of data-driven attribution also suggests effects on hierarchical
structures within organizations and group decision making. Budget decisions are often group
decisions, resulting from multiple meetings that are influenced by hierarchical superiority or
other influences, such as individual agendas or company politics (Fischer et al. 2011; Sinha
and Zoltners 2001). Our approach is purely data driven and acts as an unbiased instance of
support for marketing decisions. In meeting the objectivity criterion, it is devoid of personal
assumptions, preferences, and emotions that could adversely affect the accuracy of the
resultsin marked contrast with existing, widely used attribution methodologies that rely on
the (pre)definition of channel or position weights by advertisers (Econsultancy 2012a). As a
result, team-based budget decisions can be made in discussion leveraging both practitioners’
experience and data-driven analyses.
Moreover, the versatility of our framework makes it generalizable to many industries
and applications, unlike attribution techniques, whose highly sophisticated solutions cannot
28
be transferred easily to other firms or contexts (Abhishek, Fader, and Hosanagar 2012; Haan,
Wiesel, and Pauwels 2013; Kireyev, Pauwels, and Gupta 2013; Li and Kannan 2013).
Regarding its output, our model flexibly and efficiently evaluates various conversion types
(e.g., sales, sign-ups, leads), depending on the advertiser’s specific aim. In this sense, our
model sheds light onto multiple functions across the company’s value chain, not just sales.
For example, recruiters need to understand where to meet candidates online, where they lose
them, and how to develop appropriate countermeasures. Compared with other attribution
models that are purely retrospective, our proposed graph-based framework can be used
prospectively. Thus a possible application is real-time bidding in ad exchanges, where
advertisers can bid on advertising slots for specific users using information such as the user’s
location or previous surfing behavior (Muthukrishnan 2009). Advertising exchanges serve as
intermediaries between online publishers and advertisers. When a user visits a webpage with
an open (display) advertising slot, the publisher posts the slot in the exchange. Relying on
information provided about the user, such as his or her cookie history, advertisers can bid on
the slots. After the winning advertiser has been determined by auction, the publisher serves
the winning advertiser’s creative content to the user. This entire process happens in
milliseconds, between the time the user requests a page and the time the page is rendered on
the screen (Muthukrishnan 2009). Using our framework, advertisers can more accurately
calculate the conversion probability Eventual Conversion(si) of a customer, given his or her
previous customer journey.
6
The predicted change in Eventual Conversion(si) when the
advertiser wins the auction for the slot and the advertisement is shown to the user also can be
used to calculate the value of this slot and thus determine a maximum cost-effective bid for
this user. The short timeframe for determining a bid means that the primary system-related
restriction for real-time bidding is algorithmic efficiency. Once the model we propose has
been fully built though, using historical data, the calculation of ad factors such as Eventual
29
Conversion(si) diminishes to a single look-up, which makes the framework highly attractive
for real-time applications in ad exchanges.
Fourth, our evaluation results offer new insights into online marketing effectiveness in
a multichannel setting. The higher-order models significantly outperform first-order models,
which indicates that channels in customer journeys should not be analyzed in isolation.
Similarly, prior findings show that browsing patterns within a website are not first-order
Markovian and can be predicted better by higher-order Markov models (Chierichetti et al.
2012); existing studies also reveal the poor prediction results of first-order Markov models
for within-site clickstreams (Montgomery et al. 2004). Thus our results add to the evidence
that last click wins attribution heuristics cannot capture the full contribution of online
channels. In line with other studies (Abhishek, Fader, and Hosanagar 2012; Li and Kannan
2013), we assert that some channels, such as display or SEO, are undervalued by existing
attribution heuristics, whereas the contributions of other channels, such as SEA, may be
overestimated. Using four, large-scale data sets in three different industries, we affirm some
results in previous studies that used only a single industry and were based on substantially
smaller data sets. However, the variation in our results (e.g., for price comparison and
newsletter or the general importance of channels) shows that findings pertaining to online
channel effectiveness and attribution should not be generalized from findings based on a
single data set; they need to be analyzed on a case-by-case basis. This outcome reemphasizes
the need for an easily applicable, versatile attribution framework.
Fifth, we introduce alternative measurement methods for predictive accuracy in online
marketing research. With average journey conversion rates of less than 5% for click journeys,
class distributions in online marketing are highly skewed. Traditional measures used in
similar studies, such as average misclassification error rates or log-likelihood, have low
discriminatory power in such imbalanced contexts (He and Garcia 2009), so we introduce
30
ROC analysis as an alternative measurement method. Although widely used in other
disciplines, ROC curves have not been applied frequently in the marketing domain (Baesens
et al. 2002; Fawcett 2006) and to our knowledge have not been introduced previously to
online marketing research.
Sixth and finally, our research answers a call for marketing impact models based on
individual-level, single-source data, which help identify optimal levels of marketing
expenditures for each channel (Rust et al. 2004). Methodologically, we provide a new
perspective on path data in marketing (Hui, Fader, and Bradlow 2009) and present efficient
methods for handling large, real-world advertising data sets (Bucklin and Sismeiro 2009).
OUTLOOK
Our research has several limitations that may stimulate research on attribution and
online marketing effectiveness. Although we used four data sets from different industries,
some findings may be company specific. The customer journeys in these data sets were short
on average, so these attribution results only show limited differences to existing heuristics,
whereas longer journeys increase advertisers’ need to understand channel contributions. We
therefore recommend applying this framework to other industries and including not just
clicks but views as well. We also did not include the varying costs of different online
advertising channels and potential differences in conversion revenues in the analysis. To
evaluate the effectiveness of online marketing channels, companies should consider costs
incurred per channel, profits from conversions, andpotentially in a second stepthe
customer lifetime value of customers acquired. As Chan, Wu, and Xie (2011) show,
customers acquired through different online marketing channels differ in customer lifetime
value. Further research should extend our framework to include this information and develop
advanced attribution models. We believe our graph-based approach is well suited for such
31
extensions with additional data. We do not offer an algorithm for budget optimization though.
The attribution problem is by definition endogenic; it measures the relative effectiveness of
channels in a given setting (Li and Kannan 2013), so the results are conditional on a number
of management decisions, such as channels used, budget limits per channel or ad creatives
employed. We thus cannot suggest general recommendations for budget optimization.
However, objective attribution is a necessary prerequisite for managers to optimize their
budget decisions; subject to data availability, it could serve as a basis for developing
optimization algorithms. Finally, a strict causal interpretation of customer journeys is
difficult, because alternative explanations may exist for the correlations between conversions
and advertising exposures. Some channels, such as retargeting, explicitly try to target
customers who have a higher propensity to purchase (Lambrecht and Tucker 2013). Even
without special targeting, observed correlations might be due to selection effects, such as an
activity bias (Lewis, Rao, and Reiley 2011). To establish a strict causal relationship between
advertising and purchase behavior, large-scale field experiments with randomized exposure
are required. Such experiments are hard to implement in practice, especially in multichannel
settings, but comparing our attribution modeling framework against experimental results
would be a valuable follow up for the present study.
We thus urge marketing researchers to continue to analyze (online) advertising
effectiveness in multichannel settings to make sense of the newly available wealth of data
gained from new tracking technologies. We hope this work contributes to ongoing efforts to
bridge the gap between academic research and managerial practice and to establish rigorous,
practically applicable models for measuring marketing effectiveness.
32
REFERENCES
Abhishek, Vibhanshu, Peter S. Fader, and Kartik Hosanagar (2012), “The Long Road to
Online Conversion: A Model of Multi-Channel Attribution,” Carnegie Mellon University /
University of Pennsylvania.
Abou-Nabout, Nadia, Bernd Skiera, Tanja Stepanchuk, and Eva Gerstmeier (2012), “An
Analysis of the Profitability of Fee-Based Compensation Plans for Search Engine
Marketing,” International Journal of Research in Marketing, 29 (1), 6880.
Archak, Nikolay, Vahab S. Mirrokni, and S. Muthukrishnan (2010), “Mining Advertiser-
Specific User Behavior Using Adfactors,” in Proceedings of the 19th International
Conference on World Wide Web, WWW 2010, Michael Rappa, Paul Jones, Juliana Freire
and Soumen Chakrabarti, eds.: ACM, 3140.
Baesens, Bart, Stijn Viaene, Dirk van Den Poel, Jan Vanthienen, and Guido Dedene (2002),
“Bayesian Neural Network Learning for Repeat Purchase Modelling in Direct Marketing,”
European Journal of Operational Research, 138 (1), 191211.
Berchtold, André and Adrian E. Raftery (2002), “The Mixture Transition Distribution Model
for High-Order Markov Chains and Non-Gaussian Time Series,” Statistical Science, 17
(3), 328356.
Bradley, Andrew P. (1997), “The Use of the Area under the ROC Curve in the Evaluation of
Machine Learning Algorithms,” Pattern Recognition, 30 (7), 11451159.
Braun, Michael and Wendy Moe (2013), “Online Display Advertising. Modeling the Effects
of Multiple Creatives and Individual Impression Histories,” Marketing Science,
forthcoming.
33
Breuer, Ralph and Malte Brettel (2012), “Short- and Long-Term Effects of Online
Advertising. Differences between New and Existing Customers,” Journal of Interactive
Marketing, 26 (3), 155166.
Breuer, Ralph, Malte Brettel, and Andreas Engelen (2011), “Incorporating Long-Term
Effects in Determining the Effectiveness of Different Types of Online Advertising,”
Marketing Letters, 22 (4), 327340.
Bronnenberg, Bart J. (1998), “Advertising Frequency Decisions in a Discrete Markov
Process under a Budget Constraint,” Journal of Marketing Research, 35 (3), 399406.
Bucklin, Randolph E. and Catarina Sismeiro (2009), “Click Here for Internet Insight.
Advances in Clickstream Data Analysis in Marketing,” Journal of Interactive Marketing,
23 (1), 3548.
Chan, Tat Y., Chunhua Wu, and Ying Xie (2011), “Measuring the Lifetime Value of
Customers Acquired from Google Search Advertising,” Marketing Science, 30 (5), 837
850.
Chandler-Pepelnjak, John (2008), “Measuring ROI Beyond The Last Ad,” May 2008, Atlas
Institute.
Chatterjee, Patrali, Donna L. Hoffman, and Thomas P. Novak (2003), “Modeling the
Clickstream. Implications for Web-Based Advertising Efforts,” Marketing Science, 22 (4),
520541.
Che, Hai and P. B. Seetharaman (2009), ““Speed of Replacement” Modeling Brand Loyalty
Using Last-Move Data,” Journal of Marketing Research, 46 (4), 494505.
Chierichetti, Flavio, Ravi Kumar, Prabhakar Raghavan, and Tamás Sarlós (2012), “Are Web
Users Really Markovian?,” in Proceedings of WWW 2012: ACM, 609618.
Çinlar, Erhan (1975), Introduction to Stochastic Processes. New Jersey: Prentice Hall.
34
Dalessandro, Brian, Claudia Perlich, Ori Stitelman, and Foster Provost (2012), “Causally
Motivated Attribution for Online Advertising,” M6D Research, New York, NY.
Econsultancy (2012a), “Marketing Attribution. Valuing the Customer Journey,” London.
——— (2012b), “Quarterly Digital Intelligence Briefing. Making Sense of Marketing
Attribution,” London.
Edelman, David C. (2010), “Branding in the Digital Age,” Harvard Business Review, 88
(12), 6269.
Evans, David S. (2009), “The Online Advertising Industry. Economics, Evolution, and
Privacy,” Journal of Economic Perspectives, 23 (3), 3760.
Fawcett, Tom (2006), “An Introduction to ROC Analysis,” Pattern Recognition Letters, 27
(8), 861874.
Fischer, Marc, Sönke Albers, Nils Wagner, and Monika Frie (2011), “Dynamic Marketing
Budget Allocation across Countries, Products, and Marketing Activities,” Marketing
Science, 30 (4), 568585.
Ghose, Anindya and Sha Yang (2009), “An Empirical Analysis of Search Engine
Advertising. Sponsored Search in Electronic Markets,” Management Science, 55 (10),
16051622.
Goldfarb, Avi and Catherine Tucker (2011), “Online Display Advertising. Targeting and
Obtrusiveness,” Marketing Science, 30 (3), 389404.
Haan, Evert de, Thorsten Wiesel, and Koen Pauwels (2013), “Which Advertising Forms
Make a Difference in Online Path to Purchase?,” Marketing Science Institute Working
Paper Series No. 13.
He, Haibo and E. A. Garcia (2009), “Learning from Imbalanced Data,” IEEE Transactions
on Knowledge and Data Engineering, 21 (9), 12631284.
35
Hoch, Stephen J. and David A. Schkade (1996), “A Psychological Approach to Decision
Support Systems,” Management Science, 42 (1), 5164.
Homburg, Christian, Viviana V. Steiner, and Dirk Totzek (2009), “Managing Dynamics in a
Customer Portfolio,” Journal of Marketing, 73 (5), 7089.
Hui, Sam K., Peter S. Fader, and Eric T. Bradlow (2009), “Path Data in Marketing. An
Integrative Framework and Prospectus for Model Building,” Marketing Science, 28 (2),
320335.
Jansen, Bernard J. and Simone Schuster (2011), “Bidding on the Buying Funnel for
Sponsored Search and Keyword Advertising,” Journal of Electronic Commerce Research,
12 (1), 118.
Jordan, Patrick, Mohammad Mahdian, Sergei Vassilvitskii, and Erik Vee (2011), “The
Multiple Attribution Problem in Pay-Per-Conversion Advertising,” in Proceedings of
SIGIR 2011: ACM.
Kireyev, Pavel, Koen Pauwels, and Sunil Gupta (2013), “Do Display Ads Influence Search?
Attribution and Dynamics in Online Advertising,” Harvard Business School, Boston, MA.
Kohavi, Ron (1995), “A Study of Cross-Validation and Bootstrap for Accuracy Estimation
and Model Selection,” in Proceedings of the Fourteenth International Joint Conference on
Artificial Intelligence: Morgan Kaufmann, 11371145.
Lambrecht, Anja and Catherine Tucker (2013), “When Does Retargeting Work? Information
Specificity in Online Advertising,” Journal of Marketing Research, forthcoming.
Lehmann, Donald R., Leigh McAlister, and Richard Staelin (2011), “Sophistication in
Research in Marketing,” Journal of Marketing, 75 (July), 155165.
Lewis, Randall A., Justin M. Rao, and David H. Reiley (2011), “Here, There, and
Everywhere. Correlated Online Behaviors Can Lead to Overestimates of the Effects of
Advertising,” in Proceedings of SIGIR 2011: ACM.
36
Li, Hongshuang A. and P. K. Kannan (2013), “Attribution Modeling: Understanding the
Influence of Channels in the Online Purchase Funnel,” Marketing Science Institute
Working Paper Series No. 12.
Lilien, Gary L. (2011), “Bridging the Academic Practitioner Divide in Marketing Decision
Models,” Journal of Marketing, 75 (4), 196210.
Little, John D. C. (1970), “Models and Managers. The Concept of a Decision Calculus,”
Management Science, 16 (8), B-466B-485.
——— (1979), “Decision Support Systems for Marketing Managers,” Journal of Marketing,
43 (3), 926.
——— (2004), “Comments on "Models and Managers. The Concept of a Decision
Calculus",” Management Science, 50 (12 Supplement), 18541860.
Lodish, Leonard M. (2001), “Building Marketing Models That Make Money,” Interfaces, 31
(3), S45.
Lohtia, Ritu, Naveen Donthu, and Idil Yaveroglu (2007), “Evaluating the Efficiency of
Internet Banner Advertisements,” Journal of Business Research, 60 (4), 365370.
Lovett, John (2009), “A Framework for Multicampaign Attribution Measurement,” Forrester,
Cambridge, MA.
Montgomery, Alan L., Shibo Li, Kannan Srinivasan, and John C. Liechty (2004), “Modeling
Online Browsing and Path Analysis Using Clickstream Data,” Marketing Science, 23 (4),
579595.
Muthukrishnan, S. (2009), “Ad Exchanges: Research Issues,” in Internet and Network
Economics. Lecture Notes in Computer Science, Vol. 5929, Stefano Leonardi, ed. Berlin
Heidelberg: Springer, 112.
37
Neslin, Scott A. and Venkatesh Shankar (2009), “Key Issues in Multichannel Customer
Management: Current Knowledge and Future Directions,” Journal of Interactive
Marketing, 23 (1), 7081.
Osur, Ari (2012), “The Forrester Wave. Interactive Attribution Vendors, Q2 2012,” Forrester,
Cambridge, MA.
Pfeifer, Phillip E. and Robert L. Carraway (2000), “Modeling Customer Relationships as
Markov Chains,” Journal of Interactive Marketing, 14 (2), 4355.
PriceWaterhouseCoopers (2013), “IAB Internet Advertising Revenue Report 2012, New
York, NY.
Provost, Foster, Tom Fawcett, and Ron Kohavi (1998), “The Case against Accuracy
Estimation for Comparing Induction Algorithms,” in Proceedings of the 15th International
Conference on Machine Learning (IMLC-98), J. Shavlik, ed, 445453.
Raman, Kalyan, Murali K. Mantrala, Shrihari Sridhar, and Yihui Tang (2012), “Optimal
Resource Allocation with Time-Varying Marketing Effectiveness, Margins and Costs,”
Journal of Interactive Marketing, 26 (1), 4352.
Ramsey, Geoff (2009), “Online Brand Measurement. Connecting the Dots,” eMarketer, New
York, NY.
Reibstein, David J., George Day, and Jerry Wind (2009), “Guest Editorial. Is Marketing
Academia Losing Its Way?,” Journal of Marketing, 73 (4), 13.
Riley, Emily (2009), “The Forrester Wave. Interactive Attribution, Q4 2009,” Forrester,
Cambridge, MA.
Robinson, Dan (2012), “Online Media How Can We Tell What Really Works?,” (accessed
May 8, 2013), [available at http://www.meaningful-brands.co.uk/2012/07/online-media-
how-can-we-tell-what-really-works/].
38
Rust, Roland T, Tim Ambler, Gregory S. Carpenter, V. Kumar, and Rajendra K. Srivastava
(2004), “Measuring Marketing Productivity. Current Knowledge and Future Directions,”
Journal of Marketing, 68 (4), 7689.
Rust, Roland T, Katherine N. Lemon, and Valarie A. Zeithaml (2004), “Return on Marketing.
Using Customer Equity to Focus Marketing Strategy,” Journal of Marketing, 68 (1), 109
127.
Rutz, Oliver J. and Randolph E. Bucklin (2011), “From Generic to Branded. A Model of
Spillover in Paid Search Advertising,” Journal of Marketing Research, 48 (1), 87102.
Rutz, Oliver J., Michael Trusov, and Randolph E. Bucklin (2011), “Modeling Indirect Effects
of Paid Search Advertising. Which Keywords Lead to More Future Visits?,” Marketing
Science, 30 (4), 646665.
Shao, Xuhui and Lexin Li (2011), “Data-Driven Multi-Touch Attribution Models,” in
Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining: ACM, 258264.
Sinha, Prabhakant and Andris A. Zoltners (2001), “Sales-Force Decision Models: Insights
from 25 Years of Implementation,” Interfaces, 31 (3), S8.
Sood, Ashish, Gareth M. James, and Gerard J. Tellis (2009), “Functional Regression. A New
Model for Predicting Market Penetration of New Products,” Marketing Science, 28 (1), 36
51.
Styan, George P. H. and Harry Smith (1964), “Markov Chains Applied to Marketing,”
Journal of Marketing Research, 1 (1), 5055.
Tucker, Catherine (2012), “The Implications of Improved Attribution and Measurability for
Online Advertising Markets,” in Competition in the Online Environment.
Tversky, Amos and Daniel Kahneman (1974), “Judgment Under Uncertainty: Heuristics and
Biases,” Science, 185 (4157), 11241131.
39
Wiesel, Thorsten, Koen Pauwels, and Joep Arts (2011), “Marketing's Profit Impact.
Quantifying Online and Off-Line Funnel Progression,” Marketing Science, 30 (4), 604
611.
Wübben, Markus and Florian Wangenheim (2008), “Instant Customer Base Analysis.
Managerial Heuristics Often “Get It Right”Journal of Marketing, 72 (3), 8293.
Yang, Sha and Anindya Ghose (2010), “Analyzing the Relationship between Organic and
Sponsored Search Advertising. Positive, Negative, or Zero Interdependence?,” Marketing
Science, 29 (4), 602623.
40
FOOTNOTES
1
We use the term “online marketing channels” to cover different online marketing
instruments, including search engine advertising, display, or social media advertising.
2
Similar concepts have been referred to as the buying funnel (Jansen and Schuster, 2011),
purchasing funnel (Jordan et al. 2011) or the consumer decision journey (Edelman, 2010).
3
Examples of companies used in this context include, but are not limited to, Adclean,
Adometry, Atlas, C3 Metrics, ClearSaleing, Coremetrics (IBM), Google, Theorem,
Trueeffect, Visual IQ, Icrossing, and [x+1].
4
Called adgraphs by Archak, Mirrokni, and Muthukrishnan (2010).
5
Two-click journeys only have a first and a last, but no intermediary position.
6
For forward-looking applications, simple third-order models are recommended, because
they offer a good combination of classification accuracy and stability. Backward and
bathtub models should not be applied in prospective contexts, because positions in
unfinished journeys cannot be determined unambiguously.
TABLE 1: EVALUATION CRITERIA FOR ATTRIBUTION MODELS
Criterion
Our Definition
Relation to Prior Research
Description
Objectivity
Models must be able to assign
credit to individual channels or
campaigns in accordance with
their factual ability to generate
value, such as contributing to
conversions or increasing
revenues.
Models should allow for computing the relative
impact of decision variables and enable objectivity
in evaluating decisions options.
Attribution systems should reward an individual
channel in accordance with its ability to affect the
likelihood of conversion (fairness).
Predictive
accuracy
Models should be able to
predict conversion events
correctly.
Predictive validity is important to persuade
managers of a model's credibility.
Attribution models should have high accuracy in
predicting active or inactive users (accuracy, A-
metric).
Robustness
Models should deliver stable
and reproducible results if they
run numerous times.
Models should be robust to avoid bad, unstable
results.
Attribution models should deliver stable estimates
(variability, v-metric).
Interpretability
Model structure should be
transparent to all stakeholders
with reasonable effort, and the
results should be interpretable
with relative ease.
Model users should be able to transfer model
results directly into managerial decisions.
Models should be simple and easy to communicate.
Models should be easy to interpret, because
managers refuse to apply black box approaches.
An attribution system needs to be accepted "by all
parties with material interest" based on its
"statistical merit" and the "intuitive understanding"
of the system's components.
Versatility
Versatility combines
adaptability and ease of
control. Adaptability is the
capability to incorporate new
information that becomes
available over time. Ease of
control enables users to adjust
inputs to fit company-specific
requirements and derive
appropriate outputs.
Models should be "adaptive" and "easy to control."
"Adaptive" describes the capability to update the
model as soon as new information become
available; "easy to control" enables the user to
adjust inputs to modify outputs.
Models should deliver an adequate level of
aggregation to achieve acceptance by managers.
Algorithmic
efficiency
The speed of computing model
outputs when they are
requested.
Model structures should be complete in relevant
issues and able to handle many phenomena without
being bogged down.
Models need to provide results as soon as managers
require them to be applicable in practice.
As a basic precondition for practical purposes, a
methodology must be able to handle large data
volumes fast and efficiently.
42
TABLE 2: EXISTING RESEARCH ON ATTRIBUTION MODELING
Study
Methodology
Evaluation Criteria
Objectivity
Predictive
accuracy1
Robustness
Interpretability
Versatility
Algorithmic
efficiency
Shao and Li (2011)
(1) Bagged logistic
regression
(2) Simple
probabilistic model
No; frequency of
contacts and
positions not
considered
Yes
Yes
Yes
Yes
Not available
Dalessandro et al.
(2012)
Causally motivated
methodology based on
cooperative game
theory (Shapley value)
combined with logistic
regression
No; frequency of
contacts not
considered
Yes
Not measured
Yes
Yes
Not available
Abhishek, Fader,
and Hosanagar
(2012)
Dynamic Hidden
Markov Model
Yes
Yes
Not measured
Limited
Limited; assumptions on
channels and structure of
decision process
Not available
Li and Kannan
(2013)
Bayes
Yes
Yes
Not measured
Limited
Limited; assumptions on
channels and structure of
decision process
Not available
Kireyev, Pauwels,
and Gupta (2013)
Multivariate time-
series model
(persistence modeling)
No; not based on
individual data
Yes
Not measured
Limited
Limited; application
based on 2 channels
(display and SEO)
Not available
Haan, Wiesel, and
Pauwels (2013)
Structural vector
autoregression
No; not based on
individual data
Yes
Not measured
Limited
Limited; not suited for
performance-based
channels (e.g. affiliate)
Not available
Our study
Markov walks (first-
and higher-order)
Yes
Yes
Yes
Yes
Yes
Yes
1This table only indicates if predictive accuracy is evaluated in the respective study. The data sets used and implementation details are not publically available, and the measures vary, so a
comparison of predictive accuracy across studies is not possible.
43
TABLE 3: DESCRIPTIONS
DS 1
DS 2
DS 3
DS 4
Industry
Travel
Fashion Retail
Fashion Retail
Luggage Retail
Number of different channels
8
8
8
8
Number of clicks
1,083,901
625,798
405,906
314,890
Number of journeys
206,519
170,914
142,039
105,031
Journey length
5.25
(14.743)
3.66
(4.402)
2.86
(2.535)
3.00
(8.850)
Number of conversions
5,763
7,226
10,395
4,910
Journey conversion rate
2.79%
4.23%
7.32%
4.67%
Notes: Standard deviations are in parentheses.
44
TABLE 4: AVERAGE AREA UNDER ROC CURVE
Within Sample
Out of Sample
DS 1
DS 2
DS 3
DS 4
DS 1
DS 2
DS 3
DS 4
1st order
Simple
.8505
(.0030)
.7390
(.0031)
.6549
(.0057)
.6361
(.0086)
.8519
(.0043)
.7350
(.0053)
.6539
(.0019)
.6427
(.0092)
Forward
.8946
(.0048)
.8221
(.0032)
.7883
(.0085)
.7151
(.0062)
.8932
(.0029)
.8238
(.0038)
0.7846
(.0066)
.7208
(.0054)
Backward
.8505
(.0030)
.7392
(.0032)
.6566
(.0072)
.6359
(.0093)
.8510
(.0043)
.7350
(.0054)
.6547
(.0029)
.6430
(.0097)
Bathtub
.8505
(.0030)
.7393
(.0032)
.6566
(.0072)
.6361
(.0086)
.8510
(.0043)
.7350
(.0054)
.6546
(.0029)
.6430
(.0096)
2nd order
Simple
.8629
(.0033)
.7590
(.0028)
.6835
(.0032)
.6568
(.0077)
.8639
(.0053)
.7540
(.0051)
.6812
(.0029)
.6630
(.0083)
Forward
.9037
(.0064)
.8359
(.0040)
.7975
(.0074)
.7243
(.0084)
.9039
(.0037)
.8340
(.0056)
.7948
(.0076)
.7274
(.0072)
Backward
.8633
(.0035)
.7621
(.0030)
.6835
(.0031)
.6570
(.0082)
.8641
(.0050)
.7569
(.0053)
.6808
(.0036)
.6608
(.0099)
Bathtub
.8633
(.0035)
.7621
(.0031)
.6835
(.0030)
.6568
(.0083)
.8642
(.0050)
.7569
(.0053)
.6809
(.0036)
.6606
(.0100)
3rd order
Simple
.9005
(.0041)
.8120
(.0023)
.7699
(.0054)
.7141
(.0100)
.8997
(.0040)
.8092
(.0053)
.7663
(.0058)
.7186
(.0064)
Forward
.9060
(.0110)
.8417
(.0122)
.7962
(.0102)
.7194
(.0209)
.9052
(.0103)
.8390
(.0148)
.7937
(.0117)
.7243
(.0153)
Backward
.9002
(.0070)
.8169
(.0076)
.7804
(.0064)
.7114
(.0145)
.8978
(.0074)
.8147
(.0104)
.7767
(.0067)
.7190
(.0097)
Bathtub
.9003
(.0070)
.8169
(.0076)
.7804
(.0064)
.7115
(.0144)
.8978
(.0074)
.8146
(.0104)
.7767
(.0067)
.7190
(.0097)
4th order
Simple
.9091
(.0056)
.8338
(.0063)
.7886
(.0079)
.7197
(.0173)
.9077
(.0064)
.8304
(.0105)
.7864
(.0081)
.7254
(.0105)
Forward
.9038
(.0153)
.8367
(.0292)
.7878
(.0206)
.7136
(.0341)
.9020
(.0170)
.8354
(.0307)
.7858
(.0225)
.7169
(.0292)
Backward
.9057
(.0124)
.8255
(.292)
.7932
(.0145)
.7130
(.0314)
.9027
(.0154)
.8200
(.0357)
.7913
(.0149)
.7199
(.0229)
Bathtub
.9057
(.0124)
.8254
(.0293)
.7932
(.0145)
.7131
(.0312)
.9027
(.0155)
.8200
(.0358)
.7913
(.0149)
.7199
(.0229)
Notes: Standard deviations are in parentheses.
45
TABLE 5: REMOVAL EFFECT: AVERAGE STANDARD DEVIATION AS % OF
AVERAGE REMOVAL EFFECT
Percentage of Average RE
DS 1
DS 2
DS 3
DS 4
1st order
Simple
1.18%
.92%
.81%
1.10%
Forward
1.75%
1.60%
1.40%
3.07%
Backward
1.01%
.88%
.84%
1.65%
Bathtub
1.28%
1.03%
.87%
1.28%
2nd order
Simple
1.54%
1.40%
1.20%
1.57%
Forward
2.72%
2.48%
2.08%
3.49%
Backward
1.71%
1.78%
1.55%
3.08%
Bathtub
2.03%
1.62%
1.46%
1.87%
3rd order
Simple
2.81%
2.37%
1.86%
2.52%
Forward
3.59%
3.74%
2.95%
4.54%
Backward
3.46%
3.56%
2.94%
5.40%
Bathtub
3.55%
3.06%
2.44%
3.25%
4th order
Simple
4.40%
4.27%
3.15%
3.85%
Forward
4.27%
5.25%
3.77%
5.67%
Backward
5.04%
5.82%
4.64%
6.99%
Bathtub
5.80%
5.46%
4.22%
4.93%
46
TABLE 6: REMOVAL EFFECTS (SECOND ORDER, SIMPLE), DATA SET 1
START
Affiliate
Display
News-
letter
Price
Compari
son
Re-
targeting
SEA
SEO
Undefine
d
Affiliate
5.94%
6.79%
.22%
.28%
.30%
.08%
3.40%
1.30%
.21%
Display
.74%
.03%
.55%
.02%
.01%
.00%
.11%
.02%
.01%
Newsletter
1.64%
.10%
.05%
1.26%
.03%
.03%
.35%
.16%
.01%
Price
Comparison
2.19%
.04%
.03%
.05%
1.77%
.01%
.29%
.06%
.05%
Retargeting
.33%
.04%
.02%
.03%
.03%
.30%
.25%
.07%
.00%
SEA
24.66%
1.09%
.46%
.54%
.42%
.21%
18.64%
3.21%
.62%
SEO
5.45%
.52%
.09%
.15%
.15%
.07%
5.90%
6.19%
.11%
Undefined
1.11%
.03%
.03%
.03%
.02%
.01%
.36%
.08%
.65%
Preceded
by
47
TABLE 7: REMOVAL EFFECTS (FIRST ORDER, BATHTUB), DATA SET 1
First
Intermediate
Last
Affiliate
4.13%
2.89%
9.34%
Display
2.40%
2.03%
.04%
Newsletter
1.22%
.59%
1.22%
Price
Comparison
2.14%
.85%
1.65%
Retargeting
.23%
.27%
.34%
SEA
23.28%
7.81%
19.47%
SEO
5.09%
5.41%
7.06%
Undefined
1.23%
.69%
.60%
Position
48
FIGURE 1: MARKOV GRAPH (FIRST ORDER, SIMPLE), DATA SET 1
49
FIGURE 2: ROC CURVES FOR SIMPLE MODELS (WITHIN SAMPLE)
Data set 1
Data set 2
Data set 3
Data set 4
50
FIGURE 3: COMPARISON WITH EXISTING ATTRIBUTION HEURISTICS
Data set 1
Data set 2
Data set 3
Data set 4
... More recent work along these lines include scalable Shapley value based methods [9], unfortunately, this approach does not use the non-converting paths within the modeling step. Markov chain based attribution models have been proposed [10,11]. While appealing, these also suffer from limitations in how big the cardinality of the dimension can be (being quadratic in the cardinality for the simplest model). ...
... We assume a wide non-negative prior on the base magnitudes β a ∼ exp (10). The sign-restriction on βs reflects the knowledge that all ads have a non-negative effect when they occur without any interactions. ...
Preprint
Full-text available
In a multi-channel marketing world, the purchase decision journey encounters many interactions (e.g., email, mobile notifications, display advertising, social media, and so on). These impressions have direct (main effects), as well as interactive influence on the final decision of the customer. To maximize conversions, a marketer needs to understand how each of these marketing efforts individually and collectively affect the customer's final decision. This insight will help her optimize the advertising budget over interacting marketing channels. This problem of interpreting the influence of various marketing channels to the customer's decision process is called marketing attribution. We propose a Bayesian model of marketing attribution that captures established modes of action of advertisements, including the direct effect of the ad, decay of the ad effect, interaction between ads, and customer heterogeneity. Our model allows us to incorporate information from customer's features and provides usable error bounds for parameters of interest, like the ad effect or the half-life of an ad. We apply our model on a real-world dataset and evaluate its performance against alternatives in simulations.
... where PCRx is the pollution contribution rate of TTEx. Herein, the removal effect (RE) is introduced on the basis of a formula that originates from Markov models; that is, when the EF of a TTE is ruled out, the total pollution assessment value used in the calculation of the PLI parameters will change to consider the remaining TTEs (Anderl et al., 2013). By calculating the removal effect coefficient of each TTE, the channel contribution value is obtained according to the proportion of the removal effect coefficient in the sum of the total coefficients. ...
... where PCRx is the pollution contribution rate of TTEx. Herein, the removal effect (RE) is introduced on the basis of a formula that originates from Markov models; that is, when the EF of a TTE is ruled out, the total pollution assessment value used in the calculation of the PLI parameters will change to consider the remaining TTEs (Anderl et al., 2013). By calculating the removal effect coefficient of each TTE, the channel contribution value is obtained according to the proportion of the removal effect coefficient in the sum of the total coefficients. ...
Article
Toxic trace elements represent an ongoing environmental problem in aquatic ecosystems. However, a lack of quantitative analysis and accurate evaluation have led to unguided control and water management strategies. Lake Yangzong is the main freshwater resource for nearly one million people in Yunnan Province in southwestern China. It has been heavily contaminated in recent years by significant anthropogenic activities including an industrial phosphor-gypsum spill, sewage effluent, and chemical remediation processes. Herein, we combine eco-environmental indices with multiple statistical analyses to determine the ecological risk and degree of contamination of 11 toxic trace elements in the upper sediments of the lakebed. Local geochemical background concentrations were determined using robust regression models developed from sediment core data. Pollution indices (EF/PLI) indicate that severe As contamination was centralized in the southwestern part of the lake. Other toxic trace elements (e.g., Cd, Cu, Pb) are slightly to moderately enriched, and progressively decrease from the northwestern to the southeastern areas of the lake. A more accurate and sensitive index (PCR) was proposed herein, suggesting that contamination was dominated by As and Pb in different lake sections. The northern section of the lake and the southwestern bay exhibited higher contaminant levels than other regions of the lake. Bio-toxic indices (ERF/PERI) indicate that As and Cd pose a high ecological risk, whereas Cu and Pb pose a low risk to biota. Statistical analyses (PCA/PMF) demonstrate that metal contaminants originated from three types of anthropogenic sources: the smelting of metal ores, the leakage of tailings effluent, and coal consumption.
... Consumers will typically search for a product across several related competitor websites and search intermediaries (Holland et al., 2016). This online search trajectory is captured with online panels, and the related nature of an individual's search behavior is recorded (Anderl, Becker, Wangenheim, & Schumann, 2014;Edelman, 2010). Commercial examples of online panels include ComScore, Alexa and GfK. ...
Article
Full-text available
This paper utilizes market-level data to explore the relative performance of individual companies amongst defined competitors. We show the potential of using consumer clickstream data, an important type of big data, to create a new set of B2B analytical frameworks. In the markets where complex interactions between competitors, search intermediaries and consumers create a network, B2B relationships can be inferred from consumer search patterns, and can then be modeled to gauge the online performance. A commercial dataset from ComScore’s US panel of one million users is used to illustrate a new approach to measure and evaluate the online performance of competitors in the US airline market. The methodology and associated performance framework demonstrate the potential for new forms of market intelligence based on the visualization of market networks, online performance calculated from matrix algorithms, the measurement of the impact of search intermediaries, and the identification of latent relationships. This research makes theoretical and empirical contributions to the debate on the use of big data for B2B market analytics. B2B managers can use this approach to extend their network horizon from an egocentric to a network view of competition and map out their competitive landscape from the perspective of the customer.
Article
Full-text available
Data generation is currently expanding at an astonishing pace, and the function of marketing is becoming increasingly sophisticated and customized. Companies seek to understand their internal corporate environment and externalities and to exponentially enhance their marketing power. This study aims to understand the influence of Big data analysis on digital marketing. The methodologies used to approach this issue were: (a) a systematic literature review based on articles dated between 2014 and 2020; and (b) a bibliometric analysis of articles dated between 2000 and 2020 using the software VOSviewer. The literature review allowed us to conclude that in the next decades, the business world in general, and marketing in particular, will define more oriented strategies based on a more profound knowledge of consumer behavior. Artificial intelligence agents driven by machine learning methods, technology, and Big data will be a conditioning factor in defining these strategies.
Chapter
During the last decades, the internet has become an increasingly important channel for businesses to sell products and communicate with customers. Web analytics helps companies to understand customer behavior and optimizes processes to satisfy the customer needs but there is still room for improvement in real-time visualization in the context of business content. In this paper, we describe a graph-based visualization showing the entirety of the website activities at a glance. To increase the tangibility of customer behavior, the graph adapts to the website interactions in real time using smooth transitions from one state to another. Furthermore, we incorporate machine learning in our data integration process to deal with the dynamics of change of website content over time. Finally, we conduct an evaluation in the form of expert interviews revealing that our approach is suitable to optimize digitalized business processes, initiate marketing campaigns, increase the tangibility to the customer, and put a stronger focus on customer needs.
Chapter
Der bedeutende Beitrag der Automobilindustrie zum Klimawandel und die damit verbundenen CO2-Emissionen haben zu einem Umdenken bei Konsumentinnen und Konsumenten, Politik und Unternehmen geführt. Dadurch wird die Elektrifizierung des Straßenverkehrs vorangetrieben. Dieser Beitrag untersucht auf Basis des aktuellen Forschungsstands und einer durchgeführten Befragung die Anforderungen und Bedürfnisse von Kundinnen und Kunden rein batteriebetriebener Fahrzeuge in Bezug auf die Customer Journey sowie deren digitale und analoge Customer Touchpoints.
Conference Paper
Full-text available
An important task of the military education system is the quality professional training of officers for the formations of the country's Armed Forces, building psychological readiness and abilities for professional and competent performance of official duties. According to the law on higher education, training is carried out in specialized higher education institutions, whose role is to prepare personnel necessary for the country's defense and national security for the various levels of government (tactical, operational and strategic). The education acquired in the military educational establishments of the Republic of Bulgaria corresponds to the requirements of the European legislation and the normative regulation of the country. It provides graduates with the necessary qualifications, including vocational education, professional knowledge, skills and experience, competence to practice skills and their application in practice, as well as competencies representing a set of interrelated knowledge, skills and attitudes necessary to perform the military profession. . The provision of interaction between the military educational establishments and the users of personnel from the system of the Armed Forces of the Republic of Bulgaria is of key importance for the connection of the military education with the practice. In this way, students will learn what they need to know and can about their future realization in the field.
Thesis
The main subject of the dissertation are the digital communication configurations of organizations in the adventure tourism sector in Bulgaria. The theoretical review analyzes current trends in digital marketing communications, types of communication models and the evolution of web technologies. A review of theoretical statements concerning network theory and configuration approach is performed. An author's definition of a digital communication configuration and its basic template are proposed. A conceptual model C3E for DCC design has been derived. Complex analyzes and researches on the market of adventure tourism in Bulgaria have been carried out, based on the author's methodology. Research includes user survey, content analysis on social media content, digital audit of the communication channels of organizations in the adventure tourism sector, in-depth interviews with experts. Profiles of the main consumer segments in the sector are presented, as well as a network model of the communications in the adventure tourism sector. Based on the conducted quantitative and qualitative research, the dissertation offers a functional model for the design of DCC, which is operationalized for the adventure tourism sector. This functional model has been tested for two business organizations in different market situations - already offering adventure travel services and one that is planning to introduce them. Network analysis is applied and by constructing proximity matrices the designed digital communication configurations are visualized.
Article
Full-text available
Marketers are currently focused on proper budget allocation to maximize ROI from online advertising. They use conversion attribution models assessing the impact of specific media channels (display, search engine ads, social media, etc.). Marketers use the data gathered from paid, owned, and earned media and do not take into consideration customer activities in category media, which are covered by the OPEC (owned, paid, earned, category) media model that the author of this paper proposes. The aim of this article is to provide a comprehensive review of the scientific literature related to the topic of marketing attribution for the period of 2010-2019 and to present the theoretical implications of not including the data from category media in marketers' analyses of conversion attribution. The results of the review and the analysis provide information about the development of the subject, the popularity of particular conversion attribution models, the ideas of how to overcome obstacles that result from data being absent from analyses. Also, a direction for further research on online consumer behavior is presented.
Conference Paper
Full-text available
We review accuracy estimation methods and compare the two most common methods: crossvalidation and bootstrap. Recent experimental results on arti cial data and theoretical results in restricted settings have shown that for selecting a good classi er from a set of classiers (model selection), ten-fold cross-validation may be better than the more expensive leaveone-out cross-validation. We report on a largescale experiment|over half a million runs of C4.5 and a Naive-Bayes algorithm|to estimate the e ects of di erent parameters on these algorithms on real-world datasets. For crossvalidation, we vary the number of folds and whether the folds are strati ed or not � for bootstrap, we vary the number of bootstrap samples. Our results indicate that for real-word datasets similar to ours, the best method to use for model selection is ten-fold strati ed cross validation, even if computation power allows using more folds. 1
Article
Many decisions are based on beliefs concerning the likelihood of uncertain events such as the outcome of an election, the guilt of a defendant, or the future value of the dollar. Occasionally, beliefs concerning uncertain events are expressed in numerical form as odds or subjective probabilities. In general, the heuristics are quite useful, but sometimes they lead to severe and systematic errors. The subjective assessment of probability resembles the subjective assessment of physical quantities such as distance or size. These judgments are all based on data of limited validity, which are processed according to heuristic rules. However, the reliance on this rule leads to systematic errors in the estimation of distance. This chapter describes three heuristics that are employed in making judgments under uncertainty. The first is representativeness, which is usually employed when people are asked to judge the probability that an object or event belongs to a class or event. The second is the availability of instances or scenarios, which is often employed when people are asked to assess the frequency of a class or the plausibility of a particular development, and the third is adjustment from an anchor, which is usually employed in numerical prediction when a relevant value is available.
Article
The authors study brand loyalties for durable goods using automobile survey data that are peculiarly censored and track only elapsed times since transitions but not the transition times themselves. This censoring problem is typical of commercially available durable goods survey data. However, little attention has been paid to such "last-move" data in the statistics or marketing literature on the analysis of transition times. The authors propose a multistate, continuous-time, nonstationary Markov model with a parsimonious brand loyalty structure and also propose an estimation approach to recover the parameters of the proposed model using the automobile survey data. The proposed model fits observed brand choice outcomes even better than a model with a fully unrestricted (and, therefore, highly parameterized) transition structure. The authors also obtain several substantive findings. For example, Chrysler is significantly "weaker" than General Motors and Ford insofar as it has the lowest brand loyalty during the study period. The authors illustrate the managerial implications by predicting time-varying market shares of brands in periods subsequent to the period of analysis.
Article
As firms increasingly rely on online media to acquire consumers, marketing managers rely on online metrics such as click-through rate (CTR) and cost per acquisition (CPA). However, these standard online advertising metrics are plagued with attribution problems and do not account for synergy or dynamics. These issues can easily lead firms to overspend on some actions and thus waste money and/or underspend in others, leaving money on the table. We develop a multivariate time series model to investigate the dynamic interaction between paid search and display ads and calibrate the model using data from a bank that uses online ads to acquire new checking account customers. The model suggests that both search and display ads exhibit dynamics that improve their effectiveness and ROI over time. Moreover, our results suggest that display ads increase search conversion. However, display ads may also increase search clicks, thereby increasing search advertising costs. After accounting for these three effects, we estimate that each $1 invested in display and search leads to a return of $1.24 for display and $1.75 for search ads. These ROI estimates are respectively 10% and 38% higher than those obtained by standard metrics, which may have led the company to under-invest. We use these results to show how optimal budget allocation may shift after accounting for attribution and dynamics. Although display benefits from synergy attribution, the strong dynamic effects of search call for an increase in search advertising budget share by up to 36% in our context.
Article
The classical approach to market behavioral analysis rarely uses data provided by the transitional, or switching, habits of the consumer. In this article, the authors have taken types of laundry powders purchased by a housewife to define the state space of a Markov chain. Using this model future purchase behavior is predicted, and statistical inferences on the switching habits are made.
Article
(This article originally appeared in Management Science, April 1970, Volume 16, Number 8, pp. B-466–B-485, published by The Institute of Management Sciences.) A manager tries to put together the various resources under his control into an activity that achieves his objectives. A model of his operation can assist him but probably will not unless it meets certain requirements. A model that is to be used by a manager should be simple, robust, easy to control, adaptive, as complete as possible, and easy to communicate with. By simple is meant easy to understand; by robust, hard to get absurd answers from; by easy to control, that the user knows what input data would be required to produce desired output answers; adaptive means that the model can be adjusted as new information is acquired; completeness implies that important phenomena will be included even if they require judgmental estimates of their effect; and, finally, easy to communicate with means that the manager can quickly and easily change inputs and obtain and understand the outputs. Such a model consists of a set of numerical procedures for processing data and judgments to assist managerial decision making and so will be called a decision calculus. An example from marketing is described. It is an on-line model for use by product managers on advertising budgeting questions. The model is currently in trial use by several product managers.
Article
Building models that help marketers make productive decisions and that they actually use is hard. I learned some lessons in my 30+ years of building and applying models for sizing and deploying sales forces, for estimating brand health, and for estimating the impact on revenue of marketing mix and of the attractiveness to consumers of product attributes. One is that building scientific models that improve productivity is an art. Other lessons include these: to balance model complexity versus ease of understanding and estimation; to involve managers in any subjective estimates for models they will implement; to make measures available to managers when they need them and at the level of organization they need; to use the predictive validity of a hold-out sample to persuade managers of a model's credibility; and to recognize that even for empirical models, subjective estimates about the future may be necessary. Productive marketing models may have different attributes than those published in prestigious academic journals.