Article

Beyond the Last Touch: Attribution in Online Advertising

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

Online advertisers often utilize multiple publishers to deliver ads to multi-homing consumers. These ads often generate externalities and their exposure is uncertain, which impacts advertising effectiveness across publishers. We analytically analyze the inefficiencies created by externalities and uncertainty when information is symmetric between advertisers and publishers, in contrast to most previous research that assumes information asymmetry. Although these inefficiencies cannot be resolved through publisher side actions, attribution methods that measure the campaign uncertainty can serve as an alternative solution to help advertisers adjust their strategies. Attribution creates a virtual competition between publishers, resulting in a team compensation problem. The equilibrium may potentially increase the aggressiveness of advertiser bidding leading to increased advertiser profits. The popular last-touch method is shown to over-incentivize ad exposures, often resulting in lowering advertiser profits. The Shapley value achieves an increase in profits compared to last-touch. Popular publishers and those that appear early in the conversion funnel benefit the most from advertisers using last-touch attribution. The increase in advertiser profits come at the expense of total publisher profits and often results in decreased ad allocation efficiency. We also find that the prices paid in the market will decrease when more sophisticated attribution methods are adopted.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Importantly, the effect of a touchpoint on customer decision is dependent on the amount of marketing communications customers are exposed to (Anderl, Becker et al., 2016;Heath, Cluley, & O'Malley, 2017). Exposure to a range of touchpoints does not always have a cumulative effect (Berman, 2018). Nevertheless, the sequence of experienced touchpoints can have synergic or antagonistic effects on customer decision-making, thereby, increasing or ruining the effect of marketing communications (Nottorf, 2014;Sinha, Mehta et al., 2015). ...
... Attribution as a concept has been recognised as a marketing effectiveness analytics tool that can outperform widely adopted tools such as marketing channel performance analytics, and marketing mix modelling, in optimising marketing budget allocation (Berman, 2018). Such potential has triggered a stream of empirical research (Table B1 in Appendix B). ...
... The application of Big Data enables the dynamic elaboration and adjustments of the standardised principles, leading to sophisticated mathematic models to be introduced (Larson & Chang, 2016). It can utilise linear or logistic regressions (Shao & Li, 2011;Wiesel et al., 2011) and incorporate analytical tools such as machine learning (Abhishek et al., 2012;Li & Kannan, 2014) and cooperative game theory to generate results (Abakus, 2013;Berman, 2018). The dependence of such methods on both data and sophisticated computational methods often leads to the interchangeable application of the terms 'algorithmic' and, sometimes, 'data-driven' attribution. ...
Article
The integration of technology in business strategy increases the complexity of marketing communications and urges the need for advanced marketing performance analytics. Rapid advancements in marketing attribution methods created gaps in the systematic description of the methods and explanation of their capabilities. This paper contrasts theoretically elaborated facilitators and the capabilities of data-driven analytics against the empirically identified classes of marketing attribution. It proposes a novel taxonomy, which serves as a tool for systematic naming and describing marketing attribution methods. The findings allow to reflect on the contemporary attribution methods’ capabilities to account for the specifics of the customer journey, thereby, creating currently lacking theoretical backbone for advancing the accuracy of value attribution.
... Attribution modeling is defined as the science of using advanced analytics to allocate appropriate credit for a desired customer action to each marketing touchpoint across all online and offline channels (Kannan, Reinarz, and Verhoef 2016;Moffett, Pilecki, and McAdams 2014). The importance of understanding multiple touchpoint attribution in today's digitized business environment is underscored by attribution being identified as the number one research priority for marketing managers (MSI Research Priorities 2016-2018. ...
... For probability models that are typically nonlinear, this lift will be different than when the exposure order is reversed. If we only knew that a consumer was exposed to media 1 and 2 and did not know the order, we would have to cycle through all possible sequences (2 in this case) and assign attribution based on the Shapley (1953) value, as explained in Berman (2017) and Li and Kannan (2014). ...
... 3. Note that computing the weights as ratios of increments is similar in spirit to the Shapley Value used in attribution literature (Berman 2017;Li and Kannan 2014). The difference is that we do not have to use permutations over different orders of touchpoints because we observe the exact order of touchpoints. ...
Article
Full-text available
Media attribution is the assignment of a percentage weight to each media touchpoint a consumer is exposed to prior to purchasing. Many firms consider using attribution to allocate media budgets, particularly for digital media, but an important question is whether this is appropriate. An initial hurdle when answering this question is that, despite the surge in interest for media attribution in marketing academia and practice, attribution does not have an agreed-on formal definition. Therefore, this article proposes an attribution formulation based on the relative incremental contribution that each medium makes to a purchase, taking into account advertising carryover and interaction effects. The formulation shows that attribution is proportional to the marginal effectiveness of a medium times its number of exposures. This means that often-used media will have high attribution weights. However, the profit-maximizing allocation for a fixed budget is a function of advertising effectiveness, but not a function of past exposure levels. By offering analytical derivations and studying simulated and empirical data, the paper shows how attribution can offer misleading insights on how to allocate resources across media. Moreover, the empirical example demonstrates that substantial gains in purchase probability can be made using profit-maximizing allocation compared with attribution-based allocation.
... Thus, for luggage products and midlevel fashion, both affiliates and referrers may serve as relevant gateways for catalyzing purchase events. The explanation may route back to the question of whether these channels initialize purchase paths or finalize purchase transactions, as they often include coupons that allow affiliate websites to "free-ride" on previous contacts, at least when leveraged by more advanced online users (Berman 2016). For higher-priced fashion products, both direct type-in and generic paid search represent the most promising channel homogeneous click sequences (Model 2a, DS1: TypeIn-TypeIn b D 0.239, p < .01), ...
... Model 2c: SEAgeneric-Affiliate-Affiliate-Affiliate b D 1.481, p < .01). This supports the initial idea that affiliates may serve to finalize purchase events, in part, at the cost of potential advertiser revenues, as affiliates regularly include discounts (Berman 2016). Along these lines, affiliate contacts emerge in click sequences consisting of three or more clicks and play only a subordinate role in shorter click patterns. ...
... However, referrer appears to indicate initialized browsing sequences, while affiliate is seen when users are prone to conclude a purchase decision. This distinction may arise because affiliate websites often include coupons to catalyze purchase decisions, though at the expense of potential revenues (Berman 2016). In contrast, our results show that price comparison is less industry specific and is associated with user price sensitivity (Mehta, Rajiv, and Srinivasan 2003), such that effects for the high-priced fashion retailer appear similar to the effects for the travel company. ...
Article
Full-text available
Though research literature addresses a broad range of advertising impact models, studies on the channel preferences of online purchasers have received little attention, regarding both multichannel settings and channel interplay in click sequences. To provide advertisers a method for better evaluating customer channel preference, this study investigates the path to purchase by building on four multichannel clickstream data sets from three industries, recorded with cookie-tracking technologies. Applying a Cox model and clustering techniques supports delineation of empirical generalizations and industry-specific findings on channel exposure, including their antecedents and distinct channel click sequences. Across data sets, online users show idiosyncratic channel preferences for a limited set of one or two channels rather than multiple online vehicles. Both channel homogeneous click sequences and combinations of two channels (including branded contacts) are effective as purchase predictors. Our study also presents industry-specific results regarding the influence of click sequences on purchase intent, thereby providing insights for advertising research, particularly as are suited to optimization of online advertising activities.
... Despite the widespread and ongoing practice of many advertisers to apply comparatively simple heuristics (e.g., last click attribution), such that the value is attributed solely to the marketing channel directly preceding the conversion (The CMO Club, & Visual IQ, Inc, 2014), this challenge of attributing credit to different channels (Neslin & Shankar, 2009) has recently begun to receive increased attention in academia and practice alike (Berman, 2015). Academics have proposed a variety of substantiated analytical attribution frameworks, including logistic regression models (Shao & Li, 2011), game theory-based approaches (Berman, 2015;Dalessandro, Perlich, Stitelman, & Provost, 2012), Bayesian models (Li & Kannan, 2014), mutually exciting point process models (Xu, Duan, & Whinston, 2014), VAR models (Kireyev, Pauwels, & Gupta, 2016), and hidden Markov models (Abhishek, Fader, & Hosanagar, 2015). ...
... Despite the widespread and ongoing practice of many advertisers to apply comparatively simple heuristics (e.g., last click attribution), such that the value is attributed solely to the marketing channel directly preceding the conversion (The CMO Club, & Visual IQ, Inc, 2014), this challenge of attributing credit to different channels (Neslin & Shankar, 2009) has recently begun to receive increased attention in academia and practice alike (Berman, 2015). Academics have proposed a variety of substantiated analytical attribution frameworks, including logistic regression models (Shao & Li, 2011), game theory-based approaches (Berman, 2015;Dalessandro, Perlich, Stitelman, & Provost, 2012), Bayesian models (Li & Kannan, 2014), mutually exciting point process models (Xu, Duan, & Whinston, 2014), VAR models (Kireyev, Pauwels, & Gupta, 2016), and hidden Markov models (Abhishek, Fader, & Hosanagar, 2015). Furthermore, several industry players such as Adometry (Google), Convertro (AOL), or VisualIQ have introduced a range of attribution methodologies (Moffett, 2014). ...
... Third, we propose a novel variant that adds to existing advanced attribution modeling techniques (Abhishek et al., 2015;Berman, 2015;Haan, Wiesel, & Pauwels, 2016;Kireyev et al., 2016;Li & Kannan, 2014;Xu et al., 2014) by representing customer path data as firstand higher-order Markov walks. This graph-based approach, adapted from research on paid search (Archak et al., 2010), represents a useful addition to the emerging attribution literature. ...
... The need to assign credit for a sale at the individual customer level, as well as to more than one channel, has led to an increase in attribution modelling by marketers. Attribution modelling seeks to assign an individual e-commerce sale to one or more digital media channels (e.g., search, display, social and mobile) for the sale or business outcome being measured (Berman 2018;Li and Kannan 2014). Attribution modelling assigns credit based on a predetermined basis, such as 'last interaction', 'last click' and 'time decay', among others, to assess the brand's marketing/media mix (Dalessandro et al. 2012). ...
... Attribution modelling assigns credit based on a predetermined basis, such as 'last interaction', 'last click' and 'time decay', among others, to assess the brand's marketing/media mix (Dalessandro et al. 2012). Despite the advantages of attribution modelling, it is limited due to its digital-only variables, as well as omitting longterm or intermediate effects (Anderl et al. 2014;Berman 2018). ...
Chapter
Full-text available
As brand managers and marketing professionals struggle to optimize marketing spend in a post-digital world, many are using the prominent paid, owned and earned (POE) media model to plan and execute marketing communications. However, when it comes to quantifying the contribution and efficiency of POE media on business outcomes (like revenues), marketing professionals struggle to quantify the effects of owned and earned media, which often have low or no direct effect on revenues because they are typically created to have an intermediate or ‘long-term’ effect on brand attitudes. While marketers quantify paid media effects on revenues using either market response models, which are econometric models, or attribution models, measuring intermediate effects with these methods is not permitted. By creating a new method called customer journey modelling (CJM), the authors used brand attachment as a proxy for the intermediate effect. The resulting CJM is an objective approach to link POE media to demand and describe contribution and efficiency in terms of both revenue and brand attachment. This approach enables data-driven insights to evaluate and optimize advertiser’s POE marketing communications efforts more accurately than traditional methods.
... As a result, the change in the conversion probability is also known as the removal effect. In the past decade, researchers have developed a variety of models to describe consumer behavior, such as regression models (e.g., Shao & Li 2011, Breuer et al. 2011, Danaher & van Heerde 2018, Zhao et al. 2019, Markov models (Yang & Ghose 2010, Anderl et al. 2016, Berman 2018, Kakalejčík et al. 2018, Bayesian models (e.g., Li & Kannan 2014), time series models (Kireyev et al. 2016, De Haan et al. 2016, survival theory-based models (Zhang et al. 2014, Ji et al. 2016, deep learning models (Li et al. 2018, Kumar et al. 2020, and so on. The main novelty of previous work in this line comes from modeling user behavior. ...
... Due to the nature of the Shapley value, it typically provides channel-level but not path-level attribution or touchpoint-wise attribution scores. In addition, existing methods based on Shapley value did not take into account the temporal distance between touchpoints in the path-to-purchase data, including (Dalessandro et al. 2012, De Haan et al. 2016, Kireyev et al. 2016, Berman 2018, Singal et al. 2022. For example, the most recent work by Singal et al. (2022) used a discrete Markov chain model to describe the transitions in a customer's state along the customer journey through the conversion funnel, which does not incorporate the temporal distance when the customer moves from a state to another state in one transition. ...
Preprint
Marketers employ various online advertising channels to reach customers, and they are particularly interested in attribution for measuring the degree to which individual touchpoints contribute to an eventual conversion. The availability of individual customer-level path-to-purchase data and the increasing number of online marketing channels and types of touchpoints bring new challenges to this fundamental problem. We aim to tackle the attribution problem with finer granularity by conducting attribution at the path level. To this end, we develop a novel graphical point process framework to study the direct conversion effects and the full relational structure among numerous types of touchpoints simultaneously. Utilizing the temporal point process of conversion and the graphical structure, we further propose graphical attribution methods to allocate proper path-level conversion credit, called the attribution score, to individual touchpoints or corresponding channels for each customer's path to purchase. Our proposed attribution methods consider the attribution score as the removal effect, and we use the rigorous probabilistic definition to derive two types of removal effects. We examine the performance of our proposed methods in extensive simulation studies and compare their performance with commonly used attribution models. We also demonstrate the performance of the proposed methods in a real-world attribution application.
... The article [11,14] -strategies for using machine learning in marketing are briefly but clearly described here. ...
... An important factor is the interpretability of its results. The influence of each factor is clearly expressed by the value of the coefficient b, which makes it possible to clearly determine which of them have a positive effect and to what extent they influence decisionmaking [11,26]. ...
Conference Paper
Full-text available
Abstract — To anazyle and investigate the use of Artificial Intelligence (AI) in advertising and marketing, it is necessary to understand precisely how the process of creating advertising occurs, and what technologies are used for this. This work will consider the structure of AI algorithms implemented in advertising, as well as how they affect the increased profitability of campaign investments, better relations with clients and personalization in full-time mode, widespread marketing campaigns, the possibility of faster implementation. Keywords — Artificial Intelligence; Rеral Time Bidding; Supply / Sell Side platform; Demand-Side Platform; Data Management Platform; Pay Per Click; Over-The-Top; Google News Initiative; Chief Markeking Officer.
... Artificial intelligence (AI) coupled with promising machine learning (ML) techniques well known from computer science is broadly affecting many aspects of various fields including science and technology, industry, and even our day-to-day life [28]. Nowadays, instead of attributing the ad touchpoints by heuristic rules [6], datadriven methods [3,8,13,21,23,30] which estimate the attribution credits according to the historical data have become the mainstream techniques. These methods learn a conversion prediction model with all observed historical data and then generate the counterfactual ad journeys by removing or replacing some touchpoints. ...
... Data-driven multi-touch attribution. Previously, marketers have applied simple rules, e.g., the last touch, to attribute the influence of touched ads [6], which either ignore the effects of other channels or neglect the channel difference. To overcome these drawbacks, researchers proposed data-driven methods. ...
... Academic studies applied various methods for attribution modelling in advertising: game theory (Dalessandro et al., 2012), logistic regression (Shao and Li, 2011), Bayesian method, Markov chains (Abhishek et al., 2012;Anderl et al., 2016), Shapley value (Berman, 2014) and others. These studies have discovered that alternative methods produce channels contributions, which are different from GA's 'last-Interaction model'. ...
... The question of attribution of value to traffic channels can be considered in future studies on the examples of more complicated probabilistic models (Shao and Li, 2011), game theory approach (Berman, 2014), VAR models (Kireyev et al., 2016) and other. With the help of the existing data, furthermore, a more complicated Markov chains model of second and third order can be built. ...
... Academic studies applied various methods for attribution modelling in advertising: game theory (Dalessandro et al., 2012), logistic regression (Shao and Li, 2011), Bayesian method, Markov chains (Abhishek et al., 2012;Anderl et al., 2016), Shapley value (Berman, 2014) and others. These studies have discovered that alternative methods produce channels contributions, which are different from GA's 'last-Interaction model'. ...
... The question of attribution of value to traffic channels can be considered in future studies on the examples of more complicated probabilistic models (Shao and Li, 2011), game theory approach (Berman, 2014), VAR models (Kireyev et al., 2016) and other. With the help of the existing data, furthermore, a more complicated Markov chains model of second and third order can be built. ...
... Nowadays, instead of attributing the ad touchpoints by heuristic rules (Berman 2018), data-driven methods (Shao and Li 2011;Dalessandro et al. 2012;Ji and Wang 2017;Ren and etc. 2018;Arava et al. 2018;Yang, Dyer, and Wang Copyright © 2022, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. ...
... Data-driven multi-touch attribution. Previously, marketers have applied simple rules, e.g., the last touch, to attribute the influence of touched ads (Berman 2018), which either ignore the effects of other channels or neglect the channel difference. To overcome these drawbacks, researchers proposed data-driven attribution methods. ...
Preprint
Full-text available
Multi-touch attribution (MTA), aiming to estimate the contribution of each advertisement touchpoint in conversion journeys, is essential for budget allocation and automatically advertising. Existing methods first train a model to predict the conversion probability of the advertisement journeys with historical data and calculate the attribution of each touchpoint using counterfactual predictions. An assumption of these works is the conversion prediction model is unbiased, i.e., it can give accurate predictions on any randomly assigned journey, including both the factual and counterfactual ones. Nevertheless, this assumption does not always hold as the exposed advertisements are recommended according to user preferences. This confounding bias of users would lead to an out-of-distribution (OOD) problem in the counterfactual prediction and cause concept drift in attribution. In this paper, we define the causal MTA task and propose CausalMTA to eliminate the influence of user preferences. It systemically eliminates the confounding bias from both static and dynamic preferences to learn the conversion prediction model using historical data. We also provide a theoretical analysis to prove CausalMTA can learn an unbiased prediction model with sufficient data. Extensive experiments on both public datasets and the impression data in an e-commerce company show that CausalMTA not only achieves better prediction performance than the state-of-the-art method but also generates meaningful attribution credits across different advertising channels.
... "Last touch" attribution, a form of rules-based attribution in which the entire purchase is attributed to the most recent touchpoint, gained early popularity not because it was correct, but because it was easy to track on the basis of the referring URL. Berman (2018) shows that last touch attribution incentivizes inefficient oversupply of ad impressions, due to competition among advertisers [3]. ...
... "Last touch" attribution, a form of rules-based attribution in which the entire purchase is attributed to the most recent touchpoint, gained early popularity not because it was correct, but because it was easy to track on the basis of the referring URL. Berman (2018) shows that last touch attribution incentivizes inefficient oversupply of ad impressions, due to competition among advertisers [3]. ...
Preprint
Full-text available
Subscription services face a difficult problem when estimating the causal impact of content launches on acquisition. Customers buy subscriptions, not individual pieces of content, and once subscribed they may consume many pieces of content in addition to the one(s) that drew them to the service. In this paper, we propose a scalable methodology to estimate the incremental acquisition impact of content launches in a subscription business model when randomized experimentation is not feasible. Our approach uses simple assumptions to transform the problem into an equivalent question: what is the expected consumption rate for new subscribers who did not join due to the content launch? We estimate this counterfactual rate using the consumption rate of new subscribers who joined just prior to launch, while making adjustments for variation related to subscriber attributes, the in-product experience, and seasonality. We then compare our counterfactual consumption to the actual rate in order to back out an acquisition estimate. Our methodology provides top-line impact estimates at the content / day / region grain. Additionally, to enable subscriber-level attribution, we present an algorithm that assigns specific individual accounts to add up to the top-line estimate. Subscriber-level attribution is derived by solving an optimization problem to minimize the number of subscribers attributed to more than one piece of content, while maximizing the average propensity to be incremental for subscribers attributed to each piece of content. Finally, in the absence of definitive ground truth, we present several validation methods which can be used to assess the plausibility of impact estimates generated by these methods.
... An alternative way to address the attribution problem is based on advanced data analytics methodologies, known as algorithmic approaches, which have attracted increasing attention over the past decade (Abhishek et al., 2012;Anderl et al., 2013;Berman, 2013;Dalessandro et al., 2012;Li and Kannan, 2013;Shao and Li, 2011;Tucker, 2012;Wiesel et al., 2011;Yadagiri et al., 2015;Zhang et al., 2014). Shao and Li (2011) proposed the "simple probabilistic model". ...
... The coalition's contribution and marginal contribution can be measured in multiple ways to gain insight from different aspects. It can be related to the number of purchases, the number of reaches as well as the total revenue (Berman, 2013). In the following sections, we discuss the definition of contribution and marginal contribution. ...
Article
Full-text available
This paper re-examines the Shapley value methods for attribution analysis in the area of online advertising. As a credit allocation solution in cooperative game theory, Shapley value method directly quantifies the contribution of online advertising inputs to the advertising key performance indicator (KPI) across multiple channels. We simplify its calculation by developing an alternative mathematical formulation. The new formula significantly improves the computational efficiency and therefore extends the scope of applicability. Based on the simplified formula, we further develop the ordered Shapley value method. The proposed method is able to take into account the order of channels visited by users. We claim that it provides a more comprehensive insight by evaluating the attribution of channels at different stages of user conversion journeys. The proposed approaches are illustrated using a real-world online advertising campaign dataset.
... The second focus is on multichannel ad attribution strategies, which examine the interactions across multiple touchpoints (Barajas et al. 2016;Han et al. 2013;Li and Kannan 2014;Xu et al. 2014;Zantedeschi et al. 2017), the consumer purchase funnel (Anderl et al. 2016;Ghose and Todri 2016;Zhuang et al. 2021), budget and profitability (Berman 2018;Danaher and Dagger 2013;Danaher and Van Heerde 2018;Li et al. 2016), and the long-term effects of marketing media (De Haan et al. 2016). These studies cover both traditional offline channels, such as television, radio, print, and direct mail, and online channels, including email, banners, referral programs, and search ads. ...
Article
Full-text available
With infeed advertising becoming an increasingly popular advertising tool for advertisers to reach mobile consumers, the authors propose an integrated model of contemporaneous, carryover, and spillover effects to measure the incremental contributions of infeed ads in multiple types of mobile apps: newsfeed, social, and video. They empirically examine the three proposed effects of newsfeed ads, social feed ads, and video feed ads using data from a large-scale ad campaign for a new mobile game. The data set contains 10,115,801 impressions, 286,506 clicks, and 12,706 conversions. First, the findings show that social feed ads have the strongest contemporaneous effects on both ad clicks and conversions; social feed ads are 1.87 times more likely to generate a click and 1.69 times more likely to generate a conversion than newsfeed ads, followed by video feed ads (clicks: 1.73 times; conversions: 1.55 times). Second, video feed ads have the strongest carryover effect, followed by social feed ads, while newsfeed ads have a negative carryover effect. Third, newsfeed ads exhibit the strongest spillover effect is the strongest for newsfeed ads; prior newsfeed ad exposures are more effective than prior social or video feed ad exposures at promoting clicks and conversions upon subsequent exposure in other channels.
... As a result, MTA has emerged as a preferred approach for large enterprises that aim to align marketing and sales efforts with precise, data-driven insights. By accurately capturing the contribution of each touchpoint, MTA allows enterprises to enhance their marketing return on investment (ROI) and achieve more effective cross-channel coordination [3]. ...
Article
Multi-touch attribution (MTA) offers a sophisticated, data-driven approach to distributing credit across touchpoints within the customer journey [1] [2], thereby enabling more precise insights into channel and campaign performance. However, implementing MTA at scale for a global enterprise's sales and marketing team presents unique challenges, including the need to integrate disparate data sources, handle vast data volumes, apply advanced attribution models, and provide insights in real-time. This paper presents the design and implementation of a scalable MTA framework tailored to the needs of a global sales and marketing team, leveraging cloud-native infrastructure, big data processing, and machine learning algorithms to achieve accurate and actionable insights at scale. The proposed MTA solution addresses data heterogeneity by creating a unified data pipeline that integrates structured and unstructured data from CRM systems, web analytics, social media, paid media platforms, and offline sources. Keywords: Multi-touch attribution, sales and marketing, data integration, machine learning, cloud computing, enterprise analytics.
... Recent years have seen a seismic shift in the marketing measurement landscape: whereas advertisers previously relied on third-party cookies to track consumers online, and thus make attributions about how their ads are driving the path-to-purchase (e.g., Li and Kannan 2014;Li et al. 2016;Zantedeschi et al. 2017;Berman 2018), new privacy laws and restrictions have made doing so nearly impossible (Cui et al. 2021). This, combined with a growing understanding that such individual-level attribution models are often fundamentally flawed (e.g., Gordon et al. 2023;Tian et al. 2024), has shifted attention back to a classic marketing measurement tool: aggregate-level marketing mix models, or MMMs (Hanssens et al. 2003). 1 This shift in industry focus is reflected in both a flurry of research being done in practice, including from companies like Google and Uber (e.g., Jin et al. 2017;Wang et al. 2017;Ng et al. 2021), and in the open source tools being released by these companies, like Google's Meridian (Google 2024) and Meta's Robyn (Meta 2024), all of which implement highly sophisticated MMMs. ...
Preprint
Full-text available
Recent years have seen a resurgence in interest in marketing mix models (MMMs), which are aggregate-level models of marketing effectiveness. Often these models incorporate nonlinear effects, and either implicitly or explicitly assume that marketing effectiveness varies over time. In this paper, we show that nonlinear and time-varying effects are often not identifiable from standard marketing mix data: while certain data patterns may be suggestive of nonlinear effects, such patterns may also emerge under simpler models that incorporate dynamics in marketing effectiveness. This lack of identification is problematic because nonlinearities and dynamics suggest fundamentally different optimal marketing allocations. We examine this identification issue through theory and simulations, wherein we explore the exact conditions under which conflation between the two types of models is likely to occur. In doing so, we introduce a flexible Bayesian nonparametric model that allows us to both flexibly simulate and estimate different data-generating processes. We show that conflating the two types of effects is especially likely in the presence of autocorrelated marketing variables, which are common in practice, especially given the widespread use of stock variables to capture long-run effects of advertising. We illustrate these ideas through numerous empirical applications to real-world marketing mix data, showing the prevalence of the conflation issue in practice. Finally, we show how marketers can avoid this conflation, by designing experiments that strategically manipulate spending in ways that pin down model form.
... One approach widely adopted by the marketing domain [25] is to weigh each ad view based on its "importance", and ensure that the sum of weights for each join key value (uid) is 1 to counteract join fanouts. The choice of weights is necessarily based on the analyst's domain knowledge: one analyst may believe that the first ad view is the most important, while another prioritizes the last one [23]. The following illustrates weights where all ad views are equally important: For each user, the weights are uniform among ad views. ...
Preprint
Analysts often struggle with analyzing data from multiple tables in a database due to their lack of knowledge on how to join and aggregate the data. To address this, data engineers pre-specify "semantic layers" which include the join conditions and "metrics" of interest with aggregation functions and expressions. However, joins can cause "aggregation consistency issues". For example, analysts may observe inflated total revenue caused by double counting from join fanouts. Existing BI tools rely on heuristics for deduplication, resulting in imprecise and challenging-to-understand outcomes. To overcome these challenges, we propose "weighing" as a core primitive to counteract join fanouts. "Weighing" has been used in various areas, such as market attribution and order management, ensuring metrics consistency (e.g., total revenue remains the same) even for many-to-many joins. The idea is to assign equal weight to each join key group (rather than each tuple) and then distribute the weights among tuples. Implementing weighing techniques necessitates user input; therefore, we recommend a human-in-the-loop framework that enables users to iteratively explore different strategies and visualize the results.
... While it appears that machine learning-powered attribution models are a viable and reliable alternative to classic heuristic models, it's worth mentioning that we underestimated how intrusive and disempowering this data collection method may be for customers. These advanced and freshly created procedures are often based on path analysis, which entails identifying the user (typically through personal data) in order to duplicate path sequences properly (Berman et al., 2013). ...
Conference Paper
Full-text available
Resumo Marketers face significant obstacles as technology evolves and gets more complex. Attribution is one of the most important research problems in the field of marketing and studying it may assist media investment optimization strategies, as well as understand customer behavior across channels and platforms. This article provides a historical bibliometric analysis of Multi-Channel Attribution (MCA). A thorough study of the 156 (one hundred and fifty-six) papers aided in identifying the state-of-the-art methodologies being explored by researchers, as well as the most relevant authors, sources, and nations. Additionally, co-citation analyses were conducted in order to establish the conceptual and intellectual network. A B S T R A C T Marketers face significant obstacles as technology evolves and gets more complex. Attribution is one of the most important research problems in the field of marketing and studying it may assist media investment optimization strategies, as well as understand customer behavior across channels and platforms. This article provides a historical bibliometric analysis of Multi-Channel Attribution (MCA). A thorough study of the 156 (one hundred and fifty-six) papers aided in identifying the state-of-the-art methodologies being explored by researchers, as well as the most relevant authors, sources, and nations. Additionally, co-citation analyses were conducted in order to establish the conceptual and intellectual network.
... Given an ad click, the goal is to assign credit to the different preceding exposures of the same item to the user, e.g., previous ad exposures, emails, or other media. Multiple methods have been proposed to estimate the attribution such as attributing all to the last exposure [2], an average over all exposures, or using probabilistic models to model the click data as a function of the input exposures [24; 13]. Recent methods utilize the game-theoretic attribution score using Shapley values that summarizes the attribution over multiple simulations of input variables, with [5] or without a causal interpretation [25]. ...
Preprint
Given an unexpected change in the output metric of a large-scale system, it is important to answer why the change occurred: which inputs caused the change in metric? A key component of such an attribution question is estimating the counterfactual: the (hypothetical) change in the system metric due to a specified change in a single input. However, due to inherent stochasticity and complex interactions between parts of the system, it is difficult to model an output metric directly. We utilize the computational structure of a system to break up the modelling task into sub-parts, such that each sub-part corresponds to a more stable mechanism that can be modelled accurately over time. Using the system's structure also helps to view the metric as a computation over a structural causal model (SCM), thus providing a principled way to estimate counterfactuals. Specifically, we propose a method to estimate counterfactuals using time-series predictive models and construct an attribution score, CF-Shapley, that is consistent with desirable axioms for attributing an observed change in the output metric. Unlike past work on causal shapley values, our proposed method can attribute a single observed change in output (rather than a population-level effect) and thus provides more accurate attribution scores when evaluated on simulated datasets. As a real-world application, we analyze a query-ad matching system with the goal of attributing observed change in a metric for ad matching density. Attribution scores explain how query volume and ad demand from different query categories affect the ad matching density, leading to actionable insights and uncovering the role of external events (e.g., "Cheetah Day") in driving the matching density.
... Chan and Perry (2017) discuss challenges in MMM. In online advertising, Shapley values are now being used instead of simply attributing a sale or other customer conversion to the last click prior to that event (Berman, 2018). Shapley values are also being used in financial profit and loss attribution (Moehle et al., 2021). ...
Preprint
The most popular methods for measuring importance of the variables in a black box prediction algorithm make use of synthetic inputs that combine predictor variables from multiple subjects. These inputs can be unlikely, physically impossible, or even logically impossible. As a result, the predictions for such cases can be based on data very unlike any the black box was trained on. We think that users cannot trust an explanation of the decision of a prediction algorithm when the explanation uses such values. Instead we advocate a method called Cohort Shapley that is grounded in economic game theory and unlike most other game theoretic methods, it uses only actually observed data to quantify variable importance. Cohort Shapley works by narrowing the cohort of subjects judged to be similar to a target subject on one or more features. A feature is important if using it to narrow the cohort makes a large difference to the cohort mean. We illustrate it on an algorithmic fairness problem where it is essential to attribute importance to protected variables that the model was not trained on. For every subject and every predictor variable, we can compute the importance of that predictor to the subject's predicted response or to their actual response. These values can be aggregated, for example over all Black subjects, and we propose a Bayesian bootstrap to quantify uncertainty in both individual and aggregate Shapley values.
... Considerations to avoid temporal confounding A particular characteristic of the current advertising systems is that they target users dynamically based on observed interactions over time [16]. This means that even in a randomized control ...
Thesis
Full-text available
Uplift modeling is a machine learning-based technique for treatment effect prediction at the individual level, which has become one of the main trends in application areas where personalization is key, such as personalized medicine, performance marketing, social sciences, etc.This thesis is intended to expand the scope of uplift modeling for experimental data by developing new theory and solutions for several open challenges in the field, inspired by the online advertising applications perspective. Firstly we release a publicly available collection of 13.9 million samples collected from several randomized control trials, scaling up available datasets by a 210x factor. We formalize how uplift modeling can be performed with this data, along with relevant evaluation metrics. Then, we propose synthetic response surfaces and treatment assignment providing a general set-up for Conditional Average Treatment Effect (CATE) prediction and report experiments to validate key traits of the dataset.Secondly, we assume imbalanced treatment conditions and propose two new data representation-based methods inspired by cascade and multi-task learning paradigms. We provide then series of experimental results over several large-scale real-world collections to check the benefits of the proposed approaches.We then cover the problem of direct optimization of the Area Under the Uplift Curve (AUUC), a popular metric in the field. Using the relations between uplift modeling and bipartite ranking we provide a generalization bound for the AUUC and derive an algorithm optimizing this bound, usable with linear and deep models. We empirically study the tightness of the proposed bound, its efficacy for hyperparameters tuning, and investigate the performance of the method compared to a range of baselines on two real-world uplift modeling benchmarks.Finally, we consider the problem of learning uplift models from aggregated data. We propose a principled way to learn group-based uplift models from data aggregated according to a given set of groups that define a partition of the user space, using different unsupervised aggregation techniques, such as feature binning by value or by quantile. We proceed by introducing a bias-variance decomposition of the Precision when Estimating Heterogeneous Effect (PEHE) metric for models learned on a given grouping and show how this decomposition enables us to derive a theoretical optimal number of groups as a function of data size. Experimental results highlight the bias-variance trade-off and confirm theoretical insights concerning the optimal number of groups. In addition, we show that group-based uplift models can have comparable performance to baselines with full access to the data.
... Considerations to avoid temporal confounding A particular characteristic of current advertising systems is that they target users dynamically based on observed interactions over time [6]. This means that even in a randomized control trial (A/B test) interactions with the system influence subsequent ad exposure via adjustments of the bids based on user reactions. ...
Preprint
Full-text available
Individual Treatment Effect (ITE) prediction is an important area of research in machine learning which aims at explaining and estimating the causal impact of an action at the granular level. It represents a problem of growing interest in multiple sectors of application such as healthcare, online advertising or socioeconomics. To foster research on this topic we release a publicly available collection of 13.9 million samples collected from several randomized control trials, scaling up previously available datasets by a healthy 210x factor. We provide details on the data collection and perform sanity checks to validate the use of this data for causal inference tasks. First, we formalize the task of uplift modeling (UM) that can be performed with this data, along with the relevant evaluation metrics. Then, we propose synthetic response surfaces and heterogeneous treatment assignment providing a general set-up for ITE prediction. Finally, we report experiments to validate key characteristics of the dataset leveraging its size to evaluate and compare - with high statistical significance - a selection of baseline UM and ITE prediction methods.
... • Hierarchical Bayesian models (Li & Kannan, 2014); (Abhishek et al., 2012) • Game theoretical models (Berman, 2018); ...
Conference Paper
Full-text available
This study presents a data-driven attribution model applied in the context of HECJ, employing a novelty technique based on panel data from online and offline channels, including detailed data on social media engagement. It aims to contribute to the extent knowledge in attribution models applied in the Higher Education customer journey (HECJ). Throughout a study case at a Brazilian HEI, a total of 185,631 customer journeys were organized into two data sets corresponding to Covid19 pre-pandemic and pandemic periods. The data are modeled in a graph-based attribution model. The research finds that channels as emails, online chat, call center, sales, and inbound marketing are driving more than 70% of conversions. Instagram, sales promotion, online advertising, and instant messages grew 38% in the pandemic period. The HECJ gets longer, from 3.8 months in the pre-pandemic to 7.8 months on average during the pandemic period. This research provides a practical guide based on the proposed model application that permits a more accurate evaluation of marketing channels in the context of HECJ.
... For example, Jordan et al. (2011) shows that when an advertiser buys impressions from multiple publishers and do not consider externalities between ads, it ends up allocating most of his budget to publishers closer to the demand 1 . Similarly, Berman (2018) finds that when externalities exist between publishers, advertiser's choice of attribution model constitutes a strategic choice which directly impacts its own profit as well as publishers' ones. An empirical descriptive analysis also proved attribution modeling has impact on advertising prices and in fine on consumer welfare Tucker (2013). ...
Preprint
This paper analyzes externalities generated by offline advertising campaigns on online ads performance. Using a fixed-effect IV regression on a panel of firms in the hotel industry, we quantify the elasticity of Google and Facebook ads to offline advertisements. Our study demon- strates a positive effect of traditional mass-media campaigns on Google search ads effectiveness. An increases in traditional advertising stock by 1% rises clicks on Google ads by 4.95%. Fur- ther analyses found this effect benefited to Google by (i) a higher advertising cost related to the increase in ads performance and (ii) a higher share of advertising budget in the long run. Although we find similar offline positive effects on Facebook ads, they were not significant.
... Archak et al. (2012) focus on positive spill-over effects. Furthermore, the performance of different advertisers is analysed as well (Berman, 2015). The interactive effects of online and offline activities and their interaction (Naik and Peters, 2009;Wiesel et al., 2011) are well observed and studied. ...
... Archak et al. (2012) focus on positive spill-over effects. Furthermore, the performance of different advertisers is analysed as well (Berman, 2015). The interactive effects of online and offline activities and their interaction (Naik and Peters, 2009;Wiesel et al., 2011) are well observed and studied. ...
... Finally, our paper also relates to a recent literature on advertising attribution, that is, the ex post allocation of advertising revenue to publishers when advertisers multi-home (Kireyev et al. 2016). Consistent with our findings, Berman (2018) shows that increasing the efficiency in the allocation of ads may increase profits for publishers, but decreases those of advertisers. Our approach is complementary to this literature, because we study the ad networks' role of improving the (ex ante) allocation of ads on multiple publishers. ...
Article
We study the role of ad networks in the online advertising market. Our baseline model considers two publishers that can outsource the sale of their ad inventories to an ad network, in a market where consumers and advertisers multi-home. The ad network increases total advertising revenue by tracking consumers across outlets and reduces competition between publishers by centralizing the sale of ads. Consequently, outsourcing to the ad network benefits the publishers, but may penalize the advertisers. We show that the ad network’s ability to track consumers may either expand or reduce the provision of ads, depending on consumers’ preferences for the publishers and how advertisers use tracking information. Specifically, tracking is more likely to expand (respectively, reduce) the provision of ads when consumers’ preferences for the publishers are positively (respectively, negatively) correlated. Tracking is also more likely to expand (respectively, reduce) the provision of ads when advertisers use tracking information to cap the frequency of impressions (respectively, target specific consumers). Furthermore, we study the implications of consumers’ choice to block tracking. Generally, blocking negatively impacts the advertising industry by making ad allocation less effective. Blocking also entails an externality on consumers, which is negative when tracking reduces the provision of ads. Given these conditions, regulatory restrictions on tracking may reduce consumer surplus as well as advertising revenue. These findings contrast with the presumption that regulation should make it easier for consumers to avoid tracking. We propose further extensions, including competing ad networks, more than two publishers, and networks that do not sell ads, but only tracking information to the advertisers. This paper was accepted by Juanjuan Zhang, marketing.
... In online advertising, conversion attribution is commonly calculated by some rule-based methods, such as first-touch and lasttouch, whereafter the return-on-investment (ROI) is gained based on the achieved attribution results which may result in some bias [6]. In recent years, many works based on multi-touch attribution (MTA) have been proposed for modeling the attribution for the sequential touch points over various channels [4,25]. Shao and Li [24] proposed the first work for data-driven multi-touch attribution model, which estimates the conversion rate based on the viewed ads of the user by the bagged logistic regression model. ...
Conference Paper
Full-text available
In online advertising, the Internet users may be exposed to a sequence of different ad campaigns, i.e., display ads, search, or referrals from multiple channels, before led up to any final sales conversion and transaction. For both campaigners and publishers, it is fundamentally critical to estimate the contribution from ad campaign touch-points during the customer journey (conversion funnel) and assign the right credit to the right ad exposure accordingly. However, the existing research on the multi-touch attribution problem lacks a principled way of utilizing the users' pre-conversion actions (i.e., clicks), and quite often fails to model the sequential patterns among the touch points from a user's behavior data. To make it worse, the current industry practice is merely employing a set of arbitrary rules as the attribution model, e.g., the popular last-touch model assigns 100% credit to the final touch-point regardless of actual attributions. In this paper, we propose a Dual-attention Recurrent Neural Network (DARNN) for the multi-touch attribution problem. It learns the attribution values through an attention mechanism directly from the conversion estimation objective. To achieve this, we utilize sequence-to-sequence prediction for user clicks, and combine both post-view and post-click attribution patterns together for the final conversion estimation. To quantitatively benchmark attribution models, we also propose a novel yet practical attribution evaluation scheme through the proxy of budget allocation (under the estimated attributions) over ad channels. The experimental results on two real datasets demonstrate the significant performance gains of our attribution model against the state of the art.
... Currently, this type of approach is prevalent in fields such as personalized medicine, where one tries to infer which treatment would have the biggest positive impact for a particular patient. Similar questions are starting to be asked in the field of Performance Advertising [14], where the problem of measuring the incremental effect of an advertising campaign on the user shopping propensity is very important for proper credit attribution in the context of single and multi-channel advertising [2]. In terms of associated rewards, this marks a natural evolution of the field from measuring advertising performance in terms of number of ad displays (CPM -cost per thousand ad displays), to measuring it in terms of ad clicks (CPC), to post-click sales/conversions (CPA) to possibly incremental sales (incremental CPA -value of a sale times the associated individual treatment effect). ...
Conference Paper
Many current applications use recommendations in order to modify the natural user behavior, such as to increase the number of sales or the time spent on a website. This results in a gap between the final recommendation objective and the classical setup where recommendation candidates are evaluated by their coherence with past user behavior, by predicting either the missing entries in the user-item matrix, or the most likely next event. To bridge this gap, we optimize a recommendation policy for the task of increasing the desired outcome versus the organic user behavior. We show this is equivalent to learning to predict recommendation outcomes under a fully random recommendation policy. To this end, we propose a new domain adaptation algorithm that learns from logged data containing outcomes from a biased recommendation policy and predicts recommendation outcomes according to random exposure. We compare our method against state-of-the-art factorization methods, in addition to new approaches of causal recommendation and show significant improvements.
... In online advertising, conversion attribution is commonly calculated by some rule-based methods, such as first-touch and lasttouch, whereafter the return-on-investment (ROI) is gained based on the achieved attribution results which may result in some bias [6]. In recent years, many works based on multi-touch attribution (MTA) have been proposed for modeling the attribution for the sequential touch points over various channels [4,25]. Shao and Li [24] proposed the first work for data-driven multi-touch attribution model, which estimates the conversion rate based on the viewed ads of the user by the bagged logistic regression model. ...
Preprint
Full-text available
In online advertising, the Internet users may be exposed to a sequence of different ad campaigns, i.e., display ads, search, or referrals from multiple channels, before led up to any final sales conversion and transaction. For both campaigners and publishers, it is fundamentally critical to estimate the contribution from ad campaign touch-points during the customer journey (conversion funnel) and assign the right credit to the right ad exposure accordingly. However, the existing research on the multi-touch attribution problem lacks a principled way of utilizing the users' pre-conversion actions (i.e., clicks), and quite often fails to model the sequential patterns among the touch points from a user's behavior data. To make it worse, the current industry practice is merely employing a set of arbitrary rules as the attribution model, e.g., the popular last-touch model assigns 100% credit to the final touch-point regardless of actual attributions. In this paper, we propose a Dual-attention Recurrent Neural Network (DARNN) for the multi-touch attribution problem. It learns the attribution values through an attention mechanism directly from the conversion estimation objective. To achieve this, we utilize sequence-to-sequence prediction for user clicks, and combine both post-view and post-click attribution patterns together for the final conversion estimation. To quantitatively benchmark attribution models, we also propose a novel yet practical attribution evaluation scheme through the proxy of budget allocation (under the estimated attributions) over ad channels. The experimental results on two real datasets demonstrate the significant performance gains of our attribution model against the state of the art.
... To consider this fact during the process of analyzing acquisition source effectiveness, you should define the attribution as a rule of assigning the target action to the acquisition source or the rule of distributing value by the conversion value between the sources. The attribution of the last relevant click [5] is usually used for the CPA networks without taking free traffic sources into account (direct traffic, unpaid search traffic) i.e. the last payment click attribution. ...
... For probability models that are typically nonlinear this lift will be different than when the exposure order is reversed. If we only knew that a consumer was exposed to media 1 and 2 and did not know the order, we would have to cycle through all possible sequences (2 in this case) and assign attribution based on the Shapley value, as explained in Berman (2017) and Li and Kannan (2014). ...
Article
Full-text available
Media attribution is the assignment of a percentage weight to each media touchpoint a consumer is exposed to prior to purchasing. Many firms consider using attribution to allocate media budgets, particularly for digital media, but an important question is whether this is appropriate. An initial hurdle when answering this question is that, despite the surge in interest for media attribution in marketing academia and practice, attribution does not have an agreed formal definition. Therefore, this paper proposes an attribution formulation based on the relative incremental contribution that each medium makes to a purchase, taking into account advertising carryover and interaction effects. The formulation shows that attribution is proportional to the marginal effectiveness of a medium times its number of exposures. This means that often-used media will have high attribution weights. However, the profit-maximizing allocation for a fixed budget is a function of advertising effectiveness, but not a function of past exposure levels. By offering analytical derivations and studying simulated and empirical data, the paper shows how attribution can offer misleading insights on how to allocate resources across media. Moreover, the empirical example demonstrates that substantial gains in purchase probability can be made using profit-maximizing allocation compared with attribution-based allocation.
... Predicting the 30 days conversion probability is thus not enough as different events may change the attribution probability until the conversion actually happens: the competition (or other channels such as search-based advertising) may capture the attribution by generating clicks on their side. This characteristic of the last-click attribution has been shown to drive more efforts on average from advertising platforms and thus be more effective than other mechanisms in the case of cost-per-mil (CPM) payment models [2]. However from the same study we observe that it is not obvious for performance advertising (CPC or CPA). ...
Conference Paper
Predicting click and conversion probabilities when bidding on ad exchanges is at the core of the programmatic advertising industry. Two separated lines of previous works respectively address i) the prediction of user conversion probability and ii) the attribution of these conversions to advertising events (such as clicks) after the fact. We argue that attribution modeling improves the efficiency of the bidding policy in the context of performance advertising. Firstly we explain the inefficiency of the standard bidding policy with respect to attribution. Secondly we learn and utilize an attribution model in the bidder itself and show how it modifies the average bid after a click. Finally we produce evidence of the effectiveness of the proposed method on both offline and online experiments with data spanning several weeks of real traffic from Criteo, a leader in performance advertising.
... Predicting the 30 days conversion probability is thus not enough as different events may change the attribution probability until the conversion actually happens: the competition (or other channels such as search-based advertising) may capture the attribution by generating clicks on their side. This characteristic of the last-click attribution has been shown to drive more efforts on average from advertising platforms and thus be more effective than other mechanisms in the case of cost-per-mil (CPM) payment models Berman (2015). However from the same study we observe that it is not obvious for performance advertising (CPC or CPA). ...
Article
Predicting click and conversion probabilities when bidding on ad exchanges is at the core of the programmatic advertising industry. Two separated lines of previous works respectively address i) the prediction of user conversion probability and ii) the attribution of these conversions to advertising events (such as clicks) after the fact. We argue that attribution modeling improves the efficiency of the bidding policy in the context of performance advertising. Firstly we explain the inefficiency of the standard bidding policy with respect to attribution. Secondly we learn and utilize an attribution model in the bidder itself and show how it modifies the average bid after a click. Finally we produce evidence of the effectiveness of the proposed method on both offline and online experiments with data spanning several weeks of real traffic from Criteo, a leader in performance advertising.
Article
Full-text available
In marketing, consumer behavior is a crucial factor in the placement of products in the market and is often the subject of study and research by large companies to identify the needs of citizens and their behavior as consumers in the buying decision process. Consumer buying behavior refers to the buying behavior of final consumers—individuals and households that buy goods and services for personal consumption (Kotler et al., 1999). A company that truly understands how consumers will respond to different product features, pricing, and advertising appeals has a significant advantage over its competitors. The factors that influence consumer behavior are the key elements that companies analyze and aim to break down in order to "attract" customers. This paper will examine the factors of consumer behavior and their impact on increasing/decreasing imports in trade with several countries with which Kosovo has international trade relations. The phenomenon of ethnocentrism will also be examined, a phenomenon that has emerged in every nation in recent years and is more pronounced in the Republic of Kosovo. Finally, an empirical analysis will be presented, highlighting the relationship between imports and import prices.
Thesis
Full-text available
Online advertising significantly lowers the costs of targeting individuals. This thesis studies the contributions and limitations of online advertising through 4 empirical studies based on advertiser data. Chapter 1 shows that sponsored links (search ads) benefit greatly from offline ads. I show that when increasing its offline advertising activity by 1%, a brand generates up to 0.95% additional clicks on its sponsored links. Chapter 2 focuses on the substitutability between offline and digital ads. Using a translog model, I find that offline and digital ads are limited substitutes. Digital ad formats (display and search) are highly substitutable. Chapter 3 focuses on information asymmetries in the placement of online ads. I show that cost-per-impression (CPM) contracts do not provide incentives for advertisers to make ads visible compared to cost-per-view contracts. Programmatic buying - based on advertising intermediaries - exposes the advertiser to a lower visibility and audience quality compared to direct buying. In addition, matching ads with website content results in 69% higher click-through rates than ads that only target consumers regardless of context. Finally, while ads bought from premium inventories are not more clicked, it seems to be driving out low-quality ad spaces from the standard inventories. Context effects are also discussed in Chapter 4. Using differences-in-difference and counterfactual estimations, I show that the circulation of controversial content and the degradation of Facebook’s credibility during the July 2020 boycott altered the value of ads on the platform. From June to July 2020, Facebook ads recorded 5,000 to 10,000 fewer clicks compared to the brand’s other display campaigns. Their price also dropped. This thesis concludes that online advertising is more a complement to traditional advertising than a substitute. I also advocate for a better contextualization of advertising. This appears to be essential essential as regulation limits the use of personal data for advertising purposes.
Preprint
Dysphonia is one of the early symptoms of Parkinson's disease (PD). Most existing methods use feature selection methods to find the optimal subset of voice features for all PD patients to improve the prediction performance. Few have considered the heterogeneity between patients, which implies the need to provide specific prediction models for different patients. However, building this prediction model for each patient faces the challenge of small sample size, which makes it lack generalization ability. Instance transfer is an effective way to make up for this deficiency. Therefore, this paper proposes a patient-specific game-based transfer (PSGT) method for PD severity prediction. First, a selection mechanism is used to select PD patients with similar disease trends to the target patient from the source domain, which greatly reduces the scope of instance transfer and reduces the risk of negative transfer. Then, the contribution of the transferred subjects and their instances to the disease estimation of the target subject is fairly evaluated by the Shapley value, which improves the interpretability of the method. Next, the proportion of valid instances is determined according to the contribution of transferred subjects, and the instances with higher contribution are transferred based on this proportion to further reduce the difference between the transferred instance subset and the target subject. Finally, the selected subset of instances is added to the training set of the target subject, and the extended data is fed into the random forest to improve the performance of the PD severity prediction method. Parkinson's telemonitoring dataset is used to evaluate the feasibility and effectiveness. Experiment results show that the proposed PSGT method has better performance in both prediction error and stability over compared methods.
Article
Full-text available
In programmatic advertising, firms outsource the bidding for ad impressions to ad platforms. Although firms are interested in targeting consumers that respond positively to advertising, ad platforms are usually rewarded for targeting consumers with high overall purchase probability. We develop a theoretical model that shows if consumers with high baseline purchase probability respond more positively to advertising, then firms and ad platforms agree on which consumers to target. If, conversely, consumers with low baseline purchase probability are the ones for which ads work best, then ad platforms target consumers that firms do not want to target—the incentives are misaligned. We conduct a large-scale randomized field experiment, targeting 208,538 individual consumers, in a display retargeting campaign. Our unique data set allows us to both causally identify advertising effectiveness and estimate the degree of incentive misalignments between the firm and ad platform. In accordance with the contracted incentives, the ad platform targets consumers that are more likely to purchase. Importantly, we find no evidence that ads are more effective for consumers with higher baseline purchase probability, rendering the ad platform’s bidding suboptimal for the firm. A welfare analysis suggests that the ad platform’s bidding optimization leads to a loss in profit for the firm and an overall decline in welfare. To remedy the incentive misalignment, we propose a solution in which the firm restricts the ad platform to target only consumers that are profitable based on individual consumer-level estimates for baseline purchase probability and ad effectiveness. This paper was accepted by Anandhi Bharadwaj, information systems.
Thesis
Full-text available
In this era of the tremendous growth of the internet social media sites have become very important to the youth. The aim of this research was to study the influence of social media marketing on consumer buying behaviour amongst the youth at the University of Nairobi. A descriptive survey method was used in the study. Researchers administered an online structured questionnaire with closed-ended questions to the four academic classes in ranking to get specific data that would enable them to determine the influence. The findings indicate that social media marketing is positively related to consumer behaviour. Since the relationship between the independent and dependent variables is statistically significant, this implies that any reduction of social media marketing has a negative effect on consumer behaviour hence affecting the purchasing decisions of the youth in Kenya. The study further recommends to businesses targeting the youth to consider the most effective social media platforms, the best timing and content to take up based on popularity to ensure effective social media marketing strategies.
Article
Full-text available
Marketers are currently focused on proper budget allocation to maximize ROI from online advertising. They use conversion attribution models assessing the impact of specific media channels (display, search engine ads, social media, etc.). Marketers use the data gathered from paid, owned, and earned media and do not take into consideration customer activities in category media, which are covered by the OPEC (owned, paid, earned, category) media model that the author of this paper proposes. The aim of this article is to provide a comprehensive review of the scientific literature related to the topic of marketing attribution for the period of 2010-2019 and to present the theoretical implications of not including the data from category media in marketers' analyses of conversion attribution. The results of the review and the analysis provide information about the development of the subject, the popularity of particular conversion attribution models, the ideas of how to overcome obstacles that result from data being absent from analyses. Also, a direction for further research on online consumer behavior is presented.
Article
The authors clarify the effect of the price image that retailers refer to as “good cospa” (cospa is the abbreviation of “cost performance” which means “value for money”) on consumer reviews and purchases. Through empirical analysis, consumer-generated content is created experimentally to capture the difference in Word-of-Mouth (WOM) behavior under different expressions regarding price images as retailers. In addition, they also capture the differences in the buying behavior at the target stores before and after the WOM behavior. As a result, participants who contact post information related to “good cospa” are more likely to: (1) respond more frequently than post information related to “cheap,” (2) have more contact with the topic of “price” through “quality,” (3) have more contact with the topic of “fun” through “quality,” and (4) the purchase amount and unit purchase price at the target stores increased. The results show the effectiveness of the keyword “good cospa” in stimulating WOM and purchasing related to retail stores.
Chapter
In vielen digitalen Unternehmen hat sich das Marketing vom kreativen Prozess hin zu einem „Ingenieurs-Marketing“ entwickelt, in dem mittels quantitativer Analytik die „Maschine“ Marketing-Mix kurzfristig optimiert wird. Diese, vor allem in der starken Messbarkeit von Marketing-Effektivität und -Effizienz begründete, Entwicklung bietet zahlreiche neue Möglichkeiten, sollte aber auch kritisch hinterfragt werden. Dieser Beitrag stellt verschiedene Treiber und Merkmale des „Ingenieurs-Marketing“ dar und leitet daraus Chancen und Risiken eines stärker quantitativ orientierten Marketing-Ansatzes ab. So stehen Vorteilen, wie erhöhter Transparenz oder verbesserter Wahl von Marketing-Instrumenten, verschiedene Nachteile, wie der Hang zur kurzfristigen Optimierung statt eines langfristigen Markenaufbaus, gegenüber.
Conference Paper
One of the central challenges in online advertising is attribution, namely, assessing the contribution of individual advertiser actions including emails, display ads and search ads to eventual conversion. Several heuristics are used for attribution in practice; however, there is no formal justification for them and many of these fail even in simple canonical settings. The main contribution in this work is to develop an axiomatic framework for attribution in online advertising. In particular, we consider a Markovian model for the user journey through the conversion funnel, in which ad actions may have disparate impacts at different stages. We propose a novel attribution metric, that we refer to as counterfactual adjusted Shapley value, which inherits the desirable properties of the traditional Shapley value. Furthermore, we establish that this metric coincides with an adjusted “unique-uniform” attribution scheme. This scheme is efficiently computable and implementable and can be interpreted as a correction to the commonly used uniform attribution scheme.
Article
Attribution modeling (AM) has a crucial role in measuring the impact of advertising inputs in driving actions (clicks, conversions, purchases, homepage visits, etc.). A misattributing attribution model, such as last touch, allows publishers to ride freely on others’ efforts. This, in turn, powers futile optimizations with no realized performance lift. Shapley value and logistic regression stand out as reliable attribution models with a reputation across-industry verticals. AM using coalitional game theory—Shapley values—can distribute fairly both gains and costs to inputs, with unequal contributions, working together. AM using Shapley values, however, faces a scalability challenge for most practical applications. Notwithstanding, existing scalable AM methods not only lack interpretability but also blur the contrast between efficiency and contribution. This study demonstrates a scalable way to approximate Shapley values, mainly with successive orders of probabilistic models, which also provide additional insights into the efficiency and contribution of interacting advertising inputs. © 2018, World Advertising Research Center. All rights reserved.
Article
Full-text available
I consider the incentives of special interest websites to participate in behavioral advertising intermediaries. Participation in the intermediary reveals valuable audience data and allows the intermediary to use those data to target the site’s audience on general interest websites—thus expanding the supply of impressions and decreasing average revenue per impression. I explore monopoly and duopoly settings and highlight the trade-off between sharing audience data and displaying higher-value ads, as well as the strategic interaction between sites serving the same advertising market. The model generates empirical predictions about the choice of intermediary technologies within advertising markets. I also find that higher concentration among special interest websites benefits consumer privacy.
Chapter
Big Data applications abound in all disciplines. In this chapter we consider the practical applications of Big Data analytics in Marketing. Businesses spend significant human and financial resources in marketing their products and services to potential customers. In this chapter we look at how businesses use the data gathered from multiple sources and use that to promote their products and services to customers who are more likely to benefit from them. Typically, marketing involves communicating with the potential customer through multiple advertising and promotion mediums. Much of the information sent to the potential customer is either in print or electronic form. Businesses who use appropriate target marketing feel that a person within that market would eventually turn out to be a customer. With this in mind, the approaches businesses take are geared towards sending the right information to the right person. To gain this type of knowledge, businesses use extensive data from multiple sources. With the advancements in computing power, affordable resources and social media, businesses are in a better position to target their materials at the potential customer. Even though the cost of information dissemination is very small, if the information is sent to the wrong person then that person is not only going to discard the information but may resent being bombarded with unwanted information. In this chapter we show the various techniques real businesses use to target the right customer and send the information that will be used. In this effort Big Data techniques are helpful. We point out how the data was used in marketing and its success.
Article
Full-text available
Firms use different attribution strategies such as last-click or first-click attribution to assign conversion credits to search keywords that appear in their consumers’ paths to purchase. These attributed credits impact a firm’s future bidding and budget allocations among keywords and, in turn, determine the overall return-on-investment of search campaigns. In this paper, we model the relationship among the advertiser’s bidding decision for keywords, the search engine’s ranking decision for these keywords, and the consumer’s click-through rate and conversion rate on each keyword, and analyze the impact of the attribution strategy on the overall return-on-investment of paid search advertising. We estimate our simultaneous equations model using a six-month panel data of several hundred keywords from an online jewelry retailer. The data comprises a quasi-experiment as the firm changed attribution strategy from last-click to first-click attribution halfway through the data window. Our results show that returns for keyword investments vary significantly under the different attribution strategies. For the focal firm, first-click attribution leads to lower revenue returns and a more pronounced decrease for more specific keywords. Our policy simulation exercise shows how the firm can increase its overall returns by better attributing the real contribution of keywords. We discuss how an appropriate attribution strategy can help firms to better target customers and lower acquisition costs in the context of paid search advertising.
Article
Full-text available
Online Display Advertising's importance as a marketing channel is partially due to its ability to attribute conversions to campaigns. Current industry practice to measure ad effectiveness is to run randomized experiments using placebo ads, assuming external validity for future exposures. We identify two different effects: a strategic effect of the campaign presence in marketplaces, and a selection effect due to user targeting, which are confounded in current practices. We propose two novel randomized designs to: 1) estimate the overall campaign attribution without placebo ads, 2) disaggregate the campaign presence and the ad effects. Using the Potential Outcomes Causal Model, we address the selection effect by estimating the probability of selecting influenceable users. We show the ex-ante value of continuing evaluation to enhance the user selection for ad exposure mid-flight. We analyze two performance-based (CPA) and one Cost-Per-Impression (CPM) campaigns with 20M+ users each. We estimate a negative CPM campaign presence effect due to cross product spillovers. Experimental evidence suggests that CPA campaigns incentivize the selection of converting users regardless of the ad, up to 96% more than CPM campaigns, thus challenging the standard practice of targeting most likely converting users. Code: https://github.com/joelbz/DispAdvAttr-in-Mrkt-ExpDgn-Est External link: https://users.soe.ucsc.edu/~jbarajas/publications/paper_MarketingScience.pdf
Article
Full-text available
Technology enables a firm to produce a granular record of every touchpoint consumers make in their online purchase journey before they convert at the firm's website. However, firms still depend on aggregate measures to guide their marketing investments in multiple online channels (e.g., display, paid search, referral, e-mail). This article introduces a methodology to attribute the incremental value of each marketing channel in an online environment using individual-level data of customers' touches. The authors propose a measurement model to analyze customers' (1) consideration of online channels, (2) visits through these channels over time, and (3) subsequent purchases at the website to estimate the carryover and spillover effects of prior touches at both the visit and purchase stages. The authors use the estimated carryover and spillover effects to attribute the conversion credit to different channels and find that these channels' relative contributions are significantly different from those found by other currently used metrics. A field study validates the proposed model's ability to estimate the incremental impact of a channel on conversions. In targeting customers with different patterns of touches in their purchase funnel, these estimates help identify cases in which retargeting strategies may actually decrease conversion probabilities.
Article
Full-text available
Advertisers employ various channels to reach consumers over the Internet but often do not know to what degree each channel actually contributes to their marketing success. This attribution challenge is of great managerial interest, yet academic approaches to it developed in marketing academia have not found wide application in practice. To increase practical acceptance, the authors introduce a graph-based framework to analyze multichannel online customer path data as first- and higher-order Markov walks. According to a comprehensive set of criteria for attribution models, embracing both scientific rigor and practical applicability, four model variations are evaluated on four, large, real-world data sets from different industries. Results indicate substantial differences to existing heuristics such as “last click wins” and demonstrate that insights into channel effectiveness cannot be generalized from single data sets. The proposed framework offers support to practitioners by facilitating objective budget allocation and improving team decisions, and allows for future applications such as real-time bidding.
Article
Full-text available
In many online advertising campaigns, multiple vendors, publishers or search engines (herein called channels) are contracted to serve advertisements to internet users on behalf of a client seeking specific types of conversion. In such campaigns, individual users are often served advertisements by more than one channel. The process of assigning conversion credit to the various channels is called "attribution," and is a subject of intense interest in the industry. This paper presents a causally motivated methodology for conversion attribution in online advertising campaigns. We discuss the need for the standardization of attribution measurement and offer three guiding principles to contribute to this standardization. Stemming from these principles, we position attribution as a causal estimation problem and then propose two approximation methods as alternatives for when the full causal estimation can not be done. These approximate methods derive from our causal approach and incorporate prior attribution work in cooperative game theory. We argue that in cases where causal assumptions are violated, these approximate methods can be interpreted as variable importance measures. Finally, we show examples of attribution measurement on several online advertising campaign data sets.
Article
Full-text available
Firms' incentives to manufacture biased user reviews impede review usefulness. We examine the differences in reviews for a given hotel between two sites: Expedia.com (only a customer can post a review) and TripAdvisor.com (anyone can post). We argue that the net gains from promotional reviewing are highest for independent hotels with single-unit owners and lowest for branded chain hotels with multiunit owners. We demonstrate that the hotel neighbors of hotels with a high incentive to fake have more negative reviews on TripAdvisor relative to Expedia; hotels with a high incentive to fake have more positive reviews on TripAdvisor relative to Expedia.
Article
Full-text available
We consider the problem of online keyword advertising auctions among multiple bidders with limited budgets, and propose a bidding heuristic to optimize the utility for bidders by equalizing the return-on-investment for each bidder across all keywords. We show that natural auction mechanisms combined with this heuristic can experience chaotic cycling (as is the case with many current advertisement auction systems), and therefore propose a modifled class of mechanisms with small random perturbations. This perturbation is reminiscent of the small time-dependent perturbations employed in the dynamical systems literature to convert many types of chaos into attracting motions. We show that our perturbed mechanism provably converges in the case of flrst-price auctions and experimentally converges in the case of second-price auctions. Moreover, we show that our bidder-optimal system does not decrease the revenue of the auctioneer in the sense that it converges to the unique market equilibrium in the case of flrst-price auctions. In the case of second-price auctions, we conjecture that it converges to the non-unique \supply- aware" market equilibrium. We also observe that our perturbed auction scheme is useful in a broader context: In general, it can allow bidders to \share" a particular item, leading to stable allocations and pricing for the bidders, and improved revenue for the auctioneer.
Article
Full-text available
Core-selecting auctions were proposed as alternatives to the Vickrey–Clarke–Groves (VCG) mechanism for environments with complementarities. In this paper, we consider a simple incomplete-information model that allows correlations among bidders’ values. We perform a full equilibrium analysis of three core-selecting auction formats as applied to the “local-local-global” model. We show that seller revenues and efficiency from core-selecting auctions can improve as correlations among bidders’ values increase, producing outcomes that are closer to the true core than are the VCG outcomes. Thus, there may be a theoretical justification for policymakers to utilize core-selecting auctions rather than the VCG mechanism in certain environments.
Article
Full-text available
Motivated by recent auctions of licenses for the radio frequency spec trum, we consider situations where multiple objects are auctioned simultaneousl y by means of a second-price, sealed-bid auction. For some buyers, called globa l bidders, the value of multiple objects exceeds the sum of the objects' values separately. Others, called local bidders, are interested in only one object. I n a simple independent private values setting, we (a) characterize an equilibri um that is symmetric among the global bidders; (b) show that the addition of bi dders often leads to less aggressive bidding; and (c) compare the revenues obta ined from the simultaneous auction to those from its sequential counterpart.
Chapter
Composed in honour of the sixty-fifth birthday of Lloyd Shapley, this volume makes accessible the large body of work that has grown out of Shapley's seminal 1953 paper. Each of the twenty essays concerns some aspect of the Shapley value. Three of the chapters are reprints of the 'ancestral' papers: Chapter 2 is Shapley's original 1953 paper defining the value; Chapter 3 is the 1954 paper by Shapley and Shubik applying the value to voting models; and chapter 19 is Shapley's 1969 paper defining a value for games without transferable utility. The other seventeen chapters were contributed especially for this volume. The first chapter introduces the subject and the other essays in the volume, and contains a brief account of a few of Shapley's other major contributions to game theory. The other chapters cover the reformulations, interpretations and generalizations that have been inspired by the Shapley value, and its applications to the study of coalition formulation, to the organization of large markets, to problems of cost allocation, and to the study of games in which utility is not transferable.
Article
We study the role of ad networks in the online advertising market. Our baseline model considers two publishers that can outsource the sale of their ad inventories to an ad network, in a market where consumers and advertisers multi-home. The ad network increases total advertising revenue by tracking consumers across outlets and reduces competition between publishers by centralizing the sale of ads. Consequently, outsourcing to the ad network benefits the publishers, but may penalize the advertisers. We show that the ad network’s ability to track consumers may either expand or reduce the provision of ads, depending on consumers’ preferences for the publishers and how advertisers use tracking information. Specifically, tracking is more likely to expand (respectively, reduce) the provision of ads when consumers’ preferences for the publishers are positively (respectively, negatively) correlated. Tracking is also more likely to expand (respectively, reduce) the provision of ads when advertisers use tracking information to cap the frequency of impressions (respectively, target specific consumers). Furthermore, we study the implications of consumers’ choice to block tracking. Generally, blocking negatively impacts the advertising industry by making ad allocation less effective. Blocking also entails an externality on consumers, which is negative when tracking reduces the provision of ads. Given these conditions, regulatory restrictions on tracking may reduce consumer surplus as well as advertising revenue. These findings contrast with the presumption that regulation should make it easier for consumers to avoid tracking. We propose further extensions, including competing ad networks, more than two publishers, and networks that do not sell ads, but only tracking information to the advertisers. This paper was accepted by Juanjuan Zhang, marketing.
Article
Advertisers seek to maximize profits by investing in advertising. We propose a “cost-per-incremental-action” (CPIA) pricing model which incorporates the causal contribution of advertising in order to achieve the advertisers' objectives such as profit maximization. CPIA pricing aligns marketplace incentives among all participants to help advertisers achieve their objectives via ad effectiveness and, by doing so, eliminates the adverse behaviors resulting from the misaligned incentives of commonly used pricing models. CPIA pricing can be implemented by adapting cost-per-action (CPA) bidding by either the ad platform or advertiser's bidding agent. We discuss CPIA pricing in the context of several examples, including recent empirical studies measuring the causal effects of advertising within the context of existing pricing models.
Article
Advertisers employ various channels to reach customers over the Internet, who often get in touch with multiple channels along their “customer journey.” However, evaluating the degree to which each channel contributes to marketing success and the ways in which channels influence one another remains challenging. Although advanced attribution models have been introduced in academia and practice alike, generalizable insights on channel effectiveness in multichannel settings, and on the interplay of channels, are still lacking. In response, the authors introduce a novel attribution framework reflecting the sequential nature of customer paths as first- and higher-order Markov walks. Applying this framework to four large customer-level data sets from various industries, each entailing at least seven distinct online channels, allows for deriving empirical generalizations and industry-related insights. The results show substantial differences from currently applied heuristics such as last click attribution, confirming and refining previous research on singular data sets. Moreover, the authors identify idiosyncratic channel preferences (carryover) and interaction effects both within and across channel categories (spillover). In this way, the study can help advertisers develop integrated online marketing strategies.
Article
Advances in data collection have made it increasingly easy to collect information on advertising exposures. However, translating this seemingly rich data into measures of advertising response has proven difficult, largely because of concerns that advertisers target customers with a higher propensity to buy or increase advertising during periods of peak demand. We show how this problem can be addressed by studying a setting where a firm randomly held out customers from each campaign, creating a sequence of randomized field experiments that mitigates (many) potential endogeneity problems. Exploratory analysis of individual holdout experiments shows positive effects for both email and catalog; however, the estimated effect for any individual campaign is imprecise, because of the small size of the holdout. To pool data across campaigns, we develop a hierarchical Bayesian model for advertising response that allows us to account for individual differences in purchase propensity and marketing response. Building on the traditional ad-stock framework, we are able to estimate separate decay rates for each advertising medium, allowing us to predict channel-specific short- and long-term effects of advertising and use these predictions to inform marketing strategy. We find that catalogs have substantially longer-lasting impact on customer purchase than emails. We show how the model can be used to score and target individual customers based on their advertising responsiveness, and we find that targeting the most responsive customers increases the predicted returns on advertising by approximately 70% versus traditional recency, frequency, and monetary value–based targeting. This paper was accepted by Pradeep Chintagunta, marketing.
Article
The multibillion-dollar online advertising industry continues to debate whether to use the cost per click (CPC) or cost per action (CPA) pricing model as an industry standard. This paper applies the economic framework of incentive contracts to study how these pricing models can lead to risk sharing between the publisher and the advertiser and incentivize them to make efforts that improve the performance of online ads. We find that, compared with the CPC model, the CPA model can better incentivize the publisher to make efforts that can improve the purchase rate. However, the CPA model can cause an adverse selection problem: the winning advertiser tends to have a lower profit margin under the CPA model than under the CPC model. We identify the conditions under which the CPA model leads to higher publisher (or advertiser) payoffs than the CPC model. Whether publishers (or advertisers) prefer the CPA model over the CPC model depends on the advertisers’ risk aversion, uncertainty in the product market, and the presence of advertisers with low immediate sales ratios. Our findings indicate a conflict of interest between publishers and advertisers in their preferences for these two pricing models. We further consider which pricing model offers greater social welfare. This paper was accepted by J. Miguel Villas-Boas, marketing.
Article
Twenty-five large field experiments with major U.S. retailers and brokerages, most reaching millions of customers and collectively representing $2.8 million in digital advertising expenditure, reveal that measuring the returns to advertising is difficult. The median confidence interval on return on investment is over 100 percentage points wide. Detailed sales data show that relative to the per capita cost of the advertising, individual-level sales are very volatile; a coefficient of variation of 10 is common. Hence, informative advertising experiments can easily require more than 10 million person-weeks, making experiments costly and potentially infeasible for many firms. Despite these unfavorable economics, randomized control trials represent progress by injecting new, unbiased information into the market. The inference challenges revealed in the field experiments also show that selection bias, due to the targeted nature of advertising, is a crippling concern for widely employed observational methods. JEL Codes: L10, M37, C93.
Article
As firms increasingly rely on online media to acquire consumers, marketing managers rely on online metrics such as click-through rate (CTR) and cost per acquisition (CPA). However, these standard online advertising metrics are plagued with attribution problems and do not account for synergy or dynamics. These issues can easily lead firms to overspend on some actions and thus waste money and/or underspend in others, leaving money on the table. We develop a multivariate time series model to investigate the dynamic interaction between paid search and display ads and calibrate the model using data from a bank that uses online ads to acquire new checking account customers. The model suggests that both search and display ads exhibit dynamics that improve their effectiveness and ROI over time. Moreover, our results suggest that display ads increase search conversion. However, display ads may also increase search clicks, thereby increasing search advertising costs. After accounting for these three effects, we estimate that each 1investedindisplayandsearchleadstoareturnof1 invested in display and search leads to a return of 1.24 for display and $1.75 for search ads. These ROI estimates are respectively 10% and 38% higher than those obtained by standard metrics, which may have led the company to under-invest. We use these results to show how optimal budget allocation may shift after accounting for attribution and dynamics. Although display benefits from synergy attribution, the strong dynamic effects of search call for an increase in search advertising budget share by up to 36% in our context.
Article
Internet advertising has been the fastest growing advertising channel in recent years, with paid search ads comprising the bulk of this revenue. We present results from a series of large-scale field experiments done at eBay that were designed to measure the causal effectiveness of paid search ads. Because search clicks and purchase intent are correlated, we show that returns from paid search are a fraction of non-experimental estimates. As an extreme case, we show that brand keyword ads have no measurable short-term benefits. For non-brand keywords, we find that new and infrequent users are positively influenced by ads but that more frequent users whose purchasing behavior is not influenced by ads account for most of the advertising expenses, resulting in average returns that are negative.
Article
Consumers are exposed to advertisers across a number of channels. As a result, a conversion or a sale may be the result of a series of ads that were displayed to the consumer. This raises the key question of attribution: which ads get credit for a conversion and how much credit do each of these ads get? This is one of the most important issues facing the advertising industry. Although the issue is well documented, current solutions are often simplistic. Current practices apply simplistic methods like attributing the sale to the most recent ad exposure that penalize prior exposures and give undue credit to ad exposures further down in the conversion funnel. In this paper, we address the problem of attribution using a unique data-set from the online campaign of a car launch. We present a Hidden Markov Model of an individual consumer's behavior based on the concept of a conversion funnel that captures the consumer's deliberation process. We observe that different ad formats, e.g. display and search ads, affect the consumers differently and in different states of their decision process. Display ads usually have an early impact on the consumer, moving him from a state of dormancy to a state where he is aware of the product and it might enter his consideration set. However, when the consumer actively interacts with these ads (e.g. by clicking on them), his likelihood to convert considerably increases. Secondly, we present an attribution scheme based on the proposed model that assigns credit to an ad based on the incremental impact it has the consumer's probability to convert.
Article
Firms can now serve personalized recommendations to consumers who return to their website, based on their earlier browsing history on that website. At the same time, online advertising has greatly advanced in its use of external browsing data across the web to target internet ads. 'Dynamic Retargeting' integrates these two advances, by using information from earlier browsing on the firm's website to improve internet advertising content on external websites. Consumers who previously visited the firm's website when surfing the wider web, are shown ads that contain images of products they have looked at before on the firm's own website. To examine whether this is more effective than simply showing generic brand ads, we use data from a field experiment conducted by an online travel firm. We find, surprisingly, that dynamic retargeted ads are on average less effective than their generic equivalent. However, when consumers exhibit browsing behavior such as visiting review websites that suggests their product preferences have evolved, dynamic retargeted ads no longer underperform.
Chapter
Composed in honour of the sixty-fifth birthday of Lloyd Shapley, this volume makes accessible the large body of work that has grown out of Shapley's seminal 1953 paper. Each of the twenty essays concerns some aspect of the Shapley value. Three of the chapters are reprints of the 'ancestral' papers: Chapter 2 is Shapley's original 1953 paper defining the value; Chapter 3 is the 1954 paper by Shapley and Shubik applying the value to voting models; and chapter 19 is Shapley's 1969 paper defining a value for games without transferable utility. The other seventeen chapters were contributed especially for this volume. The first chapter introduces the subject and the other essays in the volume, and contains a brief account of a few of Shapley's other major contributions to game theory. The other chapters cover the reformulations, interpretations and generalizations that have been inspired by the Shapley value, and its applications to the study of coalition formulation, to the organization of large markets, to problems of cost allocation, and to the study of games in which utility is not transferable.
Conference Paper
In digital advertising, attribution is the problem of assigning credit to one or more advertisements for driving the user to the desirable actions such as making a purchase. Rather than giving all the credit to the last ad a user sees, multi-touch attribution allows more than one ads to get the credit based on their corresponding contributions. Multi-touch attribution is one of the most important problems in digital advertising, especially when multiple media channels, such as search, display, social, mobile and video are involved. Due to the lack of statistical framework and a viable modeling approach, true data-driven methodology does not exist today in the industry. While predictive modeling has been thoroughly researched in recent years in the digital advertising domain, the attribution problem focuses more on accurate and stable interpretation of the influence of each user interaction to the final user decision rather than just user classification. Traditional classification models fail to achieve those goals. In this paper, we first propose a bivariate metric, one measures the variability of the estimate, and the other measures the accuracy of classifying the positive and negative users. We then develop a bagged logistic regression model, which we show achieves a comparable classification accuracy as a usual logistic regression, but a much more stable estimate of individual advertising channel contributions. We also propose an intuitive and simple probabilistic model to directly quantify the attribution of different advertising channels. We then apply both the bagged logistic model and the probabilistic model to a real-world data set from a multi-channel advertising campaign for a well-known consumer software and services brand. The two models produce consistent general conclusions and thus offer useful cross-validation. The results of our attribution models also shed several important insights that have been validated by the advertising team. We have implemented the probabilistic model in the production advertising platform of the first author's company, and plan to implement the bagged logistic regression in the next product release. We believe availability of such data-driven multi-touch attribution metric and models is a break-through in the digital advertising industry.
Conference Paper
In recent years the online advertising industry has witnessed a shift from the more traditional pay-per-impression model to the payper-click and more recently to the pay-per-conversion model. Such models require the ad allocation engine to translate the advertiser's value per click/conversion to value per impression. This is often done through simple models that assume that each impression of the ad stochastically leads to a click/conversion independent of other impressions of the same ad, and therefore any click/conversion can be attributed to the last impression of the ad. However, this assumption is unrealistic, especially in the context of pay-per-conversion advertising, where it is well known in the marketing literature that the consumer often goes through a purchasing funnel before they make a purchase. Decisions to buy are rarely spontaneous, and therefore are not likely to be triggered by just the last ad impression. In this paper, we observe how the current method of attribution leads to inefficiency in the allocation mechanism. We develop a fairly general model to capture how a sequence of impressions can lead to a conversion, and solve the optimal ad allocation problem in this model. We will show that this allocation can be supplemented with a payment scheme to obtain a mechanism that is incentive compatible for the advertiser and fair for the publishers.
Article
Facebook and Google offer hybrid advertising auctions that allow advertisers to bid on a per-impression or a per-click basis for the same advertising space. This paper studies the properties of equilibrium and considers how to increase efficiency in this new auction format. Rational expectations require the publisher to consider past bid types in order to prevent revenue losses to strategic advertiser behavior. The equilibrium results contradict publisher statements and suggest that, conditional on setting rational expectations, publishers should consider offering multiple bid types to advertisers. For a special case of the model, we provide a payment scheme that achieves the socially optimal allocation of advertisers to slots and maximizes publisher revenues within the class of socially optimal payment schemes. When this special case does not hold, no payment scheme will always achieve the social optimum.
Article
Click fraud is the practice of deceptively clicking on search ads with the intention of either increasing third-party website revenues or exhausting an advertiser's budget. Search advertisers are forced to trust that search engines detect and prevent click fraud even though the engines get paid for every undetected fraudulent click. We find conditions under which it is in a search engine's interest to allow some click fraud. Under full information in a second-price auction, if x% of clicks are fraudulent, advertisers will lower their bids by x%, leaving the auction outcome and search engine revenues unchanged. However, if we allow for uncertainty in the amount of click fraud or change the auction type to include a click-through component, search engine revenues may rise or fall with click fraud. A decrease occurs when the keyword auction is relatively competitive because advertisers lower their budgets to hedge against downside risk. If the keyword auction is less competitive, click fraud may transfer surplus from the winning advertiser to the search engine. Our results suggest that the search advertising industry would benefit from using a neutral third party to audit search engines' click fraud detection algorithms.
Article
This article studies moral hazard with many agents. The focus is on two features that are novel in a multiagent setting: free riding and competition. The free-rider problem implies a new role for the principal: administering incentive schemes that do not balance the budget. This new role is essential for controlling incentives and suggests that firms in which ownership and labor are partly separated will have an advantage over partnerships in which output is distributed among agents. A new characterization of informative (hence valuable) monitoring is derived and applied to analyze the value of relative performance evaluation. It is shown that competition among agents (due to relative evaluations) has merit solely as a device to extract information optimally. Competition per se is worthless. The role of aggregate measures in relative performance evaluation is also explored, and the implications for investment rules are discussed.
Article
Corruption in the public sector erodes tax compliance and leads to higher tax evasion. Moreover, corrupt public officials abuse their public power to extort bribes from the private agents. In both types of interaction with the public sector, the private agents are bound to face uncertainty with respect to their disposable incomes. To analyse effects of this uncertainty, a stochastic dynamic growth model with the public sector is examined. It is shown that deterministic excessive red tape and corruption deteriorate the growth potential through income redistribution and public sector inefficiencies. Most importantly, it is demonstrated that the increase in corruption via higher uncertainty exerts adverse effects on capital accumulation, thus leading to lower growth rates.
Dynamics of bid optimization in online advertisement auctions
  • Christian Borgs
  • Jennifer Chayes
  • Nicole Immorlica
  • Kamal Jain
  • Omid Etesami
  • Mohammad Mahdian
Borgs, Christian, Jennifer Chayes, Nicole Immorlica, Kamal Jain, Omid Etesami, Mohammad Mahdian. 2007. Dynamics of bid optimization in online advertisement auctions. Proceedings of the 16th international conference on World Wide Web. ACM, 531-540.
Multi-touch attribution based budget allocation in online advertising
  • Sahin Geyik
  • Abhishek Cem
  • Ali Saxena
  • Dasdan
Geyik, Sahin Cem, Abhishek Saxena, Ali Dasdan. 2014. Multi-touch attribution based budget allocation in online advertising. Proceedings of the Eighth International Workshop on Data Mining for Online Advertising. ACM, 1-9.
A cascade model for externalities in sponsored search
  • David Kempe
  • Mohammad Mahdian
Kempe, David, Mohammad Mahdian. 2008. A cascade model for externalities in sponsored search. Internet and Network Economics 585-596.
Implications of improved attribution and measurability for antitrust and privacy in online advertising markets, the
  • Catherine Tucker
Tucker, Catherine. 2012. Implications of improved attribution and measurability for antitrust and privacy in online advertising markets, the. Geo. Mason L. Rev. 20 1025.
is the year of attribution
  • A Bager
  • J Laszlo
Bager A, Laszlo J (2015) 2016 is the year of attribution. Accessed July 18, 2018, https://www.clickz.com/2016-is-the-year-of-attribution/89517/.
P&G cuts more than $100 million in 'largely ineffective' digital ads
  • A Bruell
  • S Terlep
Bruell A, Terlep S (2017) P&G cuts more than $100 million in 'largely ineffective' digital ads. Wall Street Journal (July 27).
Marketing attribution: Valuing the customer journey
  • Econsultancy
Econsultancy (2012) Marketing attribution: Valuing the customer journey. Accessed February 1, 2012, https://www.thinkwithgoogle .com/_qs/documents/703/marketing-attribution-valuing-the -customer-journey_research-studies.pdf.
Your guide to Facebook bid strategy
  • Facebook
Facebook (2017) Your guide to Facebook bid strategy. Accessed July 17, 2018, https://go.fb.com/rs/267-PVB-941/images/ BiddingStrategyGuide_FINAL.pdf.
A Value for n-Person Games (RAND Corporation
  • L S Shapley
Shapley LS (1952) A Value for n-Person Games (RAND Corporation, Santa Monica, CA).