Article

Ghost Ads: Improving the Economics of Measuring Online Ad Effectiveness

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

To measure the effects of advertising, marketers must know how consumers would behave had they not seen the ads. The authors develop a methodology they call ‘Ghost Ads,’ which facilitates this comparison by identifying the control-group counterparts of the exposed consumers in a randomized experiment. The authors show that, relative to Public Service Announcement (PSA) and Intent-to-Treat A/B tests, ‘Ghost Ads’ can reduce the cost of experimentation, improve measurement precision, deliver the relevant strategic baseline, and work with modern ad platforms that optimize ad delivery in real-time. The authors also describe a variant ‘Predicted Ghost Ad’ methodology that is compatible with online display advertising platforms; their implementation records more than 100 million predicted ghost ads per day. The authors demonstrate the methodology with an online retailer's display retargeting campaign. They show novel evidence that retargeting can work as the ads lifted website visits by 17.2% and purchases by 10.5%. Compared to Intent-to-Treat or PSA experiments, advertisers can measure ad lift just as precisely while spending at least an order of magnitude less.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The most widely accepted approaches by the research community and the ad tech industry are to measure incrementality using randomized experiments (A/B testing) [3,7,10,13,18]. At a fundamental level, running an experiment for incrementality measurement sets aside without ads a randomly selected hold-out group of users (control group). This group provides a counterfactual view of the user response without the ad. ...
... That is mainly because identifying counterfactual user impressions in the hold-out group is extremely challenging. Without ads in the control group, the key challenge is to precisely identify the counterfactual ad exposed users [3,13]. ...
... ITT based solutions do not touch the control group resulting in the most unbiased campaign counterfactual [3]. Ghost bidding based solutions rely on the ability to log would-be (ghost) ad opportunity events at the last controllable step by the ad serving engine, typically submitted bids to ad exchanges [7,13]. The fundamental challenges with these solutions in the literature are: 1) the serving engine as treatment administrator is not blind to the user treatment assignment, 2) ITT and ghost bidding solutions results in large variability in the effect estimations leading to a decreased precision. ...
Conference Paper
Full-text available
Measuring the incremental value of advertising (incrementality) is critical for financial planning and budget allocation by advertisers. Running randomized controlled experiments is the gold standard in marketing incrementality measurement. Current literature and industry practices to run incrementality experiments focus on running placebo, intention-to-treat (ITT), or ghost bidding based experiments. A fundamental challenge with these is that the serving engine as treatment administrator is not blind to the user treatment assignment. Similarly, ITT and ghost bidding solutions provide greatly decreased precision since many experiment users never see ads. We present a novel randomized design solution for incremen-tality testing based on ghost bidding with improved measurement precision. Our design provides faster and cheaper results including double-blind, to the users and to the serving engine, post-auction experiment execution without ad targeting bias. We also identify ghost impressions in open ad exchanges by matching the bidding values or ads sent to external auctions with held-out bid values. This design leads to larger precision than ITT or current ghost bidding solutions. Our proposed design has been fully deployed in a real production system within a commercial programmatic ad network combined with a Demand Side Platform (DSP) that places ad bids in third-party ad exchanges. We have found reductions of up to 85% of the advertiser budget to reach statistical significance with typical ghost bids conversion and winner rates. Moreover, the highest statistical power at 50% control size design of this current practice is reached at 8% of our proposed design. By deploying this design, for an advertiser in the insurance industry, to measure the incrementality of display and native programmatic advertising, we have found conclusive evidence that the last-touch attribution framework (current industry standard) undervalues these channels by 87% when compared to the incremental conversions derived from the experiment.
... (1) Substantively, as Table 1 shows, we are among the first to reveal a causal adverse incremental impact of immediate retargeting on customer purchases. Advancing prior research on retargeting (Bleier and Eisenbeiss 2015;Johnson, Lewis, and Nubbemeyer 2017;Lambrecht and Tucker 2013;Sahni, Narayanan, and Kalyanam 2019), we not only conceptually differentiate early ECR from late ECR but also empirically demonstrate the double-edged effects of ECR ads and explore the moderated effects. (2) Methodologically speaking, we leverage a multistudy, multisetting research design with three large-scale randomized field experiments based on a finegrained hourly level of retargeted ads and over 64,000 customers from different companies, which can rigorously test the causal incremental effects of early and late ECR ads and attain a higher generalizability of our findings. ...
... Our article Lambrecht and Tucker (2013); Bleier and Eisenbeiss (2015); Hoban and Bucklin (2015); Johnson, Lewis, and Nubbemeyer (2017); Sahni et al. (2019) Nonretargeting Aaker andBruzzone (1985); Yoo and Kim (2005); Goldstein et al. (2014); Jenkins et al. (2016) Bettman (1979; Alba and Chattopadhyay (1985); Tellis (1988); Lewis and Reiley (2014); Van Heerde et al. (2004 than without delivering them, thus wasting online advertising budgets. Prudent advertisers ought to match the timing of ECR ads with the retargeted cart features (for detailed research and managerial implications, see the "Discussion and Implications" section). ...
... A recent stream of research in marketing has examined retargeting ads (Bleier and Eisenbeiss 2015;Johnson, Lewis, and Nubbemeyer 2017;Lambrecht and Tucker 2013;Sahni, Narayanan, and Kalyanam 2019). As Table 2 shows, researchers investigated with days or weeks after abandonment. ...
Article
Full-text available
Consumers often abandon e-commerce carts, so companies are shifting their online advertising budgets to immediate e-commerce cart retargeting (ECR). They presume that early reminder ads, relative to late ones, generate more click-throughs and web revisits. The authors develop a conceptual framework of the double-edged effects of ECR ads and empirically support it with a multistudy, multisetting design. Study 1 involves two field experiments on over 40,500 customers who are randomized to either receive an ECR ad via email and app channels (treatment) or not receive it (control) across different hourly blocks after cart abandonment. The authors find that customers who received an early ECR ad within 30 minutes to one hour after cart abandonment are less likely to make a purchase compared with the control. These findings reveal a causal negative incremental impact of immediate retargeting. In other words, delivering ECR ads too early can engender worse purchase rates than without delivering them, thus wasting online advertising budgets. By contrast, a late ECR ad received one to three days after cart abandonment has a positive incremental impact on customer purchases. In Study 2, another field experiment on 23,900 customers not only replicates the double-edged impact of ECR ads delivered by mobile short message service but also explores cart characteristics that amplify both the negative impact of early ECR ads and positive impact of late ECR ads. These findings offer novel insights into customer responses to online retargeted ads for researchers and managers alike.
... However, is not , in general, proportional to the potential ad stock from (t) z i winning because the probability of winning the auction can be very heterogeneous across each bid opportunity j and user i . This implies a simple adjustment to the model to unambiguously improve its predictive power, similar to "predicted Ghost Ads" (Johnson, Lewis, and Nubbemeyer 2017), by using a predicted win probability for each auction, , given the ...
... In order to restore conditional independence, we construct the analogous exogenous W regressors via "ghost bids" or "predicted ghost ads" (Johnson, Lewis, and Nubbemeyer 2017). Here, these continuoustime "ghost bid stock" are defined based on the user's context ; we ...
... Then, exogeneity of is violated for (t) ξ i subsequent bids if our bidding algorithm depends on past exposures, either explicitly via frequency capping or implicitly via changes in any user behavior caused by the advertising that changes future model bids (e.g., a form of endogeneity referred to as "covariate shift" in the machine learning literature). Johnson, Lewis, and Nubbemeyer (2017) provide examples of this bias in their implementation of predicted ghost ads. This weakness of predicted ghost ads in userlevel randomization should discourage the use of userlevel randomization as the primary source of exogeneity for estimating statistically precise incrementality bidding or attribution models. ...
Preprint
Full-text available
The causal effect of showing an ad to a potential customer versus not, commonly referred to as "incrementality", is the fundamental question of advertising effectiveness. In digital advertising three major puzzle pieces are central to rigorously quantifying advertising incrementality: ad buying/bidding/pricing, attribution, and experimentation. Building on the foundations of machine learning and causal econometrics, we propose a methodology that unifies these three concepts into a computationally viable model of both bidding and attribution which spans the randomization, training, cross validation, scoring, and conversion attribution of advertising's causal effects. Implementation of this approach is likely to secure a significant improvement in the return on investment of advertising.
... Online advertising is a main pillar of web-based economic activity and opens up new frontiers for the study of firms' use of data-driven decision-making (Brynjolfsson and McElheran 2016;Jankowski et al. 2016;Rao and Simonov 2019). Many online advertising platforms offer "experimentation-as-a-service" (Lin et al. 2019) that commonly allows advertisers to run experiments to assess the causal impact of advertising treatments on performance measures (Johnson, Lewis and Nubbemeyer 2017;Kalyanam et al. 2018;Gordon et al. 2019). In an organizational reinforcement learning framing (March 1991;Sutton and Barto 2018), experimentation enables firms to explore -to test new and innovative advertising approaches -to then use the generated insights to reinforce their advertising policies, i.e., to adopt innovative approaches and allocate advertising investment accordingly and more profitably (Fischer et al. 2011;Schwartz, Fader and Bradlow 2017;García-Galicia, Carsteanu and Clempner 2019). ...
... On online advertising platforms, advertising space is usually allocated to advertisers in an auction, based on their respective bid (Johnson, Lewis and Nubbemeyer 2017;Gordon et al. 2019;Lin et al. 2019). Experiments are implemented in different flavors, e.g., through PSA-style or "ghost ad" holdout treatments (see Johnson, Lewis and Nubbemeyer (2017) for an overview), generally providing the ability to learn with high precision due to full randomization and a holdout group without any exposure to the advertising policy (Johnson, Lewis and Nubbemeyer 2017;Gordon et al. 2019;Lin et al. 2019). ...
... On online advertising platforms, advertising space is usually allocated to advertisers in an auction, based on their respective bid (Johnson, Lewis and Nubbemeyer 2017;Gordon et al. 2019;Lin et al. 2019). Experiments are implemented in different flavors, e.g., through PSA-style or "ghost ad" holdout treatments (see Johnson, Lewis and Nubbemeyer (2017) for an overview), generally providing the ability to learn with high precision due to full randomization and a holdout group without any exposure to the advertising policy (Johnson, Lewis and Nubbemeyer 2017;Gordon et al. 2019;Lin et al. 2019). This style of experimentation is equivalent to random exploration (also called off-policy learning) of (contextual) bandit approaches in reinforcement learning (Sutton and Barto 2018). ...
Article
It is widely assumed that firms experiment with their online advertising to identify more profitable approaches to then increase their investment in more profitable advertising, increasing their overall performance. Generalizable evidence on the actual use of such experiment-based learning by firms is sparse. The study herein addresses this shortcoming – detailing the extent to which large advertisers are utilizing experimentation along with evidence on the benefits of doing so. The findings are gleaned from firms’ marketing and experimentation practices on a large online advertising platform and indicate that, while experimentation is utilized by some, adoption is far from perfect. Among the few firms making use of experiments, even fewer invest a significant share of their advertising spend in experimentation. This finding is surprising in light of broadly assumed regular experimentation by firms. Experimenting firms further experience higher concurrent and subsequent performance, suggesting that leading firms indeed successfully use experiment-based learning to improve their advertising policies – and that many firms may fall short of their potential by not (yet) using experiments in advertising.
... The value of advertising-that is, the increment in profits caused by advertising-although well defined from a conceptual point of view, is difficult to assess in practice, especially for firms. Causally identifying the value of advertising requires platforms to run sophisticated randomized experiments using data not easily available to firms (Johnson et al. 2017a). Besides, ad platforms typically report only aggregate success measures such as the absolute number of purchases or clicks associated with an ad, which do not capture the actual benefit of advertising to firms, as they confound the causal impact of advertising with consumers' baseline probability to respond positively to ads. ...
... If, on the other hand, consumers with low baseline purchase probability are the ones for which ads work best, then the ad platform will target consumers that the firm does not want to target-the incentives will be misaligned. Although some studies have found evidence consistent with a positive relationship between baseline purchase probability and ad effectiveness (Johnson et al. 2017a), others have found no such pattern (Blake et al. 2015). Thus, the magnitude of the incentive misalignment between firms and ad platforms depends on the relationship between consumers' baseline purchase probability and ad effectiveness and is ultimately an empirical question. ...
... Second, our empirical case allows us to investigate the contingencies that determine the implications of the incentives specified in CPA contracts. Our unique data from a large-scale randomized field experiment allows us to observe variation in the bids placed by the ad platform's optimization algorithm under a CPA contract without introducing selection bias into treatment and control group, overcoming a major issue for the identification of ad effectiveness discussed in related work (Johnson et al. 2017a, Gordon et al. 2019. Our analysis provides an empirical example of how firms are affected by CPA contracts contingent on consumers' baseline purchase probability and advertising effectiveness. ...
Article
Full-text available
In programmatic advertising, firms outsource the bidding for ad impressions to ad platforms. Although firms are interested in targeting consumers that respond positively to advertising, ad platforms are usually rewarded for targeting consumers with high overall purchase probability. We develop a theoretical model that shows if consumers with high baseline purchase probability respond more positively to advertising, then firms and ad platforms agree on which consumers to target. If, conversely, consumers with low baseline purchase probability are the ones for which ads work best, then ad platforms target consumers that firms do not want to target—the incentives are misaligned. We conduct a large-scale randomized field experiment, targeting 208,538 individual consumers, in a display retargeting campaign. Our unique data set allows us to both causally identify advertising effectiveness and estimate the degree of incentive misalignments between the firm and ad platform. In accordance with the contracted incentives, the ad platform targets consumers that are more likely to purchase. Importantly, we find no evidence that ads are more effective for consumers with higher baseline purchase probability, rendering the ad platform’s bidding suboptimal for the firm. A welfare analysis suggests that the ad platform’s bidding optimization leads to a loss in profit for the firm and an overall decline in welfare. To remedy the incentive misalignment, we propose a solution in which the firm restricts the ad platform to target only consumers that are profitable based on individual consumer-level estimates for baseline purchase probability and ad effectiveness. This paper was accepted by Anandhi Bharadwaj, information systems.
... Retargeting is where marketers collect data about products that individuals view and deliver ads for those products later, on another website (Pegoraro, 2014). Historically, advertising targeted to appeal to specific consumers has shown positive effects (Bleier & Eisenbeiss, 2015), and marketers assume this will be the case with retargeted ads (Johnson et al., 2017). However, retargeting may result in ads that are too specific to one's individual actions, with unintended negative consequences that may undermine persuasive effects. ...
... While marketers believe that retargeted advertising results in direct positive effects on purchases (Johnson et al., 2017), this does not take into account potential negative effects when ads' use of personal data cues consumers that marketers tracked them. Behaviorally targeted ads should be more likely than general product ads to negatively affect purchase intent indirectly, by cueing perceived marketing surveillance, threat, reactance, and negative attitudes toward the ad. ...
Article
Consumers have described retargeted ads as "creepy," possibly because these ads cue consumers that marketers are collecting personal data. Participants (N = 280) were either exposed to an ad that was targeted to past online behaviors or a general product ad. Behavioral targeting had a positive direct effect on purchase intent, but it also set off a negative indirect effect. Those exposed to behavioral targeting experienced increased perceived marketing surveillance, which led to increased threat, increased psychological reactance, negative attitudes toward the ad, and negative purchase intention. The indirect cost of perceived marketing surveillance on purchase intent was 4.5 percent.
... More broadly, marketers appreciate that the treatment effect of marketing campaigns is the best measure of a campaign's effectiveness (Hohnhold et al., 2015;Johnson et al., 2017). Indeed randomized trials (A/B experiments) are widely used to evaluate marketing campaigns (Radcliffe, 2007;Gordon et al., 2019). ...
... However, for online digital advertising, there are a number of different causal effects that might be of interest. For example, one might be interested in comparing bidding on an ad location versus not, or whether a user was shown your ad versus a competitor's ad (Johnson et al., 2017). Choosing the right causal effect, and designing the appropriate study for measuring that effect is important, and we will assume that such a decision has already been made. ...
Preprint
There are a number of available methods that can be used for choosing whom to prioritize treatment, including ones based on treatment effect estimation, risk scoring, and hand-crafted rules. We propose rank-weighted average treatment effect (RATE) metrics as a simple and general family of metrics for comparing treatment prioritization rules on a level playing field. RATEs are agnostic as to how the prioritization rules were derived, and only assesses them based on how well they succeed in identifying units that benefit the most from treatment. We define a family of RATE estimators and prove a central limit theorem that enables asymptotically exact inference in a wide variety of randomized and observational study settings. We provide justification for the use of bootstrapped confidence intervals and a framework for testing hypotheses about heterogeneity in treatment effectiveness correlated with the prioritization rule. Our definition of the RATE nests a number of existing metrics, including the Qini coefficient, and our analysis directly yields inference methods for these metrics. We demonstrate our approach in examples drawn from both personalized medicine and marketing. In the medical setting, using data from the SPRINT and ACCORD-BP randomized control trials, we find no significant evidence of heterogeneous treatment effects. On the other hand, in a large marketing trial, we find robust evidence of heterogeneity in the treatment effects of some digital advertising campaigns and demonstrate how RATEs can be used to compare targeting rules that prioritize estimated risk vs. those that prioritize estimated treatment benefit.
... They also make back-of-house decisions involving the testing of new products (Thomke et al., 1998), production methods, internal processes, and management practices (Ghosh et al., 2020). The rise of the digital era is rapidly expanding the scope and nature of experiments that businesses can run, as well as creating an increased recognition of the benefits it can offer (Kohavi et al., 2009(Kohavi et al., , 2013(Kohavi et al., , 2020. ...
... Another technique relies on the concept of ghost ads: the practice of identifying and using consumers comparable to those that were exposed to an ad in order to predict ad lift. This method is shown to provide reliable estimates of ad effectiveness while reducing ad spend (Johnson et al., 2017). Media mix modeling also enables variance in ad campaigns and spending to be treated as a form of experimentation that can be used to infer causality (Wigren & Cornell, 2019). ...
Article
Full-text available
Marketers know that running experiments is a proven way to improve results and gain competitive advantage against rivals. Despite this knowledge - and the fact that experiments are now easier to conduct than ever before - data shows that marketers consistently under-experiment. This paper examines why this gap exists, and what can be done to close it. We do so by connecting with senior-level marketing professionals representing seven consumer-facing industries in two phases. First, through a series of interviews, we gain initial understanding of the concerns, challenges, and realities of those working in the industry. Following this phase, we then survey a larger group to corroborate and extend our initial findings, comparing between cases to identify challenges and the strategies used to overcome them. We present our findings as a series of experimentation myths before closing with a broader perspective on how organizations can infuse experimentation into their culture.
... CTR measures the proportion of effectively allocated ads or the ratio of the clicks on an ad to its number of impressions [15]. Other work has proposed sophisticated measurements of online ad effectiveness such as using ghost ads and experimental approaches [43]. ...
Conference Paper
Full-text available
Showing ads delivers revenue for online content distributors, but ad exposure can compromise user experience and cause user fatigue and frustration. Correctly balancing ads with other content is imperative. Currently, ad allocation relies primarily on demographics and inferred user interests, which are treated as static features and can be privacy-intrusive. This paper uses person-centric and momentary context features to understand optimal ad-timing. In a quasi-experimental study on a three-month longitudinal dataset of 100K Snapchat users, we find ad timing influences ad effectiveness. We draw insights on the relationship between ad effectiveness and momentary behaviors such as duration, interactivity, and interaction diversity. We simulate ad reallocation, finding that our study-driven insights lead to greater value for the platform. This work advances our understanding of ad consumption and bears implications for designing responsible ad allocation systems, improving both user and platform outcomes. We discuss privacy-preserving components and the ethical implications of our work.
... For instance, knowing who recently bought prenatal vitamins is likely to increase the accuracy of predicting who will respond to ads for diapers (Duhigg, 2012). Likewise, knowing who previously browsed a product website increases the ability to accurately predict who will buy the product in response to an ad for it (i.e., behavioral re-targeting, Johnson et al., 2017b;Sahni et al., 2019). Such increases in predictive accuracy are not due to any particular methodological sophistication or "big data" per se, but due to the availability of data relevant to predicting the particular decision -similar to how marketers have used consumer purchase data to target offers for decades. ...
Article
Recent technology advances (e.g., tracking and “AI”) have led to claims and concerns regarding the ability of marketers to anticipate and predict consumer preferences with great accuracy. Here, we consider the predictive capabilities of both traditional techniques (e.g., conjoint analysis) and more recent tools (e.g., advanced machine learning methods) for predicting consumer choices. Our main conclusion is that for most of the more interesting consumer decisions, those that are “new” and non‐habitual, prediction remains hard. In fact, in many cases, prediction has become harder due to the increasing influence of just‐in‐time information (user reviews, online recommendations, new options, etc.) at the point of decision that can neither be measured nor anticipated ex ante. Sophisticated methods and “big data” can in certain contexts improve predictions, but usually only slightly, and prediction remains very imprecise—so much so that it is often a waste of effort. We suggest marketers focus less on trying to predict consumer choices with great accuracy and more on how the information environment affects the choice of their products. We also discuss implications for consumers and policymakers.
... The notion of Shapley value (Berman 2018, Singal et al. 2019 has also been used to split the credit fairly among different channels. Johnson et al. (2017), Lewis et al. (2011), Lewis and Rao (2015) all study models related to causality and (external) attribution. Their approach is based on econometric parametric models. ...
Preprint
Full-text available
Attribution-the mechanism that assigns conversion credits to individual marketing interactions-is one of the foremost digital marketing research topic. Indeed, attribution is used everywhere in online marketing whether for budget decisions or for the design of algorithmic bidders that the advertiser relies on to buy inventory. Still, an attribution definition over which everyone in the marketing community agrees upon is yet to be found. In this paper, we focus on the problem faced by a bidder who is subject to an attribution black box decided by the advertiser and needs to convert it into a bid. This naturally introduces a second level of attribution, performed internally by the bidder. We first formalize an often implicitly used solution concept for this task. Second, we characterize the solution of this problem which we call the core internal attribution. Moreover, we show it can be computed using a fixed-point method. Interestingly, the core attribution is based on a notion of marginality that differs from previously used definitions such as the counterfactual marginality.
... Deeper examination of this finding, reinforced with evidence from a lab experiment, shows that consumer preferences evolve during the purchase decision process, implying that the most effective advertising content changes during the consumer's journey. Recognizing the critical importance of randomized experimentation for effective display advertising in modern ad platforms, Johnson, Lewis, and Nubbemeyer's (2017) paper, which won the Green Award, develops a "ghost ads" methodology that reduces the cost of experimentation and improves measurement precision. ...
Article
The authors study the nature of articles published in the Journal of Marketing Research ( JMR) during the seven-year period 2013–2019. Consistent with the broad positioning of JMR, they find substantial diversity in domains, topics, methods, and sources of data among the published articles. They observe the emergence of new substantive topics, such as social media, social networks, and prosocial behavior, which reinforce the continued relevance of JMR. Notably, they observe increasing convergence across articles in the behavioral, quantitative, and strategy domains, reflecting more shared substantive topics of interest and common use of methods and sources of data. This trend bodes well for JMR, given its historical position as a diverse journal in the field.
... First, future experiments should examine the effects of viewing time on consequential choices (Morales et al., 2017) or actual purchases (e.g., Ghose & Todri-Adamopoulos, 2016) rather than purchase intentions. Second, our results need replication using field data, but controlling for variables such as product involvement and relevance can be difficult or requires sophisticated modeling (e.g., Johnson et al., 2017). Replication should extend these results to different forms of video advertising (e.g., prerolls), different ad durations (e.g., 6-, 15-, and 60-seconds), different viewing conditions (e.g., tablet computers, mobile phones), and additional measures such as actual sales (e.g., Taylor et al., 2013). ...
Article
The Media Rating Council recommends weighting advertising exposure by viewing time. Prior research shows viewing time has diminishing returns, implying that effectiveness equivalent to a 100% complete exposure could be achieved by a lower threshold. Results from four laboratory experiments, extending prior banner-ad research to dynamic video ads, suggest 75% viewed is a potential threshold. A second contribution identifies different viewing time distributions for television and online video, due to differences in ad avoidance. More television ads have viewing times exceeding the 75% threshold, and so are more effective than the typical online video ad. Online networks could charge fees equivalent to television ads for video ads that exceed the 75% threshold. A third contribution is the use of interval outcome estimation (IOE), which revealed asymmetric effects of viewing time and that brand familiarity rather than viewing time is the only necessary explanation of ad effectiveness measured by recall.
... This offers multiple implications for researchers and practitioners alike: first, the present "viewability" model might be an insufficient measure of ad attention (i.e., the 50% policy). Further, a more granular reading of the viewport might also be a relevant input variable for the assessment of ad effectiveness, which currently still relies on data generated through the policy-model (Ghose and Todri-Adamopoulos 2016;Johnson et al. 2017). Further, our findings indicate that the viewport offers relevant data to analyze advertising effects, besides clickstreams (Bucklin and Sismeiro 2009) and, thus, constitute another promising source of "big data" for studying mobile advertising (Grewal et al. 2016). ...
Conference Paper
Full-text available
Advertisers have to pay publishers for "viewable" ads, irrespective of whether the users paid active attention. In this paper, we suggest that a granular analysis of users' viewing patterns can help us to progress beyond mere "viewability" and toward actual differentiation of whether a user has paid attention to an ad or not. To this end, we use individual viewport trajectories, which measures the sequence of locations and times an object (e.g., an ad) is visible on the display of a device (desktop or mobile). To validate our model and benchmark it against the extant models, such as the "viewability" policy (50% threshold) model, we use data from an eye-tracking experiment. Findings confirm the improved model fit, highlight distinct viewing patterns in the data, and inform information processing on mobile phones. Consequently, implications are relevant to publishers, advertisers, and consumer researchers.
... Combining experimental variation and simulated outcomes, they show that the use of cookie-level data results in an attenuation bias under a "clean" environment. Their setting abstracts away from commonly observed confounds resulted from consumer behavior, including activity bias (Lewis, Rao and Reiley, 2015;Johnson, Lewis and Nubbemeyer, 2017) and purchase substitution across channels (Goldfarb and Tucker, 2011). This setting allows them to construct an experimentbased de-biasing approach to reconstruct the user-level effect from cookie-level estimates. ...
Preprint
Consumers interact with firms across multiple devices, browsers, and machines; these interactions are often recorded with different identifiers for the same individual. The failure to correctly match different identities leads to a fragmented view of exposures and behaviors. This paper studies the identity fragmentation bias, referring to the estimation bias resulted from using fragmented data. Using a formal framework, we decompose the contributing factors of the estimation bias caused by data fragmentation and discuss the direction of bias. Contrary to conventional wisdom, this bias cannot be signed or bounded under standard assumptions. Instead, upward biases and sign reversals can occur even in experimental settings. We then propose and compare several corrective measures, and demonstrate their performances using an empirical application.
... The relevance of online advertising as one of the major sources of revenue in the Internet has attracted the attention of researchers in two major areas: (i) improving the efficiency of the online advertising ecosystem [22][23][24][25][26][27] and (ii) increasing the transparency and protection of Internet users exposed to online advertising. Our work fits in the second area. ...
Article
Full-text available
Online advertising is a wealthy industry that generated more than $100B in 2018 only in the US and delivers billions of ads to Internet users every day with. These impressive numbers have also attracted the attention of malicious players that try to exploit the online advertising ecosystem for their own benefit. In particular, one of the most harmful practices refers to malicious users that act as advertisers to deliver unsafe ads. The goal of these ads is to compromise the security of the users that receive those ads. This practice is referred to as Malvertising. Some reports have estimated the economic loss caused by malvertising to the online advertising sector to $1.1B in 2017. This paper is the first work that analyses and quantifies the impact of malvertising in Facebook. To accomplish this study, we rely on a dataset that includes more than 5 M ads delivered to 3 K Facebook users from 126 K advertisers between October 2016 and May 2018. Our results reveal that although the portion of advertisers (0.68%) and ads (0.17%) associated to malvertising is very low, 1/3 of the users in our study were exposed to malvertising. Finally, we also propose a novel solution to block malvertising ads in real-time in Facebook.
... For example, a retailer might buy its brand name as a search term, but a customer who searches for the brand name online may find the brand without the retailer having to pay for the search term. See the work of Lewis and Reiley (2014); Johnson, Lewis, and Nubbemeyer (2017);and Olaya, Coussement, and Verbeke (2020) for recent surveys and benchmarking studies. ...
Article
Full-text available
Computational advertising (CA) is a rapidly growing field, but there are numerous challenges related to measuring its effectiveness. Some of these are classic challenges where CA offers a new aspect to the challenge (e.g., multi-touch attribution, bias), and some are brand-new challenges created by CA (e.g., fake data and ad fraud, creeping out customers). In this article, we present a measurement system framework for CA to provide a common starting point for advertising researchers to begin addressing these challenges, and we also discuss future research questions and directions for advertising researchers. We identify a larger role for measurement: It is no longer something that happens at the end of the advertising process; instead, measurements of consumer behaviors become integral throughout the process of creating, executing, and evaluating advertising programs.
... For example, a retailer might buy its brand name as a search term, but a customer who searches for the brand name may find the brand without the retailer having to pay for the search term. See the work of Reiley (2014) Johnson, Lewis andNubbemeyer (2017), and Olaya, Coussement and Verbeke (2020) for recent surveys and benchmarking studies. ...
Preprint
Full-text available
Computational advertising (CA) is a rapidly growing field, but there are numerous challenges related to measuring its effectiveness. Some of these are classic challenges where CA offers a new aspect to the challenge (e.g., multi touch attribution, bias), and some are brand new challenges created by CA (e.g., fake data and ad fraud, creeping out customers). In this paper, we present a measurement system framework for CA to provide a common starting point for advertising researchers to begin addressing these challenges, and we also discuss future research questions and directions for advertising researchers. We identify a larger role for measurement: it is no longer something that happens at the end of the advertising process, but instead measurements of consumer behaviors become integral throughout the process of creating, executing, and evaluating advertising programs.
... This design has proven useful in a wide variety of settings. In the context of Internet advertising, designs using ghost advertisements serve this function by delivering treatment or placebo advertisements to everyone who visits a given web page (Johnson et al., 2017). Other examples include attempts to persuade people about social issues at their doorsteps (Kalla and Broockman, 2018) or to mobilize them to vote via phone calls (Gerber et al., 2010). ...
Article
Full-text available
Education–entertainment refers to dramatizations designed to convey information and to change attitudes. Buoyed by observational studies suggesting that education–entertainment strongly influences beliefs, attitudes and behaviours, scholars have recently assessed education–entertainment by using rigorous experimental designs in field settings. Studies conducted in developing countries have repeatedly shown the effectiveness of radio and film dramatizations on outcomes ranging from health to group conflict. One important gap in the literature is estimation of social spillover effects from those exposed to the dramatizations to others in the audience members’ social network. In theory, the social diffusion of media effects could greatly amplify their policy impact. The current study uses a novel placebo‐controlled design that gauges both the direct effects of the treatment on audience members as well as the indirect effects of the treatment on others in their family and in the community. We implement this design in two large cluster‐randomized experiments set in rural Uganda using video dramatizations on the topics of violence against women, teacher absenteeism and abortion stigma. We find several instances of sizable and highly significant direct effects on the attitudes of audience members, but we find little evidence that these effects diffused to others in the villages where the videos were aired.
... Lower costs of experimentation might increase the number of ad experiments that are run (e.g., Schwartz, Bradlow and Fader, 2017). Johnson, Lewis and Nubbemeyer (2017a) proposed "ghost ads" as a method to "identify ads in the control group that would have been the focal advertiser's ads had the consumer been in the treatment group." Google implemented ghost ads and has reduced experimentation costs by an order of magnitude. ...
Article
Full-text available
Digital advertising markets are growing and attracting increased scrutiny. This article explores four market inefficiencies that remain poorly understood: ad effect measurement, frictions between and within advertising channel members, ad blocking, and ad fraud. Although these topics are not unique to digital advertising, each manifests in unique ways in markets for digital ads. The authors identify relevant findings in the academic literature, recent developments in practice, and promising topics for future research.
... Two kinds of randomized experiments have been proposed: cookie-level experiments, where a large number of cookies are randomly assigned to two different ad serving conditions (treatment and control), and geo experiments, where experimental units are nonoverlapping ad-targetable geographical areas, see Adwords [ 2021 ] for a list of "geo targets" supported by Google. While a cookie experiment is relatively easier to analyze due to large sample [ Johnson et al. , 2017, Gordon et al. , 2019, Kohavi et al. , 2020, it is limited to online metrics only (e.g., online conversion) and its measurement may be inaccurate due to technical issues such as cross-devices, signin/out, cookie churn, etc ( Yen et al. [ 2012 ] and Coey and Bailey [ 2016 ]). More importantly, such experiments may be impossible to execute properly due to data protection laws (e.g. ...
Preprint
Full-text available
How to measure the incremental Return On Ad Spend (iROAS) is a fundamental problem for the online advertising industry. A standard modern tool is to run randomized geo experiments, where experimental units are non-overlapping ad-targetable geographical areas (Vaver & Koehler 2011). However, how to design a reliable and cost-effective geo experiment can be complicated, for example: 1) the number of geos is often small, 2) the response metric (e.g. revenue) across geos can be very heavy-tailed due to geo heterogeneity, and furthermore 3) the response metric can vary dramatically over time. To address these issues, we propose a robust nonparametric method for the design, called Trimmed Match Design (TMD), which extends the idea of Trimmed Match (Chen & Au 2019) and furthermore integrates the techniques of optimal subset pairing and sample splitting in a novel and systematic manner. Some simulation and real case studies are presented. We also point out a few open problems for future research.
... They find that rigorous experiments are tremendously expensive to set up, but it does suggest that they are possible. Johnson et al. (2017) suggest a unique methodology they call "ghost ads" to create counterfactuals, allowing researchers to estimate what would have happened without the advertisements. And more and more researchers are using Facebook to conduct surveys, suggesting the possibility of recruiting participant panels to conduct experiments on third-party platforms like SurveyMonkey or Qualtrics. ...
... A recent study of the effects of TV advertising, which was comprehensive and methodologically superior to its predecessors, revealed that the effects of the vast majority of TV ads were small at best, with "a sizable percentage of statistically insignificant or negative estimates" (Shapiro et al., 2019). Another recent and large scale study-with billions of observations-of online display ad effectiveness also points to very small or null effects for the vast majority of ad campaigns, with a few positive outliers, and half of the (limited) effect of advertising being to merely make people buy earlier a product they would have bought anyway (Johnson et al., 2017). ...
Article
Are we gullible? Can we be easily influenced by what others tell us, even if they do not deserve our trust? Many strands of research, from social psychology to cultural evolution suggest that humans are by nature conformist and eager to follow prestigious leaders. By contrast, an evolutionary perspective suggests that humans should be vigilant towards communicated information, so as not to be misled too often. Work in experimental psychology shows that humans are equipped with sophisticated mechanisms that allow them to carefully evaluate communicated information. These open vigilance mechanisms lead us to reject messages that clash with our prior beliefs, unless the source of the message has earned our trust, or provides good arguments, in which case we can adaptively change our minds. These mechanisms make us largely immune to mass persuasion, explaining why propaganda, political campaigns, advertising, and other attempts at persuading large groups nearly always fall in deaf ears. However, some false beliefs manage to spread through communication. I argue that most popular false beliefs are held reflectively, which means that they have little effect on our thoughts and behaviors, and that many false beliefs can be socially beneficial. Accepting such beliefs thus reflects a much weaker failure in our evaluation of communicated information than might at first appear.
... A substantial body of research demonstrates the effectiveness of behavioral advertising. Specifically for retargeting, studies show how such ads can significantly improve ad effectiveness in terms of click-through or view-through compared to generic or non-personalized ads (e.g., Bleier and Eisenbeiss 2015a;Hoban and Bucklin 2015;Johnson, Lewis, and Nubbemeyer 2017;Lambrecht and Tucker 2013), and even help to bring consumers back to the advertiser's online store weeks after an initial visit (Sahni, Narayanan, and Kalyanam 2019). ...
... For example, in a cross-section of 55 randomized advertising field experiments for consumer packaged goods (fast-moving pre-packaged, consumer nondurables including food, beverages, health and beauty, and cleaning products), Lodish et al. (1995) not only find that the effects of successful television ad campaign persist more than two years, but the longer-term magnitudes are more than double the immediate-run effects. On the other hand, using a cross-section of 432 digital display-advertising field experiments, Johnson, Lewis, and Nubbemeyer (2017) find that advertising decays at an astonishingly rapid rate of 23 percent per day. ...
Article
US companies invested over $500 billion in 2021 in intangible brand capital, over 2% of GDP. During the past decade, US companies have also been growing their internal marketing capabilities, an often overlooked source of human capital. We discuss the private and social benefits of these intangible brand capital stocks. While the private returns to companies are fairly clear, the academic literature has been divided over the social benefits and costs of advertising and promotion, the two key investment vehicles. We also discuss the implications of brand capital for measured productivity.
... One such in efficiency concerns online ad measurement and it is suggested that the chasm between marketing practitioners and academicians is around the issues of endogeneity (Rutz & Watson, 2019). Despite the prevalence of field experiments in Marketing, often presented as gold standards to create causal insights (Johnson et al., 2017), problems concerning A/B testing remain. Feit and Berman (2019) in their research reframe A/B tests as tools to manage the trade-off between the opportunity cost of the test and the potential losses associated with deploying a suboptimal treatment to the entire population and propose an alternative that theoretically achieves the same performance as MAB implementation. ...
Article
Full-text available
One of the core challenges in digital marketing is that the business conditions continuously change, which impacts the reception of campaigns. A winning campaign strategy can become unfavored over time, while an old strategy can gain new traction. In data driven digital marketing and web analytics, A/B testing is the prevalent method of comparing digital campaigns, choosing the winning ad, and deciding targeting strategy. A/B testing is suitable when testing variations on similar solutions and having one or more metrics that are clear indicators of success or failure. However, when faced with a complex problem or working on future topics, A/B testing fails to deliver and achieving long-term impact from experimentation is demanding and resource intensive. This study proposes a reinforcement learning based model and demonstrates its application to digital marketing campaigns. We argue and validate with actual-world data that reinforcement learning can help overcome some of the critical challenges that A/B testing, and popular Machine Learning methods currently used in digital marketing campaigns face. We demonstrate the effectiveness of the proposed technique on real actual data for a digital marketing campaign collected from a firm.
... To increase purchase behavior at the purchase decision stage of the purchase funnel, retailers should consider displaying messages that would increase purchase urgency and item desirability, such as time or quantity restrictions (Inman et al., 1997), and emphasizing other positive cues, such as the number of deals already sold (Kukar-Kinney & Xia, 2017). Another managerial implication is to use retargeting (Johnson et al., 2017;Sahni et al., 2019) as a way to remind customers of the clearance item(s) previously placed into the cart. ...
Article
This research investigates online consumer behavior in an e-commerce context with a focus on consumer online shopping cart use and subsequent cart abandonment. A model rooted in the Uses and Gratifications Theory, the Unified Theory of Acceptance and Use of Technology, and the concept of the purchase funnel is developed to explain the predicted relationships. Empirical findings based on clickstream data show that returning to an existing cart increases the subsequent cart use and decreases cart abandonment. Conversely, viewing clearance pages and viewing a large number of product reviews increases both cart use and cart abandonment. Browsing product pages decreases cart use, and increases cart abandonment. The moderating role of smartphone-based shopping is also examined, with the moderating effects primarily occurring early in the purchase funnel affecting cart use, and influencing cart abandonment to a smaller degree. Theoretical contributions and managerial implications for digital marketers are provided.
... 3 This quasi-experimental variation in induced by the limited budget and the probabilistic throttling algorithm that is widely used many ad publishers. Our method is related to Johnson et al. (2017) that enables ad publishers to more efficiently measure advertising effect through logging the ghost ads. Similar to their setting, our paper focus on measuring advertising effect for ad publishers who often have direct access to the algorithm that generates advertising exposure. ...
Preprint
Full-text available
Causally identifying the effect of digital advertising is challenging, because experimentation is expensive, and observational data lacks random variation. This paper identifies a pervasive source of naturally occurring, quasi-experimental variation in user-level ad-exposure in digital advertising campaigns. It shows how this variation can be utilized by ad-publishers to identify the causal effect of advertising campaigns. The variation pertains to auction throttling, a probabilistic method of budget pacing that is widely used to spread an ad-campaign's budget over its deployed duration, so that the campaign's budget is not exceeded or overly concentrated in any one period. The throttling mechanism is implemented by computing a participation probability based on the campaign's budget spending rate and then including the campaign in a random subset of available ad-auctions each period according to this probability. We show that access to logged-participation probabilities enables identifying the local average treatment effect (LATE) in the ad-campaign. We present a new estimator that leverages this identification strategy and outline a bootstrap estimator for quantifying its variability. We apply our method to ad-campaign data from JD.com, which uses such throttling for budget pacing. We show our estimate is statistically different from estimates derived using other standard observational method such as OLS and two-stage least squares estimators based on auction participation as an instrumental variable.
... Given that a control group is to be used, another issue is what to show the control group. Johnson, et al. (2017) advocate the use of "ghost" ads, which, roughly speaking, are the ads that would have been shown in the absence of the experiment. Since field experiments involve the opportunity costs of setting non-optimal levels of the treatment variables, they can be very costly. ...
Article
The fast-paced growth of e-commerce is rapidly changing consumers’ shopping habits and shaping the future of the retail industry. While online retailing has allowed companies to overcome geographic barriers to selling and helped them achieve operational efficiencies, offline retailers have struggled to compete with online retailers, and many retailers have chosen to operate both online and offline. This paper presents a review of the literature on the interaction between e-commerce and offline retailing, highlighting empirical findings and generalizable insights, and discussing their managerial implications. Our review includes studies published in more than 50 different academic journals spanning various disciplines from the inception of the internet to present. We organize our paper around three main research questions. First, what is the relationship between online and offline retail channels including competition and complementarity between online and offline sellers as well as online and offline channels of an omnichannel retailer? Under this question we also try to understand the impact of e-commerce on market structure and what factors impact the intensity of competition /complementarity. Second, what is the impact of e-commerce on consumer behavior? We specifically investigate how e-commerce has impacted consumer search, its implications for price dispersion, and user generated content. Third, how has e-commerce impacted retailers’ key managerial decisions? The key research questions under this heading include: (i) What is the impact of big data on retailing? (ii) What is the impact of digitization on retailer outcomes? (iii) What is the impact of e-commerce on sales concentration? (iv) What is the impact of e-commerce and platforms on pricing? And (v) How should retailers manage product returns across online and offline channels? Under each section, we also develop detailed recommendations for future research which we hope will inspire continued interest in this domain.
... This is relatively easy for the tech giants (such as Google, Booking and Netflix), but can be more complex for firms in other industry. As we know digital environments facilitate randomization, controlled experiments, known as A/B tests, have become an increasingly popular part of a firm's analytics capabilities (Schwartz, et al., 2017;Johnson, et al., 2017). Thus, we have chosen to conduct a cross-industry analysis to gain an all-round understanding and highlight aspect from which they can learn from each other. ...
Conference Paper
Companies' capabilities to experiment continuously and ambidextrously have an enabler for growth. Scientific literature focused on experimentation and ambidexterity separately, thus not investigating how effective experimentation can be implemented to achieve ambidexterity. Moreover, most of the empirical evidence comes from digital companies, thus missing opportunities to provide contribution regarding other companies. This study offers a cross-industry perspective on how different companies are dealing with experimentation to achieve ambidexterity showing the crucial role of organizational.
... We test the relationship between image type and display ad CTRs (i.e., the ratio of all consumers clicking on an ad divided by the total number of ad impressions) by partnering with a leading content discovery platform that distributes ads on various publisher sites and tracks audience behavior. We use CTR as the primary dependent variable because it is one of the most commonly used and important metrics to track the performance of online advertising (Aribarg and Schwartz 2020;Johnson, Lewis, and Nubbemeyer 2017;Melumad and Meyer 2020). ...
Article
Full-text available
Smartphones have made it nearly effortless to share images of branded experiences. This research classifies social media brand imagery and studies user response. Aside from packshots (standalone product images), two types of brand-related selfie images appear online: consumer selfies (featuring brands and consumers’ faces) and an emerging phenomenon the authors term “brand selfies” (invisible consumers holding a branded product). The authors use convolutional neural networks to identify these archetypes and train language models to infer social media response to more than a quarter-million brand-image posts (185 brands on Twitter and Instagram). They find that consumer-selfie images receive more sender engagement (i.e., likes and comments), whereas brand selfies result in more brand engagement, expressed by purchase intentions. These results cast doubt on whether conventional social media metrics are appropriate indicators of brand engagement. Results for display ads are consistent with this observation, with higher click-through rates for brand selfies than for consumer selfies. A controlled lab experiment suggests that self-reference is driving the differential response to selfie images. Collectively, these results demonstrate how (interpretable) machine learning helps extract marketing-relevant information from unstructured multimedia content and that selfie images are a matter of perspective in terms of actual brand engagement.
Preprint
Full-text available
Online advertisements delivered via social media platforms function in a similar way to phishing emails. In recent years there has been a growing awareness that political advertisements are being microtargeted and tailored to specific demographics, which is analogous to many social engineering attacks. This has led to calls for total bans on this kind of focused political advertising. Additionally, there is evidence that phishing may be entering a more developed phase using software known as Phishing as a Service to collect information on phishing or social engineering, potentially facilitating microphishing campaigns. To help understand such campaigns, a set of well-defined metrics can be borrowed from the field of digital marketing, providing novel insights which inform phishing email analysis. Our work examines in what ways digital marketing is analogous to phishing and how digital marketing metric techniques can be used to complement existing phishing email analysis. We analyse phishing email datasets collected by the University of Houston in comparison with Corporate junk email and microtargeting Facebook Ad Library datasets, thus comparing these approaches and their results using Weka, URL mismatch and visual metrics analysis. Our evaluation of the results demonstrates that phishing emails can be joined up in unexpected ways which are not revealed using traditional phishing filters. However such microphishing may have the potential to gather, store and analyse social engineering information to be used against a target at a later date in a similar way to microtargeting.
Article
To inform product release and distribution strategies, research has analyzed cross-market spillovers in new product adoption. However, models that examine these effects for digital and viral media are still evolving. Given resistance to advertising, firms often seek to promote their own viral content to boost brand awareness. However, a key shortcoming of virality is its ephemeral nature. To gain insight into sustaining virality, we develop a quasi-experimental approach that estimates the backward spillover onto a focal platform by introducing a piece of content onto a new platform. We posit that introducing content to the audience of a new platform can generate word of mouth, which may affect its consumption within an earlier platform. We estimate these spillovers using data on 381 viral videos on 26 platforms (e.g., YouTube, Vimeo) and observe how consumption of videos on an initial “lead” platform is affected by their subsequent introduction onto “lag” platforms. This spillover is estimated as follows: for each multiplatform video, we compare its view growth after being introduced onto a new platform to that of a synthetic control based on similar single-platform videos. Analysis of 275 such spillover scenarios reveals that introducing a video onto a lag platform roughly doubles its subsequent view growth in the lead platform. This positive cross-platform spillover is persistent, bursty, and strongest in the first 42 days. We find that spillover is boosted when the video is consumed more in the lag platform, when the consumption rate peaks earlier in the lag platform, and when the lag platform targets a foreign market. Our findings suggest that firms can sustain the popularity of their viral content by introducing it onto additional platforms (e.g., Vimeo) after posting it on a focal platform (e.g., YouTube). As a result of their posting on the latter platforms, firms can expect subsequent view growth on the focal platform to roughly double. The aforementioned benefits persists for up to five lag platforms. Platforms should also consider that a positive cross-platform spillover may help platforms reinforce each other’s usage, rather than cannibalize each other.
Article
By means of a meta-analysis, we synthesize the findings of over two decades of research from 88 empirical studies regarding four well established and theoretically rooted determinants on consumers’ attitude towards digital advertising: informativeness, entertainment, irritation, and credibility. Among other findings, we show that the effects of these determinants have changed over the past 20 years as the internet has developed. We also find that the effects differ depending on which type of online touchpoint was considered. In particular, we differentiate between the most prominent online touchpoints: email advertising, social media advertising, search engine advertising, web display banner advertising, electronic word-of-mouth communication, and corporate website advertising. Additionally, we extend the well-established determinants by more recent ones accounting for the ongoing digitalization and advances in online touchpoints (i.e. personalization, privacy concerns and interactivity). We also we derive important managerial implications and several fundamental directions for future research.
Chapter
This research systemically reviews the directions in existing research in the digital marketing domain and unveils the irresponsibility in the digital advertising domain. The inefficiencies inherited from traditional advertising are enhanced or magnified by digital channels. This research reviews previous studies on advertising efficiency and states the enhanced challenges in the digital era: agency problem, advertising effect measurement, and the black box by programmatic advertising. Further, this research proposes the data as one potential direction for future study in the digital advertising domain.KeywordsIrresponsibilityDigital advertisingAdvertising efficiencyAgency problem
Chapter
Online advertising has historically been approached as an ad-to-user matching problem within sophisticated optimization algorithms. As the research and ad tech industries have progressed, advertisers have increasingly emphasized the causal effect estimation of their ads (incrementality) using controlled experiments (A/B testing). With low lift effects and sparse conversion, the development of incrementality testing platforms at scale suggests tremendous engineering challenges in measurement precision. Similarly, the correct interpretation of results addressing a business goal requires significant data science and experimentation research expertise. We propose a practical tutorial in the incrementality testing landscape, including: – The business need – Literature solutions and industry practices – Designs in the development of testing platforms – The testing cycle, case studies, and recommendations – Paid search effectiveness in the marketplace – Emerging privacy challenges for incrementality testing and research solutions We provide first-hand lessons based on the development of such a platform in a major combined DSP and ad network, and after running several tests for up to two months each over recent years. With increasing privacy constraints, we survey literature and current practices. These practices include private set union and differential privacy for conversion modeling, and geo-testing combined with synthetic control techniques.
Article
Display advertising is a $50 billion industry in which advertisers’ (e.g., P&G, Geico) demand for impressions is matched to publishers’ (e.g., Facebook, Wall Street Journal) supply of them. An ideal match is one wherein the publisher’s ad impression is assigned to the advertiser with the highest value for it. Intermediaries (e.g., Google) facilitate this match between advertisers and publishers by managing data and providing optimization tools and algorithms for serving ads. Although these markets exhibit high allocative efficiency, we argue there is considerable scope for improvement.
Chapter
Digital marketing is one of the fastest-growing advertising channels and crossed the $330 billion mark in 2019. With exponentially increasing budgets, measuring the impact of marketing investments and driving effectiveness becomes essential for brands. The complexity of the digital ad-tech ecosystem is constantly evolving with brands running marketing activities across multiple channels, new targeting capabilities, and different formats. Due to this intricacy, traditional digital measurement metrics like cost per click, return on investment, cost per conversion, etc. just scratch the surface while measuring the actual impact of marketing strategies remains unsettled. We bridged this gap in marketing measurement by using the incremental lift as a metric to measure the impact of a marketing strategy. Incrementality testing is a mathematical approach to differentiate between correlation and causation. We formulated the Viewability Lift method by applying the concepts of A/B testing which can be implemented in the digital marketing ecosystem. In this method, we measure the effectiveness of an ad by comparing the users who are exposed to an ad versus users that are not exposed to an ad. Our methodology covers concepts of test environment setup, randomization, bias handling, hypothesis testing, primary output and understanding different ways of using this output. We used this output for digital marketing strategy planning and campaign optimizations leading to improved campaign efficiency.
Article
The Identity Fragmentation Bias
Article
A vast majority of digital display advertisers rely on large digital ad platforms to run their ad campaigns. Although ad platforms managing real-time bidding systems offer state-of-the-art services to enhance the performance of ad campaigns, their inner workings are largely opaque to customers. As a result, advertisers who seek to value their campaigns in collaboration with third-party platforms must necessarily contend with the problem of estimation bias attributable to these algorithms in addition to the high cost of implementation. We propose an alternative approach to valuation for advertisers who choose to bypass automated performance optimizers of ad platforms. We show that external frequency caps that set upper limits on the number of ad impressions outside the purview of bidding algorithms can serve this purpose effectively. Eliminating performance optimizers allows the advertiser to value ads without relying on the support services of the DSP, with the added benefit of a broader customer reach and a markedly lower cost.
Article
This study provides a comprehensive assessment of the impact of Advertising Creative Strategy (ACS) on advertising elasticity, founded on an integrative framework which distinguishes between the Function (content) and the Form (execution) of an advertising creative. Function is evaluated using a three-dimensional representation of content (Experience, Affect, Cognition), whereas the representation of Form accounts for both executional elements and the use of creative templates. The distinction between Function and Form allows for the investigation of potential synergies between content and execution, previously unaccounted for in the literature. The ACS framework also facilitates the calculation of composite metrics that capture holistic aspects of the creative strategy, such as Focus, or the extent of the emphasis on a specific content dimension, and Variation i.e., changes in content and execution over time. The empirical application focuses on a Dynamic Linear Model analysis of 2251 television advertising creatives from 91 brands in 16 consumer packaged goods categories. The findings suggest that in terms of Function, experiential content has the biggest effect on elasticity, followed by cognitive and affective content. Function and Form produce synergies that can be leveraged by advertisers to increase returns. Finally, Focus, Variation and the use of templates increase advertising elasticity.
Chapter
Full-text available
Online advertisements delivered via social media platforms function in a similar way to phishing emails. In recent years there has been a growing awareness that political advertisements are being microtargeted and tailored to specific demographics, which is analogous to many social engineering attacks. This has led to calls for total bans on this kind of focused political advertising. Additionally, there is evidence that phishing may be entering a more developed phase using software known as Phishing as a Service to collect information on phishing or social engineering, potentially facilitating microphishing campaigns. To help understand such campaigns, a set of well-defined metrics can be borrowed from the field of digital marketing, providing novel insights which inform phishing email analysis. Our work examines in what ways digital marketing is analogous to phishing and how digital marketing metric techniques can be used to complement existing phishing email analysis. We analyse phishing email datasets collected by the University of Houston in comparison with Corporate junk email and microtargeting Facebook Ad Library datasets, thus comparing these approaches and their results using Weka, URL mismatch and visual metrics analysis. Our evaluation of the results demonstrates that phishing emails can be joined up in unexpected ways which are not revealed using traditional phishing filters. However such microphishing may have the potential to gather, store and analyse social engineering information to be used against a target at a later date in a similar way to microtargeting.
Article
A large-scale comparison of experimental advertising effects and those obtained using two state-of-the-art methods.
Article
Full-text available
Online Display Advertising's importance as a marketing channel is partially due to its ability to attribute conversions to campaigns. Current industry practice to measure ad effectiveness is to run randomized experiments using placebo ads, assuming external validity for future exposures. We identify two different effects: a strategic effect of the campaign presence in marketplaces, and a selection effect due to user targeting, which are confounded in current practices. We propose two novel randomized designs to: 1) estimate the overall campaign attribution without placebo ads, 2) disaggregate the campaign presence and the ad effects. Using the Potential Outcomes Causal Model, we address the selection effect by estimating the probability of selecting influenceable users. We show the ex-ante value of continuing evaluation to enhance the user selection for ad exposure mid-flight. We analyze two performance-based (CPA) and one Cost-Per-Impression (CPM) campaigns with 20M+ users each. We estimate a negative CPM campaign presence effect due to cross product spillovers. Experimental evidence suggests that CPA campaigns incentivize the selection of converting users regardless of the ad, up to 96% more than CPM campaigns, thus challenging the standard practice of targeting most likely converting users. Code: https://github.com/joelbz/DispAdvAttr-in-Mrkt-ExpDgn-Est External link: https://users.soe.ucsc.edu/~jbarajas/publications/paper_MarketingScience.pdf
Article
Full-text available
Firms track consumers’ shopping behaviors in their online stores to provide individually personalized banners through a method called retargeting. We use data from two large-scale field experiments and two lab experiments to show that, although personalization can substantially enhance banner effectiveness, its impact hinges on its interplay with timing and placement factors. First, personalization increases click-through especially at an early information state of the purchase decision process. Here, banners with a high degree of content personalization (DCP) are most effective when a consumer has just visited the advertiser’s online store, but quickly lose effectiveness as time passes since that last visit. We call this phenomenon overpersonalization. Medium DCP banners, on the other hand, are initially less effective, but more persistent, so that they outperform high DCP banners over time. Second, personalization increases click-through irrespective of whether banners appear on motive congruent or incongruent display websites. In terms of view-through, however, personalization increases ad effectiveness only on motive congruent websites, but decreases it on incongruent websites. We demonstrate in the lab how perceptions of ad informativeness and intrusiveness drive these results depending on consumers’ experiential or goal-directed Web browsing modes.
Article
Full-text available
An analysis is performed on the results of 241 real world TV advertising tests conducted by Information Resources, Inc. between 1989 and 2003 to partially update the findings of Lodish et al. [Journal of Marketing Research 32, 2 (1995): 125-39]. Two types of market test results, BehaviorScan and Matched-Market, are analyzed. Overall, the improvement of TV advertising sales effectiveness because of media weight increase is significantly larger than zero for established products, which is different from Lodish et al.'s finding. A further analysis indicates that such significance is mainly driven by More recent tests. A comparison between the new results and Lodish et al. reveals a significant increase in the average advertising effectiveness for tests completed after 1995. The new data still suggest (as did the original data) that it is of great managerial interest to identify advertising effectiveness before launching advertising campaigns.
Article
Full-text available
Social advertising uses information about consumers' peers, including peer affiliations with a brand, product, organization, etc., to target ads and contextualize their display. This approach can increase ad efficacy for two main reasons: peers' affiliations reflect unobserved consumer characteristics, which are correlated along the social network; and the inclusion of social cues (i.e., peers' association with a brand) alongside ads affect responses via social influence processes. For these reasons, responses may be increased when multiple social signals are presented with ads, and when ads are affiliated with peers who are strong, rather than weak, ties. We conduct two very large field experiments that identify the effect of social cues on consumer responses to ads, measured in terms of ad clicks and the formation of connections with the advertised entity. In the first experiment, we randomize the number of social cues present in word-of-mouth advertising, and measure how responses increase as a function of the number of cues. The second experiment examines the effect of augmenting traditional ad units with a minimal social cue (i.e., displaying a peer's affiliation below an ad in light grey text). On average, this cue causes significant increases in ad performance. Using a measurement of tie strength based on the total amount of communication between subjects and their peers, we show that these influence effects are greatest for strong ties. Our work has implications for ad optimization, user interface design, and central questions in social science research.
Conference Paper
Full-text available
Display advertising has traditionally been sold via guaranteed contracts – a guaranteed contract is a deal between a publisher and an advertiser to allocate a certain number of impressions over a certain period, for a pre-specified price per impression. However, as spot markets for display ads, such as the RightMedia Exchange, have grown in prominence, the selection of advertisements to show on a given page is increasingly being chosen based on price, using an auction. As the number of participants in the exchange grows, the price of an impressions becomes a signal of its value. This correlation between price and value means that a seller implementing the contract through bidding should offer the contract buyer a range of prices, and not just the cheapest impressions necessary to fulfill its demand. Implementing a contract using a range of prices, is akin to creating a mutual fund of advertising impressions, and requires randomized bidding. We characterize what allocations can be implemented with randomized bidding, namely those where the desired share obtained at each price is a non-increasing function of price. In addition, we provide a full characterization of when a set of campaigns are compatible and how to implement them with randomized bidding strategies.
Article
Full-text available
We investigate conditions sufficient for identification of average treatment effects using instrumental variables. First we show that the existence of valid instruments is not sufficient to identify any meaningful average treatment effect. We then establish that the combination of an instrument and a condition on the relation between the instrument and the participation status is sufficient for identification of a local average treatment effect for those who can be induced to change their participation status by changing the value of the instrument. Finally we derive the probability limit of the standard IV estimator under these conditions. It is seen to be a weighted average of local average treatment effects.
Article
The authors analyze results of 389 BehaviorScan® matched household, consumer panel, split cable, real world T.V. advertising weight, and copy tests. Additionally, study sponsors—packaged goods advertisers, T.V. networks, and advertising agencies—filled out questionnaires on 140 of the tests, which could test common beliefs about how T.V. advertising works, to evaluate strategic, media, and copy variables unavailable from the BehaviorScan® results. Although some of the variables did indeed identify T.V. advertising that positively affected sales, many of the variables did not differentiate among the sales effects of different advertising treatments. For example, increasing advertising budgets in relation to competitors does not increase sales in general. However, changing brand, copy, and media strategy in categories with many purchase occasions in which in-store merchandising is low increases the likelihood of T.V. advertising positively affecting sales. The authors’ data do not show a strong relationship between standard recall and persuasion copy test measures and sales effectiveness. The data also suggest different variable formulations for choice and market response models that include advertising.
Conference Paper
Identifying the same internet user across devices or over time is often infeasible. This presents a problem for online experiments, as it precludes person-level randomization. Randomization must instead be done using imperfect proxies for people, like cookies, email addresses, or device identifiers. Users may be partially treated and partially untreated as some of their cookies are assigned to the test group and some to the control group, complicating statistical inference. We show that the estimated treatment effect in a cookie-level experiment converges to a weighted average of the marginal effects of treating more of a user's cookies. If the marginal effects of cookie treatment exposure are positive and constant, it underestimates the true person-level effect by a factor equal to the number of cookies per person. Using two separate datasets---cookie assignment data from Atlas and advertising exposure and purchase data from Facebook---we empirically quantify the differences between cookie and person-level advertising effectiveness experiments. The effects are substantial: cookie tests underestimate the true person-level effects by a factor of about three, and require two to three times the number of people to achieve the same power as a test with perfect treatment assignment.
Article
Yahoo! Research partnered with a nationwide retailer to study the effects of online display advertising on both online and in-store purchases. We use a randomized field experiment on 3 million Yahoo! users who are also past customers of the retailer. We find statistically significant evidence that the retailer ads increase sales 3.6% relative to the control group. We show that control ads boost measurement precision by identifying and removing the half of in-campaign sales data that are unaffected by the ads. Less data give us 31% more precision in our estimates—equivalent to increasing our sample to 5.3 million users. By contrast, we only improve precision by 5% when we include additional covariate data to reduce the residual variance in our experimental regression. The covariate-adjustment strategy disappoints despite exceptional consumer-level data including demographics, ad exposure levels, and two years’ worth of past purchase history. Data, as supplemental material, are available at http://dx.doi.org/10.1287/mksc.2016.0998 .
Article
The author analyzes the impact of online ads on the advertiser's competitors, using data from randomized field experiments on a restaurant-search website. He finds that ads increase the chances of sales for nonadvertised restaurants significantly. The spillover benefits are concentrated on restaurants that serve the advertiser's cuisine and have a high rating on the restaurant-search website. The extent of spillovers also depends on the intensity of the advertising effort. The spillovers are largest when the intensity (frequency) of advertising is low. As the intensity increases, the spillovers disappear and the advertiser gains more sales. These patterns are consistent with the following mechanism: ads increase the chance of consumers buying the advertised product but also remind consumers of similar (nonadvertised) options. Higher ad intensity leads to a stronger direct effect favoring the advertiser and can offset the spillover caused by the broader reminder.
Book
Most questions in social and biomedical sciences are causal in nature: what would happen to individuals, or to groups, if part of their environment were changed? In this groundbreaking text, two world-renowned experts present statistical methods for studying such questions. This book starts with the notion of potential outcomes, each corresponding to the outcome that would be realized if a subject were exposed to a particular treatment or regime. In this approach, causal effects are comparisons of such potential outcomes. The fundamental problem of causal inference is that we can only observe one of the potential outcomes for a particular subject. The authors discuss how randomized experiments allow us to assess causal effects and then turn to observational studies. They lay out the assumptions needed for causal inference and describe the leading analysis methods, including, matching, propensity-score methods, and instrumental variables. Many detailed applications are included, with special focus on practical aspects for the empirical researcher.
Article
Twenty-five large field experiments with major U.S. retailers and brokerages, most reaching millions of customers and collectively representing $2.8 million in digital advertising expenditure, reveal that measuring the returns to advertising is difficult. The median confidence interval on return on investment is over 100 percentage points wide. Detailed sales data show that relative to the per capita cost of the advertising, individual-level sales are very volatile; a coefficient of variation of 10 is common. Hence, informative advertising experiments can easily require more than 10 million person-weeks, making experiments costly and potentially infeasible for many firms. Despite these unfavorable economics, randomized control trials represent progress by injecting new, unbiased information into the market. The inference challenges revealed in the field experiments also show that selection bias, due to the targeted nature of advertising, is a crippling concern for widely employed observational methods. JEL Codes: L10, M37, C93.
Article
This study examines the effects of Internet display advertising using cookie-level data from a field experiment at a financial tools provider. The experiment randomized assignment of cookies to treatment (firm ads) and control conditions (charity ads), enabling the authors to handle different sources of selection bias, including targeting algorithms and browsing behavior. They analyze display ad effects for users at different stages of the company's purchase funnel (i.e., nonvisitor, visitor, authenticated user, and converted customer) and find that display advertising positively affects visitation to the firm's website for users in most stages of the purchase funnel, but not for those who previously visited the site without creating an account. Using a binary logit model, the authors calculate marginal effects and elasticities by funnel stage and analyze the potential value of reallocating display ad impressions across users at different stages. Expected visits increase almost 10% when display ad impressions are partially reallocated from nonvisitors and visitors to authenticated users. The authors also show that results from the controlled experiment data differ significantly from those computed using standard correlational approaches.
Article
We find display advertising influences customer search for both the advertised brand and its competitors. We exploit a natural experiment that randomizes ad delivery on 500 million visits to the Yahoo! homepage and compare visitors’ subsequent activities on Yahoo! Search. In three advertisers’ campaigns, display ads increase searches for advertised brands by 30-45 % and for competitors’ brands by up to 23 %. Strikingly, the total number of incremental searches for competitors is 2-8 times the increase for advertisers’ brands. We discuss how these spillovers create strategic complementarities for search advertisers and reduce firms’ investments in advertising.
Article
Mobile advertising is one of the fastest-growing advertising formats. In 2013, global spending on mobile advertising was approximately $16.7 billion, and it is expected to exceed $62.8 billion by 2017. The most prevalent type of mobile advertising is mobile display advertising (MDA), which takes the form of banners on mobile web pages and in mobile applications. This article examines which product characteristics are likely to be associated with MDA campaigns that are effective in increasing consumers' (1) favorable attitudes toward products and (2) purchase intentions. Data from a large-scale test-control field experiment covering 54 U.S. MDA campaigns that ran between 2007 and 2010 and involved 39, 946 consumers show that MDA campaigns significantly increased consumers' favorable attitudes and purchase intentions only when the campaigns advertised products that were higher (vs. lower) involvement and utilitarian (vs. hedonic). The authors explain this finding using established theories of information processing and persuasion and suggest that when MDAs work effectively, they do so by triggering consumers to recall and process previously stored product information.
Article
Internet advertising has been the fastest growing advertising channel in recent years, with paid search ads comprising the bulk of this revenue. We present results from a series of large-scale field experiments done at eBay that were designed to measure the causal effectiveness of paid search ads. Because search clicks and purchase intent are correlated, we show that returns from paid search are a fraction of non-experimental estimates. As an extreme case, we show that brand keyword ads have no measurable short-term benefits. For non-brand keywords, we find that new and infrequent users are positively influenced by ads but that more frequent users whose purchasing behavior is not influenced by ads account for most of the advertising expenses, resulting in average returns that are negative.
Article
A randomized experiment with 1.6 million customers measures positive causal effects of online advertising for a major retailer. The advertising profitably increases purchases by 5%. 93% of the increase occurs in brick-and-mortar stores; 78% of the increase derives from consumers who never click the ads. Our large sample reaches the statistical frontier for measuring economically relevant effects. We improve econometric efficiency by supplementing our experimental variation with non-experimental variation caused by consumer browsing behavior. Our experiment provides a specification check for observational difference-in-differences and cross-sectional estimators; the latter exhibits a large negative bias three times the estimated experimental effect.
Article
Online experiments are widely used to compare specific design alternatives, but they can also be used to produce generalizable knowledge and inform strategic decision making. Doing so often requires sophisticated experimental designs, iterative refinement, and careful logging and analysis. Few tools exist that support these needs. We thus introduce a language for online field experiments called PlanOut. PlanOut separates experimental design from application code, allowing the experimenter to concisely describe experimental designs, whether common "A/B tests" and factorial designs, or more complex designs involving conditional logic or multiple experimental units. These latter designs are often useful for understanding causal mechanisms involved in user behaviors. We demonstrate how experiments from the literature can be implemented in PlanOut, and describe two large field experiments conducted on Facebook with PlanOut. For common scenarios in which experiments are run iteratively and in parallel, we introduce a namespaced management system that encourages sound experimental practice.
Article
The Lipid Research Clinics Coronary Primary Prevention Trial (LRC-CPPT) measured the effectiveness of the drug cholestyramine for lowering cholesterol levels. The patients in the study were measured for compliance (the proportion of the intended dose actually taken) and for cholesterol decrease. The compliance-response regression for the Treatment group shows a smooth increasing effect of the drug in cholesterol level with increasing compliance. However, a similar, though less dramatic, compliance-response regression is seen in the Control group. This article investigates the recovery of the true dose-response curve from the Treatment and Control compliance-response curves. A simple model is proposed, analyzed, and applied to the LRC-CPPT data. Under this model, part but not all of the true dose-response curve can be estimated.
Article
"We use a controlled field experiment to investigate the dynamic effects of retail advertising. The experimental design overcomes limitations hindering previous investigations of this issue. Our study uncovers dynamic advertising effects that have not been considered in previous literature. We find that current advertising does affect future sales, but surprisingly, the effect is not always positive; for the firm's best customers, the long-run outcome may be negative. This finding reflects two competing effects: brand switching and intertemporal substitution. We also find evidence of cross-channel substitution, with the firm's best customers switching demand to the ordering channel that corresponds to the advertising. "("JEL "L2, L81, M3) Copyright (c) 2008 Western Economic Association International.
Where Ads Might Appear in the Display Network
  • Google
  • Sahni Navdeep
  • Goldfarb Avi