Content uploaded by Sahin Cem Geyik
Author content
All content in this area was uploaded by Sahin Cem Geyik on Mar 10, 2015
Content may be subject to copyright.
arXiv:1502.06657v1 [cs.AI] 24 Feb 2015
Multi-Touch Attribution Based Budget Allocation
in Online Advertising
Sahin Cem Geyik *
Applied Science Division
Turn Inc.
Redwood City, CA 94063
sgeyik@turn.com
Abhishek Saxena *
Applied Science Division
Turn Inc.
Redwood City, CA 94063
asaxena@turn.com
Ali Dasdan
Applied Science Division
Turn Inc.
Redwood City, CA 94063
adasdan@turn.com
ABSTRACT
Budget allocation in online advertising deals with distribut-
ing the campaign (insertion order) level budgets to different
sub-campaigns which employ different targeting criteria and
may perform differently in terms of return-on-investment
(ROI). In this paper, we present the efforts at Turn on how
to best allocate campaign budget so that the advertiser or
campaign-level ROI is maximized. To do this, it is cru-
cial to be able to correctly determine the performance of
sub-campaigns. This determination is highly related to the
action-attribution problem, i.e. to be able to find out the set
of ads, and hence the sub-campaigns that provided them to
a user, that an action should be attributed to. For this pur-
pose, we employ both last-touch (last ad gets all credit) and
multi-touch (many ads share the credit) attribution method-
ologies. We present the algorithms deployed at Turn for the
attribution problem, as well as their parallel implementation
on the large advertiser performance datasets. We conclude
the paper with our empirical comparison of last-touch and
multi-touch attribution-based budget allocation in a real on-
line advertising setting.
Categories and Subject Descriptors
J.0 [Computer Applications]: General
General Terms
Algorithms, Application
Keywords
Online advertising, Multi-touch attribution, Budget alloca-
tion
* The authors contributed to this work equally.
This paper has been published in:
ADKDD’14, August 24, New York City, New York, U.S.A..
.
1. INTRODUCTION
In online advertising, our goal is to serve the best ad for
a given user in an online context. Advertisers often set con-
straints which affect the applicability of the ads, e.g., an
advertiser might want to target only the users of a certain
geographic area visiting web pages of certain types for a spe-
cific campaign. Furthermore, the objective of advertisers in
general is to receive as many actions as possible utilizing dif-
ferent campaigns in parallel. Actions are advertiser defined
and can be one of inquiring about or purchasing a product,
filling out a form, visiting a certain page, etc. [9].
An ad from an advertiser can be shown to a user on a
publisher (website, mobile app etc.) only if the value for
the ad impression opportunity is high enough to win in a
real-time auction [5]. Advertisers signal their value via bids,
which is calculated as the action probability given a user in a
certain online context multiplied by the cost-per-action goal
an advertiser wants to meet or beat. Once an advertiser,
or the demand-side platform that acts on their behalf, wins
the auction (i.e. submits the highest bid), it is responsible to
pay the amount of the second highest bid (i.e. second-price
auction). Due to this, each advertiser needs to carefully
manage their budget which dictates their capability to bid.
In this paper, we are focusing on the problem of distribut-
ing a campaign’s budget to its sub-campaigns (with different
targeting criteria) so that the return-on-investment (ROI,
i.e. value received compared to the amount spent on adver-
tising) is maximized, since the sub-campaigns may have dif-
ferent performances and spending capabilities due to their
targeting. Furthermore, we will focus on the problem of
action attribution in determining a sub-campaign’s perfor-
mance (which helps with setting its budget), i.e. when an
action is received by an advertiser, finding out the ads shown
from which sub-campaign/s has/have caused that action.
We examine both last-touch attribution (LTA, i.e. a user’s
action is attributed to the last ad s/he sees) and multi-touch
attribution (MTA, i.e. a user’s action is attributed fraction-
ally to a subset of the ads s/he sees). The contributions of
the paper can be summarized as:
•A budget allocation scheme that distributes money
from the campaign top-level to the sub-campaigns ac-
cording to their performance,
•Examination of two action-attribution approaches to
determine sub-campaign performance: last-touch and
multi-touch, with an emphasis on the latter,
•A methodology on finding multi-touch attribution of
actions to sub-campaigns on large advertiser perfor-
mance datasets (i.e. spending of campaigns and user
data of impressions as well as the actions received),
and it’s efficient parallel implementation. This imple-
mentation has enabled us to process real-world online
advertising datasets (tens of terabytes of user profile
data, and multiple billions of virtual users) that are
bigger than other published efforts dealing with multi-
touch attribution so far,
•An empirical comparison of last-touch versus multi-
touch attribution based budget allocation on a real
advertising sytem. To the best of our knowledge, this
is the first paper to show how ROI is impacted by the
choice of attribution method, and demonstrate the ef-
fect of MTA on a real-world online advertising cam-
paign.
The rest of the paper is as follows. §2 will give background
on both budget allocation and action-attribution in adver-
tising domain as well as previous work in literature on these
subjects. §3 will give the definition of the problem we
would like to solve in this paper. We present our method-
ology on both budget allocation, as well as sub-campaign
performance determination using both last and multi-touch
action attribution schemes in §4. The implementation de-
tails of the methodology (system design as well as parallel
implementation) is given in §5 which is followed by our pre-
liminary results on different attribution methods for budget
allocation given in §6. Finally, we conclude the paper and
present some potential future work in §7. As a side note,
we will be using the terms campaign and insertion order
(IO), as well as sub-campaign and line item interchangeably
throughout the paper. While the latter terms are more spe-
cific to online advertising domain, they are commonly used
to describe a certain hierarchy within an advertiser.
2. BACKGROUND ANDPREVIOUSWORK
In this section, we will give some preliminary information
on the subject matter, as well as previous work in the liter-
ature.
2.1 Budget Allocation in Online Advertising
In online advertising, the advertisers aim to show their ad
to a user on a publisher (web site, mobile app etc.), so that
they get the highest number of actions for the money they
spend. To be able to utilize the market more efficiently, they
utilize different tactics, i.e. different campaigns with differ-
ent targeting rules. For example, a sports goods company
can decide to set up a campaign to show their golf equip-
ment ads to users above a certain age or income, while their
sneaker ads may be directed towards a wider audience. This
inherently constructs a hierarchy for the advertisers. In our
model, advertisers have different campaigns (e.g. each cam-
paign is the advertising for a certain type of product) which
we call insertion orders, but each campaign can also have
sub-campaigns (with different targeting, or different medi-
ums (media channels), such as social, video, mobile etc.),
which we call line items. A simple example of such a hier-
archy is given in Figure 1.
Budget allocation deals with the distribution of the daily
insertion order budget to the line items under it (since we
assume advertisers set up insertion order level budgets man-
ually), and has to take into account both the spending capa-
bilities (i.e. whether a line item’s targeting allows it to reach
Campaign 1
(Insertion Order 1 -
Product 1)
Sub-campaign 2.1
(Line Item 2.1 -
Targeting 2.1)
Advertiser
Campaign 2
(Insertion Order 2 -
Product 2)
Campaign n
(Insertion Order n -
Product n)
. . . . .
Sub-campaign 2.m
(Line Item 2.m -
Targeting 2.m)
. . . . .
Figure 1: Example of an Advertiser Hierarchy
enough users to be able to spend the money that is assigned
to it), as well as performance issues (i.e. if a line item spends
a certain amount of money, what is the value of actions that
will be received), which is its return-on-investment (ROI).
Please see Figure 2 for an explanation of the budget allo-
cation problem. In the example, the insertion order has
a daily budget of B, and the line items are assigned daily
budgets Bisuch that PiBi=B. Each line item has an
ROI of Ri, and maximum spending capability (due to tar-
geting, bidding etc.) of Si. During budget allocation, the
spending capability should be considered so that for each
line item i, we have Bi≤Si(so that no line item is as-
signed more money than it can spend). The overall return
from the allocation given in Figure 2 can also be calculated
as PiRimin(Si, Bi). These calculations of course assume
that we have the ROI and spending capability information,
where this is not so in real settings (indeed, the main focus
of this paper is learning this information). The formal prob-
lem definition (in §3) gives further details on the budget
allocation problem.
Line Item 1
Daily Budget - B1
Spending Capability - S1
ROI - R1
Insertion Order
Daily Budget - B
. . . . .
Line Item m
Daily Budget - Bm
Spending Capability - Sm
ROI - Rm
Figure 2: Budget Allocation Example
2.2 Action-Attribution ProbleminOnlineAd-
vertising
As aforementioned, the aim of the advertiser is to receive
as many actions as possible. Furthermore, the advertiser
needs to know which sub-campaign contributed to how many
actions, hence realizing the effectiveness of the different tac-
tics utilized. The big problem for this task is the fact that
the action usually happens much later than showing the
ad to the user, e.g. user sees many ads online, and then
purchases an item, hence it is hard to attribute actions to
sub-campaigns. A very simple example for this action at-
tribution problem is given in Figure 3. In the example, we
present two methodologies, last-touch attribution (the most
commonly used method, attributes the action fully to the
last seen ad), and multi-touch attribution (MTA, the action
is attributed to many ads seen from the same advertiser).
Please note that in the figure, we presented a very simple
case of MTA, where each ad gets an equal proportion of the
action, which is rarely the case in the real setting.
Ad from
Line Item 4 Action
Ad from
Line Item 3
Ad from
Line Item 2
Ad from
Line Item 1
User 1
Time
Last-Touch Attribution
0 Action 0 Action 0 Action 1 Action
Ad from
Line Item 4 Action
Ad from
Line Item 3
Ad from
Line Item 2
Ad from
Line Item 1
User 1
(Simple) Multi-Touch Attribution
0.25 Action 0.25 Action 0.25 Action 0.25 Action
Figure 3: Action Attribution Example
Naturally, action attribution and budget allocation are
closely related. To be able to correctly allocate budget to
sub-campaigns, we need to know how effective they are,
i.e. how many actions they contributed to versus how much
money was spent on them. This contribution is calculated
by the action attribution methodology we employ (presented
in §4).
2.3 Previous Work
In this section we will present some previous efforts in the
literature on both budget allocation and action-attribution.
2.3.1 Previous Efforts in Budget Allocation
Budget allocation in the campaign level for online adver-
tising is not a very broadly examined sub ject in the litera-
ture. Most of the papers so far focus on the topic of budget
optimization, i.e. given a budget constraint, how to set the
bid values as well as spending profile to maximize utility
(i.e. budget allocation per impression, rather than per cam-
paign). This is significantly different than the problem we
are working on, since our aim is actually to set these budget
constraints. Therefore the efforts in budget optimization are
complementary to our work: After we set the budgets in the
campaign level, budget optimization can take place to allo-
cate these budgets in the impression level. As an example of
budget optimization, we can list [4], where the user behav-
ior is modeled as a Markov chain. This modeling takes into
account that advertising for a specific campaign type affects
future behavior of the user, changing state transition prob-
abilities. The authors model budget optimization task as a
constrained optimal control problem for a Markov Decision
Process (MDP).
Most of the budget allocation efforts so far aimed to max-
imize click revenue, since their focus stayed within the do-
main of search advertising. For example, the authors of
[10] propose a combined model of bid price and budget de-
termination for keywords. They assume that click-through
rate (CTR) is a function of bid price and take into account
the marginal gains by increasing the bid amount, hence the
budget. The solution of the optimization problem gives the
optimal budget allocation. However, [10] does not take into
account the ability to deliver, which is crucial, and we focus
on allocation based on actions, which is difficult due to the
attribution problem. As it can be seen, due to the nature of
action-based online advertising, a big portion of our discus-
sions are to solve the attribution problem, which makes the
methods based on CTR not appropriate.
In [7], the authors discuss the assignment of budgets to
two types of search portals, generic and specialized. The
authors model the allocation as an optimal control problem,
and solve using dynamic programming. The biggest hand-
icap with that approach is the assumption that the under-
lying parameters for earnings and clicks are known, which
does not hold true and causes the methodology to be not
applicable in real-world online advertising scenarios. We try
to actually learn the performance of multiple sub-campaigns
(which is similar to different search portals, if we take the
search portal utilization as a targeting constraint) utilizing
the multi-touch attribution.
The closest approach to the one proposed in this paper
is given in [15]. The authors aim to do combined budget
allocation and bid optimization for each campaign in an ac-
count, and employ quadratic programming method to max-
imize revenue. Our work differs in two ways. Firstly, [15]
utilizes clicks to decide on the utility of campaigns, where
we utilize actions. While clicks are straight-forward to at-
tribute to campaigns, one of the main contributions of our
work is the combined focus on attribution (which is a hard
task for actions) and allocation. Again, as previously stated,
any CTR-based allocation scheme is not appropriate for the
domain we are focusing on. Our second difference is that we
separate the budget allocation from bid optimization. The
authors of [15] argue that these two should be combined
since there can be well-performing keywords under overall
low-performing campaigns. While such an argument is valid
for search advertising, which [15] focuses on, this is not the
case for online display advertising. Furthermore, due to the
complicated (much more convoluted than pure keyword tar-
geting) targeting rules involved in online display advertising
campaigns, such combined optimization is often not feasible.
Finally, for a more theoretic approach, we can list [3],
which focuses on the budget allocation problem to maxi-
mize the set of influenced target nodes (users). The authors
model media channels (which can be taken as campaigns)
and users as a bipartite graph, and the budget allocated to
a media channel directly affects the number of users that are
influenced by this media channel. Although this paper is not
extremely relevant to ours since we aim to improve revenue
(by either clicks or actions), we believe the influenced users
would map nicely onto the set of buyers/clickers.
2.3.2 Previous Efforts in Action-Attribution
While there have been simple models utilized in the indus-
try to perform multi-touch attribution, the first published
work for data-driven allocation is given in [11]. The authors
provide both a bagged logistic regression model, and an in-
tuitive probabilistic model (which uses second-order proba-
bility estimation) for attribution.
The authors of [6] utilize Shapley value [12] for attribu-
tion. It is also shown in [6] that the simple probabilistic
scheme employed by [11] is equivalent to a Shapley value
formulation after rescaling, and under certain simplifying
assumptions. This paper also argues that it is hard to eval-
uate whether one attribution of actions is better than an-
other. Our proposed budget allocation methodology can be
taken as a way to evaluate attribution methodologies, an
additional contribution by our paper.
Abhishek et al. [2] model user behavior as a hidden Markov
model (since user states are not observable, but only the out-
come is, such as clicks). They later propose to utilize this
behavior model to perform attribution, by attributing ac-
tions to ads that cause the user to change his/her latent
state.
Finally, in [14], the authors claim that, given no other
importance information on channels, the first touch-point
as well as the touch-points closer to the last one (including
the last touch-point, which gets higher credit than first) get
the higher credit. This attribution resembles an assymetric
bathtub shape, and the authors utilize a Beta distribution
over time. Since the paper only deals with user journeys
that end in action, the authors also aim at detecting the im-
portance of initiating, intermediary, and terminating nodes
for sequences within each journey, hence this way mapping
channels to relevance values.
3. PROBLEM DEFINITION
Let us give the formal definition of the budget allocation
problem. Given the total budget Bfor an insertion order,
the set of line items L={l1, ..., ln}under this IO, maximum
spending capability of each line item S={S1, ..., Sn}, and
return-on-investment (ROI) of each line item R={R1, ..., Rn}
(the amount of dollars received by the line item, due to ac-
tions, for each dollar spent by the line item for advertising,
using the specific targeting of the line item):
maximize U=
n
X
i=1
RiBisubject to,
∀j∈[1, n]Bj≤Sjand
n
X
i=1
Bi≤B .
Please note that as presented in Section 2.3.1, this is signifi-
cantly different than the so-called budget optimization prob-
lem. If we have the correct values for the set Sand R, a
very simple greedy approach actually optimizes the above
problem:
1. Bremaining = B
2. Sort line items in L according to Ri(descending) into
a new list Lsorted.
3. While there is budget left
•For each next line item liin Lsorted
(a) Assign lithe budget Bias min(Bremaining ,Si)
(b) Bremaining = Bremaining −Bi
(c) If Bremaining ≤0, then return.
The problem we focus on in this paper is exactly the fact that
we do not know the values Riand Sifor a line item. In the
next section, we show that we solve the spending capability
estimation by a simple adaptive budget assignment scheme,
and return-on-investment estimation via multi-touch attri-
bution.
4. METHODOLOGY
As mentioned in §3, budget allocation can be reduced
to two problems: (i) spending capability calculation for a
sub-campaign, and (ii) return-on-investment calculation for
a sub-campaign. In this section, we will separate these two
problems, and examine ways to solve them.
4.1 Spending Capability Calculationfor a Sub-
Campaign
As aforementioned, sub-campaigns (line items) apply dif-
ferent targeting criteria to show ads to potential buyers of a
product. It is obvious that there are not the same number of
users, hence the same amount of advertising budget spend-
ing capability, for all targeting criteria. We certainly do
not want to assign a lot of money, no matter how high the
return-on-investment may be, on a specific campaign that
cannot reach enough users to be able to spend the money.
It is however a hard problem to estimate exactly how much
money a sub-campaign may spend, since it depends on both
the reach of users, as well as the bid price (i.e. if a sub-
campaign bids low, it will not be able to win ad auctions
and not receive impressions, hence not be able to spend the
money assigned to it). In our budget allocation approach,
we apply a simple adaptive budget assignment scheme. This
methodology can be summarized as follows.
•If a sub-campaign is new, i.e. if we have no idea of
how much it will spend, assign a learning budget that
is high enough to give it a starting boost,
•If a sub-campaign has spending data, then assign it
always a bit more (e.g. increase it with a certain per-
centage), to explore its spending limits.
Please note that, it is possible that at any point the sum of
current spending limits (calculated according to the above
adaptive scheme) of sub-campaigns may be smaller than the
overall campaign budget (i.e. a case of incomplete budget
delivery). This usually happens if the budget assigned to
a campaign is simply not possible to be spent by the sub-
campaigns, hence underspend (i.e. total spend not satisfying
total budget) may occur. In the case of incomplete budget
delivery, one solution that we utilize is to assign the remain-
ing (unassigned) budget fractionally among sub-campaigns
(according to their previous allocation). Although under-
spend may still occur, this assignment is still helpful in fur-
ther calculating the spending limits of sub-campaigns, since
we assign a little bit more budget to the sub-campaign than
our adaptive approach suggests.
It can be seen that this simple adaptive assignment method
actually tries to assign as much as possible to the sub-
campaigns that perform better (high return-on-investment).
This in turn tries to achieve the greedy algorithm given in
§3. Since we order the sub-campaigns/line items accord-
ing to their ROI, and then assign as much as possible to
the higher ranking line items, then the most important leg
of the approach is calculating the ROI accurately, which is
given in the next section.
4.2 ROI Calculation for a Sub-Campaign
We calculate the return-on-investment for a line item as
follows:
ROIli=P∀ajp(li|aj)v(aj)
Money spent by li
.(1)
Above, v(aj) is the monetary value that is received by ac-
tion aj(e.g. the profit that the advertiser earns by selling
that specific product). In this work, we deal with CPA (cost
per action) campaigns, where the advertiser provides the
demand-side platform with the values of the actions that
they want to receive, hence the return-on-investment is cal-
culated as the ratio of the value of actions received to the
amount of money spent for advertising. We also give the
attribution component in the above formulation by the term
p(li|aj). This determines the percentage of the action aj
that is attributed to line item li(while for LTA, p(li|aj) is 0
or 1, for MTA, p(li|aj)∈[0,1] since we allow partial attri-
bution of a single action to many sub-campaigns). Since the
above formulation is quite straight-forward, we will focus on
the attribution problem (i.e. determining p(li|aj)) for the
rest of the current section.
We have already stated that one of the most common at-
tribution methods used is last-touch attribution, which as-
signs the whole action to the last ad seen by the user. In this
paper, our emphasis is on multi-touch attribution, and we
utilize the probabilistic model given in [11], which also origi-
nated at Turn. The methodology given in [11] first calculates
the empirical action probability of line items (referred to as
advertising channels in the paper):
p(a|li) = N+(li)
N+(li) + N−(li),∗
as well as pairs of line items :
p(a|li, lj) = N+(li, lj)
N+(li, lj) + N−(li, lj).
In the formulation, N+denotes the number of times that
any user in the system has observed an ad sequence with an
ad from line item li(or ads from the pair of line items liand
lj) that ended in action, whereas N−denotes the number of
sequences that did not end in action (and had line item li,
or the pair liand lj, in it). This formulation basically gives
the probability that a sequence of ads shown to a user will
end in conversion if it has an ad from li(or the pair liand
lj) in it. In our deployed system, we only consider actions
for the last taction days to be attributed to the impressions
and clicks (i.e. ad sequence) that the user experienced which
happened up to tassociation days before each action. Different
values can be employed for the above two variables.
Once the action probabilities are calculated, the contribu-
tion weight (to be normalized to calculate actual attribution)
for a line item is calculated in [11] as:
w(li) = p(a|li) + 1
2(N−1) X
i6=j
{p(a|li, lj)−p(a|li)−p(a|lj)},
where N is the total number of line items under the adver-
tiser that libelongs to. Our experience with the current
advertising system built in Turn is that the second term,
* As a side note, in this setting, probability of action for a se-
quence (regardless of the line items in it) is p(a) = N+
N++N−
,
where N+is the total number of sequences (regardless of line
items) that ended in action, N−is the total number of sequences
that did not. This can be written in terms of action probabilities
conditioned on line items as:
p(a) = X
S∈{P(L)−∅}
f(S)p(a|S)p(S)
where Lis the set of all line items and P(L) is the power set
of (all subsets, and we further remove the empty set, ∅)L.p(S)
is the probability of a set of line items appearing together in a
sequence (marginal probability of the set), which is calculated as
N+(S)+N−(S)
N++N−
, i.e. total number of sequences which have set S
in it, divided by the total number of sequences. p(a|S) is the
conditional probability of action given set S, and f(S) is a func-
tion which gives +1 if set Shas odd number of line items in it,
and −1 if set Shas even number of line items in it. This is the
probability of union of conditional action events, where line items
are not independent of each other.
Algorithm 1 Second Step of Multi-Touch Attribution, Cal-
culates the Attribution for Each Action and ROI for Each
Line Item
taction = action window
tassociation = impression/click association window
// tp: touch-point, li: line item
for each user uido
Keep only the imps and clicks for the time period:
[today - (taction +tassociation), today]
Keep only the actions for the time period
[today - taction, today]
end for
action sequence set Saction =∅
// only look at action sequences
// since we are doing attribution
add each tp sequence Sithat ended in action (i.e. within
tassociation window of an action) into Saction
for each Si∈Saction do
weightSum = Pliwhere lihas a touch-point in Siw(lj)
for each ljthat has a touch-point in sequence Sido
actionAttributedlj+= w(lj)
weightSum
totalActionAttributedlj+= actionAttributedlj
totalActionValuelj+= actionAttributedlj×
valueOfActionPreceededBySi
end for
end for
for each line item ljdo
output totalActionAttributedlj// total number of
// actions attributed to lj
output totalActionValuelj// total value of actions
// attributed to lj
ROIlj=totalActionValuelj
costlj
output ROIlj// return-on-investment of lj
end for
i.e. the second-order calculations, does not give enough ad-
vantage in accuracy to justify the increase in processing time
required to train the model (calculating the pair-wise prob-
abilities as well as using these probabilities for the contri-
bution weight), hence we utilize the first-order probabilities
to calculate weights for the line items (although both first-
order and second-order calculations are supported in our
system). Therefore, the weight of each line item utilized for
attribution is given as:
w(li) = p(a|li) = N+(li)
N+(li) + N−(li).(2)
For the first step in attribution, we go through each user
(i.e. web user, whose data consists of a set of impressions,
clicks and actions), and only process data for a certain pe-
riod (keep the actions for the last taction days, and the im-
pressions for the last taction +tassociation days, since we only
attribute an action to an impression if the impression hap-
pened up to tassociation days before the action). Later, we
extract the sequences of touch-points for the users, both
those that end in an action, and those that do not. Since a
sequence can have multiple touch-points from the same line
item, we deduplicate those touch-points, and in the end we
calculate the probability of a line item being in a sequence
that ends in action as its weight (i.e. equation 2 above),
which will be used for attribution in the second step of our
employed MTA algorithm. During the first step, we also cal-
culate the amount of money spent by each line item, which
is crucial to calculate ROI.
HDFS .
.
.
Shard the users
set of users 1
set of users 2
set of users 3
set of users n
Mapper
Reducer
{line_item_id (key), cost,
action_seq (0/1),
no-action_seq (0/1)}
Mapper
Mapper
Mapper
Reducer
Reducer
.
.
.
{line_item_id, total_cost,
total_action_seq,
total_no-action_seq,
weight}
Calculate over sequences
for each user
Aggregate over
all keys
(a) First Step: Calculation of the Weights for Each Line Item
HDFS .
.
.
Shard the users
set of users 1
set of users 2
set of users 3
set of users n
Mapper
Reducer
Mapper
Mapper
Mapper
Reducer
Reducer
.
.
.
{line_item_id, total_cost,
weight}
Calculate over only
action-sequences
for each user
Aggregate over
all keys
Output of First Job
{line_item_id (key), total_cost,
attributed_action (in [0,1]),
attributed_action_value}
{line_item_id, total_cost,
total_attributed_action,
total_attributed_action_value,
return-on-investment}
Final Output
(b) Second Step: Calculation of the Attribution for Each Action, and ROI for Each Line Item
Figure 4: Implementation Details of Employed MTA Algorithm
The second step in our employed action attribution scheme
is given in Algorithm 1. Since we already calculated the
weights (w(li)) for the line items in the previous step, now
all we have to do is to assign each action to the line items
that showed at least one ad before (within a tassociation win-
dow) it, according to their weights (i.e. normalized weight
for each line item is the fraction of the action that is at-
tributed to it). For this purpose, we only look at the se-
quences that ended in action (contrary to first step, but this
is needed to calculate the weights, and total cost), and in
the end return the total values of the fractional actions at-
tributed to each line item. We also calculate ROI as given
in equation 1 (please note that costljis the total amount of
money spent by line item ljfor advertising, over both action
and no-action sequences, and is calculated in the first step
of our attribution scheme).
Please note that both of the above steps are easily paral-
lelizable, and we present some details in the next section on
how we implement our attribution and allocation system.
5. IMPLEMENTATION DETAILS
As aforementioned, the attribution scheme we employed
as given in §4.2 is easily parallelizable and we have imple-
mented the two-step algorithm on Hadoop [13]. This par-
allel implementation is necessary due to the large (multiple
billions of virtual users, where each user is a set of cookies)
number of users, and since we have to process the action and
no-action sequences for each of them. Indeed, the amount
of data we process (tens of terabytes of user profile data)
is bigger than other works published so far, and represents
perfectly the nature of real-world online advertising systems.
The two-step MTA algorithm is run every day, for each ad-
vertiser, and is scheduled by Oozie Workflow Scheduler [1].
The current implementation at Turn takes ≈40 seconds per
mapper for each of the first and second steps. The overall
job (both steps) takes around two hours to complete every
day in our production system.
The overview of our MTA implementation is given in Fig-
ure 4, which gives the details of the two steps separately.
In Figure 4(a), we present the implementation of first step
in our deployed attribution algorithm, which calculates the
attribution weights for each line item. The parallel process-
ing works as follows. First, we shard the whole set of users
into many mappers, which extract the action and no-action
sequences, and for each sequence throws out line item id as
the key, and the following values: (i)cost for the impres-
sions (touch-points) of the line item inside the sequence,
(ii) whether this sequence is an action sequence (0/1 value),
and (iii) whether this sequence is a no-action sequence (0/1
value). These <key, value tuple>pairs are sent to the re-
ducers, and the pairs with the same key end up in the same
reducer which allows for aggregation. In the end, each re-
ducer outputs the line item id key, and the aggregated total
number of action and no-action sequences which are used to
calculate the weight.
The implementation of the second step of our deployed at-
tribution scheme, where the actual action attribution as well
as the line item level return-on-investment (ROI) are calcu-
lated, is presented in Figure 4(b). Similar to Figure 4(a),
we first shard the users into mappers, and in each map-
per we only go over the action sequences. Furthermore,
we send the output of the first job (line item weights, as
well as total costs) into the mappers, since these values are
used to determine the action attribution and ROI for each
line item. For each action sequence, the mappers throw
out the line item id (for each line item that had a touch-
point inside this sequence that ended in an action) as key,
and the following values: (i) total cost of line item (this is
only for continuity, copied exactly from the output of first
job), (ii) percentage of the action (that concludes this se-
quence) that is attributed to line item (attributed action
which is within the interval [0,1]), and (iii) the value of
the action (that concludes this sequence) ×attributed action
(attributed action value ), which represents the money made
by the help of advertising under this line item. Again, the
same keys are collected within the same reducer, and the
reducer aggregates the values to calculate the total action
value (total attributed action value) received by a line item,
as well as the ROI for the line item (which uses both to-
tal attributed action value and total cost for this line item,
and calculates ROI according to equation 1).
HDFS
Multi-Touch Attribution
Scheduled Job
(Oozie)
Line Item
(Subcampaign)
Performance Data
Control Server
(Budget Allocation,
Spending Rate Control,
Budget Control)
Performance Data
Ad Servers
Spending rate
for Line Items
Spending info
for Line Items
Start or stop spending
signal for Line Items
Figure 5: MTA-based Budget Allocation Architec-
ture
The architecure we employ for MTA-based budget allo-
cation is given in Figure 5. The budget allocation algo-
rithm runs on the control server which picks up the MTA-
performance information from the Hadoop Distributed File
System (HDFS), which is populated by the MTA Oozie job.
Then, the control server calculates the daily budgets for line
items, and calculates the spending rates [8] for time periods
within the day. These spending rates are sent to ad servers,
which do the spending, and send the money spent for each
line item back to control server. Control server starts or
stops line items from further spending (this signal is also
sent to ad servers) if the line item has depleted its budget
for the day.
6. RESULTS
For our evaluations, we have set up two campaigns in
a real online advertising environment, with the same cam-
paign level budget, to run over 12 days within the month of
November, in 2013. Both campaigns have four identical line
items that run on differing targeting criteria. The only dif-
ference in the two campaigns is that the budget allocation
in one is calculated utilizing the ROI values generated by
MTA, and LTA in the other case. Please note that although
MTA-based budget allocation is used commonly within our
platform due to its advantages, we present the results of a
single experiment. This is due to the fact that this kind of
A/B testing requires exact set up of two campaigns to com-
pare, hence it requires experimentation budget (i.e. money,
since we assign the same amount of money to both cam-
paigns to allocate among sub-campaigns and then spend on
advertising). We are providing results in terms of return-
on-investment (ROI), effective cost per action (eCPA) and
effective cost per click (eCPC) metrics, which are calculated
in the campaign level. Our aim is to show that by allocating
budgets differently to sub-campaigns according to different
attribution methodologies, we improve the performance of
the overall campaign. While we have explained the ROI
metric throughout the paper, the latter two metrics can be
described as follows:
•Effective Cost per Action (eCPA): What is the
average amount of money that is spent by an advertiser
(on advertising) to receive one action (i.e. purchase
etc.)? This metric can be calculated as Advertising Cost
# of Actions .
•Effective Cost per Click (eCPC): What is the av-
erage amount of money that is spent by an advertiser
(on advertising) to receive one click (on its ad)? This
metric can be calculated as Advertising Cost
# of Clicks .
The results for the return-on-investment of the budget al-
location applying the two attribution methodologies (LTA
and MTA) is given in Figure 6. Due to privacy issues, we
have modified the actual ROI values with a constant factor.
5
10
15
20
25
30
0 1 2 3 4 5 6 7 8 9 10 11 12 13
ROI
Day
Comparison of the Budget Allocation Schemes Utilizing
Two Action Attribution Methodologies in Terms of ROI
Last-Touch Attr.
Multi-Touch Attr.
Figure 6: Comparison of ROI Performance for the
two budget allocation algorithms utilizing differ-
ent action attribution methodologies over 12 Days.
Higher ROI that has been achieved by the proposed
methodology indicates better performance.
Since we receive actions in the campaign level (i.e. when we
receive an action, we know it belongs to a certain campaign,
attribution to sub-campaigns comes afterwards), it is easier
to calculate the overall ROI for the two identical campaigns
run, to evaluate the results. It can be seen that we have
much higher ROI for the MTA scheme utilized, which sig-
nifies that the ranking information (estimated ROI) is more
accurate for MTA.
100
105
110
115
120
125
130
135
140
145
150
155
160
0 1 2 3 4 5 6 7 8 9 10 11 12 13
eCPA
Day
Comparison of the Budget Allocation Schemes Utilizing
Two Action Attribution Methodologies in Terms of eCPA
Last-Touch Attr.
Multi-Touch Attr.
Figure 7: Comparison of eCPA Performance for
the two budget allocation algorithms utilizing differ-
ent action attribution methodologies over 12 Days.
Lower eCPA that has been achieved by the proposed
methodology indicates better performance.
90
95
100
105
110
115
120
125
130
135
140
145
150
0 1 2 3 4 5 6 7 8 9 10 11 12 13
eCPC
Day
Comparison of the Budget Allocation Schemes Utilizing
Two Action Attribution Methodologies in Terms of eCPC
Last-Touch Attr.
Multi-Touch Attr.
Figure 8: Comparison of eCPC Performance for
the two budget allocation algorithms utilizing differ-
ent action attribution methodologies over 12 Days.
Lower eCPC that has been achieved by the proposed
methodology indicates better performance.
The results in terms of eCPA and eCPC are given in Fig-
ure 7 and Figure 8, respectively (again, the values are mod-
ified by a constant factor). Again, it can be seen that the
budget allocation based on the MTA performs much bet-
ter compared to the one that applies LTA. Please note that
these eCPA and eCPC values are closely related to ROI
(if the action values are the same for all actions, low eCPA
means high ROI), but we see that the MTA-based allocation
is much better in terms of ROI, compared to eCPA. This is
due to the fact that we were able to get many more “high
quality” (high value) actions by the MTA-based budget allo-
cation scheme. Finally, although budget allocation was op-
timized towards actions via MTA, we can observe that since
the MTA gives us the overall more effective sub-campaigns,
eCPC has also improved.
The final set of results for our experiment is given in Fig-
ure 9, which enhances our conclusion that MTA leads to bet-
ter determination of sub-campaign utilities, and to improved
budget allocation. In the figure, we present the percentage
of the total budget allocated to each line item, alongside
with the ROI received from that line item during the run of
the experiment. Although we can see that the ROIs received
by identical campaigns are slightly different (this difference
is expected, considering different budgets are assigned), we
see a remarkable correlation with the allocation achieved
by the MTA-based budget allocation and the actual ROIs
recorded. One more point of interest for the graph is about
the highest allocated budget in the LTA case (LI 3, i.e. line
item 3). This line item is actually a retarging sub-campaign
(i.e. tries to target users who have acted in some way about
this product, e.g. go to the homepage, click etc.), hence it
is very likely to do the last push for a user before buying a
product. This of course leads to unfair assignment of actions
in LTA case, unlike MTA.
MTA Budgets
LTA Budgets
LI 1
ROI: 31.85
Budget: 63.5%
LI 2
ROI: 7.94
Budget: 16.2%
LI 3
ROI: 7.12
Budget: 12.7%
LI 4
ROI: 0.46
Budget: 7.6%
LI 3
ROI: 3.01
Budget: 40.5%
LI 1
ROI: 34.01
Budget: 23.9% LI 2
ROI: 7.86
Budget: 18.5%
LI 4
ROI: 0.20
Budget: 17.1%
Figure 9: Comparison of how the budget distributed
(with the ROI received) among sub-campaigns for
both budget allocation schemes. It is apparent that
the MTA-based budget allocation was able to deter-
mine the ROI of campaigns with much higher ac-
curacy and has delivered the overall budget to sub-
campaigns accordingly.
7. CONCLUSIONS AND FUTURE WORK
In this paper, we have focused on the problem of budget
allocation in online advertising domain. We have shown that
sub-campaign performance values, calculated via the multi-
touch attribution, leads to better allocation of budgets. This
has been demonstrated empirically in our real-world online
advertising platform. We also gave a detailed explanation
on the algorithms utilized for both budget allocation and
multi-touch attribution, as well as their implementation.
Our future work mainly focuses on employing improved
multi-touch attribution algorithms. Furthermore, we plan
on the application of MTA for bidding as well, i.e. the bid
is calculated utilizing the past performance values generated
by the MTA algorithm.
8. REFERENCES
[1] Apache oozie workflow scheduler for hadoop.
http://oozie.apache.org/. Accessed: 2014-01-24.
[2] V. Abhishek, P. S. Fader, and K. Hosanagar. Media
exposure through the funnel: A model of multi-stage
attribution. In Proc. WISE, 2013.
[3] N. Alon, I. Gamzu, and M. Tennenholtz. Optimizing
budget allocation among channels and influencers. In
Proc. ACM WWW, 2012.
[4] N. Archak, V. S. Mirrokni, and S. Muthukrishnan.
Budget optimization for online advertising campaigns
with carryover effects. In Proc. ACM Workshop on Ad
Auctions, 2010.
[5] C. Borgs, J. Chayes, O. Etesami, N. Immorlica,
K. Jain, and M. Mahdian. Dynamics of bid
optimization in online advertisement auctions. In
Proc. ACM WWW, pages 531–540, 2007.
[6] B. Dalessandro, C. Perlich, O. Stitelman, and
F. Provost. Causally motivated attribution for online
advertising. In Proc. ACM ADKDD, 2012.
[7] G. E. Fruchter and W. Dou. Optimal budget
allocation over time for keyword ads in web portals. J.
Optimization Theory and Applications,
124(1):157–174, 2005.
[8] K.-C. Lee, A. Jalali, and A. Dasdan. Real time bid
optimization with smooth budget delivery in online
advertising. In http://arxiv.org/pdf/1305.3011v1.pdf,
pages 1–13, 2013.
[9] K.-C. Lee, B. Orten, A. Dasdan, and W. Li.
Estimating conversion rate in display advertising from
past performance data. In Proc. ACM SIGKDD Conf.
on Knowledge Discovery and Data Mining, pages
768–776, 2012.
[10] O. Ozluk and S. Cholette. Allocating expenditures
across keywords in search advertising. J. Revenue and
Pricing Management, 6(4):347–356, 2007.
[11] X. Shao and L. Li. Data-driven multi-touch attribution
models. In Proc. ACM SIGKDD Conf. on Knowledge
Discovery and Data Mining, pages 258–264, 2011.
[12] L. S. Shapley. A value for n-person games. Annals of
Mathematical Studies, 28:307–317, 1953.
[13] T. White. Hadoop: The Definitive Guide. O’Reilly
Media, Sebastopol, CA, 2012.
[14] D. A. Wooff and J. M. Anderson. Time-weighted
multi-touch attribution and channel relevance in the
customer journey to online purchase. J. Statistical
Theory and Practice, 2013.
[15] W. Zhang, Y. Zhang, B. Gao, Y. Yu, X. Yuan, and
T.-Y. Liu. Joint optimization of bid and budget
allocation in sponsored search. In Proc. ACM
SIGKDD Conf. on Knowledge Discovery and Data
Mining, 2012.