ArticlePDF Available

Multi-Touch Attribution Based Budget Allocation in Online Advertising


Abstract and Figures

Budget allocation in online advertising deals with distributing the campaign (insertion order) level budgets to different sub-campaigns which employ different targeting criteria and may perform differently in terms of return-on-investment (ROI). In this paper, we present the efforts at Turn on how to best allocate campaign budget so that the advertiser or campaign-level ROI is maximized. To do this, it is crucial to be able to correctly determine the performance of sub-campaigns. This determination is highly related to the action-attribution problem, i.e. to be able to find out the set of ads, and hence the sub-campaigns that provided them to a user, that an action should be attributed to. For this purpose, we employ both last-touch (last ad gets all credit) and multi-touch (many ads share the credit) attribution methodologies. We present the algorithms deployed at Turn for the attribution problem, as well as their parallel implementation on the large advertiser performance datasets. We conclude the paper with our empirical comparison of last-touch and multi-touch attribution-based budget allocation in a real online advertising setting.
Content may be subject to copyright.
arXiv:1502.06657v1 [cs.AI] 24 Feb 2015
Multi-Touch Attribution Based Budget Allocation
in Online Advertising
Sahin Cem Geyik *
Applied Science Division
Turn Inc.
Redwood City, CA 94063
Abhishek Saxena *
Applied Science Division
Turn Inc.
Redwood City, CA 94063
Ali Dasdan
Applied Science Division
Turn Inc.
Redwood City, CA 94063
Budget allocation in online advertising deals with distribut-
ing the campaign (insertion order) level budgets to different
sub-campaigns which employ different targeting criteria and
may perform differently in terms of return-on-investment
(ROI). In this paper, we present the efforts at Turn on how
to best allocate campaign budget so that the advertiser or
campaign-level ROI is maximized. To do this, it is cru-
cial to be able to correctly determine the performance of
sub-campaigns. This determination is highly related to the
action-attribution problem, i.e. to be able to find out the set
of ads, and hence the sub-campaigns that provided them to
a user, that an action should be attributed to. For this pur-
pose, we employ both last-touch (last ad gets all credit) and
multi-touch (many ads share the credit) attribution method-
ologies. We present the algorithms deployed at Turn for the
attribution problem, as well as their parallel implementation
on the large advertiser performance datasets. We conclude
the paper with our empirical comparison of last-touch and
multi-touch attribution-based budget allocation in a real on-
line advertising setting.
Categories and Subject Descriptors
J.0 [Computer Applications]: General
General Terms
Algorithms, Application
Online advertising, Multi-touch attribution, Budget alloca-
* The authors contributed to this work equally.
This paper has been published in:
ADKDD’14, August 24, New York City, New York, U.S.A..
In online advertising, our goal is to serve the best ad for
a given user in an online context. Advertisers often set con-
straints which affect the applicability of the ads, e.g., an
advertiser might want to target only the users of a certain
geographic area visiting web pages of certain types for a spe-
cific campaign. Furthermore, the objective of advertisers in
general is to receive as many actions as possible utilizing dif-
ferent campaigns in parallel. Actions are advertiser defined
and can be one of inquiring about or purchasing a product,
filling out a form, visiting a certain page, etc. [9].
An ad from an advertiser can be shown to a user on a
publisher (website, mobile app etc.) only if the value for
the ad impression opportunity is high enough to win in a
real-time auction [5]. Advertisers signal their value via bids,
which is calculated as the action probability given a user in a
certain online context multiplied by the cost-per-action goal
an advertiser wants to meet or beat. Once an advertiser,
or the demand-side platform that acts on their behalf, wins
the auction (i.e. submits the highest bid), it is responsible to
pay the amount of the second highest bid (i.e. second-price
auction). Due to this, each advertiser needs to carefully
manage their budget which dictates their capability to bid.
In this paper, we are focusing on the problem of distribut-
ing a campaign’s budget to its sub-campaigns (with different
targeting criteria) so that the return-on-investment (ROI,
i.e. value received compared to the amount spent on adver-
tising) is maximized, since the sub-campaigns may have dif-
ferent performances and spending capabilities due to their
targeting. Furthermore, we will focus on the problem of
action attribution in determining a sub-campaign’s perfor-
mance (which helps with setting its budget), i.e. when an
action is received by an advertiser, finding out the ads shown
from which sub-campaign/s has/have caused that action.
We examine both last-touch attribution (LTA, i.e. a user’s
action is attributed to the last ad s/he sees) and multi-touch
attribution (MTA, i.e. a user’s action is attributed fraction-
ally to a subset of the ads s/he sees). The contributions of
the paper can be summarized as:
A budget allocation scheme that distributes money
from the campaign top-level to the sub-campaigns ac-
cording to their performance,
Examination of two action-attribution approaches to
determine sub-campaign performance: last-touch and
multi-touch, with an emphasis on the latter,
A methodology on finding multi-touch attribution of
actions to sub-campaigns on large advertiser perfor-
mance datasets (i.e. spending of campaigns and user
data of impressions as well as the actions received),
and it’s efficient parallel implementation. This imple-
mentation has enabled us to process real-world online
advertising datasets (tens of terabytes of user profile
data, and multiple billions of virtual users) that are
bigger than other published efforts dealing with multi-
touch attribution so far,
An empirical comparison of last-touch versus multi-
touch attribution based budget allocation on a real
advertising sytem. To the best of our knowledge, this
is the first paper to show how ROI is impacted by the
choice of attribution method, and demonstrate the ef-
fect of MTA on a real-world online advertising cam-
The rest of the paper is as follows. §2 will give background
on both budget allocation and action-attribution in adver-
tising domain as well as previous work in literature on these
subjects. §3 will give the definition of the problem we
would like to solve in this paper. We present our method-
ology on both budget allocation, as well as sub-campaign
performance determination using both last and multi-touch
action attribution schemes in §4. The implementation de-
tails of the methodology (system design as well as parallel
implementation) is given in §5 which is followed by our pre-
liminary results on different attribution methods for budget
allocation given in §6. Finally, we conclude the paper and
present some potential future work in §7. As a side note,
we will be using the terms campaign and insertion order
(IO), as well as sub-campaign and line item interchangeably
throughout the paper. While the latter terms are more spe-
cific to online advertising domain, they are commonly used
to describe a certain hierarchy within an advertiser.
In this section, we will give some preliminary information
on the subject matter, as well as previous work in the liter-
2.1 Budget Allocation in Online Advertising
In online advertising, the advertisers aim to show their ad
to a user on a publisher (web site, mobile app etc.), so that
they get the highest number of actions for the money they
spend. To be able to utilize the market more efficiently, they
utilize different tactics, i.e. different campaigns with differ-
ent targeting rules. For example, a sports goods company
can decide to set up a campaign to show their golf equip-
ment ads to users above a certain age or income, while their
sneaker ads may be directed towards a wider audience. This
inherently constructs a hierarchy for the advertisers. In our
model, advertisers have different campaigns (e.g. each cam-
paign is the advertising for a certain type of product) which
we call insertion orders, but each campaign can also have
sub-campaigns (with different targeting, or different medi-
ums (media channels), such as social, video, mobile etc.),
which we call line items. A simple example of such a hier-
archy is given in Figure 1.
Budget allocation deals with the distribution of the daily
insertion order budget to the line items under it (since we
assume advertisers set up insertion order level budgets man-
ually), and has to take into account both the spending capa-
bilities (i.e. whether a line item’s targeting allows it to reach
Campaign 1
(Insertion Order 1 -
Product 1)
Sub-campaign 2.1
(Line Item 2.1 -
Targeting 2.1)
Campaign 2
(Insertion Order 2 -
Product 2)
Campaign n
(Insertion Order n -
Product n)
. . . . .
Sub-campaign 2.m
(Line Item 2.m -
Targeting 2.m)
. . . . .
Figure 1: Example of an Advertiser Hierarchy
enough users to be able to spend the money that is assigned
to it), as well as performance issues (i.e. if a line item spends
a certain amount of money, what is the value of actions that
will be received), which is its return-on-investment (ROI).
Please see Figure 2 for an explanation of the budget allo-
cation problem. In the example, the insertion order has
a daily budget of B, and the line items are assigned daily
budgets Bisuch that PiBi=B. Each line item has an
ROI of Ri, and maximum spending capability (due to tar-
geting, bidding etc.) of Si. During budget allocation, the
spending capability should be considered so that for each
line item i, we have BiSi(so that no line item is as-
signed more money than it can spend). The overall return
from the allocation given in Figure 2 can also be calculated
as PiRimin(Si, Bi). These calculations of course assume
that we have the ROI and spending capability information,
where this is not so in real settings (indeed, the main focus
of this paper is learning this information). The formal prob-
lem definition (in §3) gives further details on the budget
allocation problem.
Line Item 1
Daily Budget - B1
Spending Capability - S1
ROI - R1
Insertion Order
Daily Budget - B
. . . . .
Line Item m
Daily Budget - Bm
Spending Capability - Sm
ROI - Rm
Figure 2: Budget Allocation Example
2.2 Action-Attribution ProbleminOnlineAd-
As aforementioned, the aim of the advertiser is to receive
as many actions as possible. Furthermore, the advertiser
needs to know which sub-campaign contributed to how many
actions, hence realizing the effectiveness of the different tac-
tics utilized. The big problem for this task is the fact that
the action usually happens much later than showing the
ad to the user, e.g. user sees many ads online, and then
purchases an item, hence it is hard to attribute actions to
sub-campaigns. A very simple example for this action at-
tribution problem is given in Figure 3. In the example, we
present two methodologies, last-touch attribution (the most
commonly used method, attributes the action fully to the
last seen ad), and multi-touch attribution (MTA, the action
is attributed to many ads seen from the same advertiser).
Please note that in the figure, we presented a very simple
case of MTA, where each ad gets an equal proportion of the
action, which is rarely the case in the real setting.
Ad from
Line Item 4 Action
Ad from
Line Item 3
Ad from
Line Item 2
Ad from
Line Item 1
User 1
Last-Touch Attribution
0 Action 0 Action 0 Action 1 Action
Ad from
Line Item 4 Action
Ad from
Line Item 3
Ad from
Line Item 2
Ad from
Line Item 1
User 1
(Simple) Multi-Touch Attribution
0.25 Action 0.25 Action 0.25 Action 0.25 Action
Figure 3: Action Attribution Example
Naturally, action attribution and budget allocation are
closely related. To be able to correctly allocate budget to
sub-campaigns, we need to know how effective they are,
i.e. how many actions they contributed to versus how much
money was spent on them. This contribution is calculated
by the action attribution methodology we employ (presented
in §4).
2.3 Previous Work
In this section we will present some previous efforts in the
literature on both budget allocation and action-attribution.
2.3.1 Previous Efforts in Budget Allocation
Budget allocation in the campaign level for online adver-
tising is not a very broadly examined sub ject in the litera-
ture. Most of the papers so far focus on the topic of budget
optimization, i.e. given a budget constraint, how to set the
bid values as well as spending profile to maximize utility
(i.e. budget allocation per impression, rather than per cam-
paign). This is significantly different than the problem we
are working on, since our aim is actually to set these budget
constraints. Therefore the efforts in budget optimization are
complementary to our work: After we set the budgets in the
campaign level, budget optimization can take place to allo-
cate these budgets in the impression level. As an example of
budget optimization, we can list [4], where the user behav-
ior is modeled as a Markov chain. This modeling takes into
account that advertising for a specific campaign type affects
future behavior of the user, changing state transition prob-
abilities. The authors model budget optimization task as a
constrained optimal control problem for a Markov Decision
Process (MDP).
Most of the budget allocation efforts so far aimed to max-
imize click revenue, since their focus stayed within the do-
main of search advertising. For example, the authors of
[10] propose a combined model of bid price and budget de-
termination for keywords. They assume that click-through
rate (CTR) is a function of bid price and take into account
the marginal gains by increasing the bid amount, hence the
budget. The solution of the optimization problem gives the
optimal budget allocation. However, [10] does not take into
account the ability to deliver, which is crucial, and we focus
on allocation based on actions, which is difficult due to the
attribution problem. As it can be seen, due to the nature of
action-based online advertising, a big portion of our discus-
sions are to solve the attribution problem, which makes the
methods based on CTR not appropriate.
In [7], the authors discuss the assignment of budgets to
two types of search portals, generic and specialized. The
authors model the allocation as an optimal control problem,
and solve using dynamic programming. The biggest hand-
icap with that approach is the assumption that the under-
lying parameters for earnings and clicks are known, which
does not hold true and causes the methodology to be not
applicable in real-world online advertising scenarios. We try
to actually learn the performance of multiple sub-campaigns
(which is similar to different search portals, if we take the
search portal utilization as a targeting constraint) utilizing
the multi-touch attribution.
The closest approach to the one proposed in this paper
is given in [15]. The authors aim to do combined budget
allocation and bid optimization for each campaign in an ac-
count, and employ quadratic programming method to max-
imize revenue. Our work differs in two ways. Firstly, [15]
utilizes clicks to decide on the utility of campaigns, where
we utilize actions. While clicks are straight-forward to at-
tribute to campaigns, one of the main contributions of our
work is the combined focus on attribution (which is a hard
task for actions) and allocation. Again, as previously stated,
any CTR-based allocation scheme is not appropriate for the
domain we are focusing on. Our second difference is that we
separate the budget allocation from bid optimization. The
authors of [15] argue that these two should be combined
since there can be well-performing keywords under overall
low-performing campaigns. While such an argument is valid
for search advertising, which [15] focuses on, this is not the
case for online display advertising. Furthermore, due to the
complicated (much more convoluted than pure keyword tar-
geting) targeting rules involved in online display advertising
campaigns, such combined optimization is often not feasible.
Finally, for a more theoretic approach, we can list [3],
which focuses on the budget allocation problem to maxi-
mize the set of influenced target nodes (users). The authors
model media channels (which can be taken as campaigns)
and users as a bipartite graph, and the budget allocated to
a media channel directly affects the number of users that are
influenced by this media channel. Although this paper is not
extremely relevant to ours since we aim to improve revenue
(by either clicks or actions), we believe the influenced users
would map nicely onto the set of buyers/clickers.
2.3.2 Previous Efforts in Action-Attribution
While there have been simple models utilized in the indus-
try to perform multi-touch attribution, the first published
work for data-driven allocation is given in [11]. The authors
provide both a bagged logistic regression model, and an in-
tuitive probabilistic model (which uses second-order proba-
bility estimation) for attribution.
The authors of [6] utilize Shapley value [12] for attribu-
tion. It is also shown in [6] that the simple probabilistic
scheme employed by [11] is equivalent to a Shapley value
formulation after rescaling, and under certain simplifying
assumptions. This paper also argues that it is hard to eval-
uate whether one attribution of actions is better than an-
other. Our proposed budget allocation methodology can be
taken as a way to evaluate attribution methodologies, an
additional contribution by our paper.
Abhishek et al. [2] model user behavior as a hidden Markov
model (since user states are not observable, but only the out-
come is, such as clicks). They later propose to utilize this
behavior model to perform attribution, by attributing ac-
tions to ads that cause the user to change his/her latent
Finally, in [14], the authors claim that, given no other
importance information on channels, the first touch-point
as well as the touch-points closer to the last one (including
the last touch-point, which gets higher credit than first) get
the higher credit. This attribution resembles an assymetric
bathtub shape, and the authors utilize a Beta distribution
over time. Since the paper only deals with user journeys
that end in action, the authors also aim at detecting the im-
portance of initiating, intermediary, and terminating nodes
for sequences within each journey, hence this way mapping
channels to relevance values.
Let us give the formal definition of the budget allocation
problem. Given the total budget Bfor an insertion order,
the set of line items L={l1, ..., ln}under this IO, maximum
spending capability of each line item S={S1, ..., Sn}, and
return-on-investment (ROI) of each line item R={R1, ..., Rn}
(the amount of dollars received by the line item, due to ac-
tions, for each dollar spent by the line item for advertising,
using the specific targeting of the line item):
maximize U=
RiBisubject to,
j[1, n]BjSjand
BiB .
Please note that as presented in Section 2.3.1, this is signifi-
cantly different than the so-called budget optimization prob-
lem. If we have the correct values for the set Sand R, a
very simple greedy approach actually optimizes the above
1. Bremaining = B
2. Sort line items in L according to Ri(descending) into
a new list Lsorted.
3. While there is budget left
For each next line item liin Lsorted
(a) Assign lithe budget Bias min(Bremaining ,Si)
(b) Bremaining = Bremaining Bi
(c) If Bremaining 0, then return.
The problem we focus on in this paper is exactly the fact that
we do not know the values Riand Sifor a line item. In the
next section, we show that we solve the spending capability
estimation by a simple adaptive budget assignment scheme,
and return-on-investment estimation via multi-touch attri-
As mentioned in §3, budget allocation can be reduced
to two problems: (i) spending capability calculation for a
sub-campaign, and (ii) return-on-investment calculation for
a sub-campaign. In this section, we will separate these two
problems, and examine ways to solve them.
4.1 Spending Capability Calculationfor a Sub-
As aforementioned, sub-campaigns (line items) apply dif-
ferent targeting criteria to show ads to potential buyers of a
product. It is obvious that there are not the same number of
users, hence the same amount of advertising budget spend-
ing capability, for all targeting criteria. We certainly do
not want to assign a lot of money, no matter how high the
return-on-investment may be, on a specific campaign that
cannot reach enough users to be able to spend the money.
It is however a hard problem to estimate exactly how much
money a sub-campaign may spend, since it depends on both
the reach of users, as well as the bid price (i.e. if a sub-
campaign bids low, it will not be able to win ad auctions
and not receive impressions, hence not be able to spend the
money assigned to it). In our budget allocation approach,
we apply a simple adaptive budget assignment scheme. This
methodology can be summarized as follows.
If a sub-campaign is new, i.e. if we have no idea of
how much it will spend, assign a learning budget that
is high enough to give it a starting boost,
If a sub-campaign has spending data, then assign it
always a bit more (e.g. increase it with a certain per-
centage), to explore its spending limits.
Please note that, it is possible that at any point the sum of
current spending limits (calculated according to the above
adaptive scheme) of sub-campaigns may be smaller than the
overall campaign budget (i.e. a case of incomplete budget
delivery). This usually happens if the budget assigned to
a campaign is simply not possible to be spent by the sub-
campaigns, hence underspend (i.e. total spend not satisfying
total budget) may occur. In the case of incomplete budget
delivery, one solution that we utilize is to assign the remain-
ing (unassigned) budget fractionally among sub-campaigns
(according to their previous allocation). Although under-
spend may still occur, this assignment is still helpful in fur-
ther calculating the spending limits of sub-campaigns, since
we assign a little bit more budget to the sub-campaign than
our adaptive approach suggests.
It can be seen that this simple adaptive assignment method
actually tries to assign as much as possible to the sub-
campaigns that perform better (high return-on-investment).
This in turn tries to achieve the greedy algorithm given in
§3. Since we order the sub-campaigns/line items accord-
ing to their ROI, and then assign as much as possible to
the higher ranking line items, then the most important leg
of the approach is calculating the ROI accurately, which is
given in the next section.
4.2 ROI Calculation for a Sub-Campaign
We calculate the return-on-investment for a line item as
Money spent by li
Above, v(aj) is the monetary value that is received by ac-
tion aj(e.g. the profit that the advertiser earns by selling
that specific product). In this work, we deal with CPA (cost
per action) campaigns, where the advertiser provides the
demand-side platform with the values of the actions that
they want to receive, hence the return-on-investment is cal-
culated as the ratio of the value of actions received to the
amount of money spent for advertising. We also give the
attribution component in the above formulation by the term
p(li|aj). This determines the percentage of the action aj
that is attributed to line item li(while for LTA, p(li|aj) is 0
or 1, for MTA, p(li|aj)[0,1] since we allow partial attri-
bution of a single action to many sub-campaigns). Since the
above formulation is quite straight-forward, we will focus on
the attribution problem (i.e. determining p(li|aj)) for the
rest of the current section.
We have already stated that one of the most common at-
tribution methods used is last-touch attribution, which as-
signs the whole action to the last ad seen by the user. In this
paper, our emphasis is on multi-touch attribution, and we
utilize the probabilistic model given in [11], which also origi-
nated at Turn. The methodology given in [11] first calculates
the empirical action probability of line items (referred to as
advertising channels in the paper):
p(a|li) = N+(li)
N+(li) + N(li),
as well as pairs of line items :
p(a|li, lj) = N+(li, lj)
N+(li, lj) + N(li, lj).
In the formulation, N+denotes the number of times that
any user in the system has observed an ad sequence with an
ad from line item li(or ads from the pair of line items liand
lj) that ended in action, whereas Ndenotes the number of
sequences that did not end in action (and had line item li,
or the pair liand lj, in it). This formulation basically gives
the probability that a sequence of ads shown to a user will
end in conversion if it has an ad from li(or the pair liand
lj) in it. In our deployed system, we only consider actions
for the last taction days to be attributed to the impressions
and clicks (i.e. ad sequence) that the user experienced which
happened up to tassociation days before each action. Different
values can be employed for the above two variables.
Once the action probabilities are calculated, the contribu-
tion weight (to be normalized to calculate actual attribution)
for a line item is calculated in [11] as:
w(li) = p(a|li) + 1
2(N1) X
{p(a|li, lj)p(a|li)p(a|lj)},
where N is the total number of line items under the adver-
tiser that libelongs to. Our experience with the current
advertising system built in Turn is that the second term,
* As a side note, in this setting, probability of action for a se-
quence (regardless of the line items in it) is p(a) = N+
where N+is the total number of sequences (regardless of line
items) that ended in action, Nis the total number of sequences
that did not. This can be written in terms of action probabilities
conditioned on line items as:
p(a) = X
where Lis the set of all line items and P(L) is the power set
of (all subsets, and we further remove the empty set, )L.p(S)
is the probability of a set of line items appearing together in a
sequence (marginal probability of the set), which is calculated as
, i.e. total number of sequences which have set S
in it, divided by the total number of sequences. p(a|S) is the
conditional probability of action given set S, and f(S) is a func-
tion which gives +1 if set Shas odd number of line items in it,
and 1 if set Shas even number of line items in it. This is the
probability of union of conditional action events, where line items
are not independent of each other.
Algorithm 1 Second Step of Multi-Touch Attribution, Cal-
culates the Attribution for Each Action and ROI for Each
Line Item
taction = action window
tassociation = impression/click association window
// tp: touch-point, li: line item
for each user uido
Keep only the imps and clicks for the time period:
[today - (taction +tassociation), today]
Keep only the actions for the time period
[today - taction, today]
end for
action sequence set Saction =
// only look at action sequences
// since we are doing attribution
add each tp sequence Sithat ended in action (i.e. within
tassociation window of an action) into Saction
for each SiSaction do
weightSum = Pliwhere lihas a touch-point in Siw(lj)
for each ljthat has a touch-point in sequence Sido
actionAttributedlj+= w(lj)
totalActionAttributedlj+= actionAttributedlj
totalActionValuelj+= actionAttributedlj×
end for
end for
for each line item ljdo
output totalActionAttributedlj// total number of
// actions attributed to lj
output totalActionValuelj// total value of actions
// attributed to lj
output ROIlj// return-on-investment of lj
end for
i.e. the second-order calculations, does not give enough ad-
vantage in accuracy to justify the increase in processing time
required to train the model (calculating the pair-wise prob-
abilities as well as using these probabilities for the contri-
bution weight), hence we utilize the first-order probabilities
to calculate weights for the line items (although both first-
order and second-order calculations are supported in our
system). Therefore, the weight of each line item utilized for
attribution is given as:
w(li) = p(a|li) = N+(li)
N+(li) + N(li).(2)
For the first step in attribution, we go through each user
(i.e. web user, whose data consists of a set of impressions,
clicks and actions), and only process data for a certain pe-
riod (keep the actions for the last taction days, and the im-
pressions for the last taction +tassociation days, since we only
attribute an action to an impression if the impression hap-
pened up to tassociation days before the action). Later, we
extract the sequences of touch-points for the users, both
those that end in an action, and those that do not. Since a
sequence can have multiple touch-points from the same line
item, we deduplicate those touch-points, and in the end we
calculate the probability of a line item being in a sequence
that ends in action as its weight (i.e. equation 2 above),
which will be used for attribution in the second step of our
employed MTA algorithm. During the first step, we also cal-
culate the amount of money spent by each line item, which
is crucial to calculate ROI.
(a) First Step: Calculation of the Weights for Each Line Item
(b) Second Step: Calculation of the Attribution for Each Action, and ROI for Each Line Item
Figure 4: Implementation Details of Employed MTA Algorithm
The second step in our employed action attribution scheme
is given in Algorithm 1. Since we already calculated the
weights (w(li)) for the line items in the previous step, now
all we have to do is to assign each action to the line items
that showed at least one ad before (within a tassociation win-
dow) it, according to their weights (i.e. normalized weight
for each line item is the fraction of the action that is at-
tributed to it). For this purpose, we only look at the se-
quences that ended in action (contrary to first step, but this
is needed to calculate the weights, and total cost), and in
the end return the total values of the fractional actions at-
tributed to each line item. We also calculate ROI as given
in equation 1 (please note that costljis the total amount of
money spent by line item ljfor advertising, over both action
and no-action sequences, and is calculated in the first step
of our attribution scheme).
Please note that both of the above steps are easily paral-
lelizable, and we present some details in the next section on
how we implement our attribution and allocation system.
As aforementioned, the attribution scheme we employed
as given in §4.2 is easily parallelizable and we have imple-
mented the two-step algorithm on Hadoop [13]. This par-
allel implementation is necessary due to the large (multiple
billions of virtual users, where each user is a set of cookies)
number of users, and since we have to process the action and
no-action sequences for each of them. Indeed, the amount
of data we process (tens of terabytes of user profile data)
is bigger than other works published so far, and represents
perfectly the nature of real-world online advertising systems.
The two-step MTA algorithm is run every day, for each ad-
vertiser, and is scheduled by Oozie Workflow Scheduler [1].
The current implementation at Turn takes 40 seconds per
mapper for each of the first and second steps. The overall
job (both steps) takes around two hours to complete every
day in our production system.
The overview of our MTA implementation is given in Fig-
ure 4, which gives the details of the two steps separately.
In Figure 4(a), we present the implementation of first step
in our deployed attribution algorithm, which calculates the
attribution weights for each line item. The parallel process-
ing works as follows. First, we shard the whole set of users
into many mappers, which extract the action and no-action
sequences, and for each sequence throws out line item id as
the key, and the following values: (i)cost for the impres-
sions (touch-points) of the line item inside the sequence,
(ii) whether this sequence is an action sequence (0/1 value),
and (iii) whether this sequence is a no-action sequence (0/1
value). These <key, value tuple>pairs are sent to the re-
ducers, and the pairs with the same key end up in the same
reducer which allows for aggregation. In the end, each re-
ducer outputs the line item id key, and the aggregated total
number of action and no-action sequences which are used to
calculate the weight.
The implementation of the second step of our deployed at-
tribution scheme, where the actual action attribution as well
as the line item level return-on-investment (ROI) are calcu-
lated, is presented in Figure 4(b). Similar to Figure 4(a),
we first shard the users into mappers, and in each map-
per we only go over the action sequences. Furthermore,
we send the output of the first job (line item weights, as
well as total costs) into the mappers, since these values are
used to determine the action attribution and ROI for each
line item. For each action sequence, the mappers throw
out the line item id (for each line item that had a touch-
point inside this sequence that ended in an action) as key,
and the following values: (i) total cost of line item (this is
only for continuity, copied exactly from the output of first
job), (ii) percentage of the action (that concludes this se-
quence) that is attributed to line item (attributed action
which is within the interval [0,1]), and (iii) the value of
the action (that concludes this sequence) ×attributed action
(attributed action value ), which represents the money made
by the help of advertising under this line item. Again, the
same keys are collected within the same reducer, and the
reducer aggregates the values to calculate the total action
value (total attributed action value) received by a line item,
as well as the ROI for the line item (which uses both to-
tal attributed action value and total cost for this line item,
and calculates ROI according to equation 1).
Multi-Touch Attribution
Scheduled Job
Line Item
Performance Data
Control Server
(Budget Allocation,
Spending Rate Control,
Budget Control)
Performance Data
Ad Servers
Spending rate
for Line Items
Spending info
for Line Items
Start or stop spending
signal for Line Items
Figure 5: MTA-based Budget Allocation Architec-
The architecure we employ for MTA-based budget allo-
cation is given in Figure 5. The budget allocation algo-
rithm runs on the control server which picks up the MTA-
performance information from the Hadoop Distributed File
System (HDFS), which is populated by the MTA Oozie job.
Then, the control server calculates the daily budgets for line
items, and calculates the spending rates [8] for time periods
within the day. These spending rates are sent to ad servers,
which do the spending, and send the money spent for each
line item back to control server. Control server starts or
stops line items from further spending (this signal is also
sent to ad servers) if the line item has depleted its budget
for the day.
For our evaluations, we have set up two campaigns in
a real online advertising environment, with the same cam-
paign level budget, to run over 12 days within the month of
November, in 2013. Both campaigns have four identical line
items that run on differing targeting criteria. The only dif-
ference in the two campaigns is that the budget allocation
in one is calculated utilizing the ROI values generated by
MTA, and LTA in the other case. Please note that although
MTA-based budget allocation is used commonly within our
platform due to its advantages, we present the results of a
single experiment. This is due to the fact that this kind of
A/B testing requires exact set up of two campaigns to com-
pare, hence it requires experimentation budget (i.e. money,
since we assign the same amount of money to both cam-
paigns to allocate among sub-campaigns and then spend on
advertising). We are providing results in terms of return-
on-investment (ROI), effective cost per action (eCPA) and
effective cost per click (eCPC) metrics, which are calculated
in the campaign level. Our aim is to show that by allocating
budgets differently to sub-campaigns according to different
attribution methodologies, we improve the performance of
the overall campaign. While we have explained the ROI
metric throughout the paper, the latter two metrics can be
described as follows:
Effective Cost per Action (eCPA): What is the
average amount of money that is spent by an advertiser
(on advertising) to receive one action (i.e. purchase
etc.)? This metric can be calculated as Advertising Cost
# of Actions .
Effective Cost per Click (eCPC): What is the av-
erage amount of money that is spent by an advertiser
(on advertising) to receive one click (on its ad)? This
metric can be calculated as Advertising Cost
# of Clicks .
The results for the return-on-investment of the budget al-
location applying the two attribution methodologies (LTA
and MTA) is given in Figure 6. Due to privacy issues, we
have modified the actual ROI values with a constant factor.
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Comparison of the Budget Allocation Schemes Utilizing
Two Action Attribution Methodologies in Terms of ROI
Last-Touch Attr.
Multi-Touch Attr.
Figure 6: Comparison of ROI Performance for the
two budget allocation algorithms utilizing differ-
ent action attribution methodologies over 12 Days.
Higher ROI that has been achieved by the proposed
methodology indicates better performance.
Since we receive actions in the campaign level (i.e. when we
receive an action, we know it belongs to a certain campaign,
attribution to sub-campaigns comes afterwards), it is easier
to calculate the overall ROI for the two identical campaigns
run, to evaluate the results. It can be seen that we have
much higher ROI for the MTA scheme utilized, which sig-
nifies that the ranking information (estimated ROI) is more
accurate for MTA.
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Comparison of the Budget Allocation Schemes Utilizing
Two Action Attribution Methodologies in Terms of eCPA
Last-Touch Attr.
Multi-Touch Attr.
Figure 7: Comparison of eCPA Performance for
the two budget allocation algorithms utilizing differ-
ent action attribution methodologies over 12 Days.
Lower eCPA that has been achieved by the proposed
methodology indicates better performance.
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Comparison of the Budget Allocation Schemes Utilizing
Two Action Attribution Methodologies in Terms of eCPC
Last-Touch Attr.
Multi-Touch Attr.
Figure 8: Comparison of eCPC Performance for
the two budget allocation algorithms utilizing differ-
ent action attribution methodologies over 12 Days.
Lower eCPC that has been achieved by the proposed
methodology indicates better performance.
The results in terms of eCPA and eCPC are given in Fig-
ure 7 and Figure 8, respectively (again, the values are mod-
ified by a constant factor). Again, it can be seen that the
budget allocation based on the MTA performs much bet-
ter compared to the one that applies LTA. Please note that
these eCPA and eCPC values are closely related to ROI
(if the action values are the same for all actions, low eCPA
means high ROI), but we see that the MTA-based allocation
is much better in terms of ROI, compared to eCPA. This is
due to the fact that we were able to get many more “high
quality” (high value) actions by the MTA-based budget allo-
cation scheme. Finally, although budget allocation was op-
timized towards actions via MTA, we can observe that since
the MTA gives us the overall more effective sub-campaigns,
eCPC has also improved.
The final set of results for our experiment is given in Fig-
ure 9, which enhances our conclusion that MTA leads to bet-
ter determination of sub-campaign utilities, and to improved
budget allocation. In the figure, we present the percentage
of the total budget allocated to each line item, alongside
with the ROI received from that line item during the run of
the experiment. Although we can see that the ROIs received
by identical campaigns are slightly different (this difference
is expected, considering different budgets are assigned), we
see a remarkable correlation with the allocation achieved
by the MTA-based budget allocation and the actual ROIs
recorded. One more point of interest for the graph is about
the highest allocated budget in the LTA case (LI 3, i.e. line
item 3). This line item is actually a retarging sub-campaign
(i.e. tries to target users who have acted in some way about
this product, e.g. go to the homepage, click etc.), hence it
is very likely to do the last push for a user before buying a
product. This of course leads to unfair assignment of actions
in LTA case, unlike MTA.
MTA Budgets
LTA Budgets
LI 1
ROI: 31.85
Budget: 63.5%
LI 2
ROI: 7.94
Budget: 16.2%
LI 3
ROI: 7.12
Budget: 12.7%
LI 4
ROI: 0.46
Budget: 7.6%
LI 3
ROI: 3.01
Budget: 40.5%
LI 1
ROI: 34.01
Budget: 23.9% LI 2
ROI: 7.86
Budget: 18.5%
LI 4
ROI: 0.20
Budget: 17.1%
Figure 9: Comparison of how the budget distributed
(with the ROI received) among sub-campaigns for
both budget allocation schemes. It is apparent that
the MTA-based budget allocation was able to deter-
mine the ROI of campaigns with much higher ac-
curacy and has delivered the overall budget to sub-
campaigns accordingly.
In this paper, we have focused on the problem of budget
allocation in online advertising domain. We have shown that
sub-campaign performance values, calculated via the multi-
touch attribution, leads to better allocation of budgets. This
has been demonstrated empirically in our real-world online
advertising platform. We also gave a detailed explanation
on the algorithms utilized for both budget allocation and
multi-touch attribution, as well as their implementation.
Our future work mainly focuses on employing improved
multi-touch attribution algorithms. Furthermore, we plan
on the application of MTA for bidding as well, i.e. the bid
is calculated utilizing the past performance values generated
by the MTA algorithm.
[1] Apache oozie workflow scheduler for hadoop. Accessed: 2014-01-24.
[2] V. Abhishek, P. S. Fader, and K. Hosanagar. Media
exposure through the funnel: A model of multi-stage
attribution. In Proc. WISE, 2013.
[3] N. Alon, I. Gamzu, and M. Tennenholtz. Optimizing
budget allocation among channels and influencers. In
Proc. ACM WWW, 2012.
[4] N. Archak, V. S. Mirrokni, and S. Muthukrishnan.
Budget optimization for online advertising campaigns
with carryover effects. In Proc. ACM Workshop on Ad
Auctions, 2010.
[5] C. Borgs, J. Chayes, O. Etesami, N. Immorlica,
K. Jain, and M. Mahdian. Dynamics of bid
optimization in online advertisement auctions. In
Proc. ACM WWW, pages 531–540, 2007.
[6] B. Dalessandro, C. Perlich, O. Stitelman, and
F. Provost. Causally motivated attribution for online
advertising. In Proc. ACM ADKDD, 2012.
[7] G. E. Fruchter and W. Dou. Optimal budget
allocation over time for keyword ads in web portals. J.
Optimization Theory and Applications,
124(1):157–174, 2005.
[8] K.-C. Lee, A. Jalali, and A. Dasdan. Real time bid
optimization with smooth budget delivery in online
advertising. In,
pages 1–13, 2013.
[9] K.-C. Lee, B. Orten, A. Dasdan, and W. Li.
Estimating conversion rate in display advertising from
past performance data. In Proc. ACM SIGKDD Conf.
on Knowledge Discovery and Data Mining, pages
768–776, 2012.
[10] O. Ozluk and S. Cholette. Allocating expenditures
across keywords in search advertising. J. Revenue and
Pricing Management, 6(4):347–356, 2007.
[11] X. Shao and L. Li. Data-driven multi-touch attribution
models. In Proc. ACM SIGKDD Conf. on Knowledge
Discovery and Data Mining, pages 258–264, 2011.
[12] L. S. Shapley. A value for n-person games. Annals of
Mathematical Studies, 28:307–317, 1953.
[13] T. White. Hadoop: The Definitive Guide. O’Reilly
Media, Sebastopol, CA, 2012.
[14] D. A. Wooff and J. M. Anderson. Time-weighted
multi-touch attribution and channel relevance in the
customer journey to online purchase. J. Statistical
Theory and Practice, 2013.
[15] W. Zhang, Y. Zhang, B. Gao, Y. Yu, X. Yuan, and
T.-Y. Liu. Joint optimization of bid and budget
allocation in sponsored search. In Proc. ACM
SIGKDD Conf. on Knowledge Discovery and Data
Mining, 2012.
... • None of the identified attribution models/approaches is described to be applicable in an omni-channel environment (Abhishek et al., 2015;Dalessandro et al., 2012;Geyik et al., 2014;Li and Kannan, 2014;Nottorf, 2014;Shao and Li, 2011;Xu et al., 2014;Zhang et al., 2014). ...
... The primary goal for an advertiser and website provider is to make the highest possible amount of money (through, e.g. conversion, purchase, or sign up for a newsletter) out of the performed action (Geyik et al., 2014). This will continue to be the primary goal. ...
... A precise attribution of a budget per customer or audience is indispensable. Only Abhishek et al. (2015) and Geyik et al. (2014) perform a calculation based on the impact of an advertisement impression or the impression itself. This part of budget allocation needs to be further developed in future omni-channel attribution approaches. ...
... • None of the identified attribution models/approaches is described to be applicable in an omni-channel environment (Abhishek et al., 2015;Dalessandro et al., 2012;Geyik et al., 2014;Li and Kannan, 2014;Nottorf, 2014;Shao and Li, 2011;Xu et al., 2014;Zhang et al., 2014). ...
... The primary goal for an advertiser and website provider is to make the highest possible amount of money (through, e.g. conversion, purchase, or sign up for a newsletter) out of the performed action (Geyik et al., 2014). This will continue to be the primary goal. ...
... A precise attribution of a budget per customer or audience is indispensable. Only Abhishek et al. (2015) and Geyik et al. (2014) perform a calculation based on the impact of an advertisement impression or the impression itself. This part of budget allocation needs to be further developed in future omni-channel attribution approaches. ...
... It uses the Weibull distribution delay in observation as well as the hazard rate of conversion. [5] First-Touch and Last Touch attribution was focused to best optimize ROI while allocating the campaign budget for each advertisement channel. Ji et. ...
... While implementing the Random Forest model, the different parameters are taken into consideration like different estimators(10,25,50,100), different criterion(Gini, entropy), maximum depth, minimum sample split, minimum impurity decrease etc. While implementing the K Neighbors Classifier model, the different parameters are taken into consideration like no of neighbors (2,3,5), different weights(uniform, distance), different algorithms to compute neighbours like (auto, ball_tree, kd_tree, brute) and other parameters like leaf size, metric etc. While implementing the SVM Classifier model, the different parameters are taken into consideration like different kernels(linear, poly, rbf', sigmoid, precomputed), different gamma(scale, auto), cache size, class weight etc. ...
... Some works have analyzed the effect of external factors on this specific model [2]. Others have studied the effect of different attributions on budget allocation [3]. The research community has proposed alternative data-driven models to Shapley value. ...
... If we consider the aggregated option the value of the aggregation characteristic for the channels would be: [1, 1, 1, 0, 0, 0] for [paid search, affiliation, display programmatic, display premium, email, paid social]. Instead if we consider the disaggregated we would have the following values of aggregation for each channel: [3,4,4, 0, 0, 0] for [paid search, affiliation, display programmatic, display premium, email, paid social]. ...
Full-text available
Digital marketing is a profitable business generating annual revenue over USD 200B and an inter-annual growth over 20%. The definition of efficient marketing investment strategies across different types of channels and campaigns is a key task in digital marketing. Attribution models are an instrument used to assess the return of investment of different channels and campaigns so that they can assist in the decision-making process. A new generation of more powerful data-driven attribution models has irrupted in the market in the last years. Unfortunately, its adoption is slower than expected. One of the main reasons is that the industry lacks a proper understanding of these models and how to configure them. To solve this issue, in this paper, we present an empirical study to better understand the key properties of user-paths and their impact on attribution models. Our analysis is based on a large-scale dataset including more than 95M user-paths from real advertising campaigns of an international hoteling group. The main contribution of the paper is a set of recommendation to build accurate, interpretable and computationally efficient attribution models such as: (i) the use of linear regression, an interpretable machine learning algorithm, to build accurate attribution models; (ii) user-paths including around 12 events are enough to produce accurate models; (iii) the recency of events considered in the user-paths is important for the accuracy of the model.
... Real-time bidding strategies have an expansive literature [8,10,12,14,17,22], however the field of optimal budget pacing is relatively new and, as such, the literature thus far is often limited in scope [1,4,5,6]. We here examine the methodologies of a select number of other studies whose work closely aligns with our own, the aspects of real-time bidding they encompass, and where they fall short. ...
Full-text available
In this paper, we analyze a natural learning algorithm for uniform pacing of advertising budgets, equipped to adapt to varying ad sale platform conditions. On the demand side, advertisers face a fundamental technical challenge in automating bidding in a way that spreads their allotted budget across a given campaign subject to hidden, and potentially dynamic, "spent amount" functions. This automation and calculation must be done in runtime, implying a necessary low computational cost for the high frequency auction rate. Advertisers are additionally expected to exhaust nearly all of their sub-interval (by the hour or minute) budgets to maintain budgeting quotas in the long run. Our study analyzes a simple learning algorithm that adapts to the latent spent amount function of the market and learns the optimal average bidding value for a period of auctions in a small fraction of the total campaign time, allowing for smooth budget pacing in real-time. We prove our algorithm is robust to changes in the auction mechanism, and exhibits a fast convergence to a stable average bidding strategy. The algorithm not only guarantees that budgets are nearly spent in their entirety, but also smoothly paces bidding to prevent early exit from the campaign and a loss of the opportunity to bid on potentially lucrative impressions later in the period.
... The potential synergic effect of earlier experienced marketing communications on the next touchpoint and overall decision, results in two types of value allocation principles in attribution methods: fractional and incremental attribution. Fractional attribution assigns proportionate value to each touchpoint independently from other communications, experienced along a customer journey (Anderl, Becker et al., 2016;Geyik, Saxena, & Dasdan, 2015). It was introduced as one of the primary types of multi-touch methods. ...
The integration of technology in business strategy increases the complexity of marketing communications and urges the need for advanced marketing performance analytics. Rapid advancements in marketing attribution methods created gaps in the systematic description of the methods and explanation of their capabilities. This paper contrasts theoretically elaborated facilitators and the capabilities of data-driven analytics against the empirically identified classes of marketing attribution. It proposes a novel taxonomy, which serves as a tool for systematic naming and describing marketing attribution methods. The findings allow to reflect on the contemporary attribution methods’ capabilities to account for the specifics of the customer journey, thereby, creating currently lacking theoretical backbone for advancing the accuracy of value attribution.
Expense optimisation for online marketing is a relevant and challenging task. In particular, the problem of splitting daily budget among campaigns, together with the problem of setting bids for the auctions that regulate ad appearance, have been recently cast as a multi-armed bandit problem. However, at the current state of the art several shortcomings limit practical applications. Indeed, campaigns are routinely divided by practitioners into sub-entities called ad groups, while current approaches take into account only the case of single ad groups: in this paper, we extend the state of the art to multiple ad groups. Moreover, we propose a contextual bandit model which achieves high data efficiency, especially important for campaigns with few clicks and/or small conversion rate. Our model exploits domain knowledge to greatly reduce the exploration space by using parametric Bayesian regression. Elicitation of prior distributions from domain experts is simplified by interpretability, while action selection is carried out by Thompson sampling and local optimisation methods. A simulation environment was built to compare the proposed approach to current state-of-the-art methods. Effectiveness of the proposed approach is confirmed by a rich set of numerical experiments, especially in the early days of marketing expense optimisation.KeywordsMarketing expense optimizationMulti armed banditBayesian regression
Full-text available
Marketers are currently focused on proper budget allocation to maximize ROI from online advertising. They use conversion attribution models assessing the impact of specific media channels (display, search engine ads, social media, etc.). Marketers use the data gathered from paid, owned, and earned media and do not take into consideration customer activities in category media, which are covered by the OPEC (owned, paid, earned, category) media model that the author of this paper proposes. The aim of this article is to provide a comprehensive review of the scientific literature related to the topic of marketing attribution for the period of 2010-2019 and to present the theoretical implications of not including the data from category media in marketers' analyses of conversion attribution. The results of the review and the analysis provide information about the development of the subject, the popularity of particular conversion attribution models, the ideas of how to overcome obstacles that result from data being absent from analyses. Also, a direction for further research on online consumer behavior is presented.
Full-text available
This paper is concerned with the joint allocation of bid price and campaign budget in sponsored search. In this application, an advertiser can create a number of campaigns and set a budget for each of them. In a campaign, he/she can further create several ad groups with bid keywords and bid prices. Data analysis shows that many advertisers are dealing with a very large number of campaigns, bid keywords, and bid prices at the same time, which poses a great challenge to the optimality of their campaign management. As a result, the budgets of some campaigns might be too low to achieve the desired performance goals while those of some other campaigns might be wasted; the bid prices for some keywords may be too low to win competitive auctions while those of some other keywords may be unnecessarily high. In this paper, we propose a novel algorithm to automatically address this issue. In particular, we model the problem as a constrained optimization problem, which maximizes the expected advertiser revenue subject to the constraints of the total budget of the advertiser and the ranges of bid price change. By solving this optimization problem, we can obtain an optimal budget allocation plan as well as an optimal bid price setting. Our simulation results based on the sponsored search log of a commercial search engine have shown that by employing the proposed method, we can effectively improve the performances of the advertisers while at the same time we also see an increase in the revenue of the search engine. In addition, the results indicate that this method is robust to the second-order effects caused by the bid fluctuations from other advertisers.
Full-text available
In many online advertising campaigns, multiple vendors, publishers or search engines (herein called channels) are contracted to serve advertisements to internet users on behalf of a client seeking specific types of conversion. In such campaigns, individual users are often served advertisements by more than one channel. The process of assigning conversion credit to the various channels is called "attribution," and is a subject of intense interest in the industry. This paper presents a causally motivated methodology for conversion attribution in online advertising campaigns. We discuss the need for the standardization of attribution measurement and offer three guiding principles to contribute to this standardization. Stemming from these principles, we position attribution as a causal estimation problem and then propose two approximation methods as alternatives for when the full causal estimation can not be done. These approximate methods derive from our causal approach and incorporate prior attribution work in cooperative game theory. We argue that in cases where causal assumptions are violated, these approximate methods can be interpreted as variable importance measures. Finally, we show examples of attribution measurement on several online advertising campaign data sets.
Full-text available
Brands and agencies use marketing as a tool to influence customers. One of the major decisions in a marketing plan deals with the allocation of a given budget among media channels in order to maximize the impact on a set of potential customers. A similar situation occurs in a social network, where a marketing budget needs to be distributed among a set of potential influencers in a way that provides high-impact. We introduce several probabilistic models to capture the above scenarios. The common setting of these models consists of a bipartite graph of source and target nodes. The objective is to allocate a fixed budget among the source nodes to maximize the expected number of influenced target nodes. The concrete way in which source nodes influence target nodes depends on the underlying model. We primarily consider two models: a source-side influence model, in which a source node that is allocated a budget of k makes k independent trials to influence each of its neighboring target nodes, and a target-side influence model, in which a target node becomes influenced according to a specified rule that depends on the overall budget allocated to its neighbors. Our main results are an optimal (1-1/e)-approximation algorithm for the source-side model, and several inapproximability results for the target-side model, establishing that influence maximization in the latter model is provably harder.
Full-text available
We consider the problem of online keyword advertising auctions among multiple bidders with limited budgets, and propose a bidding heuristic to optimize the utility for bidders by equalizing the return-on-investment for each bidder across all keywords. We show that natural auction mechanisms combined with this heuristic can experience chaotic cycling (as is the case with many current advertisement auction systems), and therefore propose a modifled class of mechanisms with small random perturbations. This perturbation is reminiscent of the small time-dependent perturbations employed in the dynamical systems literature to convert many types of chaos into attracting motions. We show that our perturbed mechanism provably converges in the case of flrst-price auctions and experimentally converges in the case of second-price auctions. Moreover, we show that our bidder-optimal system does not decrease the revenue of the auctioneer in the sense that it converges to the unique market equilibrium in the case of flrst-price auctions. In the case of second-price auctions, we conjecture that it converges to the non-unique \supply- aware" market equilibrium. We also observe that our perturbed auction scheme is useful in a broader context: In general, it can allow bidders to \share" a particular item, leading to stable allocations and pricing for the bidders, and improved revenue for the auctioneer.
We address statistical issues in attributing revenue to marketing channels and inferring the importance of individual channels in customer journeys toward an online purchase. We describe the relevant data structures and introduce an example. We suggest an asymmetric bathtub shape as appropriate for time-weighted revenue attribution to the customer journey, provide an algorithm, and illustrate the method. We suggest a modification to this method when there is independent information available on the relative values of the channels. To infer channel importance, we employ sequential data analysis ideas and restrict to data which ends in a purchase. We propose metrics for source, intermediary, and destination channels based on two- and three-step transitions in fragments of the customer journey. We comment on the practicalities of formal hypothesis testing. We illustrate the ideas and computations using data from a major UK online retailer. Finally, we compare the revenue attributions suggested by the methods in this article with several common attribution methods.
Conference Paper
While it is relatively easy to start an online advertising campaign, proper allocation of the marketing budget is far from trivial. A major challenge faced by the marketers attempting to optimize their campaigns is in the sheer number of variables involved, the many individual decisions they make in fixing or changing these variables, and the nontrivial short and long-term interplay among these variables and decisions. In this paper, we study interactions among individual advertising decisions using a Markov model of user behavior. We formulate the budget allocation task of an advertiser as a constrained optimal control problem for a Markov Decision Process (MDP). Using the theory of constrained MDPs, a simple LP algorithm yields the optimal solution. Our main result is that, under a reasonable assumption that online advertising has positive carryover effects on the propensity and the form of user interactions with the same advertiser in the future, there is a simple greedy algorithm for the budget allocation with the worst-case running time cubic in the number of model states (potential advertising keywords) and an efficient parallel implementation in a distributed computing framework like MapReduce. Using real-world anonymized datasets from sponsored search advertising campaigns of several advertisers, we evaluate performance of the proposed budget allocation algorithm, and show that the greedy algorithm performs well compared to the optimal LP solution on these datasets and that both show consistent 5-10% improvement in the expected revenue against the optimal baseline algorithm ignoring carryover effects.
Consumers are exposed to advertisers across a number of channels. As a result, a conversion or a sale may be the result of a series of ads that were displayed to the consumer. This raises the key question of attribution: which ads get credit for a conversion and how much credit do each of these ads get? This is one of the most important issues facing the advertising industry. Although the issue is well documented, current solutions are often simplistic. Current practices apply simplistic methods like attributing the sale to the most recent ad exposure that penalize prior exposures and give undue credit to ad exposures further down in the conversion funnel. In this paper, we address the problem of attribution using a unique data-set from the online campaign of a car launch. We present a Hidden Markov Model of an individual consumer's behavior based on the concept of a conversion funnel that captures the consumer's deliberation process. We observe that different ad formats, e.g. display and search ads, affect the consumers differently and in different states of their decision process. Display ads usually have an early impact on the consumer, moving him from a state of dormancy to a state where he is aware of the product and it might enter his consideration set. However, when the consumer actively interacts with these ads (e.g. by clicking on them), his likelihood to convert considerably increases. Secondly, we present an attribution scheme based on the proposed model that assigns credit to an ad based on the incremental impact it has the consumer's probability to convert.
In targeted display advertising, the goal is to identify the best opportunities to display a banner ad to an online user who is most likely to take a desired action such as purchasing a product or signing up for a newsletter. Finding the best ad impression, i.e., the opportunity to show an ad to a user, requires the ability to estimate the probability that the user who sees the ad on his or her browser will take an action, i.e., the user will convert. However, conversion probability estimation is a challenging task since there is extreme data sparsity across different data dimensions and the conversion event occurs rarely. In this paper, we present our approach to conversion rate estimation which relies on utilizing past performance observations along user, publisher and advertiser data hierarchies. More specifically, we model the conversion event at different select hierarchical levels with separate binomial distributions and estimate the distribution parameters individually. Then we demonstrate how we can combine these individual estimators using logistic regression to accurately identify conversion events. In our presentation, we also discuss main practical considerations such as data imbalance, missing data, and output probability calibration, which render this estimation problem more difficult but yet need solving for a real-world implementation of the approach. We provide results from real advertising campaigns to demonstrate the effectiveness of our proposed approach.