Page 1

Online Learning in Online Auctions?

Avrim Blum

Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA1

Vijay Kumar

Strategic Planning and Optimization Team, Amazon.com, Seattle, WA

Atri Rudra

Department of Computer Science, University of Texas at Austin, Austin, TX2

Felix Wu

Computer Science Division, University of California at Berkeley, Berkeley, CA3

Abstract

We consider the problem of revenue maximization in online auctions, that is, auc-

tions in which bids are received and dealt with one-by-one. In this paper, we demon-

strate that results from online learning can be usefully applied in this context, and

we derive a new auction for digital goods that achieves a constant competitive

ratio with respect to the optimal (offline) fixed price revenue. This substantially

improves upon the best previously known competitive ratio for this problem of

O(exp(√loglogh)) [4]. We also apply our techniques to the related problem of de-

signing online posted price mechanisms, in which the seller declares a price for each

of a series of buyers, and each buyer either accepts or rejects the good at that price.

Despite the relative lack of information in this setting, we show that online learning

techniques can be used to obtain results for online posted price mechanisms which

are similar to those obtained for online auctions.

?Portions of this work appeared as an extended abstract in Proceedings of

SODA’03 [5].

Email addresses: avrim@cs.cmu.edu (Avrim Blum), vijayk@amazon.com

(Vijay Kumar), atri@cs.utexas.edu (Atri Rudra), felix@cs.berkeley.edu

(Felix Wu).

1Supported in part by National Science Foundation grants CCR-0105488 and IIS-

0121678.

2Work done while the author was at IBM India Research Lab, New Delhi, India.

3Supported in part by National Science Foundation ITR grant CCR-0121555.

Preprint submitted to Elsevier Science16 September 2003

Page 2

1Introduction

Auctions are traditional and well-studied economic mechanisms, and economists

have long studied the design of auctions intended to satisfy various goals, in-

cluding that of maximizing the total revenue obtained by the auctioneer from

the auction. Traditionally, however, economists have analyzed auctions under

the assumption that statistical information about the participating bidders is

available. Recent work in computer science has been directed toward designing

auctions in the absence of such statistical assumptions, using instead a form

of worst-case competitive analysis [3,4,7,10,12].

The proliferation of Internet auctions and the increasing availability of media

on the Internet has prompted particular attention to the design of auctions for

digital goods, that is, goods available in unlimited supply [7,10]. In this paper,

we focus on such goods, though our techniques may also be useful in the case

of limited supply goods. A key property of digital goods is that it will often

be useful to conduct auctions of such goods over time, with bidders arriving

one-by-one, rather than as a group. Hence, we are interested here in designing

online auctions for digital goods, a problem first described by Bar-Yossef et

al. [4].

In the model of Bar-Yossef et al. [4], n bidders arrive in a sequence. Each

bidder i is interested in one copy of the good, and values this copy at vi. The

valuations are normalized to some range [1,h], so that h is the ratio between

the highest and lowest possible valuations. Bidder i places bid bi, and the

auction must then determine whether to sell the good to bidder i, and if so,

at what price si≤ bi. This is equivalent to determining a sales price si, such

that if si≤ bi, bidder i wins the good and pays si; otherwise, bidder i does

not win the good and pays nothing.

The utility of a bidder is then given by vi− siif bidder i wins; 0 if bidder i

does not win. As in Bar-Yossef et al. [4], we are interested in auctions which

are incentive-compatible, that is, auctions in which each bidder’s utility is

maximized by bidding truthfully and setting bi= vi. As shown in that paper,

this condition is equivalent to the condition that each si depends only on

the first i − 1 bids, and not on the ith bid. Hence, the auction mechanism is

essentially trying to guess the ith valuation, based on the first i−1 valuations.

Note that in an online auction, the sales prices siare not actually revealed to

the bidders, since we need the bidders to declare their valuations, so that the

auction can use this information in dealing with future bidders. In auctions

conducted remotely over networks, however, the bidders may not trust the

auctioneer to set sales prices before seeing the next bid. Buyers would clearly

prefer to receive these sales prices directly and then to make a decision ac-

2

Page 3

cordingly whether or not to purchase the good. (Buyers purchase if and only if

si≤ vi.) We call such a mechanism a posted price mechanism [11]. The trade-

off in using such a mechanism is that in exchange for the greater trust of the

buyers, the seller loses the complete information about the buyers’ valuations.

As in previous papers [4,10,12], we will use competitive analysis to analyze the

performance of any given auction or mechanism. That is, we are interested in

the worst-case ratio (over all sequences of valuations) between the revenue of

the “optimal offline” auction and the revenue of the online auction. Following

previous papers [4,10], we take the optimal offline auction to be the one which

optimally sets a single fixed price for all bidders. Thus, our goal is what is

sometimes called “static optimality.” The revenue of the optimal fixed price

auction is given by F(v) = maxi{vini}, where ni = |{j | vj≥ vi}| is the

number of bidders with valuation at least vi. An online auction A with revenue

RA(v) is said to be c-competitive if for any sequence v, RA(v) ≥ F(v)/c. We

take RAto be the expected revenue if A is randomized.

In Section 2, we present an asymptotically constant-competitive online auction

for digital goods. By asymptotically, we mean that our auction achieves a

revenue which is a constant fraction of F, but minus an additive term. (In our

case, this term is O(hlnlnh).) Hence, as F becomes large, this additive term

becomes negligible. Nevertheless, it is important to minimize this term, since it

roughly corresponds to the size of the smallest auctions for which we can give

good revenue bounds. Theorem 4 gives a general lower bound showing that

our additive constant is nearly optimal: in particular, any constant-competitive

algorithm must have an additive constant Ω(h).

In Section 3, we derive a similar result for the problem of designing online

posted price mechanisms. (Offline posted price mechanisms have been previ-

ously studied by Hartline [11].) Such mechanisms provide much less informa-

tion to the auctioneer about the bidders’ valuations, but surprisingly, we are

still able to obtain results very similar to those obtained in the online auction

setting.

Our results are based on application of machine learning techniques to the

online auction problem. Setting a single fixed price for the auction can be

thought of as following the advice of a single “expert” who predicts that fixed

price for every bidder. Performing well relative to the optimal fixed price is

then equivalent to performing well relative to the best of these experts, a prob-

lem well-studied in learning theory [2,6,8,9,13]. The posted price setting then

corresponds to a version of the “bandit” problem [2], in which the informa-

tion received depends on the expert chosen at each step. Our algorithms are

derived by adapting these techniques to the online auction setting.

3

Page 4

2Online auctions: the full information game

We use a variant of Littlestone and Warmuth’s weighted majority (WM)

algorithm [13] given in Auer et al. [1,2]. In our context, let X = {x1,...,x?}

be a set of candidate fixed prices, corresponding to a set of experts. Let rk(v)

be the revenue obtained by setting the fixed price xkfor the valuation sequence

v, and let FX(v) = maxkrk(v) be the optimal fixed price revenue on sequence

v, when restricted to fixed prices in X.

Given a parameter α ∈ (0,1], define weights wk(i) = (1+α)rk(v1,...,vi)/h. Clearly,

the weights can be easily maintained using a multiplicative update. Then, for

bidder i, the auction chooses si∈ X with probability

wk(i − 1)

??

pk(i) = Pr[si= xk] =

j=1wj(i − 1).

This algorithm is shown in Figure 1.

Algorithm WM

Parameters: Reals α ∈ (0,1] and X ∈ [1,h]?.

Initialization: For each expert k, initialize rk() = 0,wk(0) = 1.

For each bidder i = 1,...,n:

Set the sales price sito be xkwith probability pk(i) =

wk(i−1)

??

j=1wj(i−1).

Observe bi= vi.

For each expert k, update rk(v1,...,vi) and wk(i) = (1 + α)rk(v1,...,vi)/h.

Fig. 1. WM in our setting

The following theorem appears in Auer et al., with the proof adapted from

proofs appearing in Freund and Schapire [9] and Littlestone and Warmuth [13].

Theorem 1 [1, Theorem 3.2] For any sequence of valuations v, the revenue

of auction WM is at least:

RWM(v) ≥ (1 −α

2)FX(v) −hln?

α

.

For completeness, we provide a proof here.

Proof. Let gk(i) denote the revenue gained by the kth expert from bidder i,

that is, gk(i) = xk, if vi≥ xk, and gk(i) = 0 otherwise. Then, rk(v1,...,vi) =

gk(i)+rk(v1,...,vi−1). Let W(i) =??

bidder i.

k=1wk(i) be the sum of the weights after

4

Page 5

The expected revenue of the auction from bidder i + 1 is given by

gWM(i + 1) =

??

k=1wk(i)gk(i + 1)

W(i)

.

We can then relate the change in W(i) to the expected revenue of the auction

as follows:

W(i + 1)=

?

?

?

?

k=1

wk(i)(1 + α)gk(i+1)/h

≤

k=1

wk(i)(1 + α(gk(i + 1)/h))

=W(i) + α

?

?

k=1

wk(i)(gk(i + 1)/h)

=W(i)(1 + α(gWM(i + 1)/h)),

where for the inequality, we used the fact that for x ∈ [0,1], (1+α)x≤ 1+αx.

Since W(0) = ?, we have

W(n) ≤ ? ·

n

?

i=1

(1 + α(gWM(i)/h)).

On the other hand, the sum of the final weights is at least the value of the

maximum final weight. Hence, W(n) ≥ (1 + α)FX/h.

Taking logs, we have

FX

h

ln(1 + α) ≤ ln? +

n

?

i=1

ln(1 + α(gWM(i)/h)).

For x ∈ [0,1], x −x2

FX

h

2≤ ln(1 + x) ≤ x; hence,

≤ ln? +α

?

α −α2

2

?

hRWM.

Rearranging this inequality yields the theorem.

2

Now let X consist of all powers of (1 + β) between 1 and h. If we take α =

β =?

3, we get the following theorem.

5