Online Learning in Online Auctions

University of California, Berkeley, Berkeley, California, United States
Theoretical Computer Science (Impact Factor: 0.52). 08/2003; 324(2-3). DOI: 10.1016/j.tcs.2004.05.012
Source: CiteSeer

ABSTRACT We consider the problem of revenue maximization in online auctions, that is, auctions in which bids are received and dealt with one-by-one. In this note, we demonstrate that results from online learning can be usefully applied in this context, and we derive a new auction for digital goods that achieves a constant competitive ratio with respect to the best possible (o#ine) fixed price revenue. This substantially improves upon the best previously known competitive ratio [3] of O(exp( # log log h)) for this problem. We apply our techniques to the related problem of online posted price mechanisms, where the auctioneer declares a price and a bidder only communicates his acceptance/rejection of the price. For this problem we obtain results that are (somewhat surprisingly) similar to the online auction problem.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We consider pricing in settings where a consumer discovers his value for a good only as he uses it, and the value evolves with each use. We explore simple and natural pricing strategies for a seller in this setting, under the assumption that the seller knows the distribution from which the consumer's initial value is drawn, as well as the stochastic process that governs the evolution of the value with each use. We consider the differences between up-front or "buy-it-now" pricing (BIN), and "pay-per-play" (PPP) pricing, where the consumer is charged per use. Our results show that PPP pricing can be a very effective mechanism for price discrimination, and thereby can increase seller revenue. But it can also be advantageous to the buyers, as a way of mitigating risk. Indeed, this mitigation of risk can yield a larger pool of buyers. We also show that the practice of offering free trials is largely beneficial. We consider two different stochastic processes for how the buyer's value evolves: In the first, the key random variable is how long the consumer remains interested in the product. In the second process, the consumer's value evolves according to a random walk or Brownian motion with reflection at 1, and absorption at 0.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We consider a stochastic continuum armed bandit problem where the arms are indexed by the $\ell_2$ ball $B_{d}(1+r)$ of radius $1+r$ in $\mathbb{R}^d$. The reward functions $r :B_{d}(1+r) \rightarrow \mathbb{R}$ are considered to intrinsically depend on $k \ll d$ unknown linear parameters so that $r(\mathbf{x}) = g(\mathbf{A} \mathbf{x})$ where $\mathbf{A}$ is a full rank $k \times d$ matrix. Assuming the mean reward function to be smooth we make use of results from low-rank matrix recovery literature and derive an efficient randomized algorithm which achieves a regret bound of $O(C(k,d) n^{\frac{1+k}{2+k}})$ with high probability. Here $C(k,d)$ is at most polynomial in $d$ and $k$ and $n$ is the number of rounds or the sampling budget which is assumed to be known beforehand.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Inspired by real-time ad exchanges for online display advertising, we consider the problem of inferring a buyer's value distribution for a good when the buyer is repeatedly interacting with a seller through a posted-price mechanism. We model the buyer as a strategic agent, whose goal is to maximize her long-term surplus, and we are interested in mechanisms that maximize the seller's long-term revenue. We define the natural notion of strategic regret --- the lost revenue as measured against a truthful (non-strategic) buyer. We present seller algorithms that are no-(strategic)-regret when the buyer discounts her future surplus --- i.e. the buyer prefers showing advertisements to users sooner rather than later. We also give a lower bound on strategic regret that increases as the buyer's discounting weakens and shows, in particular, that any seller algorithm will suffer linear strategic regret if there is no discounting.


Available from