Kevin Leyton-Brown

Kevin Leyton-Brown
  • PhD, Stanford University
  • Professor (Full) at University of British Columbia

About

236
Publications
77,559
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
15,696
Citations
Introduction
Much of my work is at the intersection of computer science and microeconomics, addressing computational problems in economic contexts and incentive issues in multiagent systems. I also study the application of machine learning to the automated design and analysis of algorithms for solving hard computational problems.
Current institution
University of British Columbia
Current position
  • Professor (Full)
Additional affiliations
January 2004 - present
University of British Columbia
September 1998 - June 2003
Stanford University

Publications

Publications (236)
Preprint
Full-text available
Conspicuous consumption occurs when a consumer derives value from a good based on its social meaning as a signal of wealth, taste, and/or community affiliation. Common conspicuous goods include designer footwear, country club memberships, and artwork; conspicuous goods also exist in the digital sphere, with non-fungible tokens (NFTs) as a prominent...
Preprint
Full-text available
Models of human behavior in game-theoretic settings often distinguish between strategic behavior, in which a player both reasons about how others will act and best responds to these beliefs, and "level-0" non-strategic behavior, in which they do not respond to explicit beliefs about others. The state of the art for predicting human behavior on unre...
Preprint
With the rise of online applications, recommender systems (RSs) often encounter constraints in balancing exploration and exploitation. Such constraints arise when exploration is carried out by agents whose individual utility should be balanced with overall welfare. Recent work suggests that recommendations should be individually rational. Specifica...
Preprint
Full-text available
Motivated by the rapid ascent of Large Language Models (LLMs) and debates about the extent to which they possess human-level qualities, we propose a framework for testing whether any agent (be it a machine or a human) understands a subject matter. In Turing-test fashion, the framework is based solely on the agent's performance, and specifically on...
Preprint
Full-text available
Utilitarian algorithm configuration is a general-purpose technique for automatically searching the parameter space of a given algorithm to optimize its performance, as measured by a given utility function, on a given set of inputs. Recently introduced utilitarian configuration procedures offer optimality guarantees about the returned parameterizati...
Article
Researchers building behavioral models, such as behavioral game theorists, use experimental data to evaluate predictive models of human behavior. However, there is little agreement about which loss function should be used in evaluations, with error rate, negative log-likelihood, cross-entropy, Brier score, and squared L2 error all being common choi...
Article
Mobile gaming is a rapidly growing and incredibly profitable sector; having grown seven-fold over the past 10 years, it now grosses over $100 billion annually. This growth was due in large part to a shift in monetization strategies: rather than charging players an upfront cost ("pay-to-play"), games often request optional microtransactions througho...
Article
Full-text available
This paper revisits the descending clock “reverse” auction design used in the U.S. Federal Communications Commission’s 2016–2017 “incentive auction.” We use extensive computational simulations to investigate the quantitative significance of various aspects of the design, leveraging a reverse auction simulator and realistic models of bidder values....
Article
Full-text available
Retrieval-Augmented Language Modeling (RALM) methods, which condition a language model (LM) on relevant documents from a grounding corpus during generation, were shown to significantly improve language modeling performance. In addition, they can mitigate the problem of factually inaccurate text generation and provide natural source attribution mech...
Preprint
Full-text available
Before deploying a language model (LM) within a given domain, it is important to measure its tendency to generate factually incorrect information in that domain. Existing factual generation evaluation methods focus on facts sampled from the LM itself, and thus do not control the set of evaluated facts and might under-represent rare and unlikely fac...
Article
Peer grading systems aggregate noisy reports from multiple students to approximate a "true" grade as closely as possible. Most current systems either take the mean or median of reported grades; others aim to estimate students’ grading accuracy under a probabilistic model. This paper extends the state of the art in the latter approach in three key w...
Preprint
Full-text available
Behavioral game theorists all use experimental data to evaluate predictive models of human behavior. However, they differ greatly in their choice of loss function for these evaluations, with error rate, negative log-likelihood, cross-entropy, Brier score, and L2 error all being common choices. We attempt to offer a principled answer to the question...
Preprint
Full-text available
Retrieval-Augmented Language Modeling (RALM) methods, that condition a language model (LM) on relevant documents from a grounding corpus during generation, have been shown to significantly improve language modeling while also providing a natural source attribution mechanism. Existing RALM approaches focus on modifying the LM architecture in order t...
Preprint
Full-text available
For applications that require processing large amounts of text at inference time, Large Language Models (LLMs) are handicapped by their limited context windows, which are typically 2048 tokens. In-context learning, an emergent phenomenon in LLMs in sizes above a certain parameter threshold, constitutes one significant example because it can only le...
Preprint
Full-text available
We introduce Monte Carlo Forest Search (MCFS), an offline algorithm for automatically synthesizing strong tree-search solvers for proving \emph{unsatisfiability} on given distributions, leveraging ideas from the Monte Carlo Tree Search (MCTS) algorithm that led to breakthroughs in AlphaGo. The crucial difference between proving unsatisfiability and...
Preprint
Full-text available
In September 2016, Stanford's "One Hundred Year Study on Artificial Intelligence" project (AI100) issued the first report of its planned long-term periodic assessment of artificial intelligence (AI) and its impact on society. It was written by a panel of 17 study authors, each of whom is deeply rooted in AI research, chaired by Peter Stone of the U...
Preprint
Full-text available
Peer grading systems aggregate noisy reports from multiple students to approximate a true grade as closely as possible. Most current systems either take the mean or median of reported grades; others aim to estimate students' grading accuracy under a probabilistic model. This paper extends the state of the art in the latter approach in three key way...
Article
Full-text available
Formulating real-world optimization problems often begins with making predictions from historical data (e.g., an optimizer that aims to recommend fast routes relies upon travel-time predictions). Typically, learning the prediction model used to generate the optimization problem and solving that problem are performed in two separate stages. Recent w...
Preprint
Full-text available
When trying to solve a computational problem we are often faced with a choice among algorithms that are all guaranteed to return the right answer but that differ in their runtime distributions (e.g., SAT solvers, sorting algorithms). This paper aims to lay theoretical foundations for such choices by formalizing preferences over runtime distribution...
Preprint
Full-text available
Huge language models (LMs) have ushered in a new era for AI, serving as a gateway to natural-language-based knowledge tasks. Although an essential element of modern AI, LMs are also inherently limited in a number of ways. We discuss these limitations and how they can be avoided by adopting a systems approach. Conceptualizing the challenge as one th...
Preprint
Full-text available
Huge pretrained language models (LMs) have demonstrated surprisingly good zero-shot capabilities on a wide variety of tasks. This gives rise to the appealing vision of a single, versatile model with a wide range of functionalities across disparate applications. However, current leading techniques for leveraging a "frozen" LM -- i.e., leaving its we...
Preprint
This paper studies a novel reviewer-paper matching approach that was recently deployed in the 35th AAAI Conference on Artificial Intelligence (AAAI 2021), and has since been adopted by other conferences including AAAI 2022 and ICML 2022. This approach has three main elements: (1) collecting and processing input data to identify problematic matches...
Article
In many real-world auctions, a bidder does not know her exact value for an item, but can perform a costly deliberation to reduce her uncertainty. Relatively little is known about such deliberative environments, which are fundamentally different from classical auction environments. In this paper, we propose a new approach that allows us to leverage...
Article
Full-text available
Stackelberg security games form the backbone of systems like ARMOR, IRIS and PROTECT, which are in regular use by the Los Angeles International Police, US Federal Air Marshal Service and the US Coast Guard respectively. An understanding of the runtime required by algorithms that power such systems is critical to furthering the application of game t...
Preprint
Supervised learning models often make systematic errors on rare subsets of the data. However, such systematic errors can be difficult to identify, as model performance can only be broken down across sensitive groups when these groups are known and explicitly labelled. This paper introduces a method for discovering systematic errors, which we call t...
Preprint
Full-text available
Formulating real-world optimization problems often begins with making predictions from historical data (e.g., an optimizer that aims to recommend fast routes relies upon travel-time predictions). Typically, learning the prediction model used to generate the optimization problem and solving that problem are performed in two separate stages. Recent w...
Preprint
Full-text available
Mechanical TA 2 (MTA2) is an open source web-based peer grading application that leverages trusted TA graders to incentivize high-quality peer review. A previous, prototype implementation of MTA proved the value of the concept, but was neither suitable for use at scale nor easily extensible; MTA2 is a complete reimplementation of the system that ov...
Chapter
We study a dynamic non-bipartite matching problem. There is a fixed set of agent types, and agents of a given type arrive and depart according to type-specific Poisson processes. The value of a match is determined by the types of the matched agents. We present an online algorithm that is (1/8)-competitive with respect to the value of the optimal-in...
Preprint
Full-text available
We study a dynamic non-bipartite matching problem. There is a fixed set of agent types, and agents of a given type arrive and depart according to type-specific Poisson processes. The value of a match is determined by the types of the matched agents. We present an online algorithm that is (1/6)-competitive with respect to the value of the optimal-in...
Preprint
Full-text available
We consider the problem of wisely using a limited budget to label a small subset of a large unlabeled dataset. We are motivated by the NLP problem of word sense disambiguation. For any word, we have a set of candidate labels from a knowledge base, but the label set is not necessarily representative of what occurs in the data: there may exist labels...
Preprint
Full-text available
Masking tokens uniformly at random constitutes a common flaw in the pretraining of Masked Language Models (MLMs) such as BERT. We show that such uniform masking allows an MLM to minimize its training objective by latching onto shallow local signals, leading to pretraining inefficiency and suboptimal downstream performance. To address this flaw, we...
Conference Paper
Full-text available
In many settings, an effective way of evaluating objects of interest is to collect evaluations from dispersed individuals and to aggregate these evaluations together. Some examples are categorizing online content and evaluating student assignments via peer grading. For this data science problem, one challenge is to motivate participants to conduct...
Preprint
Full-text available
Instrumental variable methods provide a powerful approach to estimating causal effects in the presence of unobserved confounding. But a key challenge when applying them is the reliance on untestable "exclusion" assumptions that rule out any relationship between the instrument variable and the response that is not mediated by the treatment. In this...
Preprint
Full-text available
A recent body of work addresses safety constraints in explore-and-exploit systems. Such constraints arise where, for example, exploration is carried out by individuals whose welfare should be balanced with overall welfare. In this paper, we adopt a model inspired by recent work on a bandit-like setting for recommendations. We contribute to this lin...
Article
Full-text available
Strangely enough, it is possible to use machine learning models to predict the satisfiability status of hard SAT problems with accuracy considerably higher than random guessing. Existing methods have relied on extensive, manual feature engineering and computationally complex features (e.g., based on linear programming relaxations). We show for the...
Preprint
Full-text available
On-street parking is convenient, but has many disadvantages: on-street spots come at the expense of other road uses such as traffic lanes, transit lanes, bike lanes, or parklets; drivers looking for parking contribute substantially to traffic congestion and hence to greenhouse gas emissions; safety is reduced both due to the fact that drivers looki...
Article
In many settings, an effective way of evaluating objects of interest is to collect evaluations from dispersed individuals and to aggregate these evaluations together. Some examples are categorizing online content and evaluating student assignments via peer grading. For this data science problem, one challenge is to motivate participants to conduct...
Conference Paper
Full-text available
We consider the problem of a nonprofit organization ("center") that must divide resources among subsidiaries ("agents"), based on agents' reported demand forecasts, with the aim of maximizing social good (agents' valuations for the allocation minus any payments that are imposed on them). We investigate the impact of a common feature of the nonprofi...
Preprint
Full-text available
Peer grading systems make large courses more scalable, provide students with faster and more detailed feedback, and help students to learn by thinking critically about the work of others. A key obstacle to the broader adoption of peer grading is motivating students to provide accurate grades. To incentivize accurate grading, previous work has consi...
Chapter
Full-text available
Many different machine learning algorithms exist; taking into account each algorithm’s hyperparameters, there is a staggeringly large number of possible alternatives overall. We consider the problem of simultaneously selecting a learning algorithm and setting its hyperparameters. We show that this problem can be addressed by a fully automated appro...
Preprint
Full-text available
Recommendation systems often face exploration-exploitation tradeoffs: the system can only learn about the desirability of new options by recommending them to some user. Such systems can thus be modeled as multi-armed bandit settings; however, users are self-interested and cannot be made to follow recommendations. We ask whether exploration can neve...
Article
Behavioral game theory seeks to describe the way actual people (as compared to idealized, "rational" agents) act in strategic situations. Our own recent work has identified iterative models, such as quantal cognitive hierarchy, as the state of the art for predicting human play in unrepeated, simultaneous-move games. Iterative models predict that ag...
Preprint
Full-text available
Algorithm configuration methods optimize the performance of a parameterized heuristic algorithm on a given distribution of problem instances. Recent work introduced an algorithm configuration procedure ('Structured Procrastination') that provably achieves near optimal performance with high probability and with nearly minimal runtime in the worst ca...
Article
Full-text available
In 2017, the U.S. Federal Communications Commission completed the world's first two-sided spectrum auction, reclaiming radio frequency spectrum from television broadcasters to meet exploding demand for mobile broadband, 5G, and other wireless services. Operations research-including a customized series of optimization models, decompositions, cuts, h...
Preprint
Full-text available
Research in multiagent systems often does not draw a clear distinction between strategic behavior and other forms of intentional (e.g., boundedly rational) action, which are broadly lumped under the term "nonstrategic". If we are already confident that all agents in a system are perfectly rational, as often envisioned by game theory, this issue may...
Conference Paper
Full-text available
Assessing the progress made in AI and contributions to the state of the art is of major concern to the community. Recently, Frechette et al. [2016] advocated performing such analysis via the Shapley value, a concept from coalitional game theory. In this paper, we argue that while this general idea is sound, it unfairly penalizes older algorithms th...
Article
Full-text available
Every day we read in the scientific and popular press about advances in AI and how AI is changing our lives. Things are moving at a fast pace, with no obvious end in sight.
Chapter
In recent years the availability of parallel computation resources has grown rapidly. Nevertheless, even for the most widely studied constraint programming problems such as SAT, solver development and applications remain largely focussed on sequential rather than parallel approaches. To ease the burden usually associated with designing, implementin...
Article
Full-text available
We use deep learning to model interactions across two or more sets of objects, such as user-movie ratings or protein-drug bindings. The canonical representation of such interactions is a matrix (or tensor) with an exchangeability property: the encoding's meaning is not changed by permuting rows or columns. We argue that models should hence be Permu...
Article
Full-text available
The optimization of algorithm (hyper-)parameters is crucial for achieving peak performance across a wide range of domains, ranging from deep neural networks to solvers for hard combinatorial problems. The resulting algorithm configuration (AC) problem has attracted much attention from the machine learning community. However, the proper evaluation o...
Conference Paper
Full-text available
Algorithm configuration methods have achieved much practical success, but to date have not been backed by meaningful performance guarantees. We address this gap with a new algorithm configuration framework, Structured Procrastination. With high probability and nearly as quickly as possible in the worst case, our framework finds an algorithm configu...
Article
Full-text available
The recent “incentive auction” of the US Federal Communications Commission was the first auction to reallocate radio frequencies between two different kinds of uses: from broadcast television to wireless Internet access. The design challenge was not just to choose market rules to govern a fixed set of potential trades but also, to determine the bro...
Article
Full-text available
We investigate the economic outcomes that result under simulated bidder behavior in a model of the FCC's reverse auction for radio spectrum. In our simulations, limiting our notion of efficiency to the reverse auction in isolation, the reverse clock auction achieves very efficient solutions, the FCC's scoring rule greatly reduces the total payments...
Article
Full-text available
Over 13 months in 2016-17 the FCC conducted an "incentive auction" to repurpose radio spectrum from broadcast television to wireless internet. In the end, the auction yielded $19.8 billion, $10.05 billion of which was paid to 175 broadcasters for voluntarily relinquishing their licenses across 14 UHF channels. Stations that continued broadcasting w...
Preprint
Over 13 months in 2016-17 the FCC conducted an "incentive auction" to repurpose radio spectrum from broadcast television to wireless internet. In the end, the auction yielded $19.8 billion, $10.05 billion of which was paid to 175 broadcasters for voluntarily relinquishing their licenses across 14 UHF channels. Stations that continued broadcasting w...
Preprint
The optimization of algorithm (hyper-)parameters is crucial for achieving peak performance across a wide range of domains, ranging from deep neural networks to solvers for hard combinatorial problems. The resulting algorithm configuration (AC) problem has attracted much attention from the machine learning community. However, the proper evaluation o...
Article
After experimentation with other designs, major search engines converged on weighted, generalized second-price auctions (wGSPs) for selling keyword advertisements. Theoretical analysis is still not able to settle the question of why they found this design preferable to other alternatives. We approach this question in a new way, adopting an analytic...
Article
Full-text available
WEKA is a widely used, open-source machine learning platform. Due to its intuitive interface, it is particularly popular with novice users. However, such users often find it hard to identify the best approach for their particular dataset among the many available. We describe the new version of Auto-WEKA, a system designed to help such users by auto...
Article
Computational mechanism analysis is a recent approach to economic analysis in which a mechanism design setting is analyzed entirely by a computer. For games with non-trivial numbers of players and actions, the approach is only feasible when these games can be encoded compactly, e.g., as Action-Graph Games. Such encoding is currently a manual proces...
Article
In many real-world systems, strategic agents' decisions can be understood as complex - i.e., consisting of multiple sub-decisions - and hence can give rise to an exponential number of pure strategies. Examples include network congestion games, simultaneous auctions, and security games. However, agents' sets of strategies are often structured, allow...
Article
Full-text available
We are in the middle of a remarkable rise in the use and capability of artificial intelligence. Much of this growth has been fueled by the success of deep learning architectures: models that map from observables to outputs via multiple layers of latent representations. These deep learning algorithms are effective tools for unstructured prediction,...
Conference Paper
In many games, players’ decisions consist of multiple sub-decisions, and hence can give rise to an exponential number of pure strategies. However, this set of pure strategies is often structured, allowing it to be represented compactly, as in network congestion games, security games, and extensive form games. Reduction to the standard normal form g...
Article
Full-text available
Behavioral game theory seeks to describe the way actual people (as compared to idealized, "rational" agents) act in strategic situations. Our own recent work has identified iterative models (such as quantal cognitive hierarchy) as the state of the art for predicting human play in unrepeated, simultaneous-move games (Wright & Leyton-Brown 2012, 2016...
Article
Full-text available
In many settings, an effective way of evaluating objects of interest is to collect evaluations from dispersed individuals and to aggregate these evaluations together. Some examples are categorizing online content and evaluating student assignments via peer grading. For this data science problem, one challenge is to motivate participants to conduct...
Conference Paper
A natural way of attacking a new, computationally challenging problem is to find a novel way of combining design elements introduced in existing algorithms. For example, this approach was made systematic in SATenstein [15], a highly parameterized stochastic local search (SLS) framework for SAT that unifies techniques across a wide range of well-kno...
Article
Since 2004, increases in computational power described by Moore's law have substantially been realized in the form of additional cores rather than through faster clock speeds. To make effective use of modern hardware when solving hard computational problems, it is therefore necessary to employ parallel solution strategies. In this work, we demonstr...
Article
Algorithms for NP-complete problems often have different strengths andweaknesses, and thus algorithm portfolios often outperform individualalgorithms. It is surprisingly difficult to quantify a component algorithm's contributionto such a portfolio. Reporting a component's standalone performance wronglyrewards near-clones while penalizing algorithms...
Article
We investigate the problem of repacking stations in the FCC's upcoming, multi-billion-dollar "incentive auction". Early efforts to solve this problem considered mixed-integer programming formulations, which we show are unable to reliably solve realistic, national-scale problem instances. We describe the result of a multi-year investigation of alter...
Article
Full-text available
The task of algorithm selection involves choosing an algorithm from a set of algorithms on a per-instance basis in order to exploit the varying performance of algorithms over a set of instances. The algorithm selection problem is attracting increasing attention from researchers and practitioners in AI. Years of fruitful applications in a number of...
Article
Full-text available
It is well known that different solution strategies work well for different types of instances of hard combinatorial problems. As a consequence, most solvers for the propositional satisfiability problem (SAT) expose parameters that allow them to be customized to a particular family of instances. In the international SAT competition series, these pa...
Article
Full-text available
We describe Mechanical TA, an automated peer review system, and report on our experience using it over three years. Mechanical TA differs from many other peer review systems by involving human teaching assistants (TAs) as a way to assure review quality. Human TAs both evaluate the peer reviews of students who have not yet demonstrated reviewing pro...
Article
Hyperparameter optimization is crucial for achieving peak performance with many machine learning algorithms; however, the evaluation of new optimization techniques on real-world hyperparameter optimization problems can be very expensive. Therefore, experiments are often performed using cheap synthetic test functions with characteristics rather diff...
Article
We study two-sided matching markets in which participants are initially endowed with partial preference orderings, lacking precise information about their true, strictly ordered list of preferences. We wish to reason about matchings that are stable with respect to agents' true preferences, and which are furthermore optimal for one given side of the...
Article
Behavioral game theory seeks to describe the way actual people (as compared to idealized, ``rational'' agents) act in strategic situations. Our own recent work has identified iterative models (such as quantal cognitive hierarchy) as the state of the art for predicting human play in unrepeated, simultaneous-move games [Wright and Leyton-Brown 2012]....
Article
Algorithmic Game Theory (AGT) studies problems at the interface between computer science and microeconomics. Research in this area typically attacks very general settings using theoretical tools. There are great advantages to such an approach: in particular, the field has amassed an impressive range of sweeping impossibility, optimality, and approx...
Conference Paper
We study two-sided matching markets in which participants are initially endowed with partial preference orderings, lacking precise information about their true, strictly ordered list of preferences. We wish to reason about matchings that are stable with respect to agents' true preferences, and which are furthermore optimal for one given side of the...
Article
State-of-the-art planners often exhibit substantial runtime variation, making it useful to be able to efficiently predict how long a given planner will take to run on a given instance. In other areas of AI, such needs are met by building so-called empirical performance models (EPMs), statistical models derived from sets of problem instances and per...
Article
Full-text available
PROBLEMS ARE INTRACTABLE when they "can be solved, but not fast enough for the solution to be usable."13 NP-complete problems are commonly said to be intractable; however, the reality is more complex. All known algorithms for solving NP-complete problems require exponential time in the worst case; however, these algorithms nevertheless solve many p...
Article
Full-text available
This article makes two contributions. First, we present a platform for running and analyzing multiagent reinforcement learning experiments. Second, to demonstrate this platform we undertook and evaluated an empirical test of multiagent reinforcement learning algorithms from the literature, which to our knowledge is the largest such test ever conduc...
Conference Paper
Full-text available
Configuring algorithms automatically to achieve high performance is becoming increasingly relevant and important in many areas of academia and industry. Algorithm configuration methods take a parameterized target algorithm, a performance metric and a set of example data, and aim to find a parameter configuration that performs as well as possible on...
Conference Paper
Full-text available
Modern solvers for hard computational problems often expose parameters that permit customization for high performance on specific instance types. Since it is tedious and time-consuming to manually optimize such highly parameterized algorithms, recent work in the AI literature has developed automated approaches for this algorithm configuration probl...
Article
Full-text available
There exist many algorithms for learning how to play repeated bimatrix games. Most of these algorithms are justified in terms of some sort of theoretical guarantee. On the other hand, little is known about the empirical performance of these algorithms. Most such claims in the literature are been based on small experiments, which has hampered unders...

Network

Cited By