Stuart Russell

University of California, Berkeley, Berkeley, California, United States

Are you Stuart Russell?

Claim your profile

Publications (80)12.61 Total impact

  • Stuart Russell
    [Show abstract] [Hide abstract]
    ABSTRACT: Stuart Russell shares his views on some of the latest developments in unifying logic and probability. He states that open-universe probability models show merit in such unifying efforts. The key benefit of first-order logic is its expressive power, which leads to concise and learnable models. New languages for defining open-universe probability models appear to provide the desired unification in a natural way. They support probabilistic reasoning about the existence and identity of objects, which is important for any system trying to understand the world through perceptual or textual inputs.
    No preview · Article · Jul 2015 · Communications of the ACM
  • Sharad Vikram · Lei Li · Stuart Russell
    [Show abstract] [Hide abstract]
    ABSTRACT: Recent technologies in vision sensors are capable of capturing 3D finger positions and movements. We propose a novel way to control and interact with computers by moving fingers in the air. The positions of fingers are precisely captured by a computer vision device. By tracking the moving patterns of fingers, we can then recognize users' intended control commands or input information. We demonstrate this human input approach through an example application of handwriting recognition. By treating the input as a time series of 3D positions, we propose a fast algorithm using dynamic time warping to recognize characters in online fashion. We employ various optimization techniques to recognize in real time as one writes. Experiments show promising recognition performance and speed.
    No preview · Conference Paper · Apr 2013
  • Nimar S. Arora · Stuart Russell · Erik Sudderth
    [Show abstract] [Hide abstract]
    ABSTRACT: The automated processing of multiple seismic signals to detect and localize seismic events is a central tool in both geophysics and nuclear treaty verification. This paper reports on a project, begun in 2009, to reformulate this problem in a Bayesian framework. A Bayesian seismic monitoring system, NET-VISA, has been built comprising a spatial event prior and generative models of event transmission and detection, as well as an inference algorithm. The probabilistic model allows for seamless integration of various disparate sources of information. Applied in the context of the International Monitoring System (IMS), a global sensor network developed for the Comprehensive Nuclear-Test-Ban Treaty (CTBT), NET-VISA achieves a reduction of around 60% in the number of missed events compared to the currently deployed system. It also finds events that are missed by the human analysts who post-process the IMS output.
    No preview · Article · Apr 2013 · Bulletin of the Seismological Society of America
  • Source
    Nir Friedman · Stuart Russell
    [Show abstract] [Hide abstract]
    ABSTRACT: "Background subtraction" is an old technique for finding moving objects in a video sequence for example, cars driving on a freeway. The idea is that subtracting the current image from a timeaveraged background image will leave only nonstationary objects. It is, however, a crude approximation to the task of classifying each pixel of the current image; it fails with slow-moving objects and does not distinguish shadows from moving objects. The basic idea of this paper is that we can classify each pixel using a model of how that pixel looks when it is part of different classes. We learn a mixture-of-Gaussians classification model for each pixel using an unsupervised technique- an efficient, incremental version of EM. Unlike the standard image-averaging approach, this automatically updates the mixture component for each class according to likelihood of membership; hence slow-moving objects are handled perfectly. Our approach also identifies and eliminates shadows much more effectively than other techniques such as thresholding. Application of this method as part of the Roadwatch traffic surveillance project is expected to result in significant improvements in vehicle identification and tracking.
    Preview · Article · Feb 2013
  • Source
    Nir Friedman · Kevin Murphy · Stuart Russell
    [Show abstract] [Hide abstract]
    ABSTRACT: Dynamic probabilistic networks are a compact representation of complex stochastic processes. In this paper we examine how to learn the structure of a DPN from data. We extend structure scoring rules for standard probabilistic networks to the dynamic case, and show how to search for structure when some of the variables are hidden. Finally, we examine two applications where such a technology might be useful: predicting and classifying dynamic behaviors, and learning causal orderings in biological processes. We provide empirical results that demonstrate the applicability of our methods in both domains.
    Full-text · Article · Jan 2013
  • Source
    Arnaud Doucet · Nando de Freitas · Kevin Murphy · Stuart Russell
    [Show abstract] [Hide abstract]
    ABSTRACT: Particle filters (PFs) are powerful sampling-based inference/learning algorithms for dynamic Bayesian networks (DBNs). They allow us to treat, in a principled way, any type of probability distribution, nonlinearity and non-stationarity. They have appeared in several fields under such names as "condensation", "sequential Monte Carlo" and "survival of the fittest". In this paper, we show how we can exploit the structure of the DBN to increase the efficiency of particle filtering, using a technique known as Rao-Blackwellisation. Essentially, this samples some of the variables, and marginalizes out the rest exactly, using the Kalman filter, HMM filter, junction tree algorithm, or any other finite dimensional optimal filter. We show that Rao-Blackwellised particle filters (RBPFs) lead to more accurate estimates than standard PFs. We demonstrate RBPFs on two problems, namely non-stationary online regression with radial basis function networks and robot localization and map building. We also discuss other potential application areas and provide references to some finite dimensional optimal filters.
    Full-text · Article · Jan 2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a new class of learning algorithms that combines variational approximation and Markov chain Monte Carlo (MCMC) simulation. Naive algorithms that use the variational approximation as proposal distribution can perform poorly because this approximation tends to underestimate the true variance and other features of the data. We solve this problem by introducing more sophisticated MCMC algorithms. One of these algorithms is a mixture of two MCMC kernels: a random walk Metropolis kernel and a blockMetropolis-Hastings (MH) kernel with a variational approximation as proposaldistribution. The MH kernel allows one to locate regions of high probability efficiently. The Metropolis kernel allows us to explore the vicinity of these regions. This algorithm outperforms variationalapproximations because it yields slightly better estimates of the mean and considerably better estimates of higher moments, such as covariances. It also outperforms standard MCMC algorithms because it locates theregions of high probability quickly, thus speeding up convergence. We demonstrate this algorithm on the problem of Bayesian parameter estimation for logistic (sigmoid) belief networks.
    Full-text · Article · Jan 2013
  • Source
    Bhaskara Marthi · Hanna Pasula · Stuart Russell · Yuval Peres
    [Show abstract] [Hide abstract]
    ABSTRACT: Filtering---estimating the state of a partially observable Markov process from a sequence of observations---is one of the most widely studied problems in control theory, AI, and computational statistics. Exact computation of the posterior distribution is generally intractable for large discrete systems and for nonlinear continuous systems, so a good deal of effort has gone into developing robust approximation algorithms. This paper describes a simple stochastic approximation algorithm for filtering called {em decayed MCMC}. The algorithm applies Markov chain Monte Carlo sampling to the space of state trajectories using a proposal distribution that favours flips of more recent state variables. The formal analysis of the algorithm involves a generalization of standard coupling arguments for MCMC convergence. We prove that for any ergodic underlying Markov process, the convergence time of decayed MCMC with inverse-polynomial decay remains bounded as the length of the observation sequence grows. We show experimentally that decayed MCMC is at least competitive with other approximation algorithms such as particle filtering.
    Full-text · Article · Dec 2012
  • Source
    Eric P. Xing · Michael I. Jordan · Stuart Russell
    [Show abstract] [Hide abstract]
    ABSTRACT: The mean field methods, which entail approximating intractable probability distributions variationally with distributions from a tractable family, enjoy high efficiency, guaranteed convergence, and provide lower bounds on the true likelihood. But due to requirement for model-specific derivation of the optimization equations and unclear inference quality in various models, it is not widely used as a generic approximate inference algorithm. In this paper, we discuss a generalized mean field theory on variational approximation to a broad class of intractable distributions using a rich set of tractable distributions via constrained optimization over distribution spaces. We present a class of generalized mean field (GMF) algorithms for approximate inference in complex exponential family models, which entails limiting the optimization over the class of cluster-factorizable distributions. GMF is a generic method requiring no model-specific derivations. It factors a complex model into a set of disjoint variable clusters, and uses a set of canonical fix-point equations to iteratively update the cluster distributions, and converge to locally optimal cluster marginals that preserve the original dependency structure within each cluster, hence, fully decomposed the overall inference problem. We empirically analyzed the effect of different tractable family (clusters of different granularity) on inference quality, and compared GMF with BP on several canonical models. Possible extension to higher-order MF approximation is also discussed.
    Full-text · Article · Oct 2012
  • Shaunak Chatterjee · Stuart Russell
    [Show abstract] [Hide abstract]
    ABSTRACT: Standard temporal models assume that observation times are correct, whereas in many real-world settings (particularly those involving human data entry) noisy time stamps are quite common. Serious problems arise when these time stamps are taken literally. This paper introduces a modeling framework for handling uncertainty in observation times and describes inference algorithms that, under certain reasonable assumptions about the nature of time-stamp errors, have linear time complexity.
    No preview · Chapter · Sep 2012
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Sequential decision problems are often approximately solvable by simulating possible future action sequences. {\em Metalevel} decision procedures have been developed for selecting {\em which} action sequences to simulate, based on estimating the expected improvement in decision quality that would result from any particular simulation; an example is the recent work on using bandit algorithms to control Monte Carlo tree search in the game of Go. In this paper we develop a theoretical basis for metalevel decisions in the statistical framework of Bayesian {\em selection problems}, arguing (as others have done) that this is more appropriate than the bandit framework. We derive a number of basic results applicable to Monte Carlo selection problems, including the first finite sampling bounds for optimal policies in certain cases; we also provide a simple counterexample to the intuitive conjecture that an optimal policy will necessarily reach a decision in all cases. We then derive heuristic approximations in both Bayesian and distribution-free settings and demonstrate their superiority to bandit-based heuristics in one-shot decision problems and in Go.
    Preview · Article · Jul 2012
  • Source
    Eric P. Xing · Michael I. Jordan · Stuart Russell
    [Show abstract] [Hide abstract]
    ABSTRACT: An autonomous variational inference algorithm for arbitrary graphical models requires the ability to optimize variational approximations over the space of model parameters as well as over the choice of tractable families used for the variational approximation. In this paper, we present a novel combination of graph partitioning algorithms with a generalized mean field (GMF) inference algorithm. This combination optimizes over disjoint clustering of variables and performs inference using those clusters. We provide a formal analysis of the relationship between the graph cut and the GMF approximation, and explore several graph partition strategies empirically. Our empirical results provide rather clear support for a weighted version of MinCut as a useful clustering algorithm for GMF inference, which is consistent with the implications from the formal analysis.
    Full-text · Article · Jul 2012
  • Source
    Bhaskara Marthi · Stuart Russell · David Andre
    [Show abstract] [Hide abstract]
    ABSTRACT: Previous work in hierarchical reinforcement learning has faced a dilemma: either ignore the values of different possible exit states from a subroutine, thereby risking suboptimal behavior, or represent those values explicitly thereby incurring a possibly large representation cost because exit values refer to nonlocal aspects of the world (i.e., all subsequent rewards). This paper shows that, in many cases, one can avoid both of these problems. The solution is based on recursively decomposing the exit value function in terms of Q-functions at higher levels of the hierarchy. This leads to an intuitively appealing runtime architecture in which a parent subroutine passes to its child a value function on the exit states and the child reasons about how its choices affect the exit value. We also identify structural conditions on the value function and transition distributions that allow much more concise representations of exit state distributions, leading to further state abstraction. In essence, the only variables whose exit values need be considered are those that the parent cares about and the child affects. We demonstrate the utility of our algorithms on a series of increasingly complex environments.
    Preview · Article · Jun 2012
  • Source
    Brian Milch · Stuart Russell
    [Show abstract] [Hide abstract]
    ABSTRACT: Tasks such as record linkage and multi-target tracking, which involve reconstructing the set of objects that underlie some observed data, are particularly challenging for probabilistic inference. Recent work has achieved efficient and accurate inference on such problems using Markov chain Monte Carlo (MCMC) techniques with customized proposal distributions. Currently, implementing such a system requires coding MCMC state representations and acceptance probability calculations that are specific to a particular application. An alternative approach, which we pursue in this paper, is to use a general-purpose probabilistic modeling language (such as BLOG) and a generic Metropolis-Hastings MCMC algorithm that supports user-supplied proposal distributions. Our algorithm gains flexibility by using MCMC states that are only partial descriptions of possible worlds; we provide conditions under which MCMC over partial worlds yields correct answers to queries. We also show how to use a context-specific Bayes net to identify the factors in the acceptance probability that need to be computed for a given proposed move. Experimental results on a citation matching task show that our general-purpose MCMC engine compares favorably with an application-specific system.
    Preview · Article · Jun 2012
  • Source
    Shaunak Chatterjee · Stuart Russell
    [Show abstract] [Hide abstract]
    ABSTRACT: Hierarchical problem abstraction, when applicable, may offer exponential reductions in computational complexity. Previous work on coarse-to-fine dynamic programming (CFDP) has demonstrated this possibility using state abstraction to speed up the Viterbi algorithm. In this paper, we show how to apply temporal abstraction to the Viterbi problem. Our algorithm uses bounds derived from analysis of coarse timescales to prune large parts of the state trellis at finer timescales. We demonstrate improvements of several orders of magnitude over the standard Viterbi algorithm, as well as significant speedups over CFDP, for problems whose state variables evolve at widely differing rates.
    Preview · Article · May 2012
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Languages for open-universe probabilistic models (OUPMs) can represent situations with an unknown number of objects and iden- tity uncertainty. While such cases arise in a wide range of important real-world appli- cations, existing general purpose inference methods for OUPMs are far less efficient than those available for more restricted lan- guages and model classes. This paper goes some way to remedying this deficit by in- troducing, and proving correct, a generaliza- tion of Gibbs sampling to partial worlds with possibly varying model structure. Our ap- proach draws on and extends previous generic OUPM inference methods, as well as aux- iliary variable samplers for nonparametric mixture models. It has been implemented for BLOG, a well-known OUPM language. Combined with compile-time optimizations, the resulting algorithm yields very substan- tial speedups over existing methods on sev- eral test cases, and substantially improves the practicality of OUPM languages generally.
    Full-text · Article · Mar 2012
  • Source
    Nimar S. Arora · Stuart Russell · Paul Kidwell · Erik B. Sudderth
    [Show abstract] [Hide abstract]
    ABSTRACT: The automated processing of multiple seismic signals to detect and localize seismic events is a central tool in both geophysics and nuclear treaty verification. This paper reports on a project, begun in 2009, to reformulate this problem in a Bayesian framework. A Bayesian seismic monitoring system, NET-VISA, has been built comprising a spatial event prior and generative models of event transmission and detection, as well as an inference algorithm. Applied in the context of the International Monitoring System (IMS), a global sensor network developed for the Comprehensive Nuclear-Test-Ban Treaty (CTBT), NET-VISA achieves a reduction of around 60 % in the number of missed events compared to the currently deployed system. It also finds events that are missed even by the human analysts who post-process the IMS output.
    Full-text · Conference Paper · Jan 2011
  • Source
    Shaunak Chatterjee · Stuart Russell

    Preview · Conference Paper · Jan 2011
  • Source
    Ronan J Le Bras · Stuart Russell · Nimar Arora · Vera Miljanovic
    [Show abstract] [Hide abstract]
    ABSTRACT: Since 2009, an initiative to investigate the potential of machine learning methods to improve automatic data processing at the CTBTO and in particular the recall and accuracy of the automatic bulletins is starting to bear fruit beyond the stage of research and has entered the domain of development and testing with the goal of operational testing for one of the projects (FEI) by the end of 2011. The prospect for FEI is that the tool will comfort analysts in their decision-making process when they make decisions on whether a (mostly smaller) event is real or false, and it is thus an enhancement of the current analysis system. The VISA projects are more ambitious and aim at replacing key components of the processing system. The prototype of the first generation, which aims at replacing the current automatic association tool (GA), is being evaluated on the vDEC collaborative platform of the CTBTO. Results show much improved accuracy using VISA as compared to the SEL3 for the same recall value, or much-improved recall value using VISA as compared to the SEL3 for the same processing accuracy. A consequence is a significant decrease in either the number of false alarms or the number of missed events, depending on the setting of the processing parameters. OBJECTIVES The objective of this project is to evaluate the applications of Machine Learning techniques to the processing of waveform data at the IDC of the CTBTO in a quasi-operational environment. The ISS09 project initiated by the
    Full-text · Article · Jan 2011
  • Source
    Nimar S. Arora · Stuart Russell · Paul Kidwell · Erik B. Sudderth
    [Show abstract] [Hide abstract]
    ABSTRACT: The International Monitoring System (IMS) is a global network of sensors whose purpose is to identify potential violations of the Comprehensive Nuclear-Test-Ban Treaty (CTBT), primarily through detection and localization of seismic events. We report on the first stage of a project to improve on the current automated software system with a Bayesian inference system that computes the most likely global event history given the record of local sensor data. The new system, VISA (Vertically Integrated Seismological Analysis), is based on empirically calibrated, generative models of event occurrence, signal propagation, and signal detection. VISA exhibits significantly improved precision and recall compared to the current operational system and is able to detect events that are missed even by the human analysts who post-process the IMS output. 1
    Full-text · Conference Paper · Dec 2010

Publication Stats

6k Citations
12.61 Total Impact Points

Institutions

  • 1970-2015
    • University of California, Berkeley
      • • Computer Science Division
      • • Department of Electrical Engineering and Computer Sciences
      Berkeley, California, United States
  • 2010
    • Preparatory Commission for the Comprehensive Nuclear-Test-Ban Treaty Organization (CTBTO)
      Wien, Vienna, Austria