Florian Wenzel

Florian Wenzel
Google Brain Berlin

MS

About

24
Publications
2,784
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
151
Citations
Citations since 2017
21 Research Items
151 Citations
2017201820192020202120222023051015202530
2017201820192020202120222023051015202530
2017201820192020202120222023051015202530
2017201820192020202120222023051015202530
Introduction
I'm a postdoctoral researcher at Google Brain Berlin (powered by Adecco). I'm interested in Bayesian deep learning, approximate inference and probabilistic models.

Publications

Publications (24)
Preprint
Machine learning models based on the aggregated outputs of submodels, either at the activation or prediction levels, lead to strong performance. We study the interplay of two popular classes of such models: ensembles of neural networks and sparse mixture of experts (sparse MoEs). First, we show that these two approaches have complementary features...
Preprint
Uncertainty estimation in deep learning has recently emerged as a crucial area of interest to advance reliability and robustness in safety-critical applications. While there have been many proposed methods that either focus on distance-aware model uncertainties for out-of-distribution detection or on input-dependent label uncertainties for in-distr...
Preprint
High-quality estimates of uncertainty and robustness are crucial for numerous real-world applications, especially for deep learning which underlies many deployed ML systems. The ability to compare techniques for improving these estimates is therefore very important for research and practice alike. Yet, competitive comparisons of methods are often l...
Preprint
Ensembles over neural network weights trained from different random initialization, known as deep ensembles, achieve state-of-the-art accuracy and calibration. The recently introduced batch ensembles provide a drop-in replacement that is more parameter efficient. In this paper, we design ensembles not only over weights, but over hyperparameters to...
Preprint
We propose automated augmented conjugate inference, a new inference method for non-conjugate Gaussian processes (GP) models. Our method automatically constructs an auxiliary variable augmentation that renders the GP model conditionally conjugate. Building on the conjugate structure of the augmented model, we develop two inference methods. First, a...
Preprint
During the past five years the Bayesian deep learning community has developed increasingly accurate and efficient approximate inference procedures that allow for Bayesian inference in deep neural networks. However, despite this algorithmic progress and the promise of improved uncertainty quantification and sample efficiency there are---as of early...
Conference Paper
Full-text available
We propose a new scalable multi-class Gaussian process classification approach building on a novel modified softmax likelihood function. The new likelihood has two benefits: it leads to well-calibrated uncertainty estimates and allows for an efficient latent variable augmentation. The augmented model has the advantage that it is conditionally conju...
Preprint
We propose a new scalable multi-class Gaussian process classification approach building on a novel modified softmax likelihood function. The new likelihood has two benefits: it leads to well-calibrated uncertainty estimates and allows for an efficient latent variable augmentation. The augmented model has the advantage that it is conditionally conju...
Conference Paper
Full-text available
Normalizing flows provide a general approach to construct flexible variational posteriors. The parameters are learned by stochastic optimization of the variational bound, but inference can be slow due to high variance of the gradient estimator. We propose Quasi-Monte Carlo (QMC) flows which reduce the variance of the gradient estimator by one order...
Conference Paper
Full-text available
We present AugmentedGaussianProcesses.jl, a software package for augmented stochastic variational inference (ASVI) for Gaussian process models with non-conjugate likelihood functions. The idea of ASVI is to find an augmentation of the original GP model which renders the model conditionally conjugate and perform inference in the augmented model. We...
Preprint
Full-text available
Many machine learning problems involve Monte Carlo gradient estimators. As a prominent example, we focus on Monte Carlo variational inference (MCVI) in this paper. The performance of MCVI crucially depends on the variance of its stochastic gradients. We propose variance reduction by means of Quasi-Monte Carlo (QMC) sampling. QMC replaces N i.i.d. s...
Conference Paper
Full-text available
Many machine learning problems involve Monte Carlo gradient estimators. As a prominent example , we focus on Monte Carlo variational inference (mcvi) in this paper. The performance of mcvi crucially depends on the variance of its stochastic gradients. We propose variance reduction by means of Quasi-Monte Carlo (qmc) sampling. qmc replaces N i.i.d....
Article
Full-text available
Dynamic topic models (DTMs) model the evolution of prevalent themes in literature, online media, and other forms of text over time. DTMs assume that word co-occurrence statistics change continuously and therefore impose continuous stochastic process priors on their model parameters. These dynamical priors make inference much harder than in regular...
Conference Paper
Full-text available
Dynamic topic models (DTMs) model the evolution of prevalent themes in literature, online media, and other forms of text over time. DTMs assume that word co-occurrence statistics change continuously and therefore impose continuous stochastic process priors on their model parameters. These dynamical priors make inference much harder than in regular...
Article
Full-text available
We propose an efficient stochastic variational approach to GP classification building on Polya- Gamma data augmentation and inducing points, which is based on closed-form updates of natural gradients. We evaluate the algorithm on real-world datasets containing up to 11 million data points and demonstrate that it is up to three orders of magnitude f...
Conference Paper
Full-text available
This paper proposes a new scalable multi-class Gaussian process classification approach building on a novel modified softmax likelihood function. This form of likelihood allows for a latent variable augmentation that leads to a conditionally conjugate model and enables efficient variational inference via block coordinate ascent updates. Our experim...
Conference Paper
Full-text available
We propose an efficient stochastic variational approach to Gaussian Process (GP) classification building on Pólya-Gamma data augmentation and inducing points, which is based on closed-form updates of natural gradients. We evaluate the algorithm on real-world datasets containing up to 11 million data points and demonstrate that it is up to two order...
Conference Paper
Full-text available
Dynamic topic models (DTMs) model the evolution of prevalent themes in literature, online media, and other forms of text over time. DTMs assume that topics change continuously over time and therefore impose continuous stochastic process priors on their model parameters. In this paper, we extend the class of tractable priors from Wiener processes to...
Article
Full-text available
Linear mixed models (LMMs) are important tools in statistical genetics. When used for feature selection, they allow to find a sparse set of genetic traits that best predict a continuous phenotype of interest, while simultaneously correcting for various confounding factors such as age, ethnicity and population structure. Formulated as models for lin...
Article
Full-text available
We propose a fast inference method for Bayesian nonlinear support vector machines that leverages stochastic variational inference and inducing points. Our experiments show that the proposed method is faster than competing Bayesian approaches and scales easily to millions of data points. It provides additional features over frequentist competitors s...
Conference Paper
Full-text available
We propose a fast inference method for Bayesian nonlinear support vector machines that leverages stochastic variational inference and inducing points. Our experiments show that the proposed method is faster than competing Bayesian approaches and scales easily to millions of data points. It provides additional features over frequentist competitors s...
Article
Full-text available
We develop a variational inference (VI) scheme for the recently proposed Bayesian kernel support vector machine (SVM) and a stochastic version (SVI) for the linear Bayesian SVM. We compute the SVM's posterior, paving the way to apply attractive Bayesian techniques, as we exemplify in our experiments by means of automated model selection.
Research
Full-text available
Previous work on inference for dynamic mixture models has so far been directed to models that follow a simple Brownian motion diffusion over time and pursued a batch inference approach. We generalize the underlying dynamics model to follow a Gaussian process, introducing a novel class of dynamic priors for mixture models. Further, we propose a stoc...
Article
Full-text available
A large class of problems in statistical genetics amounts to finding a sparse linear effect in a binary classification setup, such as finding a small set of genes that most strongly predict a disease. Very often, these signals are spurious and obfuscated by confounders such as age, ethnicity or population structure. In the probit regression model,...

Network

Cited By

Projects

Projects (5)
Project
Develope low-variance gradient estimators to speed up stochastic optimization. In particular, we focus on making Monte Carlo based variational inference faster.