Botond Szabo

Botond Szabo
Università commerciale Luigi Bocconi | Bocconi · Department of Decision Sciences

PhD in Mathematical Statistics

About

39
Publications
3,295
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
617
Citations
Additional affiliations
February 2015 - February 2016
University of Amsterdam
Position
  • PostDoc Position
September 2014 - February 2015
Budapest University of Technology and Economics
Position
  • Professor (Assistant)
March 2014 - August 2014
MINES ParisTech
Position
  • PostDoc Position
Education
February 2010 - February 2014
Eindhoven University of Technology
Field of study
  • Nonparametric Bayesin Statistics
September 2008 - June 2009
Vrije Universiteit Amsterdam
Field of study
  • Stochastics and Financial Mathematics
September 2004 - June 2010
Eötvös Loránd University
Field of study
  • Applied mathematics

Publications

Publications (39)
Article
Full-text available
We investigate the frequentist coverage properties of Bayesian credible sets in a general, adaptive, nonparametric framework. It is well known that the construction of adaptive and honest confidence sets is not possible in general. To overcome this problem we introduce an extra assumption on the functional parameters, the so called "general polishe...
Preprint
Full-text available
We study a mean-field variational Bayes (VB) approximation to Bayesian model selection priors, which include the popular spike-and-slab prior, in the sparse high-dimensional linear regression model. Under suitable conditions on the design matrix, the mean-field VB approximation is shown to converge to the sparse truth at the optimal rate for $\ell_...
Preprint
Full-text available
In this paper study the problem of signal detection in Gaussian noise in a distributed setting. We derive a lower bound on the size that the signal needs to have in order to be detectable. Moreover, we exhibit optimal distributed testing strategies that attain the lower bound.
Preprint
Full-text available
We study the theoretical properties of a variational Bayes method in the Gaussian Process regression model. We consider the inducing variables method introduced by Titsias (2009a) and derive sufficient conditions for obtaining contraction rates for the corresponding variational Bayes (VB) posterior. As examples we show that for three particular cov...
Preprint
Full-text available
We derive minimax testing errors in a distributed framework where the data is split over multiple machines and their communication to a central machine is limited to $b$ bits. We investigate both the $d$- and infinite-dimensional signal detection problem under Gaussian white noise. We also derive distributed testing algorithms reaching the theoreti...
Preprint
Full-text available
Gaussian Processes (GP) are widely used for probabilistic modeling and inference for nonparametric regression. However, their computational complexity scales cubicly with the sample size rendering them unfeasible for large data sets. To speed up the computations various distributed methods were proposed in the literature. These methods have, howeve...
Preprint
Full-text available
We propose a new, two-step empirical Bayes-type of approach for neural networks. We show in context of the nonparametric regression model that the procedure (up to a logarithmic factor) provides optimal recovery of the underlying functional parameter of interest and provides Bayesian credible sets with frequentist coverage guarantees. The approach...
Preprint
Full-text available
In the era of big data, it is necessary to split extremely large data sets across multiple computing nodes and construct estimators using the distributed data. When designing distributed estimators, it is desirable to minimize the amount of communication across the network because transmission between computers is slow in comparison to computations...
Article
Full-text available
Multi-view data refers to a setting where features are divided into feature sets, for example because they correspond to different sources. Stacked penalized logistic regression (StaPLR) is a recently introduced method that can be used for classification and automatically selecting the views that are most important for prediction. We introduce an e...
Preprint
Multi-view data refers to a setting where features are divided into feature sets, for example because they correspond to different sources. Stacked penalized logistic regression (StaPLR) is a recently introduced method that can be used for classification and automatically selecting the views that are most important for prediction. We show how this...
Article
Full-text available
We study a mean-field spike and slab variational Bayes (VB) approximation to Bayesian model selection priors in sparse high-dimensional linear regression. Under compatibility conditions on the design matrix, oracle inequalities are derived for the mean-field VB approximation, implying that it converges to the sparse truth at the optimal rate and gi...
Preprint
Multi-view stacking is a framework for combining information from different views (i.e. different feature sets) describing the same set of objects. In this framework, a base-learner algorithm is trained on each view separately, and their predictions are then combined by a meta-learner algorithm. In a previous study, stacked penalized logistic regre...
Preprint
Full-text available
Variational Bayes (VB) is a popular scalable alternative to Markov chain Monte Carlo for Bayesian inference. We study a mean-field spike and slab VB approximation of widely used Bayesian model selection priors in sparse high-dimensional logistic regression. We provide non-asymptotic theoretical guarantees for the VB posterior in both $\ell_2$ and p...
Preprint
Full-text available
We investigate whether in a distributed setting, adaptive estimation of a smooth function at the optimal rate is possible under minimal communication. It turns out that the answer depends on the risk considered and on the number of servers over which the procedure is distributed. We show that for the $L_\infty$-risk, adaptively obtaining optimal ra...
Article
In biomedical research, many different types of patient data can be collected, such as various types of omics data and medical imaging modalities. Applying multi-view learning to these different sources of information can increase the accuracy of medical classification models compared with single-view procedures. However, collecting biomedical data...
Preprint
Full-text available
Bayesian approaches have become increasingly popular in causal inference problems due to their conceptual simplicity, excellent performance and in-built uncertainty quantification ('posterior credible sets'). We investigate Bayesian inference for average treatment effects from observational data, which is a challenging problem due to the missing co...
Preprint
Full-text available
We investigate the frequentist coverage properties of credible sets resulting in from Gaussian process priors with squared exponential covariance kernel. First we show that by selecting the scaling hyper-parameter using the maximum marginal likelihood estimator in the (slightly modified) squared exponential covariance kernel the corresponding credi...
Preprint
Full-text available
In multi-view learning, features are organized into multiple sets called views. Multi-view stacking (MVS) is an ensemble learning framework which learns a prediction function from each view separately, and then learns a meta-function which optimally combines the view-specific predictions. In case studies, MVS has been shown to increase prediction a...
Preprint
Full-text available
We consider exact algorithms for Bayesian inference with model selection priors (including spike-and-slab priors) in the sparse normal sequence model. Because the best existing exact algorithm becomes numerically unstable for sample sizes over n=500, there has been much attention for alternative approaches like approximate algorithms (Gibbs samplin...
Preprint
Full-text available
In the sparse normal means model, coverage of adaptive Bayesian posterior credible sets associated to spike and slab prior distributions is considered. The key sparsity hyperparameter is calibrated via marginal maximum likelihood empirical Bayes. First, adaptive posterior contraction rates are derived with respect to $d_q$--type--distances for $q\l...
Article
Full-text available
We study distributed estimation methods under communication constraints in a distributed version of the nonparametric signal-in-white-noise model. We derive minimax lower bounds and exhibit methods that attain those bounds. Moreover, we show that adaptive estimation is possible in this setting.
Article
Full-text available
We investigate and compare the fundamental performance of several distributed learning methods that have been proposed recently. We do this in the context of a distributed version of the classical signal-in-Gaussian-white-noise model, which serves as a benchmark model for studying performance in this setting. The results show how the design and tun...
Article
Full-text available
The estimation of a log-concave density on $\mathbb{R}$ is a canonical problem in the area of shape-constrained nonparametric inference. We present a Bayesian nonparametric approach to this problem based on an exponentiated Dirichlet process mixture prior and show that the posterior distribution converges to the log-concave truth at the (near-) min...
Article
Full-text available
We investigate the frequentist properties of Bayesian procedures for estimation based on the horseshoe prior in the sparse multivariate normal means model. Previous theoretical results assumed that the sparsity level, that is, the number of signals, was known. We drop this assumption and characterize the behavior of the maximum marginal likelihood...
Article
We investigate the credible sets and marginal credible intervals resulting from the horseshoe prior in the sparse multivariate normal means model. We do so in an adaptive setting without assuming knowledge of the sparsity level (number of signals). We consider both the hierarchical Bayes method of putting a prior on the unknown sparsity level and t...
Chapter
Full-text available
We consider the problem of constructing Bayesian based confidence sets for linear functionals in the inverse Gaussian white noise model. We work with a scale of Gaussian priors indexed by a regularity hyper-parameter and apply the data-driven (slightly modified) marginal likelihood empirical Bayes method for the choice of this hyper-parameter. We s...
Article
Full-text available
We investigate the frequentist coverage of Bayesian credible sets in a nonparametric setting. We consider a scale of priors of varying regularity and choose the regularity by an empirical Bayes method. Next we consider a central set of prescribed posterior probability in the posterior distribution of the chosen regularity. We show that such an adap...
Article
Full-text available
Rejoinder of "Frequentist coverage of adaptive nonparametric Bayesian credible sets" by Szab\'o, van der Vaart and van Zanten [arXiv:1310.4489v5].
Article
Full-text available
We consider the asymptotic behaviour of the marginal maximum likelihood empirical Bayes posterior distribution in general setting. First we characterize the set where the maximum marginal likelihood estimator is located with high probability. Then we provide upper and lower bounds for the contraction rates of the empirical Bayes posterior. We demon...
Article
Full-text available
We study empirical and hierarchical Bayes approaches to the problem of estimating an infinite-dimensional parameter in mildly ill-posed inverse problems. We consider a class of prior distributions indexed by a hyperparameter that quantifies regularity. We prove that both methods we consider succeed in automatically selecting this parameter optimall...
Article
We investigate the problem of constructing Bayesian credible sets that are honest and adaptive for the L2-loss over a scale of Sobolev classes with regularity ranging between [D; 2D], for some given D in the context of the signal-in-white-noise model. We consider a scale of prior distributions indexed by a regularity hyper-parameter and choose the...
Article
In the nonparametric Gaussian sequence space model an $\ell^2$-confidence ball $C_n$ is constructed that adapts to unknown smoothness and Sobolev-norm of the infinite-dimensional parameter to be estimated. The confidence ball has exact and honest asymptotic coverage over appropriately defined `self-similar' parameter spaces. It is shown by informat...
Article
Full-text available
The performance of nonparametric estimators is heavily dependent on a bandwidth parameter. In nonparametric Bayesian methods this parameter can be specified as a hyperparameter of the nonparametric prior. The value of this hyperparameter may be made dependent on the data. The empirical Bayes method is to set its value by maximizing the marginal lik...
Article
The determination of rate parameters of gas-phase elementary reactions is usually based on direct measurements. The rate parameters obtained in many independent direct measurements are then used in reaction mechanisms, which are tested against the results of indirect experiments, like time-to-ignition or laminar flame velocity measurements. We sugg...
Conference Paper
Full-text available
In this report, several methods are investigated to rapidly compute the light intensity function, either in the far field or on a finite-distance screen, of light emanating from a light fixture with a given shape. Different shapes are considered, namely polygonal and (piecewise) smooth. In the first case, analytic methods are sought to circumvent t...
Article
Full-text available
The temperature dependence of rate coefficient k is usually described by the Arrhenius expression ln k = ln A − (E/R)T −1. Chemical kinetics databases contain the recommended values of Arrhenius parameters A and E, the uncertainty parameter f (T) of the rate coefficient and temperature range of validity of this information. Taking ln k as a random...

Network

Cited By