Youssef M Marzouk

Youssef M Marzouk
Massachusetts Institute of Technology | MIT · Department of Aeronautics and Astronautics

PhD

About

203
Publications
24,967
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,860
Citations
Additional affiliations
January 2009 - June 2020
Massachusetts Institute of Technology
Position
  • Professor
May 2007 - December 2008
Sandia National Laboratories
Position
  • Senior Member of the Technical Staff
October 2004 - May 2007
Sandia National Laboratories
Position
  • Truman Postdoctoral Fellow

Publications

Publications (203)
Article
Many Bayesian inference problems require exploring the posterior distribution of high-dimensional parameters that represent the discretization of an underlying function. This work introduces a family of Markov chain Monte Carlo (MCMC) samplers that can adapt to the particular structure of a posterior distribution over functions. Two distinct lines...
Article
Full-text available
In the Bayesian approach to inverse problems, data are often informative, relative to the prior, only on a low-dimensional subspace of the parameter space. Significant computational savings can be achieved by using this subspace to characterize and approximate the posterior distribution of the parameters. We first investigate approximation of the p...
Article
Full-text available
We construct a new framework for accelerating MCMC algorithms for sampling from posterior distributions in the context of computationally intensive models. We proceed by constructing local surrogates of the forward model within the Metropolis-Hastings kernel, borrowing ideas from deterministic approximation theory, optimization, and experimental de...
Article
We present a new approach to Bayesian inference that entirely avoids Markov chain simulation, by constructing a map that pushes forward the prior measure to the posterior measure. Existence and uniqueness of a suitable measure-preserving map is established by formulating the problem in the context of optimal transport theory. We discuss various mea...
Article
Full-text available
The optimal selection of experimental conditions is essential to maximizing the value of data for inference and prediction, particularly in situations where experiments are time-consuming and expensive to conduct. We propose a general mathematical framework and an algorithmic approach for optimal experimental design with nonlinear simulation-based...
Article
Full-text available
We develop several inference methods to estimate the position of dislocations from images generated using dark-field X-ray microscopy (DFXM)—achieving superresolution accuracy and principled uncertainty quantification. Using the framework of Bayesian inference, we incorporate models of the DFXM contrast mechanism and detector measurement noise, alo...
Article
Full-text available
Many Bayesian inference problems involve target distributions whose density functions are computationally expensive to evaluate. Replacing the target density with a local approximation based on a small number of carefully chosen density evaluations can significantly reduce the computational expense of Markov chain Monte Carlo (MCMC) sampling. Moreo...
Preprint
Full-text available
We study the convergence properties, in Hellinger and related distances, of nonparametric density estimators based on measure transport. These estimators represent the measure of interest as the pushforward of a chosen reference distribution under a transport map, where the map is chosen via a maximum likelihood objective (equivalently, minimizing...
Preprint
Full-text available
We consider the problem of reducing the dimensions of parameters and data in non-Gaussian Bayesian inference problems. Our goal is to identify an "informed" subspace of the parameters and an "informative" subspace of the data so that a high-dimensional inference problem can be approximately reformulated in low-to-moderate dimensions, thereby improv...
Preprint
Full-text available
We consider the Bayesian calibration of models describing the phenomenon of block copolymer (BCP) self-assembly using image data produced by microscopy or X-ray scattering techniques. To account for the random long-range disorder in BCP equilibrium structures, we introduce auxiliary variables to represent this aleatory uncertainty. These variables,...
Article
Full-text available
For two probability measures $${\rho }$$ ρ and $${\pi }$$ π on $$[-1,1]^{{\mathbb {N}}}$$ [ - 1 , 1 ] N we investigate the approximation of the triangular Knothe–Rosenblatt transport $$T:[-1,1]^{{\mathbb {N}}}\rightarrow [-1,1]^{{\mathbb {N}}}$$ T : [ - 1 , 1 ] N → [ - 1 , 1 ] N that pushes forward $${\rho }$$ ρ to $${\pi }$$ π . Under suitable ass...
Article
Full-text available
For two probability measures $${\rho }$$ ρ and $${\pi }$$ π with analytic densities on the d -dimensional cube $$[-1,1]^d$$ [ - 1 , 1 ] d , we investigate the approximation of the unique triangular monotone Knothe–Rosenblatt transport $$T:[-1,1]^d\rightarrow [-1,1]^d$$ T : [ - 1 , 1 ] d → [ - 1 , 1 ] d , such that the pushforward $$T_\sharp {\rho }...
Preprint
Full-text available
We study a distributionally robust optimization formulation (i.e., a min-max game) for problems of nonparametric estimation: Gaussian process regression and, more generally, linear inverse problems. We choose the best mean-squared error predictor on an infinite-dimensional space against an adversary who chooses the worst-case model in a Wasserstein...
Poster
Full-text available
Motivation: How to compute entropy, Kullback-Leibler (KL) and other divergences if probability density function (pdf) is not available? The task considered here was the numerical computation of characterising statistics of high-dimensional pdfs, as well as their divergences and distances, where the pdf in the numerical implementation was assume...
Preprint
Full-text available
In this work, we develop several inference methods to estimate the position of dislocations from images generated using dark-field X-ray microscopy (DFXM) -- achieving superresolution accuracy and principled uncertainty quantification. Using the framework of Bayesian inference, we incorporate models of the DFXM contrast mechanism and detector measu...
Preprint
Full-text available
We propose a regularization method for ensemble Kalman filtering (EnKF) with elliptic observation operators. Commonly used EnKF regularization methods suppress state correlations at long distances. For observations described by elliptic partial differential equations, such as the pressure Poisson equation (PPE) in incompressible fluid flows, distan...
Article
We exploit the relationship between the stochastic Koopman operator and the Kolmogorov backward equation to construct importance sampling schemes for stochastic differential equations. Specifically, we propose using eigenfunctions of the stochastic Koopman operator to approximate the Doob transform for an observable of interest (e.g., associated wi...
Preprint
Full-text available
Bayesian inference provides a systematic means of quantifying uncertainty in the solution of the inverse problem. However, solution of Bayesian inverse problems governed by complex forward models described by partial differential equations (PDEs) remains prohibitive with black-box Markov chain Monte Carlo (MCMC) methods. We present hIPPYlib-MUQ, an...
Preprint
Full-text available
Very often, in the course of uncertainty quantification tasks or data analysis, one has to deal with high-dimensional random variables (RVs). A high-dimensional RV can be described by its probability density (pdf) and/or by the corresponding probability characteristic functions (pcf), or by a polynomial chaos (PCE) or similar expansion. Here the in...
Preprint
Full-text available
We discuss approaches to computing eigenfunctions of the Ornstein--Uhlenbeck (OU) operator in more than two dimensions. While the spectrum of the OU operator and theoretical properties of its eigenfunctions have been well characterized in previous research, the practical computation of general eigenfunctions has not been resolved. We review special...
Preprint
We introduce a novel geometry-informed irreversible perturbation that accelerates convergence of the Langevin algorithm for Bayesian computation. It is well documented that there exist perturbations to the Langevin dynamics that preserve its invariant measure while accelerating its convergence. Irreversible perturbations and reversible perturbation...
Preprint
For two probability measures $\rho$ and $\pi$ on $[-1,1]^{\mathbb{N}}$ we investigate the approximation of the triangular Knothe-Rosenblatt transport $T:[-1,1]^{\mathbb{N}}\to [-1,1]^{\mathbb{N}}$ that pushes forward $\rho$ to $\pi$. Under suitable assumptions, we show that $T$ can be approximated by rational functions without suffering from the cu...
Article
Full-text available
Estimating parameters of chaotic geophysical models is challenging due to their inherent unpredictability. These models cannot be calibrated with standard least squares or filtering methods if observations are temporally sparse. Obvious remedies, such as averaging over temporal and spatial data to characterize the mean behavior, do not capture the...
Preprint
Model misspecification constitutes a major obstacle to reliable inference in many inverse problems. Inverse problems in seismology, for example, are particularly affected by misspecification of wave propagation velocities. In this paper, we focus on a specific seismic inverse problem - full-waveform moment tensor inversion - and develop a Bayesian...
Article
We propose a differential geometric approach for building families of low-rank covariance matrices, via interpolation on low-rank matrix manifolds. In contrast with standard parametric covariance classes, these families offer significant flexibility for problem-specific tailoring via the choice of “anchor” matrices for interpolation, for instance o...
Preprint
Full-text available
We introduce a method for the nonlinear dimension reduction of a high-dimensional function $u:\mathbb{R}^d\rightarrow\mathbb{R}$, $d\gg1$. Our objective is to identify a nonlinear feature map $g:\mathbb{R}^d\rightarrow\mathbb{R}^m$, with a prescribed intermediate dimension $m\ll d$, so that $u$ can be well approximated by $f\circ g$ for some profil...
Preprint
We exploit the relationship between the stochastic Koopman operator and the Kolmogorov backward equation to construct importance sampling schemes for stochastic differential equations. Specifically, we propose using eigenfunctions of the stochastic Koopman operator to approximate the Doob transform for an observable of interest (e.g., associated wi...
Conference Paper
Full-text available
Robustly estimating the separated flow about an airfoil is critical in the design of any closed-loop controller. Darakananda et al. (Phys. Rev. Fluids, 2018) successfully used an ensemble Kalman filter (EnKF) to sequentially estimate the flow using an inviscid vortex model and distributed surface pressure readings. To tackle challenging inference p...
Conference Paper
Full-text available
Lightweight aerial vehicles can be strongly affected by environmental disturbances (gusts). It is important to have tools for estimating their aerodynamic response to such gusts and to other disturbances, such as agile maneuvers and flow actuators. In recent work, a framework has been developed for predicting the state of disturbed aerodynamic flow...
Preprint
Full-text available
Undirected probabilistic graphical models represent the conditional dependencies, or Markov properties, of a collection of random variables. Knowing the sparsity of such a graphical model is valuable for modeling multivariate distributions and for efficiently performing inference. While the problem of learning graph structure from data has been stu...
Article
Full-text available
Estimating parameters of chaotic geophysical models is challenging due to these models' inherent unpredictability. With temporally sparse long-range observations, these models cannot be calibrated using standard least squares or filtering methods. Obvious remedies, such as averaging over temporal and spatial data to characterize the mean behavior,...
Preprint
We propose a general framework to robustly characterize joint and conditional probability distributions via transport maps. Transport maps or "flows" deterministically couple two distributions via an expressive monotone transformation. Yet, learning the parameters of such transformations in high dimensions is challenging given few samples from the...
Article
Full-text available
2020 IOP Publishing Ltd. This paper suggests a framework for the learning of discretizations of expensive forward models in Bayesian inverse problems. The main idea is to incorporate the parameters governing the discretization as part of the unknown to be estimated within the Bayesian machinery. We numerically show that in a variety of inverse prob...
Article
2020 IAEA, Vienna We present a fully Bayesian approach for the inference of radial profiles of impurity transport coefficients and compare its results to neoclassical, gyrofluid and gyrokinetic modeling. Using nested sampling, the Bayesian impurity transport inference (BITE) framework can handle complex parameter spaces with multiple possible solut...
Article
Full-text available
Satellite remote sensing provides a global view to processes on Earth that has unique benefits compared to making measurements on the ground, such as global coverage and enormous data volume. The typical downsides are spatial and temporal gaps and potentially low data quality. Meaningful statistical inference from such data requires overcoming thes...
Conference Paper
Full-text available
The Bayesian framework is commonly used to quantify uncertainty in seismic inversion. To perform Bayesian inference, Markov chain Monte Carlo (MCMC) algorithms are regarded as the gold standard technique for sampling from the posterior probability distribution. Consistent MCMC methods have trouble for complex, high-dimensional models, and most meth...
Preprint
Let $\rho$ and $\pi$ be two probability measures on $[-1,1]^d$ with positive and analytic Lebesgue densities. We investigate the approximation of the unique triangular monotone (Knothe-Rosenblatt) transport $T:[-1,1]^d\to [-1,1]^d$, such that the pushforward $T_\sharp\rho$ equals $\pi$. It is shown that for $d\in\mathbb{N}$ there exist approximatio...
Preprint
We present a new approach for sampling conditional measures that enables uncertainty quantification in supervised learning tasks. We construct a mapping that transforms a reference measure to the probability measure of the output conditioned on new inputs. The mapping is trained via a modification of generative adversarial networks (GANs), called m...
Preprint
We present a fully Bayesian approach for the inference of radial profiles of impurity transport coefficients and compare its results to neoclassical, gyrofluid and gyrokinetic modeling. Using nested sampling, the Bayesian Impurity Transport InferencE (BITE) framework can handle complex parameter spaces with multiple possible solutions, offering gre...
Preprint
We propose and analyze batch greedy heuristics for cardinality constrained maximization of non-submodular non-decreasing set functions. Our theoretical guarantees are characterized by the combination of submodularity and supermodularity ratios. We argue how these parameters define tight modular bounds based on incremental gains, and provide a novel...
Preprint
Many Bayesian inference problems involve target distributions whose density functions are computationally expensive to evaluate. Replacing the target density with a local approximation based on a small number of carefully chosen density evaluations can significantly reduce the computational expense of Markov chain Monte Carlo (MCMC) sampling. Moreo...
Preprint
We propose a differential geometric construction for families of low-rank covariance matrices, via interpolation on low-rank matrix manifolds. In contrast with standard parametric covariance classes, these families offer significant flexibility for problem-specific tailoring via the choice of "anchor" matrices for the interpolation. Moreover, their...
Article
Acoustic emission (AE) is a widely used technology to study source mechanisms and material properties during high-pressure rock failure experiments. It is important to understand the physical quantities that acoustic emission sensors measure, as well as the response of these sensors as a function of frequency. This study calibrates the newly built...
Preprint
Full-text available
This paper suggests a framework for the learning of discretizations of expensive forward models in Bayesian inverse problems. The main idea is to incorporate the parameters governing the discretization as part of the unknown to be estimated within the Bayesian machinery. We numerically show that in a variety of inverse problems arising in mechanica...
Preprint
Statistical modeling of spatiotemporal phenomena often requires selecting a covariance matrix from a covariance class. Yet standard parametric covariance families can be insufficiently flexible for practical applications, while non-parametric approaches may not easily allow certain kinds of prior knowledge to be incorporated. We propose instead to...
Article
2020 Remi Lam, Olivier Zahm, Youssef Marzouk, Karen Willcox. We propose a multifidelity dimension reduction method to identify a low-dimensional structure present in many engineering models. The structure of interest arises when functions vary primarily on a low-dimensional subspace of the high-dimensional input space, while varying little along th...
Article
2020 Society for Industrial and Applied Mathematics. Optimization-based samplers such as randomize-then-optimize (RTO) [J. M. Bardsley et al., SIAM J. Sci. Comput., 36 (2014), pp. A1895-A1910] provide an efficient and parallellizable approach to solving large-scale Bayesian inverse problems. These methods solve randomly perturbed optimization probl...
Article
2020 Society for Industrial and Applied Mathematics. Markov chain Monte Carlo (MCMC) samplers are numerical methods for drawing samples from a given target probability distribution. We discuss one particular MCMC sampler, the MALA-within-Gibbs sampler, from the theoretical and practical perspectives. We first show that the acceptance ratio and step...
Article
Full-text available
Markov chain Monte Carlo (MCMC) sampling of posterior distributions arising in Bayesian inverse problems is challenging when evaluations of the forward model are computationally expensive. Replacing the forward model with a low-cost, low-fidelity model often significantly reduces computational cost; however, employing a low-fidelity model alone mea...
Article
Full-text available
Satellite remote sensing provides a global view to processes on Earth that has unique benefits compared to measurements made on the ground. The global coverage and the enormous amounts of data produced come, however, with the price of spatial and temporal gaps and less than perfect data quality. Meaningful statistical inference from such data requi...
Preprint
Markov chain Monte Carlo (MCMC) samplers are numerical methods for drawing samples from a given target probability distribution. We discuss one particular MCMC sampler, the MALA-within-Gibbs sampler, from the theoretical and practical perspectives. We first show that the acceptance ratio and step size of this sampler are independent of the overall...
Preprint
We consider filtering in high-dimensional non-Gaussian state-space models with intractable transition kernels, nonlinear and possibly chaotic dynamics, and sparse observations in space and time. We propose a novel filtering methodology that harnesses transportation of measures, convex optimization, and ideas from probabilistic graphical models to y...
Preprint
Full-text available
Acoustic emission (AE) is a widely used technology to study source mechanisms and material properties during high-pressure rock failure experiments. It is important to understand the physical quantities that acoustic emission sensors measure, as well as the response of these sensors as a function of frequency. This study calibrates the newly built...
Preprint
Full-text available
We propose a framework for the greedy approximation of high-dimensional Bayesian inference problems, through the composition of multiple \emph{low-dimensional} transport maps or flows. Our framework operates recursively on a sequence of ``residual'' distributions, given by pulling back the posterior through the previously computed transport maps. T...
Preprint
We develop a new computational approach for "focused" optimal Bayesian experimental design with nonlinear models, with the goal of maximizing expected information gain in targeted subsets of model parameters. Our approach considers uncertainty in the full set of model parameters, but employs a design objective that can exploit learning trade-offs a...
Preprint
Optimization-based samplers provide an efficient and parallellizable approach to solving large-scale Bayesian inverse problems. These methods solve randomly perturbed optimization problems to draw samples from an approximate posterior distribution. "Correcting" these samples, either by Metropolization or importance sampling, enables characterizatio...
Article
We present an overview of optimization under uncertainty efforts under the DARPA Enabling Quantification of Uncertainty in Physical Systems (EQUiPS) ScramjetUQ project. We introduce the mathematical frameworks and computational tools employed for performing this task. In particular, we provide details in the optimization and multilevel uncertainty...
Preprint
Full-text available
We develop new approximation algorithms and data structures for representing and computing with multivariate functions using the functional tensor-train (FT), a continuous extension of the tensor-train (TT) decomposition. The FT represents functions using a tensor-train ansatz by replacing the three-dimensional TT cores with univariate matrix-value...
Article
We develop new approximation algorithms and data structures for representing and computing with multivariate functions using the functional tensor-train (FT), a continuous extension of the tensor-train (TT) decomposition. The FT represents functions using a tensor-train ansatz by replacing the three-dimensional TT cores with univariate matrix-value...
Article
Stein variational gradient descent (SVGD) was recently proposed as a general purpose nonparametric variational inference algorithm [Liu & Wang, NIPS 2016]: it minimizes the Kullback-Leibler divergence between the target distribution and its approximation by implementing a form of functional gradient descent on a reproducing kernel Hilbert space. In...
Preprint
Full-text available
We propose a multifidelity dimension reduction method to identify a low-dimensional structure present in many engineering models. The structure of interest arises when functions vary primarily on a low-dimensional subspace of the high-dimensional input space, while varying little along the complementary directions. Our approach builds on the gradie...
Preprint
Full-text available
Markov chain Monte Carlo (MCMC) sampling of posterior distributions arising in Bayesian inverse problems is challenging when evaluations of the forward model are computationally expensive. Replacing the forward model with a low-cost, low-fidelity model often significantly reduces computational cost; however, employing a low-fidelity model alone mea...
Article
Full-text available
Uncertainty quantification in expensive turbulent combustion simulations usually adopts response surface techniques to accelerate Monte Carlo sampling. However, it is computationally intractable to build response surfaces for high-dimensional kinetic parameters. We employ the active subspaces approach to reduce the dimension of the parameter space,...
Preprint
Full-text available
We propose a dimension reduction technique for Bayesian inverse problems with nonlinear forward operators, non-Gaussian priors, and non-Gaussian observation noise. The likelihood function is approximated by a ridge function, i.e., a map which depends non-trivially only on a few linear combinations of the parameters. We build this ridge approximatio...
Preprint
Full-text available