# Carles Domingo-EnrichNew York University | NYU · Department of Computer Science

Carles Domingo-Enrich

## About

21

Publications

545

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

19

Citations

Introduction

**Skills and Expertise**

## Publications

Publications (21)

Stochastic optimal control (SOC) aims to direct the behavior of noisy systems and has widespread applications in science, engineering, and artificial intelligence. In particular, reward fine-tuning of diffusion and flow matching models and sampling from unnormalized methods can be recast as SOC problems. A recent work has introduced Adjoint Matchin...

Dynamical generative models that produce samples through an iterative process, such as Flow Matching and denoising diffusion models, have seen widespread use, but there has not been many theoretically-sound methods for improving these models with reward fine-tuning. In this work, we cast reward fine-tuning as stochastic optimal control (SOC). Criti...

We investigate the optimal transport problem between probability measures when the underlying cost function is understood to satisfy a least action principle, also known as a Lagrangian cost. These generalizations are useful when connecting observations from a physical system where the transport dynamics are influenced by the geometry of the system...

We examine how transformers cope with two challenges: learning basic integer arithmetic, and generalizing to longer sequences than seen during training. We find that relative position embeddings enable length generalization for simple tasks, such as addition: models trained on $5$-digit numbers can perform $15$-digit sums. However, this method fail...

The theory of statistical learning has focused on variational objectives expressed on functions. In this note, we discuss motivations to write similar objectives on measures, in particular to discuss out-of-distribution generalization and weakly-supervised learning. It raises a natural question: can one cast usual statistical learning results to ob...

Simulation-free methods for training continuous-time generative models construct probability paths that go between noise distributions and individual data samples. Recent works, such as Flow Matching, derived paths that are optimal for each data sample. However, these algorithms rely on independent data and noise samples, and do not exploit underly...

Let $V_* : \mathbb{R}^d \to \mathbb{R}$ be some (possibly non-convex) potential function, and consider the probability measure $\pi \propto e^{-V_*}$. When $\pi$ exhibits multiple modes, it is known that sampling techniques based on Wasserstein gradient flows of the Kullback-Leibler (KL) divergence (e.g. Langevin Monte Carlo) suffer poorly in the r...

Kernel two-sample testing provides a powerful framework for distinguishing any pair of distributions based on $n$ sample points. However, existing kernel tests either run in $n^2$ time or sacrifice undue power to improve runtime. To address these shortcomings, we introduce Compress Then Test (CTT), a new framework for high-powered kernel testing ba...

When solving finite-sum minimization problems, two common alternatives to stochastic gradient descent (SGD) with theoretical benefits are random reshuffling (SGD-RR) and shuffle-once (SGD-SO), in which functions are sampled in cycles without replacement. Under a convenient stochastic noise approximation which holds experimentally, we study the stat...

Differential privacy (DP) is the de facto standard for private data release and private machine learning. Auditing black-box DP algorithms and mechanisms to certify whether they satisfy a certain DP guarantee is challenging, especially in high dimension. We propose relaxations of differential privacy based on new divergences on probability distribu...

Learning high-dimensional distributions is often done with explicit likelihood modeling or implicit modeling via minimizing integral probability metrics (IPMs). In this paper, we expand this learning paradigm to stochastic orders, namely, the convex or Choquet order between probability measures. Towards this end, we introduce the Choquet-Toland dis...

Min-max optimization problems arise in several key machine learning setups, including adversarial learning and generative modeling. In their general form, in absence of convexity/concavity assumptions, finding pure equilibria of the underlying two-player zero-sum game is computationally hard [Daskalakis et al., 2021]. In this work we focus instead...

We construct pairs of distributions $\mu_d, \nu_d$ on $\mathbb{R}^d$ such that the quantity $|\mathbb{E}_{x \sim \mu_d} [F(x)] - \mathbb{E}_{x \sim \nu_d} [F(x)]|$ decreases as $\Omega(1/d^2)$ for some three-layer ReLU network $F$ with polynomial width and weights, while declining exponentially in $d$ if $F$ is any two-layer network with polynomial...

A well-known line of work (Barron, 1993; Breiman, 1993; Klusowski & Barron, 2018) provides bounds on the width $n$ of a ReLU two-layer neural network needed to approximate a function $f$ over the ball $\mathcal{B}_R(\R^d)$ up to error $\epsilon$, when the Fourier based quantity $C_f = \int_{\R^d} \|\xi\|^2 |\hat{f}(\xi)| \ d\xi$ is finite. More rec...

Energy-based models (EBMs) are generative models that are usually trained via maximum likelihood estimation. This approach becomes challenging in generic situations where the trained energy is nonconvex, due to the need to sample the Gibbs distribution associated with this energy. Using general Fenchel duality results, we derive variational princip...

Several works in implicit and explicit generative modeling empirically observed that feature-learning discriminators outperform fixed-kernel discriminators in terms of the sample quality of the models. We provide separation results between probability metrics with fixed-kernel and feature-learning discriminators using the function classes $\mathcal...

Energy-based models (EBMs) are a simple yet powerful framework for generative modeling. They are based on a trainable energy function which defines an associated Gibbs measure, and they can be trained and sampled from via well-established statistical tools, such as MCMC. Neural networks may be used as energy function approximators, providing both a...

Advances in generative modeling and adversarial learning have given rise to renewed interest in smooth games. However, the absence of symmetry in the matrix of second derivatives poses challenges that are not present in the classical minimization framework. While a rich theory of average-case analysis has been developed for minimization problems, l...

Finding Nash equilibria in two-player zero-sum continuous games is a central problem in machine learning, e.g. for training both GANs and robust models. The existence of pure Nash equilibria requires strong conditions which are not typically met in practice. Mixed Nash equilibria exist in greater generality and may be found using mirror descent. Ye...

Data-driven model training is increasingly relying on finding Nash equilibria with provable techniques, e.g., for GANs and multi-agent RL. In this paper, we analyse a new extra-gradient method, that performs gradient extrapolations and updates on a random subset of players at each iteration. This approach provably exhibits the same rate of converge...

Data controllers accumulate more and more data on people, of which a substantial proportion are personally identifiable and hence sensitive data. Storing and processing those data in local premises is increasingly inconvenient, but resorting to cloud storage and processing raises security and privacy issues. We show how to use untrusted clouds to c...