# Michael GastparÉcole Polytechnique Fédérale de Lausanne | EPFL · School of Computer and Communication Sciences

Michael Gastpar

Dr. ès sc. (EPFL, 2002)

## About

330

Publications

10,705

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

12,148

Citations

Introduction

**Skills and Expertise**

Additional affiliations

July 2009 - March 2015

January 2003 - July 2015

## Publications

Publications (330)

We prove that every online learnable class of functions of Littlestone dimension $d$ admits a learning algorithm with finite information complexity. Towards this end, we use the notion of a globally stable algorithm. Generally, the information complexity of such a globally stable algorithm is large yet finite, roughly exponential in $d$. We also sh...

Most of our lives are conducted in the cyberspace. The human notion of privacy translates into a cyber notion of privacy on many functions that take place in the cyberspace. This article focuses on three such functions: how to privately retrieve information from cyberspace (privacy in information retrieval), how to privately leverage large-scale di...

Inspired by Sibson's alpha-mutual information, we introduce a new class of universal predictors that depend on a real parameter greater than one. This class interpolates two well-known predictors, the mixture estimator, that includes the Laplace and the Krichevsky-Trofimov predictors, and the Normalized Maximum Likelihood (NML) estimator. We point...

In this work, we connect the problem of bounding the expected generalisation error with transportation-cost inequalities. Exposing the underlying pattern behind both approaches we are able to generalise them and go beyond Kullback-Leibler Divergences/Mutual Information and sub-Gaussian measures. In particular, we are able to provide a result showin...

We explore a family of information measures that stems from R\'enyi's $\alpha$-Divergences with $\alpha<0$. In particular, we extend the definition of Sibson's $\alpha$-Mutual Information to negative values of $\alpha$ and show several properties of these objects. Moreover, we highlight how this family of information measures is related to function...

We consider the problem of parameter estimation in a Bayesian setting and propose a general lower-bound that includes part of the family of $f$-Divergences. The results are then applied to specific settings of interest and compared to other notable results in the literature. In particular, we show that the known bounds using Mutual Information can...

In the distributed remote (CEO) source coding problem, many separate encoders observe independently noisy copies of an underlying source. The rate loss is the difference between the rate required in this distributed setting and the rate that would be required in a setting where the encoders can fully cooperate. In this sense, the rate loss characte...

The Gray-Wyner network subject to a fidelity criterion is studied. Upper and lower bounds for the trade-offs between the private sum-rate and the common rate are obtained for arbitrary sources subject to mean-squared error distortion. The bounds meet exactly, leading to the computation of the rate region, when the source is jointly Gaussian. They m...

This paper presents explicit solutions for two related non-convex information extremization problems due to Gray and Wyner in the Gaussian case. The first problem is the Gray-Wyner network subject to a sum-rate constraint on the two private links. Here, our argument establishes the optimality of Gaussian codebooks and hence, a closed-form formula f...

Compute-forward is a coding technique that enables receiver(s) in a network to directly decode one or more linear combinations of the transmitted codewords. Initial efforts focused on Gaussian channels and derived achievable rate regions via nested lattice codes and single-user (lattice) decoding as well as sequential (lattice) decoding. Recently,...

Most of our lives are conducted in the cyberspace. The human notion of privacy translates into a cyber notion of privacy on many functions that take place in the cyberspace. This article focuses on three such functions: how to privately retrieve information from cyberspace (privacy in information retrieval), how to privately leverage large-scale di...

In this work, the probability of an event under some joint distribution is bounded by measuring it with the product of the marginals instead (which is typically easier to analyze) together with a measure of the dependence between the two random variables. These results find applications in adaptive data analysis, where multiple dependencies are int...

An important notion of common information between two random variables is due to Wyner. In this paper, we derive a lower bound on Wyner's common information for continuous random variables. The new bound improves on the only other general lower bound on Wyner's common information, which is the mutual information. We also show that the new lower bou...

Wyner’s common information is a measure that quantifies and assesses the commonality between two random variables. Based on this, we introduce a novel two-step procedure to construct features from data, referred to as Common Information Components Analysis (CICA). The first step can be interpreted as an extraction of Wyner’s common information. The...

Learning and compression are driven by the common aim of identifying and exploiting statistical regularities in data, which opens the door for fertile collaboration between these areas. A promising group of compression techniques for learning scenarios is normalised maximum likelihood (NML) coding, which provides strong guarantees for compression o...

Algebraic network information theory is an emerging facet of network information theory, studying the achievable rates of random code ensembles that have algebraic structure, such as random linear codes. A distinguishing feature is that linear combinations of codewords can sometimes be decoded more efficiently than codewords themselves. The present...

We give an information-theoretic interpretation of Canonical Correlation Analysis (CCA) via (relaxed) Wyner's common information. CCA permits to extract from two high-dimensional data sets low-dimensional descriptions (features) that capture the commonalities between the data sets, using a framework of correlations and linear transforms. Our interp...

We consider the problem of source coding subject to a fidelity criterion for the Gray-Wyner network that connects a single source with two receivers via a common channel and two private channels. General lower bounds are derived for jointly Gaussian sources subject to the mean-squared error criterion, leveraging convex duality and an argument invol...

The aim of this work is to provide bounds connecting two probability measures of the same event using R\'enyi $\alpha$-Divergences and Sibson's $\alpha$-Mutual Information, a generalization of respectively the Kullback-Leibler Divergence and Shannon's Mutual Information. A particular case of interest can be found when the two probability measures c...

A natural relaxation of Wyner's Common Information is studied. Specifically, the constraint of conditional independence is replaced by an upper bound on the conditional mutual information. While of interest in its own right, this relaxation has operational significance in a source coding problem that models coded caching. For the special case of jo...

The feedback sum-rate capacity is established for the symmetric J-user Gaussian multiple-access channel (GMAC). The main contribution is a converse bound that combines the dependence-balance argument of Hekstra and Willems (1989) with a variant of the factorization of a convex envelope of Geng and Nair (2014). The converse bound matches the achieva...

In this work, the probability of an event under some joint distribution is bounded by measuring it with the product of the marginals instead (which is typically easier to analyze) together with a measure of the dependence between the two random variables. These results find applications in adaptive data analysis, where multiple dependencies are int...

A new scheme for the problem of centralized coded caching with non-uniform demands is proposed. The distinguishing feature of the proposed placement strategy is that it admits equal sub-packetization for all files while allowing the users to allocate more cache to the files which are more popular. This creates natural broadcasting opportunities in...

The following problem is considered: given a joint distribution $P_{XY}$ and an event $E$, bound $P_{XY}(E)$ in terms of $P_XP_Y(E)$ (where $P_XP_Y$ is the product of the marginals of $P_{XY}$) and a measure of dependence of $X$ and $Y$. Such bounds have direct applications in the analysis of the generalization error of learning algorithms, where $...

There is an increasing concern that most current published research findings are false. The main cause seems to lie in the fundamental disconnection between theory and practice in data analysis. While the former typically relies on statistical independence, the latter is an inherently adaptive process: new hypotheses are formulated based on the out...

The distributed remote source coding (so-called CEO) problem is studied in the case where the underlying source, not necessarily Gaussian, has finite differential entropy and the observation noise is Gaussian. The main result is a new lower bound for the sum-rate-distortion function under arbitrary distortion measures. When specialized to the case...

Consider a receiver in a multi-user network that wishes to decode several messages. Simultaneous joint typicality decoding is one of the most powerful techniques for determining the fundamental limits at which reliable decoding is possible. This technique has historically been used in conjunction with random i.i.d. codebooks to establish achievable...

We consider a setup in which confidential i.i.d. samples $X_1,\dotsc,X_n$ from an unknown finite-support distribution $\boldsymbol{p}$ are passed through $n$ copies of a discrete privatization channel (a.k.a. mechanism) producing outputs $Y_1,\dotsc,Y_n$. The channel law guarantees a local differential privacy of $\epsilon$. Subject to a prescribed...

The feedback sum-rate capacity is established for the symmetric $J$-user Gaussian multiple-access channel (GMAC). The main contribution is a converse bound that combines the dependence-balance argument of Hekstra and Willems (1989) with a variant of the factorization of a convex envelope of Geng and Nair (2014). The converse bound matches the achie...

We present a practical strategy that aims to attain rate points on the dominant face of the multiple access channel capacity using a standard low complexity decoder. This technique is built upon recent theoretical developments of Zhu and Gastpar on compute-forward multiple access (CFMA) which achieves the capacity of the multiple access channel usi...

Multi-server single-message private information retrieval is studied in the presence of side information. In this problem, $K$ independent messages are replicatively stored at $N$ non-colluding servers. The user wants to privately download one message from the servers without revealing the index of the message to any of the servers, leveraging its...

This paper presents a joint typicality framework for encoding and decoding nested linear codes in multi-user networks. This framework provides a new perspective on compute– forward within the context of discrete memoryless networks. In particular, it establishes an achievable rate region for computing a linear combination over a discrete memoryless...

We propose a novel caching strategy for the problem of centralized coded caching with non-uniform demands. Our placement strategy can be applied to an arbitrary number of users and files, and can be easily adapted to the scenario where file popularities are user-specific. The distinguishing feature of the proposed placement strategy is that it allo...

We study the problem of single-server multi-message private information retrieval with side information. One user wants to recover $N$ out of $K$ independent messages which are stored at a single server. The user initially possesses a subset of $M$ messages as side information. The goal of the user is to download the $N$ demand messages while not l...

The distributed remote source coding (so-called CEO) problem is studied in the case where the underlying source has finite differential entropy and the observation noise is Gaussian. The main result is a new lower bound for the sum-rate-distortion function under arbitrary distortion measures. When specialized to the case of mean-squared error, it i...

Despite significant progress in the caching literature concerning the worst case and uniform average case regimes, the algorithms for caching with nonuniform demands are still at a basic stage and mostly rely on simple grouping and memory-sharing techniques. In this work we introduce a novel centralized caching strategy for caching with nonuniform...

We present a practical strategy that aims to attain rate points on the dominant face of the multiple access channel capacity using a standard low complexity decoder. This technique is built upon recent theoretical developments of Zhu and Gastpar on compute-forward multiple access (CFMA) which achieves the capacity of the multiple access channel usi...

The cooperative data exchange problem is studied for the fully connected network. In this problem, each node initially only possesses a subset of the $K$ packets making up the file. Nodes make broadcast transmissions that are received by all other nodes. The goal is for each node to recover the full file. In this paper, we present a polynomial-time...

We introduce the Fixed Cluster Repair System (FCRS) as a novel architecture for Distributed Storage Systems (DSS) that achieves a small repair bandwidth while guaranteeing a high availability. Specifically we partition the set of servers in a DSS into $s$ clusters and allow a failed server to choose any cluster other than its own as its repair grou...

In this paper, we consider a cache aided network in which each user is assumed to have individual caches, while upon users’ requests, an update message is sent through a common link to all users. First, we formulate a general information theoretic setting that represents the database as a discrete memoryless source, and the users’ requests as side...

Computation codes in network information theory are designed for the scenarios where the decoder is not interested in recovering the information sources themselves, but only a function thereof. K\"orner and Marton showed for distributed source coding that such function decoding can be achieved more efficiently than decoding the full information sou...

The classical distributed storage problem can be modeled by a k-uniform {\it complete} hyper-graph where vertices represent servers and hyper-edges represent users. Hence each hyper-edge should be able to recover the full file using only the memories of the vertices associated with it. This paper considers the generalization of this problem to {\it...

From a subset of the n-dimensional integer lattice, we independently pick two points uniformly at random. A sumset is formed by adding these two points component-wise and a sumset is called typical, if the sum falls inside this set with high probability. In this note we characterize the asymptotic size of the typical sumsets for large n, and show t...

We study the problem of coded caching when the server has access to several libraries and each user makes independent requests from every library. The single-library scenario has been well studied and it has been proved that coded caching can significantly improve the delivery rate compared to uncoded caching. In this work we show that when all the...

We study a generalization of Wyner's Common Information toWatanabe's Total Correlation. The first minimizes the description size required for a variable that can make two other random variables conditionally independent. If independence is unattainable, Watanabe's total (conditional) correlation is measure to check just how independent they have be...

Given two identical linear codes C with rate R over F q of length n, we independently pick one codeword from each codebook uniformly at random. A sumset is formed by adding these two codewords entry-wise as integer vectors and a sumset is called typical, if the sum falls inside this set with high probability. In this paper we show that the asymptot...

This paper presents a joint typicality framework for encoding and decoding nested linear codes for multi-user networks. This framework provides a new perspective on compute-forward within the context of discrete memoryless networks. In particular, it establishes an achievable rate region for computing the weighted sum of nested linear codewords ove...

In this paper, we consider a cache aided network in which each user is assumed to have individual caches, while upon users' requests, an update message is sent though a common link to all users. First, we formulate a general information theoretic setting that represents the database as a discrete memoryless source, and the users' requests as side i...

The classical problem in a network coding theory considers communication over multicast networks. Multiple transmitters send independent messages to multiple receivers that decode the same set of messages. In this paper, computation over multicast networks is considered: each receiver decodes an identical function of the original messages. For a co...

We study a caching problem that resembles a lossy Gray–Wyner network: A source produces vector samples from a Gaussian distribution, but the user is interested in the samples of only one component. The encoder first sends a cache message without any knowledge of the user’s preference. Upon learning her request, a second message is provided in the u...

We study the problem of coded caching when the server has access to several libraries and each user makes independent requests from every library. The single-library scenario has been well studied and it has been proved that coded caching can significantly improve the delivery rate compared to uncoded caching. In this work we show that when all the...

An information-theoretic lower bound is developed for the caching system
studied by Maddah-Ali and Niesen. By comparing the proposed lower bound with
the decentralized coded caching scheme of Maddah-Ali and Niesen, the optimal
memory--rate tradeoff is characterized to within a multiplicative gap of $4.7$
for the worst case, improving the previous a...