Sergio Verdu's research while affiliated with Princeton University and other places

Publications (232)

Article
Full-text available
Over the last six decades, the representation of error exponent functions for data transmission through noisy channels at rates below capacity has seen three distinct approaches: (1) Through Gallager’s E0 functions (with and without cost constraints); (2) large deviations form, in terms of conditional relative entropy and mutual information; (3) th...
Article
The Brascamp-Lieb inequality in functional analysis can be viewed as a measure of the “uncorrelatedness” of a joint probability distribution. We define the smooth Brascamp-Lieb (BL) divergence as the infimum of the best constant in the Brascamp-Lieb inequality under a perturbation of the joint probability distribution. An information spectrum upper...
Article
Full-text available
Rényi-type generalizations of entropy, relative entropy and mutual information have found numerous applications throughout information theory and beyond. While there is consensus that the ways A. Rényi generalized entropy and relative entropy in 1961 are the “right” ones, several candidates have been put forth as possible mutual informations of ord...
Article
A fundamental tool in network information theory is the covering lemma, which lower bounds the probability that there exists a pair of random variables; among a given number of independently generated candidates, falling within a given set. We use a weighted sum trick and Talagrand’s concentration inequality to prove new mutual covering bounds. We...
Article
We introduce a definition of perfect and quasi-perfect codes for discrete symmetric channels based on the packing and covering properties of generalized spheres whose shape is tilted using an auxiliary probability measure. This notion generalizes previous definitions of perfect and quasi-perfect codes and encompasses maximum distance separable code...
Preprint
Full-text available
Verd\'u reformulated the covering problem in the non-asymptotic information theoretic setting as a lower bound on the covering probability for \emph{any} set which has a large probability under a given joint distribution. The covering probability is the probability that there exists a pair of random variables among a given number of independently g...
Preprint
Full-text available
A strong converse shows that no procedure can beat the asymptotic (as blocklength $n\to\infty$) fundamental limit of a given information-theoretic problem for any fixed error probability. A second-order converse strengthens this conclusion by showing that the asymptotic fundamental limit cannot be exceeded by more than $O(\tfrac{1}{\sqrt{n}})$. Whi...
Article
Full-text available
Inspired by the forward and the reverse channels from the image-size characterization problem in network information theory, we introduce a functional inequality that unifies both the Brascamp-Lieb inequality and Barthe’s inequality, which is a reverse form of the Brascamp-Lieb inequality. For Polish spaces, we prove its equivalent entropic formula...
Preprint
We introduce a definition of perfect and quasi-perfect codes for symmetric channels parametrized by an auxiliary output distribution. This notion generalizes previous definitions of perfect and quasi-perfect codes and encompasses maximum distance separable codes. The error probability of these codes, whenever they exist, is shown to coincide with t...
Preprint
Inspired by the forward and the reverse channels from the image-size characterization problem in network information theory, we introduce a functional inequality which unifies both the Brascamp-Lieb inequality and Barthe's inequality, which is a reverse form of the Brascamp-Lieb inequality. For Polish spaces, we prove its equivalent entropic formul...
Article
Full-text available
The redundancy for universal lossless compression of discrete memoryless sources in Campbell’s setting is characterized as a minimax Rényi divergence, which is shown to be equal to the maximal α-mutual information via a generalized redundancy-capacity theorem. Special attention is placed on the analysis of the asymptotics of minimax Rényi divergenc...
Article
Full-text available
In this work we relax the usual separability assumption made in rate-distortion literature and propose f-separable distortion measures, which are well suited to model non-linear penalties. The main insight behind f-separable distortion measures is to define an n-letter distortion measure to be an f-mean of single-letter distortions. We prove a rate...
Article
A basic two-terminal secret key generation model is considered, where the interactive communication rate between the terminals may be limited, and in particular may not be enough to achieve the maximum key rate. We first prove a multiletter characterization of the key-communication rate region (where the number of auxiliary random variables depend...
Article
This paper presents conditional versions of Lempel-Ziv (LZ) algorithm for settings where compressor and decompressor have access to the same side information. We propose a fixed-length-parsing LZ algorithm with side information, motivated by the Willems algorithm, and prove the optimality for any stationary processes. In addition, we suggest strate...
Article
This paper considers the problem of lossy source coding with a specific distortion measure: logarithmic loss. The focus of this paper is on the single-shot approach which exposes crisply the connection between lossless source coding with list decoding and lossy source coding with log-loss. Fixed-length and variable length bounds are presented. Fixe...
Article
This paper quantifies the fundamental limits of variable-length transmission of a general (possibly analog) source over a memoryless channel with noiseless feedback, under a distortion constraint. We consider excess distortion, average distortion and guaranteed distortion (d-semifaithful codes). In contrast to the asymptotic fundamental limit, a ge...
Article
Full-text available
We introduce an inequality which may be viewed as a generalization of both the Brascamp-Lieb inequality and its reverse (Barthe's inequality), and prove its information-theoretic (i.e.\ entropic) formulation. This result leads to a unified approach to functional inequalities such as the variational formula of R\'enyi entropy, hypercontractivity and...
Article
Full-text available
This paper gives upper and lower bounds on the minimum error probability of Bayesian $M$-ary hypothesis testing in terms of the Arimoto-R\'enyi conditional entropy of an arbitrary order $\alpha$. The improved tightness of these bounds over their specialized versions with the Shannon conditional entropy ($\alpha=1$) is demonstrated. In particular, i...
Article
Full-text available
The redundancy for universal lossless compression in Campbell's setting is characterized as a minimax R\'enyi divergence, which is shown to be equal to the maximal $\alpha$-mutual information via a generalized redundancy-capacity theorem. Special attention is placed on the analysis of the asymptotics of minimax R\'enyi divergence, which is determin...
Article
Full-text available
This paper develops systematic approaches to obtain f -divergence inequalities, dealing with pairs of probability measures defined on arbitrary alphabets. Functional domination is one such approach, where special emphasis is placed on finding the best possible constant upper bounding a ratio of f -divergences. Another approach used for the derivati...
Conference Paper
Full-text available
This paper considers derivation of f-divergence inequalities via the approach of functional domination. Bounds on an f-divergence based on one or several other f-divergences are introduced, dealing with pairs of probability measures defined on arbitrary alphabets. In addition, a variety of bounds are shown to hold under boundedness assumptions on t...
Article
Full-text available
This paper considers derivation of $f$-divergence inequalities via the approach of functional domination. Bounds on an $f$-divergence based on one or several other $f$-divergences are introduced, dealing with pairs of probability measures defined on arbitrary alphabets. In addition, a variety of bounds are shown to hold under boundedness assumption...
Article
Full-text available
We generalize a result by Carlen and Cordero-Erausquin on the equivalence between the Brascamp-Lieb inequality and the subadditivity of relative entropy by allowing for random transformations (a broadcast channel). This leads to a unified perspective on several functional inequalities that have been gaining popularity in the context of proving impo...
Article
Full-text available
We study the infimum of the best constant in a functional inequality, the Brascamp-Lieb-like inequality, over auxiliary measures within a neighborhood of a product distribution. In the finite alphabet and the Gaussian cases, such an infimum converges to the best constant in a mutual information inequality. Implications for strong converse propertie...
Article
Full-text available
The basic two-terminal common randomness (CR) or key generation model is considered, where the communication between the terminals may be limited, and in particular may not be enough to achieve the maximal CR/key rate. We introduce a general framework of $XY$-absolutely continuous distributions and $XY$-concave function, and characterize the first...
Article
Full-text available
The conventional channel resolvability refers to the minimum rate needed for an input process to approximate an output distribution of a channel in total variation distance. In this paper we study $E_{\gamma}$-resolvability, which replaces total variation distance by the more general $E_{\gamma}$ distance. A general one-shot achievability bound for...
Conference Paper
Variable-length channel codes over discrete memoryless channels subject to probabilistic delay guarantees are examined in the non-vanishing error probability regime. Fundamental limits of these codes in several different settings, which depend on the availability of noiseless feedback and a termination option, are investigated. In stark contrast wi...
Article
Full-text available
This paper studies bounds among various $f$-divergences, dealing with arbitrary alphabets and deriving bounds on the ratios of various distance measures. Special attention is placed on bounds in terms of the total variation distance, including "reverse Pinsker inequalities," as well as on the $E_\gamma$ divergence, which generalizes the total varia...
Article
We investigate the minimum transmitted energy required to reproduce k source samples with a given fidelity after transmission over a memoryless Gaussian channel. In particular, we analyze the reduction in transmitted energy that accrues thanks to the availability of noiseless feedback. Allowing a nonvanishing excess distortion probability ∈ boosts...
Article
In information theory, the packing and covering lemmas are conventionally used in conjunction with the typical sequence approach in order to prove the asymptotic achievability results for discrete memoryless systems. In contrast, the single-shot approach in information theory provides non-asymptotic achievability and converse results, which are use...
Article
We study the amount of randomness needed for an input process to approximate a given output distribution of a channel in the $E_{\gamma}$ distance. A general one-shot achievability bound for the precision of such an approximation is developed. In the i.i.d.~setting where $\gamma=\exp(nE)$, a (nonnegative) randomness rate above $\inf_{Q_{\sf U}: D(Q...
Article
Full-text available
A new model of multi-party secret key agreement is proposed, in which one terminal called the communicator can transmit public messages to other terminals before all terminals agree on a secret key. A single-letter characterization of the achievable region is derived in the stationary memoryless case. The new model generalizes some other (old and n...
Article
Full-text available
By developing one-shot mutual covering lemmas, we derive a one-shot achievability bound for broadcast with a common message which recovers Marton's inner bound (with three auxiliary random variables) in the i.i.d.~case. The encoder employed is deterministic. Relationship between the mutual covering lemma and a new type of channel resolvability prob...
Conference Paper
This paper provides a necessary condition good rate-distortion codes must satisfy. Specifically, it is shown that as the blocklength increases, the distribution of the input given the output of a good lossy code converges to the distribution of the input given the output of the joint distribution achieving the rate-distortion function, in terms of...
Article
Full-text available
We show that for product sources, rate splitting is optimal for secret key agreement using limited one-way communication between two terminals. This yields an alternative proof of the tensorization property of a strong data processing inequality originally studied by Erkip and Cover and amended recently by Anantharam et al. We derive a "water-filli...
Conference Paper
This paper gives non-asymptotic converse bounds on the cumulant generating function of the encoded lengths in variable-rate lossy compression and in variable-to-fixed channel coding. The results are given in terms of the Rényi mutual information and the d-tilted Rényi entropy. We also illustrate the application of the non-asymptotic bounds to obtai...
Conference Paper
This paper analyzes the distribution of the codeword lengths of the optimal lossless compression code without prefix constraints both in the non-asymptotic regime and in the asymptotic regime. The technique we use is based on upper and lower bounding the cumulant generating function of the optimum codeword lengths. In the context of prefix codes, t...
Article
Full-text available
We show that for product sources, rate splitting is optimal for secret key agreement using limited one-way communication at two terminals. This yields an alternative proof of the tensorization property of a strong data processing inequality originally studied by Erkip and Cover and amended recently by Anantharam et al. We derive a `water-filling' s...
Conference Paper
We give explicit expressions, upper and lower bounds on the total variation distance between P and Q in terms of the distribution of the random variables log dP/dQ (X) and log dP/dQ(Y), where X and Y are distributed accorκding to P and Q respectively.
Article
Full-text available
This paper provides an extensive study of the behavior of the best achievable rate (and other related fundamental limits) in variable-length strictly lossless compression. In the non-asymptotic regime, the fundamental limits of fixed-to-variable lossless compression with and without prefix constraints are shown to be tightly coupled. Several precis...
Article
Full-text available
This paper shows new general nonasymptotic achievability and converse bounds and performs their dispersion analysis for the lossy compression problem in which the compressor observes the source through a noisy channel. While this problem is asymptotically equivalent to a noiseless lossy source coding problem with a modified distortion function, non...
Article
Full-text available
This paper shows the strong converse and the dispersion of memoryless channels with cost constraints and performs refined analysis of the third order term in the asymptotic expansion of the maximum achievable channel coding rate, showing that it is equal to $\frac 1 2 \frac {\log n}{n}$ in most cases of interest. The analysis is based on a new non-...
Article
The dependence-testing (DT) bound is one of the strongest achievability bounds for the binary erasure channel (BEC) in the finite block length regime. In this paper, we show that maximum likelihood decoded regular low-density parity-check (LDPC) codes with at least 5 ones per column almost achieve the DT bound. Specifically, using quasi-regular LDP...
Conference Paper
This work1 deals with the fundamental limits of strictly-lossless variable-length compression of known sources without prefix constraints. The source dispersion characterizes the time-horizon over which it is necessary to code in order to approach the entropy rate within a pre-specified tolerance. We show that for a large class of sources, the disp...
Article
Full-text available
This paper provides an extensive study of the behavior of the best achievable rate (and other related fundamental limits) in variable-length lossless compression. In the non-asymptotic regime, the fundamental limits of fixed-to-variable lossless compression with and without prefix constraints are shown to be tightly coupled. Several precise, quanti...
Conference Paper
Invoking random coding, but not typical sequences, we give non-asymptotic achievability results for the major setups in multiuser information theory. No limitations, such as memorylessness or discreteness, on sources/channels are imposed. All the bounds given are powerful enough to yield the constructive side of the (asymptotic) capacity regions in...
Conference Paper
We revisit the dilemma of whether one should or should not code when operating under delay constraints. In those curious cases when the source and the channel are probabilistically matched so that symbol-by-symbol coding is optimal in terms of the average distortion achieved, we show that it also achieves the dispersion of joint source-channel codi...
Conference Paper
This paper shows new finite-blocklength converse bounds applicable to lossy source coding as well as joint source-channel coding, which are tight enough not only to prove the strong converse, but to find the rate-dispersion functions in both setups. In order to state the converses, we introduce the d-tilted information, a random variable whose expe...
Conference Paper
This paper considers the distribution of the optimum rate of fixed-to-variable lossless compression. It shows that in the non-asymptotic regime the fundamental limits of fixed-to-variable lossless compression with and without prefix constraints are tightly coupled.
Conference Paper
Full-text available
Consider a Bernoulli-Gaussian complex n-vector whose components are X<sub>i</sub>B<sub>i</sub>, with B<sub>i</sub> ~Bernoulli-q and X<sub>i</sub> ~ CN(0; σ<sup>2</sup>), iid across i and mutually independent. This random q-sparse vector is multiplied by a random matrix U, and a randomly chosen subset of the components of average size np, p ∈ [0; 1]...
Conference Paper
We give a general formula for the degrees of freedom of the K-user real additive-noise interference channel involving maximization of information dimension. Previous results are recovered, and even generalized in certain cases with simplified proofs. Connections to fractal geometry are drawn.
Conference Paper
The backoff from capacity due to finite blocklength can be assessed accurately from the channel dispersion. This paper analyzes the dispersion of a single-user, scalar, coherent fading channel with additive Gaussian noise. We obtain a convenient two-term expression for the channel dispersion which shows that, unlike the capacity, it depends crucial...
Conference Paper
Full-text available
This paper studies the minimum achievable source coding rate as a function of blocklength n and tolerable distortion level d. Tight general achievability and converse bounds are derived that hold at arbitrary fixed blocklength. For stationary memoryless sources with separable distortion, the minimum rate achievable is shown to be q closely approxim...
Article
If N is standard Gaussian, the minimum mean square error (MMSE) of estimating a random variable X based on √( snr ) X + N vanishes at least as fast as 1/ snr as snr → ∞. We define the MMSE dimension of X as the limit as snr → ∞ of the product of snr and the MMSE. MMSE dimension is also shown to be the asymptotic ratio of nonlinear MMSE to linear MM...
Article
Full-text available
The minimum expected length for fixed-to-variable length encoding of an n-block memoryless source with entropy H grows as nH + O(1), where the term O(1) lies between 0 and 1. However, this well-known performance is obtained under the implicit constraint that the code assigned to the whole n-block is a prefix code. Dropping the prefix constraint, wh...
Article
We explore the duality between lossy compression and channel coding in the operational sense: whether a capacity-achieving encoder-decoder sequence achieves the rate-distortion function of the dual problem when the channel decoder [encoder] is the source compressor [decompressor, resp.], and vice versa. We show that, if used as a lossy compressor,...
Article
Full-text available
Channel dispersion plays a fundamental role in assessing the backoff from capacity due to finite blocklength. This paper analyzes the channel dispersion for a simple channel with memory: the Gilbert-Elliott communication model in which the crossover probability of a binary symmetric channel evolves as a binary symmetric Markov chain, with and witho...
Article
Full-text available
Channel dispersion plays a fundamental role in assessing the backoff from capacity due to finite blocklength. This paper analyzes the channel dispersion for a simple channel with memory: the Gilbert-Elliott communication model in which the crossover probability of a binary symmetric channel evolves as a binary symmetric Markov chain, with and witho...
Article
Full-text available
The objectives of this article are two-fold: First, to present the problem of joint source and channel (JSC) coding from a graphical model perspective and second, to propose a structure that uses a new graphical model for jointly encoding and decoding a redundant source. In the first part of the article, relevant contributions to JSC coding, rangin...
Article
Fano's inequality relates the error probability of guessing a finitely-valued random variable X given another random variable Y and the conditional entropy of X given Y. It is not necessarily tight when the marginal distribution of X is fixed. This paper gives a tight upper bound on the conditional entropy of X given Y in terms of the error probabi...
Conference Paper
Denote by C<sub>m</sub>(snr) the Gaussian channel capacity with signal-to-noise ratio SNR and input cardinality m. We show that as m grows, C<sub>m</sub>(snr) approaches C(snr) = 1/2 log(1 + snr) exponentially fast. Lower and upper bounds on the exponent are given as functions of SNR. We propose a family of input constellations based on the roots o...
Article
A random variable with distribution P is observed in Gaussian noise and is estimated by a minimum meansquare estimator that assumes that the distribution is Q. This paper shows that the integral over all signal-to-noise ratios of the excess mean-square estimation error incurred by the mismatched estimator is twice the relative entropy D(P∥Q). This...
Conference Paper
We explore the duality between the Gelfand-Pinsker problem of channel coding with side information at the transmitter and the Wyner-Ziv problem of lossy compression with side information at the decompressor in the operational sense: whether a capacity-achieving encoder-decoder sequence achieves the rate distortion function of the dual problem when...
Article
We consider the Wyner-Ziv (WZ) problem of lossy compression where the decompressor observes a noisy version of the source, whose statistics are unknown. A new family of WZ coding algorithms is proposed and their universal optimality is proven. Compression consists of sliding-window processing followed by Lempel-Ziv (LZ) compression, while the decom...
Article
In this paper, we investigate the linear precoding and power allocation policies that maximize the mutual information for general multiple-input-multiple-output (MIMO) Gaussian channels with arbitrary input distributions, by capitalizing on the relationship between mutual information and minimum mean-square error (MMSE). The optimal linear precoder...
Conference Paper
Full-text available
The energy-distortion tradeoff for lossy transmission of sources over multi-user networks is studied. The energy-distortion function E(D) is defined as the minimum energy required to transmit a source to the receiver within the target distortion D, when there is no restriction on the number of channel uses per source sample. For point-to-point chan...
Conference Paper
Full-text available
We find the capacity of discrete-time channels subject to both frequency-selective and time-selective fading, where the channel output is observed in additive Gaussian noise. A coherent model is assumed where the fading coefficients are known at the receiver. Capacity depends on the first-order distributions of the fading processes in frequency and...
Conference Paper
Full-text available
The energy-distortion function E(D) for the joint source-channel coding problem in networks is defined and studied. The energy-distortion function E(D) is defined as the minimum energy required to transmit a source to a receiver within the target distortion D, when there is no restriction on the number of channel uses per source sample. For point-t...
Conference Paper
This paper considers a three-terminal communication problem with one source node which broadcasts a common message to two destination nodes over a wireless medium. The destination nodes can cooperate over bidirectional wireless links. We study the minimum energy per information bit for this setup when there is no constraint on the available bandwid...
Conference Paper
In Shannon theory, lossless source coding deals with the optimal compression of discrete sources. Compressed sensing is a lossless coding strategy for analog sources by means of multiplication by real-valued matrices. In this paper we study almost lossless analog compression for analog memoryless sources in an information-theoretic framework, in wh...
Conference Paper
Full-text available
The minimum block-length required to achieve a given rate and error probability can be easily and tightly approximated from two key channel parameters: the capacity and the channel dispersion. The channel dispersion gauges the variability of the channel relative to a deterministic bit pipe with the same capacity. This paper finds the dispersion of...
Conference Paper
We consider scaling laws for maximal energy efficiency of communicating a message to all the nodes in a random wireless network, as the number of nodes in the network becomes large. Two cases of large wireless networks are studied - dense random networks and constant density (extended) random networks. We first establish an information-theoretic lo...
Conference Paper
Full-text available
Conventional wisdom states that the minimum expected length for fixed-to-variable length encoding of an n-block memoryless source with entropy H grows as nH+O(1). However, this performance is obtained under the constraint that the code assigned to the whole n-block is a prefix code. Dropping this unnecessary constraint we show that the minimum expe...
Article
We propose a scheme for lossy compression of discrete memoryless sources: The compressor is the decoder of a nonlinear channel code, constructed from a sparse graph. We prove asymptotic optimality of the scheme for any separable (letter-by-letter) bounded distortion criterion. We also present a suboptimal compression algorithm, which exhibits near-...

Citations

... The seventh paper by Verdú [46] is a research and tutorial paper on error exponents and α-mutual information. Similarly to [23] (the second paper in this Special Issue), it relates to Rényi's generalization of the relative entropy and mutual information. ...
... Previous solutions for the Gaussian [11,12] and the binary symmetric [13] cases were based on rearrangement inequalities or constrained optimal transport in 2-norm, which are specialized to those channel distributions. Other approaches based on measure concentration [14][15][16][17] and reverse hypercontractivity [18] (building on a method of [19][20][21]) apply for general channels but are not strong enough in the regime of interest for Cover's problem. ...
... In recent years, tools from noncommutative analysis have proven fruitful in understanding the primitives of Quantum Information Theory (QIT), where-mostly due to the noncommutative nature of the theory-even finding viable quantum analogs of certain informationtheoretic quantities turns out to be nontrivial, and various challenges arise to be overcome (see, e.g., [11,10,40,5,2]). Analytical methods have been exploited in classical information theory just as well, where the quantum phenomena are nonexistent (see, e.g., [35,23,20,24]). ...
... We note that while the conditional relative entropy in (1.6) is the expectation of D(P Y |X (·|X) Q Y |X (·|X)) over X ∼ P X , the conditional Rényi divergence in (1.7) depends on D 1+s (P Y |X (·|X) Q Y |X (·|X)) in a more involved way; indeed, it is a generalized mean of the random variable D 1+s (P Y |X (·|X) Q Y |X (·|X)) evaluated at s. For a more detailed discussion on this point, the reader is referred to Cai and Verdú [32]. We also note that there are other definitions of the conditional Rényi divergence but we will use the definition in (1.7) in this monograph; see [20], [43], [155]. ...
... Further, in order to extend the linear distortion measure as the average distortion to nonlinear distortion measures with respect to the distortion of each data point, f-separable distortion measures using f-mean has been proposed. In particular, for this distortion measure, the rate-distortion function showing the limit of lossy compression was elucidated [17]. ...
... (5.8.19)] for the BSC. The notion of pefect and quasi-perfect codes was generalized beyond binary alphabets in [10]. These codes, whenever they exist, attain the hypothesis-testing bound [7,Th. ...
... , L} are selected upon observing all the random variables U M and V L . A duality between covering and sampling recently observed in [7] shows that the approximation error in total variation is precisely characterized by the left side of (4). Based on this duality observation, we derive the exact error exponent as well as the second-order rates (for a nonvanishing error) of joint distribution simulation of stationary memoryless sources. ...
... More or less motivated by this, the Gaussian optimality problem spurred a lot of research interests recently, e.g. [5][13] [35][11] [41][26] [3]. It was known that Gaussian inputs are optimal for computing the corners of the region [44][9] [43][5] [11] [24][26] [25], but the full region or even the precise slope at Costa's corner point remains open [26]. ...
... Proof: We start from equation (12). The asymptotics of the α-mutual information term indirectly follows from the proof of Theorem 2 in [17]: from [17,Equation (80)] onwards, it is proved that ...
... [t] in (36) can be eliminated, and the resulting directed information is zero because different components of the vector Y k i are independent due to (18), (19); (38) is by substituting (34); (39) ...