[Show abstract][Hide abstract] ABSTRACT: We consider the communication capacity of wireline networks for a two-unicast
traffic pattern. The network has two sources and two destinations with each
source communicating a message to its own destination, subject to the capacity
constraints on the directed edges of the network. We propose a simple outer
bound for the problem that we call the Generalized Network Sharing (GNS) bound.
We show this bound is the tightest edge-cut bound for two-unicast networks and
is tight in several bottleneck cases, though it is not tight in general. We
also show that the problem of computing the GNS bound is NP-complete. Finally,
we show that despite its seeming simplicity, the two-unicast problem is as hard
as the most general network coding problem. As a consequence, linear coding is
insufficient to achieve capacity for general two-unicast networks, and
non-Shannon inequalities are necessary for characterizing capacity of general
two-unicast networks.
[Show abstract][Hide abstract] ABSTRACT: We consider the following non-interactive simulation problem: Alice and Bob
observe sequences $X^n$ and $Y^n$ respectively where $\{(X_i, Y_i)\}_{i=1}^n$
are drawn i.i.d. from $P(x,y),$ and they output $U$ and $V$ respectively which
is required to have a joint law that is close in total variation to a specified
$Q(u,v).$ It is known that the maximal correlation of $U$ and $V$ must
necessarily be no bigger than that of $X$ and $Y$ if this is to be possible.
Our main contribution is to bring hypercontractivity to bear as a tool on this
problem. In particular, we show that if $P(x,y)$ is the doubly symmetric binary
source, then hypercontractivity provides stronger impossibility results than
maximal correlation. Finally, we extend these tools to provide impossibility
results for the $k$-agent version of this problem.
[Show abstract][Hide abstract] ABSTRACT: In this paper, we consider the AWGN channel with a power constraint called
the $(\sigma, \rho)$-power constraint, which is motivated by energy harvesting
communication systems. Given a codeword, the constraint imposes a limit of
$\sigma + k \rho$ on the total power of any $k\geq 1$ consecutive transmitted
symbols. Such a channel has infinite memory and evaluating its exact capacity
is a difficult task. Consequently, we establish an $n$-letter capacity
expression and seek bounds for the same. We obtain a lower bound on capacity by
considering the volume of ${\cal S}_n(\sigma, \rho) \subseteq \mathbb{R}^n$,
which is the set of all length $n$ sequences satisfying the $(\sigma,
\rho)$-power constraints. For a noise power of $\nu$, we obtain an upper bound
on capacity by considering the volume of ${\cal S}_n(\sigma, \rho) \oplus
B_n(\sqrt{n\nu})$, which is the Minkowski sum of ${\cal S}_n(\sigma, \rho)$ and
the $n$-dimensional Euclidean ball of radius $\sqrt{n\nu}$. We analyze this
bound using a result from convex geometry known as Steiner's formula, which
gives the volume of this Minkowski sum in terms of the intrinsic volumes of
${\cal S}_n(\sigma, \rho)$. We show that as the dimension $n$ increases, the
logarithm of the sequence of intrinsic volumes of $\{{\cal S}_n(\sigma,
\rho)\}$ converges to a limit function under an appropriate scaling. The upper
bound on capacity is then expressed in terms of this limit function. We derive
the asymptotic capacity in the low and high noise regime for the $(\sigma,
\rho)$-power constrained AWGN channel, with strengthened results for the
special case of $\sigma = 0$, which is the amplitude constrained AWGN channel.
[Show abstract][Hide abstract] ABSTRACT: We are motivated by applications that need rich model classes to represent
them. Examples of rich model classes include distributions over large,
countably infinite supports, slow mixing Markov processes, etc. But such rich
classes may be too complex to admit estimators that converge to the truth with
convergence rates that can be uniformly bounded over the entire model class as
the sample size increases (uniform consistency). However, these rich classes
may still allow for estimators with pointwise guarantees whose performance can
be bounded in a model dependent way. The pointwise angle of course has the
drawback that the estimator performance is a function of the very unknown model
that is being estimated, and is therefore unknown. Therefore, even if the
estimator is consistent, how well it is doing may not be clear no matter what
the sample size is. Departing from the dichotomy of uniform and pointwise
consistency, a new analysis framework is explored by characterizing rich model
classes that may only admit pointwise guarantees, yet all the information about
the model needed to guage estimator accuracy can be inferred from the sample at
hand. To retain focus, we analyze the universal compression problem in this
data driven pointwise consistency framework.
[Show abstract][Hide abstract] ABSTRACT: In this paper, we revisit the structure of infeasibility results in network information theory, based on a notion of information state. We also discuss ideas for generalizing a known outer bound for lossless transmission of independent sources over a network to one of lossy transmission of dependent sources over the same network. To concretely demonstrate this, we apply our ideas and prove new results for lossy transmission of dependent sources by generalizing: 1) the cut-set bound; 2) the best known outer bound on the capacity region of a general broadcast channel; and 3) the outer bound part of the result of Maric, Yates, and Kramer on strong interference channels with a common message.
IEEE Transactions on Information Theory 10/2014; 60(10):5992-6004. DOI:10.1109/TIT.2014.2347301 · 2.33 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Consider a family of Boolean models, indexed by integers $n \ge 1$, where the
$n$-th model features a Poisson point process in ${\mathbb{R}}^n$ of intensity
$e^{n \rho_n}$ with $\rho_n \to \rho$ as $n \to \infty$, and balls of
independent and identically distributed radii distributed like $\bar X_n
\sqrt{n}$, with $\bar X_n$ satisfying a large deviations principle. It is shown
that there exist three deterministic thresholds: $\tau_d$ the degree threshold;
$\tau_p$ the percolation threshold; and $\tau_v$ the volume fraction threshold;
such that asymptotically as $n$ tends to infinity, in a sense made precise in
the paper: (i) for $\rho < \tau_d$, almost every point is isolated, namely its
ball intersects no other ball; (ii) for $\tau_d< \rho< \tau_p$, almost every
ball intersects an infinite number of balls and nevertheless there is no
percolation; (iii) for $\tau_p< \rho< \tau_v$, the volume fraction is 0 and
nevertheless percolation occurs; (iv) for $\tau_d< \rho< \tau_v$, almost every
ball intersects an infinite number of balls and nevertheless the volume
fraction is 0; (v) for $\rho > \tau_v$, the whole space covered. The analysis
of this asymptotic regime is motivated by related problems in information
theory, and may be of interest in other applications of stochastic geometry.
[Show abstract][Hide abstract] ABSTRACT: We are motivated by applications that need rich model classes to represent the application, such as the set of all discrete distributions over large, countably infinite supports. But such rich classes may be too complex to admit estimators that converge to the truth with convergence rates that can be uniformly bounded over the entire model class as the sample size increases (uniform consistency). However, these rich classes may still allow for estimators with pointwise guarantees whose performance can be bounded in a model-dependent way. But the pointwise angle has a drawback as well—estimator performance is a function of the very unknown model that is being estimated, and is therefore unknown. Therefore, even if an estimator is consistent, how well it is doing may not be clear no matter what the sample size. Departing from the uniform/pointwise dichotomy, a new analysis framework is explored by characterizing rich model classes that may only admit pointwise guarantees, yet all information about the unknown model needed to gauge estimator accuracy can be inferred from the sample at hand. To bring focus, we analyze the universal compression problem in this data driven, pointwise consistency framework. Today, data accumulated in many biological, financial, and other statistical problems stands out not just because of its nature or size, but also because the questions we ask of it are unlike anything we asked before. There is often a tension in these big data problems between the need for rich model classes to better represent the application and our ability to handle these classes at all from a mathematical point of view. Consider an example of insuring the risk of exposure to the Internet as opposed to the simple credit monitoring tools available today. Given the significant number of identity thefts, security breaches, and privacy concerns, insurance of this nature may be highly desirable. How would one model loss here? After all, losses suffered can range from direct loss of property to more intangible, yet very significant damage resulting from lowered credit scores. Designing insurance policies with ceilings on claim payments keeps us in familiar territory mathematically, but also misses the point of why one may want this sort of insurance. We therefore want a richer set of candidate loss models that do not impose artificial ceilings on loss. But we will run into a fundamental roadblock here. Richness of model classes is often quantified by metrics such as the VC-dimension [1], the Rademacher complexity [2], [3], [4], or the strong compression redundancy [5], [6], [7], [8], [9]. Typically, one looks for estimation algorithms with model-agnostic guarantees based on the sample size—indeed this is the uniform consistency dogma that underlies most formulations of engineering applications today. But any such guarantee on estimators on a model class depends on the complexity metrics above—the more complex a class, the worse the guarantees. In fact, the insurance problem above and many applications in the "big data" regime force us to consider model classes that are too complex to admit estimators with reasonable model-agnostic guarantees (or uniformly consistent estimators). Instead the best we can often do is to have guarantees dependent on not just the sample size but on the underlying model in addition (pointwise consistent). This is not very helpful either—our gauge of how well the estimator is doing is dependent on the very quantity being estimated! As in [10], we challenge the dichotomy of uniform and pointwise consistency in the analysis of statistical estimators. Neither uniform nor pointwise guarantees are particularly suited to the big data problems we have in mind. The former precludes the desired richness of model classes. While the latter allows for rich model classes, it does not provide practical guarantees that can be used in applications. Instead, we consider a new paradigm positioned in between these two extremes. This framework modifies the world of pointwise consistent estimators—keeping as far as possible the richness of model classes possible but
IEEE Symposium on Information Theory, Honolulu, HI; 06/2014
[Show abstract][Hide abstract] ABSTRACT: In this paper we provide the correct tight constant to a data-processing inequality claimed by Erkip and Cover. The correct constant turns out to be a particular hypercontractivity parameter of (X,Y), rather than their squared maximal correlation. We also provide alternate geometric characterizations for both maximal correlation as well as the hypercontractivity parameter that characterizes the data-processing inequality.
2014 IEEE International Symposium on Information Theory (ISIT); 06/2014
[Show abstract][Hide abstract] ABSTRACT: In applications involving estimation, the relevant model classes of probability distributions are often too complex to admit estimators that converge to the truth with convergence rates that can be uniformly bounded over the entire model class as the sample size increases (uniform consistency). While it is often possible to get pointwise guarantees, so that the convergence rate of the estimator can be bounded in a model-dependent way, such pointwise gaurantees are unsatisfactory - estimator performance is a function of the very unknown quantity that is being estimated. Therefore, even if an estimator is consistent, how well it is doing may not be clear no matter what the sample size. Departing from this traditional uniform/pointwise dichotomy, a new analysis framework is explored by characterizing model classes of probability distributions that may only admit pointwise guarantees, yet where all the information about the unknown model needed to gauge estimator accuracy can be inferred from the sample at hand. To provide a focus to this suggested broad new paradigm, we analyze the universal compression problem in this data-driven pointwise consistency framework.
2014 IEEE International Symposium on Information Theory (ISIT); 06/2014
[Show abstract][Hide abstract] ABSTRACT: In energy harvesting communication systems, the transmitter is adapted to harvest energy per time slot. The harvested energy is either used right away or is stored in a battery to facilitate future transmissions. We consider the problem of determining the Shannon capacity of an energy harvesting transmitter communicating over an additive white Gaussian noise (AWGN) channel, where the amount of energy harvested per time slot is a constant ρ and the battery has capacity σ. This imposes a new kind of power constraint on the transmitted codewords, and we call the resulting constrained channel a (σ, ρ) power constrained AWGN channel. When σ is 0 or ∞, the capacity of this channel is known. For the finite battery case, we obtain an expression for the channel capacity. We obtain bounds on capacity by considering the volume of Sn(σ, ρ) ⊆ ℝn, which is the set of all length n sequences satisfying the (σ, ρ) constraints.
2014 IEEE International Symposium on Information Theory (ISIT); 06/2014
[Show abstract][Hide abstract] ABSTRACT: We look at irreducible continuous time Markov chains with a finite or countably infinite number of states, and a unique stationary distribution π. If the Markov chain has distribution μt at time t, its relative entropy to stationarity is denoted by h(μt|π). This is a monotonically decreasing function of time, and decays to 0 at an exponential rate in most natural examples of Markov chains arising in applications. In this paper, we focus on the second derivative properties of h(μt|π). In particular we examine when relative entropy to stationarity exhibits convex decay, independent of the starting distribution. It has been shown that convexity of h(μt|π) in a Markov chain can lead to sharper bounds on the rate of relative entropy decay, and thus on the mixing time of the Markov chain. We study certain finite state Markov chains as well as countable state Markov chains arising from stable Jackson queueing networks.
2014 48th Annual Conference on Information Sciences and Systems (CISS); 03/2014
[Show abstract][Hide abstract] ABSTRACT: Hypercontractivity has had many successful applications in mathematics, physics, and theoretical computer science. In this work we use recently established properties of the hypercontractivity ribbon of a pair of random variables to study a recent conjecture regarding the mutual information between binary functions of the individual marginal sequences of a sequence of pairs of random variables drawn from a doubly symmetric binary source.
2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton); 10/2013
[Show abstract][Hide abstract] ABSTRACT: In this paper we provide a new geometric characterization of the
Hirschfeld-Gebelein-R\'{e}nyi maximal correlation of a pair of random $(X,Y)$,
as well as of the chordal slope of the nontrivial boundary of the
hypercontractivity ribbon of $(X,Y)$ at infinity. The new characterizations
lead to simple proofs for some of the known facts about these quantities. We
also provide a counterexample to a data processing inequality claimed by Erkip
and Cover, and find the correct tight constant for this kind of inequality.
[Show abstract][Hide abstract] ABSTRACT: Marton's region is the best known inner bound for a general discrete memoryless broadcast channel. We establish improved bounds on the cardinalities of the auxiliary random variables. We combine the perturbation technique along with a representation using concave envelopes to achieve this improvement. As a corollary of this result, we show that a randomized time division strategy achieves the entire Marton's region for binary input broadcast channels, extending the previously known result for the sum-rate and validating a previous conjecture due to the same authors.
Information Theory Proceedings (ISIT), 2013 IEEE International Symposium on; 01/2013
[Show abstract][Hide abstract] ABSTRACT: Motivated by problems in insurance, our task is to predict finite upper
bounds on a future draw from an unknown distribution $p$ over the set of
natural numbers. We can only use past observations generated independently and
identically distributed according to $p$. While $p$ is unknown, it is known to
belong to a given collection ${\cal P}$ of probability distributions on the
natural numbers.
The support of the distributions $p \in {\cal P}$ may be unbounded, and the
prediction game goes on for \emph{infinitely} many draws. We are allowed to
make observations without predicting upper bounds for some time. But we must,
with probability 1, start and then continue to predict upper bounds after a
finite time irrespective of which $p \in {\cal P}$ governs the data.
If it is possible, without knowledge of $p$ and for any prescribed confidence
however close to 1, to come up with a sequence of upper bounds that is never
violated over an infinite time window with confidence at least as big as
prescribed, we say the model class ${\cal P}$ is \emph{insurable}.
We completely characterize the insurability of any class ${\cal P}$ of
distributions over natural numbers by means of a condition on how the
neighborhoods of distributions in ${\cal P}$ should be, one that is both
necessary and sufficient.
[Show abstract][Hide abstract] ABSTRACT: In a peer-to-peer file sharing system based on random contacts where the upload capacity of the seed is small, a single chunk of the file may become rare, causing an accumulation of peers who lack the rare chunk. To prevent this from happening, we propose a protocol where each peer samples a small population of peers and makes an intelligent decision to pick which chunk to download based on this sample. We prove that the resulting system is stable under any arrival rate of peers even if the seed has small, bounded upload capacity.
[Show abstract][Hide abstract] ABSTRACT: Shannon's Entropy Power Inequality can be viewed as characterizing the
minimum differential entropy achievable by the sum of two independent random
variables with fixed differential entropies. The entropy power inequality has
played a key role in resolving a number of problems in information theory. It
is therefore interesting to examine the existence of a similar inequality for
discrete random variables. In this paper we obtain an entropy power inequality
for random variables taking values in an abelian group of order 2^n, i.e. for
such a group G we explicitly characterize the function f_G(x,y) giving the
minimum entropy of the sum of two independent G-valued random variables with
respective entropies x and y. Random variables achieving the extremum in this
inequality are thus the analogs of Gaussians in this case, and these are also
determined. It turns out that f_G(x,y) is convex in x for fixed y and, by
symmetry, convex in y for fixed x. This is a generalization to abelian groups
of order 2^n of the result known as Mrs. Gerber's Lemma.
[Show abstract][Hide abstract] ABSTRACT: A positive recurrent, aperiodic Markov chain is said to be long-range dependent
(LRD) when the indicator function of a particular state is LRD. This happens if
and only if the return time distribution for that state has infinite variance.
We investigate the question of whether other instantaneous functions of the
Markov chain also inherit this property. We provide conditions under which the
function has the same degree of long-range dependence as the chain itself. We
illustrate our results through three examples in diverse fields: queueing
networks, source compression, and finance.
Journal of Applied Probability 06/2012; 49(2012). DOI:10.1239/jap/1339878798 · 0.59 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Stability and convergence properties of stochastic approximation algorithms are analyzed when the noise includes a long range dependent component (modeled by a fractional Brownian motion) and a heavy tailed component (modeled by a symmetric stable process), in addition to the usual ‘martingale noise’. This is motivated by the emergent applications in communications. The proofs are based on comparing suitably interpolated iterates with a limiting ordinary differential equation. Related issues such as asynchronous implementations, Markov noise, etc. are briefly discussed.
Queueing Systems 06/2012; 71(1-2). DOI:10.1007/s11134-012-9283-0 · 0.84 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Seventeen years ago at the ITW that was held in Moscow, I orga-
nized a similar panel on the future of Information Theory with the partici-
pation of Dick Blahut, Imre Csiszár, Dave Forney, Prakash Narayan and
Mark Pinsker. In preparation for this panel I have asked our panelists to
read the transcript of that panel (published in the December 1994 issue
of this newsletter) and discuss the ways in which that panel’s predictions
were and were not accurate.