Motivation Almost Lossless Analog CompressionRényi's Information Dimension TheoremsConclusion

Fundamental Limits of

Almost Lossless Analog Compression

Yihong Wu and Sergio Verdú

Department of Electrical Engineering

Princeton University

June 29, 2009

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheorems Conclusion

Dimension compression rate: discrete sources

Let X be a random variable on a discrete alphabet A.

Let Xn= [X1,...,Xn]Tbe i.i.d. with Xi∼ X.

An (n,k)-code:

Encoder f : An→ Ak.

Decoder g : Ak→ An.

Block error probability: ?n= P{g(f(Xn)) ?= Xn}.

Dimension compression rate: R =k

n.

Fundamental limit:

H(X)

log|A|.

Motivation Almost Lossless Analog Compression Rényi's Information Dimension TheoremsConclusion

As |A| → ∞

Question

What is the fundamental limit when the alphabet becomes

continuum,

H(X)

log|A|

Motivation Almost Lossless Analog Compression Rényi's Information DimensionTheoremsConclusion

As |A| → ∞

Question

What is the fundamental limit when the alphabet becomes

continuum,

H(X) ? ∞

log|A| ? ∞

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheorems Conclusion

Dimension compression rate: analog sources

Let X be a random variable on R.

Let Xn= [X1,...,Xn]Tbe i.i.d. with Xi∼ X.

An (n,k)-code:

Encoder f : Rn→ Rk.

Decoder g : Rk→ Rn.

Block error probability: ?n= P{g(f(Xn)) ?= Xn}.

Dimension compression rate: R =k

Fundamental limit:

n.

MotivationAlmost Lossless Analog CompressionRényi's Information Dimension Theorems Conclusion

Dimension compression rate: analog sources

Let X be a random variable on R.

Let Xn= [X1,...,Xn]Tbe i.i.d. with Xi∼ X.

An (n,k)-code:

Encoder f : Rn→ Rk.

Decoder g : Rk→ Rn.

Block error probability: ?n= P{g(f(Xn)) ?= Xn}.

Dimension compression rate: R =k

Fundamental limit: 0 !

n.

MotivationAlmost Lossless Analog CompressionRényi's Information Dimension TheoremsConclusion

Zero Achievable Rate

Rnhas the same cardinality as R (Cantor 1891)

⇒ Rate: R =1

n→ 0

Motivation Almost Lossless Analog Compression Rényi's Information Dimension TheoremsConclusion

Observations

The bijection f : Rn→ R is highly irregular.

f and f−1can both be Borel measurable but not continuous.

Motivation Almost Lossless Analog CompressionRényi's Information Dimension Theorems Conclusion

Observations

The bijection f : Rn→ R is highly irregular.

f and f−1can both be Borel measurable but not continuous.

Goal: to seek lossless and graceful compression schemes, i.e.,

with regularity constraints on encoder/decoder

Low complexity.

Noise resilience, i.e., robust reconstruction.

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheorems Conclusion

Compressed Sensing (CS): an analog compression paradigm

ENC Axn

yk= Axn

DEC gˆ xn

For sparse vectors: xnhas a given fraction of non-zero

components.

Motivation Almost Lossless Analog CompressionRényi's Information DimensionTheoremsConclusion

Compressed Sensing (CS): an analog compression paradigm

ENC Axn

yk= Axn

DEC gˆ xn

For sparse vectors: xnhas a given fraction of non-zero

components.

Linear encoder: A ∈ Rk×n.

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheorems Conclusion

Compressed Sensing (CS): an analog compression paradigm

ENC Axn

yk= Axn

DEC gˆ xn

For sparse vectors: xnhas a given fraction of non-zero

components.

Linear encoder: A ∈ Rk×n.

Non-linear decoder g: l1-minimization, l2-regularization,

matching pursuit, basis pursuit, ...

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheoremsConclusion

CS: achievability result

ENC Axn

yk= Axn

DEC gˆ xn

Exact measurement [Candés et al., 2006a]:

∀xn∈ Rn: ?xn?0≤ s.

A randomly chosen (e.g., from a Gaussian ensemble).

k = Θ(slogn).

?n→ 0 as n → ∞.

MotivationAlmost Lossless Analog CompressionRényi's Information Dimension TheoremsConclusion

CS: robustness result

ENC Axn

yk= Axn

DEC gˆ xn

ek

ˆ yk

Noisy measurement [Candés et al., 2006b]:

g: ?1-regularization decoder.

The RIP constant of A satisfies certain conditions.

??ˆ xn− xn??2≤ Cs

??ˆ yk− yk??2

More

Motivation Almost Lossless Analog CompressionRényi's Information DimensionTheoremsConclusion

CS: robustness result

ENC Axn

yk= Axn

DEC gˆ xn

ek

ˆ yk

Noisy measurement [Candés et al., 2006b]:

g: ?1-regularization decoder.

The RIP constant of A satisfies certain conditions.

??ˆ xn− xn??2≤ Cs

??ˆ yk− yk??2

⇔ g is Lipschitz continuous!

More

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheorems Conclusion

Reflections on CS

Data

Compression

Random

Average

Bits

—

—

Compressed

Sensing

Deterministic

Worst-case

Reals

Linear

Lipschitz

Vanishing block error probability

Matrix

2 × sparsity

Source

Analysis

Codes

Compressor

Decompressor

Goal

Randomness

Fundamental limit

Source

H

MotivationAlmost Lossless Analog CompressionRényi's Information Dimension Theorems Conclusion

Reflections on CS

Data

Compression

?

Average

Bits

—

—

?

H

Compressed

Sensing

Deterministic

Worst-case

Reals

Linear

Lipschitz

Vanishing block error probability

?

2 × sparsity

Source

Analysis

Codes

Compressor

Decompressor

Goal

Randomness

Fundamental limit

?

?

?

?

?

Random

??

?

?

Source

Matrix

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheoremsConclusion

Reflections on CS

Data

Compression

?

Average

Bits

—

—

?

H

Compressed

Sensing

Deterministic

Worst-case

?

Linear

?

Matrix

2 × sparsity

Source

Analysis

Codes

Compressor

Decompressor

Goal

Randomness

Fundamental limit

?

?

?

?

?

Random

?

?

?

?

Lipschitz

?

?

?

Reals

??

??

?

Vanishing block error probability

??

?

Source

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheoremsConclusion

Reflections on CS

Data

Compression

?

Average

Bits

—

—

?

H

Compressed

Sensing

Deterministic

Worst-case

?

Linear

?

Matrix

2 × sparsity

Analog

Compression

Random

Average

Reals

Regularity conditions

Regularity conditions

Source

Analysis

Codes

Compressor

Decompressor

Goal

Randomness

Fundamental limit

?

?

?

?

?

Random

?

?

?

?

Lipschitz

?

?

?

Reals

??

??

?

Vanishing block error probability

??

?

Source

Source

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheoremsConclusion

Reflections on CS

Data

Compression

?

Average

Bits

—

—

?

H

Compressed

Sensing

Deterministic

Worst-case

?

Linear

?

Matrix

2 × sparsity

Analog

Compression

Random

Average

Reals

Regularity conditions

Regularity conditions

Source

Analysis

Codes

Compressor

Decompressor

Goal

Randomness

Fundamental limit

?

?

?

?

?

Random

?

?

?

?

Lipschitz

?

?

?

Reals

??

??

?

Vanishing block error probability

??

?

Source

Source

?

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheorems Conclusion

Almost Lossless Analog Compression

Definition 1

{Xi: i ∈ N}: a stochastic process on (RN,BN).

minimum ?-achievable rate: the infimum of R > 0 s.t. there

exists a sequence of (n,?Rn?)-codes (fn,gn)

limsup

n→∞

P{gn(fn(Xn)) ?= Xn} ≤ ?

fnand gnare constrained according to the following table:

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheoremsConclusion

Regularity conditions

Encoder Decoder Minimum ?-achievable rate

Borel Continuous

Continuous

Continuous

Lipschitz

∆-stable

R0(?)

˜R(?)

R∗(?)

R(?)

R(?,∆)

Continuous

Linear

Borel

Borel

Lipschitz

∆-stable

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheorems Conclusion

General ordering of ?-achievable rates

Theorem 2

0 = R0(?) ≤˜R(?) ≤ R∗(?) ≤ R(?)

holds for any source and any 0 < ? ≤ 1.

ENC

Borel

cont’s.

linear

Borel

DEC

cont’s.

cont’s.

cont’s.

Lip.

R0

˜R

R∗

R

Motivation Almost Lossless Analog Compression Rényi's Information DimensionTheoremsConclusion

General ordering of ?-achievable rates

Theorem 2

0 = R0(?) ≤˜R(?) ≤ R∗(?) ≤ R(?)

holds for any source and any 0 < ? ≤ 1.

ENC

Borel

cont’s.

linear

Borel

DEC

cont’s.

cont’s.

cont’s.

Lip.

R0

˜R

R∗

R

Proof

R0= 0: Hahn-Mazurkiewicz theorem.

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheorems Conclusion

General ordering of ?-achievable rates

Theorem 2

0 = R0(?) ≤˜R(?) ≤ R∗(?) ≤ R(?)

holds for any source and any 0 < ? ≤ 1.

ENC

Borel

cont’s.

linear

Borel

DEC

cont’s.

cont’s.

cont’s.

Lip.

R0

˜R

R∗

R

Proof

R0= 0: Hahn-Mazurkiewicz theorem.

R∗≤ R: Minkowski dimension compression + random

projection.

More

MotivationAlmost Lossless Analog CompressionRényi's Information Dimension Theorems Conclusion

Rényi’s Information Dimension

Definition 3 (Information Dimension [Rényi, 1959])

Let X be a real-valued random variable. Denote for

an integer m the quantized version of X:

?X?m=?mX?

m

.

Define lower information dimension

d(X) = liminf

m→∞

H (?X?m)

logm

and upper information dimension

d(X) = limsup

m→∞

H (?X?m)

logm

.

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheorems Conclusion

Rényi’s Information Dimension

If d(X) = d(X), we say that the information dimension of X

is d(X), i.e.,

d(X) = lim

m→∞

H (?X?m)

logm

,

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheoremsConclusion

Examples

Discrete distribution: d(X) = 0.

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheorems Conclusion

Examples

Discrete distribution: d(X) = 0.

Continuous distribution: d(X) = 1.

MotivationAlmost Lossless Analog Compression Rényi's Information DimensionTheorems Conclusion

Examples

Discrete distribution: d(X) = 0.

Continuous distribution: d(X) = 1.

Singular distribution: e.g. Cantor distribution:

Ternary expansion:

X =

∞

?

i=1

(X)i3−i,

with (X)ii.i.d., P{(X)i= 0} = P{(X)i= 2} =1

Information dimension:

2.

d(X) = H((X)i) = log32 ≈ 0.63.

MotivationAlmost Lossless Analog CompressionRényi's Information Dimension TheoremsConclusion

Plot of

H(?X?m)

logm

→ d(X)

510152025

0

0.5

1

1.5

2

2.5

3

3.5

m

H([X]m) / log m

Exponential

Gaussian

Cantor

Geometric

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheoremsConclusion

Information dimension = Entropy rate of the binary

expansion

Binary expansion:

X(ω) =

∞

?

i=−∞

(X)i(ω)2−i,

(1)

MotivationAlmost Lossless Analog CompressionRényi's Information Dimension Theorems Conclusion

Information dimension = Entropy rate of the binary

expansion

Binary expansion:

X(ω) =

∞

?

i=−∞

(X)i(ω)2−i,

(1)

d(X) coincides with the entropy rate of the binary expansion

of the fractional part of X.

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheoremsConclusion

Dichotomy of finiteness and infinity

Theorem 4

If Elog(|X| + 1) < ∞, then

0 ≤ d(X) ≤ d(X) ≤ 1.

If Elog(|X| + 1) = ∞, then

d(X) = d(X) = ∞.

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheorems Conclusion

Discrete/Continuous Mixture

Theorem 5 (Discrete-continuous mixed sources [Rényi, 1959])

If d(X) is finite, and

PX= (1 − ρ)Pd

X+ ρPc

X,

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheoremsConclusion

Discrete/Continuous Mixture

Theorem 5 (Discrete-continuous mixed sources [Rényi, 1959])

If d(X) is finite, and

PX= (1 − ρ)Pd

X+ ρPc

↑

discrete

X,

↑

a.c.

then

d(X) = ρ.

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheoremsConclusion

Example: A Statistical Model for Sparse Sources

Let f be a PDF and

fX(x) = (1 − ρ)δ(x) + ρf(x).

ρ constant ⇒ Xnhas linear sparsity with high probability.

d(X) = ρ.

MotivationAlmost Lossless Analog CompressionRényi's Information Dimension Theorems Conclusion

Linear compression: discrete-continuous mixture

Theorem 6

For memoryless sources, if X has a

discrete-continuous mixed distribution, then

R∗(?) = d(X),

∀ 0 < ? < 1.

Proof

ENC

linear

DEC

cont’s.

R∗

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheoremsConclusion

Proof ideas: achievability

Let d(X) = ρ.

Probability concentrated on:

W = {xn: (ρ − δ)n ≤ ?xn?0≤ (ρ + δ)n}.

Page 40

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheorems Conclusion

Proof ideas: converse

Lemma 7 ([Steinhaus, 1920])

C ⊂ Rn.

Leb(C) > 0.

Then

C − C = {x − y : x,y ∈ C}

Page 41

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheoremsConclusion

Linear compression: general achievability

Theorem 8

For memoryless sources,

R∗(?) ≤ˆd(X) = lim

α↑1dα(X),

∀ 0 < ? < 1.

Moreover,

1 For Lebesgue-a.e. linear encoder, block error probability ? is

achievable with a uniformly continuous decoder.

2 The decoder can be chosen to be β-Hölder continuous for all

0 < β <R−ˆd(X)

R

, where R >ˆd(X) is the compression rate.

Proof

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheoremsConclusion

Lipschitz decompression: discrete-continuous mixture

Theorem 9

For memoryless sources, if X has a

discrete-continuous mixed distribution, then

R(?) = d(X),

∀ 0 < ? < 1.

Proof

ENC

Borel

DEC

Lip.

R

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheoremsConclusion

Lipschitz decompression: self-similar sources

Theorem 10

Let the distribution of X be a self-similar measure generated by

i.i.d. M-ary digits with common distribution P. Then

R(?) = d(X) =H(P)

logM,

∀ 0 < ? < 1.

Moreover, if P is equiprobable on its support, then the above

holds even for ? = 0.

Proof

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheoremsConclusion

Lipschitz decompression: general converse

Theorem 11

For memoryless sources, if d(X) < ∞, then

R(?) ≥ d(X),

∀ 0 < ? < 1.

Proof

MotivationAlmost Lossless Analog CompressionRényi's Information DimensionTheoremsConclusion

Stable decompression

Theorem 12 (Stable decompression)

Let the underlying metric be the ?∞distance.

Then for memoryless sources,

limsup

∆↓0

R(?,∆) = d(X),

∀ 0 < ? < 1.

Proof

ENC

Borel

DEC

∆-stable

R

Motivation Almost Lossless Analog Compression Rényi's Information DimensionTheorems Conclusion

Example: Cantor distribution

Information dimension:

d(X) = log32 ≈ 0.63.

Lossless compression with Lipschitz decompressor or linear

compressor can be achieved at rate ∼ 63%.

Motivation Almost Lossless Analog CompressionRényi's Information DimensionTheorems Conclusion

Conclusion

Propose an information theoretic framework of lossless analog

compression.

Analogy:

compressed sensing

analog compression

↔

↔

coding theory

information theory

Probabilistic dimension reduction problem with smooth

embedding.

New operational characterization of information dimension.

For discrete-continuous mixed sources, fundamental limit:

fraction of analog symbols in source realizations.

Discreteness is the key.

