Content uploaded by Kees Schouhamer Immink
Author content
All content in this area was uploaded by Kees Schouhamer Immink on Apr 07, 2019
Content may be subject to copyright.
Pearson Codes ∗†‡
Jos H. Weber Kees A. Schouhamer Immink and Simon R. Blackburn
September 1, 2015
Abstract -The Pearson distance has been advocated for improving the
error performance of noisy channels with unknown gain and offset. The
Pearson distance can only fruitfully be used for sets of q-ary codewords,
called Pearson codes, that satisfy specific properties. We will analyze
constructions and properties of optimal Pearson codes. We will compare
the redundancy of optimal Pearson codes with the redundancy of prior
art T-constrained codes, which consist of q-ary sequences in which Tpre-
determined reference symbols appear at least once. In particular, it will be
shown that for q≤3the 2-constrained codes are optimal Pearson codes,
while for q≥4these codes are not optimal.
Key words: flash memory, digital optical recording, Non-Volatile Mem-
ory, NVM, Pearson distance.
1 Introduction
In non-volatile memories, such as floating gate memories, the data is
represented by stored charge, which can leak away from the floating gate.
∗Jos H. Weber is with Delft University of Technology, Delft, The Netherlands.
E-mail: j.h.weber@tudelft.nl.
†Kees A. Schouhamer Immink is with Turing Machines Inc, Willemskade 15b-d,
3016 DK Rotterdam, The Netherlands. E-mail: immink@turing-machines.com.
‡Simon R. Blackburn is with the Department of Mathematics, Royal Holloway
University of London, Egham, Surrey TW20 0EX, United Kingdom. E-mail:
S.Blackburn@rhul.ac.uk
1
This leakage may result in a shift of the offset or threshold voltage of
the memory cell. The amount of leakage depends on the time elapsed
between writing and reading the data. As a result, the offset between
different groups of cells may be very different so that prior art automatic
offset or gain control, which estimates the mismatch from the previously
received data, can not be applied. Methods to solve these difficulties in
Flash memories have been discussed in, for example, [4], [5], [6], [7]. In
optical disc media, such as the popular Compact Disc, DVD, and Blu-
ray disc, the retrieved signal depends on the dimensions of the written
features and upon the quality of the light path, which may be obscured by
fingerprints or scratches on the substrate. Fingerprints and scratches will
result in rapidly varying offset and gain variations of the retrieved signal.
Automatic gain and offset control in combination with dc-balanced codes
are applied albeit at the cost of redundancy [2], and thus improvements
to the art are welcome.
Immink & Weber [3] showed that detectors that use the Pearson dis-
tance offer immunity to offset and gain mismatch. The Pearson distance
can only be used for a set of codewords with special properties, called a
Pearson set or Pearson code. Let Sbe a codebook of chosen q-ary code-
words x= (x1, x2, . . . , xn) over the q-ary alphabet Q={0,1, . . . , q −1},
q≥2, where n, the length of x, is a positive integer. Note that the alpha-
bet symbols are to be treated as being just integers rather than elements
of Zq. A Pearson code with maximum possible size given the parameters
qand nis said to be optimal.
In Section 2, we set the stage with a description of Pearson distance
detection and the properties of the constrained codes used in conjunction
with it. Section 3 gives a description of T-constrained codes, a type of
code described in the prior art [3], used in conjunction with the Pearson
distance detector, while Section 4 offers a general construction of optimal
Pearson codes and a computation of their cardinalities. The rates of T-
constrained codes will be compared with optimal rates of Pearson codes.
In Section 5, we will describe our conclusions.
2
2 Preliminaries
We use the shorthand notation av+b= (av1+b, av2+b, . . . , avn+b). In [3],
the authors suppose a situation where the sent codeword, x, is received as
the vector r=a(x+ν) + b,ri∈R. The basic assumptions are that xis
scaled by an unknown factor, called gain,a > 0, offsetted by an unknown
(dc-) offset, b, where aand b∈R, and corrupted by additive noise ν=
(ν1, . . . , νn), νi∈Rare noise samples with distribution N(0, σ2). Both
quantities, gain and offset, do not vary from symbol to symbol, but are
the same for all nsymbols.
The receiver’s ignorance of the channel’s momentary gain and offset
may lead to massive performance degradation as shown, for example,
in [3] when a traditional detector, such as threshold or maximum like-
lihood detector, is used. In the prior art, various methods have been
proposed to overcome this difficulty. In a first method, data reference, or
‘training’, patterns are multiplexed with the user data in order to ‘teach’
the data detection circuitry the momentary values of the channel’s char-
acteristics such as impulse response, gain, and offset. In a channel with
unknown gain and offset, we may use two reference symbol values, where
in each codeword, a first symbol is set equal to the lowest signal level
and a second symbol equal to the highest signal level. The positions and
amplitudes of the two reference symbols are known to the receiver. The
receiver can straightforwardly measure the amplitude of the retrieved
reference symbols, and normalize the amplitudes of the remaining sym-
bols of the retrieved codeword before applying detection. Clearly, the
redundancy of the method is two symbols per codeword.
In a second prior art method, codes satisfying equal balance and en-
ergy constraints [8], which are immune to gain and offset mismatch, have
been advocated. The redundancy of these codes, denoted by r0, is given
by [8]
r0≈logqn+ logq(q2−1)q2−4 + logq
π
12√15.(1)
In a recent contribution, Pearson distance detection is advocated since
its redundancy is much less than that of balanced codes [3]. The Pearson
distance between the vectors xand ˆ
xis defined by
δ(x,ˆ
x) = 1 −ρx,ˆ
x,(2)
3
where
ρx,ˆ
x=n
i=1(xi−x)(ˆxi−ˆ
x)
σxσˆ
x
(3)
is the (Pearson) correlation coefficient, and
x=1
n
n
i=1
xi,(4)
and
σ2
x=
n
i=1
(xi−x)2.(5)
Note that σxis closely related to, but not the same as, the standard
deviation of x. The Pearson distance and Pearson correlation coefficient
are well-known concepts in statistics and cluster analysis. Note that
we have |ρx,ˆ
x| ≤ 1 by a corollary of the Cauchy-Schwarz Inequality [9,
Section IV.4.6], which implies that 0 ≤δ(x,ˆ
x)≤2.
A minimum Pearson distance detector outputs the codeword
xo= arg min
ˆ
x∈S
δ(r,ˆ
x).
As the Pearson distance is translation and scale invariant, that is,
δ(x,ˆ
x) = δ(ax+b, ˆ
x),
we conclude that the Pearson distance between the vectors xand ˆ
x
is independent of the channel’s gain or offset mismatch, so that, as a
result, the error performance of the minimum Pearson distance detector
is immune to gain and offset mismatch. This virtue implies, however, that
the minimum Pearson distance detector cannot be used in conjunction
with arbitrary codebooks, since
δ(r,ˆ
x) = δ(r,ˆ
y)
if ˆ
y=c1ˆ
x+c2,c1, c2∈Rand c1>0. In other words, since a minimum
Pearson detector cannot distinguish between the words ˆ
xand ˆ
y=c1ˆ
x+
c2, the codewords must be taken from a codebook S ⊆ Qnthat guarantees
unambiguous detection with the Pearson distance metric (2).
4
It is a well-known property of the Pearson correlation coefficient,
ρx,ˆ
x, that
ρx,ˆ
x= 1
if and only if
ˆ
x=c1+c2x,
where the coefficients c1and c2>0 are real numbers [9, Section IV.4.6].
It is further immediate, see (3), that the Pearson distance is undefined
for codewords xwith σx= 0. We coined the name Pearson code for a
set of codewords that can be uniquely decoded by a minimum Pearson
distance detector. We conclude that codewords in a Pearson code must
satisfy two conditions, namely
•Property A: If x∈ S then c1+c2x/∈ S for all c1, c2∈Rwith
(c1, c2)̸= (0,1) and c2>0.
•Property B: x= (c, c, . . . , c)/∈ S for all c∈R.
In the remaining part of this paper, we will study constructions and prop-
erties of Pearson codes. In particular, we are interested in Pearson codes
that are optimal in the sense of having the largest number of codewords
for given parameters nand q. We will commence with a description of
prior art T-constrained codes, a first example of Pearson codes.
3T-constrained codes
T-constrained codes [1], denoted by Sq,n(a1, . . . , aT), consist of q-ary n-
length codewords, where T, 0 < T ≤q,preferred or reference symbols
a1, . . . , aT∈ Q, must each appear at least once in a codeword. Thus,
each codeword, (x1, x2, . . . , xn), in a T-constrained code satisfies
|{i:xi=j}| >0 for each j∈ {a1, . . . , aT}.
The number of n-length q-ary sequences, NT(q, n), where T,T≤q,
distinct symbols occur at least once in the n-sequence, equals [1]
NT(q, n) =
T
i=0
(−1)i(T
T−i) (q−i)n, n ≥T. (6)
5
For example, we easily find for T= 1 and T= 2 that
N1(q, n) = qn−(q−1)n(7)
and
N2(q, n) = qn−2(q−1)n+ (q−2)n.(8)
Clearly, the number of T-constrained sequences is not affected by the
choice of the specific Tsymbols we like to favor.
For the binary case, q= 2, we simply find that S2,n(0) is obtained by
removing the all-‘1’ word from Qn, that S2,n(1) is obtained by removing
the all-‘0’ word from Qn, and that S2,n(0,1) is obtained by removing both
the all-‘1’ and all-‘0’ words from Qn, where Q={0,1}. Hence, indeed,
N1(2, n) = 2n−1
and
N2(2, n) = 2n−2.
The 2-constrained code Sq,n (0, q −1) is a Pearson code as it satis-
fies Properties A and B [3]. There are more examples of 2-constrained
sets that are Pearson codes, such as Sq,n(0,1). Note, however, that not
all 2-constrained sets are Pearson codes. For example, Sq,n(0,2) does
not satisfy Property A if q≥5, since, e.g., both (0,1,2, . . . , 2) and
(0,2,4, . . . , 4) = 2 ×(0,1,2, . . . , 2) are codewords
It is obvious from Property B that the code S2,n(0,1) of size 2n−2
is the optimal binary Pearson code. For the ternary case, q= 3, it can
easily be argued that S3,n(0,1), S3,n(0,2), and S3,n(1,2) are all optimal
Pearson codes of size 3n−2n+1 + 1.
However, for q > 3 the 2-constrained sets such as Sq,n(0,1), Sq,n(0, q −
1), and Sq,n (q−2, q−1) are not optimal Pearson codes, except when n= 2.
For example, for q= 4, it can be easily checked that the set S4,n(0,3) ∪
S3,n(0,1,2) is a Pearson code. Its size equals N2(4, n) + N3(3, n)=4n−
3n−2n+1 + 3, which turns out to be the maximum possible size of any
Pearson code for q= 4, as shown in the next section, where we will
address the problem of constructing optimal Pearson codes for any value
of q.
6
4 Optimal Pearson codes
For x= (x1, x2, . . . , xn)∈ Qn, let m(x) and M(x) denote the smallest
and largest value, respectively, among the xi. Furthermore, in case xis
not the all-zero word, let GCD(x) denote the greatest common divisor of
the xi. For integers n, q ≥2, let Pq,n denote the set of all q-ary sequences
xof length nsatisfying the following properties:
1. m(x) = 0;
2. M(x)>0;
3. GCD(x) = 1.
Theorem 1 For any n, q ≥2,Pq,n is an optimal Pearson code.
Proof. We will first show that Pq ,n is a Pearson code. Property B is
satisfied since any word in Pq,n contains at least one ‘0’ and at least
one symbol unequal to ‘0’. It can be shown that Property A holds by
supposing that x∈ Pq,n and ˆ
x=c1+c2x∈ Pq,n for some c1, c2∈R
with c2>0. Clearly c1= 0, since c1̸= 0 implies that m(ˆ
x)̸= 0. Then,
since ˆ
x=c2x, we infer that GCD(ˆ
x) = c2×GCD(x) = c2. Since, by
definition, GCD(ˆ
x) = 1, we have c2= 1 and conclude ˆ
x=x, which
proves that also Property A is satisfied. We conclude Pq,n is a Pearson
code.
We will now show that Pq,n is the greatest among all Pearson codes.
To that end, let Sbe any q-ary Pearson code of length n. We map all
x∈ S to x−m(x) and call the resulting code S′. Then, we map all
words x′in S′to x′/GCD(x′). Note that both mappings are injective
due to Property A and we don’t end up with the all-zero word due to
Property B. In fact, all words in the resulting code S′′ satisfy Properties
1)-3), and thus S′′ of size |S| is a subset of Pq,n, which proves that Pq,n
is optimal.
From the definitions of T-constrained sets and Pq,n it follows that
Sq,n(0,1) ⊆ Pq,n ⊆ Sq,n(0).(9)
In the following subsections, we will consider the cardinality and redun-
dancy of Pq,n, and compare these to the corresponding results for T-
constrained codes.
7
4.1 Cardinality
In this subsection, we study the size Pq,n of Pq,n. From (9), we have
N2(q, n)≤Pq,n ≤N1(q, n).(10)
From Property B we have the trivial upper bound
Pq,n ≤qn−q, (11)
which is tight in case q= 2 as indicated in Section 3, i.e.,
P2,n = 2n−2.(12)
In order to present expressions for larger values of q, we first prove the
following lemma. We define P1,n = 0.
Lemma 1 For any n≥2and q≥3,
q
i=2
i−1|q−1
(Pi,n −Pi−1,n) = qn−2(q−1)n+ (q−2)n,(13)
where the summation is over all integers iin the indicated range such
that i−1is a divisor of q−1.
Proof. For each isuch that 2 ≤i≤qand i−1 is a divisor of q−1,
we define Di,n as the set of all i-ary sequences yof length nsatisfying
m(y) = 0, M(y) = i−1, and GCD(y) = 1. Let Ddenote the union of
all these Di,n.
The mapping ψfrom Sq,n(0, q −1) to D, defined by dividing x∈
Sq,n(0, q −1) by GCD(x), is a bijection. This follows by observing that,
on one hand, ψ(x) is a unique member of D(q−1)/GCD(x)+1,n, while, on the
other hand, any sequence in y∈ Di,n is the image of ((q−1)/(i−1))y∈
Sq,n(0, q −1) under ψ.
Finally, the lemma follows by observing that |Di,n|=Pi,n −Pi−1,n and
|Sq,n(0, q −1)|=N2(q, n) = qn−2(q−1)n+ (q−2)n.
We thus have with (13) a recursive expression for Pq,n. Starting from the
result for q= 2 in (12), we can find Pq,n for any nand q. Expressions
8
for 2 ≤q≤8 of the size of optimal Pearson codes, Pq,n, are tabulated in
Table 1. The next theorem offers a closed formula for the size of optimal
Pearson codes, Pq,n. We start with a definition.
For a positive integer d, the M¨obius function µ(d) is defined [10,
Chapter XVI] to be 0 if dis divisible by the square of a prime, otherwise
µ(d) = (−1)kwhere kis the number of (distinct) prime divisors of d.
Theorem 2 Let nand qbe positive integers. Let Pq,n be the cardinality
ofaq-ary Pearson code of length n. Then
Pq,n =
q−1
d=1
µ(d)q−1
d+ 1n
−q−1
dn
−1.(14)
We use the following well-known theorem (see [10, Section 16.5], for
example) in our proof of Theorem 2.
Theorem 3 Let F:R→Rand G:R→Rbe functions such that
G(x) =
⌊x⌋
d=1
F(x/d)
for all positive x. Then
F(x) =
⌊x⌋
d=1
µ(d)G(x/d).(15)
Proof. (of Theorem 2) For a non-negative real number x, define
Ix={0,1, . . . , ⌊x⌋} =Z∩[0, x].
Let Vxbe the set of vectors of length nwith entries in Ix, but with at
least one non-zero entry. Define G(x) = |Vx|. There are |Ix|nlength n
vectors with entries in Ix, and (|Ix| − 1)nof these vectors have no zero
entries. Since |Ix|=⌊x⌋+ 1, we find that
G(x) = |Ix|n−(|Ix| − 1)n−1 = (⌊x⌋+ 1)n− ⌊x⌋n−1.(16)
9
Table 1: Size of optimal Pearson codes, Pq,n , for 2 ≤q≤8.
q Pq,n
2 2n−2
3 3n−2n+1 + 1
4 4n−3n−2n+1 + 3
5 5n−4n−3n+ 2
6 6n−5n−3n−2n+ 4
7 7n−6n−4n+ 2n+ 1
8 8n−7n−4n+ 3
For a positive integer d, let Vx,d be the set of vectors c∈Vxsuch that
GCD(c) = d. Since c̸=0, we see that 1 ≤GCD(c)≤maxi{ci}≤⌊x⌋
and so Vxcan be written as the disjoint union
Vx=
⌊x⌋
d=1
Vx,d.
Moreover, |Vx,d|=|Vx/d,1|, since the map taking c∈Vx,d to (1/d)c∈
Vx/d,1is a bijection.
Define F(x) = |Vx,1|, so F(x) is the number of vectors c∈Vxsuch
that GCD(c) = 1. Now,
G(x) = |Vx|=
⌊x⌋
d=1 |Vx,d|=
⌊x⌋
d=1 |Vx/d,1|=
⌊x⌋
d=1
F(x/d).
So, by Theorem 3, we deduce that (15) holds. Theorem 2 now follows
from the fact that Pq,n =F(q−1), by combining (15) and (16).
After perusing Table 1, it appears that for q≥4, Pq,n is roughly qn−
(q−1)n. An intuitive justification is that among the qnq-ary sequences
of length nthere are (q−1)nsequences that do not contain 0, which is
the most significant condition to avoid. All this is confirmed by the next
corollary.
Corollary 1 For any positive integer q, we have that
Pq,n =qn−(q−1)n+O(⌈q/2⌉n)
10
as n→ ∞.
Proof. The d= 1 term in the sum on the right hand side of (14)
is qn−(q−1)n, and the absolute values of remaining terms are each
bounded by ⌈q/2⌉n, since
⌊(q−1)/d⌋+ 1 ≤ ⌊(q−1)/2⌋+ 1 ≤ ⌈q/2⌉.
As discussed above, the 2-constrained codes Sq,n (0,1) and Sq,n(0, q−1)
are Pearson codes. Therefore, it is of interest to compare Pq,n with the
cardinality N2(q, n) = qn−2(q−1)n+ (q−2)nof 2-constrained codes.
For q≤3, we simply have Pq,n =Sq,n(0, q −1). For q≥4, we infer from
(8) and Corollary 1 that N2(q, n)< Pq,n , with a possible exception for
very small values of n. For all q≥2 it holds that
Pq,2=N2(q, 2) = 2 (17)
and
Pq,3= 6
q−1
j=1
ϕ(j),(18)
where ϕ(j) is Euler’s totient function that counts the totatives of j, i.e.,
the positive integers less than or equal to jthat are relatively prime to
j.
We have computed the cardinalities of N1(q, n), N2(q, n), and Pq,n by
invoking (7), (8), and the expressions in Table 1. Table 2 lists the results
of our computations for selected values of qand n.
4.2 Redundancy
As usual, the redundancy of a q-ary code Cof length nis defined by
n−logq|C|.From (7), it follows that the redundancy of a 1-constrained
code is
r1=n−logq(qn−(q−1)n)
=−logq1−q−1
qn
≈q−1
qnln(q),(19)
11
Table 2: N2(q, n), Pq,n, and N1(q, n) for selected values of qand n.
n q N2(q, n)Pq,n N1(q, n)
4 4 110 146 175
4 5 194 290 369
4 6 302 578 671
5 4 570 720 781
5 5 1320 1860 2101
5 6 2550 4380 4651
6 4 2702 3242 3367
6 5 8162 10802 11529
6 6 19502 30242 31031
7 4 12138 13944 14197
7 5 47544 59556 61741
7 6 140070 199500 201811
where the approximation follows from the well-known fact that ln(1+a)≈
awhen ais close to 0. Similarly, from (8) we infer the redundancy of a
2-constrained code, namely
r2=n−logq(qn−2(q−1)n+ (q−2)n)
=−logq1−2q−1
qn
+q−2
qn
≈2q−1
qn
−q−2
qnln(q).(20)
Since the 2-constrained code Sq,n (0,1) is optimal for q= 2,3, the ex-
pression for r2gives the minimum redundancy for any binary or ternary
Pearson code. From Corollary 1, it follows for q≥4 that the redundancy
12
of optimal Pearson codes equals
rP=n−logqqn−(q−1)n+Oq+ 1
2n
=−logq1−q−1
qn
+Oq+ 1
2qn
≈q−1
qn
+Oq+ 1
2qnln(q).(21)
In conclusion, for sufficiently large n, we have
rP=r2≈2r1(22)
if q= 2,3, while
rP≈r1≈r2/2 (23)
if q≥4. Figure 1 shows, as an example, the redundancies r1,r2, and
rPversus nfor q= 8 (the quantity rPwas computed using the expres-
sion listed in Table 1). Note that the redundancy r2decreases while the
redundancy of prior art balanced codes, r0, see (1), increases with in-
creasing codeword length n. The curve r0versus nwas not plotted in
Figure 1 as the the redundancy of balanced codes is much higher than
that of Pearson codes. For example, an evaluation of (1) shows that the
redundancy r0= 2.79 for q= 8 and n= 10, while rP= 0.147 for the
same parameters.
5 Conclusions
We have studied sets of q-ary codewords of length n, coined Pearson
codes, that can be detected unambiguously by a detector based on the
Pearson distance. We have formulated the properties of codewords in
Pearson codes. We have presented constructions of optimal Pearson codes
and evaluated their cardinalities and redundancies. We conclude that,
except for small values of qand/or n, the redundancy of optimal Pearson
codes is almost the same as the redundancy of 1-constrained codes.
13
2 4 6 8 10 12 14 16 18 20
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
n
Redundancy
r1
rP
r2
Figure 1: Redundancy r1,r2, and rPversus nfor q= 8.
References
[1] K. A. S. Immink, “Coding Schemes for Multi-Level Flash Memories
that are Intrinsically Resistant Against Unknown Gain and/or Offset
Using Reference Symbols”, Electronics Letters, vol. 50, pp. 20-22,
2014.
[2] K. A. S. Immink and J. H. Weber, “Very Efficient Balanced Codes”,
IEEE Journal on Selected Areas of Communications, vol. 28, pp.
188-192, 2010.
[3] K. A. S. Immink and J. H. Weber, “Minimum Pearson Distance
Detection for Multi-Level Channels with Gain and/or Offset Mis-
match”, IEEE Trans. Inform. Theory, vol. 60, pp. 5966-5974, Oct.
2014.
[4] A. Jiang, R. Mateescu, M. Schwartz, and J. Bruck, “Rank Modula-
tion for Flash Memories”, IEEE Trans. Inform. Theory, vol. IT-55,
no. 6, pp. 2659-2673, June 2009.
[5] F. Sala, R. Gabrys, and L. Dolecek, “Dynamic Threshold Schemes
for Multi-Level Non-Volatile Memories”, IEEE Trans. on Commun.,
pp. 2624-2634, Vol. 61, July 2013.
14
[6] H. Zhou, A. Jiang, and J. Bruck, “Error-correcting schemes with
dynamic thresholds in nonvolatile memories”, IEEE Int. Symposium
in Inform. Theory (ISIT), St Petersburg, July 2011.
[7] F. Sala, K. A. S. Immink, and L. Dolecek, “Error Control Schemes
for Modern Flash Memories: Solutions for Flash Deficiencies”, IEEE
Consumer Electronics Magazine, vol. 4 (1), pp. 66-73, Jan. 2015.
[8] K. A. S. Immink, “Coding Schemes for Multi-Level Channels with
Unknown Gain and/or Offset Using Balance and Energy con-
straints”, IEEE International Symposium on Information Theory,
(ISIT), Istanbul, July 2013.
[9] A. M. Mood, F. A. Graybill, and D. C. Boes, Introduction to the
Theory of Statistics, Third Edition, McGraw-Hill, 1974.
[10] G. H. Hardy and E. M. Wright, An Introduction to the Theory of
Numbers, (5th Edition), Oxford University Press, Oxford, 1979.
15