Content uploaded by Renfei Bu

Author content

All content in this area was uploaded by Renfei Bu on Jun 07, 2019

Content may be subject to copyright.

Maximum Likelihood Decoding for Multi-Level

Cell Memories with Scaling and Offset Mismatch

Renfei Bu

Applied Mathematics Dept., Optimization Group

Delft University of Technology

Delft, Netherlands

R.Bu@tudelft.nl

Jos H. Weber

Applied Mathematics Dept., Optimization Group

Delft University of Technology

Delft, Netherlands

J.H.Weber@tudelft.nl

Abstract—Reliability is a critical issue for modern multi-level

cell memories. We consider a multi-level cell channel model such

that the retrieved data is not only corrupted by Gaussian noise,

but hampered by scaling and offset mismatch as well. We assume

that the intervals from which the scaling and offset values are

taken are known, but no further assumptions on the distributions

on these intervals are made. We derive maximum likelihood (ML)

decoding methods for such channels, based on ﬁnding a codeword

that has closest Euclidean distance to a speciﬁed set deﬁned by

the received vector and the scaling and offset parameters. We

provide geometric interpretations of scaling and offset and also

show that certain known criteria appear as special cases of our

general setting.

Index Terms—multi-level cell memories, maximum likelihood

decoding, Euclidean distance, Pearson distance, scaling and offset

mismatch

I. INTRODUCTION

As the on-going data revolution demands storage systems

that can store large quantities of data, multi-level cell mem-

ories are gaining attention. A multi-level cell is a memory

element capable of storing more than a single bit of informa-

tion, compared to a single-level cell which can store only one

bit per memory element [1]. For example, in multi-level cell

NAND ﬂash technology, information is stored by introducing

more voltage levels that are used to represent more than one

bit [2].

It is obvious that, as the number of levels increases, the

storage capacity of multi-level cell memories is enhanced.

However, due to the increase in the per-cell storage density, the

reliability of multi-level cell memories experiences a diverse

set of short-term and long-term variations.

Unpredictable stochastic errors are exacerbated with the

short-term variation. For example, random errors occur in the

programming/reading process, and sometimes it is hard to

initialize a cell with the exact voltage. As a result, error cor-

recting techniques are usually considered and applied in multi-

level cell memories, such as BCH codes [3], Reed-Solomon

codes [4], LDPC codes [5], trellis coded modulation [6], and

so on.

In the long term, the performance of multi-level cell mem-

ories degrades with age. As documented in [7], the number

of electrons of a cell decreases and some cells even become

defective over time. The amount of electron leakage depends

on various physical parameters, e.g., the device’s temperature,

the magnitude of the charge, the quality of the gate oxide or

dielectric, and the time elapsed between writing and reading

the data. It is hard to precisely model these long-term effects

on multi-level cell memories. In this paper, we focus on the

mean change over time, while variance issues were discussed

in [8].

Scaling and offset can weaken the cell’s state strength by

moving its level closer to the next reference voltage. Various

techniques have been proposed to improve the detector’s

resilience to scaling and offset mismatch. Estimation of the

unknown shifts may be achieved by using reference cells,

but this is very expensive with respect to redundancy. Also,

coding techniques can be applied to strengthen the detector’s

reliability in case of scaling and offset mismatch; these include

rank modulation [9], balanced codes [10], and composition

check codes [11]. However, these methods often suffer from

large redundancy and high complexity.

Immink and Weber [12] advocate the use of Pearson dis-

tance decoding instead of traditional Euclidean distance de-

coding, in situations which require resistance towards scaling

and/or offset mismatch. We use the same channel model as

used in [12]: besides the noise, which varies from symbol

to symbol, a multiplicative factor aand an additive term b

specify the scaling and offset mismatch, respectively, which

are assumed to be constant within one block of code symbols,

but may be different for the next block. Even though this

model neglects certain aspects of multi-level cell memories,

such as inter-cell coupling or dependent noise, it still captures

key properties of the data corruption process in multi-level cell

memories.

The contribution of this work is two-fold. Firstly, in Sec-

tion III, we derive a maximum likelihood (ML) decoding

criterion for multi-level cell channels with Gaussian noise and

also suffering from the scaling aand the offset b, which are

known to be within certain ranges, speciﬁcally 0< a1≤

a≤a2and b1≤b≤b2. The ML decoding criterion will

also be illustrated with geometric interpretations. Secondly,

the proposed ML criterion provides a general framework,

including the scaling-only case and the offset-only case. Some

known criteria [13] [14] are shown to be special cases of this

framework for particular a1,a2,b1, and b2settings.

978-1-5386-8088-9/19/$31.00 ©2019 IEEE

This paper aims to generalize ML decoding for multi-

level cell channel with Gaussian noise and scaling and offset

mismatch. We start by providing the multi-level cell channel

model in Section II, starting with several deﬁnitions and end-

ing with the Euclidean distance-based and Pearson distance-

based decoding criteria. In Section III, we show how to achieve

ML decoding for this channel. We continue in Section IV

considering several special cases, which relate to known results

in this area. We wrap up the paper with some comments and

ideas for future work in Section V.

II. PRELIMINARIES AND CHA NN EL MO DE L

We start by introducing some notations. For any vector u=

(u1, u2, . . . , un)∈Rn, let

¯u =1

n

n

X

i=1

ui

denote the average symbol value, let

σu= n

X

i=1

(ui−¯u)2!1/2

denote the unnormalized symbol standard deviation, and let

kuk= n

X

i=1 |ui|2!1/2

denote the (Euclidean) norm. We write hu,vifor the standard

inner product (the dot product) of two vectors uand v, i.e.,

hu,vi=

n

X

i=1

uivi=kukkvkcos θ,

where θis the angle between uand v. Note that hu,ui=

kuk2.

Consider transmitting a codeword x= (x1, x2, . . . , xn)

from a codebook Sover the q-ary alphabet Q={0,1, . . . , q−

1},q≥2, where nis a positive integer. This is based on the

fact that each cell is initialized with one of a ﬁnite discrete

set of voltages. The transmitted symbols xiare distorted by

additive noise vi, by a factor a > 0, called scaling/gain, and

by an additive term b, called offset, i.e., the received symbols

riread

ri=a(xi+vi) + b,

for i= 1, . . . , n. The parameters vi∈Rare zero-mean i.i.d.

Gaussian noise samples with variance of σ2∈R, that is, the

noise vector vhas distribution

φ(v) =

n

Y

i=1

1

σ√2πe−v2

i/(2σ2).(1)

The scaling and offset (unknown to both the sender and the

receiver) may slowly vary in time due to various factors in

multi-level cells. So we assume they may differ from codeword

to codeword, but do not vary within a codeword. The received

vector when a codeword xis transmitted is

r=a(x+v) + b1,(2)

where 1= (1,1,...,1) is the real all-one vector of length n.

A. Euclidean Distance-Based Decoding

A well-known decoding criterion upon receipt of the vector

ris to choose a codeword ˆx ∈ S which minimizes the

(squared) Euclidean distance between the received vector r

and codeword ˆx, i.e.,

Le(r,ˆx) = kr−ˆxk2=

n

X

i=1

(ri−ˆxi)2.(3)

It is known to be ML with regard to handling Gaussian

noise, but not optimal in situations which require resistance

towards scaling and/or offset mismatch.

B. Pearson Distance-Based Decoding

The Pearson distance measure [12] naturally lends itself

to immunity to scaling and/or offset mismatch. The Pearson

distance between the received vector rand a codeword ˆx ∈ S

is deﬁned as

Lp(r,ˆx) = 1 −ρr,ˆx,(4)

where ρr,ˆx is the Pearson correlation coefﬁcient

ρr,ˆx =r−¯r1,ˆx −¯

ˆx1

σrσˆx

.(5)

A Pearson decoder chooses a codeword which minimizes

this distance. As shown in [12], a modiﬁed Pearson distance-

based criterion leading to the same result in the minimization

process reads

L0p(r,ˆx) =

n

X

i=1

(ri−ˆxi+¯

ˆx)2,(6)

if there is no scaling mismatch, i.e., a= 1. Use of the Pearson

distance requires that the set of codewords satisﬁes certain

special properties [12].

A geometric meaning for Pearson distance is provided

in [14]. Since the offset bchanges the mean of a vector, it

seems reasonable to consider normalized vectors ˆx −¯

ˆx1 and

r−¯r1 rather than ˆx and r. On the other hand, scaling a vector

of mean 0 by aonly changes its standard deviation by a factor

of a. So it seems reasonable to scale the normalized vectors so

that they have standard deviation 1. It is not difﬁcult to show

that this is ρr,ˆx.

III. MAX IM UM LIKELIHOOD DECODING

If a vector ris received, optimum decoding must determine

a codeword ˆx ∈ S maximizing P(ˆx |r). If all codewords are

equally likely to be sent, then, by Bayes Theorem, this scheme

is equivalent to maximizing P(r|ˆx ), that is, the probability

that ris received, given ˆx is sent.

From (2), we know v= (r−b1)/a −ˆx when aand bare

ﬁxed, and since ais nonzero, the likelihood P(r|ˆx )in this

case is

φ((r−b1)/a −ˆx).

Here, we consider the situation that the scaling and the offset

take their values within certain ranges, speciﬁcally 0< a1≤

a≤a2and b1≤b≤b2, but do not make any further

o

r

A

B

D

C

1

R

2

R

3

R

4

R

5

R

6

R

7

R

8

R

9

U R

Fig. 1. Subdivision of U0={cr+d1|c, d ∈R}.

assumptions on the distributions on these intervals. Thus, in

order to achieve ML decoding, the criterion to maximize

among all candidate codewords ˆx is

max

0<a1≤a≤a2,b1≤b≤b2

φ((r−b1)/a −ˆx).(7)

Since the logarithm function is strictly increasing on the

positive real numbers and φis a positive function, an equiv-

alent formulation of the problem is to ﬁnd ˆx ∈ S that

maximizes

max

0<a1≤a≤a2,b1≤b≤b2

log φ((r−b1)/a −ˆx).

Since

log φ((r−b1)/a −ˆx) = −nlog(σ√2π)

−1

2σ2

n

X

i=1

((ri−b)/a −ˆxi)2(8)

has a component −nlog(σ√2π)that is independent of ˆx and

r, and since 1

2σ2is a positive constant, a maximum likelihood

decoder ﬁnds a codeword ˆx that minimizes

min

0<a1≤a≤a2,b1≤b≤b2

n

X

i=1

((ri−b)/a −ˆxi)2,

i.e., it minimizes the squared Euclidean distance between the

candidate codeword ˆx and the points in

U={(r−b1)/a|0< a1≤a≤a2, b1≤b≤b2},

which is a subset of the subspace

U0={cr+d1|c, d ∈R}

in Rn.

The squared Euclidean distance between a vector ˆx and the

set Uis deﬁned as

Le(U, ˆx) =

n

X

i=1

(pi−ˆxi)2,

where p= (p1, p2, . . . , pn)is the point in Uthat is closest

to ˆx. The most likely candidate codeword xofor a received

vector has the smallest Le(U, ˆx), that is

xo= arg min

ˆx∈S

Le(U, ˆx).(9)

Hence, ˆx ∈ S closest to Uis chosen as the ML decoder output.

In order to calculate Le(U, ˆx)for a codeword ˆx, we ﬁrst

ﬁnd the point in U0that is closest to ˆx and then check if this

point is in U. Applying the ﬁrst derivative test gives that the

closest point in U0to ˆx is p0=c0r+d01with

c0=hr,ˆxi − n¯r¯

ˆx

hr,ri − n¯r2

and

d0=hr,ri¯

ˆx − hr,ˆxi¯r

hr,ri − n¯r2.

In Fig. 1, we depict the subset Uin gray when a1<1< a2

and b1<0< b2. Four vertices A,B,C,Dare also shown

in the picture:

A= (r−b11)/a1,

B= (r−b21)/a1,

C= (r−b21)/a2,

D= (r−b11)/a2.

Perpendicular lines (blue dash) in U0to sides of Uthrough ver-

tices are pictured in Fig. 1. These perpendicular lines and the

sides of Useparate U0into 9 subsets, namely, R1, R2, . . . , R9.

For instance, the perpendicular lines to side BC and BC itself

form the boundaries of R5. We use the notation R9in Fig. 1

for the subset Ufor clerical convenience.

Theorem 1. If p0is in the subset Ri,i= 1,...,9, then the

closest point in Uto ˆx is

p=

hr−b11,ˆxi

kr−b11k2(r−b11)if i= 1,

hr−b21,ˆxi

kr−b21k2(r−b21)if i= 5,

(r−(¯r −a1¯

ˆx)1)/a1if i= 3,

(r−(¯r −a2¯

ˆx)1)/a2if i= 7,

Aif i= 2,

Bif i= 4,

Cif i= 6,

Dif i= 8,

p0if i= 9.

(10)

The ML decoding criterion is minimizing Le(p,ˆx)among all

candidate codewords.

o

r

ˆ

x

M

(a)

o

r

ˆ

x

M

(b)

o

r

ˆ

x

M

(c)

Fig. 2. The distance of a candidate codeword ˆx to the subset

{r/a |0< a1≤a≤a2}: three cases in (11), (a) hr,ˆxi>hr,ri/a1, (b)

hr,ˆxi<hr,ri/a2and (c) otherwise, assuming a1<1< a2.

Proof. If p0is in the subset R1, maximizing (7) is equivalent

to minimizing the smallest squared Euclidean distance from

the codeword ˆx to the line segment

AD ={(r−b11)/a|0< a1≤a≤a2},

which is shown in Fig. 1. Let θbe the angle between ˆx and

r−b11. The point on AD closest to ˆx is p=α(r−b11)

with

α= (kˆxkcos θ)/kr−b11k=hr−b11,ˆxi/kr−b11k2.

Similarly, when p0is in the subset R5, the point on BC =

{(r−b21)/a|0< a1≤a≤a2}closest to ˆx is p=α(r−b21)

with

α=hr−b21,ˆxi/kr−b21k2.

If p0is in the subset R3, the point p∈Uthat is closest to

ˆx must be on the line segment

AB ={(r−b1)/a1|b1≤b≤b2},

which is shown in Fig. 1. The point on AB that is closest to

ˆx is p= (r−β1)/a1, with β=¯r −a1¯

ˆx,which follows from

the ﬁrst derivative test. The proof is similar when p0is in the

subset R7, with the line segment CD taking the role of AB.

If p0is in the subset R2, then the closest point in Uto ˆx is

the vertex A= (r−b11)/a1, as can be observed from Fig. 1.

Similar results are found for the situations that p0is in the

subset R4,R6, and R8, where the closest point in Uto ˆx is

B,C, and D, respectively.

Obviously, the closest point in Uto ˆx is p0itself when p0

is in the subset R9=U.

IV. SPE CI AL CA SE S

Several special values of a1,a2,b1and b2are considered,

leading to typical cases for maximizing (7); these include the

scaling-only and offset-only cases. Not only ML decoding

criteria are discussed, but also conventional decoding criteria

as introduced in Section II.

A. Scaling-Only Case

In the scaling-only case, i.e., b= 0, we simply have

r=a(x+v),

where the scaling, a, is unknown to both sender and receiver.

In Theorem 2 of [13], the following ML criterion was

presented for the case that there is bounded scaling (0< a1≤

a≤a2) and no offset mismatch (b= 0):

La1,a2(r,ˆx) =

Le(r/a1,ˆx)if hr,ˆxi>hr,ri/a1,

Le(r/a2,ˆx)if hr,ˆxi<hr,ri/a2,

kˆxk2−hr,ˆxi

krk2

otherwise.

(11)

This result can also be simply found from the general frame-

work presented in the previous section, by setting b1=b2= 0

in Theorem 1. Note that this gives indeed that p=r/a1if

p0∈R2∪R3∪R4, which corresponds to the situation that

kˆxkcos ϕ

krk=hr,ˆxi

hr,ri>1/a1,

where ϕis the angle between ˆx and r. Similarly, note that

p=r/a2if p0∈R6∪R7∪R8, which corresponds to the

situation that

kˆxkcos ϕ

krk=hr,ˆxi

hr,ri<1/a2.

Finally, note that p=hr,ˆxi

krk2rif p0∈R1∪R5∪R9, which

corresponds to the ‘otherwise’ case in (11), and that

Le(p,ˆx) = Le hr,ˆxi

krk2r,ˆx!=kˆxk2−hr,ˆxi

krk2

.

In Fig. 2, we draw the three cases in (11), where the subset

{r/a |0< a1≤a≤a2}is a line segment in the direction of

r. The circle points are the closest points on this line segment

to ˆx.

Next, we consider the situation that a1→0and a2→

∞, i.e., the only knowledge on the gain ais that it is

a positive number, without further limitations. The subset

{r/a |a∈R, a > 0}is a ray from the origin in the direction

of r. In this case, it follows from the above that ML decoding

can be achieved by minimizing

La(r,ˆx) = (kˆxk2−hr,ˆxi

krk2

if hr,ˆxi

hr,ri>0,

kˆxk2otherwise.

(12)

One reason for this choice is that it behaves well with respect

to an afﬁne scaling function (a > 0), since

La(r,ˆx) = La(r/a, ˆx).

That is, scaling a vector rby adoes not change the angle ϕ

between ˆx and r.

o

r

ˆ

x

(a)

o

r

ˆ

x

(b)

Fig. 3. The distance of a candidate codeword ˆx to the line segment {r−

b1|b1≤b≤b2}: two cases in (13), (a) ¯r −¯

ˆx < b1and (b) ¯r −¯

ˆx > b2,

assuming b1<0< b2.

B. Offset-Only Case

In the offset-only case, i.e., a= 1, we simply have

r=x+v+b1,

where the offset bis unknown to both sender and receiver.

In Theorem 1 of [13], the following ML criterion was

presented for the case that a= 1 and b1≤b≤b2:

Lb1,b2(r,ˆx) =

Le(r−b11,ˆx)if ¯r −¯

ˆx < b1,

Le(r−b21,ˆx)if ¯r −¯

ˆx > b2,

Le(r−(¯r −¯

ˆx)1,ˆx)otherwise.

(13)

This result also follows from the general setting presented in

the previous section, by substituting a1=a2= 1. Note that

the ﬁrst case in (13) corresponds to the situation that p0∈

R1∪R2∪R8, the second case to p0∈R4∪R5∪R6, and

the last case to p0∈R3∪R7∪R9.

We illustrate the ﬁrst two situations of Lb1,b2(r,ˆx)in Fig. 3

and the last one in Fig. 4, where {r−b1|b1≤b≤b2}is

shown by a line segment passing through rwith direction 1.

The point in {r−b1|b1≤b≤b2}that is closest to ˆx is r−b11

or r−b21for the situations in Fig. 3. For the ‘otherwise’ case

in (13), we consider in Fig. 4 the normalized vectors ˆx −¯

ˆx1

and r−¯r1 rather than ˆx and r.

By letting b1→ −∞ and b2→ ∞, we obtain from (13)

that the criterion

Lb(r,ˆx) = Le(r−(¯r −¯

ˆx)1,ˆx)

=

n

X

i=1

(ri−ˆxi+¯

ˆx)2−n¯r2

=L0p(r,ˆx)−n¯r2,

when there is no knowledge at all of the magnitude of the

offset [13]. Noting that the last term n¯r2is irrelevant in the

minimization process, we conclude that the modiﬁed Pearson

criterion L0p(r,ˆx)achieves ML decoding in this case.

C. Unbounded Scaling and Offset Case

In this subsection, an ML decoding criterion derived by

Blackburn [14] for the situation when both the scaling aand

o

r

ˆ

x

ˆ ˆ

x x1

r r1

Fig. 4. The distance of a candidate codeword ˆx to the line segment {r−

b1|b1≤b≤b2}for the ‘otherwise’ case in (13), assuming b1<0< b2.

the offset bare unbounded (a1→0,a2→ ∞,b1→ −∞,

b2→ ∞) is reconsidered as a special case of the results

presented in Section III. In [14], Blackburn shows that an ML

decoder chooses a codeword ˆx minimizing

lr(ˆx) = σ2

ˆx(1 −ρ2

r,ˆx)when ρr,ˆx >0,

σ2

ˆx otherwise.(14)

His argument was that when the scaling factor aand the

offset term bare fully unknown, except for the sign of a,

then maximizing (7) is equivalent to minimizing the smallest

squared Euclidean distance from the codeword ˆx to the subset

U+={(r−b1)/a|a, b ∈R, a > 0},

which is a half-subspace of U0⊂Rn. Note that when a1→0,

a2→ ∞,b1→ −∞,b2→ ∞, our Uis indeed equal to

Blackburn’s set U+. Note that p0=c0r+d01is either in

R9=U=U+or in R7. By (5), c0and d0can be rewritten

as

c0=ρr,ˆxσˆx

σr

(15)

and

d0=¯

ˆx −c0¯r.(16)

In case p0∈R9, which happens if and only if ρr,ˆx >0,

then Theorem 1 says p=p0=c0r+d01. Note that

Le(c0r+d01,ˆx)

=

n

X

i=1

[c0ri+d0−ˆxi]2

=

n

X

i=1 c0(ri−¯r)−(ˆxi−¯

ˆx)2

=

n

X

i=1 hc2

0(ri−¯r)2−2c0(ri−¯r)(ˆxi−¯

ˆx) + (ˆxi−¯

ˆx)2i

=c2

0σ2

r−2c0ρr,ˆxσrσˆx +σ2

ˆx

=ρr,ˆxσˆx

σr2

σ2

r−2ρr,ˆxσˆx

σrρr,ˆxσˆx σr+σ2

ˆx

=σ2

ˆx(1 −ρ2

r,ˆx),

which is indeed the same as in (14) when ρr,ˆx >0.

10 11 12 13 14 15 16 17 18 19

SNR (dB) = -20log10

10-3

10-2

10-1

100

WER

Pearson Distance Decoding

Euclidean Distance Decoidng

ML Decoding

Fig. 5. Word error rate (WER) against signal-to-noise ratio (SNR) when

q= 4,n= 8,a= 1.07, and b= 0.07.

In case p0∈R7, then Theorem 1 says p=¯

ˆx1 since

a2→ ∞. Hence,

Le(p,ˆx) = Le(¯

ˆx1,ˆx) = σ2

ˆx.

This shows that Blackburn’s criterion (14) indeed appears as

a special case of our general setting.

D. Simulation Results

Thus far, we have discussed ML decoding for Gaussian

noise channels with scaling and offset mismatch, and have

mentioned that Euclidean distance decoding is ML decoding

for Gaussian noise channels in Section II, while the Pearson

distance criterion (4) is optimal for channels with scaling and

offset mismatch, due to its intrinsic immunity to both scaling

and offset mismatch.

Figure 5 shows simulation results of Pearson distance de-

coding, Euclidean distance decoding, and ML decoding (14)

when q= 4 and n= 8. The word error rate (WER) of

10,000 trials is shown as a function of the signal-to-noise

ratio (SNR =−20 log10 σ). Results are given for 2-constrained

codes [12], [15], while a= 1.07 and b= 0.07. The simula-

tions indicate that for this case Pearson distance decoding has

a comparable performance as ML decoding, while Euclidean

distance decoding performs considerably worse.

V. CONCLUSION

We have derived a maximum likelihood decoding criterion

for multi-level cell memories with Gaussian noise and scaling

and/or offset mismatch. In our channel model, scaling and

offset are restricted to certain ranges, 0< a1≤a≤a2and

b1≤b≤b2, which is a generalization of several prior art

settings. For instance, by letting a1→0,a2→ ∞,b1→

−∞,b2→ ∞, we obtain the same ML decoding criterion as

proposed by Blackburn for the case of unbounded gain and

offset. We also provided geometric interpretations illustrating

the main ideas.

Scaling and offset mismatch are important issues in multi-

level cell memories, but not the only ones. As future work,

one could try to derive ML decoding criteria for multi-level

cell memories for which the channel model includes dependent

noise and/or inter-cell interference as well.

REFERENCES

[1] S. Aritome, NAND Flash Memory Technologies. Hoboken, NJ, USA:

John Wiley & Sons, Inc., 2015, ch. 4.

[2] K. Takeuchi, T. Tanaka, and H. Nakamura, “A double-level-V select

gate array architecture for multilevel NAND ﬂash memories,” IEEE J.

Solid-State Circuits, vol. 31, no. 4, pp. 602–609, Apr. 1996.

[3] W. Liu, J. Rho, and W. Sung, “Low-power high-throughput BCH error

correction VLSI design for multi-level cell NAND ﬂash memories,”

in Proc. IEEE Workshop on Signal Processing Systems Design and

Implementation (SIPS’06), Banff, Canada, 2006, pp. 303–308.

[4] B. Chen, X. Zhang, and Z. Wang, “Error correction for multi-level

NAND ﬂash memory using Reed-Solomon codes,” in Proc. IEEE

Workshop on Signal Processing Systems, Washington, USA, 2008, pp.

94–99.

[5] F. Zhang, H. D. Pﬁster, and A. Jiang, “LDPC codes for rank modulation

in ﬂash memories,” in Proc. IEEE Int. Symp. Inf. Theory (ISIT), Austin,

Texas, USA, 2010, pp. 859–863.

[6] S. Solda, D. Vogrig, A. Bevilacqua, A. Gerosa, and A. Neviani, “Analog

decoding of trellis coded modulation for multi-level ﬂash memories,” in

Proc. IEEE Int. Symp. Circuits and Systems (ISCAS), Seattle, WA, USA,

2008, pp. 744–747.

[7] G. Atwood, A. Fazio, D. Mills, and B. Reaves, “Intel strataﬂash memory

technology overview,” Intel Technology Journal, pp. 1–8, 1997.

[8] X. Huang, A. Kavcic, G. D. X. Ma, and T. Zhang, “Multilevel ﬂash

memories: channel modeling, capacities and optimal coding rates,” Int.

Journal on Advances in Systems and Measurements, vol. 6, no. 3, pp.

364–373, 2013.

[9] A. Jiang, R. Mateescu, M. Schwartz, and J. Bruck, “Rank modulation for

ﬂash memories,” IEEE Trans. Inf. Theory, vol. 55, no. 6, pp. 2659–2673,

Jun. 2009.

[10] H. Zhou and J. Bruck, “Balanced modulation for nonvolatile memories,”

arXiv preprint arXiv: 1209.0744, Sep. 2012.

[11] K. A. S. Immink and K. Cai, “Composition check codes,” IEEE Trans.

Inf. Theory, vol. 64, no. 1, pp. 249–256, Jan. 2018.

[12] K. A. S. Immink and J. H. Weber, “Minimum Pearson distance detection

for multilevel channels with gain and/or offset mismatch,” IEEE Trans.

Inf. Theory, vol. 60, no. 10, pp. 5966–5974, Oct. 2014.

[13] J. H. Weber and K. A. S. Immink, “Maximum likelihood decoding for

Gaussian noise channels with gain or offset mismatch,” IEEE Commun.

Lett., vol. 22, no. 6, pp. 1128–1131, Jun. 2018.

[14] S. R. Blackburn, “Maximum likelihood decoding for multilevel channels

with gain and offset mismatch,” IEEE Trans. Inf. Theory, vol. 62, no. 3,

pp. 1144–1149, Mar. 2016.

[15] J. H. Weber, K. A. S. Immink, and S. R. Blackburn, “Pearson codes,”

IEEE Trans. Inf. Theory, vol. 62, no. 1, pp. 131–135, Jan. 2016.