Content uploaded by Kees Schouhamer Immink

Author content

All content in this area was uploaded by Kees Schouhamer Immink on Sep 19, 2020

Content may be subject to copyright.

1

Dynamic Threshold Detection Based on Pearson

Distance Detection

Kees A. Schouhamer Immink, Kui Cai, and Jos H. Weber

Abstract—We consider the transmission and storage of encoded

strings of symbols over a noisy channel, where dynamic threshold

detection is proposed for achieving resilience against unknown

scaling and offset of the received signal. We derive simple rules

for dynamically estimating the unknown scale (gain) and offset.

The estimates of the actual gain and offset so obtained are used to

adjust the threshold levels or to re-scale the received signal within

its regular range. Then, the re-scaled signal, brought into its

standard range, can be forwarded to the ﬁnal detection/decoding

system, where optimum use can be made of the distance proper-

ties of the code by applying, for example, the Chase algorithm.

A worked example of a spin-torque transfer magnetic random

access memory (STT-MRAM) with an application to an extended

(72, 64) Hamming code is described, where the retrieved signal

is perturbed by additive Gaussian noise and unknown gain or

offset.

Index Terms—Constrained coding, storage systems, non-

volatile memories, Pearson distance, Euclidean distance, channel

mismatch, Pearson code.

I. INTRODUCTION

In mass data storage devices, the user data are translated

into physical features that can be either electronic, magnetic,

optical, or of other nature. Due to process variations, the

magnitude of the physical effect may deviate from the nominal

values, which may affect the reliable read-out of the data.

We may distinguish between two stochastic effects that de-

termine the process variations. On the one hand, we have

the unpredictable stochastic process variations, and on the

other hand, we may observe long-term effects, also stochastic,

due to various physical effects. For example, in non-volatile

memories (NVMs), such as ﬂoating gate memories, the data

is represented by stored charge. The stored charge can leak

away from the ﬂoating gate through the gate oxide or through

the dielectric. The amount of leakage depends on various

physical parameters, for example, the device temperature, the

magnitude of the charge, the quality of the gate oxide or

dielectric, and the time elapsed between writing and reading

the data.

Spin-torque transfer magnetic random access memory (STT-

MRAM) [1] is another type of emerging NVMs with nanosec-

ond reading/writing speed, virtually unlimited endurance, and

Kees A. Schouhamer Immink is with Turing Machines Inc, Willem-

skade 15d, 3016 DK Rotterdam, The Netherlands. E-mail: immink@turing-

machines.com.

Kui Cai is with Singapore University of Technology and Design (SUTD),

8 Somapah Rd, 487372, Singapore. E-mail: cai kui@sutd.edu.sg.

Jos Weber is with Delft University of Technology, Delft, The Netherlands.

E-mail: j.h.weber@tudelft.nl.

This work is supported by Singapore Agency of Science and Technology

(A*Star) PSF research grant, and Singapore Ministry of Education Academic

Research Fund Tier 2 MOE2016-T2-2-054.

zero standby power. In STT-MRAM, the binary input user

data is stored as the two resistance states of a memory

cell. Process variation causes a wide distribution of both the

low and high resistance states, and the overlapping between

the two distributions results in read errors. Furthermore, it

has been observed that with the increase of temperature,

the low resistance hardly changes, while the high resistance

decreases, leading to a drift of the high resistance to the low

resistance [2], which may lead to a serious degradation of the

data reliability for conventional detection.

The probability distribution of the recorded features changes

over time, and speciﬁcally the mean and the variance of

the distribution may change. The long-term effects are hard

to predict as they depend on, for example, the (average)

temperature of the storage device. An increase of the variance

over time may be seen as an increase of the noise level of the

storage channel, and it has a bearing on the detection quality.

The mean offsets can be estimated using an aging model,

but, clearly, the offset depends on unpredictable parameters

such as temperature, humidity, etc, so that the prediction is

inaccurate. Various techniques have been advocated to improve

the detector resilience in case of channel mismatch when the

mean and the variance of the recorded features distribution

have changed.

For example, estimation of the unknown offsets may be

achieved by using reference cells, i.e., redundant cells with

known stored data. The method is often considered too ex-

pensive in terms of redundancy, and alternative methods with

lower redundancy have been sought for.

Also, coding techniques can be applied to alleviate the

detection in case of channel mismatch. Speciﬁcally balanced

codes [3], [4], [5] and composition check codes [6], [7] prefer-

ably in conjunction with Slepian’s optimal detection [8] have

been shown to offer solace in the face of channel mismatch.

These coding methods are often considered too expensive in

terms of coding hardware and redundancy when high-speed

applications are considered.

Immink and Weber [9] advocated detectors that use the

Pearson distance instead of the traditional Euclidean distance

as a measure of similarity. The authors assume that the

offset is constant (uniform) for all symbols in the codeword.

In [10], it is assumed that the offset varies linearly over the

codeword symbols, where the slope of the offset is unknown.

The error performance of Pearson-distance-based detectors is

intrinsically resistant to both offset and gain mismatch.

Although minimum Pearson distance detection restores the

error performance loss due to channel mismatch without too

much redundant overhead, it is, however, an important open

problem to optimally combine it with error correcting codes.

2

Source data are usually encoded to improve the error reliabil-

ity, which means that the codewords have good (Hamming)

distance properties using structures such as, for example,

Hamming or BCH codes. Exhaustive optimal detection of

such codes is usually an impracticality as it requires the

distance comparison of all valid codewords. The celebrated

Chase algorithm [11] has been recommended as it enables

the trading of decoder complexity versus error performance

of conventional error correcting codes. The Chase algorithm

makes preliminary hard decisions of reliable symbols based

on a given threshold level. The Chase algorithm reduces the

exhaustive search of all symbols in the codeword to only a

small number of unreliable symbols. In case of channel mis-

match, however, due to incorrectly tuned threshold levels, the

hard decisions made are unreliable, and the Chase algorithm

fails to deliver reliable detection.

In this paper, we present new dynamic threshold detection

techniques used to estimate the channel’s unknown gain and

offset. The estimates of the actual gain and offset so obtained

are used to scale the received signal or to dynamically adjust

the threshold levels on a word-by-word basis. Then, the cor-

rected signal, brought into its standard range, can be forwarded

to the ﬁnal detection/decoding system, where optimum use can

be made of the distance properties of the code.

We set the scene in Section II with preliminaries and a

description of the mismatched channel model. In Section III,

we analyze the case where it is assumed that only the offset

is unknown and the gain is known. In Section IV, we discuss

the general case, where both gain and offset are unknown. In

Section V, we study the principal case of our paper, where it

is assumed that an error correcting code is applied to improve

the error performance of the channel. We start by showing

that channel mismatch has a detrimental effect on the error

performance of the extended Hamming code decoded by a

Chase decoder. We show that the presented dynamic threshold

detector (DTD) restores the error performance close to the

situation with an ideal, well-informed, receiver. Section VI

concludes the paper.

II. PRELIMINARIES AND CHANNEL MODEL

We consider a communication codebook, S⊆ Qn, of

selected codewords x= (x1, x2, . . . , xn)over the binary

alphabet Q={0,1}, where n, the length of x, is a positive

integer. We pursued the binary case here as it is the most

important in storage practice. The number of computations

grows rapidly for larger alphabets, see [9], (37), which may

complicate the detector design. The codeword, x∈S, is

translated into physical features, where logical ‘0’s are written

at an average (physical) level b0and the logical ‘1’s are

written at an average (physical) level 1 + b1, where b0and

b1∈R. Both b0and b1are average deviations, or ‘offsets’,

from the nominal levels, and are relatively small with respect

to the assumed unity difference (or amplitude) between the two

physical signal levels. The offsets b0and b1may be different

for each codeword, but do not vary within a codeword. For

unambiguous detection, the average of the physical levels

associated with the logical ‘0’s, b0, is assumed to be less than

that associated with the ‘1’s, 1 + b1. In other words, we have

the premise

b0<1 + b1.(1)

Assume a codeword, x, is sent. The symbols of the received

vector r= (r1, . . . , rn)are distorted by additive noise and

given by

ri=xi+f(xi;b0, b1) + νi,(2)

where we deﬁne the switch function

f(x;b0, b1) = (1 −x)b0+xb1,

and x∈ {0,1}is a dummy integer. We assume that the

received vector, r, is corrupted by additive Gaussian noise

ν=(ν1, . . . , νn), where νi∈Rare zero-mean independent

and identically distributed (i.i.d) noise samples with normal

distribution N(0, σ2). The quantity σ2∈Rdenotes the noise

variance. We may rewrite (2) and obtain

ri=axi+b+νi,(3)

where

b=b0and a= 1 + b1−b0.(4)

The mean levels, b0and b1, may slowly vary (drift) in time

due to charge leakage or temperature change. As a result, the

coefﬁcient, a= 1 + b1−b0, usually called the gain of the

channel, and the offset, b=b0, are both unknown to sender

and receiver. From the premise (1) we simply have a > 0.

Note that in [9] the authors study a slightly different channel

model, ri=a(xi+νi) + b, where also the noise component,

νi, is scaled with the gain a.

We start, in the next section, with the simplest case, namely

the offset only case, a= 1.

III. OFFS ET-ON LY CASE

In the offset-only case, b0=b1=band a= 1, we simply

have

ri=xi+b+νi,(5)

where the quantity, b, is an unknown (to both sender and re-

ceiver) offset. For detection in the above offset-only situation,

Immink and Weber [9] proposed the modiﬁed Pearson distance

instead of the Euclidean distance between the received vector

rand a candidate codeword ˆ

x∈S. The modiﬁed Pearson

distance is deﬁned by

δ(r,ˆ

x) =

n

i=1

(ri−ˆxi+ ˆx)2,(6)

where we deﬁne the mean of an n-vector of reals zby

z=1

n

n

i=1

zi.(7)

For clerical convenience we drop the variable rin (6). A

minimum Pearson distance detector operates in the same way

as the traditional minimum Euclidean detector, that is, it

outputs the codeword xo‘closest’, as measured in terms of

Pearson distance, to the received vector, r, or in other words

xo= arg min

ˆ

x∈S

δ(ˆ

x).(8)

3

Immink and Weber showed that the error performance of the

above detection rule is independent of the unknown offset

b. The evaluation of (8) is in principle an exhaustive search

for ﬁnding xo, but for a structured codebook, S, the search

is much less complex. We proceed our discussion with the

deﬁnition of a useful concept.

Let Swdenote the set of codewords of weight w, that is,

Sw={x∈ Qn:

n

i=1

xi=w}, w = 0, . . . , n.

A set Swis often called a constant weight code of weight w.

We study examples, where the codebook, S, is the union of

|V|constant weight codes deﬁned by

S=

w∈V

Sw,(9)

where the index set V⊆ {0,1, . . . , n}.

After working out (6), we obtain

δ(ˆ

x) =

n

i=1

(ri−ˆxi)2+nˆx(2r−ˆx),(10)

where the ﬁrst term is the square of the Euclidean distance

between rand ˆ

x, and the second term, nˆx(2r−ˆx), makes the

distance measure, δ(ˆ

x), independent of the unknown offset b.

The exhaustive search (8) can be simpliﬁed by the following

observations. The decoder hypothesizes that x∈Sw. Then we

have

δ(ˆ

x∈Sw) =

n

i=1

(ri−ˆxi)2+w2r−w

n.(11)

Since (8) is a minimization process, we may delete irrelevant

(scaling) constants, and obtain

δ(ˆ

x∈Sw) =

n

i=1

r2

i−2

n

i=1

ˆxiri+

n

i=1

ˆx2

i

+w2r−w

n

≡w1+2r−w

n−2

n

i=1

ˆxiri.(12)

The symbol ≡is used to denote equivalence of the expressions

(11) and (12) deleting (scaling) constants irrelevant to the

minimization operation deﬁned in (8). Note that the term

w1+2r−w

n

depends on the number of ‘1’s, w, of ˆ

xand, thus, not on the

speciﬁc positions of the ‘1’s of ˆ

x. The only degree of freedom

the detector has for minimizing δ(ˆ

x∈Sw)is permuting the

symbols in ˆ

xfor maximizing the inner product n

i=1 ˆxiri.

Slepian [8] showed that the inner product n

i=1 ˆxiri,ˆ

x∈

Sw, is maximized by pairing the largest symbol of rwith the

largest symbol of ˆ

x, the second largest symbol of rwith the

second largest symbol of ˆ

x, etc.

To that end, the nreceived symbols, ri, are sorted, largest

to smallest, in the same way as taught in Slepians prior art.

Let (r′

1, r′

2, . . . , r′

n)be a permutation of the received vector

(r1, r2, . . . , rn)such that r′

1≥r′

2≥. . . ≥r′

n. Then, since the

wlargest received symbols, r′

i,1≤i≤w, are paired with

‘1’s (and the smallest symbols r′

i,w+ 1 ≤i≤nwith ‘0’s),

we obtain

δw=w1+2r−w

n−2

w

i=1

r′

i

=

w

i=1 −2(r′

i−r) + n+ 1 −2i

n,(13)

where for convenience we use the short-hand notation

δw= min

ˆ

x

δ(ˆ

x∈Sw).

Since, as is immediate from (13), δ0=δn= 0, the detector

cannot distinguish between the all-‘0’ or the all-‘1’ codewords.

For enabling unique detection one of the two (or both)

codewords must be barred from the code book S. In other

words, either V⊆ {1, . . . , n}or V⊆ {0,1, . . . , n −1}.

Such constrained codes, S, called Pearson codes, have been

described in [9]. In order to reduce computational load, we

may rewrite (13) in recursive form, and obtain for 1≤w≤n,

δw=δw−1−2(r′

w−r) + n+ 1 −2w

n,(14)

where we initialize with δ0= 0. The value w∈Vthat

minimizes δwis denoted by ˆw, or

ˆw= arg min

w∈V

δw.(15)

Once we have obtained ˆw, we may obtain an estimate of

the sent codeword, x, by applying Slepian’s algorithm, and,

subsequently we ﬁnd an estimate of the offset, b. The estimate

of the offset, denoted by ˆ

b, is obtained by averaging (5), or

ˆ

b=1

n

n

i=1

(ri−ˆxi) = ¯r−ˆw

n.(16)

The retrieved vector, r, is re-scaled by subtracting the esti-

mated offset, ˆ

b, so that

ˆri=ri−ˆ

b=ri−¯r−ˆw

n,1≤i≤n, (17)

where ˆ

rdenotes the corrected vector. Note that we can,

instead of re-scaling the received signal as done above, adjust

the threshold levels used in a Chase decoder to discriminate

between reliable and unreliable symbols. For asymptotically

small noise variance, σ2, we may assume with high probability

that ˆw=xi, so that the variance of the offset estimate, ˆ

b,

can be approximated by

E[(b−ˆ

b)2]≈σ2

n, σ ≪1,(18)

where E[] denotes the expectancy operator. The next example

illustrates the detection algorithm.

Example 1: Let n= 6,x= (110010),σ= 0.125, and

offset b= 0.2. The received word is r=(1.194, 1.233, -

0.024, 0.331, 1.402, 0.263), and after sorting we have r′=

4

(1.402, 1.233, 1.194, 0.331, 0.263, -0.024). We simply ﬁnd

r= 0.733. The next table shows δwversus wusing (14).

w r′

wδw

1 1.402 −0.505

2 1.233 −1.005

3 1.194 −1.761

4 0.331 −1.123

5 0.263 −0.682

6−0.024 0.000

We ﬁnd ˆw= 3. The estimated offset equals

ˆ

b=r−ˆw/n = 0.733 −3/6 = 0.233.

Example 2: Let, S,neven, be the union of two constant

weight codes, that is,

S=Sw0∪Sw1,(19)

where w0=n

2−1and w1=n

2+ 1. We ﬁnd from (13) that

δw0=−2

w0

i=1

r′

i+w01+2r−w0

n

and

δw1=−2

w1

i=1

r′

i+w11 + 2r−w1

n,

so that

δw1−δw0=−2(r′

n

2+r′

n

2+1)+4r. (20)

We deﬁne the median of the received vector, ˜r, as the average

of the two middle values (neven) [12], that is,

˜r=1

2(r′

n

2+r′

n

2+1).(21)

The receiver decides that ˆw=w1if

δw1−δw0<0,

or, equivalently, if

˜r > r. (22)

In the next section, we take a look at the general case where

we face both gain and offset mismatch, a̸= 1 and b̸= 0.

IV. PEARSON DISTANCE DETECTION

We consider the general situation as in (3) where the

symbols of the received vector r= (r1, . . . , rn)are given

by

ri=axi+νi+b, (23)

where both quantities a,a > 0, and bare unknown. Immink

and Weber proposed the Pearson distance as an alternative

to the Euclidean distance in case the receiver is ignorant of

the actual channel’s gain and offset [9]. The Pearson distance

between the n-vectors rand ˆ

xis deﬁned by

δp(ˆ

x) = 1 −ρr,ˆ

x,(24)

where

ρr,ˆ

x=n

i=1(ri−r)(ˆxi−ˆx)

σrσˆx

(25)

is the Pearson correlation coefﬁcient. The (unnormalized)

variance of the vector zis deﬁned by

σ2

z=

n

i=1

(zi−z)2.(26)

A minimum Pearson distance detector operates in the same

way as the minimum Euclidean detector, that is, it outputs

the codeword xo‘closest’, as measured in terms of Pearson

distance, to the received vector, or in other words

xo= arg min

ˆ

x∈S

δp(ˆ

x).(27)

The minimum Pearson distance detector estimates the sent

codeword x, and implicitly it offers an estimate of the gain, a,

and offset, b, using (23). We start by evaluating (24) and (27).

Since (27) is a minimization process, we may delete irrelevant

(scaling) constants, and obtain

δp(ˆ

x)≡ − 1

σˆx

n

i=1

ri(ˆxi−ˆx).(28)

As in the previous section, we consider a code S=w∈VSw,

where the index set V⊆ {0,1, . . . , n}. Let ˆ

x∈Sw, then

δp(ˆ

x∈Sw)≡1

w−w2

nwr −

n

i=1

riˆxi.(29)

Note that δp(ˆ

x∈Sw)is undeﬁned for w= 0 and w=n,

and we must bar both the all-‘0’ and all-‘1’ words from Sfor

unique detection. Clearly, V⊆ {1, . . . , n −1}.

Except for the inner product riˆxi, the above expression

depends on the number of ‘1’s, w, of ˆ

xand, thus, not on the

speciﬁc positions of the ‘1’s of ˆ

x. For maximizing the inner

product riˆxiwe must pair the wlargest symbols riwith

the w1’s of ˆ

x. Let (r′

1, r′

2, . . . , r′

n)be a permutation of the

received vector (r1, r2, . . . , rn)such that r′

1≥r′

2≥. . . ≥r′

n.

Since the w1’s are paired with the largest symbols, r′

i,1≤

i≤w, we have [8]

δp,w =−1

w−w2

n

w

i=1

(r′

i−r),(30)

where δp,w is a short-hand notation of min ˆ

xδp(ˆ

x∈Sw). The

detector evaluates δp,w for all w∈V. Deﬁne

ˆw= arg min

w∈V

δp,w.(31)

The decoder decides that the ˆwlargest received signal ampli-

tudes, r′

i,1≤i≤ˆware associated with a ‘one’, and n−ˆw

smallest received signal amplitudes, r′

i,ˆw+ 1 ≤i≤nare

associated with a ‘zero’.

The estimates of the gain, ˆa, and offset, ˆ

b, of the received

vector rare found by using (4). Let ˆ

b0and ˆ

b1denote the

estimates of b0and b1, respectively. Then we ﬁnd

ˆ

b0=1

n−ˆw

n

i= ˆw+1

r′

i

and

ˆ

b1=−1 + 1

ˆw

ˆw

i=1

r′

i,

5

TABLE I

SIM ULATI ON S RES ULTS O F 105SAM PL ES F OR σ= 0.1AN D n= 6. TH E

VALUE S IN PAR EN THE SE S ARE C OM PUT ED U SIN G (35) AND (36),

RE SPE CT IVE LY.

w σ2

ˆ

b,w/σ2σ2

ˆa,w /σ2

1 0.201 (0.200) 1.201 (1.200)

2 0.250 (0.250) 0.745 (0.750)

3 0.333 (0.333) 0.668 (0.667)

4 0.497 (0.500) 0.751 (0.750)

5 1.011 (1.000) 1.198 (1.200)

so that, after using (4),

ˆa= 1 + ˆ

b1−ˆ

b0=1

ˆw

ˆw

i=1

r′

i−1

n−ˆw

n

i= ˆw+1

r′

i(32)

and

ˆ

b=ˆ

b0=1

n−ˆw

n

i= ˆw+1

r′

i.(33)

The normalized vector ˆ

ris found after scaling and offsetting

with the estimated gain, ˆa, and offset, ˆ

b, that is,

ˆri=ri−ˆ

b

ˆa,1≤i≤n. (34)

After the above normalization, the normalized vector, ˆ

r, is

corrected to its standard range, and may be forwarded to the

second part of the decoder, where the vector is processed,

decoded, and quantized.

The variance of the estimates ˆaand ˆ

bdepend on the numbers

of 1’s and 0’s in the sent codeword x. For asymptotically small

noise variance, σ2, so that we may assume that with high

probability ˆw=xi, the variance of the offset, b, denoted

by σ2

ˆ

b,w, can be approximated by

σ2

ˆ

b,w = E[(b−ˆ

b)2] = E

b−1

n−ˆw

n

i= ˆw+1

r′

i2

=1

n−wσ2, σ ≪1.(35)

Similarly, the variance of the estimate of the gain a, denoted

by σ2

ˆa,w , is given by

σ2

ˆa,w = E[(a−ˆa)2] = n

w(n−w)σ2, σ ≪1.(36)

The above ﬁndings are intuitively appealing as they show that

the quality of the estimate of the quantities, aand b, depends

on the numbers, n−wand w, of ‘0’s and ‘1’s in the sent

codeword, respectively. We have veriﬁed the above estimator

quality using computer simulations. Results of our simulations

are collected in Table I, where we assumed the case σ=

0.1and n= 6. We are now considering the general case

of uncoded i.i.d input data, so that the sent codeword does

not have a speciﬁed weight. The codeword’s weight is in the

range {1, . . . , n −1}. For the i.i.d. case, the variance of the

TABLE II

SIM ULATI ON S RES ULTS O F 105SAM PL ES F OR σ= 0.1. TH E VALUE S IN

PARE NT HES ES A RE CO MPU TE D USI NG ( 37) A ND (38), R ESP EC TIV ELY.

n σ2

ˆ

b/σ2σ2

ˆa/σ2

8 0.297 (0.296) 0.5919 (0.5919)

16 0.135 (0.135) 0.2700 (0.2699)

32 0.064 (0.065) 0.1293 (0.1293)

64 0.031 (0.032) 0.0634 (0.0635)

128 0.017 (0.016) 0.0314 (0.0315)

estimations ˆaand ˆ

b, denoted by σ2

ˆaand σ2

ˆ

b, can be found as

the weighted average of σ2

ˆa,w and σ2

ˆ

b,w, or

σ2

ˆ

b=σ2

2n−2

n−1

w=1 n

w1

n−w(37)

and

σ2

ˆa=σ2

2n−2

n−1

w=1 n

wn

w(n−w).(38)

Results of computations and simulations are shown in Table II,

where we assumed the case σ= 0.1. We have computed

the relative variance of the estimators σ2

ˆ

b/σ2and σ2

ˆa/σ2for

different values of the noise level, σ, and observed that (37)

and (38) are accurate up to a level where the detector is close

to failure (word error rate >0.1).

In the next section, we show results of computer simulations

with the newly developed DTD algorithms applied to the

decoding of an extended Hamming code.

V. APP LI CATI ON T O AN E XT EN DED HAMMING CODE

Error correction is needed to guarantee excellent error

performance over the memory’s life span. To be compatible

with the fast read access time of STT-MRAM, the error

correction code adopted needs to have a low redundancy of

around ten percent and it must have a short codeword length.

A (71, 64) regular Hamming code is used for Everspins 16

MB MRAM, where straightforward hard decision detection is

used [13]. Cai and Immink [14] propose a (72, 64) extended

Hamming code with a two-stage hybrid decoding algorithm

that incorporates hard decision detection for the ﬁrst-stage plus

a Chase II decoder [11] for the second stage of the decoding

routine.

In the next subsection, we show, using computer simula-

tions, that the application of DTD in the above scenario offers

resilience against unknown charge leakage or temperature

change. We show results of computer simulations with the

(72, 64) Hamming code, which is applied to a simple channel

with additive noise.

A. Evaluation of the Hamming code

An (n, n −r)Hamming code is characterized by two

positive integer parameters, rand n, where the redundancy

r > 1is a design parameter and n,n≤2r−1is the length of

the code [13]. The payload is of length n−r. The minimum

Hamming distance of a regular Hamming code equals dH= 3.

6

An extended Hamming code is a regular (n, n −r)Hamming

code plus an overall parity check. The minimum Hamming

distance of an extended Hamming code equals dH= 4.

The word error rate of binary words transmitted over an

ideal, matched, channel, using a Hamming code under max-

imum likelihood soft decision decoding, denoted by WERH,

equals (union bound estimate)

WERH≈AH(n, r)Q√dH

2σ, σ ≪1,(39)

where AHdenotes the average number of codewords at

minimum distance dH, and the Q-function is deﬁned by

Q(x) = 1

√2π∞

x

e−u2

2du. (40)

For a regular Hamming code, we have

AH(n, r) = n(n−1)

6, n = 2r−1.(41)

For a shortened Hamming code, n < 2r−1, since the weight

distribution of many types of linear codes, including Hamming

codes, is asymptotically binomial [15] for n≫1, we can use

the approximation

AH(n, r)≈n

31

2r,(42)

and for an extended Hamming code (only even weights)

AH(n, r)≈n

41

2r−1.(43)

Exhaustive optimal detection of long Hamming codes, such

as the extended (72,64) is an impracticality as it requires

the distance comparison of 264 valid codewords. Sub-optimal

detection can be accomplished with, for example, the well-

known Chase algorithm [11], [14].

The Chase algorithm selects Tof the least reliable bits

by selecting the symbols, ri, having least absolute channel

value with respect to the decision level. The remaining n−T

symbols, that is the most reliable ones, are quantized. Then,

the Tunreliable symbols are selected, using exhaustive search,

in such a way that the word so obtained is a valid codeword

of the Hamming code at hand and that the word minimizes

the Euclidean distance to the received vector r. The error

performance of the Chase algorithm is worse than the counter-

part error performance of the full-ﬂedged maximum likelihood

detector given by (39). The loss in performance depends on

the parameter T.

As the parameter Tdetermines the complexity of the search,

it is usually quite small in practice. The majority of symbols

are thus quantized using hard decision detection, where a pre-

ﬁxed threshold is used. The error performance of the Chase

decoder depends therefore heavily on the accuracy of the

threshold with respect to mismatch of the gain and offset

of the signal received. This means that the Chase decoder

loses a major part of its error performance in case of channel

mismatch.

Using computer simulations, we computed the error perfor-

mance of the Chase decoder in the presence of offset or gain

mismatch versus the noise level −20 log10 σ. We simulated

the error performance of an extended (72,64) Hamming

code decoded by the Chase decoder, where we selected, in

Figure 1, the offset mismatch case, a= 1 and b= 0.15.

Figure 2 shows the gain mismatch case, a= 0.85 and b= 0

(b0= 0,b1=−0.15). Both diagrams show the signiﬁcant

loss in performance due to channel mismatch. Combinations of

offset and gain mismatch give similar devastating results [9].

The word error rate found by our simulations of the ideal

channel (without mismatch), is quite close to the theoretical

performance given by the union bound estimate (39) and (43).

In order to improve the detection quality, we applied DTD,

as presented in the previous sections, followed by (standard)

Chase decoding. Before discussing the simulation results,

we note two observations. The all-‘1’ word is not a valid

codeword, and the all-‘0’ word is a valid codeword of the

(72,64) Hamming code. The probability of occurrence of the

all-‘0’ word, assuming equiprobable codewords is 2−64 ≈

10−19, which is small enough to be ignored for most practical

situations. The weight of a codeword of an extended Hamming

code is even, so that the number of evaluations of δwor δp,w ,

see (13) and (30), can be reduced.

Figure 1 shows the word error rate in case DTD is applied in

the offset-only case, a= 1 and b= 0.15. We notice that DTD

restores the error performance close to the error performance

of the ideal offset-free situation.

Figure 2, the gain mismatch case, shows that the error

performance with DTD (Curve 2) is worse than that of the

ideal case, a= 1, without applying DTD (Curve 4). This can

easily be understood: in case a= 0.85 (b0= 0, b1=−0.15),

the average levels, b0and 1 + b1, of the recorded data, xi, are

closer to each other than in the ideal case, a= 1. Curve 3

shows that the error performance with DTD is close to the

situation, where the receiver is informed about the actual gain,

a= 0.85. This gives a fairer comparison, and we observe that

the WER of DTD almost overlaps with the simulated, matched

channel, performance. This demonstrates the efﬁcacy of DTD

for the case of a= 0.85 (b0= 0, b1=−0.15).

Figure 3 shows the WER as a function of the offset

mismatch, b, where a= 1 and −20 log σ= 15, using

a Chase decoder, T= 4. The error performance of the

DTD is unaffected by the offset mismatch, b, and the error

performance is close to the performance without mismatch.

Figure 4 shows the WER as a function of b1, where the

gain a= 1 + b1,−20 log σ= 15.5, and b0= 0, using a

Chase decoder, T= 4. Curve 3 shows the situation where

the receiver is informed about the actual gain (no mismatch),

and we infer that the error performances of a receiver of the

matched channel and a receiver of the mismatched channel

combined with DTD are very similar.

Above we have shown simulation results of dynamic thresh-

old detection used in conjunction with an extended Hamming

code and a Chase decoder. We remark that although in this

paper we exemplify DTD detection on an extended Hamming

code, the hybrid DTD/decoding algorithm is a general tool

that can be applied to other (extended) BCH codes, LDPC,

polar codes, etc., for applications in both data storage and

transmission systems.

7

13 13.5 14 14.5 15 15.5 16

10−6

10−5

10−4

10−3

10−2

10−1

100

−20 logσ

WER

a=1

1 b=0.15, w/o DTD

2 b=0.15, with DTD

3 b=0, w/o DTD

4 b=0, with DTD

5 union bound

Fig. 1. Word error rate (WER) of the extended (72,64) Hamming

code with and without dynamic threshold detection (DTD), and with

and without an offset, b= 0.15, using a Chase decoder, T= 4. The

union bound estimate to the word error rate for the ideal channel,

a= 1 and b= 0, given by (39), is plotted as a reference (Curve 5).

13 13.5 14 14.5 15 15.5 16 16.5 17

10−6

10−5

10−4

10−3

10−2

10−1

100

−20 logσ

WER

1 a=0.85 (b0=0, b1=−0.15), w/o DTD

2 a=0.85 (b0=0, b1=−0.15), with DTD

3 a=0.85 (b0=0, b1=−0.15), with known

a, b0, b1 for detection/decoding

4 a=1 (b0=0, b1=0), w/o DTD

5 a=1 (b0=0, b1=0), with DTD

6 union bound

Fig. 2. Word error rate (WER) of the extended (72,64) Hamming

code with and without dynamic threshold detection (DTD), and with

and without a gain mismatch, a= 0.85 (b0= 0,b1=−0.15), using

a Chase decoder, T= 4. The union bound estimate, Curve 6, to the

word error rate for the ideal channel, a= 1 and b= 0, given by

(39), is plotted as a reference. Curves 2 and 3 show that the error

performance with DTD is close to the situation, where the receiver

is informed about the actual gain, a= 0.85.

VI. CONCLUSIONS

We have considered the transmission and storage of encoded

strings of binary symbols over a storage or transmission

channel, where a new dynamic threshold detection system

has been presented, which is based on the Pearson distance.

Dynamic threshold detection is used for achieving resilience

against unknown signal-dependent offset and corruption with

additive noise. We have presented two algorithms, namely a

ﬁrst one for estimating an unknown offset only and a second

one for estimating both unknown offset and gain. As an

0 0.05 0.1 0.15 0.2

10−5

10−4

10−3

10−2

10−1

b

WER

−20 logσ = 15

1 w/o DTD

2 with DTD

Fig. 3. Word error rate (WER) of the extended (72,64) Hamming

code with and without dynamic threshold detection (DTD), versus

the offset mismatch b, where a= 1 and −20 log σ= 15, using a

Chase decoder, T= 4.

−0.2 −0.15 −0.1 −0.05 0

10−5

10−4

10−3

10−2

10−1

b1

WER

−20 logσ = 15.5

a=1+b1, b0=0

1 w/o DTD

2 with DTD

3 with known a, b0, b1 for detection/decoding

Fig. 4. Word error rate (WER) of the extended (72,64) Hamming

code with and without dynamic threshold detection (DTD), versus

the gain mismatch a= 1 + b1,b0= 0, where −20 log σ= 15.5,

using a Chase decoder, T= 4. Curve 3 shows the situation where

the receiver is informed about the actual gain.

example to assess the beneﬁt of the new dynamic threshold

detection, we have investigated the error performance of an

extended (72,64) Hamming code using a Chase decoder. The

Chase algorithm makes hard decisions of reliable symbols that

are above or below a given threshold level. In case of channel

mismatch, however, due to incorrectly tuned threshold levels,

the hard decisions made are unreliable, and as a result the

Chase algorithm fails. We have shown that the error perfor-

mance of the extended Hamming code degrades signiﬁcantly

in the face of an unknown offset or gain mismatch. The

presented threshold detector dynamically adjusts the threshold

levels (or re-scales the received signal), and improves the error

performance by estimating the unknown offset or gain, and

8

restores the performance close to the performance without mis-

match. A worked example of a Spin-torque transfer magnetic

random access memory (STT-MRAM) with an application to

an extended (72, 64) Hamming code has been described, where

the retrieved signal is perturbed by additive Gaussian noise and

unknown gain or offset.

REF ER EN CE S

[1] M. Hosomi, H. Yamagishi, T. Yamamoto, K. Bessho, Y. Higo, K.

Yamane, H. Yamada, M. Shoji, H. Hachino, C. Fukumoto, H. Nagao,

and H. Kano, “A novel nonvolatile memory with spin torque transfer

magnetization switching: Spin-RAM,” Tech. Dig. Intl. Electron Devices

Meeting (IEDM), Washington, USA, pp. 459-462, Dec. 2005.

[2] X. Kou, J. Schmalhorst, A. Thomas, and G. Reiss, “Temperature

dependence of the resistance of magnetic tunnel junctions with MgO

barrier, ”Appl. Phys. Lett., vol. 88, pp. 212-215, 2006.

[3] K. A. S. Immink, “Coding Schemes for Multi-Level Channels with

Unknown Gain and/or Offset Using Balance and Energy constraints,”

pp. 709-713, IEEE International Symposium on Information Theory,

(ISIT), Istanbul, July 2013.

[4] H. Zhou, A. Jiang, and J. Bruck, “Balanced Modulation for Nonvolatile

Memories,” arXiv:1209.0744, Sept. 2012.

[5] B. Peleato, R. Agarwal, J. M. Ciofﬁ, M. Qin, Member, and P. H. Siegel,

“Adaptive Read Thresholds for NAND Flash,” IEEE Transactions on

Commun., vol. COM-63, pp. 3069-3081, Sept. 2015.

[6] F. Sala, R. Gabrys, and L. Dolecek, “Dynamic Threshold Schemes for

Multi-Level Non-Volatile Memories,” IEEE Trans. on Commun., pp.

2624-2634, vol. COM-61, July 2013.

[7] K. A. S. Immink and K. Cai, “Composition Check Codes,” IEEE Trans.

Inform. Theory, vol. IT-64, pp. 249-256, Jan. 2018.

[8] D. Slepian, “Permutation Modulation,” Proc. IEEE, vol. 53, pp. 228-236,

March 1965.

[9] K. A. S. Immink and J. H. Weber, “Minimum Pearson Distance

Detection for Multi-Level Channels with Gain and/or Offset Mismatch,”

IEEE Trans. Inform. Theory, vol. IT-60, pp. 5966-5974, Oct. 2014.

[10] K. A. S. Immink and V. Skachek, “Minimum Pearson Distance Detection

Using Mass-Centered Codewords in the Presence of Unknown Varying

Offset,” IEEE Journal on Selected Areas of Communications, vol. 34,

pp. 2510 - 2517, 2016.

[11] D. Chase, “A Class of Algorithms for Decoding Block Codes with

Channel Measurement Information,” IEEE Trans. Inform. Theory, vol.

IT-18, pp. 170-179, Jan. 1972.

[12] R. V. Hogg and A. T. Craig, Introduction to Mathematical Statistics, 5th

ed. New York: Macmillan, 1995.

[13] W. E. Ryan and S. Lin, Channel Codes, Classical and Modern, Cam-

bridge University Press, 2009.

[14] K. Cai and K. A. S. Immink, “Cascaded Channel Model, Analysis, and

Hybrid Decoding for Spin-Torque Transfer Magnetic Random Access

Memory (STT-MRAM),” IEEE Trans on Magn., vol. MAG-53, pp. 1-

11, Nov. 2017.

[15] V. M. Sidel’nikov, “Weight spectrum of binary Bose-Chaudhuri-

Hoquinghem codes,” Probl. Peredachi Inform., vol. 7, no. 1, pp. 14-22,

Jan.-Mar. 1971.

Kees A. Schouhamer Immink (M’81-SM’86-F’90)

received his PhD degree from the Eindhoven Uni-

versity of Technology. He was from 1994 till 2014

an adjunct professor at the Institute for Experimental

Mathematics, Essen-Duisburg University, Germany.

In 1998, he founded Turing Machines Inc., an

innovative start-up focused on novel signal pro-

cessing for solid-state (Flash) memories, where he

currently holds the position of president. Immink

designed coding techniques of digital audio and

video recording products such as Compact Disc, CD-

ROM, DCC, DVD, and Blu-ray Disc. He received a Knighthood in 2000, a

personal Emmy award in 2004, the 2017 IEEE Medal of Honor, the 1999

AES Gold Medal, the 2004 SMPTE Progress Medal, the 2014 Eduard Rhein

Prize for Technology, and the 2015 IET Faraday Medal. He received the

Golden Jubilee Award for Technological Innovation by the IEEE Information

Theory Society in 1998. He was inducted into the Consumer Electronics Hall

of Fame, elected into the Royal Netherlands Academy of Sciences and the

(US) National Academy of Engineering. He received an honorary doctorate

from the University of Johannesburg in 2014. He served the profession as

President of the Audio Engineering Society inc., New York, in 2003.

Kui Cai received her B.E. degree in information

and control engineering from Shanghai Jiao Tong

University, Shanghai, China, and joint Ph.D. degree

in electrical engineering from Technical University

of Eindhoven, The Netherlands, and National Uni-

versity of Singapore. Currently, she is an Associate

Professor with Singapore University of Technology

and Design (SUTD). She received 2008 IEEE Com-

munications Society Best Paper Award in Coding

and Signal Processing for Data Storage. She is

an IEEE senior member, and served as the Vice-

Chair (Academia) of IEEE Communications Society, Data Storage Technical

Committee (DSTC) during 2015 and 2016. Her main research interests are

in the areas of coding theory, information theory, and signal processing for

various data storage systems and digital communications.

Jos H. Weber (S’87-M’90-SM’00) was born in

Schiedam, The Netherlands, in 1961. He received

the M.Sc. (in mathematics, with honors), Ph.D.,

and MBT (Master of Business Telecommunications)

degrees from Delft University of Technology, Delft,

The Netherlands, in 1985, 1989, and 1996, respec-

tively. Since 1985 he has been with the Faculty of

Electrical Engineering, Mathematics, and Computer

Science of Delft University of Technology. Cur-

rently, he is an associate professor in the Department

of Applied Mathematics. He is the chairman of the

WIC (Werkgemeenschap voor Informatie- en Communicatietheorie in the

Benelux) and the secretary of the IEEE Benelux Chapter on Information

Theory. He was a Visiting Researcher at the University of California at Davis,

USA, the University of Johannesburg, South Africa, the Tokyo Institute of

Technology, Japan, and EPFL, Switzerland. His main research interests are in

the area of channel coding.