ChapterPDF Available

Channel coding


Abstract and Figures

Channel coding plays a fundamental role in digital communication and in digital storage systems. The position of channel coding in such a system is depicted in Figure 4.1 overleaf. The channel encoder adds redundancy to the (possibly source encoded and encrypted) messages generated by the information source, in order to make them more resistant to noise and other disturbances affecting the modulated signals during transmission over the channel. The channel decoder exploits the redundancy when trying to retrieve the original information based on the demodulator output. The choice of a channel coding scheme for a particular application is a trade-off between various factors, such as the rate (the ratio between the number of information symbols and the number of code symbols), the reliability (the bit or message error probability), and the complexity (the number of calculations required to perform the encoding and decoding operations).
Content may be subject to copyright.
Channel Coding
J.H. Weber (TU Delft)
L.M.G.M. Tolhuizen (Philips Research Eindhoven)
K.A. Schouhamer Immink (University of Essen/Turing Machines)
Channel coding plays a fundamental role in digital communication and in digital
storage systems. The position of channel coding in such a system is depicted in
Figure 4.1 overleaf. The channel encoder adds redundancy to the (possibly source
encoded and encrypted) messages generated by the information source, in order to
make them more resistant to noise and other disturbances affectingthe modulated
signals during transmission over the channel. The channel decoder exploits the
redundancy when trying to retrieve the original information based on the demod-
ulator output. The choice of a channel coding scheme for a particular application
is a trade-off between various factors, such as the rate (the ratio between the num-
ber of information symbols and the number of code symbols), the reliability (the
bit or message error probability), and the complexity (the number of calculations
required to perform the encoding and decoding operations).
1This chapter covers references [346] – [450].
90 Chapter 4 – Channel Coding
source -source
encoder -encrypter -channel
encoder -modulator
user source
decoder decrypter channel
decoder de-
Figure 4.1: Channel coding as a component in a communication or storage sys-
In his landmark paper [3], Shannon showed that virtually error-free communica-
tion is possible at any rate below the channel capacity. However, his result did not
include explicit constructions and allowed for infinite bandwidth and complexity.
Hence, ever since 1948, scientists and engineers have been working to further de-
velop coding theory and to find practically implementable coding schemes. The
paper of Costello, Hagenauer,Imai and Wicker gives a good overview of applica-
tions of error-control coding. Some of the codes emerging from coding theory as
developed in the 1950s and 1960s have been applied in mass consumer products
like the CD (developed jointly by Philips and Sony in the 1970s and 1980s) and
GSM (1990s). Classical reference works are the book of MacWilliams and Sloane
[42] and that of Blahut [62]. A more recent reference is the 2-volume work [108].
The above books focuses mainly on block codes; the book of Johannesson and
Zigangirov [110] deals exclusively (and expertedly!) with convolutional codes. In
[109], a comprehensive overview is given of (modulation) codes particularly de-
signed for data storage systems, such as optical and magnetical recording products.
The introduction of turbo codes in 1993 [90] caused a true revolution in error-
control coding. These codes allow transmission rates that closely approach channel
capacity. Also, the re-discovery of Gallager’s low-density parity-check (LDPC)
codes [15] contributedto the large present interestin iterative decoding,both the-
oretically and practically (iterative decoders are being applied in UMTS).
Sessions on channel coding have been part of the Symposia on Information The-
ory held in the Benelux since 1980. On average, about four channel coding papers
were presented per symposium. Among the highlights of the many Benelux con-
tributions to this field are the celebrated Roos bound on the minimum distance of
cyclic codes [352], Best’s work on the performance evaluation of convolutional
codes on the binary symmetric channel [368], and the comprehensive survey pa-
pers by Delsarte on the association schemes in the context of coding theory [400],
4.1 Block Codes 91
In this chapter, we briefly describe the over one hundred papers on channel coding
presented at the Symposia on InformationTheory in the Benelux. Some structure
has been pursued by classifying each paper into one of the following categories:
constructions and properties of (block) codes (Section 4.1), decoding techniques
(Section 4.2), codes for data storage systems (Section 4.3), codes for special chan-
nels (Section 4.4), and, finally, applications (Section 4.5). Some categories have
been divided further into subcategories. The classification is not always unambigu-
ous, since many papers deal with more than one aspect (e.g., a paper presenting
a code construction together with an accompanying decoding method). The final
choice represents the main contribution of the paper in the opinion of the authors
of this chapter.
4.1 Block Codes
4.1.1 Constructions
In this section, we discuss papers that deal with the construction of block codes.
Some papers in this section might just as well have been discussed in the next
section, as they aim at constructing codes with special properties, e.g. a large min-
imum distance.
The well-known Griesmer bound states that the length nof a binary [n, k, d]code
satisfies the following inequality:
ng(k, d):=
i=0 d
In [349], Van Tilborg and Helleseth explicitly construct, for each k4, a binary
k38] code that is readily be seen to meet the
Griesmer bound with equality. It is claimed that for k8, up to equivalence,
the constructed codes are the only ones with these parameters. In [380], Kapralov
and Tonchev construct self-dual binary codes from the known 2-(21,10,3)designs
without ovals, and study the automorphism groups of these codes.
In [401], Ericson and Zinoviev give three methods for constructing spherical codes
(i.e., sets of unit norm vectors in Rn) from binary constant weight codes. Bounds
are given on the dimensionality, the minimum squared Euclidean distance, and the
cardinality of the resulting spherical codes, and numerical examples are given.
In [403], Peirani studies a class of codes obtained by application of the well-known
(u, u +v)construction to a simplex code Uand a code Vfrom a class of codes
with normal asymptotic weight distribution. It is shown that the resulting codes
have an asymptotically normal weight distribution as well, by using properties of
the dual of the (u, u +v)code, the MacWilliams identity, and the central limit
92 Chapter 4 – Channel Coding
According to the Singleton bound, the cardinality of a code Cof length nand
minimum distance dover a q-ary alphabet Qis at most qnd+1 . In case of equal-
ity, Cis called an MDS code. Examples of MDS codes are Reed–Solomon codes,
which are defined if Qis endowed with the structure of a finite field (and hence qis
a power of a prime). In [404], Vanroose studies MDS codes over the alphabet Zm.
His main results are the following. Let NL
m(k)denote the largest length of a linear
MDS code over Zmwith mkwords. Then NL
m(2) = p+1,andNL
where pis the largest prime factor of m. Note that the demand that the code is
linear over Zmis quite restrictive: if mis the power of a prime, doubly extended
Reed-Solomoncodes are[m+1,k,m+2k]codes for eachk∈{1,2,...,m+1}.
In [422], Van Dijk and Keuning describe a construction of binary quasi-cyclic
codes from quaternary BCH codes. The length and dimension of the binary code
is determined by the generator polynomial of its originating quaternary code; its
minimum distance is at least the minimum distance of the quaternary code. For
some example codes obtained with this construction, the true minimum distance
(found by computer search) equals the best known minimum distance for binary
linear codes of the given length and dimension.
An (n, w, λ)optical orthogonal code is a set of binary sequences of length nand
weight wsuch that for each xCand integer τ∈{1,2,...,n1},
xtxt+τλ, (4.2)
and for any two distinct x, y Cand each integer τ∈{0,1,...,n1},
xtyt+τλ. (4.3)
The subscripts are to be taken modulon. Optical orthogonal codes can be used to
allow multi-user optical communication.
In [426], Stam and Vinck give a good overview of the known results in this area.
They also introduce a property they call “super cross-correlation”: for all distinct
x, y and zin C, and integer τ∈{0,1,...,n1}, it is demanded that
xtzt+τλ. (4.4)
Codes satisfying this extra property could be used in applications with partial syn-
chronization between different codewords and where the mutually synchronized
words typically are not sent simultaneously. In [436], Martirosyan and Vinck de-
scribe a construction of optical orthogonalcodes with λ=1. If a certain parameter
in their construction is small enough, their code contains, in a first-order approx-
imation, as many words as possible. Specific examples of good codes resulting
from the construction are tabulated.
4.1 Block Codes 93
Figure 4.2: Citation of the Roos bound in a textbook from 2003.
4.1.2 Properties
Over the years, properties like the length, cardinality, minimum distance, or weight
distribution of codes belonging to a particular family have been studied exten-
sively. In this section we review miscellaneous results in this area as presented at
the various Benelux Information Theory symposia.
In [352], Roos states and proves what in present textbooks (see Figure 4.2) is
referred to as the “Roos bound” for the minimum distance of cyclic codes. The
bound reads as follows. Let αbe an n-th primitive root of unity in GF(q). Let
b, c1,c
2and sbe integers such that δ2,(n, c1)=1,and(n, c2),andlet
N:= {αb+i1c1+i2c2|0i1δ2,0i2s}.(4.5)
Let Cbe a cyclic code over GF(q) such that each element of Nis a zero of C.
That is, for each word c=(c
n1)Cand each βN,wehavethat
i=0 ciβi=0. Then the minimum distance of Cis at least δ+s. The Roos
bound is often applied to prove a lower bound on the minimumdistance of a sub-
field subcode of C. For example, let αbe a 51st root of unity in GF(28). Let Bbe
the binary cyclic code with zeroes α, α5and α9. The conjugacy constraints imply
that all elements of N={αi|i∈{7,8,9,10,13,14,15,16}} are zeroes of B.It
follows from the Roos bound, with b=7,c
the code C:= {(c0,c
50)(GF (28))51 |P50
i=0 ciβi=0for all βN}
has minimum distance at least six. As Bis a subcode of C, its minimum distance
94 Chapter 4 – Channel Coding
is surely at least six.
In [354], De Vroedt considers formally self-dual codes. For such codes, with
the property that all weights are multiples of some constant t>1, he derives the
weight enumerator through computation of the eigenvaluesand eigenvectors of the
so-called Krawtchouk matrix, rather than by using the traditional method based on
invariant theory.
In [357], Bussbach, Gerretzen and Van Tilborg study properties of [g(k, d),k,d]
codes, i.e., codes that meet the Griesmer bound from Equation (4.1) with equality.
It is shown that the maximum number of times a coordinate in Cis repeated equals
s:= dd
2k1e. Moreover, it is shown that the coveringradius ρof such codes is at
most d−ds
2e, with equality if and only if a [g(k+1,d),k+1,d]code exists. For
s2,all[g(k, d),k,d]codes with ρ=d−ds
2eare described; for fixed kand
sufficiently large d,thereexist[g(k, d),k,d]codes with ρ=d−ds
In [400], Delsarte givesa comprehensivesurvey of some of the main applications
and generalizations of the MacWilliams transform relevant to coding theory. The
author, one of the world’s most respected contributors to this area, considers in this
paper both the generalized MacWilliams identities for inner distributions of dual
codes and the generalized MacWilliams inequalities for the inner distributions of
unrestricted codes. The latter leads to the linear programming bound in general
coding theory. The paper also contains an introduction to association scheme the-
ory, which is an appropriate framework for non-constructive coding theory. In
[444], again a survey paper by Delsarte, the Hamming space, particularly impor-
tant to coding theory, is viewed as an association scheme. The paper provides an
extensive overview of those parts of association scheme theory that are especially
relevant to coding problems. Special emphasis is put on several forms of dual-
ity inherent in the theory. The Hamming space is also considered by Canogar in
[424]. The author studies an example of a non-trivial partition design of the 10-
dimensional Hamming space. He shows that this partition can be reconstructed
from its adjacency matrix.
Gillot derives in [402] bounds on the codeword weights of cyclic codes by using
bounds on exponential sums. In particular, the author pays attention to a family of
codes defined by Wolfmann, for which the parameters can be expressed in terms
of numbers of solutions of trace equations.
Maximum-likelihood decoding of a linear block code can be efficiently performed
with a trellis. An important parameter for judging the complexity of trellis decod-
ing is the state complexity of the code. In [419], Tolhuizen shows that a binary
linear code of length dimension k, Hamming distance dand state complexity at
most k3has length n2d+2dd/2e−1, and constructs a [15,7,5] code attain-
ing this bound with equality.
A superimposed code in n-dimensional Euclidean space is a subset of vectors with
the property that all possible sumsof any mor fewer of these vectors form a set of
4.1 Block Codes 95
points which are separated by a certain minimum distance d. Since known bounds
on the rate of such a code are not so useful for small values of m, Vangheluwe
[425] studies experimentally the case m=2using visualization software pack-
ages, leading to plots for both the random-coding bound and the sphere-packing
4.1.3 Cooperating Codes
Two (or more) error-correcting codes can be combined into a new code, which has
good error correction capabilities for combinations of random and burst errors.
The new (long) code can make use of the encoding and decoding algorithms of the
(short) constituent codes, so the encoding and decoding complexity can be kept
rather low. Product Codes and Concatenated Codes are two important classes of
such cooperating codes. In the product coding concept, two (or more) codes over
the same alphabet are combined. In the concatenated coding concept, a hierarchi-
cal coding structure is established by combining an inner code over a low-order
(mostly binary) alphabet with an outer code over a high-order alphabet.
The product coding concept was introduced by Elias in 1954 [9]. In the two-
dimensional case, the codewordsare arrays in which the rows are codewords from
a code C1, while the columns are codewords from a code C2. After (row-wise)
transmission, the received symbols are collected in a similar array, in which first
the rows are decoded accordingto C1and next the columns according to C2.Inthis
way, randomerrors are likely to be corrected by the row decoder, while remaining
burst errors, which have been distributed over various columnsdue to interleaving,
are to be corrected by the column decoder.
In [370], Blaum, Farrell and Van Tilborg consider simple product codes using
even-weight codes (requiring only a single parity-check bit) as constituent codes.
They propose a diagonal read-out structure (instead of the traditional row-wise
procedure) together with an efficient decoding algorithm, which enables the cor-
rection of relatively long burst errors.
In [385], Tolhuizen and Baggen show that a product code is much more powerful
than commonly expected. Product codes generally have a poor minimum distance,
i.e, there may exist codes of the same length and dimension with a higher mini-
mum distance. Nevertheless, they may still offer good performance, since many
error patterns of a weight exceeding half the minimum distance can be decoded
correctly, even with relatively simple algorithms. The authors derive upper bounds
on the number of error patterns of low weight that a nearest neighbor decoder does
not necessarily decode correctly. Further, they also present a class of error patterns
which are decoded correctly by a nearest neighbor decoder. This class suggests
possibilities beyond those already known in 1989 for the simultaneous correction
of burst and random errors.
Concatenated codes were introduced by Forney [18] in 1966. The classical con-
catenated coding scheme consists of a binary inner code with 2kwords and an
96 Chapter 4 – Channel Coding
outer code over GF(2k), typically a Reed-Solomon code. Information is first en-
coded using the outer encoder. Next, each of the generated symbols is considered
as a binary vector of length k, which is encoded using the inner code. After trans-
mission, the received bits are decoded by the inner decoder, leading to symbols
which are decoded using the outer decoder. In order to further increase the burst
error correction capabilities, one can insert an interleaver between the outer and
inner encoder, and a correspondingde-interleaver between the inner and outer de-
coder. A popular concatenated coding scheme (e.g., for deep space missions) uses
a rate 1/2 convolutional inner code of constraint length k=7, and a Reed-Solomon
outer code over GF(256) of length 255 and dimension 223.
In [373], Van der Moolen proposes a decoding scheme for a concatenated coding
system with a convolutional inner code and a Reed-Solomon (RS) outer code, with
block interleaving. For bursty channels, if a symbol error occurs in an RS word,
the symbols at the corresponding positions in the previous and next codewords are
suspicious. Based on this observation, Van der Moolen develops a “decodingwith
memory” strategy. The basic idea is that if the RS decoder succeeds, then at all
the locations of the (corrected) symbol errors, the Viterbi decoder is (re-)started
to decode the corresponding symbols of the subsequent codewords with the new
initial states. Furthermore, the author gives a 12-state Markov model describing
the process of decoding with memory for the concatenated coding system.
In the same year, Tolhuizen [375] considered the generalized concatenation con-
struction proposed by Blokh and Zyablov in 1974. The BZ construction uses a
code A1over GF(q)ofdimensionkand r(outer) codes Biover GF(qai), where
i=1 ai=k. The author indicates how these ingredients should be chosen to
obtain a good code, i.e., a code with high minimum distance given its length and
At the 1989 symposium in Houthalen, prof. T. Ericson from Link¨oping Univer-
sity in Sweden gave an invited lecture on recent developments in concatenated
coding [386]. In particular, he discussed decoding principles, the construction
of optimal codes via concatenation (e.g., a construction of the Golay code using
a Reed-Solomon outer code and a trivial distance-1 inner code), and asymptotic
In the late 1990s, Weber and Abdel-Ghaffar studied decoder optimization issues
for concatenated coding schemes. Instead of exploiting the full error correction
capability of the inner decoder with Hamming distance d, they use this capability
only partly, thus leaving more erasures but less errors for the outer decoder. Since
it is easier to correct an erasure than an error, there is a trade-off problem to be
solved in order to determine the optimal choice. In [420], the inner code error-
correction radius tis optimized over all possible values 0t≤b(d1)/2c,
either by maximizing the number of correctable errors or by minimizing the un-
successful decoding probability. For small channel error probabilities, a strategy
that is optimal in the latter respect is also optimal in the former respect. However,
for large channel error probabilities, a strategy that is optimal in one respect may
4.2 Decoding Techniques 97
be suboptimal in the other. In [430], the erasing strategy is not determined by the
inner code error-correction radius, but it is made adaptive to the actual reliability
values of the inner decoder outputs. The authors also determine the maximum
number of channel errors for which correction is guaranteed under such an opti-
mized erasing strategy.
In 1995, Baggen and Tolhuizen [409] introduced a new class of cooperating codes:
Diamond codes. The two constituent codes, C1and C2, have the same length n
and are defined over the same alphabet. As illustrated in Figure 4.3, the Diamond
code consists of the bi-infinite strips of heightn, where each column is in C1and
each slant line with a given slope is in C2. In contrast to CIRC (Cross Interleaved
C1-words C2-words
Figure 4.3: The format of Diamond codes.
Reed-Solomon Code, used in the CD system), all symbols of the Diamond code
are checked by both codes. In the area of optical recording, the application of
Diamond codes can enhance storage densities significantly. In the accompanying
paper [410], Tolhuizen and Baggen consider block variations of Diamond codes in
order to make these more suited for rewritable, block-oriented applications.
4.2 Decoding Techniques
In the previous sections, we considered papers dealing with properties of codes
and constructions of codes. In the present section, we review papers on the decod-
ing of error-correcting codes, both block codes and convolutional codes. Various
contributions to the decoding of convolutional codes are described.
4.2.1 Hard-Decision Decoding
Hard-decision decoders operate on the symbol estimates delivered by the demodu-
lator. A hard-decision decoder may decode up to a pre-specified number of errors
and declare a decoding failure otherwise; in that case, we speak of a bounded-
distance decoder.
In [359], Simons and Roefs describe algorithms for the encoding and decoding of
[255,255 2T,2T+1]Reed-Solomon codes over GF(256) that allow an efficient
98 Chapter 4 – Channel Coding
implementation in digital signal processors. The decoding algorithms contain the
following conventional steps: syndrome computation, solving the key equation,
and error location and evaluation. Significant savings in the number of computa-
tions are reported for Fast Fourier Transform techniques (strongly advocated in the
then recent book of Blahut [62]) used for encoding, syndrome computations and
for determining the error values.
In [379], Stevens shows that the BCH algorithm can be used to decode up to a
particular instance of the Hartmann-Tzeng bound. By applying this result while
trying all values of a set of judiciously chosen syndromes, he obtains an algorithm
for decoding cyclic codes up to half their minimum distance. For various code
parameters, the cardinality of this set of syndrome values to be tried is minimized,
and thus efficient decoding algorithms are obtained.
Van Tilburg describes [387] a probabilistic algorithm for decoding an arbitrary
linear [n, k]code. It refines the following well-known method. A set of kof the
nreceived bits is selected at random. It is hoped that these kbits are error free.
If the positions corresponding to these kbits form an information set, the unique
codeword corresponding to these kbits is determined, and it is checked whether
the codeword so obtained is sufficiently close to the received word. If not, another
group of kbits is selected. The method proposed by Van Tilburg features a sys-
tematic way of checking, and a randombit swapping procedure.
In [415], Heijnen considers binary [mk, k]codes that are quasi-cyclic. That is,
is a codeword, then the vector obtained by simultaneously applying a cyclic shift
on each of the mblocks
is a codeword as well. Three general decoding methods are compared: compari-
son to all codewords, syndrome decoding (where the quasi-cyclic property allows
reduction of the number of coset leaders to be stored), and “error division”. The
latter method is based on the observation that an error vector of weight thas a
weight of at most s=bt
mcin at least one of its mblocks. For each i,1im,
and each vector eof length kand weight at most s, the codeword is computed that
in the i-th block equals to the sum of eand the i-th block of the received word. The
Hamming distance of the codeword so obtained and the received vector is used to
select the codeword to decode to.
4.2.2 Soft-Decision Decoding
While hard-decision decoders do their job solely based on the symbol estimates
delivered by the demodulator, soft-decision decoders also take into consideration
the reliability of those estimates. This leads to better performance, at the expense
of higher complexity. Over the years, many soft-decision decoding techniques
4.2 Decoding Techniques 99
have been proposed. Although a maximum-likelihood (ML) decoding algorithm
minimizes the decoding error probability, other algorithms are of interest as well,
due to the (prohibitively)high computational complexity of ML decoding for long
Generalized Minimum Distance (GMD) decoding, as introduced by Forney [17]
in 1966, permits flexible use of reliability information in algebraic decoding algo-
rithms for error correction. In subsequent trials, an increasing number of the most
unreliable symbols in the received sequence is erased, and the resulting sequence is
supplied to an algebraic error-erasuredecoder, until the decoding result and the re-
ceived sequence satisfy a certain distance criterion. In Forney’s originalalgorithm,
the unique codeword (if one exists) satisfying the generalized minimum distance
criterion is found in at most dd/2etrials, where dis the Hamming distance of the
code. In 1972, Chase [28] presented a similar class of decoding algorithms for
binary block codes, in which unreliable symbols are inverted (instead of erased)
in various decoding trials. From the list of generated codewords the most likely
one is chosen as the decoding result. Although the Forney and Chase decoding
approaches are rather old, they are still highly relevant. The resulting decoders
are not only used as stand-alone decoders, but also as constituent components in
modern techniques like iterative decoding of product codes.
In [391], Hollmann and Tolhuizen present a new condition on GMD decoding
to guarantee correct decoding. They apply their weakened condition on the de-
coding of product codes, and describe a class of error patterns that is corrected
by a slightly adapted version of the GMD-based Wainberg algorithm for decoding
product codes is described. This class of error patterns equals the class that Tol-
huizen and Baggen [385] showed to be correctableby a nearest neighbor decoder
two years before, cf. Section 4.2.1.
In the early 2000s, Weber and Abdel-Ghaffarconsidered reduced GMD decoders.
They studied the degradation in performance resulting from limiting the number
of decoding trials and/or restricting (e.g., quantizing) the set of reliability values.
In [431], they focus on single-trial methods with fixed erasing strategies, threshold
erasing strategies, and optimized erasing strategies. The ratios between the realiz-
able distances and the code’s Hamming distance for these strategies are about 2/3,
2/3, and 3/4, respectively. A particular class of reliability values is emphasized,
allowing a link to the field of concatenatedcoding. In [437], asymptotic results on
the error-correction radius of reduced GMD decoders are derived.
Recently, limited-trial versions of the Chase algorithm were introduced as well.
The least complex version of the original Chase algorithms (“Chase 3”) [28] uses
roughly d/2trials, where dis the code’s Hamming distance. In [442], Kossen and
Weber show that decoders exist with lower complexity and better performance
than the Chase 3 decoder. It also turns out that optimization of the settings of the
trials depends on the nature of the channel, i.e., AWGN and Rayleigh fading chan-
nels may require different arrangements. In [449], Weber considers Chase-like
algorithms achieving bounded-distance (BD) decoding, i.e., decoders for which
100 Chapter 4 – Channel Coding
the error-correction radius (in Euclidean space) is equal to that of a decoder that
maps every point in Euclidean space to the nearest codeword. He proposes two
Chase-like BD decoders: a static method requiring about d/6trials, and a dy-
namic method requiring only about d/12 trials. Hence, the complexity is reduced
by factors of three and six, respectively, comparedto the Chase-3 algorithm.
4.2.3 Decoding of Convolutional Codes
The Viterbi algorithm [110, Ch. 4] is a well-known method for decoding convo-
lutional codes that minimizes the sequence-error probability. It is the most pop-
ular decoding algorithm for decoding convolutional codes with a short constraint
length. In literature, quite some attention has been paid to implementation aspects
of the algorithm. Also some contributions to the WIC symposia dealt with imple-
mentation aspects of the Viterbi algorithm.
In [369], Nouwens and Verlijsdonk discuss (in Dutch) soft-decision Viterbi de-
coding of a rate R=1/2,K=3convolutional code with generator polynomials
2and 1+D
2that is used on an AWGN channel. The effect of quan-
tization of the bit reliabilities that serve as input to the Viterbi decoder is studied.
An equally-spaced quantizer is assumed, and the level spacing is determined to
optimize the union bound on the error probability after decoding.
Baggen, Egner and Vanderwiele [448] discuss quantization for a Viterbi decoder
used on a Rayleigh fading channel. Also here, an equally-spaced quantizer is con-
sidered. The level spacing is now computed in such a way that the cut-off rate
of the discrete channel resulting from this quantization is optimized. The optimal
spacing depends only weakly on the average SNR, and it is better choose one that
is too large than one that is too small. Simulation results suggest that the spacing
that maximizes the cut-off rate is optimal for Viterbi decoding as well.
Quantization of the bit reliabilities is not the only important practical aspect of
Viterbi decoding; one also has to determine which numerical range suffices for
performing the required computations. In [393] and [397], Hekstra gives results
on the maximum difference between path metrics in Viterbi decoders. From this
maximum difference, he derives consequences for reduction of the required nu-
merical range.
The Viterbi algorithm operates on a trellis that has a number of states that is ex-
ponential in the encoder constraint length. Consequently, the implementation of
the Viterbi algorithm is impractical for convolutional codes with a large constraint
length. In this case, sequential decoding [110, Ch. 6], which can be seen as a back-
tracking decoding method, can be applied. In the basic stack algorithm, a search is
performed in a tree, while a list is maintainedof paths of different lengths ordered
according to their metrics. The path with the highest metric is extended and sub-
sequently removed, while the newpaths are placed within the ordered list (stack).
The stack algorithm suffers incompletedecoding because the stack is full (“stack
overflow”). Its number of required computations depends on the actual noise se-
4.2 Decoding Techniques 101
quence. In [351], Schalkwijk describes several ways of reducing the complexity
of sequential decoders, using the syndrome of the receivedvector. One of the ob-
servations is that extension of a noise sequence with a “zero” digit is much more
likely than extension with a “one” digit, and that one has to consider more noise
digits at each decoding step to obtain two a-priori equally likely extensions. Sim-
ulations results are given.
The m-algorithm is a list decoding algorithm [110, Ch. 5]. It is a non-backtracking
method and, in contrast to sequential decoding, its decoding complexity does not
depend on the actually received sequence. The idea of the algorithm is that at
each time instant, a list of the mmost promising initial (equal length) parts of the
codewords is extended. In [383], Van der Vleuten and Vinck describe an imple-
mentation of the m-algorithm. Paths for which the metric is below the median are
extended; the others paths are not. As finding the median of mnumbers is linear in
m, the time complexity of the algorithm is linear in m. Their ingenious trace-back
method allows use of a small trace-back memory.
Assume that we generate the list of the mmost likely transmitted words from
a convolutional code, given the received sequence. If messages include a CRC
check sum, the most likely codeword in the list that has a correct CRC checksum
can be selected as final decoding result. In this way, a significant decoding gain
over conventional Viterbi decoding (m=1) can be obtained. In [447], Hekstra
proposes to generate an unorderedlist of all words for which the path metric ex-
ceeds that of the most likely path with at most B. In this way, sorting of paths
according to their path metrics is avoided. An algorithm for generating this list
is given. The length of the list is a random variable. A strategy is described for
choosing Bin such a way that the list size remains reasonable. Simulation results
are presented, showing a decoding gain of about 1.5 dB for the coding scheme
employed in GSM/GPRS on a static AWGN channel.
In 1983, Best [353] describes a convolutional decoder that outputs reliability in-
formation. This decoder seems to be a re-discovery of the BCJR algorithm or
forward-backwardalgorithm described by Bahl, Cooke, Jelinek and Raviv in 1974
[34] and well forgotten untilits usage in the decoding of Turbo codes in the 1990s.
Best considers such a decoder “not useful for practical purposes because of speed
limitations”, but he does find it useful for theoretical insight in what happens in
decoding. He mentions that the likelihood of a state in a most likely path is almost
always equal to one, until the decoder is forced to choose between two paths with
almost the same metric. In that case, the probability drops to about one half, and
remains on that value until paths merge. As a result, Best was led to modify a
Viterbi decoder so that it outputs both alternative paths in case of a close decision.
In a concatenated code system, the outer code then can decide which path is the
correct one.
The Viterbi algorithm minimizes the sequence error probabilities, while the BCJR
algorithm [34] minimizes the bit error probability. In concatenated coding schemes,
it seems more important to minimize the error probability of the symbols entering
102 Chapter 4 – Channel Coding
the outer decoder. Willems and Pa˘si´c [413] describe an implementation of such a
decoder with a complexity much lower than that achieved before, but still signif-
icantly larger than that of a Viterbi decoder. Simulations with a specific convolu-
tional code show that the symbol erroroutput rate of the proposed decoder is only
negligibly lower than with Viterbi decoding. The proposed decoder has the advan-
tage of generating soft-output information about the symbols, which can possibly
be used by the outer decoder.
We finalize this section by discussing papers dealing with the performance of
Maximum-Likelihood (ML) decoded convolutional codes employed on a binary
symmetric channel with error probability p.
Post [346] describes an upper bound for the first error event probability of ML
decoding. First, with the aid of the codeword enumerator of the code, he derives
lower bounds on the weights of error patterns of a given length that a ML de-
coder does not decode correctly. Next, by analyzing a related random walk, he
determines the probability of occurrence of error patterns satisfying these lower
bounds. For small p, the well-known union bound is sharper, but for larger p,
Post’s bound is sharper.
Schalkwijk [348] describes a syndrome decoderfor ML decoding of convolutional
codes with the aim of analyzing the first error event probability. A diagram incor-
porating metrics and states is studied, and a Markov chain technique is applied for
estimating the error event probability. This approach was continued and extended
by Best, who shows in [368] thata convolutionalcoding scheme with ML decod-
ing over a discrete memoryless channel can be modeled as a Markov chain. This
model allows exact analysis of the statistical behavior of the errors. The method
is illustrated with a R=1/2code with constraint length 1, used over a binary
symmetric channel. Unfortunately, the amount of computation grows rapidly with
the constraint length of the code. For example, according to the author, for the
“standard code” with constraint length 3 and generator polynomials 1+Dand
2used on a binary symmetric channel, the Markov model hasas many
as 104 states. In 1995, this work was reported on in [94], dedicated to the memory
of Mark Best – see Figure 4.4.
4.2.4 Iterative Decoding
The introduction of turbo codes [90] in 1993 caused a true revolution in the field
of error control coding. In their original form, turbo codes combine two recursive
convolutionalcodes along with a pseudo-random interleaverin a parallel concate-
nated coding scheme. Through a maximum a posteriori (MAP) iterative decoding
process, performances very close to the Shannon limit are achieved. As men-
tioned by Wicker in [108, Ch. 25, Sect. 11], turbo codes initially met with some
skepticism, but already four years after their introduction, a turbo code experi-
mental package was launched into space aboard the Cassini spacecraft. Further
research on iteratively decodable codes resulted in the rediscovery of Gallager’s
4.2 Decoding Techniques 103
Figure 4.4: Paper in IEEE Transactions on Information Theory based on [368].
low-density parity-check (LDPC) codes [15], dating from the 1960s. Currently,
both turbo codes and LDPC codes are studied extensively and are considered as
the most promising candidate codes for many application areas. For example,
turbo codes have been implemented in UMTS, the third-generation mobile com-
munication standard.
In [421], Tolhuizen and Hekstra-Nowacka consider turbo coding schemes employ-
ing serial (instead of parallel) concatenation. They focus on the word error rate
after decoding, for which they give the average union bound. In order to compute
this bound, one needs the input-output weight enumerator of the inner decoder.
The authors providean explicit formula for this enumerator, and apply it to some
specific examples.
Dielissen en Huisken [432] explain four implementation techniques for the soft-
input soft-output (SISO) decoding module of a third-generation mobile commu-
nication turbo decoder. They compare the performance and implementationcosts
(in terms of silicon area and power dissipation). The final choice is not trivial, but
a trade-off between different aspects.
104 Chapter 4 – Channel Coding
The inputs and outputs of an a-posteriori probability (APP) decoder as used in
turbo decoding can be represented as log-likelihood ratios (LLRs). Hagenauer’s
box function log((1 + ex+y)/(ex+ey)) can be used to establish an explicit input-
output relation of an APP decoder. Janssen and Koppelaar [433] consider turbo
codes with BPSK modulation over an AWGN channel. They show that the ran-
dom variable zthat is the output of the box function exhibits the LLR property,
that is, for each z,
log pz(z|b=0)
(z|b=0)=z. (4.8)
They study the effect of mismatched inputs to the box function, and give upper
and lower bounds on the LLR at the output of the box function as a function of
Le Bars, Le Dantec and Piret [443] focus on the design of the interleavers in a
turbo coding scheme. The authors present an algebraic interleaver construction
method leading to codes with a high minimum distance. The performance of these
codes are very good at high signal-to-noise ratios.
In [435], Balakirsky describes a realization of the Maximum-Likelihood(ML) de-
coding algorithm for messages encoded by an LDPC code and transmitted over a
binary symmetric channel. The algorithm is based on the introduction of a tree
structure in a space consisting of all possible noise vectors and principles of se-
quential decoding with the use of a special metric function. The author derives an
upper bound on the exponent of the expected number of computations in the en-
semble of low-density codes and shows that it is much smaller than the exponent
for the exhaustive search. It should be noted that this work is based on a (Russian)
paper by the author dating from 1991, i.e., from well before the world-wide redis-
covery of LDPC codes!
Steendam and Moeneclaey [441] derive the ML performance of LDPC codes, con-
sidering BPSK and QPSK transmission over a Gaussian channel. They compare
the theoretical ML performance with that of the iterative decoding algorithm. It
turns out that the performance of the iterative decoding algorithm is close to the
ML performance when the girth of the code is sufficiently high.
4.3 Codes for Data Storage Systems
Given the continuing demand for increased data storage capacity, it is not surpris-
ing that interest in codingtechniques for mass data storage systems, such as optical
and magnetic recording products, has continued unabated ever since the day when
the first mechanical computer memories were introducedin the 1950s. Evidently,
technological advances such as improved materials, heads, mechanics, and so on
have been the driving force behind the “ever” increasing data storage capacity, but
state-of-the-art storage densities are also a function of improvements in channel
coding, the topic addressedin this section. The book by Immink [109] and the sur-
vey article by Immink, Siegel and Wolf [107] offer a comprehensive description
4.3 Codes for Data StorageSystems 105
of the literature on this topic.
Optical recording, developed in the late 1960s and early 1970s, is the enabling
technology of a series of very successful productsfor digital consumer electronics
systems such as Compact Disc (CD), CD-ROM, CD-R, and Digital Video Disc
(DVD). The design of codes for optical recording systems is essentially the design
of combined dc-free,run-length limited (DC-RLL) codes.
An encoder accepts a series of information words as an input and transforms them
into a series of output words, called codewords. Binary sequences generated by
a(d, k)RLL encoder have, by definition, at least dand at most k0s between
consecutive 1s. Let the integers mand ndenote the information word length and
codeword length, respectively. The code rate,R=m/n, is a measure of the
code’s efficiency. The maximum rate of an RLL code, givenvalues of dand k,is
called the Shannon capacity, and it is denoted by C(d, k)[3].
Early examples of RLL codes have been given by Berkoff [16] some forty years
ago, and since then the chase of various code designers in the world has been
the creation of “practical” RLL codes whose rate approaches Shannon’s theoret-
ical rate limit. Hundreds of examples of RLL codes have been published and/or
patented over the years. Dc-free codes, as their name suggests, have no spectral
components at the zero frequency and suppressed spectral content near the zero
4.3.1 RLL Block Codes
One approach that has proved very successful for the conversion of source in-
formation into constrained sequences is the one constituted by block codes. The
source sequence is partitioned into blocks of length m, called source words,and
under the code rules such blocks are mapped onto words of nchannel symbols,
called codewords. In order to clarify the concept of block-decodable codes, we
have written down a simple illustrative case of a rate 3/5, (1,)block code. The
codeword assignment of Table 1 providesa simple block code that converts source
words of bit length m=3into codewords of length n=5. The two left-most
columns tabulate the eight possible source words along with their decimal repre-
sentation. We haveenumerated all eight words of length four that comply with the
d=1constraint. The eight codewords, tabulated in the right-hand column, are
found by adding one leading zero to the eight 4-bit words, so that the codewords
can be freely cascaded without violating the d=1constraint.
The code rate is m/n =3/5<C(1,)'0.69.., where C(1,)denotes the max-
imum rate possible for any d=1code irrespective of the complexity of such an
encoder. The code efficiency, expressed as the quotient of code rate and Shannon
capacity of the (d, k)-constrained channel having the same run length constraints,
is R/C(d, k)'0.6/0.69 '0.86.Thus the very simple block code considered is
sufficient to attain 86% of the rate that is maximally possible.
106 Chapter 4 – Channel Coding
Table 4.1: Simple (d=1)block code.
source output
0 000 00000
1 001 00001
2 010 00010
3 011 00100
4 100 00101
5 101 01000
6 110 01001
7 111 01010
It is straightforward to generalize the preceding implementation example to en-
coder constructions that generate sequences with an arbitrary value of the mini-
mum run length d. To that end, choose some appropriate codeword length n. Write
down all d-constrained words that start with dzeros. The number of codewords
that meet the given run length condition is Nd(nd), which can be computed
with generating functions or recursive relations [23].
A maximum run length constraint, k, can be incorporated in the code rules in a
straightforward manner. For instance, in the (d=1)code previously described,
the first codeword symbolis at all times preset to zero. If, however, the last sym-
bol of the preceding codeword and the second symbol of the actual codeword to be
conveyed are both zero, then the first codeword symbol can be set to one without
violating the d=1channel constraint. This extra rule, which governs the selec-
tion of the first symbol, the merging rule, can be implemented quite smoothly with
some extra hardware. It is readily conceded that with this additional ‘merging’
rule the (1,)code turns into a (1,6) code. The process of decoding is exactly
the same as that for the simple (1,)code, since the first bit, the “merging” bit, is
redundant, and in decoding it is skipped anyway. The (1,6) code is a good illustra-
tion of a code that uses state-dependent encoding (the actual codeword transmitted
depends on the previous codeword) and state-independent decoding (the source
word can be retrieved by observing just a single codeword, that is, without knowl-
edge of previous or upcomingcodewords or the channel state).
The first article describing RLL block codes was written by Tang and Bahl [23] in
1970. It describes a method where (d, k)constrained info blocks of length n0are
cascaded with merging blocks of length d+2. Twelve years later, it was shown
by Beenker and Immink [60] that their method can be made more efficient by con-
straining the maximum number of zeros at the beginning and start of the (d, k)
constrained info blocks to kd. Then merging blocks of length dare sufficient
to cascade (glue) the info blocks. The authors presented two constructions. In the
first construction, the merging block is the all-zero word (as in Table 1), while in
the second (more efficient) construction, the merging blocks depend on the two
neighboring info words.
4.3 Codes for Data StorageSystems 107
The methods described by Weber and Abdel-Ghaffar[392] [395] offer a more flex-
ible and efficient method for cascading RLL blocks than that described in the early
literature, specifically for the case where kis rather small. The method presented
by Tjalkens [394] does not use ‘merging bits’ to cascade the RLL info blocks,
but Tjalkens, alternatively, shows that with the set of (d, k)constrained codewords
that start with at least dzeros and end with at most k1zeros one may construct
a RLL block of maximum size. Later constructions showed that merging blocks
of length less than dcan be used, where the merging algorithm can alter both the
merging block and (small) parts of the info word.
The article by Hollmann and Immink [390] addresses the problem of generating
RLL sequences, where we have the additional demand that a certain, prescribed,
sequence of run lengths is not allowed to be generated. Said specific sequence of
run lengths that should be avoided is called a prefix, which is normally used in
recording practice as a synchronization pattern.
In essence all articles mentioned above discuss block codes. The article by Holl-
mann [398] uses a completely different approach, as codes generated by his con-
structions must be decoded by sliding-block decoders. A sliding-block decoder
observes the n-bit codeword plus rpreceding n-bit codewords plus qtrailing n-bit
codewords. Such a sliding-block concept leads to codes having a high efficiency,
involvingsmall hardware, and that usually do not have too many significant draw-
backs. A drawback of codes that are decoded by a sliding-block decoder iserror
propagation, as the decoding operation depends on r+q+1consecutive code-
words. In practice, the increased efficiency and reduced hardware of a sliding-
block decoder outweigh the extra load on the error correction unit. There are vari-
ous coding formats and design methods with which we can construct such codes.
Immink [114] has recently shown that very efficient sliding block codes can be
designed. For example, a rate 9/13, (1,18) 5-state encoder has a redundancy of
0.2%, while a rate 6/11, (2,15) 9-state encoder has a redundancy of 0.84%.
The article by Abdel-Ghaffar and Weber [412] addresses run-length-constrained
channels, where there is, as in the prior art, a maximum run length constraint,
and additionally a maximum run length constraint on both the odd and the even
positions of the encoded sequence. These codes are often called (0,G/I)con-
strained, where Gdenotes the maximum run length constraint on the sequence,
and Idenotes the maximum run length imposed on the symbols at the odd and
even positions. Abdel-Ghaffarand Weber study block codes, where they show re-
sults on the maximal size of a set of (0,G/I)constrained codewords of length n
that can be freely concatenated without violating the specified(0,G/I)constraint.
Closing Remark by the Editors
The work described in several WIC papers of SchouhamerImmink et al. summa-
rized in this subsection on RLL codes has found its way in consumer electronics
products, such as CD and DVD. His contributions to these products have gained
108 Chapter 4 – Channel Coding
him acknowledgment from several international institutions and societies.
4.3.2 Dc-Free Codes
Dc-balanced or dc-free codes, as they are often called, have a long history and
their application is certainly not confined to recording practice. Since the early
days of digital communication over cable, dc-balanced codes have been employed
to counter the effects of low-frequency cut-off due to coupling components, isolat-
ing transformers, and so on. In optical recording, dc-balanced codes are employed
to circumvent or reduce interaction between the data written on the disc and the
servo systems that follow the track. Low-frequency disturbances, for example due
to fingerprints, may cause completely wrong read-out if the signal falls below the
decision level. Errors of this type are avoided by high-pass filtering, which is
only permissible provided that the encoded sequence itself does not contain low-
frequency components, or, in other words, provided that it is dc-balanced.
Rejection of LF components is usually achieved by bounding the accumulated
sum of the transmitted symbols. Common sense tells us that a certain rate has to be
sacrificed in order to convertarbitrary user data into a dc-balanced sequence. The
quantification of the maximum rate, the capacity, of a sequence given the fact that
it contains no low-frequency components has been reported by Chien [22]. The
articles by Immink [358] and De With [360] provide a description of key charac-
teristics of dc-free sequences generated by a Markov information source having
maximum entropy. Given the fact that a Markov source, which describes a dc-
balanced sequence, is maxentropic, we can substitute the maxentropic transition
probabilities. Then computation of the spectrum is straightforward. Knowledge
of ideal, “maxentropic” sequences with a spectral null at dc is essential for under-
standing the basic trade-offsbetween the rate of a code and the amount of suppres-
sion of low-frequencycomponents. The results obtained in [358] and [360] allow
us to derive a figure of merit of implemented dc-balanced codes that takes into
account both the redundancy and the emergent frequency range with suppressed
components (notch width).
Beenker and Immink [367] present a category of dc-free codes called dc2-free
codes. This type of codes offers a larger rejection of low-frequency components
than is possible with the traditional codes discussed in the prior art. Besides the
trivial fact that they are dc-balanced, an additional property of dc2-free codes is
that the second (and even higher) derivative of the code spectrum also vanishes at
zero frequency (note that the odd derivatives of the spectrum at zero frequency are
zero because the spectrum is an even function of the frequency). The imposition of
this additional channel constraint results in a substantial decrease of the power at
the very low frequenciesfor a fixed code redundancy as compared with the designs
based on the conventional ‘bounded accumulated sum’ concept. The drawback of
this new scheme is the implementation of the codes, as it demands significantly
more hardware and large codewords at high coding rates.
4.4 Codes for Special Channels 109
4.3.3 Error-Detecting Constrained Codes
The paper by Immink [374] offers coding techniques for simple partial-response
channels. He showed that the simple bi-phase code can be used as an inner code
of an outer code designed for maximum (free) Hamming distance. The paper by
Weber and Abdel-Ghaffar [389] discloses a class of run-length-limited codesthat
can detect asymmetric errors made during transmission. Baggen and Balakirsky
[450] consider data transmission over so-called bit shift channels with (2,)RLL
constraints, and obtain bounds on the entropy of the output sequences.
4.4 Codes for Special Channels
4.4.1 Coding for Memories with Defects
In 1974, Kusnetsov and Tsybakov introduced [35] the following model for coding
for memories with stuck-at defects. In some memory cells, known to the encoder,
only one particular symbol (known to the encoder) can be written. The decoder
does not know in which positions stuck-at errors occur. The question is how much
information can be stored in such a memorywith stuck-at defects. Kusnetsov and
Tsybakov [35] gave upper bounds on the rate that can be obtained if a fraction
pof the positions contain stuck-at errors. With a random coding argument, they
obtained the surprising result that the capacity of a stuck-at channel with stuck-at
probability pequals 1-p.
Some ten years later, coding for stuck-at defects was a popular subject at vari-
ous WIC symposia. In 1985, Van Pul [361] described an explicit construction for
obtaining the capacity of the stuck-at channel with stuck-at probability p.Inthe
same year, Baggen[362] showed that MDS codesachieve the upperbound on the
information rate, given the number of stuck-at errors combined with random er-
rors. Vinck [363] varies on the theme by using convolutionalcodes for correcting
bursts of defect errors, separated by guard spaces. In [382], Peek and Vinck give
an explicit algorithm for the binary stuck-at channel. Bounds for the bit error rate
and the decoding complexity are also obtained. Schalkwijk and Post [381] take
an information-theoretic approach to coding for stuck-at errors. Indeed, suppose
that information is stored in elementary blocks of nbits. The memory with known
defects is then equivalent to a noisy channels with input and output alphabets of
size 2n. This “superchannel” can be described by a strategy in which an n-bits
input block is to be used for a particular input message and defect pattern. In a
memory with known defects, the bit values that are eventually read out become
available at the moment of storing. In other words, the equivalent super channel
has perfect feedback, and repetition feedback strategies can be used [26] – see also
Section 4.4.5. Strategies for small nare described.
Vinck and Post [376] discuss the following combined test and error-correction
procedure. A message mof even length is initially written in memory as x(m)=
(0,m,P), where P is the parity of m. Upon reading a word zfrom memory, we
check if it has an even number of ones. If so, we leave it unchanged; if not, we
110 Chapter 4 – Channel Coding
invert all its bits and obtain z0.Ifzoriginates from x(m)by a single stuck-at er-
ror, then all bits of zexcept for the stuck-at bit are actually inverted; the stuck-at
bit keeps its value that is incorrect for x(m). Consequently, z0is the complement
of x(m). We see that mcan be represented by two messages, namely x(m)and
its complement, as long as at most one stuck-at error occurs in the bits of word.
Note that both x(m)and its complement have an even number of ones. We keep
applying the same procedure. A next single stuck-at error that occurs in the course
of time is detected, as inversion of the word leads to a 0 in the leftmost bit. Upper
and lower bounds on the meantime before a memory fails with this procedure are
given, and an extension of the procedure for combinationwith coding for random
(non-permanent) errors is indicated.
In 1989, Bassalygo, Gelfand and Pinsker [76] introduced the model of localized
errors. In this model, the encoder knows a set of Eof codeword positions in
whichanerrormay occur; outside E, no errors occur. The decoder does not know
E. Coding for this model received quite some attention in the early nineties, as
indicated by Bratatjandra and Weber in their paper from 1997 [417]. In this paper,
the authors take for Ea set of multiple burst errors, that is, Eis the union of a col-
lection of disjoint sets of consecutive positions. In literature, the main attention is
on the sets Econsisting of all set of positions up to a certain cardinality. Bratatjan-
dra and Weber assume that both encoder and decoder know an upper bound mon
the number of bursts, and an upper bound bon the length of each burst. They give a
“fixed-rate” scheme for this situation. They also give a “variable-rate” scheme that
allows the transmitter to send more information information if the actual number
of burst errors is below m, or one or more of the burst lengthsis below b.
4.4.2 Asymmetric/Unidirectional Error Control Codes
Most classes of error control codes have been designed for use on binary symmet-
ric channels, on which 01cross-overs and 10cross-overs occur with equal
probability (symmetric errors). However, in certain applications, such as optical
communications, the error probability from 1to 0may be significantly higher than
the error probability from 0to 1. These applications can be modeled by an asym-
metric channel, on which only 10transitions can occur (asymmetric errors).
Further, some memory systems behave like a unidirectional channel, on which
both 10and 01errors are possible, but per transmission, all errors are of
the same type (unidirectional errors).
Codes that detect and/or correct symmetric errors have been studied extensively
since the 1940s. Of course, these codes can also be used to detect and/or correct
asymmetric or unidirectional errors. However, it seemed likely that it should be
possible to design codes that detect and/or correct asymmetric or unidirectional
errors which need less redundancy than a comparable symmetric error correcting
code. Pioneering work in this area was done by Varshamov [33] in the 1960s and
1970s. In the Benelux, the topic was further explored by Weber and various co-
authors in the late 1980s and early1990s.
4.4 Codes for Special Channels 111
In [377], Weber, De Vroedt and Boekee propose a method to construct codes cor-
rectinguptotasymmetric errors by expurgating and puncturing codes of Ham-
ming distance 2t+1. The resulting codes are often of higher cardinality than their
symmetric error-correctingcounterparts, but are mostly nonlinear. The same group
of authors derived bounds on the sizes of codes that correct unidirectional errors
[378], and they determined necessary and sufficient conditions for a block code to
be capable of correcting/detecting any combination of symmetric, unidirectional,
and asymmetric errors [384].
For practical purposes it is highly desirable that a code is systematic, i.e., that
the message is to be found unchangedin the codeword. In [399], Weber and Kaag
present a construction method for systematic codes which are able to correct up to
tasymmetric errors and detect from t+1up to dasymmetric errors.
Finally, in [405], Weber studies the asymptotic behavior of the rates of optimal
codescorrectingand/ordetecting combinations of symmetric, unidirectional,and/or
asymmetric errors. The main conclusion is that, without loosing rate asymptoti-
cally, one can upgrade any error control combination to simultaneous symmetric
error correction/detection and all unidirectional error detection.
4.4.3 Codes for Combined Bit and Symbol Error Correction
In 1983, Piret introduced [355] binary codes for compound channels where both
bit errors and symbol errors occur, where a symbol is a fixed group ofbit positions.
He introduces a distance profile to measure the error control capabilities and gives
some examples of codes for combined bit and symbol error control.
Two years later, Van Gils published the first of a series of 3 papers dealing with
the construction of codes for combined bit and symbol error correction. In the
application that Van Gils has in mind, a symbol correspondsto a module in a pro-
cessor. An erased symbol thus corresponds to a module that is detected to be in
error, while an erroneous symbol corresponds to a malfunctioning module that is
not detected to be in error. In [366], Van Gils announces binary [3k, k]codes for
k=4,8,16 that can correct one single symbol error (i.e., one of the three groups
of kbits is in error), up to k/4+1 bit errors, and one single symbol erasure plus
up to k/4bit errors (for k=4,8) or 3 bit errors (for k=8). In addition, for
k=8and k=16,k/4+2 bit errors can be detected. In [371], he describes a
binary [27,16] code, with symbol size 9, that can correct single bit errors, detect
single (9-bit) symbol errors and detect up to four bit errors. Finally, in [372], Boly
and Van Gils suggest to construct codes for controlling bit and symbol errors by
representing the symbols from a symbol-error correcting code with respect to a
judiciously chosen basis.
4.4.4 Coding for Informed Decoders
In 2001, Van Dijk, Baggen and Tolhuizen introduced informed decoding [438].
This concept was inspired by the following practical application. The address of
112 Chapter 4 – Channel Coding
a sector of an optical disc is part of a header that is protected by its own error-
correcting code. In many circumstances, the location of the reading/writing head
is approximately known. The question is whether it is somehow possible to use
this information on the actual sector address for retrieving the header more reliably.
With informed decoding, it assumed that the decoder is informed about the value
of some information symbols of the transmitted codeword. The authors show that
with judicious encoding, the decoder can employ such information to effectively
decode to a subcode with a larger minimum distance. Three ways to encodewell-
known codes that lead to favorable decoding capabilities are presented.
In [440], Tolhuizen, Hekstra, Cai and Baggen discuss two aspects of coding for
informed decoding. Firstly, they propose to use a certain Gray code for address-
ing sectors in such a way that all addresses of sectors close to a target sector have
many coordinates in common. In this manner, it is ensured that whenever the read-
ing/writing head lands close to the target sector, many coordinates of the address
of the sector in which the head actually lands are known. It is claimed that the
proposed method yields the maximum number of common coordinates for each
maximum deviation of the targetsector. The other aspect aims to improve decod-
ing for data encoded using a formed informed decoding, but where no informa-
tion about known informationsymbols is supplied to the encoder. This is done by
combining the codewords of several consecutive sectors, which usually have many
information symbols in common.
4.4.5 Coding for Channels with Feedback
Already in 1956, Shannon proved [10] the surprising fact that feedback does not
increase the capacity of a discrete memoryless channel. Feedback may, however,
significantly reduce the complexity that is required to obtain reliable communica-
tion. In 1971, Schalkwijk presented simple fixed-length feedback strategies for the
binary symmetric channel with error probability p[26].Itisassumedthatthefeed-
back is error-free and instantaneous, that is, immediately after the transmission of
a bit, the transmitter knows which bit value has been received. Schalkwijk’s strate-
gies achieve an upper bound on the rate below which reliable communication is
possible and can be described as follows. A message index sis pre-coded to an
n-bits message mthat does not contain a run of kequal symbols. The transmitter
consecutively transmits the bits of muntil the feedback reports the occurrence of
an error. In such a case, the bit that was meant to be transmitted is repeated k
times and transmission continues until the next error occurs. If all bits of mhave
been transmitted successfully, a tail is added until nbits have been transmitted.
The receiver decodes as follows. Working its way back from the last received bit,
it replaces subsequences 01kby1and10kby 0, respectively, and afterwards, it
removes the tail.
In the 1990s, Veugen and Bargh, two Ph.D. students of Schalkwijk, build further
on his research on channels with feedback. The remainder of this section describes
their work as presented at various WIC symposia.
4.4 Codes for Special Channels 113
A possible choice for the tails in Schalkwijk’s strategy is the alternating sequence
0101.... In [407], Veugen studies conditions on the tails that are sufficient for
correct operation of Schalkwijk’s strategies. In [396], he introduces the following
generalization of Schalkwijk’s scheme. Each bit of the message mis transmitted
ctimes in cconsecutive transmissions. If not all creceived bits are equal, the re-
ceiver neglects them, and the transmitter again transmits the intended message bit
ctimes, until cequal bits are received. If the receiver decodes incorrectly, which
happens if the channel produces cconsecutive errors, the transmitter acts like that
in Schalkwijk’s scheme: it inserts the last message bit ktimes in the message m.
This scheme reduces to Schalkwijk’s scheme if c=1.Forc>1, it introduces
large redundancies, so it is not suitable for small p. For each p<1/2,astrategy
can be found that has a positive rate. The schemes need less than 1 bit feedback
per transmitted bit, as for each cbits, the encoder only needs to know if they were
all zero, all one, or not all equal.
In [406], Veugen considers the following extension of Schalkwijk’s scheme to
non-binary channels. If the transmitter observes that symbol jwas received, al-
though it sent symbol i, it immediately repeats symbol ik
ij times. A pre-coder
takes care that in the data stream to be transmitted, subsequences of the form jikij
(with i6=j) do not occur. Veugen considers decoding with a fixed delay D.That
is, suppose the sequence (xn)n0is transmitted, and the sequence (yn)n0is re-
ceived. Symbol ynwill be decoded as follows. The sequence yn,y
is scanned from right to left, and each subsequence jikij is replaced by i.The
leftmost symbol of the resulting sequence is the estimate ˆxn. By comparing ˆxand
y, the pre-coder inverse can locate the errors and eliminate the error correction
symbols. Veugen studies the error probabilities for these schemes. Combining
calculations on random walks and a plausible conjecture, he computes the error
exponent of the strategy.
In [414], Schalkwijk and Bargh consider the situation where the feedback link is
without delay and noiseless, but operates at a smaller rate than the forward chan-
nel. They combine Ungerboeck’s set partitioning technique and feedback schemes
for full-rate feedback. The feedback scheme is used to see if the received signal
was in the correct subset of signal points. If so, convolutional decoding is expected
to retrieve the remaining information correctly. If not, the label of the subset of
signal points is repeated. An example with feedback rate 1/2and a ν=2con-
volutional code shows a much better performance than a much more complicated
ν=6convolutional code.
In [423], Bargh and Schalkwijk compare the block coding strategies discussed
above with a recursive scheme. In the latter case, decoding takes place after a fixed
delay D. A new strategy is discussed, and results on the rate and error exponent
are obtained. In [428], Bargh and Schalkwijk introduce Soft-Repetition Feedback
Coding and its recursivedecoding method for binary input, soft-output symmetri-
cal Discrete Memoryless Channels. The method is explained with a binary-input,
quaternary output channel.
114 Chapter 4 – Channel Coding
In [429], Bargh and Schalkwijk give an overview of error correction schemes in
DMCs and AWGN channels with noiseless, instantaneous and full-rate feedback.
They distinguish between two classes. In the first class, which they call “repeat to
resolve uncertainty”, the transmitter conceptually reconstructs the list of candidate
codewords for the decoder, and aims to reduce this list size with every transmis-
sion. The second class of schemes, called “repeat to correct erroneous reception”,
the transmitter repeats a message segment if it is received incorrectly. In such
schemes, a mechanism is required to signal to the receiver whether transmission is
repeated, or a new segment is transmitted.
4.5 Applications
Channel coding theory is applied in a wide range of areas: deep space communi-
cation, satellite communication, data transmission, data storage, mobile commu-
nication, file transfer, digital audio/video transmission, etc. For an overview of
applications in the first fifty years following Shannon’s 1948 “noisy channel cod-
ing theorem”, we refer to [105]. One of the most notable success stories for the
Benelux in this respect is the development of the compact disc (CD) in the late
1970s and early 1980s [109]. In this section we provide an overview of various
applications reported at the symposia on InformationTheory in the Benelux.
In [347], Roefs discusses candidate concatenated codingschemes (cf. Section 4.1.3)
for European Space Agency (ESA) telemetry applications in the early 1980s. The
inner code is fixed as the standard rate 1/2 convolutional code of constraintlength 7,
but several candidates for the outer code are considered: Reed-Solomon codes
with interleaving, Gallager’s burst-correcting scheme, and Tong’s burst-trapping
scheme. Their performances are compared for dense burst channels with widely
varying burst and guard space lengths. This work is continued in [350]. In this pa-
per, Best and Roefs again take as inner code the conventional rate 1/2 convolutional
code of constraint length 7. As outer code, they use a [256,224] Reed-Solomon
code Cover GF(257). To be more precise, they propose to encode 224 non-zero
symbols (in GF(257)) systematically into a word from C. If a generated parity
symbol happens to be zero, it is replaced by the element 1 (in GF(257)). The au-
thors argue that the encoding error probability introduced by this replacement is
negligible compared to the symbol error probability of the Viterbi decoder. The
choice for GF(257) instead of GF(256) is motivated by the resulting possibility to
employ the Fermat Number Transform for more efficient encoding and decoding.
Van Gils [364] describes dot codes for product identification (as an alternative
to the well-known bar codes). As a product carrying a dot code word can have
several orientationswith respect to the read-out device, the same product is iden-
tified by several dot code words. It is indicated that for certain error-correcting
codes, this ambiguity can be efficiently resolved.
At the time when telephony, telegraphy, and postal services were still all carried
out by the PTT, Haemers considered the protectionof a binary representation of the
4.5 Applications 115
postal code, as printed on envelopes, against read-out errors. In [365] he proposes
the use of an (extended) Hamming codes for this purpose, with a small modifica-
tion in order to increase the burst error detection capability.
Belgian bank account numbers consist of 12 digits, a9a8...a
and c1are such that P9
i=0 ai(10)i10c1+c0(mod 97). The check digits c0and
c1serve to detect the most common errors made by humans when processing digit
strings (single errors, transpositions of consecutivesymbols). Stevens [388] shows
that replacing the modulus 97 by 93 slightly increases the error detection proba-
bility. Another slight increase is obtained if it is stipulated that the bank account
number be divisible by93, i.e.,thatP9
i=0 ai(10)i+2 + 10c
00 (mod 93).
Offermans, Breeuwer, Weber and Van Willigen [408] consider error-correction
strategies for Eurofix, an integrated radio navigation system that combines terres-
trial Loran-C and the satellite-based Global Positioning System (GPS). Differen-
tial GPS messages are transported via the Loran-C data link, which is disturbed by
continuous wave interference, cross-rate interference, atmospheric noise, etc. In
order to combat these phenomena, the authors propose a coding scheme based on
the concatenation of a Reed-Solomoncode and a parity check code.
In [411], Hekstra considers the following synchronizationproblem. Suppose that
when a bit string x=(x
n)is written down, then either xor one of its
cyclic shifts, i.e., a string of the form (x1+i,x
i), could be
read out. The problem is how to efficiently encode much informationinto strings
such that all cyclic shifts of two distinct information strings are different. The au-
thor proposes the followingmethod for efficient encoding of nearly the maximum
amount of information. Suppose that n=2
m1. Then encode k=nm
information bits systematically to a cyclic Hamming code of length n, and sub-
sequently invert the leftmost parity symbol. Synchronization is re-established by
single-error correction, followed by shifting the received sequence until the error
position corresponds to the leftmost parity bit.
In [418], De Bart shows that the channel coding scheme of the Digital Video
Broadcasting (DVB) satellite system, based on the concatenation of a Reed-Solomon
code and a convolutionalcode, has to deal with ambiguities that cannot be solved
by the Viterbidecoder. The channel and the QPSK demodulator may cause trans-
formations (rotations, shifts, etc.) yielding an incorrect sequence that resembles a
codeword of the originalconvolutional code. Joined synchronization of the Viterbi
and Reed-Solomon decoders shouldsolve the problem.
A method for error correction in IC implementations of Boolean functions is pro-
posed by Muurling, Kleihorst, Benschop, Van der Vleuten and Simonis [434]. The
methods corrects both manufacturing hard errors and temporary soft errors during
circuit operation. A systematic Hamming code is used, which can be implemented
through additional logic or even through software tools.
Desset [439] considers error control coding for Wireless Personal Area Networks
116 Chapter 4 – Channel Coding
(WPAN) in 2002. In a Wireless Personal Area Network, power consumption plays
a very important role. High-performance channel coding strategies can be used to
obtain coding gain and thus reduce transmit power. The average energy required
per bit in a typical situation is about 15 nJ/bit. In addition, power consumption due
to the complexity of encoding and decoding has to be considered. The complexity
of Hamming codes, Reed-Muller codes, Reed-Solomon codes and Convolutional
and Turbo codes has been analyzed. The two constraints are in contradiction and
an optimum solution has to be found. The paper proposes a strategy to select error
correcting codes for WPANs. For applications with different average bit energies
ranging from 100 pJ/bit to 10 nJ/bit, the authors recommend Hamming codes,
short constraint-length convolutional codes, and turbo coding,respectively.
ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.