Conference PaperPDF Available

Knuth's Balancing of Codewords Revisited

Abstract and Figures

In 1986, Don Knuth published a very simple algorithm for constructing sets of bipolar codewords with equal numbers of 1s and 0s, called balanced codes. Knuth's algorithm is, since look-up tables are absent, well suited for use with large codewords. The redundancy of Knuths balanced codes is a factor of two larger than that of a code comprising the full set of balanced codewords. In our paper we will present results of our attempts to improve the performance of Knuths balanced codes.
Content may be subject to copyright.
Knuth’s Balancing of Codewords Revisited
Jos H. Weber
TU Delft, IRCTR/CWPC,
Mekelweg 4, 2628 CD Delft The Netherlands
Email: J.H.Weber@ewi.tudelft.nl
Kees A. Schouhamer Immink
Turing Machines Inc.
Willemskade 15b-d, 3016 DK Rotterdam The Netherlands
Email: immink@turing-machines.com
Abstract In 1986, Don Knuth published a very simple al-
gorithm for constructing sets of bipolar codewords with equal
numbers of ’1’s and ’-1’s, called balanced codes. Knuth’s algo-
rithm is, since look-up tables are absent, well suited for use with
large codewords. The redundancy of Knuth’s balanced codes is
a factor of two larger than that of a code comprising the full set
of balanced codewords. In our paper we will present results of
our attempts to improve the performance of Knuth’s balanced
codes.
Key words: magnetic recording, optical recording, channel
capacity, constrained code, balanced code.
I. INTRODUCTION
Sets of bipolar codewords that have equal numbers of ’1’s
and ’-1’s are usually called balanced codes. Such codes have
found application in cable transmission, optical and magnetic
recording. A survey of properties and methods for constructing
balanced codes can be found in [1]. A simple encoding
technique for generating balanced codewords, which is capable
of handling (very) large blocks was described by Knuth [2] in
1986.
Knuth’s algorithm is extremely simple. An -bit user word,
even, consisting of bipolar symbols valued is forwarded
to the encoder. The encoder inverts the ſrst bits of the user
word, where is chosen in such a way that the modiſed word
has equal numbers of ’1’s and ’-1’s. Knuth showed that such
an index can always be found. The index is represented
by a (preferably) balanced word of length .The -bit
preſx word followed by the modiſed -bit user word are
both transmitted, so that the rate of the code is .
The receiver can easily undo the inversion of the ſrst bits
received. Both encoder and decoder do not require look-up
tables, and Knuth’s algorithm is therefore very attractive for
constructing long balanced codewords. Modiſcations of the
generic scheme are discussed in Knuth [2], Alon et al. [3],
Al-Bassam & Bose [4], and Tallini, Capocelli & Bose [5].
Knuth showed that in his best construction is roughly
equal to so that the redundancy of
Knuth’s construction is [2]
(1)
The cardinality of a full set of balanced codewords of length
equals
where the approximation of the central binomial coefſcient is
due to Stirling. Then the redundancy of a full set of balanced
codewords is
(2)
We conclude that the redundancy of a balanced code generated
by Knuth’s algorithm falls a factor of two short with respect
to a code that uses ’full’ balanced code sets. Clearly, the
loss in redundancy is the price one has to pay for a simple
construction. There are two features of Knuth’s construction
that could help to explain the difference in performance, and
they offer opportunities for code improvement.
The ſrst feature that may offer a possibility of improving the
code’s performance stems from the fact that Knuth’s algorithm
is greedy as it takes the very ſrst opportunity for balancing
the codeword [1], that is, in Knuth’s basic scheme, the ſrst,
i.e. the smallest index where balance is reached is selected.
In case there is more than one position where balance can
be achieved, the encoder will thus favor smaller values of the
position index. As a result, we may expect that smaller values
of the index are more probable than larger ones. Then, if the
index distribution is non-uniform, we may conclude that the
average length of the preſx required to transmit the position
information is less than . A practical embodiment of
a scheme that takes advantage of this feature is characterized
by the fact that the length of the preſx word is not ſxed, but
user data dependent. The preſx assigned to a position with
a smaller, more probable, index has a smaller length than a
preſx assigned to a position with a larger index.
Secondly, it has been shown by Knuth that there is always a
position where balance can be reached. It can be veriſed that
there is, for some user words, more than one suitable position
where balance of the word can be realized. It will be shown
later that the number of positions where words can be balanced
lies between 1 and . This freedom offers a possibility
to improve the redundancy of Knuth’s basic construction. An
enhanced Knuth’s algorithm may transmit auxiliary data by
using the freedom of selecting from the balancing positions
possible. Assume there are positions, where
the encoder can balance the user word, then the encoder can
convey an additional bits. The number depends on
the user word, and therefore the amount of auxiliary data that
can be transmitted is user data dependent.
We start, in Section II, with a survey of known properties
of Knuth’s coding method. Thereafter, we will compute the
ISIT 2008, Toronto, Canada, July 6 - 11, 2008
1567978-1-4244-2571-6/08/$25.00 ©2008 IEEE
distribution of the transmitted index in Section III. Given
the distribution of the index, we will compute the entropy
of the index, and evaluate the performance of a suitably
modiſed scheme. In Section IV, we will compute the amount
of additional data that can be conveyed in a modiſcation of
Knuth’s basic scheme. Section V concludes this article.
II. KNUTHS BASIC SCHEME
Knuth’s balancing algorithm is based on the idea that there
is a simple translation between the set of all -bit bipolar user
words, even, and the set of all -bit codewords. The
translation is achieved by selecting a bit position within the
-bit word that deſnes two segments, each having the same,
but opposite, disparity. A zero-disparity or balanced block is
now generated by the inversion of the ſrst bits (or the last
bits). The position digit is encoded in the -bit preſx.
The rate of the code is simply .
The proof that there is at least one position, , where balance
in any even length user word can be achieved is due to Knuth.
Let the user word be ,,and
let be the sum, or disparity, of the user symbols, or
(3)
Let be the running digital sum of the ſrst , ,
bits of ,or
(4)
and let be the word with its ſrst bits inverted. For
example, let
then we have and = (1, -1, -1, -1, -1, 1, -1,
1, 1, -1). We let stand for , then the quantity
is
(5)
It is immediate that , (no symbols inverted)
and (all symbols inverted). We may,
as , conclude that every word ,
even, can be associated with at least one position for which
,or is balanced. This concludes the proof.
The value of is encoded in a (preferably) balanced word
of length ,even. The maximum codeword length of
is, since the preſx has an equal number of ’1’s and ’-1’s,
governed by
(6)
In this article, we follow Knuth’s generic format, where
. Note that in a slightly different format, we may opt for
, where the encoder has the option to invert or not
to invert the codeword in case the user word is balanced. For
small values of , this will lead to slightly different results,
though for very large values of , the differences between
the two formats are small. Knuth described some variations
on the general framework. For example, if and are both
odd, we can use a similar construction. The redundancy of
Knuth’s most efſcient construction is
III. DISTRIBUTION OF THE TRANSMITTED INDEX
The basic Knuth algorithm, as described above, progres-
sively scans the user word till it ſnds the ſrst suitable position,
, where the word can be balanced. In case there is more than
one position where balance can be obtained, it is expected that
the encoder will favor smaller values of the position index.
Then the distribution of the index is not uniform, and, thus,
the entropy of the index is less than , which opens the
door for a more efſcient scheme. A practical embodiment of
amoreefſcient scheme would imply that the preſx assigned
to a smaller index has a smaller length than a preſx assigned
to a larger index. We will compute the entropy of the index
sent by the basic Knuth encoder, and in order to do so we
ſrst compute the probability distribution of the transmitted
index. In our analysis it is assumed that all information words
are equiprobable and independent. Let denote the
probability that the transmitted index equals , .
Theorem 1: The distribution of the transmitted index ,
,is given by ( )
Proof. This result can be shown using On Line Encyclopedia
of Integer Sequences A33820.
Invoking Stirling’s approximation, we have
For ,wehave ,and
for ,wehave .
Figure 1 shows two examples of the distribution, ,for
and . The entropy of the transmitted index,
denoted by ,is
(7)
Given the distribution, it is now straightforward to compute the
entropy, , of the index. Figure 2 shows a few results of
computations. The diagram shows that is only slightly
ISIT 2008, Toronto, Canada, July 6 - 11, 2008
1568
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
Pr(j)
j/m
m=64
m=256
Fig. 1. Distribution of the (normalized) transmitted index
for and .
1 2 3 4 5 6 7 8 9 10
1
2
3
4
5
6
7
8
9
10
Hp
log(m)
Fig. 2. Entropy versus .
less than , and we conclude that the above proposed
modiſcation of Knuth’s scheme using a variable length preſx
can offer only a small improvement in redundancy within
the range of codeword length investigated. We conclude that,
within the range of codeword length investigated, the proposed
variable preſx-length scheme cannot bridge the factor of two
in redundancy between the basic Knuth scheme and that of
full set balanced codes.
IV. ENCODING AUXILIARY DATA
There is at least one position and there at most
positions within an -bit word, even, where a word can
be balanced. The ’at least’ one position, which makes Knuth’s
algorithm possible, was proved by Knuth (see above). The ’at
most’ bound will be shown in the next Theorem.
Theorem 2: There at most positions within an -bit
word, even, where a word can be balanced.
Proof. Let denote the position where balance can be made.
Then, at the neighboring positions or such a
00.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
0
0.02
0.04
0.06
0.08
0.1
0.12
Pr(j)
j/m
m=64
m=256
Fig. 3. Distribution of the (normalized) number, ,of
possible balancing positions for and .
balance cannot be made, so that we conclude that the number
of positions where balance can be made is less or equal to
Note that the indices of a word with balance positions
are either all even or all odd. It can easily be veriſed that
there are three groups of words that can be balanced at
positions. Namely
a) the words consisting of the cascade of the
di-bits (+1,-1) or (-1,+1),
b) the words beginning with a +1 followed by
di-bits (+1,-1) or (-1,+1), followed by a +1, and
c) the inverted words of case b).
Since, on average, the encoder has the degree of freedom of
selecting from more than one balance position, it offers the
encoder the possibility to transmit auxiliary data. Assume there
are positions, , where the encoder can balance
the user word, then the encoder can convey an additional
bits. The number depends on the user word at hand, and
therefore the amount of auxiliary data that can be transmitted
is user data dependent.
Let denote the probability that the encoder may
choose between ,, possible positions, where
balancing is possible.
Theorem 3: The distribution of the number of positions,
where an -bit word, even, can be balanced is given by
(8)
Proof. See Appendix. Theorem 3 follows from Lemma 3 and
the fact that there are sequences of length , which are
assumed to be equally probable.
Figure 3 shows two examples of the distribution, namely for
and . The average amount of information,
, that can be conveyed via the choice in the position
ISIT 2008, Toronto, Canada, July 6 - 11, 2008
1569
12345678910
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
Ha(m)
log (m)
Fig. 4. The average amount of information, , that can be
conveyed via the choice in the index as a function of
data is
(9)
Results of computations are shown in Figure 4. We can
recursively compute by invoking
For large and ,wehave
where . We approximate
so that
Now, for large , we can approximate by
(10)
(11)
(12)
where is Euler’s constant. We conclude that
the average amount of information that can be conveyed by
exploiting the choice of index compensates for the loss in rate
between codes based on Knuth’s algorithm and codes based
on full balanced codeword sets.
V. C ONCLUSIONS
We have investigated some characteristics and possible
improvements of Knuth’s algorithm for constructing bipolar
codewords with equal numbers of ’1’s and ’-1’s. An -
bit codeword is obtained after a small modiſcation of the -
bit user word plus appending a, ſxed-length, -bit preſx. The
-bit preſx represents the position index within the codeword,
where the modiſcation has been made.
We have derived the distribution of the index (assuming
equiprobable user words), and have computed the entropy of
the transmitted index. Our computations show that a modiſca-
tion of Knuth’s generic scheme using a variable length preſx
of the position index will only offer a small improvement in
redundancy.
The transmitter can, in general, choose from a plurality
of indices, so that the transmitter can transmit additional
information. The number of possible indices depends on the
given user word, so that the amount of extra information
that can be transmitted is data dependent. We have derived
the distribution of the number of positions where a word
can be balanced. We have computed the average information
that can be conveyed by using the freedom of choosing from
multiple indices. The average information rate can, for large
user word length, , be approximated by .
This compensates for the loss in code rate between codes
based on Knuth’s algorithm and codes based on full balanced
codeword sets.
VI. APPENDIX
In this appendix we give a combinatorial proof of The-
orem 3. We also refer the reader to On Line Encyclopedia
of Integer Sequences A112326. Let denote the set of
bipolar sequences of even length which can be balanced in
positions ( ). Deſne .
We will derive an explicit expression for (in Lemma 3),
from which Theorem 3 immediately follows. Let denote
the set of all balanced sequences of length without internal
balancing positions, i.e., there are no balancing positions
with .Deſne . Any sequence
with balancing positions can be
uniquely decomposed as ,where
is of length , with and . Note that
is in for all and that is
in . From these observations, we can easily
derive the recursive relation
(13)
for all . Further, we have, for all , the trivial
equality
(14)
ADyck word of length is a balanced bipolar sequence
of length such that no initial segment has more ’1’s than
ISIT 2008, Toronto, Canada, July 6 - 11, 2008
1570
’-1’s [6], or in other words, is a Dyck word if the running
digital sum for all .The
number of Dyck words of length is equal to
(15)
which is the -th Catalan number [6]. For example, ,and
are the Dyck words of length ,and ,
,,,and are the Dyck
words of length , where for clerical convenience we have
written ’0’ instead of ’-1’. Note that a sequence is in
if and only if it has (the inverse of) the format ,where
is a Dyck word of length . Hence, for all ,
(16)
For example, ,,,
, which is indeed the result provided by (16).
Lemma 1: For all and satisfying , it holds
that
(17)
Proof. Any bipolar sequence of length containing
’ones’ can be uniquely written as ,where is a Dyck
word of length , with ,and
is a bipolar sequence of length containing
1’s. Using (15) for Dyck word enumeration, a simple
counting argument gives the stated result.
Lemma 2: For all , it holds that
(18)
Proof. Any bipolar sequence of length having more than
1’s can be uniquely written as ,where is of length
, with ,and is of length and
has 1’s. Any bipolar sequence of length containing less
than 1’s can be uniquely written as ,where is of
length , with ,and is of length
and has 1’s. Hence,
(19)
which concludes the proof.
Lemma 3: For all , it holds that
(20)
Proof. Assuming that the statement holds for all ,
we will show that it also holds for .Forall
,wehave
(21)
where the ſrst equality follows from (13), the second from
(20) and (16), and the third from Lemma 1 (with
and ). Further, we have
(22)
where the ſrst equality follows from (14) (with ),
the second from (21), and the third from Lemma 2 (with
). Hence, if the statement in the lemma holds for
all , then it holds for as well. Since
(14) gives that , (20) holds for ,andthe
lemma follows by induction on .
REFERENCES
[1] K.A.S. Immink, Codes for Mass Data Storage Systems, Second Edi-
tion, ISBN 90-74249-27-2, Shannon Foundation Publishers, Eindhoven,
Netherlands, 2004.
[2] D.E. Knuth, ’Efſcient Balanced Codes’, IEEE Trans. Inform. Theory,
vol. IT-32, no. 1, pp. 51-53, Jan. 1986.
[3] N. Alon, E.E. Bergmann, D. Coppersmith, and A.M. Odlyzko, ’Balanc-
ing Sets of Vectors’, IEEE Trans. Inform. Theory, vol. IT-34, no. 1, pp.
128-130, Jan. 1988.
[4] S. Al-Bassam and B. Bose, ’On Balanced Codes’, IEEE Trans. Inform.
Theory, vol. IT-36, no. 2, pp. 406-408, March 1990.
[5] L.G. Tallini, R.M. Capocelli, and B. Bose, ’Design of Some New
Balanced Codes’, IEEE Trans. Inform. Theory, vol. IT-42, no. 3, pp.
790-802, May 1996.
[6] R.P. Stanley, Enumerative Combinatorics, Vol. 2, Cambridge University
Press, 1999.
ISIT 2008, Toronto, Canada, July 6 - 11, 2008
1571
... Restricting codewords to balanced ones necessarily introduces redundancy. However, the redundancy of Knuth's balanced codes is about twice the lower bound (theoretical bound) on redundancy, as it has been shown by Weber and Immink [2]. Later in the same work [2], the authors presented their first attempt to close the gap between Knuth's redundancy and the lower bound redundancy. ...
... However, the redundancy of Knuth's balanced codes is about twice the lower bound (theoretical bound) on redundancy, as it has been shown by Weber and Immink [2]. Later in the same work [2], the authors presented their first attempt to close the gap between Knuth's redundancy and the lower bound redundancy. The first improvement attempt achieved only little profit, but later on they achieved a significant improvement [3]. ...
... As it has been concluded in [2], KR is about twice R which is given in (5). I.e. ...
Conference Paper
Full-text available
Donald Knuth published an efficient algorithm for constructing a code with balanced codewords. A balanced codeword is a codeword that contains an equal number of zero's and one's. The redundancy of the codes built using Knuth's algorithm is about twice the lower bound on redundancy. In this paper we propose a new scheme based on the bit recycling compression technique to reduce Knuth's algorithm redundancy. The proposed scheme does not affect the simplicity of Knuth's algorithm and achieves less redundancy. Theoretical results and an analysis of our scheme are presented as well.
... (4)The bits are flipped to m = 4, which is the cardinality of | x |. In[15], it was further discussed that m is not necessarilyVOLUME 8, 2020 ...
Article
Full-text available
Visible light communication (VLC) offers wireless communication within short-range based on wavelength converters and light-emitting diode (LED). In the VLC system, conventional forward error correction (FEC) codes are not guaranteed to provide flicker mitigation and dimming support. Consequently, modified coding schemes are introduced for reliable VLC. These methods require complicated coding structures, use of lookup tables, and the addition of large redundancy, resulting to increased computational complexity and low transmission efficiency. In this article, we propose a coding scheme that is flicker-free and enhances the transmission efficiency for VLC systems. The proposed scheme is based on polar codes (PC) and Knuth balancing code with enhanced prefix coding technique. The results show that the proposed algorithm exhibits improved transmission efficiency compared to the PC without and with run-length limited code, for dimming values 75% (or 25%) and 87.5% (or 12.5%). Also, the proposed scheme presents a significant bit error rate (BER) performance gain compared to the schemes in literature. The proposed scheme is flicker-free, provides a simple encoding structure, does not utilize lookup tables, generates minimal number of redundancies for energy efficiency. Thus, the approach is flexible, and it is more suitable for real-time VLC systems. INDEX TERMS Forward error correction, Knuth balancing codes, light-emitting diode, polar codes, visible light communication.
... The codeword found with zero unbalance is x (4) = (1010101) at index i 0 = 4. The codeword, x (i0) , is transmitted plus the index i 0 , which is encoded using a lookup table, and sent as a separate prefix to the codeword (see for example [15] for prefix coding). At the decoder's site, the transformations to the source word can be uniquely undone. ...
Article
We consider the transmission and storage of data that use coded binary symbols over a channel, where a Pearsondistance-based detector is used for achieving resilience against additive noise, unknown channel gain, and varying offset. We study Minimum Pearson Distance (MPD) detection in conjunction with a set, S, of codewords satisfying a center-of-mass constraint. We investigate the properties of the codewords in S, compute the size of S, and derive its redundancy for asymptotically large values of the codeword length n. The redundancy of S is approximately 3/2 log2 n + α where α = log2 √π/24 =-1.467. for n odd and α =-0.467. for n even. We describe a simple encoding algorithm whose redundancy equals 2 log2 n + o(log n). We also compute the word error rate of the MPD detector when the channel is corrupted with additive Gaussian noise.
... and the redundancy, r dc (n), of balanced codes is approximately [5] r dc (n) = n−log 2 N dc (n) ≃ 1 2 log 2 n+0.326, n ≫ 1. ...
... These extra balanced outputs could conceivably be used to send auxiliary data, thus reducing the redundancy. This was proved for the binary case by Weber and Immink [12]. Decoding algorithm The following steps are followed to recover the original information from the balanced sequence: 1) Drop the redundant symbol u, then recover z from the Gray code sequence using the decoding algorithm as presented in Section II-B. ...
Conference Paper
Full-text available
Balancing sequences over a non-binary alphabet is considered, where the algebraic sum of the components (also known as the weight) is equal to some specific value. Various schemes based on Knuth’s simple binary balancing algorithm have been proposed. However, these have mostly assumed that the prefix describing the balancing point in the algorithm can easily be encoded. In this paper we show how non-binary Gray codes can be used to generate these prefixes. Together with a non-binary balancing algorithm, this forms a complete balancing system with straightforward and efficient encoding/decoding.
... The index j is not necessarily unique as in general there are more positions where balance can be obtained [6]. Let the smallest index j, where u j is balanced, be denoted by I(u). ...
... In optical disc storage devices and non-volatile memories, constrained codes, specifically dc-free or balanced codes, have been used and/or proposed to counter the effects of offset mismatch [2]. Jiang et al. [3] addressed a qary balanced coding technique, called rank modulation, for circumventing the difficulties with flash memories having aging offset levels. ...
Conference Paper
The error performance of optical storage and Non-Volatile Memory (Flash) is susceptible to unknown offset of the retrieved signal. Balanced codes offer immunity against unknown offset at the cost of a significant code redundancy, while minimum Pearson distance detection offers immunity with low-redundant codes at the price of lessened noise margin. We will present a hybrid detection method, where the distance measure is a weighted sum of the Euclidean and Pearson distance, so that the system designer may trade noise margin versus amount of immunity to unknown offset.
... Knuth's algorithm for constructing balanced codes [3] is very well suited for use with long codewords, since lookup tables are completely absent. Modifications and improvements of the generic Knuth scheme are discussed by Alon et al. [4], Al-Bassam & Bose [5], Tallini, Capocelli & Bose [6], Weber & Immink [7], and Immink & Weber [8]. ...
Conference Paper
Knuth published a very simple algorithm for constructing bipolar codewords with equal numbers of +1's and -1's, called balanced codes. In our paper we will present new code constructions that generate balanced runlength limited sequences using a modification of Knuth's algorithm.
Conference Paper
Full-text available
An indel refers to a single insertion or deletion, while an edit refers to a single insertion, deletion or substitution. We investigate codes that combat either a single indel or a single edit and provide linear-time algorithms that encode binary messages into these codes of length n. Over the quaternary alphabet, we provide two linear-time encoders. One corrects a single edit with 2log n + 2 redundant bits, while the other corrects a single indel with log n + 2 redundant bits. The latter encoder reduces the redundancy of the best known encoder of Tenengolts (1984) by at least four bits. Over the DNA alphabet, exactly half of the symbols of a GC-balanced word are either C or G. Via a modification of Knuth’s balancing technique, we provide a linear-time map that translates binary messages into GC-balanced codewords and the resulting codebook is able to correct a single edit. The redundancy of our encoder is 3log n + 2 bits and this is the first known construction of a GC-balanced code that corrects a single edit.
Book
Full-text available
Preface - The advantages of digital audio and video recording have been appreciated for a long time and, of course, computers have long been operated in the digital domain. The advent of ever-cheaper and faster digital circuitry has made feasible the creation of high-end digital video and audio recorders, an impracticable possibility using previous generations of conventional analog hardware. The principal advantage that digital implementation confers over analog systems is that in a well-engineered digital recording system the sole significant degradation takes place at the initial digitization, and the quality lasts until the point of ultimate failure. In an analog system, quality is diminished at each stage of signal processing and the number of recording generations is limited. The quality of analog recordings, like the proverbial 'old soldier', just fades away.
Book
Full-text available
Preface to the Second Edition About five years after the publication of the first edition, it was felt that an update of this text would be inescapable as so many relevant publications, including patents and survey papers, have been published. The author's principal aim in writing the second edition is to add the newly published coding methods, and discuss them in the context of the prior art. As a result about 150 new references, including many patents and patent applications, most of them younger than five years old, have been added to the former list of references. Fortunately, the US Patent Office now follows the European Patent Office in publishing a patent application after eighteen months of its first application, and this policy clearly adds to the rapid access to this important part of the technical literature. I am grateful to many readers who have helped me to correct (clerical) errors in the first edition and also to those who brought new and exciting material to my attention. I have tried to correct every error that I found or was brought to my attention by attentive readers, and seriously tried to avoid introducing new errors in the Second Edition. China is becoming a major player in the art of constructing, designing, and basic research of electronic storage systems. A Chinese translation of the first edition has been published early 2004. The author is indebted to prof. Xu, Tsinghua University, Beijing, for taking the initiative for this Chinese version, and also to Mr. Zhijun Lei, Tsinghua University, for undertaking the arduous task of translating this book from English to Chinese. Clearly, this translation makes it possible that a billion more people will now have access to it. Kees A. Schouhamer Immink Rotterdam, November 2004
Article
Coding schemes in which each codeword contains equally many zeros and ones are constructed in such a way that they can be efficiently encoded and decoded.
Article
A balanced code with r check bits and k information bits is a binary code of length k+r and cardinality 2<sup>k</sup> such that each codeword is balanced; that is, it has [(k+r)/2] 1's and [(k+r)/2] 0's. This paper contains new methods to construct efficient balanced codes. To design a balanced code, an information word with a low number of 1's or 0's is compressed and then balanced using the saved space. On the other hand, an information word having almost the same number of 1's and 0's is encoded using the single maps defined by Knuth's (1986) complementation method. Three different constructions are presented. Balanced codes with r check bits and k information bits with k&les;2<sup>r+1</sup>-2, k&les;3×2<sup>r</sup>-8, and k&les;5×2<sup>r</sup>-10r+c(r), c(r)∈{-15, -10, -5, 0, +5}, are given, improving the constructions found in the literature. In some cases, the first two constructions have a parallel coding scheme
Article
In a balanced code each codeword contains equally many 1's and 0's. Parallel decoding balanced codes with 2<sup>r</sup> (or 2<sup>r </sup>-1) information bits are presented, where r is the number of check bits. The 2<sup>2</sup>-r-1 construction given by D.E. Knuth (ibid., vol.32, no.1, p.51-3, 1986) is improved. The new codes are shown to be optimal when Knuth's complementation method is used
Article
For n >0, d &ges;0, n ≡ d (mod 2), let K ( n , d ) denote the minimal cardinality of a family V of ±1 vectors of dimension n , such that for any ±1 vector w of dimension n there is a v ∈ V such that | v - w |&les; d , where v - w is the usual scalar product of v and w . A generalization of a simple construction due to D.E. Knuth (1986) shows that K ( n , d )&les;[ n /( d +1)]. A linear algebra proof is given here that this construction is optimal, so that K ( n , d )-[ n /( d +1)] for all n ≡ d (mod 2). This construction and its extensions have applications to communication theory, especially to the construction of signal sets for optical data links
  • R P Stanley
R.P. Stanley, Enumerative Combinatorics, Vol. 2, Cambridge University Press, 1999. ISIT 2008, Toronto, Canada, July 6 -11, 2008