ArticlePDF Available

Abstract and Figures

In this reported work, Knuth's balancing scheme, which was originally developed for unconstrained binary codewords is adapted. Presented is a simple method to balance the NRZ runlength constrained block codes corresponding to (d, k) constrained NRZI sequences. A short marker violating the maximum runlength or k constraint is used to indicate the balancing point for Knuth's inversion. The marker requires fewer overhead bits and less implementation complexity than indexing the balancing point's address by mapping it onto a (d, k) or runlength constrained prefix, such as when applying Knuth's original scheme more directly. The new code construction may be attractive for future magnetic and especially optical recording schemes. In fact the current optical storage media, such as the CD, DVD and Blue Ray Disc, all attempt to achieve some suppression of low frequency components of the constrained codes, by exploiting a limited degree of freedom within the set of candidate (d, k) words.
Content may be subject to copyright.
Markers to construct DC free (d, k)
constrained balanced block codes using
Knuth’s inversion
H.C. Ferreira, J.H. Weber, C.H. Heymann and
K.A.S. Immink
In this reported work, Knuth’s balancing scheme, which was originally
developed for unconstrained binary codewords is adapted. Presented is
a simple method to balance the NRZ runlength constrained block codes
corresponding to (d, k ) constrained NRZI sequences. A short marker
violating the maximum runlength or k constraint is used to indicate
the balancing point for Knuth’s inversion. The marker requires fewer
overhead bits and less implementation complexity than indexing the
balancing point’s address by mapping it onto a (d, k) or runlength con-
strained prefix, such as when applying Knuth’s original scheme more
directly. The new code construction may be attractive for future mag-
netic and especially optical recording schemes. In fact the current
optical storage media, such as the CD, DVD and Blue Ray Disc, all
attempt to achieve some suppression of low frequency components
of the constrained codes, by exploiting a limited degree of freedom
within the set of candidate (d, k) words.
Introduction: Binary (d, k) constrained codes have been used since the
1960s in magnetic recording and are currently very prominent in optical
recording [1]. Here d denotes the minimum number of 0s and k the
maximum number of 0s between any two 1s in the NRZI representation
of the coded sequence. Hence there is a minimum runlength of d + 1
same symbols and maximum runlength of k + 1 same symbols in the
NRZ representation of the binary runlength constrained channel
sequence. By balancing the binary channel symbols {0, 1} as in [2]
and mapping them onto bipolar symbols {2 1, +1}, a balanced and
hence DC free code with suppressed low frequency components can
be constructed. Previously, investigations of DC free (d, k) constrained
codes focused mainly on short finite state machine codes for rotating
head magnetic tape recorders as in [3], or medium length block codes
as currently implemented in optical storage media [1]. Long block
codes as in [4] may suffer from large error propagation. However,
long block codes are attractive in order to more closely approach
channel capacity and hence increase storage density. There is very
little literature on constructing such long block codes. We now present
a simple method, requiring fewer overhead bits and also less implemen-
tation complexity than the more direct application of Knuth’s algorithm
in [5].
Code construction: We modify and apply Knuth’s balancing scheme
[2], which was originally developed for unconstrained binary code-
words. The main idea in Knuth’s construction is to convert the original
data sequence into a balanced sequence by inverting all symbols beyond
a certain balancing index. This index is represented in a (balanced)
prefix, which is also communicated. The receiver obtains the index
from the prefix and retrieves the original data sequence by inverting
all symbols beyond the balancing index. In the construction for balanced
constrained sequences which is presented in this Letter, an alternative to
the prefix method is proposed. Specifically, a short marker violating the
maximum runlength constraint is now used to indicate the balancing
point for Knuth’s inversion. Note that while the d constraint is still criti-
cal for recording density and reliability, the k constraint increased in
steps from k ¼ 3 in the 1960s when only simple tank oscillator elec-
tronic synchronisation circuitry was available, to k ¼ 7 in the 1970s
and later k ¼ 11. So far there has been little interest in further increasing
k, since this will only yield an insignificant increase in practical record-
ing density as can be shown by numerical evaluation of the information
theoretic channel capacity as in e.g. [1]. With state of the art phase
locked loops, occasional violations are tolerable. In fact, it can also be
expected that in future synchronisation technology may be further
improved. In this Letter we exploit occasional violations to create
markers.
We use prior art such as in [1] or [6]
to generate a high rate NRZI
(d, k ) constrained block code. Represent the NRZ (d + 1, k + 1)
runlength constrained channel sequence x with
x
1
...x
j
x
j+1
...x
n
x
i
[ {0, 1}(1)
where j is the balancing index. Consequently, the sequence can be
balanced by inverting the last nj bits x
j+1
...x
n
. Knuth showed that
every binary word has at least one such balancing index j, where 1
j n. Let
x
i
be the binary complement of x
i
. We now insert a balanced
marker sequence M ¼ m
1
...m
z
of length z bits and transmit
x
i
...x
j
m
1
...m
z
x
j+1
...x
n
(2)
We want to extend the last run of the sequence x
1
...x
j
to create a
maximum runlength constraint violation, i.e. a run of k + 2 bits or
longer, without violating the minimum runlength of d + 1 channel
symbols anywhere. To this end, we propose the following marker. Let
y
u
denote a run of length
m
same symbols y and set
M = x
k+1
j
x
k+1
j
x
d+1
j+1
x
d+1
j+1
(3)
If j ¼ n, then we use the first bit of the next word as
x
j+1
. As a simple
example for exposition, the set of NRZ candidate markers for a (d, k) ¼
(1, 3) code is
{000011110011, 000011111100, 111100001100, 111100000011}
Note that M is balanced and that the overall length of the marker is z ¼
2k + 2d + 4. The first maximum runlength violation which includes x
j
,
ranges between k + 2 and 2k + 2 symbols. Note that if x
j
= x
j+1
,a
second maximum runlength violation, of length k + d + 2, occurs
inside the marker. A third runlength violation, of length at most k +
d + 2, may occur where the marker M merges with the inverted
portion of the sequence. However, the decoder only searches for the
first violation. When the first maximum runlength violation is detected
the decoder counts back k + 1 bits from the end of this violated run to
locate the start of the marker. It then removes the marker, inverts
x
j+1
...x
n
and uses prior art to decode and retrieve the data.
The proposed marker is a very simple one. More advanced markers
can be designed for specific purposes, at the expense of a higher com-
plexity. For example, constructions of variable-length markers with a
shorter average length are possible by taking into account the lengths
of the runs preceding and following the balancing index. Also, the
marker construction may be optimised to bring down the maximum run-
length violation. Note that for the proposed marker a run of length 2k +
2 can occur in the transmitted sequence, which may be considered as
being too long. By extending the set of candidate markers the
maximum value of this runlength violation may be set closer to k + 2.
However, such advanced markers are beyond the scope of this Letter,
where we focus on the introduction of the marker concept and the
simple design presented in (3).
–20
–15
–10
–5
0
5
10
0 0.1 0.2 0.3 0.4 0.5
PSD, dB
frequency f
pure MFM
proposed construction, 16 bit
proposed construction, 128 bit
Fig. 1 Comparison of power spectral densities between MFM (solid curve)
and proposed construction, with (d, k) ¼ (1, 3), for 16 bit (dashed curve)
and 128 bit (dotted curve) codewords as input to balancing
Results: With the above marker construction, the overhead is 2k + 2d +
4 bits per codeword. We compare this to the previous construction, pre-
sented in [5], where Knuth’s algorithm is directly applied to constrained
sequences, which requires a fixed length interfix of d + 1 bits as well as
a prefix, of length dependent on (d, k) and n, as tabled in [5]. Note that
the overhead of the marker construction does not depend on n.
Therefore, except for small values of n, the proposed method based
on the marker has less overhead as well as lower complexity. For
example, if (d, k) ¼ (1,3) the marker presented here represents less over-
head if the prefix in [5] is longer than (2k + 2d + 4)(d + 1) ¼ 12 2 ¼
10 bits.
The power spectral densities for (d, k) ¼ (1, 3) coded sequences with
markers corresponding to the proposed construction are compared to the
(d, k) ¼ (1,3) MFM (Miller code, [1])inFigs. 1 and 2.InFig. 1, the
ELECTRONICS LETTERS 13th September 2012 Vol. 48 No. 19
influence of the codeword length is shown as well. In Fig. 2, it can be
seen how the low frequency components are suppressed.
–20
–15
–10
–5
0
5
10
10
–3
10
–2
10
–1
PSD, dB
frequency f
pure MFM
oroposed construction, 128 bit
Fig. 2 Low frequency performance comparison for MFM (solid curve) and
proposed construction (dotted curve), with (d, k) ¼ (l, 3), for 128 bit code-
words as input to balancing
Conclusion: We have presented an efficient construction for long
balanced and DC free (d, k) constrained block codes. In the context of
constrained codes, modifying Knuth’s original scheme by using
markers instead of the original Knuth’s prefix which for (d, k) con-
straints furthermore mandates an additional interfix may require con-
siderably fewer overhead symbols and lower implementation
complexity.
# The Institution of Engineering and Technology 2012
19 April 2012
doi: 10.1049/el.2012.1220
H.C. Ferreira and C.H. Heymann (Department of Electrical and
Electronic Engineering Science, University of Johannesburg, APK
Campus, Box 524, Auckland Park, 2006, South Africa)
E-mail: hcferreira@uj.ac.za
J.H. Weber (Delft University of Technology, The Netherlands)
K.A.S. Immink (Turing Machines, Rotterdam, The Netherlands)
References
1 Immink, K.A.S.: ‘Codes for mass data storage systems’ (Shannon
Foundation Publishers, Eindhoven, The Netherlands, 2004, 2nd edn),
ISBN 90-74249-27-2
2 Knuth, D.E.: ‘Efficient balanced codes’, IEEE Trans. Inf. Theory, 1986,
IT-32, (1), pp. 51 53
3 Ferreira, H.C.: ‘On dc free magnetic recording codes generated by finite
state machines’, IEEE Trans. Magn., 1983, MAG-19, (6),
pp. 2691 2693
4 Braun, V., and Immink, K.A.S.: ‘An enumerative coding technique for
DC-free runlength-limited sequences’, IEEE Trans Commun., 2000,
48, (12), pp. 2024 2031
5 Immink, K.A.S., Weber, J.H., and Ferreira, H.C.: ‘Balanced runlength
limited codes using Knuth’s algorithm’. Proc. of IEEE Int. Symp. on
Information Theory, St Petersburg, Russia, July August 2011,
pp. 282285
6 Adler, R., Coppersmith, D., and Hassner, M.: ‘Algorithms for sliding
block codes an application of symbolic dynamics to information
theory’, IEEE Trans. Inf. Theory, 1983, 29, (1), pp. 522
ELECTRONICS LETTERS 13th September 2012 Vol. 48 No. 19
... In [1], Ferreira et al. proposed an alternative method for balancing RLL sequences, for which the redundancy does not depend on the length of the source sequence. Hence, for long sequences, this method is more efficient than the method in [5]. ...
... When comparing the methods from [5] and [1], the latter is less redundant and less complex, at the price of an occasional violation of the upper runlength constraint. As argued in [1], such violation is defendable based on technological developments. ...
... When comparing the methods from [5] and [1], the latter is less redundant and less complex, at the price of an occasional violation of the upper runlength constraint. As argued in [1], such violation is defendable based on technological developments. While the lower runlength constraint is still critical for recording density and reliability, the upper runlength constraint has been relaxed over the years due to improved electronic synchronization circuitry. ...
Conference Paper
Full-text available
A well-known method for balancing binary sequences, in the sense of forcing them to have as many zeroes as ones, was proposed by Knuth. It is based on the inversion of all bits beyond a certain balancing index, and communicating this index via a prefix. This principle has also been applied to balance runlength-limited (RLL) sequences. Another Knuth-based approach exploits the insertion of a marker in the RLL sequence causing a deliberate runlength violation at the position of the balancing index. This marker method has an advantage over the prefix method, since its redundancy does not grow with the length of the source blocks. In this paper, the markers are optimized with respect to their length and the severeness of the runlength violation, for possible application in future (optical) recording systems.
... In our scheme, having constrained sequences guarantees the detection of a markerrefer to e.g. [6], [7]. ...
Article
We present the construction of interleaving arrays for correcting clusters as well as diffuse bursts of insertion or deletion errors in constrained data. In this construction, a constrained information sequence is systematically encoded by computing a small number of parity checks and inserting markers such that the resulting code word is also constrained. Insertions and deletions lead to a shift between successive markers which can thus be detected and recovered using the parity checks. In this paper, as an example, the scheme is developed for Manchester-encoded information sequences.
... The number of balanced sequence is equal to a number of combination of ones, that is B 1⇥2m = 2m m and asymptotically R(B 1⇥2m ) ⇡ 1, for m ! 1. It was shown [6,10,11] that the balanced property does not influence the asymptotic rate of constrained sequences. So R(A 1⇥2m \ B 1⇥2m ) ⇡ log 2 1 2 + 1 2 p 5 , for m ! 1. ...
Article
A novel Knuth-like balancing method for runlength-limited words is presented, which forms the basis of new variable- and fixed-length balanced runlength-limited codes that improve on the code rate as compared to balanced runlength-limited codes based on Knuth’s original balancing procedure developed by Immink et al. While Knuth’s original balancing procedure, as incorporated by Immink et al. , requires the inversion of each bit one at a time, our balancing procedure only inverts the runs as a whole one at a time. The advantage of this approach is that the number of possible inversion points, which needs to be encoded by a redundancy-contributing prefix/suffix, is reduced, thereby allowing a better code rate to be achieved. Furthermore, this balancing method also allows for runlength violating markers which improve, in a number of respects, on the optimal such markers based on Knuth’s original balancing method.
Book
Full-text available
Preface - The advantages of digital audio and video recording have been appreciated for a long time and, of course, computers have long been operated in the digital domain. The advent of ever-cheaper and faster digital circuitry has made feasible the creation of high-end digital video and audio recorders, an impracticable possibility using previous generations of conventional analog hardware. The principal advantage that digital implementation confers over analog systems is that in a well-engineered digital recording system the sole significant degradation takes place at the initial digitization, and the quality lasts until the point of ultimate failure. In an analog system, quality is diminished at each stage of signal processing and the number of recording generations is limited. The quality of analog recordings, like the proverbial 'old soldier', just fades away.
Book
Full-text available
Preface to the Second Edition About five years after the publication of the first edition, it was felt that an update of this text would be inescapable as so many relevant publications, including patents and survey papers, have been published. The author's principal aim in writing the second edition is to add the newly published coding methods, and discuss them in the context of the prior art. As a result about 150 new references, including many patents and patent applications, most of them younger than five years old, have been added to the former list of references. Fortunately, the US Patent Office now follows the European Patent Office in publishing a patent application after eighteen months of its first application, and this policy clearly adds to the rapid access to this important part of the technical literature. I am grateful to many readers who have helped me to correct (clerical) errors in the first edition and also to those who brought new and exciting material to my attention. I have tried to correct every error that I found or was brought to my attention by attentive readers, and seriously tried to avoid introducing new errors in the Second Edition. China is becoming a major player in the art of constructing, designing, and basic research of electronic storage systems. A Chinese translation of the first edition has been published early 2004. The author is indebted to prof. Xu, Tsinghua University, Beijing, for taking the initiative for this Chinese version, and also to Mr. Zhijun Lei, Tsinghua University, for undertaking the arduous task of translating this book from English to Chinese. Clearly, this translation makes it possible that a billion more people will now have access to it. Kees A. Schouhamer Immink Rotterdam, November 2004
Article
Full-text available
We present an enumerative technique for encoding and decoding DC-free runlength-limited sequences. This technique enables the encoding and decoding of sequences approaching the maxentropic performance bounds very closely in terms of the code rate and low-frequency suppression capability. Use of finite-precision floating-point notation to express the weight coefficients results in channel encoders and decoders of moderate complexity. For channel constraints of practical interest, the hardware required for implementing such a quasi-maxentropic coding scheme consists mainly of a ROM of at most 5 kB
Article
Full-text available
State systems for the encoding of the following magnetic recording codes, with minimum Hamming distance d min , as indicated, are presented. A R = ½, (1, 4, 3), d min = 2 code; a R = ½, (0, 3, 3), d min = 4 code; and a R = 2/4, (0, 3, 2), d min = 4 code. An encoder and decoder for the (1, 4, 3) code are shown. Minimum Hamming distance aspects of dc free codes, generated by finite state machines, are briefly discussed.
Article
Full-text available
Ideas which have origins in C. E. Shannon's work in information theory have arisen independently in a mathematical discipline called symbolic dynamics. These ideas have been refined and developed in recent years to a point where they yield general algorithms for constructing practical coding schemes with engineering applications. In this work we prove an extension of a coding theorem of B. Marcus and trace a line of mathematics from abstract topological dynamics to concrete logic network diagrams.
Conference Paper
Knuth published a very simple algorithm for constructing bipolar codewords with equal numbers of +1's and -1's, called balanced codes. In our paper we will present new code constructions that generate balanced runlength limited sequences using a modification of Knuth's algorithm.
Article
Coding schemes in which each codeword contains equally many zeros and ones are constructed in such a way that they can be efficiently encoded and decoded.