ArticlePDF Available

Efficient dc-free RLL codes for optical recording

Authors:

Abstract and Figures

Runlength-limited (RLL) codes, generically designated as (d, k) RLL codes, have been widely and successfully applied in modern magnetic and optical recording systems. The design of codes for optical recording is essentially the design of combined dc-free and runlength limited (DCRLL) codes. We will discuss the development of very efficient DCRLL codes, which can be used in upcoming generations of high-density optical recording products.
Content may be subject to copyright.
326 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 51, NO. 3, MARCH 2003
Efficient dc-Free RLL Codes for Optical Recording
Kees A. Schouhamer Immink, Fellow, IEEE, Jin-Yong Kim, Sang-Woon Suh, and Seong Keun Ahn
Abstract—We will report on new dc-free runlength-limited
codes (DCRLL) intended for the next generation of DVD. The
efficiency of the newly developed DCRLL schemes is extremely
close to the theoretical maximum, and as a result, significant
density gains can be obtained with respect to prior art coding
schemes. With a newly developed DCRLL code we can
achieve a 9% higher overall rate than that of DVD’s EFMPlus.
Index Terms—Channel capacity, constrained code, dc-free
code, sequence, optical recording, runlength-limited (RLL)
sequence.
I. INTRODUCTION
THE design of codes for optical recording is essentially
the design of combined dc-free and runlength limited
(DCRLL) codes [1]. Eight to Fourteen Modulation (EFM),
invented by Immink and Ogawa in the early Eighties [2],
and EFMPlus [3] were adopted as the recording code for the
compact disc (CD) and DVD, respectively.
Binary sequences generated by a RLL encoder have
at least and at most , , “zero’s” between successive
“one’s”. The series of encoded bits is converted, via a modulo-2
integration operation, called precoding, to a corresponding
modulated signal formed by bit cells having a high or low signal
value, a “one” being represented in the modulated signal by a
change from a high to a low signal value or vice versa. A “zero”
is represented by the lack of change of the modulated signal.
Specifically, codes with minimum runlength parameter
have been widely employed in optical recording, while codes
with have been proposed for future systems [4]. For
that reason will will focus our design efforts on efficient RLL
codes with minimum runlength parameter and .
Thereafter we will discuss the development of a new DCRLL
coding arrangement that employs a highly efficient RLL inner
code which is extended by a second coding mechanism, such
as, for example, Guided Scrambling [5], used for spectral
shaping (and other) purposes. We start with the development of
the new RLL codes.
II. VERY EFFICIENT RLL CODING SCHEMES
Let the integers and denote the information word length
and codeword length, respectively. The maximum rate,
, of an RLL code, given values of and , is called the
Paper approved by V. K. Bhargava, the Editor for Coding and Communication
Theory of the IEEE Communications Society. Manuscript received May 2001;
revised January 2, 2002. This paper was presented in part at the International
Symposium on Information Theory (ISIT), Washington, DC, June 2001.
K. A. S. Immink is with Turing Machines Inc., 3016 DK Rotterdam, The
Netherlands (e-mail: immink@turing-machines.com).
J.-Y. Kim, S.-W. Suh, and S. K. Ahn are with the DCT Team, Multi-Media
Labs, LG Electronics Inc., Seocho-Gu, Seoul 137-724, Korea.
Digital Object Identifier 10.1109/TCOMM.2003.809752
TABLE I
CAPACITY AND AS A FUNCTION OF
Shannoncapacity, and it is denoted by . Table I tabulates
and for relevant values of . The
efficiency of an RLL code is usually measured by a quantity
called code efficiency,, defined by
(1)
For ease of presentation we will first focus on the design of RLL
codes with . Later we will extend the ideas to the design
of codes with .
Up till now, small codes with a rate exceeding
two-thirds have not been published. There are only two ap-
proaches for constructing a RLL code, whose rate is
larger than two-thirds. Firstly, we may relax the maximum
runlength to a value larger than 7. Note that a (1,7) code was
first put to practical use in the early seventies, and that since the
advent of hard-disk drives (HDDs), significant improvements
in signal processing for timing recovery circuits have made
it possible to employ codes with a much larger maximum
runlength . Secondly, on top of that we may endeavor to
design a more efficient code. The efficiency of the rate 2/3,
(1,7) code is , which reveals that we
can gain at most 1.9% in rate by an alternative, more efficient,
code redesign. If we fully relax the constraint, i.e. set ,
we can at most gain 3.97% in code rate. In other words, a viable
improvement in code rate of a encoder ranges from
1.9 to 3.97%.
In the sequel of this paper we will show how to create a
(1,14) code, whose rate is 3.85% better than the traditional rate
2/3, (1,7) code. We start, in the next subsection, with a simple
problem, namely finding integers and that improve the rate,
2/3, of the industry standard code.
A. Suitable Integers and for
We will start with a simple exercise, namely a search for pairs
of integers and that are suitable candidates for a coding rate
exceeding 2/3. Obviously, the “best” code is a code with a rate,
, that exactly equals the capacity for desired values
of and . One is tempted to ask if it is possible to choose
the integers and such that . The answer
—a sounding no— was given by Ashley and Siegel [6], who
0090-6778/03$17.00 © 2003 IEEE
IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 51, NO. 3, MARCH 2003 327
TABLE II
INTEGERS AND SUCH THAT .THE
QUANTITY EXPRESSES THE CODE EFFICIENCY
showed that, besides a very few trivial exceptions, the capacity
is an irrational number. Thus, as the rate of a code ,
where and are integers, is rational, the capacity can only be
approached.
In order to obtain some feeling if there are many “practical”
pairs of such integers and , we wrote a one-line computer
program for searching integers and that satisfy the inequal-
ities , where for reasons of implemen-
tation we set . All pairs of integers found are shown in
Table II. Surprisingly there are just six and pairs whose
quotient is larger than 2/3.1Perusal of the table reveals that the
code rate is highly attractive as it is just 0.28%
below the Shannon capacity . The next better code of
rate 34/49 is far less attractive as it is much more complex and
adds a minute 0.2% to the density gain with respect to a rate
9/13code.Wethereforeconcentratedourattentiononarate9/13
code. The fact that the rate 9/13 is less than capacity does not
mean that a code with that rate can be practically constructed. In
the next subsection, we will show how a rate 9/13, (1,14) code
can be created using a new design technique.
B. Encoder Description
In this section, we will describe a finite-state encoder that
generates sequences satisfying the constraint (the con-
straint is ignored for a while for ease of presentation). We start
with a few definitions. A codeword is a binary string of length
that satisfies the constraint. The set of codewords, ,
is divided into four subsets , , , and . The four
subsets are characterized as follows. Codewords in start and
end with a “0”, codewords in start with a “0” and end with
a “1”, etc. The encoder has states, which are divided into two
state subsets of a first and second type. The encoder has states
of the first type and states of the second type. The
two types of coding states are characterized by the fact that all
codewords in the states of the first type must start with a “0”,
while codewords in the states of the second type are free to
start with a”1” or a “0”.
The encoder state-transition rules are now easily described.
Codewords that end with a ‘0’, i.e., codewords in subsets
and may enter any of the encoder states. Code-
words that end with a “1” may only enter the states of the first
type only (and not the states of the second kind). Note that, by
definition, the codewords in states of the first type start with a
“0”, and codewords in states of the second type may start with a
1We omitted trivial pairs, such as 18 and 26, etc., that are multiples of given
smaller pairs. This, by the way, does not mean the omitted pairs are irrelevant
for a specific code design.
Fig. 1. Codewords that end with a “0” may be followed by codewords in the
states of the first type and the states of the second type, while words that
end with a ‘1’ may only be followed by codewords in the states of the first
type.
“1”, which prohibits that a codeword ending with “1” may enter
states of the second type. The encoder concept is schematically
represented in Fig. 1. It is essential that the sets of codewords
that belong to a given state (of any type) do not have codewords
in common (i.e., sets of codewords associated with coding states
are disjoint). This attribute implies that any codeword can un-
ambiguously be identified to the state from which it emerged.
Then,as we will show,itispossibleto assign the same codeword
to more than one information word (the miraculous multiplica-
tionofcodewords).Thesliding-blockdecodercan,byobserving
boththe current and the next codeword—for identifying the next
state—, uniquely decide which of the information words was
actually transmitted. Codewords in subsets and can, as
codewords in these two subsets end with a ‘0’, be followed by
codewords in any of states, and can, thus, be as-
signed times to different information words. Sim-
ilarly, codewords in and can only be followed by the
states of the 1st kind, and can therefore be assigned times
to different information words. Given the above encoder model,
we can write down two necessary conditions of such a rate ,
code.
Let denote the size of . Then, following the above
arguments, there are at maximum codewords
leaving the states of the first type. For a rate code, there
should be at least codewords leaving the states of the
first type. Thus we can write down the first condition
(2)
Similarly, the second condition follows from the fact that there
should be a sufficient amount of codewords leaving the states.
We find
(3)
Note that the inequalities (2) and (3) are equal to the approx-
imate eigenvector equation, which plays an essential role in a
variety of code constructions, such as the state-splitting method
[7]. There is, however, quite a difference as the two inequalities
above imply a very specific encoder structure (including size),
while, in general, the approximate eigenvector merely gives a
328 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 51, NO. 3, MARCH 2003
TABLE III
VALUES OF AND THAT SATISFY CONDITIONS (2) AND (3)
TABLE IV
DISTRIBUTION OF THE VARIOUS SUBSETS AND STATES
loose upper bound to the encoder size of the code found by the
state-splitting method.
With a small computer we can, given and , easily find
integers and that satisfy the above two conditions. An ex-
ample of a rate 9/13 code will show the effectiveness of the new
construction.
C. Rate 9/13, Codes
Assume the construction of a rate 9/13 encoder. Then
, , and . Table III shows
values of , , and that satisfy Conditions (2)
and (3). After finding suitable values of and , the next step
in the code construction is the distribution of the various code-
words among the various states. In order to find such a distribu-
tion, a trial and error approach has been used. Table IV shows,
for and as an example (Note that the distribution
given is not unique, there are many other ways for allocating
the codewords to the states), how the codewords in the various
subsets can be allocated to the various states. From Table IV, we
discern that the subset of size 233 has 72 words in States
1 and 2, 87 words in State 3, and 2 words in State 5. Thus in
total: . Similarly, it can be verified
that the four row sums equal the number of codewords in each
of the four subsets. Codewords that end with a “0”, i.e., code-
words in and , can be assigned times to different
information words, while codewords that end with a “1”, i.e.
codeword in and , can be assigned times to
different information words. Thus, the total number of informa-
tion words that can be assigned to the codewords in State 1 is
. Similarly, it can be verified that from
any of the encoder states there at least 516 information
words that can be assigned to codewords, which shows that the
code can accommodate 9-bit information words. An enumera-
tion table such as Table IV suffices to construct a code by as-
signing codewords to the coding states and source words.
It can be verified with the procedure outlined above that a
13-state encoder of (code) size 520 can be created. The max-
imum size of any 13-bit code equals ,
and we therefore conclude that the above code is quite efficient
TABLE V
INTEGERS AND SUCH THAT .THE
QUANTITY EXPRESSES THE CODE EFFICIENCY
particularly considering that the encoder has a rel-
atively small number, 13, of states. For -bit codewords,
wefindthat a 13-state encoder achieves the maximum codesize,
321, possible. These codes are supposedly the most efficient in
existence in terms of relative performance. Such extremely ef-
ficient codes could up till now only be constructed with “large”
codewords, but as shown here also selected “small” codes can
have a rate which is very close to the channel capacity.
As the above code can accommodate more than the required
512 words, surplus ‘worst-case’ codewords can be deleted
for minimizing the constraint. After a judicious process of
deleting codewords that end or start with “long” runs of ‘0’s,
we constructed a 5-state (1,18) code, and a 13-state (1,14) code.
Note, in Table I, that the smallest possible for a rate 9/13
code equals 12.
A few words are in order about the decoder. A decoder
must observe both the current and the upcoming codeword to
uniquely decode the encoded sequence of codewords into a
sequence of information words. Single channel bit errors can
thus lead to at most two decoded -bit symbols. The decoder
comprises two look-up tables: the next-state look-up table and
the data look-up table. The next-state look-up table has the
next codeword as an input, and the state to which this word
belongs as an output. The data look-up table has the output of
the next-state look-up table and the current codeword as an
input, and the output of the data look-up table is the decoded
information word.
III. EFFICIENT CODES
Up till now we have concentrated on the design of efficient
codes, and as both code parameters, and ,
are of great practical interest for optical recording, we will now
repeat the exercise for the case .
A. Suitable Integers and for
RLL codes with minimum runlength parameter have
been widely published. Table I tabulates as a function
of , and from this table the reader can easily discern the head
room available for the design of a code of rate
. The rate 8/15 is, see Table I, 3.3% below channel capacity
. Table V shows values of and , where
and . The ‘ and ’ pairs are ordered
according to their quotient . Clearly, the quotients 11/20,
6/11, and 7/13 are suitable candidate rates for the creation of
IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 51, NO. 3, MARCH 2003 329
small codes. Efficiency-wise speaking the code of rate
17/31 is also attractive, but the code is far too complex for cur-
rent implementation. Kim [8] has been granted a U.S. Patent on
an embodiment of a rate 7/13, (2,25) code, which operates with
a single merging bit (3PM principle [9]). In the next subsection,
we will describe in detail how the very efficient codes
with the above mentioned rates can be constructed.
B. Encoder Description
In this section we will describe a finite-state encoder that gen-
erates sequences that satisfy the constraint (note that the
constraint will be ignored for a while). We start with a few def-
initions. The encoder is assumed to have states, which are di-
vided into three state subsets of states of a first, second, and third
type. The state subsets are of size , , and ,
respectively. A codeword is a binary string of length that sat-
isfies the constraint. The set of codewords is divided
into nine subsets denoted by ,,,,
etc, where the two first symbols of the subset subscript denote
the first two symbols of the codeword, and the last two sym-
bols of the subset subscript denote the last two symbols of the
codeword. Thus, codewords in start and end with ‘00’;
codewords in start with ‘00’ and end with a “01”, etc.
The codewords in the various subsets are distributed over the
various states of the three types such that
codewords in states of the first type start with “00”;
codewords in states of the second type start with “01” or
“00”;
codewords in states of the third type start with “10,” “01,”
or “00”.
The state-transition rules are now easily described. Codewords
that end with the string “00,” i.e., codewords in subsets ,
, and may enter any of the encoder states. Code-
words that end with a “10” may not be followed by codewords
in a state of the third type. Similarly, codewords that end with a
“1” may only be followed by codewords belonging to states of
the first type. The state sets of codewords from whicha selection
is to be made do not have codewords in common. As a result, it
is possible to assign the same codeword to differentinformation
words. For example, codewords that end with “00,” i.e., code-
words in subsets , , and , may enter any state
so that these codewords can be assigned times
to different information words. Codewords that end with “10”,
i.e.words in subsets , , and may enter statesof
thefirstandsecond type so that these codewordscanbeassigned
times to different information words. Similarly, code-
words that end with a “1”, i.e. words in the remaining subsets
, , and can be assigned times. Given the
above encoder model, it is straightforward to write down three
conditions for the existence of such a rate code. Define
and
TABLE VI
EXAMPLE OF THE DISTRIBUTION OF THE VARIOUS SUBSETS AND
STATES OF A RATE 6/11, CODE
TABLE VII
SURVEY OF NEWLY DEVELOPED CODES
Then the conditions are
(4)
(5)
(6)
In a similar vein as with the codes discussed pre-
viously, we have experimented with the selection of suitable
values of , , , , and . Many good codes have been
found. As a typical example, which is amenable for a hand
check, we will show results of a 9-state, code of rate 6/11.
Given the choice of the code rate, we use a small computer pro-
gram to find suitable values of , , and that satisfy con-
ditions (4)–(6). A possible distribution of the various codeword
sets, where we opted for , , and , is shown
in Table VI. Such a distribution table suffices to construct the
code. After judiciously barring worst-case codewords from the
coding table, we were able to construct a rate 6/11, (2,15) code.
Note, see Table I, that is the smallest value possible
for the given rate 6/11. Using the above construction methods,
we built a 9-state rate 11/20, (2,23) code, whose efficiency is
0.25% less than unity. In addition, we constructed a rate 7/13,
(2,11) code, whose efficiency is 1.1% less than unity.
Table VII summarizes the new RLL codes, and ,
we have found. As we can see, the efficiency of the majority of
the new codes is just a fewtenths of a percent below capacity.
The efficiency of the new construction technique can be fur-
ther exemplified by a second example, where the code size, ,
is not equal to a power of two. The ‘spare’ codewords can be
used as alternative channel representations for suppressing the
lf components. The codeword length equals . Table VIII
shows the efficiency, , as a function of the
number of encoder states, . It shows that the construction tech-
nique yields fine results as the codes obtained reach efficien-
cies that are only a tenth of a percent below capacity. Note that
330 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 51, NO. 3, MARCH 2003
TABLE VIII
CODE SIZE, , FOR , , AND SELECTED
VALUES OF THE NUMBER OF ENCODER STATES
the maximum size of a code with codewords of length
equals 453.
At this junction, we have completed the description of the
new RLL codes, and we are in the position to describe how we
can turn the newly developed RLL codes into DCRLL codes.
IV. GUIDED SCRAMBLING
In Guided Scrambling (GS), each information word can be
represented by a member of a selection set consisting of
, , codewords. The encoder generates the selection
set, and the “best” (according to a predefined penalty function)
codeword in the selection set is selected for transmission. The
penalty function weighs each element of the selection set ac-
cording to its spectral and other properties such as maximum
runlength and so on. The maximum runlength constraint, , im-
posed by the GS penalty function can be made smaller than that
of the inner RLL code. Naturally, the GS method cannot fully
guarantee the constraints, but the probability of occurrence
of such vexatious subsequences can be made extremely small.
Other (runlength) constraints, such as MTR, can be added to the
penalty function if required.
In the preferred GS format, user bits are multiplexed with
redundant bits, which are a part of the input of the channel
encoder. The redundant bits are used to generate a selection
set of size . In the proposed coding format, the channel
encoder input comprises redundant bits plus user bits that
from a superblock. The -bit super block is scram-
bled using a self-synchronizing (feedback register) scrambler
(see for more details [1, Chapter 13]). Then, under the rules of
the RLL code, the -bit scrambled super block
is translated into channel bits. The above scrambling/en-
coding step is repeated times for all possible combinations
of the redundant bits. The encoder transmits the sequence that
best matches the channel constraints such as lf content, -con-
straint, etc., discussed above.
The integers and are integers chosen such that
(7)
where is an integer that denotes the number of -bit infor-
mation words in a super block. In a practical environment of a
byte-oriented system, is a multiple of eight, i.e.
. Thus the overall rate of the code is
(8)
In the next subsection, we will select values of and , and
show results of computer simulations.
Fig. 2. Simulation results of a PSD function of a ( ,)
code of overall rate . The spectrum was computed on
the basis of 10 million channel bits. The straight line is a “best fit”
estimate of the low-frequency part of the spectrum. We simply discern that
dB.
A. Results and Comparison with Prior Art Methods
We have written a computer program to simulate the per-
formance of GS in conjunction with the newly developed RLL
codes. The power spectral density (PSD), , and other rel-
evant characteristics can easily be computed.
As a typical example, we will show results obtained with the
rate 9/13, (1,14) RLL code. Fig. 2 shows the spectrum, ,
versus (channel) frequency, , for , , and .
The overall code rate is . Note that the overall code is byte ori-
ented as is a multiple of eight. The scrambler
polynomial used in our simulations is . In the run-
length penalty function, we set the maximum “zero” runlength
to , which means that the code essentially behaves as a
(, ) code. The spectrum, , versus frequency
has a parabolic shape in the low-frequency range, which shows
as a straight line as a result of the logarithmic frequency axis
used. Using a computer simulation of the encoding process, we
compute the PSD of an encoded sequence. Then using a best-fit
(LMS) estimation technique involving the low-frequency com-
ponents of the spectrum, we derive an estimate of the low-fre-
quency performance. For example, in Fig. 2, we estimate that
dB. In similar vein, we estimated
as a function of the overall rate, . Results are shown in Fig. 3
for and . The maximum runlength in the GS
penaltyfunctionwassetto . Similar curvescan be plotted
for other values of , and, obviously, the more relaxed the con-
straint, the more lf suppression. In order to compare our results
with the maximum theoretical performance of DCRLL codes,
we invoked the algorithms derived by Braun and Janssen [10],
which compute the maxentropic performance of codes.
The maxentropic performance sets a theoretical limit to the per-
formance of any implemented DCRLL code. Fig. 3 shows that
the implemented codes operate very close to the best theoretical
performance. For the implemented codes are 2–3 dB,
(for , 1–2 dB) below the theoretical ceiling. As a further
IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 51, NO. 3, MARCH 2003 331
Fig. 3. The two upper curves show the lf suppression, , as a function
of the overall code rate . The upper curve shows results for , and the
lower curve for . The maximum imposed runlength for both cases is
. As a comparison we plotted the theoretical ceiling, ,of
maxentropic ( , ) sequences [1]. The curve denoted by (1,7)PP
gives results of a prior art code [4].
Fig. 4. The two upper curves show the lf suppression, , as a function
of the overall rate . The upper curve is for , and the lower curve is
for . The maximum imposed runlength for both cases is .As
a comparison we plotted the theoretical ceiling, , of maxentropic
(, ) sequences.
comparison we plotted the performance of a prior art rate 2/3,
(1,7) code [4], which is extended with dc-control bits on data
sequence level.
We proceed with a second example. Fig. 4 shows the lf spec-
tral performance of the rate 6/11, (2,15) code in conjunction
with Guided Scrambling. Results are given for and .
As reported in the above case, the combination of an effi-
cientRLLcode and GS works quite satisfactorily as only 2–3 dB
can be gained with respect to the theoretical ceiling.
V. CONCLUSIONS
We have studied the construction of extremely efficient RLL
codes. We have shown that there is a very limited number of
pairs of integers and , whose quotient form a suitable
coding rate for and RLL codes that are more
efficient than prior art codes. Suitable values for the rate of a
code are 9/13 and 11/16, while for codes we
have 11/20, 7/13, and 6/11.
We have disclosed a novel technique for designing very ef-
ficient RLL codes, whose rate is only a few tenths below ca-
pacity. For example, we have constructed a 13-state rate 9/13,
(1,14) RLL code, whose rate is only 0.2% below channel ca-
pacity . In addition, we have constructed a new rate
6/11, (2,15) code, a rate 11/20, (2,23) code, and a rate 7/13,
(2,11) code.
Results of computer simulations have shown that the ar-
rangement of the newly developed RLL codes in conjunction
with Guided Scrambling (GS) is extremely efficient in terms
of overall rate and spectral performance as we have shown
that only a few dB in spectral performance can be gained with
respect to the theoretical ceiling. With a newly developed
code we achieved a 9% higher overall rate than that of DVD’s
EFMPlus.
REFERENCES
[1] K. A. S. Immink, Codes for Mass Data Storage Systems. Geldrop, The
Netherlands: Shannon Foundation, 1999.
[2] K. A. S. Immink and H. Ogawa, “Method for Encoding Binary Data,”
U.S. Patent 4501000, Feb. 19, 1985.
[3] K. A. S. Immink, “EFMPlus: the coding format of the multimedia com-
pact disc,” IEEE Trans. Consumer Electron., vol. 41, pp. 491–497, Aug.
1995.
[4] T. Narahara, S. Kobayashi, M. Hattori, Y. Shimpuku, G. van den Enden,
J. A. Kahlman, M. van Dijk, and R. Woudenberg, “Optical disc system
for digital video recording,” in Proc. Joint Int. Symp. on Optical Memory
and Optical Data Storage, Hawaii, July 11–15, 1999.
[5] I. J. Fair, W. D. Gover, W. A. Krzymien, and R. I. MacDonald, “Guided
scrambling: a new line coding technique for high bit rate fiber optic
transmission systems,” IEEE Trans. Commun., vol. 39, pp. 289–297,
Feb. 1991.
[6] J. J. Ashley and P. H. Siegel, “A note on the shannon capacity of
run-length-limited codes,” IEEE Trans. Inform. Theory, vol. IT-33, pp.
601–605, July 1987.
[7] R. L. Adler, D. Coppersmith, and M. Hassner, “Algorithms for sliding
block codes. An application of symbolic dynamics to information
theory,” IEEE Trans. Inform. Theory, vol. IT-29, pp. 5–22, Jan. 1983.
[8] M. J. Kim, “7/13 Channel coding and decoding method using RLL(2,25)
code,” U.S. Patent 6188336, Feb. 13, 2001.
[9] G. V. Jacoby, “Method and apparatus forencoding and recovering binary
digital data,” U.S. Patent 4 323931, Apr. 6, 1982.
[10] V. Braun and A. J. E. M. Janssen, “On the low-frequency suppression
performance of DC-free runlength-limited modulation codes,” IEEE
Trans. Consumer Electron., vol. 42, no. 4, pp. 939–945, Nov. 1996.
... We also highlight that complexity is not solely governed by code length. For example, [45] introduces (d, k) RLL codes with smaller lengths at the same rate compared with our method. However, the technique in [45] is based on lookup tables; thus, the complexity of encoding and decoding is mainly governed by large lookup table sizes, which is a significantly higher complexity than what we offer (see also [11]). ...
... For example, [45] introduces (d, k) RLL codes with smaller lengths at the same rate compared with our method. However, the technique in [45] is based on lookup tables; thus, the complexity of encoding and decoding is mainly governed by large lookup table sizes, which is a significantly higher complexity than what we offer (see also [11]). In addition to offering low-complexity encoding and decoding, LOCO codes can be easily reconfigured, which is a unique feature. ...
... resulting in N 8 (0) 1. Substituting this result in (46) gives N 8 (−1) 1/6. Substituting that in (45) gives N 8 (−2) 1/36, which completes the proof. ...
Article
Full-text available
Constrained codes are used to prevent errors from occurring in various data storage and data transmission systems. They can help in increasing the storage density of magnetic storage devices, in managing the lifetime of solid-state storage devices, and in increasing the reliability of data transmission over wires. Over the years, designing practical (complexity-wise) capacity-achieving constrained codes has been an area of research gaining significant interest. We recently designed various constrained codes based on lexicographic indexing. We introduced binary symmetric lexicographically-ordered constrained (S-LOCO) codes, q-ary asymmetric LOCO (QA-LOCO) codes, and a class of two-dimensional LOCO (TD-LOCO) codes. These families of codes achieve capacity with simple encoding and decoding, and they are easy to reconfigure. We demonstrated that these codes can contribute to notable density and lifetime gains in magnetic recording (MR) and Flash systems, and they find application in other systems too. In this paper, we generalize our work on LOCO codes by presenting a systematic method that guides the code designer to build any constrained code based on lexicographic indexing once the finite set of data patterns to forbid is known. In particular, we connect the set of forbidden patterns directly to the cardinality of the LOCO code and most importantly to the rule that uncovers the index associated with a LOCO codeword. By doing that, we reveal the secret arithmetic of patterns, and make the design of such constrained codes significantly easier. We give examples illustrating the method via codes based on lexicographic indexing from the literature. We then design optimal (rate-wise) constrained codes for the new two-dimensional magnetic recording (TDMR) technology. Over a practical TDMR model, we show notable performance gains as a result of solely applying the new codes. Moreover, we show how near-optimal constrained codes for TDMR can be designed and used to further reduce complexity and error propagation. All the newly introduced LOCO codes are designed using the proposed general method, and they inherit all the desirable properties in our previously designed LOCO codes.
... We also highlight that complexity is not solely governed by code length. For example, [45] introduces (d, k) RLL codes with smaller lengths at the same rate compared with our method. However, the technique in [45] is based on lookup tables; thus, the complexity of encoding and decoding is mainly governed by large lookup table sizes, which is a significantly higher complexity than what we offer (see also [11]). ...
... For example, [45] introduces (d, k) RLL codes with smaller lengths at the same rate compared with our method. However, the technique in [45] is based on lookup tables; thus, the complexity of encoding and decoding is mainly governed by large lookup table sizes, which is a significantly higher complexity than what we offer (see also [11]). In addition to offering low-complexity encoding and decoding, LOCO codes can be easily reconfigured, which is a unique feature. ...
... resulting in N 8 (0) 1. Substituting this result in (46) gives N 8 (−1) 1/6. Substituting that in (45) gives N 8 (−2) 1/36, which completes the proof. ...
Preprint
Full-text available
Constrained codes are used to prevent errors from occurring in various data storage and data transmission systems. They can help in increasing the storage density of magnetic storage devices, in managing the lifetime of electronic storage devices, and in increasing the reliability of data transmission over wires. We recently introduced families of lexicographically-ordered constrained (LOCO) codes. These codes achieve capacity with simple encoding and decoding, and they are easy to reconfigure. In this paper, we generalize our work on LOCO codes by presenting a systematic method that guides the code designer to build any constrained code based on lexicographic indexing once the finite set of data patterns to forbid is known. In particular, we connect the set of forbidden patterns directly to the cardinality of the code and to the rule that uncovers the index associated with a codeword. By doing that, we reveal the secret arithmetic of patterns, and make the code design significantly easier. We design optimal (rate-wise) constrained codes for the new two-dimensional magnetic recording (TDMR) technology. We show notable performance gains as a result of solely applying the new codes. Moreover, we show how near-optimal constrained codes be designed and used to further reduce complexity.
... Moreover, the technique in [23] does not readily generalize to T x -constrained codes. While techniques based on lookup tables, e.g., [26], offer a better rate-length trade-off, they incur significant encoding and decoding complexity. ...
... From Table IV, the C-LOCO code C c 90,1 has rate 0.6923 and adder size 63 bits. The same rate is achieved in [26] for an RLL code with d = 1 at code (resp., message) length 13 (resp., 9) bits. However, the technique in [26] is based on lookup tables; thus, the complexity of the encoding and decoding is governed by lookup tables of size 2 9 × 13 = 6656 bits. ...
... The same rate is achieved in [26] for an RLL code with d = 1 at code (resp., message) length 13 (resp., 9) bits. However, the technique in [26] is based on lookup tables; thus, the complexity of the encoding and decoding is governed by lookup tables of size 2 9 × 13 = 6656 bits. Note that in the case of d = 2, the size of these lookup tables governing the complexity can reach 40960 bits. ...
Article
Full-text available
Line codes make it possible to mitigate interference, to prevent short pulses, and to generate streams of bipolar signals with no direct-current (DC) power content through balancing. They find application in magnetic recording (MR) devices, in Flash devices, in optical recording devices, and in some computer standards. This paper introduces a new family of fixed-length, binary constrained codes, named lexicographically-ordered constrained codes (LOCO codes), for bipolar non-return-to-zero signaling. LOCO codes are capacity-achieving, the lexicographic indexing enables simple, practical encoding and decoding, and this simplicity is demonstrated through analysis of circuit complexity. LOCO codes are easy to balance, and their inherent symmetry minimizes the rate loss with respect to unbalanced codes having the same constraints. Furthermore, LOCO codes that forbid certain patterns can be used to alleviate inter-symbol interference in MR systems and inter-cell interference in Flash systems. Numerical results demonstrate a gain of up to 10% in rate achieved by LOCO codes with respect to other practical constrained codes, including run-length-limited codes, designed for the same purpose. Simulation results suggest that it is possible to achieve a channel density gain of about 20% in MR systems by using a LOCO code to encode only the parity bits, limiting the rate loss, of a low-density parity-check code before writing.
... With |v| = 6, the parameters of the proposed networks are shown in row three of Table II. The MLP network we trained for frameby-frame decoding has three hidden layers [32,16,8] and 924 trainable parameters. Its BER performance is shown in Fig. 3, which shows that DNN-based decoding achieves a BER that is very close to MAP decoding of CS codes, and outperforms the conventional LUT decoding described above by ∼2.2 dB. ...
... These lowlevel features can be extracted to enable CNNs to efficiently learn the weights of the kernels, which results in a smaller number of weights that need to be trained in the training phase compared to MLP networks. For example, although similar BER performance is achieved by the [32,16,8] MLP network and the [6, 10, 6] CNN, the number of weights in the CNN is only 82% of that in the MLP network. With larger networks the reduction in the number of weights that need to be trained is more significant, as we show in the next subsection. ...
Preprint
Full-text available
Constrained sequence (CS) codes, including fixed-length CS codes and variable-length CS codes, have been widely used in modern wireless communication and data storage systems. Sequences encoded with constrained sequence codes satisfy constraints imposed by the physical channel to enable efficient and reliable transmission of coded symbols. In this paper, we propose using deep learning approaches to decode fixed-length and variable-length CS codes. Traditional encoding and decoding of fixed-length CS codes rely on look-up tables (LUTs), which is prone to errors that occur during transmission. We introduce fixed-length constrained sequence decoding based on multiple layer perception (MLP) networks and convolutional neural networks (CNNs), and demonstrate that we are able to achieve low bit error rates that are close to maximum a posteriori probability (MAP) decoding as well as improve the system throughput. Further, implementation of capacity-achieving fixed-length codes, where the complexity is prohibitively high with LUT decoding, becomes practical with deep learning-based decoding. We then consider CNN-aided decoding of variable-length CS codes. Different from conventional decoding where the received sequence is processed bit-by-bit, we propose using CNNs to perform one-shot batch-processing of variable-length CS codes such that an entire batch is decoded at once, which improves the system throughput. Moreover, since the CNNs can exploit global information with batch-processing instead of only making use of local information as in conventional bit-by-bit processing, the error rates can be reduced. We present simulation results that show excellent performance with both fixed-length and variable-length CS codes that are used in the frontiers of wireless communication systems.
... The DC-free codes are a subclass of spectrum shaping codes. The codes have been employed in many applications of digital transmission and recording systems [12]- [14]. For example, the effects of low-frequency noise which can be caused by some scratch on the disc surface are easily avoided using the DC-free codes in the optical memory devices. ...
Article
Full-text available
An efficient concatenation of error correction codes with constrained codes is proposed in this paper. Generally, constrained codes are designed to match specified channels, whereas error correction coding schemes are designed to correct the channel errors. They both play important roles to ensure the integrity of data in data storage systems. In this study, we first investigate the design of k constrained codes combined with a low-density parity-check (LDPC) code, and then we extend the idea to the design of DC-free k constrained LDPC codes. Simulation results show that the proposed designs achieve an improved bit error rate (BER) performance, compared to prior art schemes. Especially, the proposed design for the DC-free k constrained codes not only fully eliminates the effect of error propagation in a reverse configuration, but also achieves significant DC suppression.
... This implies that at most N − 1 consecutive logic ones or logic zeros can exist in the coded sequence. In some systems, DC-free RLL constraints place limits on runlength other than those implied by the RDS bounds [30] [31]. ...
Article
Full-text available
We consider the construction of capacity-approaching variable-length constrained sequence codes based on multi-state encoders that permit state-independent decoding. Based on the finite state machine description of the constraint, we first select the principal states and establish the minimal sets. By performing partial extensions and normalized geometric Huffman coding, efficient codebooks that enable state-independent decoding are obtained. We then extend this multi-state approach to a construction technique based on n-step FSMs. We demonstrate the usefulness of this approach by constructing capacity-approaching variable-length constrained sequence codes with improved efficiency and/or reduced implementation complexity to satisfy a variety of constraints, including the runlength-limited (RLL) constraint, the DC-free constraint, and the DC-free RLL constraint, with an emphasis on their application in visible light communications.
Article
Full-text available
We consider noisy communications and storage systems that are hampered by varying offset of unknown magnitude such as low-frequency signals of unknown amplitude added to the sent signal. We study and analyze a new detection method whose error performance is independent of both unknown base offset and offset’s slew rate. The new method requires, for a codeword length n ≥ 12, less than 1.5 dB more noise margin than Euclidean distance detection. The relationship with constrained codes based on mass-centered codewords and the new detection method is discussed.
Article
Full-text available
We study the ability of recently developed variable-length constrained sequence codes to determine codeword boundaries in the received sequence upon initial receipt of the sequence and if errors in the received sequence cause synchronization to be lost.We first investigate construction of these codes based on the finite state machine description of a given constraint, and develop new construction criteria to achieve high synchronization probabilities. Given these criteria, we propose a guided partial extension algorithm to construct variable-length constrained sequence codes with high synchronization probabilities. With this algorithm we construct new codes and determine the number of codewords and coded bits that are needed to recover synchronization once synchronization is lost.We consider a large variety of constraints including the runlength limited (RLL) constraint, the DC-free constraint, the Pearson constraint and constraints for inter-cell interference mitigation in flash memories. Simulation results show that the codes we construct exhibit excellent synchronization properties, often resynchronizing within a few bits.
Article
Bi-modal (respectively, multi-modal) constrained coding refers to an encoding model whereby a user input block can be mapped to two (respectively, multiple) codewords. In current storage applications, such as optical disks, multi-modal coding allows to achieve DC control, in addition to satisfying the runlength limited (RLL) constraint specified by the recording channel. In this work, a study is initiated on bi-modal fixed-length constrained encoders. Necessary and sufficient conditions are presented for the existence of such encoders for a given constraint. It is also shown that under somewhat stronger conditions, one can guarantee a bi-modal encoder with finite decoding delay.
Article
This paper studies a deep learning (DL) framework to solve distributed non-convex constrained optimizations in wireless networks where multiple computing nodes, interconnected via backhaul links, desire to determine an efficient assignment of their states based on local observations. Two different configurations are considered: First, an infinite-capacity backhaul enables nodes to communicate in a lossless way, thereby obtaining the solution by centralized computations. Second, a practical finite-capacity backhaul leads to the deployment of distributed solvers equipped along with quantizers for communication through capacity-limited backhaul. The distributed nature and the non-convexity of the optimizations render the identification of the solution unwieldy. To handle them, deep neural networks (DNNs) are introduced to approximate an unknown computation for the solution accurately. In consequence, the original problems are transformed to training tasks of the DNNs subject to non-convex constraints where existing DL libraries fail to extend straight-forwardly. A constrained training strategy is developed based on the primal-dual method. For distributed implementation, a novel binarization technique at the output layer is developed for quantization at each node. Our proposed distributed DL framework is examined in various network configurations of wireless resource management. Numerical results verify the effectiveness of our proposed approach over existing optimization techniques.
Patent
Full-text available
A system for block encoding words of a digital signal achieves a maximum of error compaction and ensures reliability of a self-clocking decoder, while minimizing any DC in the encoded signal. Data words of m bits are translated into information blocks ofn1 bits (n1 >m) that satisfy a (d,k)-constraint in which at least d "0" bits, but no more than k "0" bits occur between consecutive "I" bits. The information blocks are concatenated by inserting separation blocks of n2 bits there between, selected so that the (d,k)-constraint is satisfied over the boundary between any two information words. For each information word, the separation block that will yield the lowest net digital sum value is selected. Then, the encoded signal is modulated as an NRZ-M signal in which a "1" becomes a transition and a "0" becomes an absence of a transition. A unique synchronizing block is inserted periodically. A decoder circuit, using the synchronizing blocks to control its timing, disregards the separation blocks, but detects the information blocks and translates them back into reconstituted data words of m bits. The foregoing technique can be used to advantage in recording digitized music on an optical disc.
Book
Full-text available
Preface to the Second Edition About five years after the publication of the first edition, it was felt that an update of this text would be inescapable as so many relevant publications, including patents and survey papers, have been published. The author's principal aim in writing the second edition is to add the newly published coding methods, and discuss them in the context of the prior art. As a result about 150 new references, including many patents and patent applications, most of them younger than five years old, have been added to the former list of references. Fortunately, the US Patent Office now follows the European Patent Office in publishing a patent application after eighteen months of its first application, and this policy clearly adds to the rapid access to this important part of the technical literature. I am grateful to many readers who have helped me to correct (clerical) errors in the first edition and also to those who brought new and exciting material to my attention. I have tried to correct every error that I found or was brought to my attention by attentive readers, and seriously tried to avoid introducing new errors in the Second Edition. China is becoming a major player in the art of constructing, designing, and basic research of electronic storage systems. A Chinese translation of the first edition has been published early 2004. The author is indebted to prof. Xu, Tsinghua University, Beijing, for taking the initiative for this Chinese version, and also to Mr. Zhijun Lei, Tsinghua University, for undertaking the arduous task of translating this book from English to Chinese. Clearly, this translation makes it possible that a billion more people will now have access to it. Kees A. Schouhamer Immink Rotterdam, November 2004
Article
Full-text available
Basic trade-offs between the rate of combined DC-free runlength-limited (DCRLL) modulation codes and the amount of suppression of low-frequency components are presented. The main results are obtained by means of a numerical study of dependencies between statistical properties of ideal, “maxentropic” DCRLL sequences. The numerical results are mathematically founded by proving the observed behavior of the Shannon capacity of the DCRLL constraints for asymptotically large values of digital sum variation. Presented characteristics of maxentropic DCRLL sequences comply with the corresponding properties of maxentropic pure DC-free sequences, as previously considered by Justesen (1982) and Immink (1991). Knowledge of the maxentropic bounds enables us to evaluate the performances of implemented DCRLL codes with respect to their low-frequency suppression capability. Among the considered codes are the EFM code as applied in the compact disc system, and the EFMPlus code which has been adopted as the coding format of the DVD system
Article
Full-text available
We report on an alternative to Eight-to-Fourteen Modulation (EFM), called EFMPlus, which has been adopted as coding format of the MultiMedia Compact Disc proposal. The rate of the new code is 8/16, which means that a 6-7% higher information density can be obtained. EFMPlus is the spitting image of EFM (same minimum and maximum runlength, clock content etc). Computer simulations have shown that the low-frequency content of the new code is only slightly larger than its conventional EFM counterpart.
Article
Full-text available
Ideas which have origins in C. E. Shannon's work in information theory have arisen independently in a mathematical discipline called symbolic dynamics. These ideas have been refined and developed in recent years to a point where they yield general algorithms for constructing practical coding schemes with engineering applications. In this work we prove an extension of a coding theorem of B. Marcus and trace a line of mathematics from abstract topological dynamics to concrete logic network diagrams.
Article
We have developed a new error correction method (Picket: a combination of a long distance code (LDC) and a burst indicator subcode (BIS)), a new channel modulation scheme (17PP, or (1, 7) RLL parity preserve (PP)-prohibit repeated minimum transition runlength (RMTR) in full), and a new address format (zoned constant angular velocity (ZCAV) with headers and wobble, and practically constant linear density) for a digital video recording system (DVR) using a phase change disc with 9.2 GB capacity with the use of a red (lambda=650 nm) laser and an objective lens with a numerical aperture (\mathit{NA}) of 0.85 in combination with a thin cover layer. Despite its high density, this new format is highly reliable and efficient. When extended for use with blue-violet (lambda≈ 405 nm) diode lasers, the format is well suited to be the basis of a third-generation optical recording system with over 22 GB capacity on a single layer of a 12-cm-diameter disc.
Article
The technique introduced has relatively simple encoding and decoding procedures which can be implemented at the high bit rates used in optical fiber communication systems. Because it is similar to the established technique of self-synchronizing scrambling but is also capable of guiding the scrambling process to produce a balanced encoded bit stream, the technique is called guided scrambling, (GS). The concept of GS coding is explained, and design parameters which ensure good line code characteristics are discussed. The performance of a number of guided scrambling configurations is reported in terms of maximum consecutive like-encoded bits, encoded stream disparity, decoder error extension, and power spectral density of the encoded signal. Comparison of guided scrambling with conventional line code techniques indicates a performance which approaches that of alphabetic lookup table codes with an implementation complexity similar to that of current nonalphabetic coding techniques.
Article
It is proven that 100-percent efficient fixed-rate codes for run-length-limited (RLL) (d,k) and RLL charge-constrained (d, k; c) channels are possible in only two eases, namely (d,k; c)=(0,1;1) and (1,3;3) . Specifically, the binary Shannon capacity of RLL (d, k) constrained systems is shown to be irrational for all values of (d, k),0 leq d < k . For RLL charge-constrained systems with parameters (d, k;c) , the binary capacity is irrational for all values of (d, k; c),0 leq d < k,2c geq k + 1 , except (0,1; 1) and (1,3;3) , which both have binary capacity 1/2 .
7/13 Channel coding and decoding method using RLL(2,25) code
  • M J Kim
M. J. Kim, "7/13 Channel coding and decoding method using RLL(2,25) code," U.S. Patent 6 188 336, Feb. 13, 2001.
Method and apparatus for encoding and recovering binary digital data
  • G V Jacoby
G. V. Jacoby, "Method and apparatus for encoding and recovering binary digital data," U.S. Patent 4 323 931, Apr. 6, 1982.