## No full-text available

To read the full-text of this research,

you can request a copy directly from the author.

Describes a new technique for constructing fixed-length (d,k)
runlength-limited block codes. The new codes are very close to
block-decodable codes, as decoding of the retrieved sequence can be
accomplished by observing (part of) the received codeword plus a very
small part (usually only a single bit) of the previous codeword. The
basic idea of the new construction is to uniquely represent each source
word by a (d,k) sequence with specific predefined properties, and to
construct a bridge of β, 1⩽β⩽d, merging bits between
every pair of adjacent words. An essential element of the new coding
principle is look ahead. The merging bits are governed by the state of
the encoder (the history), the present source word to be translated, and
by the upcoming source word. The new constructions have the virtue that
only one look-up table is required for encoding and decoding

To read the full-text of this research,

you can request a copy directly from the author.

... It is obvious that when the codeword length is large, the encoder/decoder mapping should be such that they can be implemented by an algorithmic procedure rather than by table lookup. Block-code constructions that can be used in conjunction with such an algorithm have been presented by Tang and Bahl [8], Beenker and Immink [9], Gu and Fuja [10], and Immink [11]. A survey and comparison of the various block-code constructions have been given by Abdel-Ghaffar and Weber [12]. ...

... Gu and Fuja's method is optimal: no other block code can achieve a higher rate. Recently, Immink [11] presented a code construction which is "almost" block-decodable. Gu and Fuja [10] showed that Beenker and Immink's "Construction 2" is optimal for and Construction 2 is not optimal for Construction 2, though not optimal for all parameters, has the advantage that only one table is needed plus simple logic to cascade the sequences. ...

... Cascading (merging) can be done with the following constructions [9]. [11]. ...

A new coding technique is proposed that translates user
information into a constrained sequence using very long codewords. Huge
error propagation resulting from the use of long codewords is avoided by
reversing the conventional hierarchy of the error control code and the
constrained code. The new technique is exemplified by focusing on (d,
k)-constrained codes. A storage-effective enumerative encoding scheme is
proposed for translating user data into long dk sequences and vice
versa. For dk runlength-limited codes, estimates are given of the
relationship between coding efficiency versus encoder and decoder
complexity. We show that for most common d, k values, a code rate of
less than 0.5% below channel capacity can be obtained by using hardware
mainly consisting of a ROM lookup table of size 1 kbyte. For selected
values of d and k, the size of the lookup table is much smaller. The
paper is concluded by an illustrative numerical example of a rate
256/466, (d=2, k=15) code, which provides a serviceable 10% increase in
rate with respect to its traditional rate 1/2, (2, 7) counterpart

... The traditional methods [1] for RLL coding and decoding use the enumeration principle based on the weighted sum of the symbols in the codeword. A new and more efficient method for cascading (d,k)-sequences was described by Beenker and Immink [2], [3], using a special class of runlength limited codes: (d,k,l,r)sequences. The new two constraints are: -the number of consecutive leading zeros sequence is at most l, -the number of consecutive trailing zeros sequence is at most r. ...

The construction procedure for (d,k,l,r)-sequences by traditional methods (based on the enumeration principle) requires two sets of weighting coefficients. Based on a set of parameters and recursive relationships, the proposed algorithm with just one set of weighting coefficients is presented. A new formula to determine the number of the messages permitted on constrained channels is introduced.

... For some recent information on these methods, we refer to [18]. A similar idea is used in [19] to construct almost-blockdecodable (d, k)-codes. ...

Modulation codes such as runlength-limited codes have been widely employed in magnetic and optical data storage systems. We review the main techniques involved in the design and use of these codes: the maximal code rate or capacity, graphical presentations of constraints, encoders and decoders, and code construction methods such as the ACH state-splitting algorithm. We conclude this survey by discussing some recent developments and research trends.

... The EFM15 code [17] is an example of a rate 8/15, (2, 14) DCRLL code. An alternative rate 8/15 construction [18] is possible that requires, in contrast with classic EFM, only one merging bit. We could, in principle, employ the same 14-bit word assignment as in classic EFM. ...

Codes were designed for optical disk recording system and future options were explored. The designed code was a combination of dc-free and runlength limited (DCRLL) codes. The design increased minimum feature size for replication and sufficient rejection of low-frequency components enabling a simple noise free tracking. Error-burst correcting Reed-Solomon codes were suggested for the resolution of read error. The features of DCRLL and runlength limited (RLL) sequences was presented and practical codes were devised to satisfy the given channel constraints. The mechanism of RLL codes supressed the components of the genarated sequences. The construction and performance of alternative Eight to fourteen modulation (EFM)-like codes was studied.

... This philosophy marries nicely with modern developments in magnetic recording, such as (a) the use of low-density parity-check codes, which come without cast-iron guarantees but work very well empirically [7]; and (b) the idea of digital fountain codes [1], which can store a large file on disc by writing thousands of packets on the disc, each packet being a random function of the original file, and the original file being recoverable from (almost) any sufficiently large subset of the packets -in which case occasional packet loss is unimportant. The ideas presented here are similar to, but different from, those presented by Immink [4][5][6], Deng and Herro [2], and Markarian et al. [9]. ...

Standard runlength-limiting codes — nonlinear codes defined by trellises — have the disadvantage that they disconnect the outer errorcorrecting code from the bit-by-bit likelihoods that come out of the channel. I present two methods for creating transmissions that, with probability extremely close to 1, both are runlength-limited and are codewords of an outer linear error-correcting code (or are within a very small Hamming distance of a codeword). The cost of these runlength-limiting methods, in terms of loss of rate, is significantly smaller than that of standard runlength-limiting codes. The methods can be used with any linear outer code; low-density parity-check codes are discussed as an example.
The cost of the method, in terms of additional redundancy, is very small: a reduction in rate of less than 1% is sufficient for a code with blocklength 4376 bits and maximum runlength 14.

Symbolic dynamics is a mature yet rapidly developing area of dynamical systems. It has established strong connections with many areas, including linear algebra, graph theory, probability, group theory, and the theory of computation, as well as data storage, statistical mechanics, and $C^*$-algebras. This Second Edition maintains the introductory character of the original 1995 edition as a general textbook on symbolic dynamics and its applications to coding. It is written at an elementary level and aimed at students, well-established researchers, and experts in mathematics, electrical engineering, and computer science. Topics are carefully developed and motivated with many illustrative examples. There are more than 500 exercises to test the reader's understanding. In addition to a chapter in the First Edition on advanced topics and a comprehensive bibliography, the Second Edition includes a detailed Addendum, with companion bibliography, describing major developments and new research directions since publication of the First Edition.

Constrained codes are a kev component in the digital recording devices that have become ubiquitous in computer data storage and electronic entertainment applications. This paper surveys the theory and practice of constrained coding, tracing the evolution of the subject from its origins in Shannon's classic 1948 paper to present-day applications in high-density digital recorders. Open problems and future research directions are also addressed.

We report on an alternative to Eight-to-Fourteen Modulation (EFM), called EFMPlus, which has been adopted as coding format of the MultiMedia Compact Disc proposal. The rate of the new code is 8/16, which means that a 6-7% higher information density can be obtained. EFMPlus is the spitting image of EFM (same minimum and maximum runlength, clock content etc). Computer simulations have shown that the low-frequency content of the new code is only slightly larger than its conventional EFM counterpart.

Constrained codes are a key component in digital recording devices
that have become ubiquitous in computer data storage and electronic
entertainment applications. This paper surveys the theory and practice
of constrained coding, tracing the evolution of the subject from its
origins in Shannon's classic 1948 paper to present-day applications in
high-density digital recorders. Open problems and future research
directions are also addressed

The terms (d,k)- constrained sequence and runlength-limited (RLL) sequence are usually used as synonyms, and traditionally the design of encoders that generate RLL sequences is almost always conducted by designing encoders that generate (d,k)-constrained sequences followed by a precoder. It is generally believed that this design procedure does not entail a losss of performance in terms of coder complexity and error propagation. In this paper, however, we will show that it is surprisingly profitable in terms of error propagation to design RLL encoders directly, i.e. without the intermediate step of a (d,k)-constrained sequence.

Since the early 1980s we have witnessed the digital audio and video revolution: the Compact Disc (CD) has become a commodity audio system. CD-ROM and DVD-ROM have become the de facto standard for the storage of large computer programs and files. Growing fast in popularity are the digital audio and video recording systems called DVD and BluRay Disc. The above mass storage products, which form the backbone of modern electronic entertainment industry, would have been impossible without the usage of advanced coding systems.
Pulse Code Modulation (PCM) is a process in which an analogue, audio or video, signal is encoded into a digital bit stream. The analogue signal is sampled, quantized and finally encoded into a bit stream. The origins of digital audio can be traced as far back as 1937, when Alec H. Reeves, a British scientist, invented pulse code modulation \cite{Ree}. The advantages of digital audio and video recording have been known and appreciated for a long time. The principal advantage that digital implementation confers over analog systems is that in a well-engineered digital recording system the sole significant degradation takes place at the initial digitization, and the quality lasts until the point of ultimate failure. In an analog system, quality is diminished at each stage of signal processing and the number of recording generations is limited. The quality of analog recordings, like the proverbial 'old soldier', just fades away. The advent of ever-cheaper and faster digital circuitry has made feasible the creation of high-end digital video and audio recorders, an impracticable possibility using previous generations of conventional analog hardware.
The general subject of coding for digital recorders is very broad, with its roots deep set in history. In digital recording (and transmission) systems, channel encoding is employed to improve the efficiency and reliability of the channel. Channel coding is commonly accomplished in two successive steps: (a) error-correction code followed by (b) recording (or modulation) code. Error-correction control is realized by adding extra symbols to the conveyed message. These extra symbols make it possible for the receiver to correct errors that may occur in the received message.
In the second coding step, the input data are translated into a sequence with special properties that comply with the given "physical nature" of the recorder. Of course, it is very difficult to define precisely the area of recording codes and it is even more difficult to be in any sense comprehensive. The special attributes that the recorded sequences should have to render it compatible with the physical characteristics of the available transmission channel are called channel constraints. For instance, in optical recording a '1' is recorded as pit and a '0' is recorded as land. For physical reasons, the pits or lands should neither be too long or too short. Thus, one records only those messages that satisfy a run-length-limited constraint. This requires the construction of a code which translates arbitrary source data into sequences that obey the given constraints. Many commercial recorder products, such as Compact Disc and DVD, use an RLL code.
The main part of this book is concerned with the theoretical and practical aspects of coding techniques intended to improve the reliability and efficiency of mass recording systems as a whole. The successful operation of any recording code is crucially dependent upon specific properties of the various subsystems of the recorder. There are no techniques, other than experimental ones, available to assess the suitability of a specific coding technique. It is therefore not possible to provide a cookbook approach for the selection of the 'best' recording code.
In this book, theory has been blended with practice to show how theoretical principles are applied to design encoders and decoders. The practitioner's view will predominate: we shall not be content with proving that a particular code exists and ignore the practical detail that the decoder complexity is only a billion times more complex than the largest existing computer. The ultimate goal of all work, application, is never once lost from sight. Much effort has been gone into the presentation of advanced topics such as in-depth treatments of code design techniques, hardware consequences, and applications. The list of references (including many US Patents) has been made as complete as possible and suggestions for 'further reading' have been included for those who wish to pursue specific topics in more detail.
The decision to update Coding Techniques for Digital Recorders, published by Prentice-Hall (UK) in 1991, was made in Singapore during my stay in the winter of 1998. The principal reason for this decision was that during the last ten years or so, we have witnessed a success story of coding for constrained channels. The topic of this book, once the province of industrial research, has become an active research field in academia as well. During the IEEE International Symposia on Information Theory (ISIT and the IEEE International Conference on Communications (ICC), for example, there are now usually three sessions entirely devoted to aspects of constrained coding. As a result, very exciting new material, in the form of (conference) articles and theses, has become available, and an update became a necessity.
The author is indebted to the Institute for Experimental Mathematics, University of Duisburg-Essen, Germany, the Data Storage Institute (DSI) and National University of Singapore (NUS), both in Singapore, and Princeton University, US, for the opportunity offered to write this book. Among the many people who helped me with this project, I like to thank Dr. Ludo Tolhuizen, Philips Research Eindhoven, for reading and providing useful comments and additions to the manuscript.
Preface to the Second Edition
About five years after the publication of the first edition, it was felt that an update of this text would be inescapable as so many relevant publications, including patents and survey papers, have been published. The author's principal aim in writing the second edition is to add the newly published coding methods, and discuss them in the context of the prior art. As a result about 150 new references, including many patents and patent applications, most of them younger than five years old, have been added to the former list of references. Fortunately, the US Patent Office now follows the European Patent Office in publishing a patent application after eighteen months of its first application, and this policy clearly adds to the rapid access to this important part of the technical literature.
I am grateful to many readers who have helped me to correct (clerical) errors in the first edition and also to those who brought new and exciting material to my attention. I have tried to correct every error that I found or was brought to my attention by attentive readers, and seriously tried to avoid introducing new errors in the Second
Edition.
China is becoming a major player in the art of constructing, designing, and basic research of electronic storage systems. A Chinese translation of the first edition has been published early 2004. The author is indebted to prof. Xu, Tsinghua University, Beijing, for taking the initiative for this Chinese version, and also to Mr. Zhijun Lei, Tsinghua University, for undertaking the arduous task of translating this book from English to Chinese. Clearly, this translation makes it possible that a billion more people will now have access to it.
Kees A. Schouhamer Immink, Rotterdam, November 2004

Many modulation systems used in magnetic and optical recording are based on binary run-length-limited codes. We generalize the concept of dk -limited sequences of length n introduced by Tang and Bald by imposing constraints on the maximum number of consecutive zeros at the beginning and the end of the sequences. It is shown that the encoding and decoding procedures are similar to those of Tang and Bald. The additional constraints allow a more efficient merging of the sequences. We demonstrate two constructions of run-length-limited codes with merging rules of increasing complexity and efficiency and compare them to Tang and Bahl's method.

Ideas which have origins in C. E. Shannon's work in information theory have arisen independently in a mathematical discipline called symbolic dynamics. These ideas have been refined and developed in recent years to a point where they yield general algorithms for constructing practical coding schemes with engineering applications. In this work we prove an extension of a coding theorem of B. Marcus and trace a line of mathematics from abstract topological dynamics to concrete logic network diagrams.

In magnetic or optical storage devices, it is often required to
map the data into runlength-limited sequences. To ensure that cascading
such sequences does not violate the runlength constraints, a number of
merging bits are inserted between two successive sequences. A theory is
developed in which the minimum number of merging bits is determined, and
the efficiency of a runlength-limited fixed-length coding scheme is
considered

Magnetic recording has been around for a long time already. Nevertheless, it is one of the important methods for storing large amounts of data for all kinds of purposes. During the last decades an enormous increase in the recording density has occurred. The main reasons for the increase are the improved properties of recording materials, recording heads and recording mechanics. In these notes the fundamentals of digital magnetic recording will be discussed. Information is stored by writing transitions in the recording layer. The stored data are retrieved by detection of output pulses in the read-back wave form. The information is decoded to yield the original binary sequence of information. The properties of the signals depend upon the write and read process. The effect of some of the basic parameters will be treated by a discussion of the properties of the output signal based on theoretical considerations using the Maxwell equations for electromagnetic fields. Topics are: the reciprocity principle for the read process, the shape of the head field and its consequence for the output waveform, and the influence of recording layer thickness, head-to-medium distance, gap size and transition width. Some examples from literature will be used to illustrate the topics.

A systematic approach to the analysis and construction of channel codes for digital baseband transmission is presented. The structure of the codes is dominated by the set of requirements imposed by channel characteristics and system operation. These requirements may be translated into symbol sequence properties which, in turn, specify a set of permissible sequence states. State-dependent coding of both fixed and variable length is a direct result. Properties of such codes are discussed and two examples are presented.

Consider a restricted channel whose constraints may be characterized by a finite state machine model. Conventional coding techniques for such channels result in codes where the choice of a word to be transmitted is only a function of the current state and the information to be represented by this word. This paper develops techniques for constructing codes where the code word choice may also depend on future information to be transmitted. It is shown that such future-dependent codes exist for channels and coding rates where no conventional code may be constructed.

Algorithms are described for constructing synchronous (fixed rate) codes for discrete noiseless channels where the constraints can be modeled by finite state machines. The methods yield two classes of codes with minimum delay or look-ahead.

Most recording systems encode their data using binary run-length-limited (RLL) codes. Statistics such as the density of 1s, the probabilities of specific code strings or run lengths, and the power spectrum are useful in analyzing the performance of RLL codes in these applications. These statistics are easy to compute for ideal run-length-limited codes, those whose only constraints are the run-length limits, but ideal RLL codes are not usable in practice because their code rates are irrational. Implemented RLL codes achieve rational rates by not using all code sequences which satisfy the run-length constraints, and their statistics are different from those of the ideal RLL codes. Little attention has been paid to the computation of statistics for these practical codes. In this paper a method is presented for computing statistics of implemented codes. The key step is to develop an exact description of the code sequences which are used. A consequence of the code having rational rate is that all the code-string and run-length probabilities are rational. The method is illustrated by applying it to three codes of practical importance: MFM, (2, 7), and (1, 7).

A special case with binary sequences was presented at the IEEE 1969 International Symposium on Information Theory in a paper titled “Run-Length-Limited Codes.

The terms (d,k)-constrained sequence and runlength-limited (RLL) sequence are usually used as synonyms, and traditionally the design of encoders that generate RLL sequences is almost always conducted by designing encoders that generate (d,k)-constrained sequences followed by a precoder. It is generally believed that this design procedure does not entail a loss of performance in terms of coder complexity and error propagation. In this paper, however, we will show that it is surprisingly profitable in terms of error propagation to design RLL encoders directly, i.e. without the intermediate step of a (d,k)-constrained sequence.

This paper describes a novel run-length limited code, termed 3PM. A group of three data bits is converted into six code bits which are represented by the presence or absence of signal transitions. At least two zeros are maintained between two consecutive ones, that is a minimum distance of three positions between transitions, resulting in great reduction of pulse crowding. The minimum distance is assured by a unique merging rule at the boundary of adjacent code words. This rule distinguishes the code from both fixed and variable length codes and results in very simple encoding and decoding algorithms. An actual 50% density increase has been accomplished in saturation recording by using the 3PM code in combination with other electronic techniques. The new code is used in a current ISS/Univac high density disk storage system, featuring 2500 bits/cm (6300 BPI) linear density, 10 Mbits/sec data rate, 338 MByte capacity and one bit in 10 billion raw error rate on conventional Mod-11 head/disk interface.

The paper describes a technique for constructing fixed-length
block codes for (d, k)-constrained channels. The codes described are of
the simplest variety-codes for which the encoder restricted to any
particular channel state is a one-to-one mapping and which is not
permitted to “look ahead” to future messages. Such codes can
be decoded with no memory and no anticipation and are thus an example of
what Schouhamer Immink (1992) has referred to as block-decodable. For a
given blocklength n and given values of (d, k), the procedure constructs
a code with the highest possible rate among all such block codes, and it
does so without the iterative search that is typically used (i.e.,
Franaszek's recursive elimination algorithm). The technique used is
similar to Beenker and Immink's (1983) “Construction 2” in
that every message is associated with a (d, k, l, r) sequence of length
n-d; however the values used in the present approach are l=k-d and
r=k-1, as opposed to Beenker and Schouhamer Immink's values of l=r=k-d.
Thus the present approach demonstrates that “Construction 2”
is optimal for d=1 but is suboptimal for d>1. Furthermore, the
structure of the present codes permits enumerative coding techniques to
simplify encoding and decoding

A new approach to constructing optimal block codes for runlength-limited channels Sequence-state encoding for digital transmission A ,new look-ahead code for increasing data density Method and apparatus for encoding binary signals

- K A S J Immink
- T Gu
- G V Fuja
- Jacoby

K. A. S. Immink, " Block-decodable runlength-limited codes via look-ahead technique, " Philips J. Res., vol. 46, no. 6, pp. 293-310, 1991. J. Gu and T. Fuja, " A new approach to constructing optimal block codes for runlength-limited channels, " IEEE Trans. Inform. Theory, vol. 40, no. 3, pp. 774-785, May 1994. P. A. Franaszek, " Sequence-state encoding for digital transmission, " Bell Syst. Tech. J., vol. 47, pp. 143-157, Jan. 1968. G. V. Jacoby, " A,new look-ahead code for increasing data density, " IEEE Trans. Magn., vol. MAG-13, no. 5, pp. 1202-1204, Sept. 1977. See also GB Patent 1590404, June 1981. S. Tanaka, " Method and apparatus for encoding binary signals, " EP Patent 0 178 813, Oct. 1985. S. B. McClelland, " Compatible digital magnetic recording system, " U S. Patent 4 261 019, Apr. 1981.

Method and apparatus for encoding binary signals

- S Tanaka