BookPDF Available

Codes for Mass Data Storage Systems

Authors:
  • Turing Machines Inc

Abstract

Preface to the Second Edition About five years after the publication of the first edition, it was felt that an update of this text would be inescapable as so many relevant publications, including patents and survey papers, have been published. The author's principal aim in writing the second edition is to add the newly published coding methods, and discuss them in the context of the prior art. As a result about 150 new references, including many patents and patent applications, most of them younger than five years old, have been added to the former list of references. Fortunately, the US Patent Office now follows the European Patent Office in publishing a patent application after eighteen months of its first application, and this policy clearly adds to the rapid access to this important part of the technical literature. I am grateful to many readers who have helped me to correct (clerical) errors in the first edition and also to those who brought new and exciting material to my attention. I have tried to correct every error that I found or was brought to my attention by attentive readers, and seriously tried to avoid introducing new errors in the Second Edition. China is becoming a major player in the art of constructing, designing, and basic research of electronic storage systems. A Chinese translation of the first edition has been published early 2004. The author is indebted to prof. Xu, Tsinghua University, Beijing, for taking the initiative for this Chinese version, and also to Mr. Zhijun Lei, Tsinghua University, for undertaking the arduous task of translating this book from English to Chinese. Clearly, this translation makes it possible that a billion more people will now have access to it. Kees A. Schouhamer Immink Rotterdam, November 2004
A preview of the PDF is not available
... The spectral radius of the channel graph have been known to bound the channel capacity (Cohn [8]). Further, the maximum entropy of a unifilar Markov information source can be expressed in terms of the largest eigenvalue of its connection matrix (Immink [19]). The least eigenvalue provides information about independence number and chromatic number of the graph and interlacing gives information about its substructures (Fan et al. [12], and Tan and Fan [42]). ...
Article
Full-text available
Interconnection Networks are a boon to human made large computing systems which are often designed based on the need. In this paper, we address the question of obtaining an interconnection network to suit a specific need which is a cumbersome procedure. The methods for constructing infinitely large interconnection networks from any simple, undirected graph \(G\) is illustrated and their properties are discussed. Further, we study the behaviour of certain vertex-degree based graph invariants of the constructed networks under the transformations by means of graph operation and obtain explicit relations for energy and some degree-based topological indices. Also, we show that they can be computed in polynomial time for these constructed networks. By doing so, we also obtain new methods of constructing infinite families of integral graphs.
... The second largest eigenvalue of a regular graph can be used in coding theory to represent the minimum Hamming distance of a linear code [16,17]. According to Shannon information theory, the eigenvalues of the channel graph can be used to represent the channel capacity, which is the maximum amount of information that can be communicated over a channel or stored in a storage medium [17,18]. For a given code, an encoder or decoder is constructed based on the spectral radius of the channel graph. ...
Article
Full-text available
Graph energy is defined to be the p-norm of adjacency matrix associated to the graph for p = 1 elaborated as the sum of the absolute eigenvalues of adjacency matrix. The graph's spectral radius represents the adjacency matrix's largest absolute eigenvalue. Applications for graph energies and spectral radii can be found in both molecular computing and computer science. On similar lines, Inverse Sum Indeg, (ISI) energies, and (ISI) spectral radii can be constructed. This article's main focus is the ISI energies, and ISI spectral radii of the generalized splitting and shadow graphs constructed on any regular graph. These graphs can be representation of many physical models like networks, molecules and macromolecules, chains or channels. We actually compute the relations about the ISI energies and ISI spectral radii of the newly created graphs to those of the original graph.
... Constrained coding is a fundamental field with a wide range of applications including magnetic and optical storage, flash memory, DNA storage, wireless communication, and satellite communication [Imm04,MRS01,Imm22]. These applications are subject to unique challenges and constraints that must be satisfied by the data encoding. ...
Preprint
Full-text available
Constrained coding is a fundamental field in coding theory that tackles efficient communication through constrained channels. While channels with fixed constraints have a general optimal solution, there is increasing demand for parametric constraints that are dependent on the message length. Several works have tackled such parametric constraints through iterative algorithms, yet they require complex constructions specific to each constraint to guarantee convergence through monotonic progression. In this paper, we propose a universal framework for tackling any parametric constrained-channel problem through a novel simple iterative algorithm. By reducing an execution of this iterative algorithm to an acyclic graph traversal, we prove a surprising result that guarantees convergence with efficient average time complexity even without requiring any monotonic progression. We demonstrate the effectiveness of this universal framework by applying it to a variety of both local and global channel constraints. We begin by exploring the local constraints involving illegal substrings of variable length, where the universal construction essentially iteratively replaces forbidden windows. We apply this local algorithm to the minimal periodicity, minimal Hamming weight, local almost-balanced Hamming weight and the previously-unsolved minimal palindrome constraints. We then continue by exploring global constraints, and demonstrate the effectiveness of the proposed construction on the repeat-free encoding, reverse-complement encoding, and the open problem of global almost-balanced encoding. For reverse-complement, we also tackle a previously-unsolved version of the constraint that addresses overlapping windows. Overall, the proposed framework generates state-of-the-art constructions with significant ease while also enabling the simultaneous integration of multiple constraints for the first time.
... A balanced codeword has the same number of 1s and 0s. Reference [189] shows that there aren balance codewords for n output bits, wheren = n!/[(n/2)!] 2 . To design a codebook of a k/n-balanced block code, for a given k input bits (k = ⌊log 2n ⌋, where ⌊a⌋ is the largest integer less than or equal to a), the frequency spectrum of each of the resulting balancedn codewords is calculated and 2 k codewords with the deepest spectral nulls at DC are kept; the remaining codewords are discarded. ...
Article
Full-text available
Backscatter communication (BackCom) networks enable passive/battery-free Internet-of-Thing devices, providing reliable, massive connectivity while ensuring self-sustainability, low maintenance, and low costs. Effective channel codes and decoding algorithms are necessary to achieve these objectives. However, a comprehensive survey/review paper on such techniques for BackCom networks has not been available. This paper aims to fill this gap. Because tags have limited computational resources, traditional coding techniques may not suit them. We first describe the basics of BackCom, channel codes and their relevant design parameters, and codes for general communication networks. We address the BackCom limitations, requirements, and channel characteristics. As conventional codes may not seamlessly move to the BackCom arena, we identify the potential BackCom coding techniques and multiple access schemes. We further highlight potential approaches for addressing code implementation complexity and reliability. Finally, we discuss open issues, challenges, and potential future research directions.
... S(n, p) is the set of all one-dimensional sequences that satisfy the p-bounded constraint. One may use enumeration coding [27], [28] to construct rank p : S(n, p) → [|S(n, p)|] and unrank p : [|S(n, p)|] → S(n, p). The redundancy of this encoding algorithm is then λ(n, p) = n − log |S(n, p)| µ(n, p) (bits). ...
Article
Full-text available
In this work, we study two types of constraints on two-dimensional binary arrays. Given p ∈ [0, 1], ϵ ∈ [0, 1/2], we study • The p -bounded constraint: a binary vector of size n is said to be p -bounded if its weight is at most pn , • The ϵ-balanced constraint: a binary vector of size n is said to be ϵ-balanced if its weight is within [(1/2 – ϵ) n , (1/2 + ϵ) n ]. Such constraints are crucial in several data storage systems, those regard the information data as two-dimensional (2D) instead of one-dimensional (1D), such as the crossbar resistive memory arrays and the holographic data storage. In this work, efficient encoding/decoding algorithms are presented for binary arrays so that the weight constraint (either p -bounded constraint or ϵ-balanced constraint) is enforced over every row and every column, regarded as 2D row-column (RC) constrained codes; or over every window (where each window refers to as a subarray consisting of consecutive rows and consecutive columns), regarded as 2D sliding-window (SW) constrained codes. While low-complexity designs have been proposed in the literature, mostly focusing on 2D RC constrained codes where p = 1/2 and ϵ = 0, this work provides efficient coding methods that work for both 2D RC constrained codes and 2D SW constrained codes, and more importantly, the methods are applicable for arbitrary values of p and ϵ. Furthermore, for certain values of p and ϵ, we show that, for sufficiently large array size, there exists linear-time encoding/decoding algorithm that incurs at most one redundant bit.
Article
Full-text available
On a finite sequence of binary (0-1) trials we define a random variable enumerating patterns of length subject to certain constraints. For sequences of independent and identically distributed binary trials exact probability mass functions are established in closed forms by means of combinatorial analysis. An explicit expression of the mean value of this random variable is obtained. The results associated with the probability mass functions are extended on sequences of exchangeable binary trials. An application in Information theory concerning counting of a class of run-length-limited binary sequences is provided as a direct byproduct of our study. Illustrative numerical examples exemplify further the results.
Conference Paper
In this work, we propose efficient constrained coding schemes to significantly reduce the sneak path interference (SPI), a fundamental and challenging problem, in the crossbar resistive memory arrays. Particularly, we attempt to combat the sneak path effect locally as follows. For arrays of size (n \times n), we study coding methods that enforce every sliding window of size m \times m (where each window refers to a subarray consisting of consecutive rows and consecutive columns), for some m<n to be: (i) (m,\delta) bounded weight: the number of 1 in every sliding window is at most m^2/2 - \delta for \delta >= 0, and this constraint is called the locally bounded weight constraint. (ii) Sneak path free: the written bits in every window do not induce any sneak path to any cell, and this constraint is called the locally sneak path free constraint. In this work, we study the maximum information rate that can be achieved (or channel capacity) and design codes for each constraint. Particularly, for the first constraint, for arbitrary m and \delta<=m/2, we provide an efficient construction of codes with the code rate 1-2/m, while the highest rate found in the prior art is approximately 1-1/m, which was only applicable for m=n-o(n)) and \delta=0.
Conference Paper
High temperatures in electronic devices may have a negative effect on their performance. Various techniques have been proposed and studied to address and combat this thermal challenge. To guarantee that the peak temperature of the devices will be bounded by some maximum temperature, the transmitted signal has to satisfy some constraints. With this motivation, we study the constrained channel that only accepts sequences that satisfy prescribed thermal constraints. The main goal in this paper is to compute the capacity of this channel. We provide the exact capacity of the channel with some certain parameters and we also present some bounds on the capacity in various cases. Finally, we consider the model that multiple wires are available to use and find out the smallest number of wires required to satisfy the thermal constraints.
Article
Partial geometries form an interesting branch in combinatorial mathematics, and recently they have been shown to be very effective in the construction of LDPC codes with distinct geometric and algebraic structures. This paper presents three specific cyclic classes of partial geometries. Based on these three classes of partial geometries, three classes of LDPC codes and three classes of constant-weight codes are constructed. Codes in these two categories are either cyclic or quasi-cyclic. Designs and constructions of these codes are straightforward and flexible without a need for extensive computer search. It is shown that long high-rate LDPC codes constructed based on the three classes of cyclic partial geometries perform well over the additive white Gaussian noise channel (AWGNC) with iterative decoding algorithm based on belief propagation. They can achieve low error-rates without visible error-floor and their decoding converges rapidly. These LDPC codes also perform well over the binary erasure channel (BEC) and are very effective in correcting phased-bursts of erasures. Based on cyclic partial geometries, a special type of constant-weight codes, called balanced codes, can be constructed. The constant-weight codes constructed based on partial geometries are either optimal or nearly optimal.
Article
Balanced sequences and balanced codes have attracted a lot of research in the last seventy years due to their diverse applications in information theory as well as other areas of computer science and engineering. There have been some methods to classify balanced sequences. This work suggests two new different hierarchies to classify these sequences. The first one is based on the largest $\ell $ for which each $\ell $ -tuple is contained the same amount of times in the sequence. This property is a generalization for the property required for de Bruijn sequences. The second hierarchy is based on the number of balanced derivatives of the sequence. Enumeration for each such family of sequences and efficient encoding and decoding algorithms are provided in this paper.
Article
Full-text available
The sound from a Compact Disc system encoded into data bits and modulated into channel bits is sent along the 'transmission channel' consisting of write laser - master disk - user disk - optical pick-up. The maximum information density on the disk is determined by the diameter d of the laser light spot on the disk and the 'number of data bits per light spot'. The effect of making d smaller is to greatly reduce the manufacturing tolerances for the player and the disk. The compromise adopted is d approximately equals 1 mu m, giving very small tolerances for objective disk tilt, disk thickness and defocusing.
Article
Full-text available
The systematic design of DC-constrained codes based on codewords of fixed length is considered. Simple recursion relations for enumerating the number of codewords satisfying a constraint on the maximum unbalance of ones and zeros in a codeword are derived. An enumerative scheme for encoding and decoding maximum unbalance constrained codewords with binary symbols is developed. Examples of constructions of transmission systems based on unbalance constrained codewords are given. A worked example of an 8b10b channel code is given being of particular interest because of its practical simplicity and relative efficiency.
Article
A (1,6) runlength limited (RLL) code is described in which the encoder converts unconstrained data strings into (1,6) RLL constrained strings, where the rate is 2/3. The decoder, having limited error propagation recovers the original data strings. One error in the encoded string can result in no more than eleven errors in the decoded data.
Article
A code that converts unconstrained data strings into (5, 12) run-length limited constrained strings is presented. Its decoder recovers the original data strings, and its error propagation is limited. One error in the encoder string can result in no more than five errors in the decoded data.
Article
The stochastic process appearing at the output of a digital encoder is investigated. Based upon the statistics of the code being employed, a systematic procedure is developed by means of which the average power spectral density of the process can be determined. The method is readily programmed on the digital computer, facilitating the calculation of the spectral densities for large numbers of codes. As an example of its use, the procedure is applied in the case of a specific multi-alphabet, multi-level code.
Article
This paper analyzes a block-coding scheme designed to suppress spectral energy near f = 0 for any binary message sequence. In this scheme, the polarity of each block is either maintained or reversed, depending on which decision drives the accumulated digit sum toward zero. The polarity of the block's last digit informs the decoder as to which decision was made. Our objective is to derive the average power spectrum of the coded signal when the message is a random sequence of +1's and −1's and the block length (M) is odd. The derivation uses a mixture of theoretical analysis and computer simulation. The theoretical analysis leads to a spectrum description in terms of a set of correlation coefficients, {ρq}, q = 1, 2, etc., with the ρq's functions of M. The computer simulation uses FFT algorithms to estimate the power spectrum and autocorrelation function of the block-coded signal. From these results, {ρq} is estimated for various M. A mathematical approximation to ρg in terms of q and M is then found which permits a closed-form evaluation of the power spectrum. Comparisons between the final formula and simulation results indicate an accuracy of ±5 percent (±0.2 dB) or better. The block-coding scheme treated here is of particular interest because of its practical simplicity and relative efficiency. The methods used to analyze it can be applied to other block-coding schemes as well, some of which are discussed here for purposes of comparison.
Article
A method for determining maximum-size block codes, with the property that no concatenation of codewords violates the input restrictions of a given channel, is presented. The class of channels considered is essentially that of Shannon (1948) in which input restrictions are represented through use of a finite-state machine. The principal results apply to channels of finite memory and codes of length greater than the channel memory but shorter codes and non-finite memory channels are discussed briefly. A partial ordering is first defined over the set of states. On the basis of this ordering, complete terminal sets of states are determined. Use is then made of Mason's general gain formula to obtain a generating function for the size of the code which is associated with each complete terminal set. Comparison of coefficients for a particular block length establishes an optimum terminal set and codewords of the maximum-size code are then obtained directly. Two important classes of binary channels are then considered. In the first class, an upper bound is placed on the separation of 1's during transmission while, in the second class, a lower bound is placed on this separation. Universal solutions are obtained for both classes.
Article
A special family J of prefix codes is considered. It is verified that if A ε J has not a certain synchronizing property, then A = Cp (p > 1), where C is another code from the same family.
Article
We derive the limiting efficiencies of dc-constrained codes. Given bounds on the running digital sum (RDS), the best possible coding efficiency η, for a K-ary transmission alphabet, is η = log2 λmax/log2 K, where λmax is the largest eigenvalue of a matrix which represents the transitions of the allowable states of RDS. Numerical results are presented for the three special cases of binary, ternary and quaternary alphabets.