Conference Paper

Coding for Reliable Data Storage on Different Hardware Platforms

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

When systems are designed to tolerate faulty components, application data must be protected against loss. This is reached by a distribution of data together with addition of redundant elements according to an erasure-tolerant code. In this paper, we elaborate architectures for such a fault-tolerant data storage. The concepts are originated from distributed systems and mostly implemented by software. We extend these concepts for usage in the scope of system on chip architectures. On the one hand, systems on chips, and multi core systems are employed as a platform for code calculation - on the other hand, such architectures include these techniques to fulfill their own functionality. We explain how data coding is mapped to (i) multi core CPU structures and (ii) implemented in a specialized design on a FPGA. We compare the time for coding on these architectures for a Cauchy-Reed/Solomon and a classical Reed/Solomon code.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

ResearchGate has not been able to resolve any citations for this publication.
Conference Paper
Full-text available
In the past few years, all manner of storage applica- tions, ranging from disk array systems to distributed and wide-area systems, have started to grapple with the re- ality of tolerating multiple simultaneous failures of stor- age nodes. Unlike the single failure case, which is opti- mally handled with RAID Level-5 parity, the multiple failure case is more difficult because optimal general purpose strategies are not yet known. Erasure Coding is the field of research that deals with these strategies, and this field has blossomed in re- cent years. Despite this research, the decades-old Reed- Solomon erasure code remains the only space-optimal (MDS) code for all but the smallest storage systems. The best performing implementations of Reed-Solomon coding employ a variant called Cauchy Reed-Solomon coding, developed in the mid 1990's (4). In this paper, we present an improvement to Cauchy Reed-Solomon coding that is based on optimizing the Cauchy distribution matrix. We detail an algorithm for generating good matrices and then evaluate the per- formance of encoding using all implementations Reed- Solomon codes, plus the best MDS codes from the lit- erature. The improvements over the original Cauchy Reed-Solomon codes are as much as 83% in realistic scenarios, and average roughly 10% over all cases that we tested.
Article
Full-text available
We present a novel method, that we call EVENODD, for tolerating up to two disk failures in RAID architectures. EVENODD employs the addition of only two redundant disks and consists of simple exclusive-OR computations. This redundant storage is optimal, in the sense that two failed disks cannot be retrieved with less than two redundant disks. A major advantage of EVENODD is that it only requires parity hardware, which is typically present in standard RAID-5 controllers. Hence, EVENODD can be implemented on standard RAID-5 controllers without any hardware changes. The most commonly used scheme that employes optimal redundant storage (i.e., two extra disks) is based on Reed-Solomon (RS) error-correcting codes. This scheme requires computation over finite fields and results in a more complex implementation. For example, we show that the complexity of implementing EVENODD in a disk array with 15 disks is about 50% of the one required when using the RS scheme. The new scheme is not limited to RAID architectures: it can be used in any system requiring large symbols and relatively short codes, for instance, in multitrack magnetic recording. To this end, we also present a decoding algorithm for one column (track) in error
Article
Full-text available
An (m; n; b; r)-erasure-resilient coding scheme consists of an encoding algorithm and a decoding algorithm with the following properties. The encoding algorithm produces a set of n packets each containing b bits from a message of m packets containing b bits. The decoding algorithm is able to recover the message from any set of r packets. Erasure-resilient codes have been used to protect real-time traffic sent through packet based networks against packet losses. In this paper we describe a erasure-resilient coding scheme that is based on a version of Reed-Solomon codes and which has the property that r = m: Both the encoding and decoding algorithms run in quadratic time and have been customized to give the first real-time implementations of Priority Encoding Transmission (PET) [2],[1] for medium quality video transmission on Sun SPARCstation 20 workstations. 1 Introduction Most existing and proposed networks are packet based, where a packet is a fixed length indivisible unit of inform...
Conference Paper
Distributed storage systems often have to guarantee data availability despite of failures or temporal downtimes of storage nodes. For this purpose, a deletion-tolerant code is applied that allows to reconstruct missing parts in a codeword, i.e. to tolerate a distinct number of failures. The Reed/Solomon (R/S) code is the most general deletion-tolerant code and can be adapted to a required number of tolerable failures. In terms of its least information overhead, R/S is optimal, but it consumes significantly more computation power than parity-based codes. Reconfigurable hardware can be employed for particular operations in finite fields for R/S coding by specialized arithmetics, so that the higher computation effort is compensated by faster and parallel operations. We present architectures for an application–specific acceleration by FPGAs. In this paper, strategies for an efficient communication with the accelerating FPGA and a performance comparison between a pure software-based solution and the accelerated system are provided.
Conference Paper
Distributed storage systems apply erasure-tolerant codes to guarantee reliable access to data despite failures of storage resources. While many codes can be mapped to XOR operations and efficiently implemented on common microprocessors, only a certain number of codes are usually implemented in a certain system (out of a wide variety of different codes). The ability to include new codes easily, to exchange codes and finally to select codes for several types of data is desirable. To provide this flexibility, a parameterization is used which allows the definition of different XOR based codes, and beyond different styles of en- and decoding. The parameters include (i) the assignment of data and redundancy elements to the storage resources and (ii) a description of en- and decoding algorithms with XOR based equations. The parameters of a certain code can be changed and in addition a wide variety of codes can be described and included in a storage system implementation. The proposed parameterization adopts the ability of codes like EVEN- ODD, Cauchy-R/S and Hover codes to map to distributed resources. Furthermore, en- and decoding algorithms can be described differently, either for minimal coding cost or for minimal coding time on parallel systems.
Conference Paper
We present a new family of XOR-based erasure codes primarily targeted for use in disk arrays. These codes have a unique data/parity layout, with both horizontal and vertical parity arrangements giving rise to the name HoVer codes. We give constructions that tolerate up to four disk failures. Though the codes are only approximately maximum distance separable (MDS), they have performance advantages over other codes at many common array sizes. In addition, they have fewer parameter constraints than many other codes which enable greater choices and flexibility in efficiency and performance trade-offs
Article
A bit parallel structure for a multiplier with low complexity in Galois fields is introduced. The multiplier operates over composite fields GF((2<sup>n</sup>)<sup>m</sup>), with k=nm. The Karatsuba-Ofman algorithm (A. Karatsuba and Y. Ofmanis, 1963) is investigated and applied to the multiplication of polynomials over GF(2<sup>n</sup>). It is shown that this operation has a complexity of order O(k<sup>log23 </sup>) under certain constraints regarding k. A complete set of primitive field polynomials for composite fields is provided which perform module reduction with low complexity. As a result, multipliers for fields GF(2<sup>k</sup>) up to k=32 with low gate counts and low delays are listed. The architectures are highly modular and thus well suited for VLSI implementation
Article
It is well-known that Reed-Solomon codes may be used to provide error correction for multiple failures in RAID-like systems. The coding technique itself, however, is not as well-known. To the coding theorist, this technique is a straightforward extension to a basic coding paradigm and needs no special mention. However, to the systems programmer with no training in coding theory, the technique may be a mystery. Currently, there are no references that describe how to perform this coding that do not assume that the reader is already well-versed in algebra and coding theory. This paper is intended for the systems programmer. It presents a complete specification of the coding algorithm plus details on how it may be implemented. This specification assumes no prior knowledge of algebra or coding theory. The goal of this paper is for a systems programmer to be able to implement Reed-Solomon coding for reliability in RAID-like systems without needing to consult any external references. plank@...