ArticlePDF Available

# Very Efficient Balanced Codes

Authors:

## Abstract and Figures

The prior art construction of sets of balanced codewords by Knuth is attractive for its simplicity and absence of look-up tables, but the redundancy of the balanced codes generated by Knuth's algorithm falls a factor of two short with respect to the minimum required. We present a new construction, which is simple, does not use look-up tables, and is less redundant than Knuth's construction. In the new construction, the user word is modified in the same way as in Knuth's construction, that is by inverting a segment of user symbols. The prefix that indicates which segment has been inverted, however, is encoded in a different, more efficient, way.
Content may be subject to copyright.
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 28, NO. 2, FEBRUARY 2010 1
Very Efcient Balanced Codes
Kees A. Schouhamer Immink and Jos H. Weber
Abstract—The prior art construction of sets of balanced
codewords by Knuth is attractive for its simplicity and absence
of look-up tables, but the redundancy of the balanced codes
generated by Knuth’s algorithm falls a factor of two short with
respect to the minimum required. We present a new construction,
which is simple, does not use look-up tables, and is less redundant
than Knuth’s construction. In the new construction, the user
word is modied in the same way as in Knuth’s construction,
that is by inverting a segment of user symbols. The prexthat
indicates which segment has been inverted, however, is encoded
in a different, more efcient, way.
Index Terms—Magnetic recording, optical recording, channel
capacity, constrained code, dc-free code, balanced code.
I. INTRODUCTION
SETS of bipolar codewords that have equal numbers of
’1’s and ’-1’s are usually called balanced codes. Balanced
codes have been widely used in storage and communication
channels. A survey of properties and methods for constructing
balanced codes can be found in [1]. There is a trend towards
high rate codes, which are made possible by codes with longer
codewords. The implementation of such a code is not a simple
task as look-up tables for translating user words into channel
words and vice versa are impractically large. Knuth published
a simple algorithm for constructing balanced codes [2], which
is very well suited for use with long codewords, since look-
up tables are absent. Modications and improvements of the
generic scheme are discussed by Alon et al. [3], Al-Bassam &
Bose [4], Tallini, Capocelli & Bose [5], and Weber & Immink
[6]. The redundancy of a balanced code generated by Knuth’s
algorithm falls a factor of two short with respect to the
minimum required, i.e. the redundancy of a code that uses
all balanced codewords of a given length [1]. An attempt by
Weber & Immink [6] to compress the xed length prextoa
variable length prex with less redundancy was futile.
In this paper, we will study simple balanced code designs
that require minimum redundancy, while, as in Knuth’s con-
struction, the balanced codeword is obtained by a simple, al-
gorithmic modication of the user word, and a prex is added
that carries sufcient information for the recipient to uniquely
retrieve the user word. The way the user word is modied in
the new construction is the same as in Knuth’s construction,
that is by inverting a segment of user symbols. The prex,
however, is encoded and decoded in a different, more efcient,
Manuscript received 12 January 2009; revised 17 July 2009. This project
was supported by grant Theory and Practice of Coding and Cryptography,
Award Number: NRF-CRP2-2007-03
Kees A. Schouhamer Immink is with Turing Machines Inc., Willemskade
15b-d, 3016 DK Rotterdam, The Netherlands, National Technological Uni-
versity of Singapore, Singapore (e-mail: immink@turing-machines.com).
Jos H. Weber is with TU Delft, IRCTR/CWPC, Mekelweg 4, 2628 CD
Delft The Netherlands, (e-mail: J.H.Weber@ewi.tudelft.nl).
Digital Object Identier 10.1109/JSAC.2010.1002xx.
way. We start, in Section II, with a brief description of the
prior art code construction by Knuth, followed, in Section III,
by a description of the new construction. In Section IV we
will compute some statistics, which will enable us, in V,
to compute the redundancy of the new code construction.
In Section VI, we will present some details regarding the
implementation of the new algorithm. Section VII concludes
the paper.
II. KNUTHS CODE CONSTRUCTION
The conventional Knuth algorithm runs as follows. The user
data is arranged as a bipolar m-tuple u=(u1,...,u
m),ui
{−1,1},meven. (Knuth also presented code constructions
for odd m, but they will not be discussed here.) We dene
for 1jm, the bipolar m-tuple uj=(u1,u2,...,
uj,uj+1,u
j+2,...,u
m). Knuth showed that for any user
data uan index jcan be found such that the codeword ujis
balanced, that is
j
i=1
ui+
m
i=j+1
ui=0.
The index jis not necessarily unique as in general there are
more positions where balance can be obtained [6]. Let the
smallest index j,whereujis balanced, be denoted by I(u).
In other words, I(u)is the smallest index, 1im,for
which uiis balanced. The balanced codeword x=uI(u)
plus a prex, which suitably represents the index I(u),is
I(u), and can thus uniquely undo the encoding step by
forming u=xI(u). For an efcient code, the redundant prex
should be as small as possible. Knuth showed that in his best
construction the redundancy pis roughly equal to [2]
log2m, m >> 1.(1)
The redundancy of a full set of balanced codewords of length
m,H0, equals
H0=mlog2m
m/2,(2)
and can be approximated by [2]
H01
2log2m+0.326,m>>1.(3)
We notice that, for large values of the codeword length, m,the
redundancy of Knuth-based codes is twice as high as that of
codes that uses ’full’ sets of balanced codewords. It has been
a continuing desideratum in data communication and storage
systems to increase the capacity by using more efcient coding
methods. In the next section, we will present the new coding
technique.
0733-8716/10/$25.00 c 2010 IEEE 2 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 28, NO. 2, FEBRUARY 2010 III. NEW PREFIX CODING SCHEME FOR BALANCING CODEWORDS Let x=uI(u)be a codeword balanced by Knuth’s method described above. Clearly, we have u=xI(u) {x1,...,xm}, and the index received makes it possible to uniquely single out the right member from the mpossible ones. The new encoder is based upon the observation that not all mmembers of the set {x1,...,xm}can be legally associated with x, since, by denition, Knuth’s encoder takes the smallest index for balancing a user word. An m-tuple xj with j>I(xj), is not a bona de word under Knuth’s rules. We now dene the set of user words legally associated with xby σx={xj:j=I(xj)}. The cardinality of σxwill be denoted by d(x)=|σx|. Example: Let m=6. Then it can be veried that σ000111 ={100111, ’110111’, ’111111’, 111000}, where a ’0’ denotes the symbol value ’-1’. In a similar way, σ010101 ={110101,100101}. Hence d(000111)=4and d(010101)=2. An efcient encoder will transmit the balanced word x= uI(u)plus an index that resolves the ambiguity about which word in σxis meant. The average number of bits required to represent the index depends on the way the code construc- tion is implemented. For a xed-length-prex construction it depends on the maximum size of σx, while in a variable- length-prex construction it will depend on the average size of σx. Before formulating the key theorem, we offer some denitions. Let zkbe the running sum of the rst k,km, symbols of x,or zk= k i=1 xi, and let zmax =max{zk}and zmin =min{zk}. Theorem 1: d(x)=zmax zmin +1. Proof. We hav e xi/σxif there is a j,where(xi)j, 1j<im, is balanced. Since (xi)j=(x1,...,x j, xj+1,...,xi,x i+1,...,x m)we conclude that the sum of the elements of (xi)jequals m k=1 xk2 i k=j+1 xk. Then, as m k=1 xk=0, we notice that xi/σxif there is a j,1j<i, such that i k=j+1 xk=0. Or, in other words, xi/σxif there is a j,1j<isuch that zi=zj.Let Zdenote the set of sum values zi,1im, and let V(Z)denote the number of distinct values of Z.Then it is immediate that d(x)=V(Z). Since all possible values of zibetween zmin and zmax are in Z, we conclude that d(x)=V(Z)=zmax zmin +1. This concludes the proof. The following Theorem gives a lower and upper bound on the size d(x). Theorem 2: 2d(x)m 2+1. Proof. Clearly, z1=x1=0and zm=0. Thus d(x)= zmax zmin +12.We now proceed with the upper bound d(x)m 2+1. Note that, since z0=zm=0,andzi=zi1±1 for all i, all values in Zoccur at least twice with the exception of a single maximum zmax and/or a single minimum zmin. Thus d(x)2+m2 2=m 2+1. This concludes the proof. The conventional Knuth scheme, which requires a prexof length log2(m)bits, is less efcient than a new scheme based on Theorem 2, which shows that the maximum number of bits required to represent the index equals log2(m/2+1). The average number of bits required to represent the index will be computed in the next section. IV. COMPUTATION OF THE DISTRIBUTION OF d(x) In this section, we will compute the distribution of d(x), so that we can, in Section V, compute the redundancy of the new schemes. Let P(u, m)denote the number of binary balanced words xof length mwith d(x)=u.Bydenition and invoking Theorem 2, we have m/2+1 u=2 P(u, m)=m m/2. The computation of the distribution of d(x)is related to the problem of computing the number of sequences (random walks) whose running sum remains within given limits, a prob- lem rst studied by Chien [7]. Chien studied bipolar sequences {xi},xi∈{1,1}, where the running sum zi=zi1+xi, for any i, remains within the limits N1and N2,whereN1 and N2are two (nite) constants, N2>N 1. The range of sum values a sequence may assume, denoted by N=N2N1+1,(4) is often called the digital sum variation.Takingziat any instant ias the state of the stream {xi}, then the bounds to zidene a set of Nallowable states. For the N-state source, an N×Nconnection matrix, DN,isdened by DN(i, j)= 1if a transition from state σito state σjis allowable and DN(i, j)=0otherwise. The connection matrix DNfor the channel having a bound to the number of assumed sum values is given by DN(i+1,i)=DN(i, i +1)=1,i=1,2, ..., N 1, DN(i, j)=0,otherwise. (5) The (i, j)-th entry of the m-th power of DNwill be denoted by Dm N(i, j). The following Theorem will be helpful in computing P(u, m). IMMINK and WEBER: VERY EFFICIENT BALANCED CODES 3 Theorem 3: The number of balanced words xof length m with d(x)=u,P(u, m),2um/2+1,isgivenby P(u, m)= u i=1 Dm u(i, i)2 u1 i=1 Dm u1(i, i) + u2 i=1 Dm u2(i, i),2um 2+1. Proof. In order to calculate P(u, m), we must count the number of balanced sequences xof length m, whose running sum span equals u. The matrix entries Dm N(i, i)give the number of balanced sequences of length m, whose running sum span is at most Nthat start and end with a given sum value i. Thus the count Dm N(i, i)includes words xwith d(x)<N. We may resolve this difculty by observing that a balanced word xwith d(x)=Nhas a unique starting (and ending) state. Namely, assume a word xwith the property d(x)=N,andlet zk=z0+ k i=1 xi, where 1z0Ndenotes the initial value of the running sum. Then, by denition, max{zi}−min{zi}+1=N.The limiting values zmax and zmin are by denition the maximum and minimum sum values allowed within the N-state machine. Other values of z0are not allowed as they will lead to too high or too low a value of the running sum. We conclude that there is a unique starting (state) value z0for a sequence having the maximum running sum span N. Similarly, a word xwith d(x)=N1may have two possible starting states, and a word xwith d(x)=N2has three possible starting states, etc. As a result, we nd u i=1 Dm u(i, i)=P(u, m)+2P(u1,m)+ +3P(u2,m)+4P(u3,m)+... = u2 k=0 (k+1)P(uk, m). Then, after a simple manipulation, we nd P(u, m)= u i=1 Dm u(i, i)2 u1 i=1 Dm u1(i, i) + u2 i=1 Dm u2(i, i),2um 2+1.(6) This proves the theorem. A useful property to compute powers of DNwas derived by Salkuyeh [8], namely Dm N(i, j)= 2 N+1 N k=1 λm ksin ikπ N+1sin jkπ N+1,(7) where λi=2cos πi N+1,1iN TAB L E I P(u, m)VERSUS u. uP(u, m) 22 32(2m/22) m/2m(m4),m>4 m/2+1 m are the eigenvalues of DN[1]. The number N i=1 Dm N(i, i) can be calculated by invoking relation (7). After rearranging some terms, we nd N i=1 Dm N(i, i)= 2 N+1 N k=1 λm k N i=1 sin2ikπ N+1 = N i=1 λm i=2 m N i=1 cosmπi N+1.(8) With the above relations it is now straightforward to compute P(u, m). For special values of u, we could derive simple relations that offer more insight. The rst two cases were discussed previously. There are two codewords that achieve the minimum bound d(x)=2, namely x=(+1,1,+1,1,...)and its inverse. There are mcodewords that achieve the upper bound, d(x)=m/2+1, namely the codewords starting with the maximum runlength of m/2’+1’s followed by m/2 ’-1’s, and the m1circular shifts of that codeword. There are 2(2m/22) codewords xwith d(x)=3, namely the 2m/2codewords formed of combinations of the 2-bit words (+1,-1) and (-1,+1) minus the two codewords with d(x)=2plus the one-bit circular shifts of those codewords. There are m(m4) codewords xwith d(x)=m/2, m>4, namely the m/22codewords starting with a runlength of m/21’+1’s followed by i’-1’s, a ’+1’, and m/2i’-1’s, 1im/22, their inverse, and the m1circular shifts of those codewords. A survey of the above ndings is shown in Table I. V. P ERFORMANCE COMPUTATIONS We rst compute the average number of bits, H, required to represent the index. The quantity, H, sets a theoretical limit as it is not assumed that the prex is balanced or has an integer number of bits. There are d(x)different user words that are transformed into the balanced word x, so that we conclude m/2+1 u=2 uP (u, m)=2 m.(9) The average number of bits, H, required to represent the index is given by H=2 m m/2+1 u=2 uP (u, m)log 2u. (10) Results of computations of the average prex length, H, versus user word length mare listed in Table II. As a reference 4 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 28, NO. 2, FEBRUARY 2010 TAB L E I I AVERAGE PREFIX LENGTH,H,AND MINIMUM REDUNDANCY, H0=mlog2m m/2,VERSUS USER WORD LENGTH m. mH H 0 64 3.3641 3.3314 128 3.8616 3.8286 256 4.3603 4.3272 512 4.8597 4.8265 1024 5.3594 5.3261 2048 5.8592 5.8259 4096 6.3591 6.3258 8192 6.8591 6.8258 2 3 4 5 6 7 8 9 10 2 3 4 5 6 7 8 9 10 Redundancy log m new scheme Fig. 1. Average prex length as a function of log2mof the VL prex Knuth scheme with balanced prex. As a reference we plotted the minimum redundancy of Knuth’s construction log2(m)and log2(m). we listed the base-line redundancy of full sets of balanced codewords, H0. The difference between the average prex length Hand H0is less than 1 percent. The redundancy of the variable-length (VL) balanced code, H, as shown in Table II, is a theoretical minimum. As in Knuth’s prior art construction, the VL prex should be balanced (or should compensate the unbalance of the codeword). To that end, for every integer p, p>0,wedene the integer function B(p)as the smallest even integer qsuch that q q/2p. Then, assuming that the VL index is mapped onto a balanced prex, we nd with a slight modication of (10) that ˆ H=2 m m/2+1 u=2 uP (u, m)B(u),(11) where ˆ Hdenotes the redundancy of the new construction hav- ing balanced prexes. Figure 1 shows results of computations. As a reference we plotted the curves log2(m)and log2(m), which show the minimum redundancy and that of integer valued redundancy of Knuth’s construction. We may observe that for m<64 the redundancy of the xed-prex Knuth scheme and that of the VL scheme do not signicantly differ. For m>64, we notice that the average redundancy of 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 Redundancy log m K=1 K=2 K=4 Bound Fig. 2. Average prex length as a function of log2mof the VL prex Knuth scheme combining K=1,2,or4m-bit codewords; all schemes have a balanced prex. the new scheme is less than that of the classic Knuth scheme. We may approach the theoretical minimum by combining a plurality of codewords, say K. Each of the Km-bit user words is balanced as discussed above, and the Kprexes are combined into one ’super prex’. The super prex is balanced by a look-up table or in case the super prex is too long for table look-up, we apply Knuth’s method. Figure 2 shows some results of computations, where it is assumed that the prexes of K=2and K=4words are combined. As a reference we plotted the curve of minimum redundancy as dened by (10). We note that the curve showing the average redundancy of K=4combined prexes is only one bit away from the bound. We will now take a look at the implementation of both the encoder and decoder that exploits the ndings of Theorem 2. VI. IMPLEMENTATION ISSUES The two coding schemes, which are based on Theorem 2, may use a) a xed length or b) a variable length (VL) prex. In the rst scheme, the prexlengthisxed as in the conventional scheme. Then the prex must be able to uniquely encode at most m/2+1 indices requiring less than log2mbits, so that it is less redundant than the conventional method. In the second scheme, where the prex length depends on the user data, the prex length varies between 1andlog2(m/2+1) bits. On the average, the VL coding scheme will be more efcient than the rst scheme. We will rst describe the implementation of the encoder and decoder. Encoder description: Assume the user data uenters the en- coder. The encoder computes, as in the classic Knuth method, the balancing position I(u), and transmits the balanced word x=uI(u). The computation of the prex is more involved. To that end, we rst specify an order relation on σx.Then we compute the rank, Iu,0Iud(x)1,ofuin the ordered set σx, and uniquely translate the rank Iuinto a (preferably) balanced prex. In a scheme with a xed prex IMMINK and WEBER: VERY EFFICIENT BALANCED CODES 5 length, the prex must accommodate in the worst case m/2+1 values of Iu. For the scheme with variable prex length, the prex length depends on x, and must accommodate the d(x) possible values of Iu. The (balanced) m-tuple xand the prex are transmitted to the receiver. Decoder description: The decoder receives the codeword x=uI(u)and a suitable representation of Iu.Inascheme with xed prex length, the decoder can identify the prexIu. Then, the decoder generates the ordered set σx, and retrieves from σxthe member with rank Iu, and outputs that member. In a VL scheme, the decoder rst receives x, and computes d(x).Thevalueofd(x)identies the prex length. Then we decode the index Iu, generate the ordered set σx, and retrieve the member of σxwith index Iu. Note that in the VL scheme, the ’prex’ must follow the codeword, and normal people would call it therefore a ’sufx’. For heritage reasons, we will use Knuth’s term ’prex’, while it should be appreciated that in this context, the ’prex’ will be following the codeword. Essential parts of the encoding and decoding operation are the computation of d(x), the generation of the ordered set σx, and the computation of the index Iu. The complexity of a straightforward algorithm for computing d(x)grows quadratically with increasing codeword length m. We will discuss some more efcient methods. Let x=uI(u)be a balanced word. The complexity of the following algorithm grows linearly with m.Dene the set of running sum val- ues as Z={zk},k, and dene the binary vector v= (vm/2,...,v m/2)as vi=1,i Zand vi=0,i /Z.As d(x)equals the number of different values that ziassumes, which equals the weight of v,wend d(x)=vi.The following code implements the above in MatLab notation: Algorithm 1 Input: User input word x(i),1im. Output: d(x). z(1)=x(1); v(z(1)+m/2+1)=1; for i=2:m; z(i)=z(i-1)+x(i); v(z(i)+m/2+1)=1; end; dx=sum(v); where z(i)denotes the running sum, x(i)∈{1,+1}denotes the entries of codeword x,andv(i)are the entries of a binary vector counting the occurrence of the running sum values z(i)+m/2+1 1. The sum of the entries of vequals d(x). We n ow de ne the ordering of the members of σx.Letxi and xjbe elements of σx. We call xiless than xj, in short xi<xj,ifi<j.Therank of vσx, denoted by Iv,is dened to be the position of vin the ordered list of members of σx, i.e., Ivis the number of all yin σxwith y<v.The following MatLab-algorithm nds the rank, Iu, of the user word u=xI(u)in the ordered set σxby counting the words less than xI(u). Algorithm 2 Input: I(u),v(i),z(i),1im,asdened in Algorithm 1. Output: Iu=p. p=0; 1The term m/2+1is added since MatLab does not allow non-positive array indices. for i=1:I(u)-1; if v(z(i)+m/2+1)==1 then p=p+1; v(z(i)+m/2+1)=0; end;end; After execution of the routine, we nd Iu=p. VII. CONCLUSIONS We have presented a new method for constructing sets of balanced bipolar codewords. The new construction presented is attractive as it does not use look-up tables and is less redun- dant than Knuth’s prior art construction. We have presented simple algorithms for computing the prex, encoding, and decoding. We have analyzed the distribution of the lengths of the prex length, and determined the average efciency of the new construction. REFERENCES [1] K.A.S. Immink, Codes for Mass Data Storage Systems, Second Edi- tion, ISBN 90-74249-27-2, Shannon Foundation Publishers, Eindhoven, Netherlands, 2004. [2] D.E. Knuth, ’Efcient Balanced Codes’, IEEE Trans. Inform. Theory, vol. IT-32, no. 1, pp. 51-53, Jan. 1986. [3] N. Alon, E.E. Bergmann, D. Coppersmith, and A.M. Odlyzko, ’Balanc- ing Sets of Vectors’, IEEE Trans. Inform. Theory, vol. IT-34, no. 1, pp. 128-130, Jan. 1988. [4] S. Al-Bassam and B. Bose, ’On Balanced Codes’, IEEE Trans. Inform. Theory, vol. IT-36, no. 2, pp. 406-408, March 1990. [5] L.G. Tallini, R.M. Capocelli, and B. Bose, ’Design of some New Balanced Codes’, IEEE Trans. Inform. Theory, vol. IT-42, pp. 790-802, May 1996. [6] J.H. Weber and K.A.S. Immink, ’Knuth’s Balancing of Codewords Revisited’, IEEE International Symposium on Information Theory, ISIT2008, pp. 1567-1571, Toronto, 6-11 July 2008. [7] T.M. Chien, ’Upper Bound on the Efciency of Dc-constrained Codes’, Bell Syst. Tech. J., vol. 49, pp. 2267-2287, Nov. 1970. [8] D.K. Salkuyeh, ’Positive Integer Powers of the Tri-diagonal Toeplitz Matrices’, International Mathematical Forum, 1, no. 22, pp. 1061 - 1065, 2006 Kees Schouhamer Immink received his PhD de- gree from the Eindhoven University of Technology. He founded and was named president of Turing Machines Inc. in 1998. He is, since 1994, an adjunct professor at the Institute for Experimental Math- ematics, Essen University, Germany, and is afli- ated with the Nanyang Technological University of Singapore. Immink designed coding techniques of a wealth of digital audio and video recording products, such as Compact Disc, CD-ROM, CD-Video, Digital Compact Cassette system, DCC, Digital Versatile Disc, DVD, Video Disc Recorder, and Blu-ray Disc. He received a Knighthood in 2000, a personal Emmy award in 2004, the 1996 IEEE Masaru Ibuka Consumer Electronics Award, the 1998 IEEE Edison Medal, 1999 AES Gold and Silver Medals, and the 2004 SMPTE Progress Medal. He was named a fellow of the IEEE, AES, and SMPTE, and was inducted into the Consumer Electronics Hall of Fame, and elected into the Royal Netherlands Academy of Sciences and the US National Academy of Engineering. He served the profession as President of the Audio Engineering Society inc., New York, in 2003. 6 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 28, NO. 2, FEBRUARY 2010 Jos H. Weber (S’87-M’90-SM’00) was born in Schiedam, The Netherlands, in 1961. He received the M.Sc. (in mathematics, with honors), Ph.D., and MBT (Master of Business Telecommunications) degrees from Delft University of Technology, Delft, The Netherlands, in 1985, 1989, and 1996, respec- tively. Since 1985 he has been with the Faculty of Electrical Engineering, Mathematics, and Computer Science of Delft University of Technology. Cur- rently, he is an associate professor at the Wire- less and Mobile Communications Group. He is the chairman of the WIC (Werkgemeenschap voor Informatie- en Communicatietheorie in the Benelux) and the secretary of the IEEE Benelux Chapter on Information Theory. He was a Visiting Researcher at the University of California at Davis, USA, the University of Johannesburg, South Africa, and the Tokyo Institute of Technology, Japan. His main research interests are in the areas of channel and network coding. ... The average redundancy of a VL tag scheme is less than that of the above fixed-length tag scheme. As the distribution of r(y) is the same as that of Knuth's code, we follow [22] for the computation of the redundancy of the VL tag scheme. The number of balanced words y of length n with r(y) = u, denoted by P (u, n), 2 ≤ u ≤ n/2 + 1, is given by [22] P (u, n) = D(u, n) − 2D(u − 1, n) + D(u − 2, n), (19) where ... ... As the distribution of r(y) is the same as that of Knuth's code, we follow [22] for the computation of the redundancy of the VL tag scheme. The number of balanced words y of length n with r(y) = u, denoted by P (u, n), 2 ≤ u ≤ n/2 + 1, is given by [22] P (u, n) = D(u, n) − 2D(u − 1, n) + D(u − 2, n), (19) where ... ... where v = n/(2u + 2) . The redundancy of the VL tag scheme, denoted by H, equals [22] H = 2 −n n/2+1 u=2 uP (u, n) log 2 u. ... Preprint Full-text available We present and analyze a new systematic construction of bipolar balanced codes where each code word contains equally many −1's and +1's. The new code is minimally modified as the number of symbol changes made to the source word for translating it into a balanced code word is as small as possible. The balanced codes feature low redundancy and time complexity. Large look-up tables are avoided. ... Therefore, in [9], [10], Immink and Weber proposed balancing schemes that transmit variable-length prefixes and studied the average redundancy of their proposals. Specifically, in [9], Weber and Immink provided two variable-length balancing schemes whose average redundancy are asymptotically equal to log 2 n and 1 2 log 2 n + 0.936, respectively. ... ... Specifically, in [9], Weber and Immink provided two variable-length balancing schemes whose average redundancy are asymptotically equal to log 2 n and 1 2 log 2 n + 0.936, respectively. Later in [10], Immink and Weber proposed another variable-length balancing scheme which we study closely in this paper. In [10], Immink and Weber provided closed formulas for the average redundancy of their scheme and computed these values for n 8192. ... ... Later in [10], Immink and Weber proposed another variable-length balancing scheme which we study closely in this paper. In [10], Immink and Weber provided closed formulas for the average redundancy of their scheme and computed these values for n 8192. While numerically the redundancy values are close to the optimal value given in (3), a tight asymptotic analysis was not provided. ... Preprint Full-text available We study and propose schemes that map messages onto constant-weight codewords using variable-length prefixes. We provide polynomial-time computable formulas that estimate the average number of redundant bits incurred by our schemes. In addition to the exact formulas, we also perform an asymptotic analysis and demonstrate that our scheme uses$\frac12 \log n+O(1)$redundant bits to encode messages into length-$n$words with weight$(n/2)+{\sf q}$for constant${\sf q}$. ... The generating function offers a tool for enumerating the balanced codes [44,45]. Encoding/decoding of balanced codes has attracted a considerable amount of research and engineering attention [46,47]. ... ... MAXIMUM LIKELIHOOD DECODING 5 83 dc/dc 2 -balanced codes in [45]. Encoding/decoding of balanced codes has attracted a considerable amount of research and engineering attention [46,47]. ... ... Improvements of the traditional Knuth's algorithm are considered in [16], [17]. Of interest is the new prefix coding technique for balancing codewords in [17] with reduced redundant than Knuth's construction. ... ... Improvements of the traditional Knuth's algorithm are considered in [16], [17]. Of interest is the new prefix coding technique for balancing codewords in [17] with reduced redundant than Knuth's construction. The number of bits required to represent the index was achieved by either a fixed prefix length (FPL) method or variable prefix length (VPL) method. ... Article Full-text available Visible light communication (VLC) offers wireless communication within short-range based on wavelength converters and light-emitting diode (LED). In the VLC system, conventional forward error correction (FEC) codes are not guaranteed to provide flicker mitigation and dimming support. Consequently, modified coding schemes are introduced for reliable VLC. These methods require complicated coding structures, use of lookup tables, and the addition of large redundancy, resulting to increased computational complexity and low transmission efficiency. In this article, we propose a coding scheme that is flicker-free and enhances the transmission efficiency for VLC systems. The proposed scheme is based on polar codes (PC) and Knuth balancing code with enhanced prefix coding technique. The results show that the proposed algorithm exhibits improved transmission efficiency compared to the PC without and with run-length limited code, for dimming values 75% (or 25%) and 87.5% (or 12.5%). Also, the proposed scheme presents a significant bit error rate (BER) performance gain compared to the schemes in literature. The proposed scheme is flicker-free, provides a simple encoding structure, does not utilize lookup tables, generates minimal number of redundancies for energy efficiency. Thus, the approach is flexible, and it is more suitable for real-time VLC systems. INDEX TERMS Forward error correction, Knuth balancing codes, light-emitting diode, polar codes, visible light communication. ... Note that the phrase "balanced codes" might be used for other concepts in literature, e.g., in [21]. ... Preprint Full-text available This is a manuscript of a chapter prepared for a book. The good codes possess large information length and large minimum distance. A class of codes is said to be asymptotically good if there exists a positive real$\delta$such that, for any positive integer$N$we can find a code in the class with code length greater than$N$, and with both the rate and the relative minimum distance greater than$\delta$. The linear codes over any finite field are asymptotically good. More interestingly, the (asymptotic) GV-bound is a phase transition point for the linear codes; i.e., asymptotically speaking, the parameters of most linear codes attain the GV-bound. It is a long-standing open question: whether or not the cyclic codes over a finite field (which are an important class of codes) are asymptotically good? However, from a long time ago the quasi-cyclic codes of index$2$were proved to be asymptotically good. This chapter consists of some of our studies on the asymptotic properties of several classes of quasi-group codes. We'll explain the studies in a consistent and self-contained style. We begin with the classical results on linear codes. In many cases we consider the quasi-group codes over finite abelian groups (including the cyclic case as a subcase of course), and study their asymptotic properties along two directions: (1) the order of the group (the coindex) is fixed while the index is going to infinity; (2) the index is small while the order of the group (the coindex) is going to infinity. Finally we describe the story on dihedral codes. The dihedral groups are non-abelian but near to cyclic groups (they have cyclic subgroups of index$2\$). The asymptotic goodness of binary dihedral codes was obtained in the beginning of this century, and extended to the general dihedral codes recently.
... The main reason is due to the provable difficulty of 2D-constraints compared to 1D-constraints. For example, consider certain weight-constrained codes such as the balanced codes or constant-weight codes, there are several efficient prior-art coding methods for designing 1Dcodes with optimal or almost optimal redundancy [14]- [17]. Here, almost optimal refers to the cases that the encoder's redundancy is at most a constant bit away from the optimal redundancy. ...
Conference Paper
Full-text available
In this work, given n, p>0 , efficient encoding/decoding algorithms are presented for mapping arbitrary data to and from n×n binary arrays in which the weight of every row and every column is at most pn. Such constraint, referred as p-bounded-weight-constraint, is crucial for reducing the parasitic currents in the crossbar resistive memory arrays, and has also been proposed for certain applications of the holographic data storage. While low-complexity designs have been proposed in the literature for only the case p=1/2 , this work provides efficient coding methods that work for arbitrary values of p . The coding rate of our proposed encoder approaches the channel capacity for all p .
... In the serial or sequential scheme, the prefix comprises the original sequence's weight then, the com-plementing is performed on the overall sequence (prefix and original sequence) up to the balancing point. Improvements and embellishments of Knuth's binary methods can be found in [5]- [10]. ...
Article
A simplified and efficient algorithm with parallel decoding capacity was presented by Knuth for balancing binary sequences (binary sequences are a combination of zeros and ones, making up a set of instructions and data that a computer understands). This study proposes a generalization of this algorithm for q-ary sequences (multiplexed sequences, clock-controlled sequences, geometric sequences). This new approach is also based on simplicity and parallel decoding for q-ary balanced codes. Furthermore, it has a fixed redundancy for short and long sequences that equals logq k, where k is the sequence length, and no lookup tables are required.
... Since Shannon's 1948 paper [5], the design of CS codes has been an active research area where efficient CS codes that satisfy a great variety of constraints have been proposed. Although most CS codes in the literature are fixed-length codes [1]- [4], [6]- [16], recent advances show that variable-length CS codes have the potential to achieve higher code rates with simpler codebooks [17]- [26]. Since CS codes typically do not have strong error-correction capabilities, decoding of CS codes may result in error propagation. ...
Article
Full-text available
We study the ability of recently developed variable-length constrained sequence codes to determine codeword boundaries in the received sequence upon initial receipt of the sequence and if errors in the received sequence cause synchronization to be lost.We first investigate construction of these codes based on the finite state machine description of a given constraint, and develop new construction criteria to achieve high synchronization probabilities. Given these criteria, we propose a guided partial extension algorithm to construct variable-length constrained sequence codes with high synchronization probabilities. With this algorithm we construct new codes and determine the number of codewords and coded bits that are needed to recover synchronization once synchronization is lost.We consider a large variety of constraints including the runlength limited (RLL) constraint, the DC-free constraint, the Pearson constraint and constraints for inter-cell interference mitigation in flash memories. Simulation results show that the codes we construct exhibit excellent synchronization properties, often resynchronizing within a few bits.
Conference Paper
Article
Bazzi and Mitter [4] showed that binary dihedral group codes are asymptotically good. In this paper we prove that the dihedral group codes over any finite field with strong duality property are asymptotically good. If the characteristic of the field is even, self-dual dihedral group codes are asymptotically good. If the characteristic of the field is odd, maximal self-orthogonal dihedral group codes and LCD dihedral group codes are asymptotically good.
Book
Full-text available
Preface - The advantages of digital audio and video recording have been appreciated for a long time and, of course, computers have long been operated in the digital domain. The advent of ever-cheaper and faster digital circuitry has made feasible the creation of high-end digital video and audio recorders, an impracticable possibility using previous generations of conventional analog hardware. The principal advantage that digital implementation confers over analog systems is that in a well-engineered digital recording system the sole significant degradation takes place at the initial digitization, and the quality lasts until the point of ultimate failure. In an analog system, quality is diminished at each stage of signal processing and the number of recording generations is limited. The quality of analog recordings, like the proverbial 'old soldier', just fades away.
Article
Full-text available
Let n be an arbitrary integer, let p be a prime factor of n. Denote by ! 1 the p t h primitive unity root, omega(1) : = e 2 pi i/p Define omega(i) : = omega 1(i) for 0 <= i <= p - 1 and B : = {1; omega 1, ... , omega(p-1)}(n) subset of C(n). Denote by K (n; p) the minimum k for which there exist vectors upsilon(1,) ... , upsilon(k) is an element of B such that for any vector omega is an element of B, there is an i, 1 <= i <= k, such that v(i) . omega = 0, where upsilon center dot omega is the usual scalar product of upsilon and omega. Grobner basis methods and linear algebra proof gives the lower bound K ( n; p) = n (p-1). Galvin posed the following problem: Let m = m ( n) denote the minimal integer such that there exists subsets Lambda(1,) ..., Lambda(m) of {1, ... , 4n} with vertical bar Lambda i vertical bar = 2n for each 1 <= i <= n, such that for any subset B subset of [4n] with 2 n elements there is at least one i, 1 <= i <= m, with A(i) boolean AND B having n elements. We obtain here the result m (p) >= p in the case of p > 3 primes.
Book
Full-text available