ArticlePDF Available

Abstract

A cross-bifix-free set of words is a set in which no prefix of any length of any word is the suffix of any other word in the set. A construction of cross-bifix-free sets has recently been proposed by Chee {\it et al.} in 2013 within a constant factor of optimality. We propose a \emph{trace partitioned} Gray code for these cross-bifix-free sets and a CAT algorithm generating it.
arXiv:1401.4650v1 [cs.IT] 19 Jan 2014
A Gray Code for cross-bifix-free sets
A. BerniniS. BilottaR. PinzaniV. Vajnovszki
January 12, 2015
Abstract
A cross-bifix-free set of words is a set in which no prefix of any
length of any word is the suffix of any other word in the set. A con-
struction of cross-bifix-free sets has recently been proposed by Chee
et al. in 2013 within a constant factor of optimality. We propose a
trace partitioned Gray code for these cross-bifix-free sets and a CAT
algorithm generating it.
1 Introduction
A cross-bifix-free set of words is a set where, given any two words over
an alphabet, possibly the same, any prefix of the first one is not a suffix of
the second one and vice versa. Cross-bifix-free sets are involved in the study
of distributed sequences for frame synchronization [11]. The problem of
determining such sets is also related to several other scientific applications,
for instance in pattern matching [6] and automata theory [3].
Fixed the cardinality qof the alphabet and the length nof the words,
a matter is the construction of a cross-bifix-free set with the cardinality
as large as possible. An interesting method has been proposed in [1] for
words over a binary alphabet. In a recent paper [5] the authors revisit
the construction of [1] and generalize it obtaining cross-bifix-free sets of
words with greater cardinality over an alphabet of arbitrary size. They also
show that their cross-bifix-free sets have a cardinality close to the maximum
possible; and to our knowledge this is the best result in literature about the
size of cross-bifix-free sets.
It is worth to mention that an intermediate step between the original
method [1] and its generalization in [5] has been proposed in [4]: it is con-
stituted by a different construction of binary cross-bifix-free sets based on
Dipartimento di Matematica e Informatica “U. Dini”, Universit`a degli Studi di
Firenze, Viale G.B. Morgagni 65, 50134 Firenze, Italy. antonio.bernini@unifi.it,
stefano.bilotta@unifi.it, renzo.pinzani@unifi.it
LE2I, Universit´e de Bourgogne, BP 47 870, 21078 Dijon Cedex, France
vvajnov@u-bourgogne.fr
1
lattice paths which allows to obtain greater cardinality if compared to the
ones in [1].
Once a class of objects is defined, in our case words, often it could be
useful to list or generate them according to a particular criterion. A special
way to do this is their generation in a way such that any two consecutive
words differ as little as possible, i.e., in Gray code order [8]. In the case
the objects are words, as in our, we can specialize the concept of Gray
code saying that it is an infinite set of word-lists with unbounded word-
length such that the Hamming distance between any two adjacent words is
bounded independently of the word-length [18] (the Hamming distance is the
number of positions in which the two successive words differ [9]). Gray codes
find useful applications in circuit testing, signal encoding, data compression,
telegraphy, error correction in digital communication and others. They are
also widely studied in the context of combinatorial objects as: permutations
[10], Motzkin and Schr¨oder words [16], derangements [2], involutions [17],
compositions, combinations, set-partitions [12, 14], and so on.
In this work we propose a Gray code for the cross-bifix-free set S(k)
n,q
defined in [5]. It is formed by length nwords over the q-ary alphabet A=
{0,1,...,q1}containing a particular sub-word avoiding kconsecutive 0’s
(for more details see the next section). First we propose a Gray code for
S(k)
n,2over the binary alphabet {0,1}, then we expand each binary word to
the alphabet A. The expansion of a binary word αis obtained replacing all
the 1’s with the symbols of Adifferent from 0 producing a set of words with
the same trace α. The Gray code we get is trace partitioned in the sense
that all the words with the same trace are consecutive.
2 Definitions and tools
Let n3, q2 and 1 kn2. The cross-bifix-free set S(k)
n,q defined in
[5] is the set of all length nwords s1s2···snover the alphabet {0,...,q1}
satisfying:
s1=··· =sk= 0;
sk+1 6= 0;
sn6= 0;
the subword sk+2 . . . sn1does not contain kconsecutive 0’s.
Throughout this paper we are going to use several standard notations
which are typical in the framework of sets and lists of words. For the sake
of clearness we summarize the ones used here.
For a set of words Lover an alphabet Awe denote by Lan ordered list
for L, and
2
L denotes the list obtained by covering Lin reverse order;
if Lis another list, then L ◦ Lis the concatenation of the two lists,
obtained by appending the words of Lafter those of L;
first(L) and last(L) are the first and the last word of L, respectively;
if uis a word in A, then u· L (resp. L · u) is a new list where each
word has the form (resp. ωu) where ωis any word of L;
if uis a word in A, then |u|is its length, and un=uuu . . . u
|{z }
n
.
For our purpose we need a Gray code list for the set of words of a certain
length over the (q1)-ary alphabet {1,2,...,q 1},q3. An obvi-
ous generalization of the Binary Reflected Gray Code [8] to the alphabet
{1,2,...,q 1}is the list Gn,q for the set of words {1,2,...,q 1}nde-
fined in [7, 19] where is also shown that it is a Gray code with Hamming
distance 1. The authors defined this list as:
Gn,q =
λif n= 0,
1· Gn1,q 2·Gn1,q ◦ · · · (q1) · G
n1,q if n > 0,
(1)
where G
n1,q is Gn1,q or Gn1,q according on whether qis even or odd. The
reader can easily verify the following proposition.
Proposition 2.1. For q3,
first(Gn,q) = 1n;
last(Gn,q) = (q1)1n1if qis odd, and (q1)nif qis even.
Now we are going to present another tool we need in the paper. If βis a
binary word of length nsuch that |β|1=t(the number of 1’s in β), we
define the expansion of β, denoted by ǫ(β), as the list of (q1)twords,
where the i-th word is obtained by replacing the t1’s of βby the tsymbols
(read from left to right) of the i-th word in Gt,q. For example, if q= 3 and
β= 01011 (the trace), then G3,3= (111,112,122,121,221,222,212,211) and
ǫ(β) = (01011,01012,01022,01021,02021,02022,02012,02011).Notice that
in particular first(ǫ(β)) = βand all the words of ǫ(β) have the same trace.
We observe that ǫ(β) is the list obtained from Gt,q inserting some 0’s,
each time in the same positions. Since Gt,q is a Gray code and the insertions
of the 0’s does not change the Hamming distance between two successive
word of ǫ(β) (which is 1), the following proposition holds.
Proposition 2.2. For any q3and binary word β, the list ǫ(β)is a Gray
code.
3
3 Trace partitioned Gray code for S(k)
n,q
Our construction of a Gray code for the set S(k)
n,q of cross-bifix-free words
is based on two other lists:
• F(k)
n, a Gray code for the set of binary words of length navoiding k
consecutive 0’s, and
• H(k)
n,q, a Gray code for the set of q-ary words of length nwhich begin and
end by a non zero value and avoiding kconsecutive 0’s. In particular,
H(k)
n,2= 1 · F (k)
n2·1.
Finally, we will define the Gray code list S(k)
n,q for the set S(k)
n,q as 0k·H(k)
nk,q.
3.1 The list F(k)
n
Let Cnbe the list of binary words defined as:
Cn=
λif n= 0,
1·Cn10· Cn1if n1,
(2)
with λthe empty word. The list Cnis a Gray code for the set {0,1}nand
it is a slight modification of the original Binary Reflected Gray Code list
defined in [8].
By the definition of Cngiven in relation (2), we have for n1,
last(Cn) = 0 ·last(Cn1) = 0n;
first(Cn) = 1 ·first(Cn1) = 1 ·last(Cn1) = 10n1.
Let now define the list F(k)
nof length nbinary words as:
F(k)
n=
Cnif 0 n < k,
1·F(k)
n101 · F(k)
n2001 · F(k)
n3◦ · · · 0k11· F (k)
nkif nk.
(3)
For k2 and n0, F(k)
nis a list for the set of length nbinary words
with no kconsecutive 0’s, and Proposition 3.2 says that it is a Gray code
(actually, F(k)
nis a adaptation of a similar list defined earlier [15]).
It is easy to see that the number of binary words in F(k)
nis given by f(k)
n,
the well known k-Fibonacci integer sequence defined by:
f(k)
n=
2nif 0 n < k,
f(k)
n1+f(k)
n2+···+f(k)
nk,if nk,
4
and the words in F(k)
nare said k-generalized Fibonacci words. For example,
the list F(3)
3for the length 3 binary words avoiding 3 consecutive 0’s is
F(3)
3= (100,101,111,110,010,011,001).
Proposition 3.1.
first(F(k)
n)is the length nprefix of the infinite periodic word (10k11)(10k11) ...;
last(F(k)
n)is the length nprefix of the infinite periodic word (0k111)(0k111) ....
Proof. For the first point, if 1 n < k, then first(F(k)
n) = first(Cn) = 10n1;
and if n=k, then first(F(k)
n) = 1 ·first(F(k)
n1) = 1 ·last(Cn1) = 10k1, and
the statement holds in both cases.
Now, if n > k, by the definition of F(k)
nwe have
first(F(k)
n) = 1 ·first(F(k)
n1)
= 1 ·last(F(k)
n1)
= 10k11·last(F(k)
nk1)
= 10k11·first(F(k)
nk1),
and recursion on ncompletes the proof.
For the second point, if 1 n < k, then last(F(k)
n) = last(Cn) = 0n; and
if n=k, then last(F(k)
n) = 0k11, and the statement holds in both cases.
Now, if n > k, we have
last(F(k)
n) = 0k11·last(F(k)
nk)
= 0k11·first(F(k)
nk),
and by the first point of the present proposition, recursion on ncompletes
the proof.
Proposition 3.2. The list F(k)
nis a Gray code where two consecutive strings
differ in a single position.
Proof. It is enough to prove that there is a ‘smooth’ transition between any
two consecutive lists in the definition of F(k)
ngiven in relation (3), that is,
for any , 1 k1, the words
α= 011·last(F(k)
n) = 011·first(F(k)
n)
and
β= 01·first(F(k)
n1) = 01·last(F(k)
n1)
5
differ in a single position. By Proposition 3.1,
α= 011α
and
β= 01β
with αand βappropriate length prefixes of (10k11)(10k11) ...and (0k111)(0k111) ...,
and so αand βdiffer only in position .
As a by-product of the proof of the previous proposition we have the fol-
lowing remark which is critical in algorithm process used for the generating
algorithm in Section 4.2.
Remark 1. If α=a1a2. . . anand β=b1b2. . . bnare two successive words
in F(k)
nwhich differ in position , then either =nor a+1 =b+1 = 1.
3.2 The list H(k)
n,q
Let H(k)
n,q be the list defined by:
H(k)
n,q =ǫ(α1)ǫ(α2)ǫ(α3)ǫ(α4)◦ · · · ǫ(αf(k)
n2
) (4)
with αi= 1φi1 and φiis the i-th binary word in the list F(k)
n2, and ǫ(αf(k)
n2
)
is ǫ(αf(k)
n2
) or ǫ(αf(k)
n2
) according on whether f(k)
n2is odd or even.
Clearly, H(k)
n,q is a list for the set of q-ary words of length nwhich begin
and end by a non zero value, and with no kconsecutive 0’s. In particular,
H(k)
n,2= 1 · F(k)
n2·1.
Proposition 3.3. The list H(k)
n,q is a Gray code.
Proof. From Proposition 2.2 it follows that consecutive words in each list
ǫ(αi) and ǫ(αi) differ in a single position (and by +1 or 1 in this position).
To prove the statement it is enough to show that, for two consecutive binary
words φiand φi+1 in F(k)
n2, both pair of words
last(ǫ(1φi1)) and first(ǫ(1φi+11)) = last(ǫ(1φi+1 1)), and
last(ǫ(1φi1)) = first(ǫ(1φi1)) and first(ǫ(1φi+11))
differ in a single position.
In the first case, by Proposition 2.1, the first symbols of last(ǫ(1φi1)) and
of last(ǫ(1φi+11)) are both (q1), and the other symbols are either 1 if qis
odd, or (q1) if qis even; and since φiand φi+1 differ in a single position,
the result holds.
In the second case, first(ǫ(1φi1)) = 1φi1 and first(ǫ(1φi+11)) = 1φi+1 1,
and again the result holds.
6
3.3 The list S(k)
n,q
Now we define the list S(k)
n,q as
S(k)
n,q = 0k· H(k)
nk,q,
and clearly, S(k)
n,q is a list for the set of cross-bifix-free words S(k)
n,q . In partic-
ular,
S(k)
n,2= 0k1· F(k)
nk2·1,
for example, the set S(3)
8,2of length 8 binary cross-bifix-free words which begin
by 000 is
S(3)
8,2= 0001 · F(3)
3·1 =
= (00011001,00011011,00011111,00011101,00010101,00010111,00010011).
A consequence of Proposition 3.3 is the next proposition.
Proposition 3.4. The list S(k)
n,q is a Gray code.
For the sake of clearness, we illustrate the previous construction for the
Gray code list S(3)
8,3on the alphabet A={0,1,2}. We have:
G3,3= (111,112,122,121,221,222,212,211);
G4,3= (1111,1112,1122,1121,1221,1222,1212,1211,2211,2212,2222,
2221,2121,2122,2112,2111);
G5,3= (11111,...,12111,22111,...,21111);
and
S(3)
8,3= (00011001,00011002,00012002,00012001,00022001,00022002,
00021002,00021001,00021011, . . . , 00011011,00011111, . . .
. . . , 00021111,00021101, . . . , 00011101,00010101,00010102,
00010202,00010201,00020201,00020202,00020102,00020101,
00020111,...,00010111,00010011,00010012,00010022,
00010021,00020021,00020022,00020012,00020011).
4 Algorithmic considerations
In this section we give a generating algorithm for binary words in the
list F(k)
nand an algorithm expanding binary words; then, combining them,
we obtain a generating algorithm for the list H(k)
n,q, and finally prepending
7
0kto each word in H(k)
nk,q the list S(k)
n,q is obtained. The given algorithms
are shown to be efficient.
The list F(k)
ndefined in (3) has not a straightforward algorithmic im-
plementation, and now we explain how F(k)
ncan be defined recursively as
the concatenation of at most two lists, then we will give a generating algo-
rithm for it. Let F(k)
n(u), 0 uk1, be the sublist of F(k)
nformed by
strings beginning by at most u0’s. By the definition of F(k)
n, it follows that
F(k)
n=F(k)
n(k1), and
F(k)
n(0) = 1 ·F(k)
n1
= 1 ·F(k)
n1(k1),
and for u > 0,
F(k)
n(u) = 1 ·F(k)
n101 · F(k)
n2◦ · · · 0u1· F (k)
nu1
= 1 ·F(k)
n10·(1 · F(k)
n2◦ · · · 0u11· F (k)
nu1)
= 1 ·F(k)
n10· F(k)
n1(u1).
By the above considerations we have the following proposition.
Proposition 4.1. Let k2,0uk1, and F(k)
n(u)be the list defined
as:
F(k)
n(u) =
λif n= 0,
1·F(k)
n1(k1) if n > 0 and u= 0,
1·F(k)
n1(k1) 0· F(k)
n1(u1) if n, u > 0.
(5)
Then F(k)
n(k1) is the list F(k)
ndefined by relation (3).
Now we explain how the relation (5) defining the list F(k)
n(u) can be im-
plemented in a generating algorithm. It is easy to check that F(k)
n=F(k)
n(k
1) has the following properties: for α=a1a2. . . anand β=b1b2. . . bntwo
consecutive binary words in F(k)
n, there is a psuch that
ai=bifor all i, 1 in, except bp= 1 ap,
0k1can not be a suffix of a1a2. . . ap1=b1b2. . . bp1,
8
the sublist of F(k)
nformed by the strings with the prefix b1b2. . . bpis
b1b2. . . bp· L, where Lis F(k)
np(u1) or F(k)
np(u1) according to the
prefix b1b2. . . bphas an even or odd number of 1’s, and uis equal to k
minus the length of the maximal 0 suffix of b1b2. . . bp.
Let us consider procedure gen fib in Figure 1 where process switches
the value of b[pos] (that is, b[pos] := 1 b[pos]), and prints the obtained
binary string b. By the above remarks and relation (5) in Proposition 4.1 it
follows that after the initialization of bby the first string in F(k)
m(given in
Proposition 3.1) and printing it out, the call of gen fib(1,k1,0)produces
the list F(k)
m. Moreover, as we will show below, for m=n1 and after the
appropriate initialization of b=b1b2. . . bnthe call of gen fib(k+ 2,k1,0)
produces the list 0k1· F(k)
nk2·1 = S(k)
n,2.
Procedure gen fib is an efficient generating procedure. Indeed, each
recursive call induced by gen fib is either
a terminal call (which does not produce other calls), or
a call producing two recursive calls, or
a call producing one recursive call, which in turn is in one of the
previous two cases.
By ‘CAT’ principle in [13] it follows that procedure gen fib runs in constant
amortised time.
4.1 Generating S(k)
n,2
After the initialization of b1b2. . . bnby 0k1·first(F(k)
nk2)·1, with first(F(k)
nk2)
given in Proposition 3.1, and printing it out, the call of gen fib(k+ 2,k
1,0)where
m=n1, and
procedure process called by gen fib switches the value of b[pos] (that
is, b[pos] := 1 b[pos]) and prints b
produces, in constant amortized time, the list 0k1· F(k)
nk2·1 = 0k· H(k)
nk,2
which is, as mentioned before, the list S(k)
n,2.
4.2 Generating S(k)
n,q ,q > 2
Before discussing the expansion algorithm expand needed to produce
the list S(k)
n,q when q > 2 we show that gen tuple procedure in Figure 2, on
which expand is based, is an efficient generating algorithm for the list Gn,q
defined in relation (1). Procedure gen tuple is a ‘naive’ odometer principle
based algorithm, see again [13], and we have the next proposition.
9
procedure gen fib(pos,u,dir)
global b,k,m;
if pos m
then if u= 0
then gen fib(pos + 1,k1,1dir);
else if dir = 0
then gen fib(pos + 1,k1,1);
process(pos);
gen fib(pos + 1,u1,0);
else gen fib(pos + 1,u1,1);
process(pos);
gen fib(pos + 1,k1,0);
end if
end if
end if
end procedure.
Figure 1: Algorithm producing the list F(k)
nor S(k)
n,q , according to the initial
values of m,band the definition of process procedure.
Proposition 4.2. After the initialization of vby 11 ···1, the first word in
Gn,q, and diby 1, for 1in, procedure gen tuple produces the list Gn,q
in constant amortized time.
Proof. The total amount of computation of gen tuple is proportional to the
number of times the statement i:= i1 is performed in the inner while
loop; and for a given qand nlet denote by cnthis number. So, the average
complexity (per generated word) of gen tuple is cn
qn. Clearly, c1=q1 and
cn= (q1) ·n+q·cn1, and a simple recursion shows that cn=q·qn1
q1n
and finally the average complexity of gen tuple is cn
qnq
q1.
Now we adapt algorithm gen tuple in order to obtain procedure expand
producing the expansion of a words; and like gen tuple, procedure expand
has a constant average time complexity. More precisely, for a words b=
b1b2. . . bnin {0,1,...,q}n, with b+1 , bn6= 0 let bdenote the trace of
b+1b+2 . . . bn, that is, the word obtained from b+1b+2 . . . bnby replac-
ing each non-zero value by 1, and b′′ that obtained by erasing each 0 letter
in b+1b+2 . . . bn. Procedure expand produces the list:
b1b2. . . b·ǫ(b) if the initial value of bis such that b′′ is the first word
in G|b′′|,q , or
b1b2. . . b·ǫ(b) if the initial value of bis such that b′′ is the last word
in G|b′′|,q .
10
procedure gen tuple
global v,d,n;
output v;
do i:= n;
while i1and
(v[i] = q1and d[i] = 1 or v[i] = 1 and d[i] = 1)
d[i] := d[i];
i:= i1;
end while
if i1then v[i] := v[i] + d[i]; output v; end if
while i1
end procedure.
Figure 2: Odometer algorithm producing the list Gn,q.
The initial value of d+1, d+2,...,dnare given by: if bi= 1, then di= 1;
and if bi=q1, then di=1; otherwise diis not defined. In order to
access in constant time from a position iin the current word b, with bi6= 0,
to the previous one, additional data structures are used. The array prec is
defined by: if bi6= 0, then preci=j, where jis the rightmost position in
b, at the left of iand with bj6= 0; and for convenience preci= 0 if iis the
leftmost non-zero position in b.
procedure expand
global b,d,,n,prec;
output v;
do i:= n;
while i+ 1 and
(b[i] = q1and d[i] = 1 or b[i] = 1 and d[i] = 1)
d[i] := d[i];
i:= prec[i];
end while
if i+ 1 then b[i] := b[i] + d[i]; output b; end if
while i+ 1
end procedure.
Figure 3: Algorithm expanding a word band mimicking procedure
gen tuple.
Now we explain procedure process; it calls expand and we will show
that when gen fib in turn calls procedure process in Figure 4, then it
produces the list S(k)
n,q , with q > 2. The parameter pos of process is given
11
procedure process(pos)
global b,d,succ,prec;
if b[pos] = 0
then a:= prec[pos + 1];succ[a] := pos;succ[pos] := pos + 1;
prec[pos] := a;prec[pos + 1] := pos;
b[pos] := b[pos + 1];
d[pos] := d[pos + 1];
else a:= prec[pos];z:= succ[pos];
prec[z] := a;succ[a] := z;
b[pos] := 0;
expand;
end procedure.
Figure 4: Procedure process called by gen fib in order to generate the
list S(k)
n,q .
by the corresponding call of gen fib, and it gives the position in the current
word b1b2. . . bnin S(k)
n,q where bpos changes from a non-zero value to 0, or vice
versa. By Remark 1 and the definition of the list S(k)
n,q from H(k)
nk,q , and so
from F(k)
nk2,q, it follows that bpos+1 6= 0. Procedure process, sets bpos to
0 if previously bpos 6= 0; and sets bpos to bpos+1 if previously bpos = 0, which
according to Proposition 2.1, Remark 1 and the definition of the expansion
operation is the new value of bpos. In order to access in constant time from
a non-zero position in the array bto the previous one, process uses array
prec of procedure expand and array succ, defined as: succi=j, with j
the leftmost position in b, at the right of iand with bj6= 0, and succiis
not defined if iis the rightmost non-zero position. In addition, procedure
process updates both arrays prec and succ.
For given q > 2, k2 and nk+ 2, after the initialization of b1b2. . . bn
by 0k1·first(F(k)
nk2)·1, as for generating S(k)
n,2, the call of gen fib(k+ 2,k
1,0)where
m=n1, and
procedure process is that in Figure 4, and
procedure expand that in Figure 3, with =k+ 1
produces, in constant amortized time, the list S(k)
n,q .
5 Conclusion and further works
The cross-bifix-free sets S(k)
n,q defined in [5] have the cardinality close to
the optimum. They are constituted by particular words s1s2. . . snof length
12
nover a q-ary alphabet. Each word has the form 0ksk+1sk+2 . . . snwhere
sk+1 and snare different from 0 and sk+1sk+2 . . . sn1does not contain k
consecutive 0’s. We have provided a Gray code for S(k)
n,q by defining a Gray
code for the words sk+1sk+2 . . . snand then prepending the prefix 0kto them.
Moreover, an efficient generating algorithm for the obtained Gray code is
given. We note that this Gray code is trace partitioned in the sense that all
the words with the same trace are consecutive. To this aim we used a Gray
code for restricted binary strings [15], opportunely replacing the bits 1 with
the symbols of the alphabet different from 0.
A future investigation could be the definition of a Gray code which is
prefix partitioned, where all the words with the same prefix are consecutive.
Actually, the definition of the sets S(k)
n,q shows that it is sufficient to define a
prefix partitioned Gray code for the subwords sk+1sk+2 . . . sn.
An interesting question arising when one deals with a Gray code Lon a
set is the possibility to define it in such a way that the Hamming distance
between last(L) and first(L) is 1 (circular Gray code). Usually it is not so
easy to have a circular Gray code, unless the elements of the set are not
subject to constraints; in our case it is worth to study if the ground-set we
are dealing with (which is a cross-bifix free set) allows to find a circular Gray
code.
References
[1] Bajic, D. (2007) On Construction of Cross-Bifix-Free Kernel Sets. 2nd
MCM COST 2100, TD(07)237, Lisbon, Portugal.
[2] Baril, J. and Vajnovszki, V. (2004) Gray code for derangements. Dis-
crete Applied Mathematics 140 207–221
[3] Berstel, J., Perrin, D. and Reutenauer, C. (2009) Codes and Automata
(Encyclopedia of Mathematics and its Applications). Cambridge Uni-
versity Press.
[4] Bilotta, S., Pergola, E. and Pinzani, R. (2012) A new approach to cross-
bifix-free sets. IEEE Transactions on Information Theory 58 4058–
4063.
[5] Chee, Y. M., Kiah, H. M., Purkayastha, P. and Wang, C. (2013) Cross-
bifix-free codes within a constant factor of optimality. IEEE Transac-
tions on Information Theory 59 4668–4674.
[6] Crochemore, M., Hancart, C. and Lecroq, T. (2007) Algorithms on
strings. Cambridge University Press, Cambridge.
[7] Er, M. C. (1984) On generating the N-ary reflected Gray code. IEEE
Transaction on Computer 33 739–741.
13
[8] Gray, F. (1953) Pulse Code Communication. U.S. Patent 2 632 058.
[9] Hamming, R. W. (1950) Error detecting and error correcting codes.
Bell System Technical Journal 29 147–160.
[10] Johnson, S. M. (1963) Generation of permutations by adjacent trans-
positions. Mathematics of Computation 17 282–285.
[11] de Lind van Wijngaarden, A. J. and Willink, T. J. (2000) Frame syn-
chronization using distributed sequences. IEEE Transactions on Com-
mununications 48 2127-2138.
[12] Ruskey, F. (1993) Simple combinatorial Gray codes constructed by re-
versing sublist. Lecture Notes in Computer Science 762 201–208.
[13] Ruskey F. Combinatorial generation, Book in preparation.
[14] Sagan, B. E. (2010) Pattern avoidance in set partitions. Ars Combina-
toria 94 (2010) 79–96.
[15] Vajnovszki, V. (2001) A loopless generation of bitstrings without p con-
secutive ones. Discrete Mathematics and Theoretical Computer Science
Springer (2001), 227–240.
[16] Vajnovszki, V. (2001) Gray visiting Motzkin. Acta Informatica 38 793–
811.
[17] Walsh, T. (2001) Gray codes for involutions. Journal of Combinatorial
Mathematics and Combinatorial Computing 36 95–118.
[18] Walsh, T. (2003) Generating Gray Codes in O(1) worst-case time per
word. Lecture Notes in Computer Science 2731 73–88.
[19] Williamson, S.G. (1985) Combinatorics for computer science. Computer
Science Press, Rockville, Maryland.
14
... Recently, non-overlapping codes have been important in the application of DNA storage systems [15,25]. For further constructions of fixed-length non-overlapping codes, see [4][5][6][7]9]. Additionally, recent advances in non-overlapping codes and their extensions are discussed in [1,[10][11][12][20][21][22]. ...
Article
Full-text available
Non-overlapping codes over a given alphabet are defined as a set of words satisfying the property that no prefix of any length of any word is a suffix of any word in the set, including itself. When the word lengths are variable, it is additionally required that no word is contained as a subword within any other word. In this paper, we present a new construction of variable-length non-overlapping codes that generalizes the construction by Bilotta. Subsequently, we derive the generating function and an enumerative formula for our constructed code, and establish upper bound on their cardinalities. A comparison with the bound provided by Bilotta shows that the newly constructed code offers improved performance in the code size.
... Furthermore, the author of [7] extended Construction I for q > 2 and showed that if q is a multiple of n then this extension results with strictly optimal codes. Lastly, we note that in [4], [5] Gray codes were presented for listing the vectors of the code C 1 (n, q, k). ...
Preprint
Mutually Uncorrelated (MU) codes are a class of codes in which no proper prefix of one codeword is a suffix of another codeword. These codes were originally studied for synchronization purposes and recently, Yazdi et al. showed their applicability to enable random access in DNA storage. In this work we follow the research of Yazdi et al. and study MU codes along with their extensions to correct errors and balanced codes. We first review a well known construction of MU codes and study the asymptotic behavior of its cardinality. This task is accomplished by studying a special class of run-length limited codes that impose the longest run of zeros to be at most some function of the codewords length. We also present an efficient algorithm for this class of constrained codes and show how to use this analysis for MU codes. Next, we extend the results on the run-length limited codes in order to study (dh,dm)(d_h,d_m)-MU codes that impose a minimum Hamming distance of dhd_h between different codewords and dmd_m between prefixes and suffixes. In particular, we show an efficient construction of these codes with nearly optimal redundancy. We also provide similar results for the edit distance and balanced MU codes. Lastly, we draw connections to the problems of comma-free and prefix synchronized codes.
... Non-overlapping codes have found important applications in communication systems, and recently DNA storage systems [17,21]. For more constructions on fixed-length non-overlapping codes, for example, see [1,[3][4][5]7]. In addition, we refer to [8,9,[18][19][20] for a series of recent advances in non-overlapping codes and their extensions. ...
Article
Full-text available
Non-overlapping codes are a set of codewords such that any nontrivial prefix of each codeword is not a nontrivial suffix of any codeword in the set, including itself. If the lengths of the codewords are variable, it is additionally required that every codeword is not contained in any other codeword as a subword. Let C(n, q) be the maximum size of a fixed-length non-overlapping code of length n over an alphabet of size q. The upper bound on C(n, q) has been well studied. However, the nontrivial upper bound on the maximum size of variable-length non-overlapping codes whose codewords have length at most n remains open. In this paper, by establishing a link between variable-length non-overlapping codes and fixed-length ones, we are able to show that the size of a q-ary variable-length non-overlapping code is upper bounded by C(n, q). Furthermore, we prove that the minimum average codeword length of a q-ary variable-length non-overlapping code with cardinality C~\tilde{C}, is asymptotically no shorter than n2n-2 as q approaches \infty , where n is the smallest integer such that C(n1,q)<C~C(n,q)C(n-1, q) < \tilde{C} \le C(n,q).
... Another further line of research could consider the possibility to list the paths of D (h,2) n in a Gray code sense using the tools developed by Barcucci, Bernini et al. [1,4,5,7,6]. As mentioned in Section 2, these paths can be encoded by strings on the alphabet {U, D}, so the problem of defining a Gray code could be addressed by starting from the techniques developed by Vajnovszki et al. [21]. ...
Preprint
Full-text available
Dyck paths having height at most h and without valleys at height h1h-1 are combinatorially interpreted by means of 312-avoding permutations with some restrictions on their \emph{left-to-right maxima}. The results are obtained by analyzing a restriction of a well-known bijection between the sets of Dyck paths and 312-avoding permutations. We also provide a recursive formula enumerating these two structures using ECO method and the theory of production matrices. As a further result we obtain a family of combinatorial identities involving Catalan numbers.
... This note is devoted to proving this fact, which apart from its interest en soi has practical counterparts. Indeed, words in B n (1 k ) play a critical role in some telecommunication frame synchronization protocols, see for example [1,3,5], or in particular Fibonacci-like interconnection networks [8]. ...
Article
Full-text available
It is known that binary words containing no k consecutive 1s are enumerated by k-step Fibonacci numbers. In this note we discuss the expected value of a random bit in a random word of length n having this property.
... Moreover, the construction we proposed, in the case of fixed dimension matrices, gives the possibility to list them in a Gray code sense, following the studies started in [2,5,7,8,9,10] where different Gray codes are defined for several set of strings and matrices. ...
Preprint
Full-text available
We propose a method for the construction of sets of variable dimension strong non-overlapping matrices basing on any strong non-overlapping set of strings.
Article
A cross-bifix-free code of length n over Zq\mathbb {Z}_{q} is a non-empty subset of Zqn\mathbb {Z}_{q}^{n} such that the prefix set of each codeword is disjoint from the suffix set of every codeword. To achieve good performance in communication systems, it is desirable to construct cross-bifix-free codes with large size. Recently, Wang and Wang generalized the classical cross-bifix-free codes presented by Levenshtein, Gilbert and Chee et al. by constructing a new family of cross-bifix-free codes SI,J(k)(n)S_{I,J}^{(k)}(n) . The code SI,J(k)(n)S_{I,J}^{(k)}(n) is nearly optimal in terms of its size and non-expandable if k=n1k=n-1 or 1k<n/21\leq k < n/2 . There are three major ingredients in this paper. The first is to improve the results in Chee et al. (2013) and Wang and Wang (2022) in which we prove that the code SI,J(k)(n)S_{I,J}^{(k)}(n) is non-expandable if and only if k=n1k=n-1 or 1k<n/21\leq k < n/2 . The second ingredient contributes to a new family of cross-bifix-free codes UI,J(t)(n)U^{(t)}_{I,J}(n) . This new code enables us to construct non-expandable cross-bifix-free codes SI,J(k)(n)UI,J(t)(n)S_{I,J}^{(k)}(n)\bigcup U^{(t)}_{I,J}(n) whenever SI,J(k)(n)S_{I,J}^{(k)}(n) is expandable. The union of UI,J(t)(n)U^{(t)}_{I,J}(n) and SI,J(k)(n)S_{I,J}^{(k)}(n) enlarges the size of SI,J(k)(n)S_{I,J}^{(k)}(n) . Finally, we give an explicit formula for the size of SI,J(k)(n)UI,J(t)(n)S_{I,J}^{(k)}(n)\bigcup U^{(t)}_{I,J}(n) .
Article
A (1, k )-overlap-free code, motivated by applications in DNA-based data storage systems and synchronization between communication devices, is a set of words in which no prefix of length t of any word is the suffix of any word for every integer t such that 1 ≤ t ≤ k . A (1, n — 1)-overlap-free code of length n is said to be non-overlapping. We provide a construction for q -ary (1, k )-overlap-free codes of length 2 k , which can be viewed as a generalization of the Zero Block Construction presented by Blackburn, Esfahani, Kreher and Stinson recently over a binary alphabet, and analyze the asymptotic behavior of their sizes. When n ≥ 2 k , an explicit general lower bound and an asymptotic lower bound for the size of an optimal q -ary (1, k )-overlap-free code of length n are presented. The exact value of the maximum size of q -ary (1, 2)-overlap-free codes of length n is determined for any n ≥ 4, and a construction for q -ary (1, k )-overlap-free codes of length k + 2 is given.
Article
Full-text available
A cross-bifix-free code is a set of words in which no prefix of any length of any word is the suffix of any word in the set. Cross-bifix-free codes arise in the study of distributed sequences for frame synchronization. We provide a new construction of cross-bifix-free codes which generalizes the construction in Bajic (2007) to longer code lengths and to any alphabet size. The codes are shown to be nearly optimal in size. We also establish new results on Fibonacci sequences, that are used in estimating the size of the cross-bifix-free codes.
Article
This text and reference on string processes and pattern matching presents examples related to the automatic processing of natural language, to the analysis of molecular sequences and to the management of textual databases. Algorithms are described in a C-like language, with correctness proofs and complexity analysis, to make them ready to implement. The book will be an important resource for students and researchers in theoretical computer science, computational linguistics, computational biology, and software engineering.
Article
A set of words with the property that no prefix of any word is the suffix of any other word is called cross-bifix-free set. We provide an efficient generating algorithm producing Gray codes for a remarkable family of cross-bifix-free sets.
Article
The author was led to the study given in this paper from a consideration of large scale computing machines in which a large number of operations must be performed without a single error in the end result. This problem of “doing things right” on a large scale is not essentially new; in a telephone central office, for example, a very large number of operations are performed while the errors leading to wrong numbers are kept well under control, though they have not been completely eliminated. This has been achieved, in part, through the use of self-checking circuits. The occasional failure that escapes routine checking is still detected by the customer and will, if it persists, result in customer complaint, while if it is transient it will produce only occasional wrong numbers. At the same time the rest of the central office functions satisfactorily. In a digital computer, on the other hand, a single failure usually means the complete failure, in the sense that if it is detected no more computing can be done until the failure is located and corrected, while if it escapes detection then it invalidates all subsequent operations of the machine. Put in other words, in a telephone central office there are a number of parallel paths which are more or less independent of each other; in a digital machine there is usually a single long path which passes through the same piece of equipment many, many times before the answer is obtained.
Article
Let F n (p) be the set of all n-length bitstrings such that there are no p consecutive ls. F n (p) is counted with the pth order Fibonacci numbers and it may be regarded as the subsets of {1, 2,…, n} without p consecutive elements and bitstrings in F n (p) code a particular class of trees or compositions of an integer. In this paper we give a Gray code for F n (p) which can be implemented in a recursive generating algorithm, and finally in a loopless generating algorithm.