PreprintPDF Available

Locality and Centrality: The Variety ZG

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract

We study the variety ZG of monoids where the elements that belong to a group are central, i.e., commute with all other elements. We show that ZG is local, that is, the semidirect product ZG*D of ZG by definite semigroups is equal to LZG, the variety of semigroups where all local monoids are in ZG. Our main result is thus: ZG*D = LZG. We prove this result using Straubing's delay theorem, by considering paths in the category of idempotents. In the process, we obtain the characterization ZG = MNil \vee Com, and also characterize the ZG languages, i.e., the languages whose syntactic monoid is in ZG: they are precisely the languages that are finite unions of disjoint shuffles of singleton languages and regular commutative languages.
Locality and Centrality: The Variety ZG
Antoine Amarilli !Ï
LTCI, Télécom Paris, Institut polytechnique de Paris, France
Charles Paperman
LINKS, CRIStAL, Université de Lille, INRIA, France
Abstract
We study the variety
ZG
of monoids where the elements that belong to a group are central, i.e.,
commute with all other elements. We show that
ZG
is local, that is, the semidirect product
ZG D
of
ZG
by definite semigroups is equal to
LZG
, the variety of semigroups where all local monoids
are in
ZG
. Our main result is thus:
ZG D
=
LZG
. We prove this result using Straubing’s
delay theorem, by considering paths in the category of idempotents. In the process, we obtain the
characterization
ZG
=
MNil Com
, and also characterize the
ZG
languages, i.e., the languages
whose syntactic monoid is in
ZG
: they are precisely the languages that are finite unions of disjoint
shuffles of singleton languages and regular commutative languages.
2012 ACM Subject Classification
Theory of computation
Formal languages and automata theory
Keywords and phrases regular language, variety, locality
Acknowledgements We thank Jean-Éric Pin and Jorge Almeida for their fruitful advice.
1 Introduction
In this paper, we study a variety of monoids called
ZG
. It is defined by enforcing that
the elements of the monoid that belong to a group are central, i.e., commute with all other
elements of the monoid. The notation
ZG
thus stands for Zentral Group, inspired by the
classical notion of centrality in group theory. We can also define
ZG
with the equation
xω+1y=yxω+1 on all elements xand y, where ωis the idempotent power of the monoid.
The variety
ZG
has been introduced by Auinger [
5
] as a subvariety of interest of the
broader class
ZE
of semigroups where the idempotent elements are central. The study of
ZE
was initiated by Straubing [
21
]. Straubing shows in particular that the variety
MNil
(called
simply
V
in the paper) of regular languages generated by finite languages is exactly the
variety of aperiodic monoids in
ZE
. From this, a systematic investigation of the subclasses
of ZE was started by Almeida and pursued by Auinger: see [2, page 211] and [5,4].
Our specific motivation to explore
ZG
comes from our study of the dynamic membership
problem for regular languages. In this problem [
18
], we want to handle update operations on
an input word while maintaining the information of whether it belongs to a fixed regular
language. In a companion paper also submitted to ICALP’21 [
3
], we identify a variant of
ZG
as a plausible tractability boundary characterizing the languages for which every update
operation can be handled in
O
(1). Specifically, this variant can be defined as the so-called
semidirect product of ZG by definite (D) semigroups, which we denote ZG D.
This semidirect product operation on varieties, which we use to define
ZG D
, intuitively
corresponds to composing finite automata via a kind of cascade operation. Its study is
the subject of a large portion of semigroup theory, inspired by the classical study of the
semidirect product in group theory. There are also known results to understand specifically
the semidirect product by
D
. For instance, the Derived category theorem [
28
] studies it as
a decisive step towards proving the decidability of membership to an arbitrary semidirect
product, i.e., deciding if a given monoid belongs to the product. The product by
D
also arises
naturally in several other contexts: the dotdepth hierarchies [
22
], the circuit complexity of
regular languages [
23
], or the study of the successor relations in first-order logic [
26
,
27
,
14
].
arXiv:2102.07724v1 [cs.FL] 15 Feb 2021
2 Locality and Centrality: The Variety ZG
Understanding this product with
D
is notoriously complicated. For instance, it requires
specific dedicated work for some varieties like
J
or
Com
[
13
,
12
,
25
]. Also, this product
does not preserve the decidability of membership, i.e., Auinger [
6
] proved that there are
varieties
V
such that membership in
V
is decidable, but the analogous problem for
VD
is
undecidable. For the specific case of the varieties
ZG
,
ZE
, or even
MNil
, we are not aware
of prior results describing their semidirect product with D.
Locality.
Existing work has nevertheless identified some cases where the
D
operator can
be simplified to a much nicer local operator, that preserves the decidability of membership and
is easier to understand. For any semigroup
S
, the local monoids of
S
are the subsemigroups
of
S
of the shape
eSe
with
e
an idempotent element of
S
. For a variety
V
, we say that a
semigroup belongs to
LV
if all its local monoids are in
V
. It is not hard to notice that the
variety
VD
is always a subvariety of
LV
, i.e., that every monoid in
VD
must also be
in
LV
. In some cases, we can show a locality result stating that the other direction also
holds, so that
VD
=
LV
. In those cases we say that the variety
V
is local. The locality
of the variety of monoids
DA
[
2
] is a famous result that has deep implications in logic and
complexity [
26
,
9
,
11
] and has inspired recent follow-up work [
17
]. Locality results are also
known for other varieties, for instance the variety of semi-lattice monoids (monoids that are
both idempotent and commutative) [
15
,
8
], any sub-varieties of groups [
22
, Theorem 10.2],
or the
R
-trivial variety [
20
,
19
,
24
]. This suggests an angle of attack to understanding the
variety ZG D: establishing a locality result of this type for ZG.
Contributions. Our main result in this paper is to show the locality of the variety ZG:
Theorem 1.1. We have LZG =ZG D.
In the process of showing this result, we obtain a characterization of
ZG
-congruences,
i.e., congruences
on Σ
where the quotient Σ
/
is a monoid of
ZG
. We show that they
are always refined by a so-called
n
-congruence, which identifies the number of occurrences of
the frequent letters (the ones occurring
> n
times in the word) modulo
n
, and also identifies
the exact subword formed by the rare letters (the ones occurring
n
times). Thanks to this
(Theorem 3.4), we also obtain a characterization of the languages of
ZG
, i.e., the languages
whose syntactic monoid is in
ZG
: they are exactly the finite unions of disjoint shuffles of
singleton languages and commutative languages (Corollary 3.5). We also characterize
ZG
as
a variety of monoids:
ZG
=
MNil Com
, for
MNil
defined in [
21
] and
Com
the variety
of commutative monoids.
Paper structure.
We give preliminaries in Section 2and formally define the variety
ZG
.
We then give in Section 3our characterizations of
ZG
via the so-called
n
-congruence. We
then define in Section 4the varieties
ZG D
and
LZG
used in Theorem 1.1, which we prove
in the rest of the paper. We first introduce the framework of Straubing’s delay theorem
used for our proof in Section 5, and rephrase our result as a claim (Claim 5.6) on paths in
the category of idempotents. We then study in Section 6how to pick a sufficiently large
value of
n
as a choice of our
n
-congruence, show in Section 7two lemmas on paths in the
idempotent category, and finish the proof in Section 8. We conclude in Section 9.
2 Preliminaries
For a complete presentation of the basic concepts (automata, monoids, semigroups, groups,
etc.) the reader can refer to the book of J. E. Pin [
16
] or to the more recent lecture notes [
29
].
A. Amarilli, C. Paperman 3
All semigroups, groups, and monoids that we consider are finite.
Semigroups and varieties.
For a semigroup
S
, we call
xS
idempotent if
x2
=
x
. We call
the idempotent power of
xS
the unique idempotent element which is a power of
x
. (This
means that
x
is idempotent iff it is its own idempotent power.) Now, the idempotent power
of
S
is an integer
ω
such that for any element
xS
, the element
xω
is the idempotent power
of
x
. We write
xω+k
for any
kZ
to mean
xω+k
where
k
is the remainder of
k
in the
integer division by ω.
We will use notions from formal language theory for some of our definitions. We denote
by Σan alphabet and by Σ
the set of all finite words on Σ. We denote by
ϵ
the empty word.
For
w
Σ
, we denote by
|w|
the length of
w
. For
u, v
Σ
, we say that
u
is a subword
of
v
if there is 0
n |u|
and 1
i1<· · · < in |v|
such that
u
=
vi1· · · vin
. For
w
Σ
and
a
Σ, we denote by
|w|a
the number of occurrences of
a
in
w
. A language
L
is a subset
of Σ
. A variety of (regular) languages is a class of regular languages which is closed under
Boolean operations, left and right derivatives, and inverse homomorphisms. A non-erasing
variety is closed under the same operations, except we only require closure under inverse
non-erasing homomorphisms, that is, those that do not map any letter of Σto ϵ.
Avariety of monoids (resp., variety of semigroups) is a class of monoids (resp., semigroups)
closed under direct product, quotient, and submonoid (resp., subsemigroup). We recall that
Eilenberg’s theorem [
10
] gives a one-to-one correspondence between varieties of languages
and varieties of monoids. A similar one-to-one correspondence exists between non-erasing
varieties of languages and varieties of semigroups. In the following, we abuse notation and
identify varieties of monoids with varieties of languages following this correspondence. To be
more precise, we say that a monoid
M
recognizes a language
L
if there exists a morphism
η
: Σ
M
such that
L
=
η1
(
η
(
L
)). Eilenberg’s correspondence then says that, when
considering a variety
V
of monoids, the languages recognized by monoids in
V
belong to the
corresponding variety of languages. Eilenberg’s correspondence extends to a correspondence
between variety of semigroups and non-erasing variety of languages.
Congruences.
Afinite index congruence on a finite alphabet Σis a congruence on Σ
that
has a finite number of equivalence classes. For a given finite index congruence
, the quotient
Σ
/
is a finite monoid, whose law corresponds to concatenation over Σ
and whose neutral
element is the class of the empty word. The syntactic monoid of a regular language over Σ
is
the quotient by the syntactic congruence for the language, which is a finite index congruence
because the language is regular. Letting
V
be a variety of monoids, we say that a finite
index congruence
on Σis a
V
-congruence if the quotient Σ
/
is a monoid in
V
. For a
given
V
-congruence
, the map
η
: Σ
Σ
/
, defined by associating each word with its
equivalent class, is an onto morphism within a monoid of
V
. Hence, each equivalence class is
a language of V, since it is recognized by Σ/.
The variety ZG.
In this paper, we study the variety of monoids
ZG
defined by the equation:
xω+1y
=
yxω+1
for all
x, y M
. Intuitively, this says that the elements of the form
xω+1
are central, i.e., commute with all other elements. This clearly implies the same for elements
of the form xω+kfor any kZ, as we will implicitly use throughout the paper:
Claim 2.1.
For any monoid
M
in
ZG
, for
x, y M
, and
kZ
, we have:
xω+ky
=
yxω+k
.
Note that these elements are precisely the monoid elements that are within a (possibly
trivial) subgroup. This motivates the name
ZG
, which stands for “Zentral Group”: it follows
4 Locality and Centrality: The Variety ZG
the traditional notation
Z
(
·
)for central subgroups, and extends the variety
ZE
introduced
in [
1
, p211] which only requires idempotents to be central. Thus, we have
ZG ZE
, and
non-commutative groups are examples of monoids that are in ZE but not in ZG.
By Eilenberg’s theorem,
ZG
then defines a variety of regular languages, namely the
languages whose syntactic monoid is in
ZG
. Note that any regular commutative language is
in
ZE
and in
ZG
(as is clear from the equation), and any finite language is also in
ZE
and
in ZG (their unique group element is a zero so it commutes with everything).
3 Characterizations of ZG
In this section, we present our characterizations of
ZG
, which we will use to prove Theorem 1.1.
We will show that
ZG
is intimately linked to a congruence on words called the
n
-congruence.
Intuitively, two words are identified by this congruence if the subwords of the rare letters
(occurring less than
n
times) are the same, and the numbers of occurrences of the frequent
letters (occurring more than ntimes) are congruent modulo n. Formally:
Definition 3.1
(Rare and frequent letters,
n
-congruence)
.
Fix an alphabet Σand a word
w
Σ
. Given a threshold
nN
, we call
a
Σrare in
w
if
|w|an
, and frequent in
w
if
|w|a> n
. We define the rare subword
wn
to be the subword of
w
obtained by keeping only
the rare letters of
w
, i.e., the subword of
w
where we keep precisely the letters of the (possibly
empty) rare alphabet {aΣ| |w|an}.
For n > 0, the n-congruence nis defined by writing unvfor u, v Σiff:
The rare subwords are equal: un=vn;
The rare alphabets are the same: for al l aΣ, we have |u|a> n iff |v|a> n;
The number of occurrences modulo
a
are the same: for al l
a
Σsuch that
|u|a> n
(and
|v|a> n), we have that |u|aand |v|aare congruent modulo n.
We first remark that two
n
-equivalent words are also
m
-equivalent for any divisor
m
of
n
:
Claim 3.2.
For any alphabet Σ, for any
n >
0, for any
m >
0, if
m
is a multiple of
n
then the m-congruence refines the n-congruence.
What is more, observe that n-congruences are a particular case of ZG-congruences:
Claim 3.3.
For any alphabet Σand
n >
0, the
n
-congruence over Σ
is a
ZG
-congruence.
Proof sketch.
Considering any equivalence class of the
n
-congruence, we can enforce in
ZG
the (commutative) conditions on the number of occurrences of the frequent letters, and
interleave this with the requirement on the rare subword.
The goal of this section is to show the following result. Intuitively, it states that
ZG
-
congruences are always refined by a sufficiently large n-congruence. Formally:
Theorem 3.4.
Consider any
ZG
-congruence
over Σ
and consider its associated monoid
M:=
Σ
/
. Let
n:=
(
|M|
+ 1)
·ω
with
ω
the idempotent power of
M
. Then the congruence
is refined by the n-congruence on Σ.
Before proving this result, we spell out some of its consequences. The most important
one is that Theorem 3.4 implies a characterization of languages in
ZG
, which is similar
to the one obtained by Straubing in [
21
] for the variety
MNil
. To define
MNil
, define a
nilpotent semigroup
S
to be a semigroup satisfying the equation
xωy
=
yxω
=
xω
, and let
S1
be the monoid obtained from
S
by adding an identity element 1to
S
(i.e., an element
A. Amarilli, C. Paperman 5
with 1
x
=
x
1 =
x
for all
xS
) if
S
does not have one. The variety
MNil
is generated
by semigroups of the form
S1
for
S
a nilpotent semigroup. It was shown in [
21
] that the
languages of
MNil
are disjoint monomials, that is, Boolean combinations of languages of
the shape Ba1Ba2· · · akBwith B {a1, . . . , ak}=.
Our analogous characterization for ZG is the following, obtained via Theorem 3.4:
Corollary 3.5.
Any
ZG
language
L
can be expressed as a finite union of languages of the
form
Ba1Ba2· · · akBK
where
{a1, . . . , ak} B
=
and
K
is a regular commutative
language.
Equivalently, we can say that every language of
ZG
is a finite union of disjoint shuffles
of a singleton language (containing only one word) and of a regular commutative language,
where the disjoint shuffle operator interleaves two languages (i.e., it describes the sets of
words that can be achieved as interleavings of one word in each language) while requiring
that the two languages are on disjoint alphabets. We sketch the proof of Corollary 3.5:
Proof sketch.
We know that the syntactic congruence of a
ZG
language is a
ZG
-congruence,
so by Theorem 3.4 it is refined by an
n
-congruence; and the equivalence classes of an
n
-
congruence can be expressed as stated.
This corollary also implies a characterization of the variety of monoids
ZG
. To define it,
we use the join of two varieties
V
and
W
, denoted by
VW
, which is the variety of monoids
generated by the monoids of
V
and those of
W
. Alternatively, the join is the smallest variety
containing both varieties. We then have:
Corollary 3.6.
The variety
ZG
is generated by commutative monoids and monoids of the
shape S1with Sa nilpotent semigroup. In other words, we have: ZG =MNil Com.
Proof.
Clearly
ZG
contains both
Com
and
MNil
. Furthermore, by Corollary 3.5, any
language in
ZG
is a union of intersections of a language in
MNil
and a language in
Com
.
Hence, it is in the variety generated by them, concluding the proof.
On a different note, we will also use Theorem 3.4 to show a technical result that will be
useful later. It intuitively allows us to regroup and move arbitrary elements:
Corollary 3.7.
For any monoid
M
in
ZG
, letting
n
(
|M|
+ 1)
·ω
, for any element
m
of Mand elements m1, . . . , mnof M, we have
m·m1·m·m2·m· · · m·mn·m·mn·m=mn+1 ·m1· · · mn.
Having spelled out the consequences of Theorem 3.4, we turn to its proof. It crucially
relies on a general result about
ZG
that we will use in several proofs, and which is shown
simply by manipulating equations:
Lemma 3.8.
Let
M
be a monoid of
ZG
, let
ω
be the idempotent power, and let
x, y M
.
Then we have: (xy)ω=xωyω.
We can then sketch the proof of Theorem 3.4:
Proof sketch.
From Lemma 3.8, we can rewrite any
ZG
-congruence to a normal form, where
frequent letters are moved to the end of the word. Looking at this form, we can show that
n-equivalence implies equivalence by the ZG-congruence.
6 Locality and Centrality: The Variety ZG
4 Defining ZG Dand LZG, and Result Statement
We have given our characterizations of
ZG
and presented some preliminary results. We now
move to the definition of LZG and ZG D, to show that LZG =ZG D(Theorem 1.1).
ZG D.
We denote by
D
the variety of the definite semigroups, i.e., the semigroups
satisfying the equation
yxω
=
xω
. The variety of semigroups
ZG D
is intuitively defined
by taking the semidirect product of monoids in
ZG
and semigroups in
D
. Although we
will not use directly its definition in this paper, we recall it for completeness. Given two
semigroups
S
and
T
, a semigroup action of
S
on
T
is defined by a map
act
:
S×TT
such
that
act
(
s1,act
(
s2, t
)) =
act
(
s1s2, t
)and
act
(
s, t1t2
) =
act
(
s, t1
)
act
(
s, t2
). We then define
the product
act
on the set
T×S
as follows: for all
s1, s2
in
S
and
t1, t2
in
T
, we have:
(
t1, s1
)
act
(
t2, s2
)
:=
(
t1act
(
s1, t2
)
, s1s2
)
.
The set
T×S
equipped with the product
act
is a
semigroup called the semidirect product of
S
by
T
, denoted
Tact S
. The variety
ZG D
is
then the variety generated by the semidirect products of monoids in
ZG
and semigroups in
D
. Remark that this operation is equivalent to the wreath product of varieties. Furthermore
we could equivalently replace
D
by the variety of locally trivial semigroups. We refer to [
22
]
for a detailed presentation on this subject.
LZG.
Last, we introduce the variety
LZG
. This is the variety of semigroups
S
such that,
for every idempotent
e
of
S
, the submonoid
eSe
of elements that can be written as
ese
for
some
sS
is in
ZG
. In other words, a semigroup is in
LZG
iff it satisfies the following
equation: for any x,yand zin S, we have:
(zωxzω)ω+1 (zωyzω)=(zωyzω)(zωxzω)ω+1 .
Again, following Eilenberg’s theorem, we also see
LZG
as a non-erasing variety of languages.
Main result.
Our main result, stated in Theorem 1.1, is that
ZG D
and
LZG
are
actually the same variety. To prove this result, we will first present the general framework of
Straubing’s delay theorem in the next section and show the easy inclusion
ZG DLZG
,
before embarking with the actual proof.
5 Straubing’s Delay Theorem
To show our main result, we will use Straubing’s delay theorem from [
22
]. We first give some
prerequisites to recall this result. To this end, let us first define a general notion of category:
Definition 5.1.
Afinite category on a set of objects
O
is a finite multiset
C
over
O×O
of arrows, each arrow going from an object to another object (possibly itself), equipped with a
composition law: for any arrows
a, b C
such that we can write
a
= (
o, o
)and
b
= (
o, o′′
),
the composition law gives us
ab
which must be an arrow of the form
ab
= (
o, o′′
). Further,
this composition law must be associative. What is more, we require that for any object
o
, there
exists an arrow (
o, o
)which is the identity for all elements with which it can be combined
(hence these arrows are in particular unique).
We now define the notion of idempotent category of a semigroup. The idempotent category
of a language is then defined as that of its syntactic semigroup.
Definition 5.2
(Idempotent category)
.
Let
S
be a semigroup. The idempotent category
SE
of Sis the finite category defined as follows:
A. Amarilli, C. Paperman 7
The objects of Sare the idempotents of S.
For any idempotents
e
and
f
and any element
x
of
S
such that
xeSf
, we have an
arrow labeled by xgoing from eto f, which we will denote by (e, x, f ).
The composition law of the category is (
e, x, f
)(
f, y, g
) = (
e, xy, g
). Note that it is clearly
associative thanks to the associativity of the composition law on S.
Let us now study the idempotent category of a semigroup
S
in more detail. Let
Arrows
(
SE
) =
{
(
e, x, f
)
|xeSf }
be the set of arrows of the idempotent category. For
brevity, we denote this set simply as B.
Apath of
SE
is a nonempty word of
B
whose sequence of arrows is valid, i.e., the end
object of each arrow except the last one is equal to the starting object of the next arrow.
Because
SE
is a category, each path is equivalent to an element of the category, i.e., composing
the arrows of the path according to the composition law of the category will give one arrow
of the category, whose starting and ending objects will be the starting object of the path
(i.e., that of the first arrow) and the ending object of the path (i.e., of the last arrow). Two
paths are coterminal if they have the same starting and end object. Two paths
p1
and
p2
are
SE
-equal if they evaluate to the same category element, which we write
p1p2
. Note that if
two paths are
SE
-equal then they must be coterminal. A loop is a path whose starting and
ending objects are the same.
Acongruence is an equivalence relation
over
B
which satisfies compositionality, i.e., it
is compatible with the concatenation of words in the following sense: for any words
x
,
y
,
z
, and
t
of
B
, if
xy
and
zt
, then
xz yt
. Note that the relation is also defined on
words of
B
that are not valid, i.e., do not correspond to paths; and compositionality also
applies to such words.
Definition 5.3
(Compatible congruence)
.
A congruence
on
B
is compatible with
SE
iff
for any two coterminal paths
p1
and
p2
of
SE
such that
p1p2
, then
p1p2
. In other words,
is compatible with
SE
iff, on words of
B
that are coterminal paths, it refines
SE
-equality.
Recall the notion of a
ZG
-congruence from Section 2. We are now ready to state
Straubing’s delay theorem. The theorem applies to any variety, but we state it specifically
for ZG for our purposes. The theorem gives us an alternative characterization of ZG D:
Theorem 5.4
(Straubing’s delay theorem (Theorem 5.2 of [
22
]))
.
A language
L
is in
ZG D
iff, writing
SE
the idempotent category of
L
and defining
B:= Arrows
(
SE
)as above, there
exists a ZG-congruence on Bwhich is compatible with SE.
Using our notion of n-congruence, via Theorem 3.4 and Claim 3.3, we rephrase it again:
Corollary 5.5.
A language
L
is in
ZG D
iff, writing
SE
and
B
as above, there exists an
n-congruence on Bwhich is compatible with SE.
Before moving on to the full proof of our main theorem (Theorem 1.1), we conclude the
section by noticing that the Straubing delay theorem implies the easy direction of our result,
namely, if
L
is in
ZG D
then
L
is in
LZG
. This easy direction follows directly from [
28
],
but we provide a self-contained argument in Appendix Cfor completeness.
In the rest of this paper, we show the much harder direction, i.e., if
L
is in
LZG
then
L
is in ZG D. To prove this, using Corollary 5.5, it suffices to show:
Claim 5.6.
Let
S
be a semigroup of
LZG
, write
SE
its idempotent category and
B:=
Arrows(SE). There is n > 0such that the n-congruence on Bis compatible with SE.
8 Locality and Centrality: The Variety ZG
This result then implies, by our rephrasing of Straubing’s result (Corollary 5.5), that
S
is in
ZG D
. So in the rest of this paper we prove Claim 5.6. The proof is structured in
three sections. First, in Section 6, we carefully choose the value
n
in the congruence to be
“large enough”. Second, in Section 7, we show auxiliary results about paths in the category of
idempotents. Third, in Section 8, we conclude the proof, first by an induction on the number
of rare arrow occurrences, then by a decomposition of the category using ear decompositions
of multigraphs.
6 Choosing the Congruence
In this section, we define our choice of the value of
n
to prove Claim 5.6. Intuitively, we need
to choose
n
to be large enough so that the “gap” between the number of occurrences of rare
letters and of frequent letters can be made sufficiently large:
Definition 6.1.
For Σan alphabet,
u
Σ
,
nN
, and
m >
0, we say that
n
is an
m
-distant rare-frequent threshold if, letting Σ
r:={a
Σ
| |u|an}
be the rare letters for
n
,
then their total number of occurrences in
u
plus 1 multiplied by
m
is less than
n
, formally
1 + PaΣr|u|a×mn.
In other words, as every frequent letter occurs strictly more than
n
times, this guarantees
that every frequent letter occurs strictly more than (
r
+ 1)
m
times, where
r
is the total
number of rare letters. This means that there must some contiguous subword containing no
rare letter where the frequent letter occurs
> m
times. This will prove useful in pumping
arguments. Specifically, we will want an m-distant rare-frequent threshold with m:=|S|.
Of course, we cannot pick an
n
which will be an
m
-distant rare-frequent threshold for
any path
u
: there will always be paths
u
where the number of arrow occurrences is close
to
n
. However, remember that
n
-equivalence implies
n
-equivalence for all
n
that divide
n
(Claim 3.2). This suggests that, by choosing a large and composite enough
n
, we can ensure
that, given any pair of paths, we can pick a divisor
n
which is a
m
-distant rare-frequent
threshold. We will also want to ensure in the sequel that
n
is also always a multiple of the
idempotent power ωand of |S|+ 1. Let us formally state that such a choice of nexists:
Lemma 6.2.
For any
m
1, for any semigroup
S
, letting
SE
be the category of idempotents
of
S
, there exists an integer
n
2with the following property: for any paths
u1, u2
in
SE
,
there exists a divisor
n
of
n
which is a multiple of
ω×
(
|S|
+ 1) (for
ω
the idempotent power
of S) such that nis an m-distant rare-frequent threshold for u1and for u2.
This value of
n
will be the one we choose in our proof of Claim 5.6. Note that the result
applies to arbitrary semigroups, not only those of
ZG
. Before we prove Lemma 6.2, we first
observe for later that if a path has an
m
-distant rare-frequent threshold (actually it suffices
to have a 1-distant rare-frequent threshold) then the frequent arrows in this path for this
threshold must form a so-called union of strongly connected components (SCCs):
Definition 6.3.
Given
SE
and a subset
B
of its arrows
B
, we say that
B
is a union of
SCCs if, letting
G
be the directed graph on the objects of
SE
formed of the arrows of
B
, then
all connected components of Gare strongly connected.
Claim 6.4.
Fix
S
and
SE
and
B
, let
wB
be a path of
SE
, and let
n>
1be a 1-distant
rare-frequent threshold of
w
. Then the set of frequent arrows of
w
for
n
is a union of SCCs.
Proof sketch.
Any frequent arrow occurs
> n
times in
w
, so
w
must contain
n
occurrences
of a return path. Now, by 1-distance, not all of these paths can contain a rare arrow.
A. Amarilli, C. Paperman 9
Hence, in the rest of this section, we prove Lemma 6.2. Let us show an abstract result
that will give our choice of value n:
Claim 6.5.
For any
d >
0and
k >
0and
m
1, there exists
nm
such that for any
d
-tuple
T
of integers, there exists an
n′′ m
such that, letting
n:=n′′k
, we have that
n
divides nand that mPiFTin, where F={i|Tin}.
Intuitively,
d
is the cardinality of the alphabet
B
,
k
is
ω×
(
|S|
+ 1) (ensuring that we
always work with multiples of that value),
m
enforces a sufficiently large gap (it is the
parameter of a distant rare-frequent threshold), and
n
is the threshold that we will choose,
to ensure the existence of a suitable threshold n. Let us sketch the proof of Claim 6.5:
Proof sketch.
By choosing a sufficiently large
n
, we can ensure that we have 2
d
+ 1 divisors
of
n
that are candidate thresholds (are multiple of
k
) and are sufficiently far apart. As
there are only 2
d
possible partitions in rare and frequent alphabets, the pigeonhole principle
ensures that two divisors achieve the same partition. Taking the larger one then ensures that
the gap between rare and frequent letter occurrences is sufficiently large.
With Claim 6.5, it is now easy to show Lemma 6.2; details are given in the appendix.
7 The Loop Insertion and Prefix Substitution Lemmas
We now show several auxiliary results on the category of idempotents to be used in the
sequel. We first show some combinatorial results on paths and loops. We then use them
to establish two technical claims on paths: the loop insertion lemma, making it possible to
insert any loop of frequent arrows to the power
n
(with
n
a sufficiently distant rare-frequent
threshold) without affecting equivalence; and the prefix substitution lemma, which we can
use to replace a prefix of frequent arrows by another up to inserting a loop later in the path.
Recall that
SE
denotes the idempotent category of the semigroup
S
in
LZG
that we
study, and Bdenotes the set of arrows of SE.
Basic combinatorial results.
We first apply the definition of
ZG
to the local monoid to
get a trivial result about the commutation between loops:
Claim 7.1.
Let
x
and
y
be two coterminal loops of
SE
, let
kZ
, and let
ω
be an idempotent
power of S. We have: xω+kyyxω+k
We then show that frequent loops can be “recombined” without changing the category
image, simply by equation manipulation:
Claim 7.2.
For
x
,
x
two coterminal paths in
SE
and
y
,
y
coterminal paths in
SE
such
that xy and xyare valid loops, we have: (xy)ω(xy)ω(xy)ω(xy)ω(xy)ω(xy)ω.
The previous lemma implies that we can freely change the initial part of a path, even if
it not a loop, when there is a coterminal path under an
ω
with which we can swap it. We
show this again by equation manipulation, and it will be crucial for the prefix substitution
lemma that we show later in the section:
Claim 7.3.
For
x
,
x
two coterminal paths in
SE
and
y
,
y
coterminal paths in
SE
such
that
xy
and
xy
are valid loops, and for any path
t
coterminal with
y
, the following equation
holds: xt(xy)ω(xy)ωxt(xy)ωxy(xy)ω1.
10 Locality and Centrality: The Variety ZG
Loop insertion lemma.
We now argue that, when we have a sufficiently distant rare-frequent
threshold
n
, we can insert any arbitrary loop raised to the power
n
without changing the
category element to which a path evaluates:
Lemma 7.4
(Loop insertion lemma)
.
Let
p
be a path, and assume that
n
is a rare-frequent
threshold for
p
which is
|S|
-distant and a multiple of
ω
. Let
p
=
rt
be a decomposition of
p
(with
r
or
t
potentially empty), let
o
be the object between
r
and
t
(i.e., the final object of
r
,
or the initial object of
t
if
r
is empty), and let
p
be a loop over
o
that only uses frequent
arrows. Then pr(p)nt(note that they are also n-equivalent by construction).
We only sketch the proof of this result, which is proved in the appendix.
Proof sketch.
Intuitively, we show that, at any object
o
along a path, any idempotent
x
that can be achieved as a loop of frequent arrows of the form
qn
x
on
o
can be added without
affecting equivalence. This clearly preserves
n
-equivalence by definition, so we must only
argue that it only preserves equivalence. This claim implies Lemma 7.4, as spawning the
loop (p)nwhen indicated is then absorbed by one of these idempotents.
To establish the claim, we first show that, for any prefix
u
of frequent arrows, we can
spawn a loop starting by
u
with some arbitrary return path. This is by induction on the
length of
u
. The base case of a prefix
a
of length 1is shown by the pigeonhole principle: we
consider all occurrences of
a
, and as it is frequent and the threshold is
|S|
-distant we can
apply the pigeonhole principle to two occurrences of
a
separated only by frequent arrows,
which we then iterate to form the loop. The induction step is shown by spawning a loop
with a shorter prefix, then spawning the missing arrow of the prefix within the first loop,
and recombining. Thanks to this, all necessary loops can be spawned, proving the claim.
Prefix substitution lemma.
We finally show that we can freely change any prefix of frequent
arrows of a path, up to inserting a loop of frequent arrows elsewhere:
Lemma 7.5.
Let
p
=
xry
be a path, and assume that
n
is a rare-frequent threshold for
p
which is
|S|
-distant and a multiple of
ω
. Let
x
be a path coterminal with
x
. Assume that
every arrow in
x
and in
x
is frequent. Assume that some object
o
in the SCC of frequent
arrows of the initial object of
r
occurs again in
y
, say as the intermediate object of
y
=
y1y2
.
Then there exists
y
=
y1y′′y2
for some loop
y′′
consisting only of frequent arrows such that
pxryand such that pnxry.
This claim is shown in the appendix: it uses Claim 6.4 to argue that frequent arrows are
a union of SCCs, and crucially relies on Claim 7.3.
8 Concluding the Proof of LZG ZG D: Claim 5.6
We are now ready to prove the second direction of Theorem 1.1, namely Claim 5.6. We
have fixed the semigroup
S
in
LZG
, its category of idempotents
SE
, and
B:= Arrows
(
SE
).
We take the
n
given by Lemma 6.2. Our goal is to show that the
n
-congruence on
B
is
compatible with
SE
. To do so, let
u1
and
u2
be two coterminal paths that are
n
-equivalent.
We must show that
u1u2
, i.e., the two paths
u1
and
u2
evaluate to the same category
element in SE.
To do so, we use the guarantee on
n
ensured by Lemma 6.2 to pick an
n
with which to
work. Considering the path
u1
, the lemma ensures that there is a divisor
n
of
n
which is a
multiple of
ω×
(
|S|
+ 1) and is an
|S|
-distant rare-frequent threshold for
u1
. As
n
divides
n
and
u1
and
u2
are
n
-equivalent, by Claim 3.2 we know that they are also
n
-equivalent. So we
A. Amarilli, C. Paperman 11
will only consider the
n
-congruence, denoted
, from now on. We know that
u1
and
u2
are
n
-equivalent, that
n
is a multiple of
ω×
(
|S|
+ 1), and that
n
is an
|S|
-distant rare-frequent
threshold for
u1
and for
u2
. Recall that, following Definition 3.1, now that we have fixed
the threshold
n
, we call an arrow of
B
rare in
u1
in
u2
if it occurs
n
times, and frequent
otherwise.
We will show that
u1u2
, and in fact will show that
p1p2
for pairs of paths
p1, p2
more generally. More specifically, we establish the following claim by finite induction on
r
:
let
p1
and
p2
be two coterminal paths that are
n
-equivalent and which contain
r
rare arrows
each. Then p1p2. Showing this for all restablishes in particular that u1u2.
Base case: all arrows in p1and p2are frequent. The base case of the induction is:
Claim 8.1.
Let
p1
and
p2
be two coterminal paths that are
n
-equivalent and which contain
no rare arrows. Then p1p2
The claim is shown in Appendix F.1, so we only sketch the proof here. We consider the
multigraph
G
of all arrows occurring in
p1
and
p2
(note that these arrows for
p1
and
p2
must be the same). We prove that
p1p2
by another induction, this time on the number
of frequent arrows, i.e., the number of edges of the multigraph
G
. Formally, we show the
following by finite induction on the integer
η
: let
q1
and
q2
be two coterminal paths that are
n
-equivalent, where all rare arrows have 0 occurrences, and where there are
η
different
frequent arrows. Then
q1q2
. Showing this for all
η
establishes in particular that
p1p2
.
The base case is
η
= 0, in which case
q1
and
q2
must be empty and the claim is trivial,
so what matters is the induction step on
G
. Assume that the claim is true for any
q1
and
q2
such that Ghas ηedges. Consider q1and q2such that Ghas η+ 1 edges. Recall that G
is strongly connected: indeed, we know, as all arrows are rich, that
G
is a union of SCCs,
and
p1
(or
p2
) witnesses that
G
is connected, so we know that
G
is strongly connected. The
induction case is shown using a decomposition result on strongly connected multigraphs
following the notion of ear decomposition:
Lemma 8.2.Let Gbe a strongly connected nonempty directed multigraph. We have:
Gis a simple cycle; or
G
contains a simple cycle
u1 · · · unu1
with
n
1, where all vertices
u1, . . . , un
are pairwise disjoint, such that all intermediate vertices
u2, . . . , un1
only occur in the
edges of the cycle, and such that the removal of the cycle leaves the graph strongly connected
(note that the case n= 1 corresponds to the removal of a self-loop); or
G
contains a simple path
u1 · · · un
with
n
2where all vertices are pairwise
distinct, such that all intermediate vertices
u2, . . . , un1
only occur in the edges of the
path, and such that the removal of the path leaves the graph strongly connected (note that
the case n= 2 corresponds to the removal of a single edge).
This is a known result [
7
], but we give a self-contained proof in Appendix F.1. We use it
to distinguish three cases in the induction step of the induction on
G
, which we now sketch.
The first case is when
G
is a simple cycle. In this case
n
-equivalence ensures that the
cycle is taken by
q1
and
q2
some number of times with the same remainder modulo
n
, so
they evaluate to the same element because nis a multiple of ω.
The second case is when
G
contains a simple cycle only connected to a single object. This
time, we argue as in the previous case that the number of occurrences of the cycle must have
the same remainder, and we can use Corollary 3.7 to merge all the occurrences together.
However, to eliminate them, we need to use Lemma 7.5, to modify
q1
and
q2
to have the
12 Locality and Centrality: The Variety ZG
same prefix (up to and including the cycle occurrences), while preserving equivalence. This
allows us to consider the rest of the paths (which contains no occurrence of the cycle), apply
the induction hypothesis to them, and conclude by compositionality. A technicality is that
we must ensure that removing the common prefix does not make some arrows insufficiently
frequent relative to the distant rare-frequent threshold. We avoid this using Lemma 7.4 to
spawn sufficiently many copies of a suitable loop.
The third case is when
G
contains a simple path
π
connecting two objects. The reasoning
is similar, but we also use Lemma 7.4 to spawn a loop involving a return path for
π
and a
path that is parallel to
π
(i.e., does not share any arrows with it). The return path in this
loop can then be combined with
π
to form a loop, which we handle like in the previous case.
Induction case: some arrows are rare
Let us now show the induction step for the outer
induction, namely, the one on the number of occurrences of rare arrows. We assume the
claim of the outer induction for
rN
. Consider two paths
p1
and
p2
that are
n
-equivalent
and that contain
r
+ 1 rare letters. Let us partition them as
p1
=
q1as1
and
p2
=
q2as2
where
q1
and
q2
all consist of frequent arrows, and
a
is the first rare arrow of
p1
and
p2
(note
that
n
-equivalence implies that the first rare arrow is the same in both paths). In this case,
q1
and
q2
are two coterminal paths consisting only of frequent arrows (or they are empty),
and s1and s2are two coterminal paths (possibly empty) with rrare letter occurrences.
The full proof is given in appendix; we only sketch it. By Claim 6.4, either the SCC at
the origin of
a
occurs again in the rest of the path, or it does not. In the first case, we argue
with Lemma 7.5 that the prefix
q1
can be substituted for
q2
without affecting equivalence,
reason on the rest of the path by induction hypothesis, and conclude by compositionality. In
the second case,
n
-equivalence intuitively ensures that
q1
and
q2
, and
s1
and
s2
, must each
be
n
-equivalent, so we can apply the induction hypothesis to each of them. This concludes
the induction step of the outer induction.
Concluding the proof.
We have established by induction that
p1
and
p2
evaluate to the
same category element in all cases. This implies that
n
-equivalence for our choice of
n
is
compatible with
SE
, so by Corollary 5.5 we know that
L
is in
ZG D
. Thus,
LLZG
implies that
LZG D
. We have therefore established the locality result
LZG
=
ZG D
,
concluding the proof of Claim 5.6 and hence of Theorem 1.1.
9 Conclusion
In this paper, we have given a characterization of the languages of
ZG
, and proved that the
variety
ZG
is local. The methodology seems to be adaptable enough to tackle
ZGA
=
MNil
as well, but this would require a careful analysis of the proofs that we devote to future work.
The case of
ZE
is more complicated. As proved by Almeida [
1
], we have
ZE
=
GCom
,
that is,
ZE
is the variety of monoids generated by both groups and commutative languages.
Now, commutative languages are a specific example of a non-local variety, while
G
is a local
variety. This being said, we do not know of general results showing the preservation or
non-preservation of locality under such operators. Interestingly, however, one can check by
computation that the counter-example language in
LCom
but not in
Com D
(illustrating
that Com is not local), namely eaf becf , is in LZG.
We hope that extending our approach to a study of locality for centrally defined varieties
in general could lead to such general results on the interplay of join operations and of the
locality or non-locality for arbitrary varieties.
A. Amarilli, C. Paperman 13
References
1Jorge Almeida. Finite semigroups and universal algebra, volume 3. World Scientific, 1994.
2
Jorge Almeida. A syntactical proof of locality of DA. Int. J. Algebra Comput., 6(2):165–178,
1996. doi:10.1142/S021819679600009X.
3
Antoine Amarilli, Louis Jachiet, and Charles Paperman. Dynamic membership for regular
languages. Available online:
https://a3nm.net/publications/amarilli2021dynamic.pdf
.
Also submitted to ICALP’21, 2021.
4
K. Auinger. Join decompositions of pseudovarieties involving semigroups with commuting
idempotents. Journal of Pure and Applied Algebra, 170(2):115–129, 2002.
5
Karl Auinger. Semigroups with central idempotents. In Algorithmic problems in groups and
semigroups, pages 25–33. Springer, 2000.
6
Karl Auinger. On the decidability of membership in the global of a monoid pseudovariety. Int.
J. Algebra Comput., 20(2):181–188, 2010. doi:10.1142/S0218196710005571.
7
Jørgen Bang-Jensen and Gregory Z Gutin. Digraphs: theory, algorithms and applications.
Springer Science & Business Media, 2008.
8
J.A. Brzozowski and Imre Simon. Characterizations of locally testable events. Discrete
Mathematics, 4(3):243–271, 1973. doi:10.1016/s0012-365x(73)80005-6.
9
Luc Dartois and Charles Paperman. Alternation hierarchies of first order logic with regular
predicates. In FCT, 2015.
10
Samuel Eilenberg. Automata, languages, and machines. Vol. B. Academic Press [Harcourt
Brace Jovanovich, Publishers], New York-London, 1976. With two chapters (“Depth
decomposition theorem” and “Complexity of semigroups and morphisms”) by Bret Tilson,
Pure and Applied Mathematics, Vol. 59.
11
Nathan Grosshans, Pierre McKenzie, and Luc Segoufin. The power of programs over monoids
in DA. In MFCS, 2017.
12
Robert Knast. A semigroup characterization of dot-depth one languages. RAIRO Theor.
Informatics Appl., 17(4):321–330, 1983. doi:10.1051/ita/1983170403211.
13
Robert Knast. Some theorems on graph congruences. RAIRO Theor. Informatics Appl.,
17(4):331–342, 1983. doi:10.1051/ita/1983170403311.
14
Manfred Kufleitner and Alexander Lauser. Quantifier alternation in two-variable first-order
logic with successor is decidable. In STACS, 2013.
15
Robert McNaughton. Algebraic decision procedures for local testability. Mathematical Systems
Theory, 8(1):60–76, 1974. doi:10.1007/bf01761708.
16
J.-E. Pin. Varieties of formal languages. Foundations of Computer Science. Plenum Publishing
Corp., New York, 1986. With a preface by M.-P. Schützenberger, Translated from the French
by A. Howie. doi:10.1007/978-1-4613- 2215-3.
17
Thomas Place and Luc Segoufin. Decidable characterization of FO2(<, +1) and locality of
DA. abs/1606.03217, 2016.
18
Gudmund Skovbjerg Frandsen, Peter Bro Miltersen, and Sven Skyum. Dynamic word problems.
JACM, 44(2):257–271, 1997.
19
Benjamin Steinberg. A modern approach to some results of Stiffler. In Semigroups and
Languages. World Scientific, 2004. doi:10.1142/9789812702616_0013.
20
Price Stiffler. Chapter 1. extension of the fundamental theorem of finite semigroups. Advances
in Mathematics, 11(2):159–209, 1973. doi:10.1016/0001-8708(73)90007-8.
21
Howard Straubing. The variety generated by finite nilpotent monoids. Semigroup Forum,
24(1):25–38, 1982. doi:10.1007/bf02572753.
22
Howard Straubing. Finite semigroup varieties of the form V*D. Journal of Pure and Applied
Algebra, 36:53–94, 1985.
23
Howard Straubing. Finite automata, formal logic, and circuit complexity. Birkhäuser Boston
Inc., Boston, MA, 1994.
24
Howard Straubing. A new proof of the locality of R. International Journal of Algebra and
Computation, 25(01n02):293–300, 2015. doi:10.1142/s0218196715400111.
14 Locality and Centrality: The Variety ZG
25
Denis Thérien and Alex Weiss. Graph congruences and wreath products. J. Pure Appl. Algebra,
36(2):205–215, 1985. doi:10.1016/0022-4049(85)90071-4.
26
Denis Thérien and Thomas Wilke. Over words, two variables are as powerful as one quantifier
alternation. In STOC, 1998.
27
Denis Thérien and Thomas Wilke. Temporal logic and semidirect products: An effective
characterization of the until hierarchy. SIAM J. Comput., 31(3):777–798, 2001.
doi:10.1137/
S0097539797322772.
28
Bret Tilson. Categories as algebra: an essential ingredient in the theory of monoids. J. Pure
Appl. Algebra, 48(1-2):83–198, 1987. doi:10.1016/0022-4049(87)90108- 3.
29
Jean Éric Pin. Mathematical foundations of automata theory.
https://www.irif.fr/~jep/
PDF/MPRI/MPRI.pdf, 2019.
A. Amarilli, C. Paperman 15
A Proofs for Section 2(Preliminaries)
Claim 2.1.
For any monoid
M
in
ZG
, for
x, y M
, and
kZ
, we have:
xω+ky
=
yxω+k
.
Proof.
This is simply because we can always write
xω+k
as (
xω+k
)
ω+1
, because the latter
is equal to
x(ω+k)×(ω+1)
which is indeed equal to
xω+k
. Thus, by setting
x:=
(
xω+k
)and
applying the equation, we conclude.
B Proofs for Section 3(Characterizations of ZG)
B.1 Miscellaneous Results
Claim 3.2.
For any alphabet Σ, for any
n >
0, for any
m >
0, if
m
is a multiple of
n
then the m-congruence refines the n-congruence.
Proof.
The claim is trivial for
m
=
n
, so we assume
m>n
. As
m>n
, if two words
u
and
v
have the same rare alphabet for
m
, then they have the same rare alphabet for
n
, because
the number of occurrences of all rare letters for
m
is the same, so the same ones are also rare
for
n
. Furthermore, if they have the same rare subword for
m
, the restriction of this same
rare subword to the rare letters for
n
yields the same word. Last, we show that the number
of occurrences modulo
n
are the same. For the letters that were frequent for
m
, this is the
case because their number of occurrences is congruent modulo
m
, hence modulo
n
because
n
divides
m
. For the letters that were not frequent for
m
, this is because their number of
occurrences has to be the same because the rare subwords for mwere the same.
Claim 3.3.
For any alphabet Σand
n >
0, the
n
-congruence over Σ
is a
ZG
-congruence.
Proof.
Let
E
be an equivalence class of the
n
-congruence, which we see as a language of Σ
,
and let us show that
E
is a language of
ZG
. Let Σ =
AB
the partition of Σin rare and
frequent letters for the class
E
, let
u
be the word over
A
associated to the class
E
, and let
k
be the
|B|
-tuple describing the modulo values for
E
. We know that the singleton language
{u}
is a language of
ZG
, because it is finite. Hence, the language
U
=
Bu1· · · BunB
is
also in
ZG
, because it is the inverse inverse of
{u}
by the morphism that erases the letters
of
B
and is the identity on
A
. Similarly, the language
C
of words of
B
where the modulo
values of each letter are as prescribed by
k
and where every letter occurs at least
n
times is a
language of
ZG
, because it is commutative. For the same reason, the language
C
of words
of Σ
whose restriction to
B
are in
C
is also a language of
ZG
, because it is the inverse
image of
C
by the morphism that erases the letters of
A
and is the identity on
B
. Now, we
remark that E=CU, so Eis in ZG, concluding the proof.
B.2 Proofs of the Characterizations (Consequences of Theorem 3.4)
Corollary 3.5.
Any
ZG
language
L
can be expressed as a finite union of languages of the
form
Ba1Ba2· · · akBK
where
{a1, . . . , ak} B
=
and
K
is a regular commutative
language.
Proof.
Fix a language
L
in
ZG
, and consider the syntactic congruence
of
L
: it is a
ZG
-
congruence. By Theorem 3.4, there exists
nN
such that
is refined by a
n
-congruence
.
Now, by definition of the syntactic congruence, the set of words of Σ
that are in
L
is a
union of equivalence classes of
, hence of
. This means that
L
can be expressed as the
union of the languages corresponding to these classes.
16 Locality and Centrality: The Variety ZG
Now, an equivalence class of the
n
-congruence
can be expressed as the shuffle of
two languages: the singleton language containing the rare word defining the class, and the
language that imposes that all frequent letters are indeed frequent (so the rare alphabet is as
required) and that the modulo of their number of occurrences is as specified. The second
language is commutative, and the disjointness of rare and frequent letters guarantees that
the shuffle is indeed disjoint.
Thus, we have shown that
L
is a union of disjoint shuffles of a singleton language and a
regular commutative language. The form stated in the corollary is equivalent, i.e., it it the
shuffle of the singleton language
{a1· · · ak}
and of the commutative language obtained by
restricting Kto the subalphabet B.
Corollary 3.7.
For any monoid
M
in
ZG
, letting
n
(
|M|
+ 1)
·ω
, for any element
m
of Mand elements m1, . . . , mnof M, we have
m·m1·m·m2·m· · · m·mn·m·mn·m=mn+1 ·m1· · · mn.
Proof.
We consider the free monoid
M
. Let
η
:
MM
be the onto morphism defined
by
η
(
m1·m2
) =
η
(
m1
)
·η
(
m2
). Let
be the congruence it induces over
M
, i.e., for
u, v M
, we have
uv
if
η
(
u
) =
η
(
v
). Remark that, as
M
is in
ZG
, the congruence
is a
ZG
-congruence by definition. Hence, by Theorem 3.4,
is refined by a
n
-congruence
where
n
= (
|M|
+ 1)
ω
. Now, consider the two words in the equation to be shown above: they
are words of
M
. As the letter
m
over is then a frequent letter, we know that the two words
are indeed n-congruent, which concludes.
B.3 Proving Lemma 3.8
We show the lemma by establishing two claims:
Claim B.1. We have: (xy)ω=xωyω(xy)ω
Proof.
We have (
xy
)
ω
=
xy
(
xy
)
ω1
: as (
xy
)
ω1
is central the right-hand-side is equal to:
x(xy)ω1y. By injecting an (xy)ωin the latter, we obtain:
(xy)ω=x(xy)ω1(xy)ωy.
Applying this equality ωtimes gives:
(xy)ω= (x(xy)ω1)ω(xy)ωyω.
Now, note that we can expand (
x
(
xy
)
ω1
)
ω
, commuting the (
xy
)
ω1
to regroup the
x
into
xω
and regroup the (
xy
)
ω1
into ((
xy
)
ω1
)
ω
which is equal to (
xy
)
ω
, so that the first factor
of the right-hand-side is equal to xω(xy)ω
Thus, by commuting, we obtain: (xy)ω=xωyω(xy)ω, the desired result.
Claim B.2. We have: xωyω=xωyω(xy)ω.
Proof. We have xωyω=xω1xyω1y, so by the equation of ZG we get:
xωyω=xω1yω1xy.
Now, we have xω1yω1=xω1xωyω1yω, and by the equation of ZG we have:
xω1yω1=xω1yω1xωyω.
A. Amarilli, C. Paperman 17
Inserting the second equality in the first, we have:
xωyω=xω1yω1xωyωxy.
Now, applying this equality
ω
times gives
xωyω
= (
xω1yω1
)
ωxωyω
(
xy
)
ω
. As the first factor
of the right-hand side is equal to xωyω, we get xωyω=xωyω(xy)ω, the desired result.
Putting Claims B.1 and B.2 together immediately establishes Lemma 3.8.
B.4 Concluding the Proof of Theorem 3.4
To conclude the proof of Theorem 3.4, we will now show a kind of “normal form” for
ZG
-
congruences, by arguing that any word can be rewritten to a word where frequent letters are
moved to the end of the word, without breaking equivalence for the
ZG
-congruence. This
relies on Lemma 3.8 and allows us to get to the notion of n-equivalence. Specifically:
Claim B.3.
Let
be a
ZG
-congruence on Σ. Let
n:=
(
|M|
+ 1)
·ω
where
M
is the
monoid associated to
. Then, for al l
w
Σ
, for every letter
a
Σwhich is frequent in
w
(i.e.,
|w|a> n
), writing
w
the restriction of
w
to Σ
\ {a}
, and writing
w′′ :=wa|w|a
, we
have: ww′′.
Proof.
Define
n
as in the claim statement, and let
µ
: Σ
M
= Σ
/
be the morphism
associated to
. Remark that by definition, for any words
u, v
, we have
uv
iff
µ
(
u
) =
µ
(
v
).
Let us take an arbitrary
w
and
a
Σsuch that
a
is frequent in
w
. We can therefore write
w
=
w1aw2a· · · wmawm+1
with
m
=
|w|a>n>|M|
. Furthermore, letting
xl
=
µ
(
v1· · · vla
)
for each 1
lm
, as
m>M
we know by the pigeonhole principle that there exist
1
i<jm
such that
xi
=
xj
. But we have
xj
=
xi
(
a
)where
z
=
µ
(
vi+1a· · · vj
). By
applying the equation ωtimes, we have that xi=xi((a))ω.
Now, by Lemma 3.8, we have (
(
a
))
ω
=
zωµ
(
a
)
ω
. This is equal to
zωµ
(
a
)
ωµ
(
a
)
ω
, and
by now applying the lemma in reverse we conclude that (
(
a
))
ω
= (
(
a
))
ωµ
(
a
)
ω
. Finally,
we obtain xi=xi((a))ω=xi((a))ωµ(a)ω=xiµ(a)ω.
Now, the equation of
ZG
ensures that
µ
(
a
)
ω
is central, so we can commute it in
µ
(
w
)
and absorb all occurrences of
µ
(
a
)in
µ
(
w
), then move it at the end, while keeping the same
µ-image. Formally, from xi=xiµ(a)ω, we have
µ(w) = µ(w1)µ(a)· · · µ(wi)µ(a)µ(a)ωµ(wi+1)µ(a)· · · µ(wm)µ(a)µ(wm+1 ),
and we commute
µ
(
a
)
ω
to merge it with all
µ
(
a
)and then commute the resulting
µ
(
a
)
ω+|w|a
.
As
|w|aω
, this
µ
-image is the same as the one that we obtain from
wa|w|a
, with
w
as defined in the statement of the claim. This establishes that
ww′′
and concludes the
proof.
We can now conclude the proof of Theorem 3.4:
Proof of Theorem 3.4.
Let
be a
ZG
-congruence on Σ
,
M
its associated monoid, and fix
n:=
(
|M|
+ 1)
·ω
as in the theorem statement. Let
u
and
v
be two
n
-congruent words of Σ
,
we need to prove that they are indeed
-equivalent. Let Σ
=
{a1, . . . , ar}
be the subset
of letters in Σthat are frequent in
u
(hence in
v
, as they are
n
-congruent). By successive
applications of Claim B.3 for every frequent letter in Σ
, starting with
u
, we know that
uuna|u|a1
1· · · a|u|ar
r
. Likewise, we have
vvna|v|a1
1· · · a|v|ar
r
. Now, we know that
ω
divides
n
. Thus, as for any 1
ir
, the values
|u|ai
and
|v|ai
are greater than
n
and
congruent modulo
n
(by definition of the
n
-congruence), we have
a|u|ai
ia|v|ai
i
. We also
18 Locality and Centrality: The Variety ZG
know by definition of the
n
-congruence that
un
=
vn
. All of this together establishes that
uv. Thus, the n-congruence indeed refines the -congruence, concluding the proof.
C Proofs for Section 5(Straubing’s Delay Theorem)
In this appendix, we give the self-contained proof of the easy direction of our main result,
namely:
Claim C.1. We have ZG DLZG.
Proof.
If
L
is in
ZG D
, then by Theorem 5.4, there exists a
ZG
-congruence
compatible
with
SE
. Let us now show that
L
is in
LZG
by showing that, writing
S
the syntactic semigroup
of
L
, for any idempotent
e
, the local monoid
eSe
is in
ZG
. Let
e
be an idempotent. By
definition of
SE
, the local monoid
eSe
is isomorphic to the subset of arrows of
SE
going
from
e
to
e
, with their composition law. Let us denote this subset by
Be
. Define
e
to
be the specialization of the relation
to
Be
. Remark that
N
=
B
e/e
is a submonoid of
B/
, and is hence in
ZG
because
B/
is, and
ZG
is a variety. Remark that since all
words in
B
e
are valid paths in
SE
, the local monoid
H:=eSe
defines a congruence
2
over
B
e
where two paths are equivalent if they evaluate to the same monoid element. We know
that
, hence
e
, refines this congruence
2
. Hence,
H
is a quotient of
N
. Thus,
eSe
is a
quotient of B
e/e, which is a submonoid of a monoid in ZG, concluding the proof.
D Proofs for Section 6(Choosing the Congruence)
D.1 Proving Claim 6.4 and Claim 6.5
Claim 6.4.
Fix
S
and
SE
and
B
, let
wB
be a path of
SE
, and let
n>
1be a 1-distant
rare-frequent threshold of
w
. Then the set of frequent arrows of
w
for
n
is a union of SCCs.
Proof.
Consider
G
the directed graph of Definition 6.3. Let us assume by way of contradiction
that
G
has a connected component which is not strongly connected. This means that there
exists an edge (
u, v
)of
G
such that there is no path from
u
to
v
in
G
. Consider any frequent
arrow
a
in
SE
achieving the edge (
u, v
)of
G
. As
a
is frequent is
w
, we know that
a
occurs
> n
times in
w
, hence
w
contains
n
paths from the final object
v
of
a
back to the initial
object
u
of
a
. As there is no path from
v
to
u
in
G
, each one of these paths must contain an
arrow of Bwhich is rare in w.
Hence, the total number of rare arrows in
w
is at least
n
. But the 1-distant rare-frequent
threshold condition imposes that the total number of rare arrow occurrences in
w
is
n
1.
We have thus reached a contradiction.
Claim 6.5.
For any
d >
0and
k >
0and
m
1, there exists
nm
such that for any
d
-tuple
T
of integers, there exists an
n′′ m
such that, letting
n:=n′′k
, we have that
n
divides nand that mPiFTin, where F={i|Tin}.
Proof.
Let us take
n:=k×
((
md
)
2d+1
!), which ensures
nm
. This ensures that
n
has as
divisors
kmd
,
k
(
md
)
2, . . . , k
(
md
)
2d+1
: these are all our possible choices of values for
n
. Now
take any
d
-tuple
T
. For any possible choice of
n′′
, the rare set is the subset of coordinates of
T
having values
kn′′
. By the pigeonhole principle, we can choose two of the divisors above,
say (
md
)
i
and (
md
)
j
with
i<j
, having the same rare set, i.e., letting
Fi
=
{i|Ti
(
md
)
ik}
and Fj={i|Ti(md)jk}, we have Fi=Fj.
A. Amarilli, C. Paperman 19
Let us set
n′′ :=
(
md
)
j
, and let
n:=n′′k
. By construction, we have
nm
, and
n
divides
n
. Now, consider the sum
m×PiFjTi
. As
Fj
=
Fi
, we know that for every
iFj
,
we have
Ti
(
md
)
ik
. Thus, the sum is at most
d
times this value because
T
is a
d
-tuple,
and multiplying by
m
we know that the sum
m×PiFjTi
is at most (
md
)
×
(
md
)
ik
, hence
it is
(
md
)
i+1k
, so it is
nk
for
n:=
(
md
)
j
because
i<j
. This concludes the proof.
D.2 Concluding the proof of Lemma 6.2
We are now ready to show Lemma 6.2:
Proof of Lemma 6.2.
Fix
S
,
SE
, and
B
, and the desired
m
. Take
n
to be as given by
Claim 6.5 with
d:=
2
|B|
, with
k:=ω×
(
|S|
+ 1), with
ω
being the idempotent power of the
semigroup, and with
m
being the desired
m
plus 1. Now consider any pair
u1, u2
of
SE
. Let
T
be the
d
-tuple of the letter occurrences of
u1
, followed by those of
u2
. The statement of
Claim 6.5 ensures that there exists
n>
1such that
nk
divides
n
and such that the total
number of rare arrows in
u1
plus in
u2
is
(
nk
)
/
(
|S|
+ 1). As
n |S|
+ 1 and
k |S|
+ 1,
we have
nk |S|
(
|S|
+ 1), so we have (
nk
)
/
(
|S|
+ 1)
(
nk
)
/|S|
1. Hence, the total
number of rare arrows
u1
plus in
u2
is
(
nk
)
/|S|
1. So the same is true of the rare
arrows in
u1
, and of the rare arrows in
u2
. By contrast, the frequent arrows in
u1
occur
> nk
times, and the same is true of the frequent arrows in
u2
. Hence, by Definition 6.1,
nk
is an
|S|
-distant rare-frequent threshold for
u1
and for
u2
. Now, note that
nk
is a multiple
of
k
, hence of
ω×
(
|S|
+ 1). Thus, we have achieved all desired conditions and showed the
result.
E Proofs for Section 7(The Loop Insertion and Prefix Substitution
Lemmas)
E.1 Proof of Basic Combinatorial Results
Claim 7.1.
Let
x
and
y
be two coterminal loops of
SE
, let
kZ
, and let
ω
be an idempotent
power of S. We have: xω+kyyxω+k
Proof.
Recall that the equation of
ZG
implies:
xω+kyyxω+k
. By definition of
LZG
, the
local monoid
eSe
, for
e
the initial and final object of
x
and
y
, is in
ZG
. As the loops
x
and
y
evaluate in the category to some arrow having
e
as starting and ending object, the previous
equation then concludes the proof.
Claim 7.2.
For
x
,
x
two coterminal paths in
SE
and
y
,
y
coterminal paths in
SE
such
that xy and xyare valid loops, we have: (xy)ω(xy)ω(xy)ω(xy)ω(xy)ω(xy)ω.
Proof. Let us first show that:
(xy)ω(xy)ωxy(xy)ω1(xy)ω1xy(1)
To show Equation 1, first rewrite (xy)ωas x(yx)ω1yand likewise for (yx)ω, to get:
(xy)ω(xy)ωx(yx)ω1yx(yx)ω1y
Then, we use Claim 7.1 to move (yx)ω1, so the above is equal to:
xyx(yx)ω1(yx)ω1y
20 Locality and Centrality: The Variety ZG
We again rewrite (yx)ω1to y(xy)ω2x, yielding:
xyxy(xy)ω2x(yx)ω1y
We again use Claim 7.1 to move (
xy
)
ω2
, merge it with the prefix
xy
, and move it back to
its place, yielding:
xy(xy)ω1x(yx)ω1y
We rewrite (yx)ω1to y(xy)ω2x, yielding:
xy(xy)ω1xy(xy)ω2xy
Again by Claim 7.1, we can merge (xy)ω1with xyand move it to finally get:
xy(xy)ω1(xy)ω1xy
This establishes Equation 1.
Now, as (
xy
)
ω
(
xy
)
ω
(
xy
)
2ω
(
xy
)
2ω
, we can now apply Equation 1
ω
times to the
right-hand side and get:
(xy)ω(xy)ω(xy)ω(xy)ω(xy)ω(xy)ω(2)
As these elements commute (thanks to Claim 7.1), we have shown the desired equality.
Claim 7.3.
For
x
,
x
two coterminal paths in
SE
and
y
,
y
coterminal paths in
SE
such
that
xy
and
xy
are valid loops, and for any path
t
coterminal with
y
, the following equation
holds: xt(xy)ω(xy)ωxt(xy)ωxy(xy)ω1.
Proof. We apply Claim 7.2 to show the following equality on the left-hand-side:
xt(xy)ω(xy)ωxt(xy)ω(xy)ω(xy)ω(xy)ω
By commutation of (xy)ωthanks to Claim 7.1, the right-hand-side is equal to:
(xy)ωxt(xy)ω(xy)ω(xy)ω
By expanding (xy)ω=x(yx)ω1y, we get:
x(yx)ω1yxt(xy)ω(xy)ω(xy)ω
By commutation of (xy)ωand expanding it to x(yx)ω1y, we get:
x(yx)ω1yx(yx)ω1yxt(xy)ω(xy)ω
Combining (yx)ω1with what precedes and follows, we get:
x(yx)ω1(yx)ω+1 t(xy)ω(xy)ω
By expanding (xy)ω=x(yx)ω1y, and commuting (yx)ω1and (yx)ω+1 , we get:
xt(xy)ωx(yx)ω1(yx)ω+1(yx)ω1y
Now, we have x(yx)ω1= (xy)ω1x, so we get:
xt(xy)ω(xy)ω1x(yx)ω+1 (yx)ω1y
A. Amarilli, C. Paperman 21
Commuting (yx)ω1and doing a similar transformation, we get:
xt(xy)ω(xy)ω1(xy)ω1x(yx)ω+1 y
Now, expanding (yx)ω+1, we get:
xt(xy)ω(xy)ω1(xy)ω1xy(xy)ωxy
Commuting (xy)ω1and merging it with xy, we finally get:
xt(xy)ω(xy)ω1(xy)ω(xy)ωxy
Note that (xy)ω1(xy)ω(xy)ω1, so applying commutation we get:
xt(xy)ω(xy)ω(xy)ω(xy)ω(xy)ω1xy
Now, applying Claim 7.2 in reverse (using commutation again, we can obtain):
xt(xy)ω(xy)ω(xy)ω1xy
And commuting (xy)ωand merging it yields:
xt(xy)ω(xy)ω1xy
A final commutation of (
xy
)
ω1
yields the desired right-hand-side, establishing the result.
E.2 Proof of the Loop Insertion Lemma (Lemma 7.4)
Lemma 7.4
(Loop insertion lemma)
.
Let
p
be a path, and assume that
n
is a rare-frequent
threshold for
p
which is
|S|
-distant and a multiple of
ω
. Let
p
=
rt
be a decomposition of
p
(with
r
or
t
potentially empty), let
o
be the object between
r
and
t
(i.e., the final object of
r
,
or the initial object of
t
if
r
is empty), and let
p
be a loop over
o
that only uses frequent
arrows. Then pr(p)nt(note that they are also n-equivalent by construction).
We first rephrase the claim to the following auxiliary result:
Claim E.1.
Let
p
be a path, assume that
n
is a rare-frequent threshold for
p
which is
|S|
-distant and a multiple of
ω
, let
o
be the object between
r
and
t
, and let
X
be the set of
elements of the local monoid on
o
that can be achieved as a loop
qn
x
on
x
with frequent arrows
(i.e., the loop evaluates to an arrow (
o, x, o
)), noting that this implies that
x
is idempotent
as
n
is a multiple of
ω
. Then letting
q:=QxXqn
x
, we have that
prqt
. (Note that
they are also n-equivalent by construction.)
We explain why Claim E.1 implies the desired claim. Indeed, when taking
p
=
rt
and
taking
r
(
p
)
nt
, choosing for
o
the object between
r
and
t
, the sets
X
defined in the auxiliary
result will be the same for both paths (because
X
only depends on
o
), so the rephrased claim
implies that there is a loop
q
such that
rt rqt
and
r
(
p
)
ntrq
(
p
)
nt
. Now, as (
p
)
n
must
correspond to an arrow of the form (
o, x, o
)for
xX
, it must be the same idempotent as one
of the idempotents achieved by one of the loops in the definition of
q
, and as the local monoid
is in
ZG
these idempotents commute and
qq
(
p
)
n
. Hence, we have
rqt rq
(
p
)
nt
. We
know that
rqt rt
, and
rq
(
p
)
ntr
(
p
)
nt
. Thus we obtain
rt r
(
p
)
nt
. Thus, Lemma 7.4
is proved once we have shown Claim E.1.
Hence, all that remains is to show Claim E.1. We will do so by establishing a number of
claims, all of which will have their proofs deferred to the appendix.
We first prove a preliminary claim which uses the pigeonhole principle to insert a loop
containing an arbitrary arrow xin a word where xoccurs more than |S|times:
22 Locality and Centrality: The Variety ZG
Claim E.2.
Let
p
=
rt
be a path, let
o
be the object between
r
and
t
, let
x
be an arrow
starting at
o
, and assume that
x
occurs
>|S|
times in
p
. Then we have
pr
(
xu
)
ωt
for
some return path uusing only the arrows of p.
Proof.
As
x
occurs
k > |S|
times in
p
, this provides a decomposition of
p
in the shape:
p=p1xp2x· · · pkxs.
By the pigeonhole principle, there exists
i<j
such that
p1x· · · pixp1· · · pjx
. Hence,
iterating, we obtain:
p1x· · · pjxp1x· · · pix(pi+1x· · · pjx)ω.
Moving the ω, we get:
p1x· · · pjxp1x· · · pi1xpi(xpi+1x· · · pj)ωx.
This proves that
p
and
h
(
xu
)
ωg
achieve the same category element by taking
u:=pi+1x· · · pj
,
h:=p1x· · · pi1xpi
and
g:=pj+1x· · · pkxs
. Remark that the terminal object of
h
is the
same than the terminal object of
r
. Furthermore either
h
is a prefix of
r
or the converse.
Assume first that
h
=
rw
for some path
w
. Then,
w
and (
xu
)
ω
belong to the local monoid of
the terminal object of
r
which is in
ZG
. Since idempotents commute with all elements, we
have
w
(
xu
)
ω
(
xu
)
ωw
achieving that
rw
(
xu
)
ωgr
(
xu
)
ωwg r
(
xu
)
ωt
since
wg
=
t
. The
other case is symmetrical. This concludes the proof of Claim E.2.
Let us extend this to a claim using the notion of distant rare-frequent threshold (this is
where we use the fact that the threshold is distant):
Claim E.3.
Let
p
=
rt
be a path with an
|S|
-distant rare-frequent threshold
n
, let
o
be
the object between
r
and
t
, and let
x
be any frequent arrow starting at
o
. Then we have
pr(xu)ωtfor some return path uusing only frequent arrows of p.
Proof.
As
x
is a frequent arrow and
n
is
|S|
-distant, it occurs
>
(
ρ
+ 1)
|S|
times in
p
, where
ρ
is the total number of rare arrows of
p
. Hence, writing
p
=
p1a1· · · pρaρpρ+1
where the
ai
are the rare arrows and the
ρi
are paths of frequent arrows, there must be a
ρi
containing
>|S|
occurrences of
x
. Write
ρi
=
rt
where the object between
r
and
t
is the initial object
of
x
, which must exist as
x
occurs in
ρi
. Applying Claim E.2 to that decomposition, we have
ρir(xu)ωtfor some return path uusing only arrows of ρi, hence only frequent arrows.
Now, as idempotents commute with all elements, similarly to the end of the proof of
Claim E.2, we deduce that pr(xu)ωt.
We then prove a generalization of the previous claim, going from a single frequent arrow
to an arbitrary path of frequent arrows:
Claim E.4.
Let
p
=
rt
be a path with a rare-frequent threshold
n
which is
|S|
-distant and
a multiple of
ω
. Let
o
be the object between
r
and
t
, and let
h
be a path of frequent arrows
starting at
o
. Then for any path
h
using only arrows of
p
beginning at the final object of
r
,
we have pr(hg)ωtfor some return path gusing only frequent arrows of p.
Proof.
We show the claim by induction on the length of
h
. The base case of the induction,
with hof length 0, is trivial with galso having length 0.
For the inductive claim, write
h
=
ha
. By induction hypothesis, there exists a
g
using
only frequent arrows of psuch that:
pr(hg)ωt.
A. Amarilli, C. Paperman 23
Furthermore, by applying Claim E.3 to the decomposition
r
=
rh
and
t
=
g
(
hg
)
ω1t
and
with the frequent arrow
a
we get a return path
u
using only frequent arrows of
p
such that:
r(hg)ωtrh(au)ωg(hg)ω1t
So, iterating the ωpower, and combining with the preceding equation, we get:
prh((au)ω)ωg(hg)ω1t.
Now, by applying
ω
1times Claim 7.1 to each (
au
)
ω
except the first and to each loop going
from after this (au)ωto the position between an occurrence of hand g, we get that:
prh(au)ωg(h(au)ωg)ω1t.
Note the right-hand side is equal to: r(h(au)ωg)ωt. So we have shown:
pr(h(au)ωg)ωt.
So this establishes the inductive claim by taking g:=u(au)ω1g.
We can extend this to a claim about inserting arbitrary loops:
Claim E.5.
Let
p
=
rt
be a path with an
|S|
-distant rare-frequent threshold
n
, let
o
be the
object between
r
and
t
, and let
q
and
q
be loops on
o
using only frequent arrows. We have
that
rqt rq
(
q
)
n
(
q′′
)
nt
for some loop
q′′
on
o
using only frequent arrows (note that the
two are also n-equivalent).
Proof.
We use Claim E.4 with
h:=q
. This gives us the existence of a return path
g
using only frequent arrows, which is then also a loop on
o
, such that
rqt rq
(
qg
)
ωt
. Now,
applying Lemma 3.8 to the local monoid on object
o
, we know that this evaluates to the
same category element as:
rq
(
q
)
ωgωt
. Hence, as
n
is a multiple of
ω
, it evaluates to the
same category element as
rq
(
q
)
ngnt
, which now preserves
n
-equivalence and concludes the
proof of Claim E.5.
The only step left is to argue that Claim E.5 implies our rephrasing of the result that we
wish to prove, Claim E.1. To do this, let
o
be the terminal object of
r
and let
X
be the set of
idempotents definable from frequent arrows. For each idempotent
xX
, we can choose some
loop
qx
on frequent arrows that achieves it. Now, successive applications of Claim E.5 to
each
xX
, and using the fact that the (
q
)
n
and (
q′′
)
n
commute by Claim 7.1 (remember
that
n
is a multiple of
ω
), we know that
p
=
rt
is
n
-equivalent to, and evaluates to the same
category element as, the path rQxXqn
x(q
x)nt, for some q
xfor each qx(corresponding
to the
q′′
in that application of Claim E.5), which also only consists of frequent arrows. Now,
since (
q
x
)
n
is a loop on
o
consisting of frequent arrows, it also achieves an idempotent in the
local monoid on
o
. As this monoid is in
ZG
, idempotents commute, and so these idempotents
can all be combined with some idempotent
qn
x
and absorbed by them. Thus, we get that
rt
is
n
-equivalent to, and evaluates to the same category object as, the path
rQxXqn
xt
. This
concludes the proof of Claim E.1, and thus establishes our desired result, Lemma 7.4.
E.3 Proof of the Prefix Substitution Lemma (Lemma 7.5)
Lemma 7.5.
Let
p
=
xry
be a path, and assume that
n
is a rare-frequent threshold for
p
which is
|S|
-distant and a multiple of
ω
. Let
x
be a path coterminal with
x
. Assume that
24 Locality and Centrality: The Variety ZG
every arrow in
x
and in
x
is frequent. Assume that some object
o
in the SCC of frequent
arrows of the initial object of
r
occurs again in
y
, say as the intermediate object of
y
=
y1y2
.
Then there exists
y
=
y1y′′y2
for some loop
y′′
consisting only of frequent arrows such that
pxryand such that pnxry.
Proof.
As
n
is an
|S|
-distant rare-frequent threshold, we know by Claim 6.4 that the frequent
arrows occurring in
p
are a union of SCCs. Thus, there is a return path
s
for
x
(i.e.,
xs
is a
loop, hence xs also is) where sonly consists of frequent arrows.
By our hypothesis on the initial object of
r
, we can decompose
y
=
y1y2
such that the
terminal object of
y1
and initial object of
y2
is the initial object of
r
. Now, take
p
to be
the loop
s
(
xs
)
ω
(
xs
)
ωx
: note that all arrows of
p
are frequent. Hence, by Lemma 7.4,
xry
evaluates to the same category element as, and is n-equivalent to,
xry1(p)ωy2=xry1(s(xs)ω(xs)ωx)ny2
By unfolding the power n, we get the following:
xry1(p)ωy2=xry1s(xs)ω(xs)ωx(s(xs)ω(xs)ωx)n1y2=xry1s(xs)ω(xs)ωxz
where we write
z:=
(
s
(
xs
)
ω
(
xs
)
ωx
)
n1y2
for convenience. We can therefore apply Claim 7.3
to obtain that:
(x(ry1s))(xs)ω(xs)ωxz (x(ry1s))(xs)ω(xs)(xs)ω1xz.
What is more, these two paths are clearly
n
-equivalent, as they only differ in terms of
frequent arrows (all arrows in
x
and
x
being frequent) and the number of these arrows is
unchanged by the transformation. This path is of the form given in the statement, taking
y:=y1s
(
xs
)
ω
(
xs
)(
xs
)
ω1xz
from which we can extract the right
y′′
. This concludes the
proof.
F Proofs for Section 8(Concluding the Proof of LZG ZG D:
Claim 5.6)
F.1 Proof of the Base Case: Claim 8.1
Claim 8.1.
Let
p1
and
p2
be two coterminal paths that are
n
-equivalent and which contain
no rare arrows. Then p1p2
We first state and prove the lemma on graph decompositions:
Lemma 8.2.Let Gbe a strongly connected nonempty directed multigraph. We have:
Gis a simple cycle; or
G
contains a simple cycle
u1 · · · unu1
with
n
1, where all vertices
u1, . . . , un
are pairwise disjoint, such that all intermediate vertices
u2, . . . , un1
only occur in the
edges of the cycle, and such that the removal of the cycle leaves the graph strongly connected
(note that the case n= 1 corresponds to the removal of a self-loop); or
G
contains a simple path
u1 · · · un
with
n
2where all vertices are pairwise
distinct, such that all intermediate vertices
u2, . . . , un1
only occur in the edges of the
path, and such that the removal of the path leaves the graph strongly connected (note that
the case n= 2 corresponds to the removal of a single edge).
We repeat here that the result is standard. The proof given below is only for the reader’s
convenience, and follows [7].
A. Amarilli, C. Paperman 25
Proof.
This result is showed using the notion of an ear decomposition of a directed multigraph.
Specifically, following Theorem 7.2.2 of [
7
], for any nonempty strongly connected multigraph
G
, we can build a copy of it (called
G
) by the following sequence of steps, with the invariant
that Gremains strongly connected:
First, take some arbitrary simple cycle in Gand copy it to G;
Second, while there are some vertices of
G
that have not been copied to
G
, then pick
some vertex
v
of
G
that was not copied, such that there is an edge (
v, v
)in
G
with
v
a
vertex that was copied. Now take some shortest path (hence a simple path)
v · · · v′′
from
v
to the subset of the vertices of
G
that had been copied to
G
. This path ends
at a vertex
v′′
which may or may not be equal to
v
. If
v′′
=
v
, then we have a simple
path
vv · · · v′′
, which we copy to
G
; otherwise we have a simple cycle, which
we copy to
G
. Note that, in both cases, all intermediate vertices in the simple path or
simple cycle that we copy only occur in the edges of the path or cycle (as they had not
been previously copied to
G
). Further,
G
clearly remains strongly connected after this
addition.
Third, once all vertices of
G
have been copied to
G
, take each edge of
G
that has not been
copied to
G
(including all self-loops), and copy it to
G
(as a simple path of length 1).
These additions preserve the strong connectedness of G.
At the end of this process, Gis a copy of G.
Now, to show the result, take the graph
G
, consider how we can construct it according to
the above process, and distinguish three cases:
If the process stopped at the end of the first step, then
G
is a simple cycle (case 1 of the
statement).
If the process stopped after performing a copy in the second step, then considering the
last simple path or simple cycle that we added, then it satisfies the conditions and its
removal from
G
gives a graph which is still strongly connected (case 2 or case 3 of the
statement).
If the process stopped after performing a copy in the third step, then considering the last
edge that we added, then it is a simple path of length 1and its removal from
G
gives a
graph which is still strongly connected (case 3 of the statement).
This concludes the proof.
As explained in the body, we do an induction on the number of edges of the strongly connected
multigraph G, whose base case is trivial. Here are the details of the three cases to consider
in the induction step, following Lemma 8.2.
Case 1: Gis a simple cycle.
If
G
is a simple cycle, then distinguish the initial object of
q1
(hence, of
q2
) as
o1
, and let
α
be the category element corresponding to the cycle from
o1
to itself, and
p
the category element corresponding to the path from
o1
to the common
terminal element of
q1
and
q2
. We have:
q1
=
αn1p
and
q2
=
αn2p
with
n1
and
n2
being
n
and having the same remainder modulo
n
. By definition of
x
, there exists an idempotent
e
and some element
meSe
such that
x
= (
e, m, e
). Hence,
q1
=
xn1p
(
e, mn1, e
)
p
(resp.
q2
=
xn2p
(
e, mn2, e
)
p
). Since
n
is a multiple of the idempotent power of
S
, we have
mn1mω+r
and
mn2mω+r
where
r
is the remainder modulo
n
. Thus, we have
q1q2
,
concluding this case.
Case 2: Ghas a simple cycle.
Recall that, in this case, we know that
G
has a simple cycle
whose intermediate objects have no other incident edges and such that the removal of the
simple cycle leaves the graph strongly connected. Let
α
be the simple cycle, starting from
26 Locality and Centrality: The Variety ZG
the only object
e
of the cycle having other incident edges. We can then decompose
q1
and
q2
to isolate the occurrences of the simple cycle (which must be taken in its entirety), i.e.:
q1=x1αx2αx3· · · xt1αxt
q2=y1αy2αy3· · · yt1αyt
We ensure that the edges of
α
do not occur elsewhere than in the
α
factors, except possibly
in
x1
,
y1
and in
xt
,
yt
if the paths
q1
and/or
q2
start/and or end in the simple cycle. However,
in that case, we know that the prefixes of
q1
and
q2
containing this incomplete subset of the
cycle must be equal (same sequence of arrows), and likewise for their suffixes. For this reason,
it suffices to show the claim that
q1
and
q2
evaluate to the same category object under the
assumption that both their initial and terminal objects are not intermediate vertices of the
cycle: the claim then extends to the case when they can be (by adding the common prefixes
and suffixes to the two paths that satisfy the condition, using the fact that a congruence is
compatible with concatenation). Thus, in the rest of the proof for this case, we assume that
the edges of αonly occur in the αfactors.
We will now argue that, to show that
q1q2
, it suffices to show the same of two
n
-equivalent coterminal paths from which all occurrences of the edges of the cycle have been
removed and where all other edges still occur sufficiently many times. As this deals with
paths where the underlying multigraph contains fewer edges, the induction hypothesis will
conclude.
To do this, by Lemma 7.5, as
x1
and
y1
are coterminal and consist only of frequent arrows,
and as the initial object of
α
occurs again in both paths, the path
q1
is
n
-equivalent, and
evaluates to the same category element as, some path:
q
1=y1αx
2αx
3· · · x
t′′1αxt′′
For this reason, up to replacing q1by q
1, we can assume that x1=y1.
Now, furthermore,
x2, . . . , xt2
(resp.
y2, . . . , yt2
) and
α
are coterminal cycles over the
object
e
(which by definition corresponds to an idempotent of
S
). Hence,
αx2αx3· · · xt1α
=
(
e, mm2mm3· · · mt1m, e
)where
α
= (
e, m, e
),
xi
= (
e, mi, e
)for 2
it
1and where
m
and all
mi
’s are in
eSe
, which is by hypothesis a monoid in
ZG
. Now, by Corollary 3.7,
we know that
mm2mm3· · · mt1m
=
mt1m2m3· · · mt1
, because, as the arrows of
α
are
frequent, the number of occurrences of
α
is
n
, and we have taken
n
to be a multiple of
(
|S|
+ 1)
×ω
(where
ω
is the idempotent power of
S
), which is greater than (
|eSe|
+ 1)
×k
,
where
k
is the idempotent power of
eSe
(which divides
ω
, hence is
ω
). By applying the
same reasoning to
q2
, it suffices to show that the two following paths evaluate to the same
category element, where x1=y1:
x1αt1x2x3· · · xt1xt
y1αt1y2y3· · · yt1yt
Now, because these two paths are
n
-equivalent, we know that
t
1and
t
1have the same
remainder modulo
n
. By the same reasoning as in case 1, they evaluate to the same category
element as
αr
, where
r
is the remainder. So it suffices to show that the two following paths
evaluate to the same category element, with x1=y1:
x1αrx2x3· · · xt1xt
y1αry2y3· · · yt1yt
A. Amarilli, C. Paperman 27
To ensure that the edges not in
α
still occur sufficiently many times, let
β
be any loop on
e
that visits all edges of
G
except the ones in
α
: this is doable because
G
is still strongly
connected after the removal of
α
. Up to exponentiating
β
, we can assume that
β
traverses
each edge sufficiently many times to satisfy the lower bound imposed by the requirement
of
n
being an
|S|
-distant rare-frequent threshold. By Lemma 7.4, it suffices to show that
the following paths evaluate to the same category element:
q
1=x1αrβnx2x3· · · xt1xt
q
2=y1αrβny2y3· · · yt1yt
Now, observe that both paths start by
x1α
=
y1α
, and the arrows of
α
do not occur in
the rest of the paths. Now consider the paths
βnx2x3. . . xt
and
βny2y3. . . yt
. They are
paths that are coterminal,
n
-equivalent because
q1
and
q2
were, where the frequent letters
that are used are a strict subset of the ones used in
p1
and
p2
, and where all other frequent
letters occur sufficiently many times for
n
to still be an
|S|
-distant rare-frequent threshold.
Thus, by induction hypothesis, we know that these two paths evaluate to the same category
element, so that q
1and q
2also do. This concludes case 2.
Case 3: Ghas a simple path.
Recall that, in this case, we know that
G
has a simple path
whose starting and ending objects have no other incident edges and such that the removal
of the simple path leaves the graph strongly connected. We denote by
x
=
y
the starting
and ending objects of the path. Let
π
be the category element corresponding to the path.
Since the removal of the path does not affect strong connectedness of the graph, there is a
simple path from
x
to
y
sharing no edges with
π
; let
κ
be the category element to which this
“return path” evaluates. Furthermore, there is a simple path
ρ
from
y
to
x
sharing no edges
with π(this is because all intermediate objects of πonly occur in the edges of π).
Like in the previous case, up to removing common prefixes and suffixes, it suffices to
consider the case where
q1
and
q2
do not start or end in the intermediate vertices of
π
. For
that reason, we can now isolate all occurrences of the edges of π, and write:
q1=x1πx2πx3· · · xt1πxt
q2=y1πy2πy3· · · yt1πyt
Like in the previous case, by Lemma 7.5, we can assume that x1=y1.
By Lemma 7.4, we spawn a loop (
ρκ
)
n
after every occurrence of
π
without changing the
category element and still respecting the
n
-congruence. By expanding (
ρκ
)
n
=
ρκ
(
ρκ
)
n1
,
it suffices to show that the following paths evaluate to the same category element, with
x1=y1:
q
1=x1πρκ(ρκ)n1· · · xt1πρκ(ρκ)n1xt
q
2=y1πρκ(ρκ)n1· · · yt1πρκ(ρκ)n1yt
We can now regroup the occurrences of
πρ
, which are loops such that some edges (namely,
the edges of
π
) only occur in these factors. This means that we can conclude as in case 2 for
the cycle
πρ
, as this cycle contains some edges that only occur there; we can choose
β
at the
end of the proof to be a loop on
x
visiting all edges of
G
except those of
π
, which is again
possible because Gis still strongly connected even after the removal of π.
This establishes case 3 and concludes the induction step of the proof.
We have thus proved by induction that
q1
and
q2
evaluate to the same category element,
in the base case of the outer induction where all edges of q1and q2are frequent.
28 Locality and Centrality: The Variety ZG
F.2 Proof of the Induction Case
We now conclude the proof of the induction step for the outer induction. Recall from the
body that we have taken two
n
-equivalent paths
p1
=
q1as1
and
p2
=
q2as2
with
a
being
the first of the r+ 1 rare letters of the paths.
Remember that, as
n
is an
|S|
-distant rare-frequent threshold for
p1
and
p2
, then we
know that the frequent arrows of
p1
form a union of SCCs (Claim 6.4); note that, thanks to
n
-equivalence, the same is true of
p2
with the same SCCs. Let
e
be the source object of
a
,
and consider the SCC
C
of frequent arrows that contains
e
. There are two cases, depending
on whether some object of
C
occurs again in
s1
or not. Note that some object of
C
occurs
again in
s1
iff the same is true of
s2
, because which frequent arrow components occur again
is entirely determined by the terminal objects of the rare arrows of
s1
and
s2
, which are
identical thanks to n-equivalence.
Case 1: Coccurs again after a.
In this case, we apply Lemma 7.5, because
q1
and
q2
only
consist of frequent arrows and some object of the SCC of the initial object of
a
occurs again
in s1. The claim tells us that there is a path:
p
1=q2as
1
which evaluates to the same category element as
p1
and is
n
-equivalent to it. Hence, by
compositionality, it suffices to show that s
1and s2evaluate to the same category element.
To apply the induction hypothesis, we simply need to ensure that
n
is still a
|S|
-distant
rare-frequent threshold for
s
1
and
s2
. To do this, we need to ensure that the arrows that
are frequent in
p1
and
p2
are still frequent there, and still satisfy the
|S|
-distant condition.
Fortunately, we can simply ensure this by inserting a loop using Lemma 7.4. Formally, write
s
1
=
r1t1
where the intermediate object is the object of the SCC
C
that occurred in
s1
(the
existence of such a decomposition is a consequence of the statement of Lemma 7.5), and write
s2
=
r2t2
in the same way (which we already discussed must be possible with
s2
). Let
p
be
an arbitrary loop of frequent arrows where all arrows of
C
occur: this is possible because
C
is strongly connected. We know by Lemma 7.4 that
s
1
=
r1t1
and
r1
(
p
)
nt1
are both
n
-equivalent and evaluate to the same category object: this is also true with
w1:=r1
(
p
)
knt1
for a sufficiently large
k
such that every frequent arrow of
C
occurs as many times as it did
in
p1
. Likewise,
s2
=
r2t2
and
w2:=r2
(
p
)
knt2
are both
n
-equivalent and evaluate to the
same category object, and we can take a sufficiently large
k
. So it suffices to consider
w1
and w2.
Let us apply the induction hypothesis to them. They are two coterminal paths, and
they are
n
-equivalent because
w1np
1np1
and
w2np2
and by hypothesis
p1np2
.
What is more, the arrows that were rare in
p1
and
p2
are still rare for them, and they have
r
occurrences in total: this was true by construction of
s1
and
s2
and is true of
s
1
because
p1
=
q1as1nq2as
1
and all arrows of
q2
are frequent so the rare subwords of
s1
and
s
1
are
the same. The arrows that were frequent in
p1
and
p2
are still frequent in
s1
and
s2
and
occur at least as many times as they did in
p1
and
p2
respectively: we have guaranteed this
for the arrows of
C
using Lemma 7.4, and this is clear for the arrows outside of
C
as all
their occurrences in
p1
and
p2
were in
s1
and
s2
respectively, and
s
1
has at least as many
occurrences of every letter as
s1
does (this is a consequence of the statement of Lemma 7.5).
This ensures that
s
1ns2
, and that
n
is still an
|S|
-distant rare-frequent threshold for
them.
Hence, by the induction hypothesis, we have
s
1s2
, so that by compositionality we have
p1p2.
A. Amarilli, C. Paperman 29
Case 2: Cdoes not occur again after a.
We first claim that
q1q2
by the base case of
the outer induction. Indeed, first note that they are two coterminal paths. Now, every arrow
x
which is frequent in
p1
and
p2
is either in the SCC of the initial object of
a
or not. In
the first situation, all the occurrences of
x
in
p1
must be in
q1
, as any occurrence of
x
in
s1
would witness that we are in Case 2; and likewise all its occurrences in
p2
must be in
q2
. In
the second situation, all its occurrences in
p1
must be in
s1
and all its occurrences in
p2
must
be in
s2
, for the same reason. Thus,
q1
and
q2
contain no letter which was rare in
p1
and
p2
,
some of the frequent letters of
p1
and
p2
(those of the other SCCs) do not occur there at all,
and the others occur there with the same number of occurrences. Thus indeed
q1nq2
,
they contain no rare arrows, and
n
is still an
|S|
-distant rare-frequent threshold for them.
Thus, the base case of the outer induction concludes that they evaluate to the same category
element.
We now claim that
s1s2
by the induction case of the outer induction. Indeed, they
are again two coterminal paths. What is more, by the previous reasoning the arrows that
are frequent in
p1
and
p2
either occur only in
s1
and
s2
or do not occur there at all. Thus,
s1
and
s2
contain
r
rare arrows (for the arrows that were already rare in
p1
and
p2
), and
the frequent arrows either occur in
s1
and
s2
with the same number of occurrences as in
p1
and
p2
or not at all. This implies that
n
is still an
|S|
-distant rare-frequent threshold for
s1
and
s2
. Thus, we have
s1ns2
and the induction case of the outer induction establishes
that s1s2.
Thus, by compositionality, we know that
p1
and
p2
evaluate to the same category element.
We have concluded both cases of the outer induction proof.
ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.