ArticlePDF Available

Abstract and Figures

A bidirectional ballot sequence (BBS) is a finite binary sequence with the property that every prefix and suffix contains strictly more ones than zeros. BBSs were introduced by Zhao, and independently by Bosquet-M\'elou and Ponty as (1,1)-culminating paths. Both sets of authors noted the difficulty in counting these objects, and to date research on bidirectional ballot sequences has been concerned with asymptotics. We introduce a continuous analogue of bidirectional ballot sequences which we call bidirectional gerrymanders, and show that the set of bidirectional gerrymanders form a convex polytope sitting inside the unit cube, which we refer to as the bidirectional ballot polytope. We prove that every (2n1)(2n-1)-dimensional unit cube can be partitioned into 2n12n-1 isometric copies of the (2n1)(2n-1)-dimensional bidirectional ballot polytope. Furthermore, we show that the vertices of this polytope are all also vertices of the cube, and that the vertices are in bijection with BBSs. An immediate corollary is a geometric explanation of the result of Zhao and of Bosquet-M\'elou and Ponty that the number of BBSs of length n is Θ(2n/n)\Theta(2^n/n).
Content may be subject to copyright.
arXiv:1708.02399v2 [math.CO] 30 Aug 2017
THE BIDIRECTIONAL BALLOT POLYTOPE
STEVEN J. MILLER, CARSTEN PETERSON, CARSTEN SPRUNGER, AND ROGER VAN PESKI
ABS TR ACT. A bidirectional ballot sequence (BBS) is a finite binary sequence with the property
that every prefix and suffix contains strictly more ones than zeros. BBSs were introduced by Zhao,
and independently by Bosquet-Mélou and Ponty as (1,1)-culminating paths. Both sets of authors
noted the difficulty in counting these objects, and to date research on bidirectional ballot sequences
has been concerned with asymptotics. We introduce a continuous analogue of bidirectional ballot
sequences which we call bidirectional gerrymanders, and show that the set of bidirectional gerry-
manders form a convex polytope sitting inside the unit cube, which we refer to as the bidirectional
ballot polytope. We prove that every (2n1)-dimensional unit cube can be partitioned into 2n1
isometric copies of the (2n1)-dimensional bidirectional ballot polytope. Furthermore, we show
that the vertices of this polytope are all also vertices of the cube, and that the vertices are in bijec-
tion with BBSs. An immediate corollary is a geometric explanation of the result of Zhao and of
Bosquet-Mélou and Ponty that the number of BBSs of length nis Θ(2n/n).
1. INTRO DUCTI ON
In [Zh1], the author introduced a family of combinatorial objects called bidirectional ballot
sequences, defined as follows.
Definition 1.1. A finite 0-1 sequence is a bidirectional ballot sequence (BBS) if every prefix and
every suffix contains strictly more ones than zeros. Let Bndenote the number of bidirectional
ballot sequences of length n.
Bidirectional ballot sequences have a natural interpretation in terms of lattice paths. Suppose
we start at (0,0) and take a finite number of steps either of the form (1,1) or (1,1). We call
such a path a standard lattice path. We define the length of the path to be the number of steps
it contains. We define the height of a point in the lattice path to be its y-coordinate. Bidirectional
ballot sequences of length nare in bijection with standard lattice paths of length nwhose unique
minimum height is attained at the first point in the path, and whose unique maximum height is
attained at the last point in the path. The bijection is given by identifying the digit ‘0’ in a BBS
with a step of the form (1,1) and the digit ‘1’ with a step of the form (1,1) (for an example of
this, see Section 4).
From this perspective, bidirectional ballot sequences were independently introduced by [BP]
as a special type of what they call culminating paths. In particular, an (a, b)-culminating path is
a sequence of lattice points starting at (0,0) such that each step is of the form (1, a)or (1,b)
and such that the unique minimum height is achieved at the first point and the unique maxi-
mum height is achieved at the last point. Thus bidirectional ballot sequences are in bijection
Date: August 31, 2017.
1991 Mathematics Subject Classification. 51M20, 52B12, 05A99 .
Key words and phrases. bidirectional ballot sequence, culminating paths.
The first named author was partially supported by NSF Grants DMS1265673 and DMS1561945, the other authors
by DMS1659037, Princeton University and Williams College. We thank Mel Nathanson and Kevin O’Bryant for
helpful conversations.
1
with (1,1)-culminating paths. In [BP] it is noted that (1,1)-culminating paths had been used in
[FGK] with connections to theoretical physics, and general (a, b)-culminating paths had been used
in [AGMML], [CR], and [PL] with connections to bioinformatics.
In both [Zh1] and [BP], it is noted that unlike other easy to define classes of lattice paths (e.g.
Dyck paths), the enumeration of BBSs is tricky; there is no obvious recursive structure to such
paths. Both authors focused on the asymptotics of Bn. In particular, [BP] obtained a gener-
ating function in nfor the number of (a, b)-culminating paths of length nwith fixed height k
(the generating function for the (1,1) case was found in [FGK]). Furthermore, they showed that
Bn2n/4n. Independently, [Zh1] showed that Bn= Θ(2n/n)and stated without detailed proof
that Bn2n/4n. Additionally in [Zh1], the author conjectured an even finer asymptotic expres-
sion for Bn. This conjecture was later proved by [HHPW], who refined the asymptotic expression
even further using techniques from analytic combinatorics.
The motivation for the study of culminating paths in [BP] was the observation that such paths
had been independently introduced and utilized in disparate contexts (theoretical physics and bioin-
formatics) as well as a general interest in understanding subfamilies of lattice paths. However, the
motivation in [Zh1], as well as our original motivation for studying BBS, arises from additive com-
binatorics. Let AZbe a finite set of integers. We define the sumset, A+A, as those elements
in Zexpressible as a+bwith a, b A. Similarly, the difference set, AAis those elements
expressible as abwith a, b Z. We say that Ais a more sums than differences (MSTD)
set if |A+A|>|AA|. Because of the commutativity of addition, one may intuitively expect
that in general |AA| |A+A|. This intuition turns out to be correct in some contexts (see
[HM]), in particular if each element in [n] := {1,2,...,n}is independently chosen to be in A
with some probability p(n)tending to zero). Let ρnbe the proportion of subsets of [n]which are
sum dominant. In [MO], it was shown that ρn>2×107for n15 and in [Zh2] it was shown
that limn→∞ ρnconverges to a positive number; experimental data suggests this limit to be of order
104. Thus, in this sense, a positive proportion of sets are MSTD. However, the techniques in
[MO] are probabilistic, and to date no known constant density family of MSTD subsets of [n]as
n is known.
The best density explicit construction of MSTD sets is due to Zhao in [Zh1] using BBSs. Let
Bbe a binary sequence of length n. We can associate to Bthe set A[n]defined as A:= {i:
Bi= 1}. For example if B= 01101, then A={2,3,5}. Those subsets Aof [n]arising from
BBSs have the property that A+A={i: 2 i2n}, which is to say that the sumset is as
large as possible (similarly it turns out that the difference set is also as large as possible). Using
this property, Zhao was able to translate those subsets of [n]arising from BBSs and append extra
elements to the fringes to obtain an MSTD set for each set arising from a BBS. From this, one
immediately gets a density Θ(1/n)family of MSTD sets.
Motivated by the use of BBSs in additive combinatorics, in this paper, we study the natural
analgoue of BBSs in a continuous setting, which we call bidirectional gerrymanders; in the
related paper [MP], we use similar ideas as in this paper to study the analogue of MSTD sets in a
continuous setting.
We first set some notation and then describe our main results. Let Indenote the set of all
subsets of Rconsisting of exactly ndisjoint open intervals such that the leftmost interval starts
at 0. Suppose A In. If we translate A, then the sumset and difference set merely translate
as well. Thus, when studying additive behavior, we do not lose any generality by restricting our
attention to collections of intervals such that the leftmost interval starts at zero. We can topologize
Inby identifying it with R2n1
0, the non-negative orthant: let A=I1I2 · · · InInwith
2
Iito the left of Ijfor i < j. Suppose Ij= (aj, bj). We then identify Awith the vector vA=
[b1a1, a2b1, b2a2, a3b2,...,bnan]. Thus the first entry is the length of the first
interval, the second entry is the size of the gap between the first and second intervals, the third
entry is the length of the second interval, etc. We shall find it convenient to restrict our attention
to the following set: let JnInbe the set of collections of nnon-overlapping intervals such
that the leftmost interval starts at zero, the length of each interval is between 0 and 1, and the gap
between adjacent intervals is between 0 and 1 (if we scale A Inby α6= 0, then the sumset and
difference set scale by αas well, so αAhas the same essential additive behavior as A; note that up
to scaling, every element of Inis an element of Jn). We can topologize Jnby identifying it with
C2n1= [0,1]2n1, the 2n1dimensional unit cube. For other ways to topologize Inand related
spaces, see [MP].
The bidirectional gerrymanders in Jnform a convex, compact polytope contained in C2n1
which we call the bidirectional ballot polytope,Pn. This polytope has a number of extraordi-
nary combinatorial features. In Section 2 we formally define this polytope and show that C2n1
can be partitioned into 2n1disjoint isometric copies of Pn, which in particular shows that the
volume of Pnis 1/(2n1). In Section 3 we show that the vertices of Pnare vertices of C2n1. Fi-
nally in Section 4 we show that the vertices of Pnare in bijection with B2n+3 , and that a particular
subset of the vertices are in bijection with B2n1. From this we are able to immediately rederive
geometrically that |Bn|= Θ(2n/n).
2. THE BID IRECTIO NAL BAL LOT CONE AND POLYTOPE
We first set some notation. Let m= 2n1for some nN.
Definition 2.1. Let the set of left ballot vectors,Ln, and the set of right ballot vectors,Rn, be the
following sets of vectors in Rm:
Ln:= {[1,1,0,...,0],[1,1,1,1,0,...,0],...,[1,1,...,1,1,0]},(2.1)
Rn:= {[0,...,0,1,1],[0,...,0,1,1,1,1],...,[0,1,1,...,1,1]}.(2.2)
We define Vn, the set of ballot vectors, as Vn=LnRn.
Definition 2.2. The bidirectional ballot cone,Bn, is the set of xRmsuch that x·w0for all
wVn. When the value of nis obvious, we simply refer to it as B.
We now define the continuous analogue of BBSs, and show in Proposition 2.4 that it is the right
generalization.
Definition 2.3. Let A In. We call Aabidirectional gerrymander if vA B.
Proposition 2.4. Suppose A=I1 · · · InInwith endpoints ordered as before. Suppose the
right endpoint of Inis a. Then, Ais a bidirectional gerrymander if and only if µ(A [0, t]) t/2
and µ(A [at, a]) t/2for all t[0, a].
Proof. The condition µ(A∩[0, t]) t/2is equivalent to the non-negativity of µ(A[0, t])µ((R\
A)[0, t]). For t[0, a],µ(A [0, t]) µ((R\ A)[0, t]) takes a local minimum exactly at the
left endpoints of intervals Ii. Hence showing that it is non-negative is equivalent to the condition
that vA·w0for all wLn. The equivalence of µ(A [at, a]) t/2and vA·w0for all
wRnfollows similarly.
A BBS in the sense of [Zh1] is a binary sequence for which any subsequence truncated on the
left or right contains more 1’s than 0’s, and Proposition 2.4 shows that a bidirectional gerrymander
3
is a subset of Rcontained in [0, a]for which any subset obtained by truncating on the left or right
contains “more” points (in a measure theoretic sense) in the original set than points not in this set.
It is thus clear that they are a natural analogue, but, as we shall see, what is surprising is that they
can be used to prove results about standard (discrete) BBSs.
Definition 2.5. The bidirectional ballot polytope Pn, is defined as BnCm. Equivalently, it is
those vectors vAsuch that A Jnis a bidirectional gerrymander. When the value of nis obvious,
we shall refer to it simply as P.
FIGUR E 1. The polytope P2(red) sitting inside C3. Notice that adding two ad-
ditional copies of P2, rotated about the main diagonal of the cube by 2π
3and 4π
3
respectively, would result in a partition of C3(neglecting overlap of boundaries).
Definition 2.6. Let Zmbe the cyclic group of order kwith generator ρ. Let Zmact on Rmby
cyclically permuting the entries (e.g. ρ2([0,1,2,3,4]) = [3,4,0,1,2]). For a given set of vectors
Vand σZm, let σ(V) := {σ(v) : vV}with σZm. For each σZm, define Bσby
Bσ:= {vRm
0:v·w0for all wσ(Vn)},(2.3)
and Pσlikewise. Note that Bσ=σ1(B), and that B=BId and P=PId .
Theorem 2.7. The non-negative orthant, Rm
0, is contained in SσZmBσ. Furthermore, for σ16=
σ2, the interiors of Bσ1and Bσ2are disjoint.
Proof. Let τ=ρ2Zmbe the cyclic shift by two places. Because mis odd, τgenerates Zm. In
particular, we see that the set of left and right ballot vectors Vnas defined in Definition 2.1 is equal
to
Vn=(k
X
i=0
τi(w) : 0 k2n3),(2.4)
where w= [1,1,0,...,0]. If < k 2n3then
k
X
i=0
τi(w)
X
i=0
τi(w) =
k
X
i=+1
τi(w) = τ+1
k1
X
i=0
τi(w),(2.5)
4
and since P2n2
i=0 τi(w) = [0,...,0] we have similarly that, for 0k,
k
X
i=0
τi(w)
X
i=0
τi(w) = τ+1
(2n3)+(k)
X
i=0
τi(w).(2.6)
Then we have that
(k
X
i=0
τi(w)
X
i=0
τi(w) : 0 k2n3)=τ+1(Vn).(2.7)
Now let wk=Pk
i=0 τi(w), take any v[0,1]m, and choose 02n3minimizing v·w
(this may not be unique). Then
v· k
X
i=0
τi(w)
X
i=0
τi(w)!0(2.8)
for all 0k2n3. Therefore v·r0for all rτ+1(Vn), so v Bτ+ 1 . This shows that
Rm
0=SσZmBσ. Intersecting with Cgives the corresponding result for P.
Conversely, if vInt(Bτ+1 )Int(Bτk+1 )and τ+1 6=τk+1, then (because taking the interior
simply changes the inequalities defining Bτ+1 to strict ones) we have both
v· k
X
i=0
τi(w)
X
i=0
τi(w)!>0
v·
X
i=0
τi(w)
k
X
i=0
τi(w)!>0.
This is a contradiction, so the interiors distinct regions Bτ+1 are disjoint, and it follows immedi-
ately that the interiors of distinct regions Pτ+1 are disjoint.
Corollary 2.8. The unit cube Cmequals SσZmPσ. Furthermore, for σ16=σ2, the interiors of
Pσ1and Pσ2are disjoint. Consequently, the volume of Pis exactly 1
m.
Proof. Intersecting the nonnegative orthant and the translates Bσwith Cm, Theorem 2.7 yields that
Cmis partitioned into mregions produced by permuting the coordinates of P. Because the matrix
representing τ=ρ2has determinant 1it leaves volume invariant. Therefore, Vol(Pσ) = Vol(P)
for all σZm, so Vol(P) = 1
m.
Corollary 2.9. For any vector vRm
0, there exists σZmsuch that the vector v= (v
1, v
2,...,v
m) =
σ(v)has the following property: For all 1kn,
k
X
i=1
(v
2i1v
2i)0(2.9)
and
k
X
i=1
(v
2n(2i1) v
2n2i)0.(2.10)
If furthermore these are all >0, then σis unique.
5
One interpretation of the above corollary is as follows. Suppose you have a necklace with an odd
number of beads. On each bead you write a non-negative number. Then there exists some place
where you can cut the necklace such that when you lay out the necklace and think of the sequence
of values on the beads as a vector in Rm, this vector is a bidirectional gerrymander. Furthermore,
if the numbers you write on the beads are “generic”, then there is exactly one such place you can
cut the necklace.
1.78
1.55
0.76 2.06
3.21
3.21 1.78 1.55 0.76 2.06
FIGUR E 2. An example “cut” of a necklace as in Corollary 2.9
3. VERTICES OF THE BIDIRECT IONA L BALLOT POLYTOPE ARE VE RTI CE S OF TH E CUBE
In this section we show that the vertices of Pnare also vertices of Cm, the unit cube. We had
previously defined Pnas the intersection of the unit cube with the ballot cone, which is equivalent
to the set of vectors [1, g1, . . . , gn1, n]satisfying the below inequality:
cube vectors
left ballot vectors
right ballot vectors
1 0 0 0 0 ... 0 0 0 0 0
1 0 0 0 0 ... 0 0 0 0 0
0 1 0 0 0 ... 0 0 0 0 0
01 0 0 0 ... 0 0 0 0 0
.
.
..
.
..
.
..
.
..
.
.....
.
..
.
..
.
..
.
..
.
.
11 0 0 0 ... 0 0 0 0 0
11 1 1 0 ... 0 0 0 0 0
.
.
..
.
..
.
..
.
..
.
.....
.
..
.
..
.
..
.
..
.
.
0 0 0 0 0 ... 0 0 0 1 1
0 0 0 0 0 ... 01 1 1 1
.
.
..
.
..
.
..
.
..
.
.....
.
..
.
..
.
..
.
..
.
.
1
g1
2
g2
.
.
.
gn1
n
0
1
0
1
.
.
.
0
0
.
.
.
0
0
.
.
.
.
(3.1)
The first collection of rows in the above matrix is necessary to ensure that we only deal with
points inside of the unit cube. Thus we call any vector of the form [0,...,0,±1,0,...,0] acube
vector.
Before proving the main result of this section, we must review a few concepts related to convex
polytopes. We follow the terminology of [BT].
Definition 3.1. Let Pbe a polytope in Rndefined by the inequalities aT
ixbifor i[k]. Let x
be such that for some i,aT
ix=bi. Then, we say that the ith constraint is active at x.
6
Definition 3.2. A vector xRnis called a basic solution if out of all of the constraints that are
active at x, there is some collection of nof them which is linearly independent. If xis a basic
solution that satisfies all of the constraints, then it is called a basic feasible solution.
Part of what makes the study of convex polytopes interesting is that there are several equivalent
but strikingly different ways of defining what the vertices of a polytope are. In particular, one
definition is that a point vis a vertex if and only if it is a basic feasible solution.
Another definition which will be helpful in the proof of the main theorem of this section is the
following.
Definition 3.3. A matrix/vector is called flat if all of its entries are 0, 1, or -1.
Let Qndenote the set of vertices in the polytope Pn. Let Sndenote the set of vertices of the unit
cube Cm. The main result of this section is the following.
Theorem 3.4. All of the vertices of the bidirectional ballot polytope Pnare also vertices of the
unit cube Cm; i.e., QnSn.
Proof. By the above discussion, we know that we must show that all basic feasible solutions are
vertices of the cube. Throughout this proof, we let nbe fixed, and let m= 2n1. Thus we
unambiguously let P=Pn,C=C2n1,Q=Qn, and S=Sn. Notice that Zm P S. From
this observation, we now describe the strategy for proving the theorem. Suppose xis a basic
solution whose corresponding constraints are ai1,...,aim. Then xsatisfies
ai1
.
.
.
aim
x=
bi1
.
.
.
bim
.(3.2)
Let Abe the matrix in (3.2). Let bbe the vector on the right hand side in (3.2). Thus x=A1b.
Note that bZmsince it is some subset of the entries in the vector on the right hand side of
(3.1). If we can show that det(A) = ±1, it will imply that A1has integer entries, and thus that
A1bZm. From the earlier observation, if xis a basic feasible solution, then we must have
that A1b=xS, which would prove the theorem.
Now we must show that if Ais invertible, then it has determinant ±1. In order to show this,
we will show that given any such A, we can obtain a sequence of matrices A0,A1,...,Amwith
A0=Asuch that det(Ai) = ±det(Ai1). The last matrix Amwill be a permutation matrix, and
thus will have determinant ±1, thus allowing us to conclude that det(A)is ±1.
Now, let Abe some intervible matrix whose rows are composed of cube vectors, left ballot
vectors, and right ballot vectors. We carry out the following process. First find the smallest index,
j1, such that the 1st entry of the j1th row of Ais non-zero (note that this entry must be ±1).
Multiply this row by ±1so that entry aj1,1= 1, and then subtract off the appropriate multiple of
this row from all the other rows so that ak,1= 0 if k6=j1. Call this new matrix A1. We claim that
A1is flat. This will be proven in Lemma 3.5.
We now find the smallest index j26∈ {j1}such that aj2,26= 0 (again, it must be ±1). We
multiply this row so that aj2,2= 1, and then we subtract off the appropriate multiple of this row so
that ak,2= 0 if k6=j2. Call this new matrix A2.
We repeat this above process up to jm. That is, at the pth step we find the smallest index jp6∈
{j1,...,jp1}such that ajp,p 6= 0. We multiply this row by ±1so that ajp,p = 1. We then subtract
off the appropriate multiple of this row so that ak,p = 0 if k6=jp. After msteps, the resulting
matrix Ammust have exactly one non-zero entry in each column, which is a one. Thus Amis a
permutation matrix, so det(Am) = ±1. Thus, once we prove Lemma 3.5, we are done.
7
Before proving Lemma 3.5, we include an example to illustrate the method. The bolded row in
Aicorreponds to row ji+1 as described in the proof of Theorem 3.4.
A0:
0 1 0 0 0
0 0 0 0 1
11 0 0 0
01 1 1 1
0 0 0 1 1
A1:
01000
00001
11 0 0 0
01 1 1 1
0 0 0 1 1
A2:
0 1 0 0 0
0 0 0 0 1
1 0 0 0 0
0011 1
0001 1
(3.3)
A3:
0 1 0 0 0
0 0 0 0 1
1 0 0 0 0
0011 1
0 0 0 1 1
A4:
0100 0
0000 1
1000 0
0010 0
00011
A5:
01000
00001
10000
00100
00010
.(3.4)
This example reveals that in some cases, Ai+1 =Ai.
Lemma 3.5. All of the matrices Apin the proof of Theorem 3.4 are flat.
Proof. We proceed by induction. In particular we show that for each k, every row of the matrix Ak
is of exactly one of six types depending on the form of the first kentries of that row and the last
mkentries of that row (in the sequel, we will refer to this as saying that every row is one of the
six types with respect to k).
We now describe these six types. Let αndenote any sequence of length nconsisting of alter-
nating plus ones and minus one (e.g. α3= [1,1,1] or α1= [1]). Let βndenote the sequence
of length nconsisting of all zeros. Let γndenote any binary sequence of length ncontaining ex-
actly one one (e.g. γ4= [0,0,1,0]). Let refer to the operation of vector concatenation (e.g.
[1,2,3] [4,5] = [1,2,3,4,5]). The six types (with respect to k) are listed in the following table.
Type First kLast mkExample (k= 3,m= 7)
1βkβ1αj1βmkj1[0,0,00,1,1,0]
2βkα1βnk0[0,0,01,1,1,0]
3βkβ1αmk0[0,0,00,0,1,1]
4γkβ1αj0βmkj1[0,1,00,0,0,0]
5γkα1βnk0[0,1,01,1,1,0]
6γkβ1αmk0[0,1,00,0,1,1]
TABLE 1. The six types with respect to k
We now go through the inductive argument. For the base case, notice that when k= 0, the cube
vectors are type 1, the left ballot vectors are type 2, and the right ballot vectors are type 3. Thus
the claim is proven in the base case.
Now for the inductive step, we shall show that if all rows of Akare of one of the above types
with respect to k, then all rows of Ak+1 are of one of the above types with respect to k+ 1. As
described in the proof of Theorem 3.4, at step k, we must first find some row whose first kentries
are zero, and whose k+ 1 entry is ±1. We see then that we must select some row of type 2, call
it T. We then subtract Tfrom all other rows whose k+ 1 entry is non-zero. Thus the only types
we must worry about are types 2 and 5. Notice that when we subtract Tfrom a row of type 2, we
8
get a row either or type 1, type 2, or type 3 with respect to k+ 1. When we subtract Tfrom a
row of type 5, we get a row either of type 4, 5, or 6 with respect to k+ 1. All other rows remain
the same. Thus when we catalog the new rows with respect to k+ 1, we get that those of type 1
become either type 1 or type 2. As mentioned before, those of type 2 become those of type 1, 2,
or 3, except for row Twhich becomes of type 4 or 5. Type 3 becomes type 2 or 3. Type 4 remains
type 4 or becomes type 5. As mentioned before, type 5 becomes type 4, 5, or 6. Lastly, type 6
becomes type 5 or type 6. Thus, by induction, we have proven the desired statement, implying in
particular that the matrix is flat at every step.
4. VERTICES OF TH E CU BE I N TH E BALL OT REGIO N
In this section, we demonstrate that bidirectional ballot sequences of length 2n1correspond
in a natural way to Qn, and we rederive the growth rate given in [Zh1] and [BP].
Definition 4.1. Aslope vector is a vector λ= [λ1,...,λm]Rmwith mN. To a slope vector λ,
we associate the unique continuous piecewise linear function fλ: [0, m]Rsuch that f(0) = 0
and f
λ(x) = λifor x(i1, i)for each 1im.
Given any binary sequence b=b1···bm, we associate to this sequence the graph of the function
fλwhere λ= (λ1,...,λm)with λi:= (1)bi1.
Example 4.2. The bidirectional ballot sequence 11011001111 corresponds to the path
This is a bijection from binary sequences of length mto graphs of functions fλwith λ 1}m.
Recall from Section 1 that the graphs which correspond to bidirectional ballot sequences are those
of functions fλwhere fλ(0) < fλ(t)< fλ(m)for all 0< t < m.
Now we will draw a correspondence between Qnand B2n+3 through these graphs, as well as a
correspondence between a certain subset of Qnand B2n1, by describing a way to interpret vectors
vC2n1= [0,1]2n1as paths as in the discrete case in such a way that the vertices of the ballot
polytope are realized as exactly the graphs above. Given a vector v= [v1,...,v2n1]C2n1,
define the slope vector λv= [λ1,...,λ2n1]by λi:= (1)i1(2vi1), and associate to vthe
graph of the function fλv.
Example 4.3. The gap-parametrization vector v=3
4,1
3,1
2,2
3,1[0,1]5gives the slope vector
λv=1
2,1
3,0,1
3,1, which gives the following graph of the function fλv, where the values next to
the points indicate the distance above the x-axis:
9
0 1 2 3 4 5
1
2
1
6
1
6
1
2
3
2
Although the function fλvin Example 4.3 has the property that it achieves global minimum and
maximum values at it left and right endpoints (respectively), we will see that this is not always the
case (see Example 4.6). We determine this behavior more precisely now.
If v= [v1,...,v2n1]C2n1, then for 0k2n1we have
fλv(k) =
k
X
j=1
(1)j1(2vj1) = (2Pk
j=1(1)j1vjkis even
1 + 2 Pk
j=1(1)j1vjkis odd, (4.1)
and similarly,
fλv(2n1k) = (f(2n1) 2Pk
j=1(1)j1v2njkis even
f(2n1) + 1 2Pk
j=1(1)j1v2njkis odd. (4.2)
One can see now that, even if v Pn, it is possible for the graph to fail the property stated above,
i.e., to achieve a global maximum or minimum at a point in the interior of its interval of definition
(again, see Example 4.6 for an explicit example). However, one can also see that if v Pn, it
cannot fail this property to a great extent; namely, the values at the left and right endpoints will be
within a distance of 1 from the maximum and minimum values, since the large sums in the RHS
of (4.1) and (4.2) will be non-negative. Nonetheless, we would like the graphs of the functions fλv
with vQnto match the graphs of bidirectional ballot sequences in B2n+3, and for that reason we
give a way to modify a vector vQnbefore associating it to a graph. Namely, we will add a sort
of buffer to each side of the vector, so that the left and right endpoints get a leg up.
Definition 4.4. If v= [v1, . . . , v2n1]C2n1, we define
α(v):= [1,0, v1, v2,...,v2n2, v2n1,0,1].
We now present two correspondences, the first stated more naturally, and the second proven
more naturally, which are nonetheless very closely related. The first correspondence is as follows.
Theorem 4.5. We have Qnis in bijection with B2n+3, induced by the map
v7→ fλα(v).(4.3)
Before we prove Theorem 4.5, we give an example of the process that induces the bijection.
10
Example 4.6. Consider the gap-parameterization vector v= [0,0,1,0,0] [0,1]5, an element
of Q3. We shall obtain a bidirectional ballot sequence from v.We see that vgives the slope
vector λv= [1,1,1,1,1]. The graph of fλvis the following, where the values next to the points
indicate the distance above the x-axis:
12345
0
1
0
1
2
1
This is not the graph of a bidirectional ballot sequence. Namely, the graph passes below the x-axis
and above the line y=fλv(5). Let’s now consider α(v) = [1,0,0,0,1,0,0,0,1] [0,1]9, which
gives slope vector λα(v)= [1,1,1,1,1,1,1,1,1] and leads to the following graph of fλα(v).
0123456789
1
2
1
2
3
4
3
4
5
The portion of the graph between the vertical dotted lines is simply the graph of fλvtranslated
in the plane by the vector [2,2]. This graph does correspond to a bidirectional ballot sequence,
namely 110111011. We now prove that this process gives a bijection as in the statement of the
theorem.
Proof of Theorem 4.5. By the correspondence between bidirectional ballot sequences and graphs
of certain functions given in Example 4.2, it suffices to show that the map of (4.3) puts Qnin
11
bijection with
F={fµ:µ 1}2n+3, fµ(0) < fµ(t)< fµ(2n+ 3) for all t(0,2n+ 3)}.(4.4)
If vC2n1is any gap-parameterization vector, then, in light of (4.1), (4.2), and the fact that
fλvachieves maxima and minima only at integer values, we have that fλv(0) 1fλv(t)
fλv(2n1) + 1 for t[0,2n1] if and only if vis a bidirectional gerrymander. Furthermore, if v
is a vertex of the cube C2n1, then α(v)is a vertex of C2n+3 = [0,1]2n+3 so that fλα(v)takes integers
to integers. Since for any vC2n1we have fλα(v)(k+ 2) = fλv(k) + 2 for 0k2n1,
fλα(v)(i) = ifor i= 0,1,2, and fλα(v)(2n+ 1 + i) = fλα(v)(2n+ 1) + ifor i= 1,2. Thus if
vis a vertex of C2n1then fλα(v)(0) < fλα(v)(t)< fλα(v)(2n+ 3) for all t(0,2n+ 3) if and
only if vQn. It follows then that, since λα(v) 1}2n+3 when vQn, we indeed have that
fλα(v)F, and so the map in (4.3) does indeed take Qnto graphs of bidirectional ballot sequences
in B2n+3.
Injectivity of the map is clear. To show that the map is surjective, we provide an inverse. For
a bidirectional ballot sequence b=b1···b2n+3 of length 2n+ 3, we define the vector wb=
[w1,...,w2n1], where
wj:=(1if jbj+2 (mod 2)
0if j6≡ bj+2 (mod 2).(4.5)
It is easily verified that the graph of fλα(w)is the one associated to b. Moreover, the two statements
directly following (4.4) imply that, since w 1}2n1and the graph of fλα(w)is that of a bidi-
rectional ballot sequence, we must have that wQn. It is clear that this map is both a right- and
left-inverse of the map given by (4.3).
We now give the second correspondence. Let Indenote the interior of Bnin R2n1. Let Tn=
InQn, i.e. those vertices of Pnin the interior of Bn.
Corollary 4.7. Tnis in bijection with B2n1, induced by the map
v7→ fλv.(4.6)
Proof. The proof here is essentially the same as that of Theorem 4.5. The point here is that, when
vTn, we already have fλv(0) < fλv(t)< fλv(2n1), following similar reasoning as in the
statements directly following (4.4).
Lastly, we use these correspondences along with our previous analysis of Pnand its translates
to obtain the growth rate in [Zh1].
Corollary 4.8. For odd,
B2
16(4).(4.7)
Proof. Suppose m= 2n1. By Theorem 4.5, we know that the vertices of Pnare in bijection
with Bm+4. From Corollary 2.8, we know that every vertex of C2n1is contained in Pσfor some
σZm. Since there are msuch copies of P, we get that
mBm+4 2m.(4.8)
Let =m+ 4. Then by rearrangement we get
B2
16(4).(4.9)
12
Corollary 4.9. For odd,
B2
.(4.10)
Proof. Suppose = 2n1. From Corollary 4.7, we know that the vertices of Pnwhich are in the
interior of Bn, namely Tn, are in bijection with Bm. Since the interiors of Bσ1and Bσ2are disjoint
if σ16=σ2, we have that σ1(Tn)σ2(Tn) = for σ16=σ2. Therefore, summing over all the
vertices in σ(T)for each σZ, we at most get every vertex of the cube once. That is,
ℓB2.(4.11)
Rearranging yields
B2
.(4.12)
Corollary 4.10. For all , the growth rate of Bis Θ(2/ℓ).
Proof. By Corollaries 4.8 and 4.9, we know that for odd, the growth rate is Θ(2/ℓ). The only
additional insight needed is that for all ,B+1 B. To see this, note that given a BBS of length
, by appending a 1 to the end of it, we obtain a BBS of length + 1. Thus up to fixed constants,
the inequalities in Corollaries 4.8 and 4.9 are correct for even as well. Thus, for all ,Bgrows
like Θ(2/ℓ).
5. CONCLU SI ON
Our methods reveal a rich combinatorial structure underlying bidirectional ballot sequences.
In previous papers on BBSs ([Zh1], [BP], [HHPW]), analytic techniques were used to obtain
asymptotics, but our techniques reveal a geometric interpretation for the Θ(2n/n)growth rate.
Interestingly, in the final section of [Zh1], Zhao states without detailed proof that nBn/2ngoes
to 1/4, but claims his proof is “calculation-heavy”. He then posits that “[t]here should be some
natural, combinatorial explanation, perhaps along the lines of grouping all possible walks into
orbits of size mostly nunder some symmetry, so that almost every orbit contains exactly one walk
with the desired property. Zhao’s statement is strikingly similar to the ideas presented in our
paper. Though we have made some effort, we have not been able to derive that nBn/2n1/4
using the techniques of our paper, but we feel that there is hope for such a proof.
The second, more general takeaway from this paper is the potential for the ideas originally
presented in [MP]. The ideas in this paper in fact evolved from the ideas in [MP]. In passing to
the continuous setting, several additive number theory and combinatorial problems reveal a rich
structure which was not otherwise visible. We believe that there is even greater potential still in
such ideas and techniques.
REFEREN CES
[AGMML] S. F. Altshul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. Basic local alignment search tool, J.
Molecular Biology 215.3 (1990), 403-410.
[BP] M. Bosquet-Mélou and Y. Ponty, Culminating Paths, Discrete Math. Theor. Comput. Sci. 10.2 (2008),
125-152.
13
[BT] D. Bertsimas and J. N. Tsitsiklis, Introduction to Linear Optimization, Athena Scientific, Belmont, MA,
1997.
[CR] A. Califano and I. Rigoutsos, Flash: A fast look-up algorithm for string homology, In Proceedings of the
1st International Conference on Intelligent Systems for Molecular Biology, pages 56-64. AAAI Press,
1993.
[FGK] P. Di Franeso, E. Guitter, and C. Kristjansen. Integrable 2D Lorentzian gravity and random walks, Nu-
clear Phys. B 567.3 (2000), 515-553.
[HHPW] B. Hackl, C. Heuberger, H. Prodinger, and S. Wagner, Analysis of Bidirectional Ballot Sequences and
Random Walks Ending in their Maximum, Ann. Comb. 20 (2016), 775-797.
[HM] P. Hegarty, S.J. Miller, When almost all sets are difference dominated, Random Structures Algorithms 35
(1) (2009) 118-136.
[MO] G. Martin and K. O’Bryant, Many sets have more sums than differences, Additive combinatorics, 287-
305, CRM Proc. Lecture Notes 43, Amer. Math. Soc., Providence, RI, 2007.
[MP] S. J. Miller and C. Peterson, A geometric perspective on the MSTD question, in preparation.
[PL] W. R. Pearson and D. J. Lipman, Improved tools for biological sequence comparison, Proceedings of the
National Academy of Sciences of the USA 85 (1998), 2444-2448.
[Zh1] Y. Zhao, Constructing MSTD sets using bidirectional ballot sequences, Journal of Number Theory 130.5
(2010): 1212-1220.
[Zh2] Y. Zhao, Sets characterized by missing sums and differences, Journal of Number Theory 131.11 (2011):
2107-2134.
E-mail address:sjm1@williams.edu,Steven.Miller.MC.96@aya.yale.edu
DEPARTMEN T OF MATH EMATI CS A ND STATI ST IC S, WIL LI AM S COL LEG E, WIL LI AM ST OW N, M A 01 267
E-mail address:carstenp@umich.edu
DEPARTMEN T OF MATH EMATI CS , UN IV ER SI TY O F MIC HI GA N, A NN AR BO R, M I 48109
E-mail address:csprun@umich.edu
DEPARTMEN T OF MATH EMATI CS , UN IV ER SI TY O F MIC HI GA N, A NN AR BO R, M I 48109
E-mail address:rpeski@princeton.edu
DEPARTMEN T OF MATH EMATI CS , PR IN CE TO N UNI VE RS IT Y, PRINCETON , NJ 0 8544
14
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
A more sums than differences (MSTD) set A is a subset of Z\mathbb{Z} for which A+A>AA|A+A| > |A-A|. Martin and O'Bryant used probabilistic techniques to prove that a non-vanishing proportion of subsets of {1,,n}\{1, \dots, n\} are MSTD as nn \to \infty. However, to date only a handful of explicit constructions of MSTD sets are known. We study finite collections of disjoint intervals on the real line, I\mathbb{I}, and explore the MSTD question for such sets, as well as the relation between such sets and MSTD subsets of Z\mathbb{Z}. In particular we show that every finite subset of Z\mathbb{Z} can be transformed into an element of I\mathbb{I} with the same additive behavior. Using tools from discrete geometry, we show that there are no MSTD sets in I\mathbb{I} consisting of three or fewer intervals, but there are MSTD sets for four or more intervals. Furthermore, we show how to obtain an infinite parametrized family of MSTD subsets of Z\mathbb{Z} from a single such set A; these sets are parametrized by lattice points satisfying simple congruence relations contained in a polyhedral cone associated to A.
Article
Full-text available
Consider non-negative lattice paths ending at their maximum height, which will be called admissible paths. We show that the probability for a lattice path to be admissible is related to the Chebyshev polynomials of the first or second kind, depending on whether the lattice path is defined with a reflective barrier or not. Parameters like the number of admissible paths with given length or the expected height are analyzed asymptotically. Additionally, we use a bijection between admissible random walks and special binary sequences to prove a recent conjecture by Zhao on ballot sequences.
Article
Full-text available
Let a and b be two positive integers. A culminating path is a path of Z^2 that starts from (0,0), consists of steps (1,a) and (1,-b), stays above the x-axis and ends at the highest ordinate it ever reaches. These paths were first encountered in bioinformatics, in the analysis of similarity search algorithms. They are also related to certain models of Lorentzian gravity in theoretical physics. We first show that the language on a two letter alphabet that naturally encodes culminating paths is not context-free. Then, we focus on the enumeration of culminating paths. A step by step approach, combined with the kernel method, provides a closed form expression for the generating fucntion of culminating paths ending at a (generic) height k. In the case a=b, we derive from this expression the asymptotic behaviour of the number of culminating paths of length n. When a>b, we obtain the asymptotic behaviour by a simpler argument. When a= b, with no precomputation stage nor non-linear storage required. The choice of the best algorithm is not as clear when a
Article
Full-text available
A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score. Recent mathematical results on the stochastic properties of MSP scores allow an analysis of the performance of this method as well as the statistical significance of alignments it generates. The basic algorithm is simple and robust; it can be implemented in a number of ways and applied in a variety of contexts including straightforward DNA and protein sequence database searches, motif searches, gene identification searches, and in the analysis of multiple regions of similarity in long DNA sequences. In addition to its flexibility and tractability to mathematical analysis, BLAST is an order of magnitude faster than existing sequence comparison tools of comparable sensitivity.
Article
We have developed three computer programs for comparisons of protein and DNA sequences. They can be used to search sequence data bases, evaluate similarity scores, and identify periodic structures based on local sequence similarity. The FASTA program is a more sensitive derivative of the FASTP program, which can be used to search protein or DNA sequence data bases and can compare a protein sequence to a DNA sequence data base by translating the DNA data base as it is searched. FASTA includes an additional step in the calculation of the initial pairwise similarity score that allows multiple regions of similarity to be joined to increase the score of related sequences. The RDF2 program can be used to evaluate the significance of similarity scores using a shuffling method that preserves local sequence composition. The LFASTA program can display all the regions of local similarity between two sequences with scores greater than a threshold, using the same scoring parameters and a similar alignment algorithm; these local similarities can be displayed as a "graphic matrix" plot or as individual alignments. In addition, these programs have been generalized to allow comparison of DNA or protein sequences based on a variety of alternative scoring matrices.
Article
A more sums than differences (MSTD) set is a finite subset S of the integers such |S+S| > |S-S|. We show that the probability that a uniform random subset of {0, 1, ..., n} is an MSTD set approaches some limit rho > 4.28 x 10^{-4}. This improves the previous result of Martin and O'Bryant that there is a lower limit of at least 2 x 10^{-7}. Monte Carlo experiments suggest that rho \approx 4.5 \x 10^{-4}. We present a deterministic algorithm that can compute rho up to arbitrary precision. We also describe the structure of a random MSTD subset S of {0, 1, ..., n}. We formalize the intuition that fringe elements are most significant, while middle elements are nearly unrestricted. For instance, the probability that any ``middle'' element is in S approaches 1/2 as n -> infinity, confirming a conjecture of Miller, Orosz, and Scheinerman. In general, our results work for any specification on the number of missing sums and the number of missing differences of S, with MSTD sets being a special case. Comment: 32 pages, 1 figure, 1 table
Article
A more sums than differences (MSTD) set is a finite subset S of the integers such that |S+S| > |S-S|. We construct a new dense family of MSTD subsets of {0, 1, 2, ..., n-1}. Our construction gives Theta(2^n/n) MSTD sets, improving the previous best construction with Omega(2^n/n^4) MSTD sets by Miller, Orosz, and Scheinerman.
Article
A key issue in managing today's large amounts of genetic data is the availability of efficient, accurate, and selective techniques for detecting homologies (similarities) between newly discovered and already stored sequences. A common characteristic of today's most advanced algorithms, such as FASTA, BLAST, and BLAZE is the need to scan the contents of the entire database, in order to find one or more matches. This design decision results in either excessively long search times or, as is the case of BLAST, in a sharp trade-off between the achieved accuracy and the required amount of computation. The homology detection algorithm presented in this paper, on the other hand, is based on a probabilistic indexing framework. The algorithm requires minimal access to the database in order to determine matches. This minimal requirement is achieved by using the sequences of interest to generate a highly redundant number of very descriptive tuples; these tuples are subsequently used as indices in a table look-up paradigm. In addition to the description of the algorithm, theoretical and experimental results on the sensitivity and accuracy of the suggested approach are provided. The storage and computational requirements are described and the probability of correct matches and false alarms is derived. Sensitivity and accuracy are shown to be close to those of dynamic programming techniques. A prototype system has been implemented using the described ideas. It contains the full Swiss-Prot database rel 25 (10 MR) and the genome of E. Coli (2 MR). The system is currently being expanded to include the complete Genbank database.(ABSTRACT TRUNCATED AT 250 WORDS)