Page 1
arXiv:1203.1113v3 [math.PR] 29 May 2013
CYCLES AND EIGENVALUES OF SEQUENTIALLY GROWING
RANDOM REGULAR GRAPHS
TOBIAS JOHNSON AND SOUMIK PAL
Abstract. Consider the sum of d many iid random permutation matrices on
n labels along with their transposes. The resulting matrix is the adjacency ma-
trix of a random regular (multi)-graph of degree 2d on n vertices. It is known
that the distribution of smooth linear eigenvalue statistics of this matrix is
given asymptotically by sums of Poisson random variables. This is in contrast
with Gaussian fluctuation of similar quantities in the case of Wigner matrices.
It is also known that for Wigner matrices the joint fluctuation of linear eigen-
value statistics across minors of growing sizes can be expressed in terms of
the Gaussian Free Field (GFF). In this article we explore joint asymptotic (in
n) fluctuation for a coupling of all random regular graphs of various degrees
obtained by growing each component permutation according to the Chinese
Restaurant Process. Our primary result is that the corresponding eigenvalue
statistics can be expressed in terms of a family of independent Yule processes
with immigration. These processes track the evolution of short cycles in the
graph. If we now take d to infinity, certain GFF-like properties emerge.
1. Introduction
We consider graphs that have labeled vertices and are regular, i.e., every ver-
tex has the same degree. We allow our graphs to have loops and multiple edges
(such graphs are sometimes called multigraphs or pseudographs). Additionally, our
graphs will be sparse in the sense that the degree will be negligible compared to
the order. Every such graph has an associated adjacency matrix whose (i,j)th
element is the number of edges between vertices i and j, with loops counted twice.
When the graph is randomly selected, the matrix is random, and we are interested
in studying the eigenvalues of the resulting symmetric matrix. Note that, due to
regularity, it does not matter whether we consider the eigenvalues of the adjacency
or the Laplacian matrix.
The precise distribution of this random regular graph is somewhat ad hoc. We
will use what is called the permutation model. Consider the permutation digraphs
generated by d many iid random permutations on n labels. We remove the direction
of the edge and collapse all these graphs on one another. This results in a 2d-regular
graph on n vertices, denoted by G(n,2d). At the matrix level this is given by adding
all the d permutation matrices and their transposes.
Date: May 29, 2013.
2000 Mathematics Subject Classification. 60B20, 05C80.
Key words and phrases. Random regular graphs, eigenvalue fluctuations, Chinese Restaurant
Process, minors of random matrices.
This research is partially supported by NSF grant DMS-1007563.
1
Page 2
2 TOBIAS JOHNSON AND SOUMIK PAL
Our present work is an extension of the study of eigenvalue fluctuations carried
out in [DJPP12]. We are motivated by the recent work by Borodin on joint eigen-
value fluctuations of minors of Wigner matrices and the (massless or zero-boundary)
Gaussian Free Field (GFF) [Bor10a, Bor10b]. Eigenvalues of minors are closely re-
lated to interacting particle systems [Fer10, FF10], and the KPZ universality class
of random surfaces [BF08]. See [JN06] for more on eigenvalues of minors of GUE
and [ANvM11] for those of Dyson’s Brownian motion.
Let us consider a particular but important case of Borodin’s result in [Bor10a]
(single sequence, the entire N). An n × n real symmetric Wigner matrix has iid
upper triangular off-diagonal elements with four moments identical to the standard
Gaussian. The diagonal elements are usually taken to be iid with mean zero vari-
ance two. Notice that every principal submatrix (called minors in this context)
of a Wigner matrix is again a Wigner matrix of a smaller order. Thus, on some
probability space one can construct an infinite order Wigner matrix W whose n×n
minor W(n) is a Wigner matrix of order n.
Let z be a complex number in the upper half plane H. Define y = |z|2and
x = 2ℜ(z).
eigenvalues that are greater than or equal to√nx. Define the height function
?π
Then Borodin shows that {Hn(z) − EHn(z), z ∈ H}, viewed as distributions,
converges in law to a generalized Gaussian process on H with a covariance kernel
Consider the minor W(⌊ny⌋), and let N(z) be the number of its
(1)Hn(z) :=
2N(z).
(2)C(z,w) =
1
2πln
????
z − w
z − w
????.
The above is the covariance kernel for the GFF on the upper half plane.
An equivalent assertion is the following.
{1,2,...,n}. Consider the Chebyshev polynomials of the first kind, {Tn, n =
0,1,2,...}, on the interval [−1,1]. These polynomials are given by the identity
Tn(cos(θ)) ≡ cos(nθ). We specialize [Bor10a, Proposition 3] for the case of GOE
(β = 1). Fix m positive real numbers t1 < t2 < ... < tm. In the notation of
[Bor10a], we take L = n and Bi(n) = [⌊tin⌋]. Then, for any positive integers
j1,j2,...,jm, the random vector
?trTji
converges in law, as n tends to infinity, to a centered Gaussian vector. For s ≤ t,
?
which gives the covariance kernel of the limiting vector. In particular, all such co-
variances are zero when i ?= k. Note that the traces can be expressed as integrals of
the height function of the corresponding submatrices. Thus, by approximating con-
tinuous compactly supported functions of z by a function that is piecewise constant
in y and polynomial in x, one gets the kernel (2).
Let [n] denote the set of integers
?W(⌊tin⌋)/2√tin?− EtrTji
?
?W(⌊tin⌋)/2√tin?, i ∈ [m]?
?W(⌊sn⌋)/2√sn??
(3)lim
n→∞Cov
trTi
W(⌊tn⌋)/2√tn
?
, trTk
= δikk
2
?s
t
?k/2
,
1.1. Main results. By a tower of random permutations we mean a sequence of
random permutations (π(n),n ∈ N) such that
(i) π(n)is a uniformly distributed random permutation of [n] for each n, and
Page 3
CYCLES AND EIGENVALUES3
(ii) for each n, if π(n)is written as a product of cycles then π(n−1)is derived
from π(n)by deletion of the element n from its cycle.
The stochastic process that grows π(n)from π(n−1)by sequentially inserting an el-
ement n randomly is called the Chinese Restaurant Process (CRP). We will review
the basic principles at a later section. In [KOV04] and other related work, a se-
quence of permutations satisfying condition (ii) is called a virtual permutation, and
the distribution on virtual permutations satisfying condition (i) is considered as a
substitute for Haar measure on S(∞), the infinite symmetric group. This is used
to study the representation theory of S(∞), with connections to Random Matrix
Theory. A recent extension of this idea is [BNN11].
Now suppose we construct a countable collection {Πd, d ∈ N} of towers of ran-
dom permutations. We will denote the permutations in Πdby {π(n)
it is possible to model every possible G(n,2d) by adding the permutation matrices
(and their transposes) corresponding to {π(n)
will keep d fixed and consider n as a growing parameter. Thus, Gnwill represent
G(n,2d) for some fixed d. Here and later, G0will represent the empty graph. We
construct a continuous-time version of this by inserting new vertices into Gnwith
rate n + 1. Formally, define independent times Ti∼ Exp(i), and let
?
and define the continuous-time Markov chain G(t) = GMt. When d = 1, this
process is essentially just a continuous-time version of the CRP itself. Though this
case is unusual compared to the rest—for example, G(t) is likely to be disconnected
when d = 1 and connected when d is larger—our results do still hold.
Our first result is about the process of short cycles in the graph process G(t).
By a cycle of length k in a graph, we mean what is sometimes called a simple cycle:
a walk in the graph that begins and ends at the same vertex, and that otherwise
repeats no vertices. We will give a more formal definition in Section 2.2. Let
(C(s)
k(t), k ∈ N) denote the number of cycles of various lengths k that are present
in G(s + t). This process is not Markov, but nonetheless it converges to a Markov
process (indexed by t) as s tends to infinity.
To describe the limit, define
?
(2d − 1)k+ 1,
Consider the set of natural numbers N = {1,2,...} with the measure
µ(k) =1
2[a(d,k) − a(d,k − 1)],
Consider a Poisson point process χ on N × [0,∞) with an intensity measure given
on N×(0,∞) by the product measure µ⊗Leb, where Leb is the Lebesgue measure,
and with additional masses of a(d,k)/2k on (k,0) for k ∈ N.
Let?Pxdenote the law of an one-dimensional pure-birth process on N given by
Lf(k) = k (f(k + 1) − f(k)),
starting from x ∈ N. This is also known as the Yule process.
d, n ∈ N}. Then
j
, 1 ≤ j ≤ d}. In what follows we
Mt= maxm:
m
?
i=1
Ti≤ t
?
,
a(d,k) =
(2d − 1)k− 1 + 2d,when k is even,
when k is odd.
k ∈ N,a(d,0) := 0.
the generator:
k ∈ N,
Page 4
4 TOBIAS JOHNSON AND SOUMIK PAL
Suppose we are given a realization of χ. For any atom (k,y) of the countably
many atoms of χ, we start an independent process (Xk,y(t), t ≥ 0) with law?Pk.
Nk(t) :=
Define the random sequence
?
(j,y)∈χ∩{[k]×[0,t]}
1{Xj,y(t − y) = k}.
In other words, at time t, for every site k, we count how many of the processes that
started at time y ≤ t at site j ≤ k are currently at k. Note that both (Nk(·), k ∈ N)
and (Nk(·), k ∈ [K]), for some K ∈ N, are Markov processes, while Nk(·) for fixed
k is not.
Theorem 1. As s → ∞, the process (C(s)
in the product topology on D∞[0,∞), to the Markov process (Nk(t), k ∈ N, 0 ≤
t < ∞). The limiting process is stationary.
Remark 2. In fact, the same argument used to prove Theorem 1 shows that
the process (C(s)
k(t), −∞ < t < ∞) converges in law to the Markov process
(Nk(t), −∞ < t < ∞) running in stationarity. The same conclusion holds for
all the following theorems in this section.
k(t), k ∈ N, 0 ≤ t < ∞) converges in law,
We now explore the joint convergence across various d’s. Define C(s)
stressing the dependence on the parameter d.
d,k(t) naturally,
Theorem 3. There is a joint process convergence of (C(s)
to a limiting process (Ni,k(t), k ∈ N, i ∈ [d], t ≥ 0). This limit is a Markov process
whose marginal law for every fixed d is described in Theorem 1. Moreover, for
any d ∈ N, the process (Nd+1,k(·) − Nd,k(·), k ∈ N) is independent of the process
(Ni,k(·), k ∈ N, i ∈ [d]) and evolves as a Markov process. Its generator (defined on
functions dependent on finitely many coordinates) is given by
i,k(t), k ∈ N, i ∈ [d], t ≥ 0)
Lf(x) =
∞
?
k=1
kxk[f (x + ek+1− ek) − f(x)] +
∞
?
k=1
ν(d,k)[f(x + ek) − f(x)],
where x is a nonnegative sequence, (ek,k ∈ N) are the canonical orthonormal basis
of ℓ2, and
ν(d,k) =1
2[a(d + 1,k) − a(d + 1,k − 1) − a(d,k) + a(d,k − 1)].
Remark 4. Theorems 1 and 3 show an underlying branching process structure. We
actually prove a more general decomposition where cycles are tracked by edge labels.
The additive structure also imparts a natural intertwining relationship between the
Markov operators. See [CPY98, Section 2] and [DF90, Bor10a].
We now focus on eigenvalues of G(t). Note that there is no easy exact relationship
between the eigenvalues of Gn for various n since the eigenvectors play a role in
determining any such identity. In fact, the eigenvalues of Gnand Gn+1need not
be interlaced. However, one can consider linear eigenvalue statistics for the graph
G(n,2d). That is, for any d-regular graph on n vertices G and function f : R → R,
define the random variable
trf(G) :=
n
?
i=1
f(λi)
Page 5
CYCLES AND EIGENVALUES5
where λ1 ≥ ... ≥ λn are the eigenvalues of adjacency matrix of G divided by
2(2d − 1)1/2. The scaling is necessary to take a limit with respect to d.
By a polynomial basis we refer to a sequence of polynomials {f0≡ 1,f1,f2,...}
such that fk is a polynomial of degree k of a single argument over reals. In the
statement below [∞] will refer to N.
Theorem 5. There exists a polynomial basis {fi, i ∈ N} (depending on d) such
that for any K ∈ N ∪ {∞}, the process (trfk(G(s + t)), k ∈ [K], t ≥ 0) converges
in law, as s tends to infinity, to the Markov process (Nk(t), k ∈ [K], t ≥ 0)
of Theorem 1. (The polynomials are given explicitly in (15).) Hence, for any
polynomial f, the process?trf(G(s+t))?converges to a linear combination of the
The Markov property is especially intriguing since, to the best of our knowledge,
no similar property of eigenvalues of the standard Random Matrix ensembles is
known. For the special case of minors of the Gaussian Unitary/Orthogonal Ensem-
bles, the entire distribution of eigenvalues across minors of various sizes do satisfy
a Markov property. However, this is facilitated by the known symmetry properties
of the eigenvectors, and do not extend to other examples of Wigner matrices.
For our final result we will take d to infinity. We will make the following no-
tational convention: for any polynomial f, we will denote the limiting process of
(trf(G(s + t)), t ≥ 0) by (trf (G(∞ + t)), t ≥ 0). Recall that this process is a
linear combination of (Nk(t), k ∈ N, t ≥ 0).
Theorem 6. Let {Tk, k ∈ N} denote the Chebyshev orthogonal polynomials of the
first kind on [−1,1]. As d tends to infinity, the collection of processes
(trTk(G(∞ + t)) − EtrTk(G(∞ + t)), t ≥ 0, k ∈ N)
converges weakly in D∞[0,∞) to a collection of independent Ornstein-Uhlenbeck
processes (Uk(t), t ≥ 0, k ∈ N), running in equilibrium. Here the equilibrium dis-
tribution of Ukis N(0,k/2) and Uksatisfies the stochastic differential equation
coordinate processes of (Nk(t), k ∈ N).
dUk(t) = −kUk(t)dt + kdWk(t),t ≥ 0,
and (Wk, k ∈ N) are iid standard one-dimensional Brownian motions.
Thus, the collection of random variables?trTk(G(∞ + t))−EtrTk(G(∞ + t))?,
with covariance kernel given by
indexed by k and t, converges as d tends to infinity to a centered Gaussian process
(4)lim
d→∞Cov(trTi(G(∞ + t)),trTk(G(∞ + s))) = δikk
2ek(s−t).
for s ≤ t.
A comparison of (4) with Borodin’s result (3) shows that the above limit captures
a key property of the GFF covariance structure. The appearance of the exponential
is merely due to a deterministic time-change of the process. A somewhat more
detailed discussion can be found in the following section.
Remark 7. A common model for random regular graphs is the configuration model
or pairing model (see [Wor99] for more information).
follows: Start with n buckets, each containing d prevertices. Then, separate these
dn prevertices into pairs, choosing uniformly from every possible pairing. Finally,
collapse each bucket into a single vertex, making an edge between one vertex and
The model is defined as
Page 6
6TOBIAS JOHNSON AND SOUMIK PAL
another if a prevertex in one bucket is paired with a prevertex in the other bucket.
This model has the advantage that choosing a graph from it conditional on it
containing no loops or parallel edges is the same as choosing a graph uniformly
from the set of graphs without loops and parallel edges. The model also allows for
graphs of odd degrees, unlike the permutation model.
It is possible to construct a process of growing random regular graphs simi-
lar to the one in this paper using a dynamic version of this model. Given some
initial pairing of prevertices labeled {1,...,dn}, extend it to a random pairing of
{1,...,dn+2} by the following procedure: Choose X uniformly from {1,...,dn+1}.
Pair dn+2 with X. If X = dn+1, leave the other pairs unchanged; if not, pair the
previous partner of X with dn + 1. This is an analogue of the CRP in the setting
of random pairings, in that if the initial pairing is uniformly chosen, then so is the
extended one.
If d is odd, we repeat this procedure a total of d times to extend a random
d-regular graph on n vertices to have n + 2 vertices (when d is odd, the number of
vertices in the graph must be even). When d is even, repeat d/2 times to add one
new vertex to a random graph. In this way, we can construct a sequence of growing
random regular graphs. We believe that all the results of this paper hold in this
model with minor changes, with similar proofs.
1.2. Existing literature. The study of the spectral properties of sparse regu-
lar random graphs is motivated by several different problems. These matrices do
not fall within the purview of the standard techniques of Random Matrix Theory
(RMT) due to their sparsity and lack of independence between entries. However, ex-
tensive simulations ([JMRR99]) point to conjectures that these matrices still belong
to the universality class of random matrices. For example, it is conjectured via simu-
lations ([MNS08]) that the distribution of the second largest eigenvalue (in absolute
value) is given by the Tracy-Widom distribution. In the physics literature, eigen-
values of random regular graphs have been considered as a toy model of quantum
chaos ([Smi10], [OGS09], [OS10]). Simulations suggest that the eigenvalue spacing
distribution has the same limit as that of the Wigner matrices. A limiting Gaussian
wave character of eigenvectors have also been conjectured ([Elo08, Elo10, ES10]).
Some fine properties of eigenvalues and eigenvectors can indeed be proved for a
single permutation matrix; see [Wie00] and [BAD11].
Somewhat complicating the matter is the fact that when the degree d is kept fixed
and we let n go to infinity, several classical results about random matrix ensembles
fail. A bit more elaboration on this point is needed. The two parameters in the
ensemble of random graphs are the degree d and the order n. In the permutation
model it is possible to construct random regular graphs for every possible value of
(d,n) where d is an even positive integer and n is any positive integer. Hence one can
consider various kinds of limits of these parameters. We will refer as the diagonal
limit the procedure of having a sequence of (d,n) where both these parameters
simultaneously go to infinity. To maintain sparsity1, it is usually assumed that
d is at most poly-logarithmic in n. No lower bound on the growth rate of d is
assumed. However, results are often easier to prove when d is kept fixed and we let
n go to infinity. Suppose for each d one gets a limiting object (say a probability
distribution); one can now take d to infinity and explore limits of the sequence of
1The non-sparse can be typically absorbed within standard techniques of RMT by comparing
with a corresponding Erd˝ os-R´ enyi graph whose adjacency matrix has independent entries.
Page 7
CYCLES AND EIGENVALUES7
these objects. We will refer to this procedure (limd→∞limn→∞) as the triangular
limit. The triangular limit is often identical to the diagonal limit irrespective of
the sequence through which the diagonal limit is taken, while maintaining sparsity.
Moreover, these limiting statistics frequently match with those of the GOE ensemble
and the real symmetric Wigner matrices. This is true, for example, for the empirical
spectral distribution [DP12, TVW13] and fluctuations of smooth linear eigenvalue
statistics [DJPP12].
Our present result is a triangular limit result. Let us first explain the connection
with the massless GFF. We follow Definition 2.12 and the first example in Section
2.5 of [She07]. Consider the space of smooth real functions compactly supported on
H with the Dirichlet inner product ?f,g? =?
thought of as a random distribution h which associates with every f ∈ H1a mean
zero Gaussian random variable ?h,f? that is an L2isometry in the following sense
Cov(?h,f?,?h,g?) = ?f,g?. Now, one can perform the integration of a function f
with h by first integrating their traces over semicircular arcs of a fixed radius, and
then a further integral over the radius. Over the semicircular arcs Fourier trans-
forms (or Chebyshev Polynomials, for real functions) provide an orthogonal basis
for this Gaussian field. As one parametrizes the radius properly, one obtains inde-
pendent Ornstein-Uhlenbeck processes for each Chebyshev polynomial. Hence these
OU processes completely determine the GFF covariance structure. This explains
the word ‘equivalent’ on page 2, paragraph 4, and is the essence of the calculations
done in [Bor10a]. See also [Spo98] for a similar formalism for Dyson’s Brownian
motion on the circle.
One of the reasons why we cannot prove a full GFF convergence is that the
parameters d and n behave independently of one another. The degree d determines
the support of the spectral distribution [−2√2d − 1,2√2d − 1], asymptotically in-
dependent of n. For Wigner matrices, the dimension itself determines the length
of the spectral support. This results in the parametrization of (1). It should be
possible to extend our results to a GFF convergence by either letting d grow with n
in the graph, or, even by letting d grow with time for the limiting Poisson structure
in Theorem 1. However this has not been attempted in the present article.
H∇f ·∇gdz. We consider the comple-
tion of this pre-Hilbert space, and call it the Sobolev space H1. The GFF can be
2. Preliminaries
2.1. A primer on the Chinese Restaurant Process. The CRP, introduced by
Dubins and Pitman, is a particular example of a two parameter family of stochas-
tic processes that constructs sequentially random exchangeable partitions of the
positive integers via the cyclic decomposition of a random permutation. Our short
description is taken from [Pit06, Section 3.1].
An initially empty restaurant has an unlimited number of circular tables num-
bered 1,2,..., each capable of seating an unlimited number of customers. Cus-
tomers numbered 1,2,... arrive one by one and are seated at the tables according
to the following plan. Person 1 sits at table 1. For n ≥ 1 suppose that n customers
have already entered the restaurant, and are seated in some arrangement, with at
least one customer at each of the tables j for 1 ≤ j ≤ k (say), where k is the
number of tables occupied by the first n customers to arrive. Let customer n + 1
choose with equal probability to sit at any of the following n+1 places: to the left
of customer j for some 1 ≤ j ≤ n, or alone at table k +1. Define π(n): [n] → [n] as
Page 8
8 TOBIAS JOHNSON AND SOUMIK PAL
π2
π1
π2
π1
π2
π3
Figure 1. A cycle whose word is the equivalence class of
π2π−1
3
in W6/D12.
1π2π1π2π−1
the permutation whose cyclic decomposition is given by the tables; that is, if after
n customers have entered the restaurant, customers i and j are seated at the same
table, with i to the left of j, then π(n)(i) = j, and if customer i is seated alone
at some table then π(n)(i) = i. The sequence (π(n)) then has features (i) and (ii)
mentioned in the first paragraph of Section 1.1.
2.2. Combinatorics on words. The graph Gn, formed from the independent
permutations π(n)
d, can be considered as a directed, edge-labeled graph in
a natural way. For convenience, drop superscripts and let πl= π(n)
then by definition Gn contains an edge between i to j.
consider this edge to be directed from i to j and to be labeled by πl.
Consider a walk on Gn, viewed in this way, and imagine writing down the label
of each edge as it is traversed, putting πior π−1
over the edge. We call a walk closed if it starts and ends at the same vertex, and
we call a closed walk a cycle it never visits a vertex twice (besides the first and
last one), and it never traverses an edge more than once in either direction. Thus
the word w = w1···wk formed as a cycle is traversed is cyclically reduced, i.e.,
wi?= w−1
then immediately backtracking does not form a 2-cycle, and the word formed by
this walk is πiπ−1
i
or π−1
two cycles equivalent if they are both walks on an identical set of edges; that is, we
ignore the starting vertex and the direction of the walk.
Let Wkdenote the set of cyclically reduced words of length k. We would like to
associate each k-cycle in Gnwith the word in Wkformed by the above procedure,
but since we can start the walk at any point in the cycle and walk in either of
two directions, there are actually up to 2k different words that could be formed by
it. Thus we identify elements of Wkthat differ only by rotation and inversion (for
example, π1π−1
where D2kis the dihedral group acting on the set Wkin the natural way.
Definition 8 (Properties of words). For any k-cycle in Gn, the element of Wk/D2k
given by walking around the cycle is called the word of the cycle (see Figure 1).
For any word w, let |w| denote the length of w. Let h(w) be the largest number
m such that w = umfor some word u. If h(w) = 1, we call w primitive. For any
w ∈ Wk, the orbit of w under the action of D2kcontains 2k/h(w) elements, a fact
which we will frequently use. Let c(w) denote the number of pairs of double letters
in w, i.e., the number of integers i modulo |w| such that wi= wi+1. For example,
1,...,π(n)
l
. If πl(i) = j,
When convenient, we
i
according to the direction we walk
i+1for all i, considering i modulo k. For example, following an edge and
iπifor some i, which is not cyclically reduced. We consider
2π1π2and π−1
1π2π−1
1π−1
2) and denote the resulting set by Wk/D2k,
Page 9
CYCLES AND EIGENVALUES9
c(π1π1π−1
Wk/D2k, since they are invariant under cyclic rotation and inversion.
To more easily refer to words in Wk/D2k, choose some canonical representative
w1···wk ∈ Wk for every w ∈ Wk/D2k. Based on this, we will often think of
elements of Wk/D2k as words instead of equivalence classes, and we will make
statements about the ith letter of a word in Wk/D2k. For w = w1···wk∈ Wk/D2k,
let w(i)refer to the word in Wk+1/D2k+2given by w1···wiwiwi+1···wk. We refer
to this operation as doubling the ith letter of w. A related operation is to halve a
pair of double letters, for example producing π1π2π3π4 from π1π2π3π4π1. (Since
we apply these operations to words identified with their rotations, we do not need
to be specific about which letter of the pair is deleted.) The following technical
lemma underpins most of our combinatorial calculations.
2π−1
2π1) = 3. We will also consider | · |, h(·), and c(·) as functions on
Lemma 9. Let u ∈ Wk/D2k and w ∈ Wk+1/D2k+2. Suppose that a letters in u
can be doubled to form w, and b pairs of double letters in w can be halved to form
u. Then
a
h(u)=
b
h(w).
Remark 10. At first glance, one might expect that a = b. The example u =
π1π2π1π1π2and w = π1π1π2π1π1π2shows that this is wrong, since only one letter
in u can be doubled to give w, but two different pairs in w can be halved to give u.
Proof. Let Orb(u) and Orb(w) denote the orbits of u and w under the action of
the dihedral group in Wk and Wk+1, respectively. When we speak of halving a
pair of letters in a word in Orb(w), always delete the second of the two letters (for
example, π1π2π1 becomes π1π2, not π2π1). When we double a letter in a word
in Orb(u), put the new letter after the doubled letter (for example, doubling the
second letter of π1π−1
2
gives π1π−1
For each of the 2k/h(u) words in Orb(u), there are a doubling operations yielding
a word in Orb(w). For each of the (2k + 2)/h(w) words in Orb(w), there are b
halving operations yielding a word in Orb(u). For every halving operation on a
word in Orb(w), there is a corresponding doubling operation on a word in Orb(u)
and vice versa, except for halving operations that straddle the ends of the word, as
in π1π2π1. There are 2b/h(w) of these, giving us
2π−1
2, not π−1
2π1π−1
2.)
2ka
h(u)=(2k + 2)b
h(w)
2kb
h(w),
−
2b
h(w)
=
and the lemma follows from this.
Let W′=?∞
Lemma 11. In the vector space with basis {qw}w∈W′
?
?
k=1Wk/D2k, and let W′
K=?K
k=1Wk/D2k. We will use the previous
lemma to prove the following technical property of the c(·) statistic.
K,
w∈W′
K−1
|w|
?
i=1
1
h(w)qw(i) =
?
w∈W′
K
c(w)
h(w)qw.
Page 10
10 TOBIAS JOHNSON AND SOUMIK PAL
1
2
34
5
π1
π1
π2
π1
π2
π1= (1 2 3)(4 5)
π2= (1 5)(4 3)(2)
1
2
34
5
6
π1
π1
π1
π2
π1
π2
π1= (1 2 6 3)(4 5)
π2= (1 5)(4 3)(2 6)
Figure 2. The vertex 6 is inserted between vertices 2 and 3 in π1,
causing the above cycle to grow.
Proof. Fix some w ∈ Wk/D2k, and let a(u) denote the number of letters of u that
can be doubled to give w, for any u ∈ Wk−1/D2k−2. We need to prove that
?
Let b(u) be the number of pairs in w that can be halved to give u. By Lemma 9,
u∈Wk−1/D2k−2
a(u)
h(u)=c(w)
h(w).
?
u∈Wk−1/D2k−2
a(u)
h(u)=
?
u∈Wk−1/D2k−2
b(u)
h(w),
and?
u∈Wk−1/D2k−2b(u) = c(w).
?
3. The process limit of the cycle structure
As the graph G(t) grows, new cycles form, which we can classify into two types.
Suppose a new vertex numbered n is inserted at time t, and this insertion creates a
new cycle. If the edges entering and leaving vertex n in the new cycle have the same
edge label, then the new cycle has “grown” from a cycle with one fewer vertex, as in
Figure 2. If the edges entering and leaving n in the cycle have different labels, then
the cycle has formed “spontaneously” as in Figure 3, rather than growing from a
smaller cycle. This classification will prove essential in understanding the evolution
of cycles in G(t).
Once a cycle comes into existence in G(t), it remains until a new vertex is inserted
into one of its edges. Typically, this results in the cycle growing to a larger cycle,
as in Figure 2. If a new vertex is simultaneously inserted into multiple edges of the
same cycle, the cycle is instead split into smaller cycles as in Figure 4. These new
cycles are spontaneously formed, according to the classification of new cycles given
in the previous paragraph. Tracking the evolution of these smaller cycles in turn,
we see that as the graph evolves, a cycle grows into a cluster of overlapping cycles.
However, it will follow from Proposition 19 that for short cycles, this behavior is
not typical. Thus in our limiting object, cycles will grow only into larger cycles.
Page 11
CYCLES AND EIGENVALUES11
12345
π2
π1
π2
π1
π1= (2 3 1)(4 5)
π2= (2 1 3 4 5)
12345
6
π2
π3
π2
π1
π1
π2
π1= (2 3 1 6)(4 5)
π2= (2 1 3 4 6 5)
Figure 3. A cycle forms “spontaneously” when the vertex 6 is
inserted into the graph.
1
2
34
5
π1
π1
π2
π1
π2
π1= (1 2 3)(4 5)
π2= (1 5)(4 3)(2)
1
2
34
5
6
π1
π1
π1
π2
π1
π2
π2
π1= (1 2 6 3)(4 5)
π2= (1 5 6)(4 3)(2)
Figure 4. The vertex 6 is inserted into the cycle in two different
places in the same step, causing the cycle to split in two. Note
that each new cycle would be classified as spontaneously formed.
3.1. Heuristics for the limiting process. We give some estimates that will
motivate the definition of the limiting process in Section 3.2. This section is entirely
motivational, and we will not attempt to make anything rigorous.
Suppose that vertex n is inserted into G(t) at some time t. First, we consider
the rate that cycles form spontaneously with some word w ∈ Wk/D2k. There
are 2k/h(w) words in the orbit of w under the action of D2k, and out of these,
2(k − c(w))/h(w) have nonequal first and last letters. For each such word u =
u1···uk, we can give a walk on the graph by starting at vertex n and following the
edges indicated by u, going from n to u1(n) to u2(u1(n)) and so on. If this walk
happens to be a cycle, the condition u1?= ukimplies that it would be spontaneously
formed.
In a short interval ∆t when G(t) has n − 1 vertices, the probability that vertex
n is inserted is about n∆t. For any word u, the walk from vertex n generated by
u is a cycle with probability approximately 1/n, since after applying the random
permutations u1,...,ukin turn, we will be left at an approximately uniform random
vertex. Any new spontaneous cycle formed with word w will be counted by one
of these walks, with u in the orbit of w, and it will be counted again by the walk
generated by u−1
1. The expected number of spontaneous cycles formed in a
k···u−1
Page 12
12 TOBIAS JOHNSON AND SOUMIK PAL
short interval ∆t is then approximately
1
h(w)(k − c(w))n∆t
n
=
1
h(w)(k − c(w))∆t.
Thus we will model the spontaneous formation of cycles with word w by a Poisson
process with rate (k − c(w))/h(w).
Next, we consider how often a cycle with word w ∈ Wkgrows into a larger cycle.
Suppose that G(t) has n − 1 vertices, and that it contains a cycle of the form
wk
s0
s1
s2
...
sk−1
w1
w2
w3
wk−1
When vertex n is inserted into the graph, the probability that it is inserted after
si−1in permutation wi is 1/n. Thus, after a spontaneous cycle with word w has
formed, we can model the evolution of its word as a continuous-time Markov chain
where each letter is doubled with rate one.
3.2. Formal definition of the limiting process. Consider the measure µ on W′
given by
µ(w) =|w| − c(w)
h(w)
.
Consider a Poisson point process χ on W′×[0,∞) with an intensity measure given
by the product measure µ ⊗ Leb, where Leb refers to the Lebesgue measure. Each
atom (w,t) of χ represents a new spontaneous cycle with word w formed at time t.
Now, we define a continuous-time Markov chain on the countable space W′
governed by the following rates: From state w ∈ Wk/D2k, jump with rate one to
each of the k words in Wk+1/D2k+2obtained by doubling a letter of w. If a word
can be formed in more than one way by doubling a letter in w, then it receives a
correspondingly higher rate. For example, from w = π1π1π2, the chain jumps to
π1π1π1π2with rate two and to π1π1π2π2with rate one. Let?Pwdenote the law of
Suppose we are given a realization of χ. For any atom (w,s) of the countably
many atoms of χ, we start an independent process (Xw,s(t), t ≥ 0) with law?Pw.
Nw(t) :=1{Xu,s(t − s) = w}.
this process started from w ∈ W′.
Define the stochastic process
?
s≤t
(u,s)∈χ
Interpreting these processes as in the previous section, Nw(t) counts the number of
cycles formed spontaneously at time s that have grown to have word w at time t.
The fact that the process exists is obvious since one can define the countably
many independent Markov chains on a suitable product space. The following lemma
establishes some of its key properties.
L=?L
homogeneous Markov process with respect to its natural filtration, with RCLL
paths.
Lemma 12. Recall that W′
(i) For any L ∈ N, the stochastic process {(Nw(t), w ∈ W′
k=1Wk/D2k. We have the following conclusions:
L), t ≥ 0} is a time-
Page 13
CYCLES AND EIGENVALUES 13
(ii) Recall that for w ∈ Wk/D2k, the element w(i)∈ Wk+1/D2k+2 is the word
formed by doubling the ith letter of w. The generator for the Markov process
{(Nw(t), w ∈ W′
?
?
where ew is the canonical basis vector equal to one at entry w and equal to
zero everywhere else. For a word u of length greater than L, take eu= 0.
(iii) The product measure of Poi(1/h(w)) over all w ∈ W′
measure for this Markov process.
L), t ≥ 0} acts on f at x = (xw, w ∈ W′
|w|
?
|w| − c(w)
h(w)
L) by
Lf(x) =
w∈W′
L
i=1
xw[f(x − ew+ ew(i)) − f(x)]
+
w∈W′
L
[f(x + ew) − f(x)],
Lis the unique invariant
Proof. Conclusion (i) follows from construction, as does conclusion (ii). To prove
conclusion (iii), we start by the fundamental identity of the Poisson distribution: if
X ∼ Poi(λ), then for any function f, we have
(5)
EXg(X) = λEg(X + 1).
We need to show that if the coordinates of X = (Xw, w ∈ W′
Poisson random variables with EXw= 1/h(w), then
L) are independent
(6)
ELf(X) = 0.
Since the process is an irreducible Markov chain on countable state space, the
existence of one invariant distribution shows that the chain is positive recurrent
and that the invariant distribution is unique.
To argue (6) we will repeatedly apply identity (5) to functions g constructed from
f by keeping all but one coordinate fixed. Thus, for any w ∈ W′
we condition on all Xuwith u ?= w and hold those coordinates of f fixed to obtain,
EXwf (X − ew+ ew(i)) =
Land 1 ≤ i ≤ |w|,
1
h(w)Ef (X + ew(i))
taking ew(i) = 0 when |w| = L. In the same way,
EXwf (X) =
1
h(w)Ef (X + ew).
By these two equalities,
E
?
w∈W′
L
|w|
?
i=1
Xw[f(X − ew+ ew(i)) − f(X)]
=
?
?
w∈W′
L
|w|
?
i=1
1
h(w)E?f(X + ew(i)) − f(X + ew)?
|w|
?
?
=
w∈W′
L−1
i=1
1
h(w)Ef(X + ew(i)) +
?
w∈WL/D2L
|w|
h(w)Ef(X)
−
w∈W′
L
|w|
h(w)Ef(X + ew).
Page 14
14 TOBIAS JOHNSON AND SOUMIK PAL
Specializing Lemma 11 to qw= Ef(X + ew), the first sum is
?
w∈W′
L−1
|w|
?
i=1
1
h(w)Ef(X + ew(i)) =
?
w∈W′
L
c(w)
h(w)Ef(X + ew),
which gives us
E
?
w∈W′
L
|w|
?
i=1
Xw[f(X − ew+ ew(i)) − f(X)]
=
?
w∈W′
L
c(w) − |w|
h(w)
Ef(X + ew) +
?
w∈WL/D2L
|w|
h(w)Ef(X).
All that remains in proving (6) is to show that
?
w∈W′
L
|w| − c(w)
h(w)
=
?
w∈WL/D2L
|w|
h(w).
Specializing Lemma 11 to qw= 1 shows that?
?
w∈W′
Lc(w)/h(w) =?
?
|w|
h(w),
w∈W′
L−1|w|/h(w).
Thus
w∈W′
L
|w| − c(w)
h(w)
=
?
w∈W′
L
|w|
h(w)−
w∈W′
L−1
|w|
h(w)
=
?
w∈WL/D2L
establishing (6) and completing the proof.
?
From now on, we will consider the process (Nw(t), k ∈ N, t ≥ 0) to be run-
ning under stationarity, i.e., with marginal distributions given by conclusion (iii)
of the last lemma. This process is easily constructed as described above, but with
additional point masses of weight 1/h(w) for each w ∈ W′at (w,0) added to the
intensity measure of χ, thus giving us the correct distribution at time zero.
3.3. Time-reversed processes. Fix some time T > 0.
reversal← −
Nw(t) := Nw(T − t) for 0 ≤ t ≤ T.
Lemma 13. For any fixed L ∈ N, the process {(← −
a time-homogenous Markov process with respect to the natural filtration. A trivial
modification at jump times renders RCLL paths. The transition rates of this chain
are given as follows. Let u ∈ Wk−1/D2k−1and w ∈ Wk/D2k, and suppose that u
can be obtained from w by halving b different pairs. Let x = (xw, w ∈ W′
(i) The chain jumps from x to x + eu− ewwith rate bxw.
(ii) The chain jumps from x to x − ewwith rate (k − c(w))xw.
(iii) If w ∈ WL/D2L, then the chain jumps from x to x + ewwith rate L/h(w).
Proof. Any Markov process run backwards under stationarity is Markov. If the
chain has transition rate r(x,y) from states x to y, then the transition rate of the
backwards chain from x to y is r(y,x)ν(y)/ν(x), where ν is the stationary distribu-
tion. We will let ν be the stationary distribution from Lemma 12iii and calculate
the transition rates of the backwards chain, using the rates given in Lemma 12ii.
We define the time-
Nw(t), w ∈ W′
L), 0 ≤ t ≤ T} is
L).
Page 15
CYCLES AND EIGENVALUES15
Let a denote the number of letters in u that give w when doubled. The transition
rate of the original chain from x + eu− ewto x is a(xu+ 1), so the transition rate
of the backwards chain from x to x + eu− ewis
a(xu+ 1)ν(x + ek−1,c−1− ek,c)
ν(x)
=ah(w)xw
h(u)
,
and this is equal to bxwby Lemma 9. A similar calculation shows that the transition
rate from x to x − ewis
(k − c(w))ν(x − ew)
h(w)ν(x)
= (k − c(w))xw,
proving (ii). The transition rate from x to x + ewfor w ∈ WL/D2Lis
ν(x + ew)
ν(x)
(xw+ 1)L =
L
h(w),
which completes the proof.
?
By definition,
← −
Nw(t) =
?
s≤T−t
(u,s)∈χ
1{Xu,s(T − t − s) = w}.
We will modify this slightly to define the process
← −
Mw(t) :=
?
s≤T−t
(u,s)∈χ
1{Xu,s(T − t − s) = w and |Xu,s(T − s)| ≤ L}.
The idea is that← −
time t that had more than L vertices at time zero. The process (← −
is a Markov chain with the same transition rates as (← −
it does not jump from x to x + ewfor w ∈ WL/D2L. These two chains also have
the same initial distribution, but (← −
Mw(t), w ∈ W′
eventually absorbed at zero).
Mw(t) is the same as← −
Nw(t), except that it does not count cycles at
Mw(t), w ∈ W′
L), except that
L)
Nw(t), w ∈ W′
L) is not stationary (in fact, it is
4. Process convergence
Recall that C(s)
defined on p. 3. For w ∈ W′, let C(s)
word w. We will prove that?C(s)
knowing the limiting marginal distribution of C(s)
the following theorem, which should be of independent interest:
k(t) is the number of cycles of length k in the graph G(s + t),
w (t) be the number of cycles in G(s + t) with
w (·), w ∈ W′?converges to a distributional limit,
w (t). We provide this and more in
from which the convergence of?C(s)
k(·), k ∈ N?will follow. The proof depends on
Theorem 14. Let Gn= G(n,2d), a 2d-regular random graph on n vertices from
the permutation model. For any k, let Ikbe the set of all cycles of length k on the
complete graph Kn with edge labels that form a cyclically reduced word; these are
the possible k-cycles that might appear in Gn. Let I =?r
Z = (Zα, α ∈ I) be a vector whose coordinates are independent Poisson random
k=1Ikfor some integer r.
For any cycle α ∈ I, let Iα= 1{Gncontains α}, and let I = (Iα, α ∈ I). Let
Page 16
16 TOBIAS JOHNSON AND SOUMIK PAL
variables with EZα= 1/[n]kfor α ∈ Ik. Then for all d ≥ 2 and n,r ≥ 1,
dTV(I,Z) ≤c(2d − 1)2r−1
for some absolute constant c, where dTV(X,Y ) denotes the total variation distance
between the laws of X and Y .
n
Corollary 15. Let {Zw, w ∈ W′
variables with EZw= 1/h(w). For any fixed integer K and d ≥ 1,
(i) as t → ∞,
(Cw(t), w ∈ W′
(ii) as t → ∞, the probability that there exist two cycles of length K or less sharing
a vertex in G(t) approaches zero.
K} be a family of independent Poisson random
K)
L
−→(Zw, w ∈ W′
K);
We give the proofs in the appendix, along with some further discussion. Now,
we turn to the convergence of the processes.
?C(s)
Proof. The main difficulty in turning the intuitive ideas of Section 3.1 into an actual
proof is that?C(s)
some fixed T > 0. Then, we ignore all of← −
of cycles of size L and smaller, which we will call← −
evolution of this subgraph as time runs backward, ignoring the rest of← −
we consider the number of cycles with word w in← −
Choose K ≪ L. Then φw(← −
w with |w| ≤ K. The remarkable fact that makes φw(← −
is that if← −
chain governed by the same transition rates as?← −
that we do not know in what order the vertices will be removed. Thus we can
view← −
Gs(t) as a Markov chain with the following description: Assign each vertex
an independent Exp(1) clock. When the clock of vertex v goes off, remove it from
the graph, and patch together the πi-labeled edges entering and leaving v for each
1 ≤ i ≤ d.
Step 1. Definitions of← −
Fix T > 0 and define← −
Gs(t) = G(s+T −t). As mentioned above, we will consider
← −
Gs(t) only up to relabeling of vertices, which makes it a process on the countable
state space consisting of all edge-labeled graphs on finitely many unlabeled vertices.
With respect to its natural filtration, it is a Markov chain in which each vertex is
removed with rate one, as described above.
To formally define← −
← −
with← −
Theorem 16. The process
(Nw(·), w ∈ W′).
w (·), w ∈ W′?
converges in law as s → ∞ to
w (t), w ∈ W′?is not Markov. We now sketch how we evade this
Gs(0) except for the subgraph consisting
problem. We will run our chain backwards, defining← −
Gs(t) = G(s + T − t) for
Γs(0). The graph← −
Γs(t) is the
Gs(t). Then,
Γs(t)).Γs(t), which we call φw(← −
w (T−t) for any word
Γs(t)) possible to analyze
Γs(t)), w ∈ W′
Mw(t), w ∈ W′
Γs(t)) is likely to be the same as C(s)
Γs(0) consists of disjoint cycles, then?φw(← −
Another important idea of the proof is to ignore the vertex labels in← −
L
?is a Markov
Gs(t), so
L
?.
Γs(t) and φwand analysis of?φw(← −
Γs(t)), w ∈ W′
L
?.
Γs(t), fix integers L > K and let← −
Γs(0) be the subgraph of
Γs(t) in parallel
Gs(t), the corresponding vertex v in
Gs(0) made up of all cycles of length L or less. We then evolve← −
Gs(t). When a vertex v is deleted from← −
Page 17
CYCLES AND EIGENVALUES17
← −
Γs(t) is deleted if it is present. If v has a πi-labeled edge entering and leaving it in
← −
v are deleted. This makes← −
Markov chain on the countable state space consisting of all edge-labeled graphs
on finitely many unlabeled vertices. The transition probabilities of← −
depend on s.
From Corollary 15, we can find the limiting distribution of← −
γ is a graph in the process’s state space that is not a disjoint union of cycles. By
Corollary 15ii,
s→∞P[← −
Suppose instead that γ is made up of disjoint cycles, with zwcycles of word w for
each w ∈ W′
limΓs(0) = γ] =
(7)
Γs(t), then these two edges are patched together. Other edges in← −
Γs(t) a subgraph of← −
Γs(t) adjacent to
Gs(t), as well as a continuous-time
Γs(t) do not
Γs(0). Suppose that
limΓs(0) = γ] = 0.
L. By Corollary 15i,
s→∞P[← −
?
w∈W′
L
P[Zw= zw],
where (Zw, w ∈ W′
1/h(w). Thus← −
ported on the graphs made up of disjoint unions of cycles. For different values of
s, the chains← −
Γs(t) differ only in their initial distributions, and the convergence in
law of← −
to a Markov chain {← −
distribution is the limit of← −
Γs(0).
For any finite edge-labeled graph G, let φw(G) be the number of cycles in G with
word w. By the continuous mapping theorem, the process (φw(← −
converges in law to (φw(← −
Γ(t)), w ∈ W′
We will now demonstrate that this process has the same law as (← −
W′
these cycles shrink or are destroyed. The process (φw(← −
exactly when a vertex in a cycle in← −
Γ(t) is deleted. If the deleted vertex lies in a
cycle between two edges with the same label, the cycle shrinks. If the deleted vertex
lies in a cycle between two edges with different labels, the cycle is destroyed. The
only relevant consideration in where the process will jump at time t is the number
of vertices of these two types in← −
W′
Consider two words u,w ∈ W′
a letter. Suppose that u can be obtained from w by halving any of b pairs of letters.
Suppose that the chain is at state x = (xv, v ∈ W′
when deleted cause the chain to jump from x to x − ew+ eu, each of which is
removed with rate one. Thus the chain jumps from x to x−ew+euwith rate bxw.
Similarly, it jumps to x−ewwith rate (|w|−c(w))xw. These are the same rates as
the chain (← −
Mw(t), w ∈ W′
is also the same as that of (← −
Mw(t), w ∈ W′
?φw(← −
C(s)
L) are independent Poisson random variables with EZw =
Γs(0) converges in law as s → ∞ to a limiting distribution sup-
Γs(0) as s → ∞ induces the process convergence of {← −
Γ(t), 0 ≤ t ≤ T} with the same transition rates whose initial
Γs(t), 0 ≤ t ≤ T}
Γs(t)), w ∈ W′
L)
L) as s → ∞.
Mw(t), w ∈
L). The graph← −
Γ(t) consists of disjoint cycles at time t = 0, and as it evolves,
Γ(t)), w ∈ W′
L) jumps
Γ(t), which can be deduced from (φw(← −
Γ(t)), w ∈
L). Thus this process is a Markov chain.
Ksuch that w can be obtained from u by doubling
L). There are bxwvertices that
L) from Section 3.3. The initial distribution given by (7)
L), demonstrating that the two processes
?and (← −
Γs(t)).
Γ(t)), w ∈ W′
Step 2. Approximation of← −
L
Mw(t), w ∈ W′
w (t) by φw(← −
L) have the same law.
Page 18
18 TOBIAS JOHNSON AND SOUMIK PAL
We will compare the two processes {(← −
{(φw(← −
identical with probability arbitrarily close to one.
Consider some cycle in← −
between two edges of the cycle with different labels, and those that lie between
two edges with the same label. We call this second class the shrinking vertices of
the cycle, because if one is deleted from← −
Gs(t) as it evolves, the cycle shrinks. We
define Es(L) to be the event that for some cycle in← −
l − K of its shrinking vertices are deleted by time T.
We claim that outside of the event Es(L), the two processes {(← −
W′
that these two processes are not identical. Then there is some cycle α of size K or
less present in← −
Γs(t) for 0 < t ≤ T. As explained in Section 3,
as a cycle evolves (in forward time), it grows into an overlapping cluster of cycles.
Thus← −
Gs(0) contains some cluster of overlapping cycles that shrinks to α at time
t. One of the cycles in this cluster has length greater than L, or the cluster would
be contained in← −
To see that l − K shrinking vertices must be deleted from this cycle, consider
the evolution of α into the cluster of cycles in both forward and reverse time. If
a vertex is inserted into a single edge of a cycle in forward time, we see in reverse
time the deletion of a shrinking vertex. If a vertex is simultaneously inserted into
two edges of a cycle, causing the cycle to split, we see in reverse time the deletion
of a non-shrinking vertex of a cycle. As α grows, a cycle of size greater than L can
form only by single-insertion of at least l − K vertices into the eventual cycle. In
reverse time, this is seen as deletion of l−K shrinking vertices. This demonstrates
that Es(L) holds.
We will now show that for any ǫ > 0, there is an L sufficiently large that
P[Es(L)] < ǫ for any s. Let w ∈ Wl/D2l with l > L, and let I ⊆ [l] such that
|I| = l−K and wi= wi−1for all i ∈ I, considering indices modulo l. For any cycle
in← −
Gs(0) with word l, the set I corresponds to a set of l − K shrinking vertices of
the cycle.
We define F(w,I) to be the event that← −
word w, and that the vertices corresponding to I in one of these cycles are all
deleted within time T. By a union bound,
?
We proceed by enumerating all pairs of w and I. For any pair w,I, deleting the
letters in w at positions given by I results in a word u ∈ WK/D2K. For any given
u = u1···uK∈ WK/D2K, the word w ∈ Wl/D2lmust have the form
w = u1···u1
? ?? ?
the number of compositions of l into K parts, and each of these corresponds to a
choice of w and I. There are fewer than a(d,K) choices for u, giving us a bound
of a(d,K)?l−1
C(s)
w (t), w ∈ W′
K), 0 ≤ t ≤ T?
and
Γs(t)), w ∈ W′
K), 0 ≤ t ≤ T} and show that for sufficiently large L, they are
Gs(t); we can divide its vertices into those that lie
Gs(0) of size l > L, at least
C(s)
w (t), w ∈
K), 0 ≤ t ≤ T} and {(φw(← −
Gs(t) but not in← −
Γs(t)), w ∈ W′
K), 0 ≤ t ≤ T} are identical. Suppose
Γs(0) and α would have been contained in← −
Γs(t).
Gs(0) contains one or more cycles with
P[Es(L)] ≤
w,I
P[F(w,I)]. (8)
a1 times
u2···u2
? ?? ?
a2 times
······uK···uK
? ???
aK times
,
with ai≥ 1 and a1+ ··· + aK= l. The number of choices for a1,...,aKis?l−1
?choices of pairs w and I for any fixed l > L.
K−1
?,
K−1
Page 19
CYCLES AND EIGENVALUES19
Next, we will show that for any pair w and I with |w| = l,
P[F(w,I)] ≤ (1 − e−T)l−K.
Condition on← −
Gs(0) having n vertices. Consider any of the [n]lpossible sequences of
l vertices. Choose some representative w′∈ Wlof w. For each of these sequences,
the probability that it forms a cycle with word w′is at most 1/[n]l (recall the
original definition of our random graphs in terms of random permutations). Given
that the sequence forms a cycle, the probability that the vertices of the cycle at
positions I are all deleted within time T is (1 − e−T)l−K. Hence
?
(9)
P
F(w,I) |← −Gs(0) has n vertices
?
≤ [n]l
≤ (1 − e−T)l−K.
1
[n]l(1 − e−T)l−K,
This holds for any n, establishing (9).
Applying all of this to (8),
P[Es(L)] ≤
∞
?
l=L+1
a(d,K)
?l − 1
K − 1
?
(1 − e−T)l−k.
This sum converges, which means that for any ǫ > 0, we have P[Es(L)] < ǫ for
large enough L, independent of s.
Step 3. Approximation of← −
Mw(t).
Recall that we defined the processes {(← −
{(← −
that for sufficiently large L, the two processes are identical with probability arbi-
trarily close to one.
By their definitions, these two processes are identical unless one of the processes
Xu,s(·) started at each atom of χ grows from a word of size K or less to a word of
size L + 1 before time T; we call this event E(L). Let
???(u,s) ∈ χ: |u| ≤ K, s ≤ T???,
Suppose that X(·) has law?Pw for some word w ∈ Wk/D2k. We can choose L
a union bound, and so P[E(L)] < ǫEY . Since EY < ∞, we can make P[E(L)]
arbitrarily small by choosing sufficiently large L.
Step 4. Weak convergence of {(← −
W′
If two processes are identical with probability 1 − ǫ, then the total variation
distance between their laws is at most ǫ. Thus, by steps 2 and 3, we can choose L
large enough that the laws of the processes {(← −
{(φw(← −
uniformly in s, and so that the laws of {(← −
{(← −
Since total variation distance dominates the Prokhorov metric (or any other metric
for the topology of weak convergence), we can choose L such that these two pairs
are each within ǫ/3 in the Prokhorov metric. Since {(φw(← −
Nw(t) by← −
Mw(t), w ∈ W′
K), 0 ≤ t ≤ T} and
Nw(t), w ∈ W′
K), 0 ≤ t ≤ T} on the same probability space. We will show
Y =
the number of processes starting from a word of size K or less before time T.
large enough that P?|X(T)| > L?< ǫ for all k ≤ K. Then P[E(L) | Y ] < ǫY by
C(s)
w (t), w ∈ W′
K), 0 ≤ t ≤ T} to {(← −
Nw(t), w ∈
K), 0 ≤ t ≤ T}.
C(s)
w (t), w ∈ W′
K), 0 ≤ t ≤ T} and
Γs(t), w ∈ W′
K), 0 ≤ t ≤ T)} are arbitrarily close in total variation distance,
Mw(t), w ∈ W′
K), 0 ≤ t ≤ T}} are arbitrarily close in total variation distance.
K), 0 ≤ t ≤ T} and
Nw(t), w ∈ W′
Γs(t)), w ∈ W′
K), 0 ≤
Page 20
20 TOBIAS JOHNSON AND SOUMIK PAL
t ≤ T} converges in law to {(← −
is an s0 such that for all s ≥ s0, the laws of these processes are within ǫ/3 in
the Prokhorov metric. We have thus shown that for every ǫ > 0, the laws of
{(← −
for sufficiently large s, which proves that the first random vector converges in law
to the second as s → ∞.
Step 5. Weak convergence of {(C(s)
0}.
It follows immediately from the previous step that the (not time-reversed) process
{(C(s)
for any T > 0. By Theorem 16.17 in [Bil99], {(C(s)
in law to {(Nw(t), w ∈ W′
0} converges in law to {(Nw(t), w ∈ W′), t ≥ 0}.
Mw(t), w ∈ W′
K), 0 ≤ t ≤ T} as s → ∞, there
C(s)
w (t), w ∈ W′
K), 0 ≤ t ≤ T} and {(← −
Nw(t), w ∈ W′
K), 0 ≤ t ≤ T} are within ǫ
w (t), w ∈ W′), t ≥ 0} to {(Nw(t), w ∈ W′), t ≥
w (t), w ∈ W′
K), 0 ≤ t ≤ T} convergesin law to {(Nw(t), w ∈ W′
K), 0 ≤ t ≤ T}
K), t ≥ 0} converges
w (t), w ∈ W′), t ≥
w (t), w ∈ W′
K), t ≥ 0}, which also proves that {(C(s)
?
Proof of Theorem 1. We now consider the case of short cycles in the graph. We
will express these as functionals of?C(s)
k-cycles in G(s + t), and let
?
It follows immediately from the continuous mapping theorem that {(C(s)
N), t ≥ 0} converges in law to {(Nk(t), k ∈ N), t ≥ 0} as s → ∞.
It is not hard to see that this limit is Markov and admits the following rep-
resentation: Cycles of size k appear spontaneously with rate?
i(f(i + 1) − f(i)). The only thing we need to verify is that
?
However, this follows from Lemma 11 in the following way. From that lemma
we get
?
Thus
?
However, the two terms on the right side of the above equation are simply half the
total number of cyclically reduced words possible, of size k and k − 1 respectively.
The total number of cyclically reduced words of size k on an alphabet of size d is
a(d,k) (see Appendix of [DJPP12]). This shows (10) and completes the proof.
w (t), w ∈ W′?. For example, consider the
count of cycles of size k ∈ N. Then C(s)
k(t) =?
w∈Wk/D2kC(s)
w (t) is the number of
Nk(t) =
w∈Wk/D2k
Nw(t).
k(t), k ∈
w∈Wk/D2kµ(w).
The size of each cycle then grows as a pure birth process with generator Lf(i) =
(10)
w∈Wk/D2k
µ(w) =
?
w∈Wk/D2k
k − c(w)
h(w)
= (a(d,k) − a(d,k − 1))/2.
w∈Wk/D2k
c(w)
h(w)= (k − 1)
?
w∈Wk−1/D2(k−1)
1
h(w).
w∈Wk/D2k
µ(w) =
?
w∈Wk/D2k
k
h(w)−
?
w∈Wk−1/D2(k−1)
k − 1
h(w).
?
We end with the following corollary.
Page 21
CYCLES AND EIGENVALUES21
Corollary 17. For any s < t and j,k ∈ N, one has:
?a(d,j)
Cov(Nk(t),Nj(s)) =
2j
?k−1
otherwise.
k−j
?pj(1 − p)k−j,p = es−t, if k ≥ j.
0,
Proof. We will refer to the Yule processes counted by Nk(t) as cycles of length
k present at time t, even though these “cycles” in the limiting process have no
connection to graphs. If k < j, every cycle that is of length j at time s cannot
grow to a cycle of of length k at time t. Thus, Nk(t) depends on cycles that are
independent of those that make up Nj(s). Hence Nk(t) is independent of Nj(s).
If k ≥ j, notice that one has the following decomposition:
k
?
where α(j,k) is the proportion of one-dimensional pure-birth Yule processes that
were at state j at time s and grew to state k at time t, and Z is a random vari-
ables that counts the number of new births in the time interval (s,t) that grew to
state k at time t. Note that, under our invariant distribution all random variables
{Z,Nj(s), 1 ≤ j ≤ k} are independent of one another. Thus, our conclusion follows
once we show
?k − 1
The expected proportion Eα(j,k) is the probability that a one-dimensional pro-
cess Xj,k, with law of an Yule process starting at j, is at state k at time (t −s). If
ξj,...,ξkare independent exponential random variables with rates j,...,k, then
(11)Nk(t) =
j=1
α(j,k)Nj(s) + Z,
(12)
Eα(j,k) =
k − j
?
pj(1 − p)k−j,p = es−t.
Eα(j,k) = P[{ξj+ ... + ξk−1≤ t − s} ∩ {ξj+ ... + ξk> t − s}]
We now use the R´ enyi representation: suppose Y1,Y2,...,Ykare iid Exp(1) random
variables. Define the order statistics Y(1)≥ Y(2)≥ ... ≥ Y(k). Then, the following
equality holds in distribution
?Y(i)− Y(i+1), j ≤ i ≤ k?= (ξi, j ≤ i ≤ k).
Here we have defined Y(k+1)≡ 0. Thus, in distribution,
ξj+ ... + ξk−1= Y(j)− Y(k),
Thus
Eα(j,k) = P?t − s < Y(j)≤ Y(k)+ t − s?.
Note that, by an elementary symmetry argument, for any u > (t − s), we have
P?Y(j)∈ (u,u + du),Y(j)− Y(k)< t − s?
and the rest of Y1,...,Ykare in [u − t + s,u]?du
= ke−u
j − 1
= k
j − 1
ξj+ ... + ξk= Y(j).
= P?Yi= u for some i, exactly j − 1 of Y1,...,Ykare greater than u,
?k − 1
?k − 1
?
e−(j−1)u?e−u+t−s− e−u?k−jdu
e−ku?et−s− 1?k−jdu.
?
Page 22
22 TOBIAS JOHNSON AND SOUMIK PAL
Integrating out u in the interval (t − s,∞) we get
P?t − s < Y(j)< Y(k)+ t − s?=
=
j − 1
This shows (12) and completes the proof of the corollary.
?k − 1
j − 1
??et−s− 1?k−j?∞
?k − 1
t−s
ke−kudu
?k − 1
??et−s− 1?k−je−k(t−s)=
j − 1
?
ej(s−t)(1 − es−t)k−j.
?
4.1. Two-dimensional convergence. So far, we have considered d as a constant.
We now view it as a parameter of the graph and allow it to vary. Recall that
(Πd, d ∈ N) are independent towers of random permutations, with Πd= (π(n)
N), and that G(n,2d) is defined from π(n)
construction used to define G(t) and construct G2d(t), a continuous-time version of
(G(n,2d), n ∈ N). Let W′(d) be the set of equivalence classes of cyclically reduced
words as before, with the parameter d made explicit. Define C(s)
of k-cycles in G2d(s + t) and consider the convergence of the two-dimensional field
{(C(s)
Again, we will consider this process as a functional of another one.
W′(∞) =?∞
C(s)
w (t) by this, so that
?
|w|=k
d, n ∈
1 ,...,π(n)
d. For each d, we follow the
d,k(t) as the number
d,k(t), d,k ∈ N), t ≥ 0} as s → ∞.
Define
d=1W′(d), noting that W′(1) ⊆ W′(2) ⊆ ···. For any w ∈ W′(d), the
number of cycles in G2d′(s + t) with word w is the same for all d′≥ d. We define
C(s)
d,k(t) =
w∈W′(d)
C(s)
w(t).
Then we will prove convergence of {(C(s)
To define a limit for this process, we extend µ to a measure on all of W′(∞) and
define the Poisson point process χ on W′(∞)×[0,∞). The rest of the construction is
identical to the one in Section 3.2, giving us random variables?Nw(t), w ∈ W′(∞)?.
Theorem 18. The process
?Nw(·), w ∈ W′(∞)?.
Proof. It suffices to prove that?C(s)
w (t), w ∈ W′(∞)), t ≥ 0} as s → ∞.
?C(s)
w (·), w ∈ W′(∞)?
w (·), w ∈ W′(d)?converges in law as s → ∞ to
converges in law as s → ∞ to
(Nw(·), w ∈ W′(d)) for each d, which we did in Theorem 16.
?
Proof of Theorem 3. Let
Nd,k(t) =
?
|w|=k
w∈W′(d)
Nw(t).
By the continuous mapping theorem, (Nd,k(·), d,k ∈ N) is the limit of (C(s)
N) as s → ∞.
Let us now describe what the limiting process is. It is obvious that (Nd,k(·), k ∈
N, d ∈ N) is jointly Markov. For every fixed d, the law of the corresponding
marginal is given by Theorem 1. To understand the relationship across d, notice
that cycles of size k for (d + 1) consist of cycles of size k for d and the extra ones
d,k(·), d,k ∈
Page 23
CYCLES AND EIGENVALUES23
that contain an edge labeled by πd+1of π−1
d+1. Thus
Nd+1,k(t) − Nd,k(t) =
?
|w|=k
w∈W′(d+1)\W′(d)
Nw(t)
This process is independent of (Ni,·, i ∈ [d]), since the set of words involved are
disjoint. Moreover, the rates for this process are clearly the following: cycles of size
k grow at rate k and new cycles of size k appear at rate [a(d + 1,k) − a(d + 1,k −
1) − a(d,k) + a(d,k − 1)]/2. This completes the proof of the result.
5. Process limit for linear eigenvalue statistics
?
Let us recall some of the basic facts established in [DJPP12, Sections 3, 5] that
connect linear eigenvalue statistics with cycle counts. A closed non-backtracking
walk is a walk that begins and ends at the same vertex, and that never follows
an edge and immediately follows that same edge backwards. If the last step of a
closed non-backtracking walk is anything other than the reverse of the first step, we
say that the walk is cyclically non-backtracking. Cyclically non-backtracking walks
on Gn are exactly the closed non-backtracking walks whose words are cyclically
reduced. Let CNBW(n)
k
denote the number of closed cyclically non-backtracking
walks of length k on Gn.
Cyclically non-backtracking walks are useful because they can be enumerated
by linear functionals of a graph’s eigenvalues. Let {Tn(x)}n∈N be the Chebyshev
polynomials of the first kind on the interval [−1,1]. We define a set of polynomials
Γ0(x) = 1 ,
Γ2k(x) = 2T2k(x) +
2d − 2
(2d − 1)k, ∀ k ≥ 1 ,
Γ2k+1(x) = 2T2k+1(x) , ∀ k ≥ 0 .
Let Anbe the adjacency matrix of Gn, and let λ1≥ ··· ≥ λnbe the eigenvalues
of (2d − 1)−1/2An/2. Then
n
?
Now, for any cycle in Gnof length j | k, we obtain 2j non-backtracking walks of
length k by choosing a starting point and direction and then walking around the
cycle repeatedly. In [DJPP12, Corollary 18], it is shown that with certain conditions
on the growth of d and r, all cyclically non-backtracking walks of length r or less
have this form with high probability. Thus the random vectors
k ≤ r?
statistics is reduced to finding limiting distributions of cycle counts. We will prove
Theorem 5 by arguing that this holds for the entire process (G(t), t ≥ 0).
Call a cyclically non-backtracking walk bad if it is anything other than a repeated
walk around a cycle.
i=1
Γk(λi) = (2d − 1)−k/2CNBW(n)
k. (13)
?CNBW(n)
k, 1 ≤
and
??
j|k2jC(n)
k, 1 ≤ k ≤ r?
have the same limiting distribution, and
the problem of finding the limiting distributions of polynomial linear eigenvalue
Proposition 19. Fix an integer K. There is a random time T, almost surely
finite, such that there are no bad cyclically non-backtracking walks of length K or
less in G(t) for all t ≥ T.
Page 24
24 TOBIAS JOHNSON AND SOUMIK PAL
Proof. We will work with the discrete-time version of our process (Gn, n ∈ N).
We first define some machinery introduced in [LP10]. Consider some cyclically
non-backtracking walk of length k on the edge-labeled complete graph Knof the
form
w1
w2
s0
s1
s2
···sk= s0.
w3
wk
Here, si ∈ [n] and w = w1···wk is the word of the walk (that is, each wi is πj
or π−1
j
for some j, indicating which permutation provided the edge for the walk).
We say that Gn contains the walk if the random permutations π1,...,πd satisfy
wi(si−1) = si. In other words, Gn contains a walk if considering both as edge-
labeled directed graphs, the walk is a subgraph of Gn.
If (s′
i, 0 ≤ i ≤ k) is another walk with the same word, we say that the two walks
are of the same category if si= sj ⇐⇒ s′
the same category if they are identical up to relabeling vertices. The probability
that Gncontains a walk depends only on its category. If a walk contains e distinct
edges, then Gncontains the walk with probability at most 1/[n]e.
Let X(n)
k
be the number of bad walks of length k in Gnthat start at vertex n.
We will first prove that with probability one, X(n)
Call a category bad if the walks in the category are bad. Let Tk,dbe the number of
bad categories of walks of length k. For any particular bad category whose walks
contain v distinct vertices, there are [n − 1]v−1walks of that category whose first
vertex is n. Any bad walk contains more edges than vertices, so
i= s′
j. In other words, two walks are of
k
> 0 for only finitely many n.
EX(n)
k
≤Tk,d[n − 1]v−1
[n]v+1
≤
Tk,d
n(n − k).
Since X(n)
Borel-Cantelli lemma, X(n)
Thus, for any fixed r + 1, there exists a random time N such that there are no
bad walks on Gn of length r + 1 or less starting with vertex n, for n ≥ N. We
claim that for n ≥ N, there are no bad walks at all on Gn with length r or less.
Suppose that Gmcontains some bad walk of length k ≤ r, for some m ≥ N. As
the graph evolves, it is easy to compute that with probability one, a new vertex is
eventually inserted into an edge of this walk. But at the time n > m ≥ N when
this occurs, Gnwill contain a bad walk of length r + 1 or less starting with vertex
n, a contradiction. Thus we have proven that Gneventually contains no bad walks
of length r or less. The equivalent statement for the continuous-time version of the
graph process follows easily from this.
k
takes values in the nonnegative integers, P[X(n)
> 0 for only finitely many values of n.
k
> 0] ≤ EX(n)
k. By the
k
?
Proof of Theorem 5. Let CNBW(s)
walks of length k in G(s + t). We decompose these into those that are repeated
walks around cycles of length j for some j dividing k, and the remaining bad walks,
which we denote B(s)
k(t), giving us
CNBW(s)
k(t) =
j|k
k(t) denote the number of cyclically non-backtracking
?
2jC(s)
j(t) + B(s)
k(t).
Proposition 19 implies that
lim
s→∞P?B(s)
k(t) = 0 for all k ≤ K, t ≥ 0?= 1.
Page 25
CYCLES AND EIGENVALUES25
By Theorem 1 together with the continuous mapping theorem and Slutsky’s theo-
rem and, as s tends to infinity,
?CNBW(s)
Now, we modify the polynomials Γkto form a new basis {fk, k ∈ N} with the
right properties, which amounts to expressing each Nk(t) as a linear combination
of terms?
1
2k
j|k
j
k(·), 1 ≤ k ≤ K?
L
−→
??
j|k
2jNj(·), 1 ≤ k ≤ K
?
. (14)
j|l2jNj(t). We do this with the M¨ obius inversion formula. Define the
polynomial
fk(x) =
?
µ
?k
?
(2d − 1)j/2Γj(x), (15)
where µ is the M¨ obius function, given by
?
µ(n) =
(−1)a
0
if n is the product of a distinct primes,
otherwise.
The theorem then follows from (13), (14), and the continuous mapping theorem.
?
Proof of Theorem 6. We start by recalling that, for any fixed d,
(2d − 1)i/21{i is even} = (2d − 1)−i/2?
Now, we will provefinite-dimensional convergenceto the stated Ornstein-Uhlenbeck
process. Consider two time points s ≤ t and two positive integers i,k. We will
first show that, for any i,k ∈ N, the pair ((2d − 1)−i/2(Ni(s) − ENi(s)),(2d −
1)−k/2(Nk(t)−ENk(t))) converges to a Gaussian limit as d tends to infinity. When
s = t, this trivially follows via the Central Limit Theorem and their independent
Poisson joint distribution.
When s < t, observe from (11) that
2trTi(G(∞ + t)) + n
2d − 2
k|i
2kNk(t).
Nk(t) =
k
?
j=1
α(j,k)Nj(s) + Z.
Here α(j,k)Nj(s), j ∈ [k], and Z are independent Poisson random variables of
various means. Moreover Z is independent of the history of the process till time
s. Under the stationary law, the vector (Nj(s), j ∈ N) are independent Poisson
random variables. Thus, if i > k, then Ni(s) is independent of Nk(t). Otherwise, by
the thinning property of Poisson, α(i,k)Ni(s) is independent of (1 − α(i,k))Ni(s).
Therefore, Nk(t)−α(i,k)Ni(s), α(i,k)Ni(s), and (1−α(i,k))Ni(s) are three inde-
pendent Poisson random variable.
By the normal approximation to Poisson, we get the appropriate distributional
convergence to corresponding independent Gaussian random variables. This shows
the joint convergence of (Ni(s),Nk(t)) to Gaussian after centering and scaling.
A similar Gram-Schmidt orthogonalization can be carried out for the case of time
points t1≤ t2≤ ... ≤ tm and corresponding positive integers j1,j2,...jm. This
proves the joint Gaussian convergence of any finite collection of (Nji(ti), i ∈ [m])
under centering and suitable scaling. Since the traces of Chebyshev polynomials are
Page 26
26 TOBIAS JOHNSON AND SOUMIK PAL
linear combinations of coordinates of N, the joint Gaussian convergence extends to
them by an argument invoking the Continuous Mapping and Slutsky’s theorems.
For a fixed d, the covariance computation follows from Corollary 17 and (13).
Hence, if s < t, then
(16)
Cov(trTi(G(∞ + t)),trTj(G(∞ + s))) =1
4(2d − 1)−(i+j)/2?
k|i, l|j
4lkCov(Nk(t),Nl(s))
Here
Cov(Nk(t),Nl(s)) =
?a(d,l)
0,
2l
?k−1
otherwise.
k−l
?pl(1 − p)k−l,p = es−t, if k ≥ l.
We now fix any i,j,t,s and take d to infinity. Any term a(d,r) is asymptotically
the same as (2d −1)r. Thus the highest order term (in d) on the right side of (16)
is (2d −1)min(i,j). Unless i = j, this term is negligible compared to (2d −1)(i+j)/2.
This shows that the limiting covariance is zero unless i = j.
On the other hand, when i = j, every term on the right side of (16) vanishes,
except when k = i = l = j. Hence,
lim
d→∞Cov(trTi(G(∞ + t)),trTi(G(∞ + s))) =1
Finally we prove the process convergence. One simply needs to argue tightness.
Fix a K ∈ N and, for every d, consider the process
?
We claim that it suffices to show tightness for this process. This follows, since then,
due to unequal scaling, the difference between this process and the centered and
scaled traces go to zero in probability as d tends to infinity.
To show tightness of X, note that, by [EK86, Chapter 11, Problem 22 (c)], it
suffices to show tightness for each of the individual one-dimensional processes
42ipi=i
2ei(s−t).
Xk(t) := (2d − 1)−k/2(2kNk(t) − a(d,k)), k ∈ [K], t ≥ 0
?
.
{Xk, k ∈ N} and{Xk+ Xl, k,l ∈ N}.
For each of these one-dimensional processes, it is enough to show tightness in
D[0,T] (take T = 1, without loss of generality) which we will show using [Bil99,
Theorem 13.5] for β = 1,α = 1.
Consider Xk. We already know that it has finite-dimensional convergence to the
Ornstein-Uhlenbeck process. Now, for any r ≤ s ≤ t, we want to estimate
?
We claim that the above is O(t − r)2as d tends to infinity.
To see this, first use the elementary inequality 2ab ≤ a2+ b2to obtain
E
E
(Xk(s) − Xk(r))2(Xk(t) − Xk(s))2?
=
16k4
(2d − 1)2kE
?
(Nk(s) − Nk(r))2(Nk(t) − Nk(s))2?
.
?
(Xk(s) − Xk(r))2(Xk(t) − Xk(s))2?
Consider E(Xk(t) − Xk(s))4, which can be written as
(2d − 1)−2k(2k)4E(Nk(t) − Nk(s))4.
From the orthogonal decomposition described in the beginning of this proof, one can
write Nk(t)−Nk(s) as the difference of two independent Poisson random variables
≤1
2
?
E(Xk(s) − Xk(r))4+ E(Xk(t) − Xk(s))4?
.
Page 27
CYCLES AND EIGENVALUES27
Z1−Z2with the same mean ν = (1−r(k,k))a(d,k)/2k, where r(k,k) = Eα(k,k).
Thus
E(Nk(t) − Nk(s))4= E[(Z1− ν) − (Z2− ν)]4
= E(Z1− ν)4− 4E(Z1− ν)3E(Z2− ν) + 6E(Z1− ν)2E(Z2− ν)2
− 4E(Z1− ν)E(Z2− ν)3+ E(Z2− ν)4
= 2ν(1 + 3ν) + 6ν2= 2ν(1 + 6ν).
Since ν = (1 − ek(s−t))a(d,k)/2k, we get
(2d − 1)−2k(2k)4E(Nk(t) − Nk(s))4≤ C′
Here Ck,C′
Similarly E(Xk(s) − Xk(r))4≤ Ck(s − r)2. Combining the two parts we get
E
≤ Ck(t − r)2. This allows us to use [Bil99,
Theorem 13.5, eqn. (13.14)] to show tightness of Xk.
A similar argument for the pairs Xk+ Xlproves our claim and completes the
proof of the theorem.
k(1 − ek(s−t))2≤ Ck(t − s)2.
kare constants depending on k.
(Xk(s) − Xk(r))2(Xk(t) − Xk(s))2?
?
?
Appendix: A broad Poisson approximation result
This appendix provides the proofs of Theorem 14 and Corollary 15. A less
general version of Theorem 14 can be found in [DJPP12, Theorem 11]; we show in
Corollary 24i how it follows from Theorem 14. Our theorem here also improves the
total variation bound from O((2d − 1)2r/n) to O((2d − 1)2r−1/n). We conjecture
that Theorem 14 is sharp.
As in the proof of Theorem 11 in [DJPP12], the main tool is the Stein-Chen
method for Poisson approximation by size-biased couplings as described in [BHJ92],
which uses the following idea: Recall the definition of (Iβ, β ∈ I) from Theorem 14.
For each α ∈ I, let (Jβα, β ∈ I) be distributed as (Iβ, β ∈ I) conditioned on Iα= 1.
The goal is to construct a coupling of (Iβ, β ∈ I) and (Jβα, β ∈ I) so that the two
random vectors are “close together”. We hope that for each α ∈ I, the cycles in
I \ {α} can be partitioned into two sets I−
Jβα≤ Iβ
Jβα≥ Iβ
If this is the case, then one can approximate (Iβ, β ∈ I) by a Poisson process by
calculating Cov(Iα,Iβ) for every α,β ∈ I, according to the following proposition.
Proposition 20 (Corollary 10.B.1 in [BHJ92]). Suppose that I = (Iα, α ∈ I) is
a vector of 0-1 random variables with EIα = pα. Suppose that (Jβα, β ∈ I) is
distributed as described above, and that for each α there exists a partition and a
coupling of (Jβα, β ∈ I) with (Iβ, β ∈ I) such that (17) and (18) are satisfied.
Let Y = (Yα, α ∈ I) be a vector of independent Poisson random variables with
EYα= pα. Then
?
We introduce two lemmas, whose proofs we will defer to the end of the appendix.
The first will let us approximate I by Z rather than by Y, and the second provides
a technical bound that we need.
αand I+
if β ∈ I−
if β ∈ I+
αsuch that
α, (17)
α.(18)
dTV(I,Y) ≤
α∈I
p2
α+
?
α∈I
?
β∈I−
α
|Cov(Iα,Iβ)| +
?
α∈I
?
β∈I+
α
Cov(Iα,Iβ).(19)
Page 28
28 TOBIAS JOHNSON AND SOUMIK PAL
Lemma 21. Let Y = (Yα, α ∈ I) and Z = (Zα, α ∈ I) be vectors of independent
Poisson random variables. Then
?
Lemma 22. Let a and b be d-dimensional vectors with nonnegative integer com-
ponents, and let ?a,b? denote the standard Euclidean inner product.
d
?
Proof of Theorem 14. We will give the proof in three sections: First, we make the
coupling and show that it satisfies (17) and (18). Next, we apply Proposition 20 to
approximate I by Y, a vector of independent Poissons with EYα= EIα. Last, we
approximate Y by Z to prove the theorem.
If d > n1/2or r > n1/10, then c(2d − 1)2r−1/n > 1 for a sufficiently large choice
of c, and the theorem holds trivially. Thus we will assume throughout that d ≤ n1/2
and r ≤ n1/10(the choice of 1/10 here is completely arbitrary). The expression
O(f(d,r,n)) should be interpreted as a function of d, r, and n whose absolute value
is bounded by Cf(d,r,n) for some absolute constant C, for all d, r, and n satisfying
2 ≤ d ≤ n1/2and r ≤ n1/10.
Step 1. Constructing the coupling.
Fix some α ∈ I. We will construct a random vector (Jβα, β ∈ I) distributed as
(Iβ, β ∈ I) conditioned on Iα= 1. We do this by constructing a random graph G′
distributed as Gn conditioned to contain the cycle α. Once this is done, we will
define Jβα= 1{G′
Let π1,...,πdbe the random permutations that give rise to Gn. We will alter
them to form permutations π′
us first consider what distributions π′
that α is the cycle
dTV(Y, Z) ≤
α∈I
|EYα− EZα|.
i=1
1
[n]ai+bi
−
d
?
i=1
1
[n]ai[n]bi
≤?a,b?
n
d
?
i=1
1
[n]ai+bi
n
ncontains cycle β}.
1,...,π′
d, and we will construct G′
1,...,π′
nfrom these. Let
dshould have. For example, suppose
12341.
π3
π1
π3
π1
Then π′
make π′
n-permutation conditioned to make π′
be a uniform random n-permutation. A random graph constructed from π′
and π′
3will be distributed as Gnconditioned to contain α.
We now describe the construction of π′
1should be distributed as a uniform random n-permutation conditioned to
1(3) = 2 and π′
3(1) = 2 and π′
1(4) = 1, and π′
3should be distributed as a uniform random
3(3) = 4, while π′
2should just
1, π′
2,
1,...,π′
w3
d. Suppose α is the cycle
wk
s0
s1
s2
···sk= s0,
w1
w2
(20)
with each edge directed according to whether wi(si−1) = sior wi(si) = si−1. Fix
some 1 ≤ l ≤ d, and suppose that the edge-label πlappears M times in the cycle
α. Let (am,bm) for 1 ≤ m ≤ M be these directed edges. We must construct π′
have the uniform distribution conditioned on π′
We define a sequence of random transpositions by the following algorithm: Let
τ1swap πl(a1) with b1. Let τ2swap τ1πl(a2) with b2, and so on. We then define
π′
is distributed uniformly, subject to the given constraints, which can be proven by
lto
l(am) = bmfor 1 ≤ m ≤ M.
l= τM···τ1πl. This permutation satisfies π′
l(am) = bmfor 1 ≤ m ≤ M, and it
Page 29
CYCLES AND EIGENVALUES29
induction on each swap. We now define G′
usual way. It is defined on the same probability space as Gn, and it is distributed
as Gnconditioned to contain α, giving us a random vector (Jβα, β ∈ I) coupled
with (Iβ, β ∈ I).
Now, we will give a partition I−∪I+= I\{α} satisfying (17) and (18). Suppose
that Gn contains an edge si
v
with v ?= si+1, or an edge v
with v ?= si. The graph G′
edges of this form are the only ones found in Gnbut not G′
nfrom the permutations π′
1,...,π′
din the
wi+1
si+1
wi+1
ncannot contain this edge, since it contains α. In fact,
n:
Lemma 23. Suppose there is an edge i
Then α contains either an edge i
vj
with v ?= i.
Proof. Suppose πl(i) = j, but π′
making π′
l, which can happen only if πl(am) = j or bm= j for some m. In the first
case, am= i and α contains the edge i
α contains the edge am
j
with am?= i.
j
πl
contained in Gnbut not in G′
with v ?= j, or α contains an edge
n.
v
πl
πl
l(i) ?= j. Then j must have been swapped when
bm
πl
with bm?= j, and in the second
πl
?
Define I−
an edge v
G′
For any β ∈ I+
G′
n. Hence Jβα≥ Iβ, and (18) is satisfied.
Step 2. Approximation of I by Y.
The conditions of Proposition 20 are satisfied, and we need only bound the sums
in (19). Let pα = EIα, the probability that cycle α appears in Gn. Recall that
this equals?d
1
nk≤ pα≤
αas all cycles in I that contain an edge si
si+1
with v ?= si, and define I+
ncannot contain any cycle in I−
α, Lemma 23 shows that if β appears in Gn, it must also appear in
v
wi+1
with v ?= si+1or
wi+1
αto be the rest of I \ {α}. Since
α, we have Jβα= 0 for all β ∈ I−
α, satisfying (17).
i=11/[n]ei, where eiis the number of times πiand π−1
word of α. This means that
i
appear in the
1
[n]k, (21)
where k = |α|, the length of cycle α.
We bound the first sum in (19) by
?
α∈I
p2
α=
r
?
k=1
?
α∈Ik
p2
α≤
r
?
r
?
r
?
k=1
?
?[n]ka(d,k)
2d(2d − 1)k−1
2k[n]k
α∈Ik
1
[n]2
k
=
k=1
2k
??
1
[n]2
k
?
≤
k=1
= O
?d
n
?
.(22)
To bound the second sum in (19), we investigate the size of I−
α ∈ Ik, and α has the form given in (20). Any β ∈ I−
si
v
with v ?= si+1, or an edge v
are at most 2k(n − 1) edges of this form. For any given edge, there are at most
[n−2]j−2(2d−1)j−1cycles in Ijthat contain that edge, for any j ≥ 2. Thus for any
α ∈ Ik, the number of cycles of length j ≥ 2 in I−
and this bound also holds for j = 1.
α. Suppose that
αmust contain an edge
with v ?= si, and there
wi+1
si+1
wi+1
αis at most 2k[n−1]j−1(2d−1)j−1,
Page 30
30 TOBIAS JOHNSON AND SOUMIK PAL
For any β ∈ I−
Putting this all together and applying (21), we have
α, it holds that E[IαIβ] = 0, so that Cov(Iα,Iβ) = −pαpβ.
?
α∈I
?
β∈I−
α
|Cov(Iα,Iβ)| =
r
?
r
?
r
?
r
?
k=1
?
α∈Ik
r
?
1
[n]k
j=1
?
??I−
2k(2d − 1)j−1
n
?(2d − 1)r−1
β∈I−
α∩Ij
pαpβ
≤
k=1
|Ik|
r
?
r
?
j=1
α∩ Ij
??
1
[n]j
≤
k=1
a(d,k)
2k
j=1
=
k=1
a(d,k)O
n
?
= O
?(2d − 1)2r−1
n
?
. (23)
The final sum in (19) is the most difficult to bound. We partition I+
I+
edges with α. For any β ∈ I+
αinto sets
α= I0
α∪ ··· ∪ I|α|−1
α
, where Il
α,
αis all cycles in I+
αthat share exactly l labeled
E[IαIβ] = P[G contains α and β] =
d
?
i=1
1
[n]ei
,
where eiis the number of πi-labeled edges in α ∪ β. Thus for β ∈ Il
1
n|α|+|β|−l≤ E[IαIβ] ≤
α,
1
[n]|α|+|β|−l. (24)
We start by seeking estimates on the size of Il
edges of α. We start by counting the cycles in Il
with α. We illustrate this in Figure 5. Call the graph consisting of these edges H,
and suppose that H has p components. Since it is a forest, H has l + p vertices.
Let A1,...,Apbe the components of H. We can assemble any element β ∈ Il
that overlaps with α in H by stringing together these components in some order,
with other edges in between.Each component can appear in β in one of two
orientations. Since the vertices in β have no fixed ordering, we can assume without
loss of generality that β begins with component A1with a fixed orientation. This
leaves (p − 1)!2p−1choices for the order and orientation of A2,...,Apin β.
Imagine now the components laid out in a line, with gaps between them, and
count the number of ways to fill the gaps. Suppose that β is to have length j. Each
of the p gaps must contain at least one edge, and the total number of edges in all
the gaps is j − l. Thus the total number of possible gap sizes is the number of
compositions of j − l into p parts, or?j−l−1
the edges themselves. We can do this by giving an ordered list j −p−l vertices to
go in the gaps, along with a label and an orientation for each of the j −l edges this
gives. There are [n−p−l]j−p−lways to choose the vertices. We can give each new
edge any orientation and label subject to the constraint that the word of the cycle
we construct must be reduced. This means we have at most 2d − 1 choices for the
orientation and label of each new edge, for a total of at most (2d − 1)j−i.
All together, there are at most (p − 1)!2p−1?j−l−1
αfor l ≥ 1. Fix some choice of l
αthat share exactly these edges
α
p−1
?.
Now that we have chosen the number of edges to appear in each gap, we choose
p−1
?[n − p − l]j−p−l(2d − 1)j−l
elements of Ijthat overlap with the cycle α at the subgraph H. We now calculate
Page 31
CYCLES AND EIGENVALUES31
1
2
3
4
5
6
7
8
9
10
11
π1
π1
π2
π3
π1
π2
π1
π2
π1
π3
π3
The cycle α, with H dashed.
The subgraph H has components
A1,...,Ap. In this example, the
number of componnents of H is
p = 3, the size of α is k = 11, and
the number of edges in H is l = 4.
In this example, we will construct a
cycle β of length j = 10 that overlaps
with α at H.
345 10978
π2
π3
π1
π1
Step 1. We lay out the components
A1,...,Ap. We can order and orient
A2,...,Aphowever we would like,
for a total of (p − 1)!2p−1choices.
Here, we have ordered the compo-
nents A1,A3,A2, and we have re-
versed the orientation of A3.
345 10978
π2
π3
π1
π1
Step 2. Next, we choose how many
edges will go in each gap between
components. Each gap must contain
at least one edge, and we must add a
total of j − l edges, giving us?j−l−1
added one edge after A1, three after
A3, and two after A2.
p−1
?
choices. In this example, we have
345 109
231
78
15
π2
π3
π1
π1
π2
π3
π2
π1
π2
π1
Step 3. We can choose the new ver-
tices in [n − p − l]j−p−lways, and we
can direct and give labels to the new
edges in at most (2d − 1)j−lways.
Figure 5. Assembling an element β ∈ Il
a given subgraph H.
αthat overlaps with α at
the number of different ways to choose a subgraph H of α with l edges and p
components. Suppose α is given as in (20). We first choose a vertex si0. Then,
we can specify which edges to include in H by giving a sequence a1,b1,...,ap,bp
instructing us to include in H the first a1edges after si0, then to exclude the next
b1, then to include the next a2, and so on. Any sequence for which aiand biare
positive integers, a1+ ··· + ap= l, and b1+ ··· +bp= k − l gives us a valid choice
of l edges of α making up p components. This counts each subgraph H a total
of p times, since we could begin with any component of H. Hence the number of
subgraphs H with l edges and p components is (k/p)?l−1
p−1
??k−l−1
p−1
?. This gives us
Page 32
32 TOBIAS JOHNSON AND SOUMIK PAL
the bound
|Il
α∩ Ij| ≤
l∧(j−l)
?
p=1
(k/p)
?l − 1
?j − l − 1
p − 1
??k − l − 1
?
p − 1
?
(p − 1)! ×
2p−1
p − 1
[n − p − l]j−p−l(2d − 1)j−l.
We apply the bounds
?l − 1
p − 1
?
?
≤
rp−1
(p − 1)!,
≤ (er/(p − 1))p−1,
?k − l − 1
p − 1
?
,
?j − l − 1
p − 1
to get
|Il
α∩ Ij| ≤ k(2d − 1)j−l[n − 1 − l]j−1−l
1 +
i∧(k−i)
?
p=2
1
p
?
2e2r3
(p − 1)2
?p−1
1
[n − 1 − l]p−1
.
Since r ≤ n1/10, the sum in the above equation is bounded by an absolute constant.
Applying this bound and (24), for any α ∈ Ikand l ≥ 1,
?
?
β∈Il
α
Cov(Iα,Iβ) ≤
r
?
r
j=l+1
?
?k(2d − 1)j−l
?k(2d − 1)r−l
β∈Il
α∩Ij
1
[n]k+j−l
≤
j=l+1
O
nk+1
?
= O
nk+1
?
.
Therefore
?
α∈I
?
l≥1
?
β∈Il
α
Cov(Iα,Iβ) =
r
?
r
?
r
?
r
?
k=1
?
?
[n]ka(d,k)
2k
?(2d − 1)r+k−1
?(2d − 1)2r−1
αCov(Iα,Iβ). For any word w, let ew
in w. Let α and β be cycles with words w
α∈Ik
k−1
?
k−1
?
l=1
?
?k(2d − 1)r−l
?k(2d − 1)r−1
?
?
β∈Il
α
Cov(Iα,Iβ)
≤
k=1α∈Ik
l=1
O
nk+1
?
=
k=1
O
nk+1
?
=
k=1
O
n
= O
n
. (25)
Last, we must bound?
α∈I
?
β∈I0
ibe the
number of appearances of πi and π−1
i
Page 33
CYCLES AND EIGENVALUES33
and u respectively, and let k = |α| and j = |β|. Suppose that β ∈ I0
d
?
≤?ew,eu?
α. Then
Cov(Iα,Iβ) =
i=1
1
[n]ew
i+eu
i
−
d
?
1
i=1
1
[n]ew
i[n]eu
i
n
d
?
i=1
[n]ew
i+eu
i
≤?ew,eu?
n[n]k+j
by Lemma 22. For any pair of words w ∈ Wk and u ∈ Wj, there are at most
[n]k[n]j pairs of cycles α,β ∈ I with words w and u, respectively. Enumerating
over all w ∈ Wkand u ∈ Wj, we count each pair of cycles α,β exactly 4kj times.
Thus
?
≤1 + O(r2/n)
The vector?
The inner product in the above equation comes to kja(d,k)a(d,j)/d, giving us
?
α∈Ik
?
β∈I0
α∩Ij
Cov(Iα,Iβ) ≤
[n]k[n]j
4kjn[n]k+j
?
?
w∈Wk
?
u∈Wj
?ew,eu?
4kjn
?
w∈Wk
ew,
?
u∈Wj
eu
?
.
w∈Wkewhas every entry equal by symmetry, as does?
u∈Wjeu. Thus
u∈Wjeuis ja(d,j)/d.
each entry of?
w∈Wkewis ka(d,k)/d, and each entry of?
?
α∈Ik
β∈I0
α∩Ij
Cov(Iα,Iβ) ≤a(d,k)a(d,j)(1 + O(r2/n))
?(2d − 1)j+k−1
4dn
= O
n
?
.
Summing over all 1 ≤ k,j ≤ r,
?
α∈I
?
β∈I0
α
Cov(Iα,Iβ) =
?(2d − 1)2r−1
n
?
. (26)
We can now combine equations (22), (23), (25), and (26) with Proposition 20 to
show that
?(2d − 1)2r−1
dTV(I, Y) = O
n
?
. (27)
Step 3. Approximation of Y by Z.
By Lemma 21 and (21),
dTV(Y, Z) ≤
?
α∈I
|EYα− EZα| ≤
r
?
r
?
k=1
?
a(d,k)
2k
α∈Ik
?
1
[n]k
?
−
1
nk
?
?
=
k=1
1 −[n]k
nk
.
Since [n]k≥ nk(1 − k2/2n),
dTV(Y, Z) ≤
r
?
k=1
a(d,k)k
4n
= O
?r(2d − 1)r
n
?
.
Page 34
34 TOBIAS JOHNSON AND SOUMIK PAL
Together with (27), this bounds the total variation distance between the laws of I
and Z and proves the theorem.
?
The distributions of any functionals of I and Z satisfy the same bound in total
variation distance. This gives us several results as easy corollaries, including an
improvement on [DJPP12, Theorem 11].
Corollary 24.
i) Let (Zk, 1 ≤ k ≤ r) be a vector of independent Poisson random variables with
EZk= a(d,k)/2k. Let Ck denote the number of k-cycles in Gn, a 2d-regular
permutation random graph on n vertices. Then for some absolute constant c,
dTV
?(Ck, 1 ≤ k ≤ r), (Zk, 1 ≤ k ≤ r)?≤c(2d − 1)2r−1
ii) Let (Zw, w ∈ W′
EZw = 1/h(w). Let Cw denote the number of cycles with word w in Gn, a
2d-regular permutation random graph on n vertices. Then for some absolute
constant c,
n
.
K) be a vector of independent Poisson random variables with
dTV
?(Cw, w ∈ W′
K), (Zw, w ∈ W′
α∈IkIα, and that if we define Zk=?
αIα, where the sum is over all cycles in I with
K)?≤c(2d − 1)2K−1
n
.
Proof. Observe that Ck=?
To prove (ii), note that Cw=?
of cycles in I with word w is [n]k/h(w), we have EZw = 1/h(w), and the total
variation bound follows from Theorem 14.
α∈IkZα, then
(Zk, 1 ≤ k ≤ r) is distributed as described. Thus (i) follows from Theorem 14.
word w. We then define Zw as the analogous sum over Zα. Since the number
?
We can also use Theorem 14 to bound the likelihood that Gn contains two
overlapping cycles of size r or less.
Corollary 25. Let Gn be a 2d-regular permutation random graph on n vertices.
Let E be the event that Gncontains two cycles of length r or less with a vertex in
common. Then for some absolute constant c′, for all d ≥ 2 and n,r ≥ 1,
P[E] ≤c′(2d − 1)2r
Proof. Let E′be the event that Zα= Zβ= 1 for two cycles α, β ∈ I that have a
vertex in common. By Theorem 14,
n
.
P[E] ≤ P[E′] +c(2d − 1)2r−1
n
.
For any cycle α ∈ Ik, there are at most k[n − 1]j−1a(d,j) cycles in Ij that share
a vertex with α. For any such cycle β, the chance that Zα= 1 and Zβ= 1 is less
than 1/[n]k[n]j. By a union bound,
P[E′] ≤
r
?
r
?
k=1
a(d,k)[n]k
2k
r
?
j=1
k[n − 1]j−1a(d,j)
[n]k[n]j
?(2d − 1)2r
≤
k=1
r
?
j=1
a(d,k)a(d,j)
2n
= O
n
?
.
?
Page 35
CYCLES AND EIGENVALUES35
Proof of Corollary 15. When d = 1, there is only one word of each length in W′
and statement (i) reduces to the well-known fact that the cycle counts of a random
permutation converge to independent Poisson random variables (see [AT92] for
much more on this subject). In this case, G(t) is made up of disjoint cycles for all
times t, so that statement (ii) is trivially satisfied.
When d ≥ 2, let C(n)
lary 24ii. The random vector (Cw(t), w ∈ W′
(C(n)
K) over different values of n. That is,
K,
w
be the number of cycles with word w in Gn, as in Corol-
K) is a mixture of the random vectors
w , w ∈ W′
P??Cw(t), w ∈ W′
for any set A, recalling that G(t) = GMt. Corollary 24ii together with the fact that
P[Mt> N] → 1 as t → ∞ for any N imply that (Cw(t), w ∈ W′
to (Zw, w ∈ W′
way from Corollary 25.
K
?∈ A?=
∞
?
n=1
P[Mt= n]P
??C(n)
w, w ∈ W′
K
?∈ A
?
K) converges in law
K), establishing statement (i). Statement (ii) follows in the same
?
Proof of Lemma 21. We will apply the Stein-Chen method directly. Define the
operator A by
Ah(x) =
α∈I
?
E[Zα]?h(x + eα) − h(x)?+
+→ R and x ∈ Z|I|
EAh(Z) = 0 for any bounded function h. By Proposition 10.1.2 and Lemma 10.1.3
in [BHJ92], for any set A ⊆ Z|I|
Ah(x) = 1{x ∈ A} − P[Z ∈ A],
and this function has the property that
?
α∈I
xα
?h(x − eα) − h(x)?
for any h: Z|I|
+. This is the Stein operator for the law of Z, and
+, there is a function h such that
sup
x∈Z|I|
α∈I
+
|h(x + eα) − h(x)| ≤ 1. (28)
Thus we can bound the total variation distance between the laws of Y and Z by
bounding |EAh(Y)| over all such functions h.
We write Ah(Y) as
Ah(Y) =
α∈I
?
The first two of these sums have expectation zero, so
?
By (28), |h(Y + eα) − h(Y)| ≤ 1, which proves the lemma.
?
+
E[Yα]?h(Y + eα) − h(Y)?+
?EZα− EYα
?
α∈I
Yα
?h(Y − eα) − h(x)?
α∈I
??h(Y + eα) − h(Y)?.
|EAh(Y)| ≤
α∈I
|EZα− EYα|E|h(Y + eα) − h(Y)|.
?
Proof of Lemma 22. We define a family of independent random maps σiand τifor
1 ≤ i ≤ d. Choose σiuniformly from all injective maps from [ai] to [n], and choose
Page 36
36 TOBIAS JOHNSON AND SOUMIK PAL
τi uniformly from all injective maps from [bi] to [n]. Effectively, σi and τi are
random ordered subsets of [n]. We say that σiand τiclash if their images overlap.
P[σiand τiclash for some i] = 1 −
d
?
i=1
[n]ai+bi
[n]ai[n]bi
.
For any 1 ≤ i ≤ d, 1 ≤ j ≤ ai, and 1 ≤ k ≤ bi, the probability that σi(j) = τi(k) is
1/n. By a union bound,
P[σiand τiclash for some i] ≤
d
?
i=1
aibi
n
=?a,b?
n
.
We finish the proof by dividing both sides of this inequality by?d
i=1[n]ai+bi.
?
References
[ANvM11] Mark Adler, Eric Nordenstam, and Pierre van Moerbeke. The Dyson Brownian minor
process. Preprint. Available at arXiv:1006.2956, 2011.
[AT92] Richard Arratia and Simon Tavar´ e. The cycle structure of random permutations. Ann.
Probab., 20(3):1567–1591, 1992.
[BAD11] G´ erard Ben Arous and Kim Dang. On fluctuations of eigenvalues of random permuta-
tion matrices. 2011. Preprint. Available at arXiv:1106.2108.
[BF08] Alexei Borodin and Patrik Ferrari. Anisotropic growth of random surfaces in 2+1
dimensions. Preprint. Available at arXiv:0804.3035, 2008.
[BHJ92]A. D. Barbour, Lars Holst, and Svante Janson. Poisson approximation, volume 2 of
Oxford Studies in Probability. The Clarendon Press Oxford University Press, New
York, 1992. Oxford Science Publications.
[Bil99]Patrick Billingsley. Convergence of probability measures. Wiley Series in Probability
and Statistics: Probability and Statistics. John Wiley & Sons Inc., New York, second
edition, 1999. A Wiley-Interscience Publication.
[BNN11] Paul Bourgade, Joseph Najnudel, and Ashkan Nikeghbali. A unitary extension of vir-
tual permutations. Preprint. Available at arXiv:1102.2633, 2011.
[Bor10a] Alexei Borodin. CLT for spectra of submatrices of Wigner random matrices. Preprint.
Available at arXiv:1010.0898, 2010.
[Bor10b] Alexei Borodin. CLT for spectra of submatrices of Wigner random matrices II. Sto-
chastic evolution. Preprint. Available at arXiv:1011.3544, 2010.
[CPY98] Philippe Carmona, Fr´ ed´ erique Petit, and Marc Yor. Beta-gamma random variables and
intertwining relations between certain Markov processes. Rev. Mat. Iberoamericana,
14(2):311–367, 1998.
[DF90] Persi Diaconis and James Allen Fill. Strong stationary times via a new form of duality.
Ann. Probab., 18(4):1483–1522, 1990.
[DJPP12] Ioana Dumitriu, Tobias Johnson, Soumik Pal, and Elliot Paquette. Functional limit
theorems for random regular graphs. Probab. Theory Related Fields, pages 1–55, 2012.
Published online, 25 August 2012.
[DP12] Ioana Dumitriu and Soumik Pal. Sparse regular random graphs: Spectral density and
eigenvectors. Ann. Probab., 40(5):2197–2235, 2012.
[EK86] Stewart N. Ethier and Thomas G. Kurtz. Markov processes. Wiley Series in Probability
and Mathematical Statistics: Probability and Mathematical Statistics. John Wiley &
Sons Inc., New York, 1986. Characterization and convergence.
[Elo08] Yehonatan Elon. Eigenvectors of the discrete Laplacian on regular graphs—a statistical
approach. J. Phys. A, 41(43):435203, 17, 2008.
[Elo10]Yehonatan Elon. Gaussian waves on the regular tree. Preprint. Available at
arXiv:0907.5065, 2010.
[ES10] Yehonatan Elon and Uzy Smilansky. Percolating level sets of the adjacency eigenvectors
of d-regular graphs. J. Phys. A, 43(45):455209, 13, 2010.
[Fer10] Patrik L. Ferrari. From interacting particle systems to random matrices. J. Stat. Mech.
Theory Exp., (10):P10016, 15, 2010.
Page 37
CYCLES AND EIGENVALUES 37
[FF10] Patrik L. Ferrari and Ren´ e Frings. On the partial connection between random matrices
and interacting particle systems. J. Stat. Phys., 141(4):613–637, 2010.
[JMRR99] Dmitry Jakobson, Stephen D. Miller, Igor Rivin, and Ze´ ev Rudnick. Eigenvalue spac-
ings for regular graphs. In Emerging applications of number theory (Minneapolis, MN,
1996), volume 109 of IMA Vol. Math. Appl., pages 317–327. Springer, New York, 1999.
[JN06] Kurt Johansson and Eric Nordenstam. Eigenvalues of GUE minors. Electron. J.
Probab., 11:no. 50, 1342–1371, 2006.
[KOV04] Sergei Kerov, Grigori Olshanski, and Anatoly Vershik. Harmonic analysis on the infi-
nite symmetric group. Invent. Math., 158(3):551–642, 2004.
[LP10] Nati Linial and Doron Puder. Word maps and spectra of random graph lifts. Random
Structures Algorithms, 37(1):100–135, 2010.
[MNS08] Steven J. Miller, Tim Novikoff, and Anthony Sabelli. The distribution of the largest
nontrivial eigenvalues in families of random regular graphs. Experiment. Math.,
17(2):231–244, 2008.
[OGS09] Idan Oren, Amit Godel, and Uzy Smilansky. Trace formulae and spectral statistics for
discrete Laplacians on regular graphs. I. J. Phys. A, 42(41):415101, 20, 2009.
[OS10] Idan Oren and Uzy Smilansky. Trace formulas and spectral statistics for discrete Lapla-
cians on regular graphs (II). J. Phys. A, 43(22):225205, 13, 2010.
[Pit06] Jim Pitman. Combinatorial stochastic processes, volume 1875 of Lecture Notes in
Mathematics. Springer-Verlag, Berlin, 2006. Lectures from the 32nd Summer School
on Probability Theory held in Saint-Flour, July 7–24, 2002, With a foreword by Jean
Picard.
[She07] Scott Sheffield. Gaussian free fields for mathematicians. Probability Theory and Related
Fields, 139(3-4):521–541, 2007.
[Smi10]Uzy Smilansky. Discrete graphs - a paradigm model for quantum chaos. In S´ eminaire
Poincar´ e, volume XIV, pages 1–26. 2010.
[Spo98] Herbert Spohn. Dyson’s model of interacting brownian motions at arbitrary coupling
strength. In MARKOV PROC. REL. FIELDS, pages 649–661, 1998.
[TVW13] Linh V. Tran, Van H. Vu, and Ke Wang. Sparse random graphs: Eigenvalues and
eigenvectors. Random Structures Algorithms, 42(1):110–134, 2013.
[Wie00] Kelly Wieand. Eigenvalue distributions of random permutation matrices. Ann.
Probab., 28(4):1563–1587, 2000.
[Wor99] Nicholas C. Wormald. Models of random regular graphs. In Surveys in combinatorics,
1999 (Canterbury), volume 267 of London Math. Soc. Lecture Note Ser., pages 239–
298. Cambridge Univ. Press, Cambridge, 1999.
Department of Mathematics, University of Washington, Seattle, WA 98195
E-mail address: toby@math.washington.edu
Department of Mathematics, University of Washington, Seattle, WA 98195
E-mail address: soumik@u.washington.edu